Shama I 1990

Capacity of a pulse amplitude modulated direct detection photon channel
Prof. S. Shamai (Shitz), MSc, DSc
Indexing terms: Detection, Communication systems theory
Abstract: The classical direct detection optical channel is modelled by an observed Poisson process with intensity (rate) A(t) A,, where A(t) is the information carrying input waveform and A, represents the dark current. The capacity of this channel is considered within a restricted class of peak power A(t) < A and average power E(I(t)) < c constrained-pulse amplitude modulated-input waveforms. Within this class where I(t) = Ai, during the ith signalling interval iA < t < (i 1)A the symbol duration A affects the spectral properties (bandwidth) of A@). The capacity achieving distribution of the symbols {Ai} is determined by setting {Ii}to be an independent identically distributed sequence of discrete random variables taking on a finite number of values. The two valued distribution of A with mass points located at 0 and A is capacity achieving for c = A (no average power constraint) and I, = 0, in the region 0 < AA < 3.3679. In the following region (3.3679 < AA < 5) the ternary distribution is capacity achieving with the additional mass point rising at 0.339A.
Introduction
Consider a communication system where the message is transmitted by modulating the intensity A(t) of a photon emitting source. In a direct detection photon channel, (referred also as a Poisson channel) the receiver records the exact time arrivals of the individual photons that follow a Poisson probability law with an intensity A(t) A,, where A, denotes the dark current. These channels are often used to model direct detection optical communication systems [1-14] gaining increased attention in recent years due to their practical importance. The capacity of this direct detection photon channel under peak and average constraints imposed on I(t) was derived by Wyner [l], Davis [2], and Kabanov [3]. In Reference 1, a simple intuitively appealing method was used whereas in References 2 and 3, martingale techniques were applied. The error exponent and construction of specific codes achieving capacity were also reported [l] (see also Reference 15). It was shown [l] that a two level modulation Yt) is capacity achieving. In all cases examined in References 1-3, no bandwidth-like constraints restricting the
Paper 76151 (E8, E13), first received 21st December 1988 and in revised form 30th April 1990 Associate Prof. Shamai (Shitz) is with the Department of Electrical Engineering, Technion-Israel Institute of Technology, Haifa 32000, Israel 424
rate of variation of A(t) were imposed. Eventually, it can be shown that the capacity achieving modulating process has a transition in any interval E > 0 with probability approaching one [1,101. This work is concerned with the direct detection photon channel where, in addition to the peak and average power constraints imposed on I(t), the minimal time interval A in which A(t) must remain constant is specified. A simple piecewise constant pulse amplitude modulated waveform satisfying the constraints is considered. In this case, I(t) = Ai for the ith time interval t E [iA, (i + 1)A). The achievable information rate with this specific amplitude modulation was a subject of an extensive study 111-141 (and references therein). However in those previous works certain assumptions were a priori imposed on the probability distribution of Ai. Gordon [ l l ] examined the case of no dark current (A, = 0) and no peak limit constraint. He assumed a discrete distribution of I and particularly examined the two-valued (binary) distribution. It is also shown [ l l ] that asymptotically, for large average power, the capacity achieving distribution is approximated by an exponential probability density function (PDF) (see also Reference 10). The information rate for this specific case was evaluated by Goodwin and Bolgiano [l2]. Jodin and Mandel [13] considered the case of no dark current with a peak power constraint imposed on Ai. The continuous truncated exponential distribution was used to evaluate the corresponding achievable information rates. It was assumed that this truncated distribution is close to optimal for large peak power values. It was also pointed out [14] that standard variational techniques fail to specify the capacity achieving distribution. Hisdal [141 assumed that I takes on uniformly spaced equiprobable discrete values. It was numerically shown [14] that under this assumption, for a given peak power constraint and a given A,, there are an optimum number of finite levels maximising the corresponding information rate. In this work, it is determined that the capacity achieving probability law of the sequence {Ai} is specified by choosing {Ii}to be an independent identically distributed (IID) sequence where each Ai = A is a discrete random variable that assumes a finite number of values. This result is determined without a priori restricting the class of possible distributions. It is further shown that the two valued distribution of I is optimal in a certain region 0 < A < A, where A, > 0 is determined by I, and the peak and the average power constraints. The value of A, is explicitly found for the case of no dark current and no average power constraint. The derivation of the main result relies basically on the method used by Smith [16] to determine the capacity of a scalar peak limited Gaussian channel. Various schemes of pulse position modulation for the direct detection photon channels gained also
IEE PROCEEDINGS, Vol. 137, Pt. I , No. 6, DECEMBER 1990
increased attention in recent years [4, 8-10] (and references therein), due to their efficiency, especially in cases where a minimum pulse width constraint is imposed [4].
2
Problem formulation and main results
Consider the Poisson channel with an intensity (rate) A(t) + A, where A, is the dark current intensity and
m
satisfy the average power constraint S f A dF, < a. By lemma 4 (Appendix 7.1) we conclude that Fj: exists and it is unique. Furthermore this distribution satisfies the Kuhn-Tucker conditions stated in corollary 1 in Section 3. The set of increase points of FX which are all located, of course, in the interval [0, A] are denoted by E X . The main result is stated in theorem 1 which is proved in Section 3.
Theorem I : EX contains afinite set of points. The direct conclusion is that {Ai} the capacity achieving inputs are IID discrete random variables assuming finite numbers of values. In Reference 14 it was concluded by numerical examination that under the assumption of 1 being a discrete random variable assuming equally spaced equiprobable levels in the region [0, A] (a = A), the optimising distribution has a finite number of increase points (the number of levels is finite). Here we have theoretically established that a discrete A, assuming a finite number of levels, though not equiprobable and equispaced, is the capacity achieving distribution within the class 9(, , ) . The specific distribution for any values of the parameters (A, 0, A, A,) can be found by using standard finite vector optimisation packages following the method used for the scalar Gaussian channel [16]. The basic efficient algorithm to evaluate the optimal distribution uses corollary 1 of section 3 to verify optimality in an adaptive programming procedure detailed in Reference 16. A good initial point is A A + 0, as in this case the capacity achieving distribution Fb, (where superscript b stands for binary) is two valued [1,2].
1(t) =
i=O
Ai U(t - iA) 0 < t < CO
(1)
describes a piecewise constant pulse amplitude modulated waveform. The information carrying symbols are denoted by {A,}, the symbol duration interval is A and U(t)stands for a rectangular pulse
U(t) =
1 O<t<A 0 otherwise
The specific waveform of eqn. 1 results by restricting the class of signals with minimum transition interval of A to the subclass where the possible transitions take place at predetermined (synchronised) instants id. Imposing peak and average power constraints on 1(t), i.e.
0 < 1(t) < A

E(A(t))< 0 (3) where A and a ( a < A) stand for the peak power and the average power (the peak and average number of photons per second, respectively), translates by eqn. 1 to the same constraints imposed on each Ai
O<Ai<A
Vi
Vi
Eli <a
(4)
Fb, = (1 - B)S(A) + pS(A - A )

with p defined in this case by q*
(7)
The statistical average operator is denoted by E. The channel output v(t) (given A(t) A,) is a Poisson process with an intensity parameter Yt) A,, therefore o(t) is an independent increment process satisfying [17]
where S(u) is a step function
40)= 0
and for 0 < T, t < CO
Pr{v(t
(54
(9)
e -AAj
= 0,
+ T) - u(t) =j } = j j!
1, 2, . ..
(5b)
The corresponding capacity [l, 23 is given then by
c = A{q*(l + &/A) log (1 + &/A)

+ (1 - 4*)(1,/A) log ( A o / 4 - (4* + &/A) log (4* + Ao/A)}
(10)
where
A
= [+(A(<)
+ A,) d5
1
(54
We are interested in C the channel capacity in nats/s.

C = lim sup - I($:
T-m
{1,}r())
where I( ) denotes the mutual information function [18], uf denotes the path 4t), 0 < t < T, and {Ai}t(T)notes the sequence {A,}, 0 < i < N(T) with N(T) = T/A and where 0stands for the closest integer from above. The supremum in eqn. 6 is carried over all probability measures of the sequence {Ai}:(T) satisfying the constraint of eqn. 4. It is shown in Section 3 that, as expected, the capacity achieving probability measure is a product A A / ( e A A - l ) + (1 - e - A A ) (1 1) Bo;,1 -e measure such that the sequence {A,} is composed of IID and I(F!) the related mutual information per channel use random variables (A,). Therefore the optimisation equals problem is reduced to determining Fj: the capacity achieving distribution of A that maximises eqn. 6 out of I ( F ~ ,= ) log [1 + (1 - ,-AA)e-A/(eAA-l)- 1. (12) all distributions F A E 9(, e), where the set 9(, Eqns. 1 1 and 12, derived in Section 3, agree with those denotes the class of all distributions F, having their reported in Reference 7. Of course for A A 40, increase points in the interval [O, A], d F , = 1 and
with natural logarithms used henceforth. Of special interest is the threshold value A, being a function of A and a such that for any A < A, the two valued distribution of eqn. 7 is optimal. Note that /3, if optimally chosen, is not necessarily equal to 4* (as in general, A A does not tend to 0). This threshold value A, is determined by that A for which the two valued distribution (eqn. 7) ceases to satisfy corollary 1 specifying the necessary and sufficient conditions for optimality. The calculations are extremely simplified in the case of no dark current 1 , = 0 and no average power constraint a = A. In this specific case, pop, denoting the optimal 8, satisfies
Jt
425
Bop + q*
= e - l and
The capacity C is given by

C
=-
C = lim - Z(Fb,) = Ale A-0 A
1 lim sup - I({ y,},", AN-m N 1
{A,},")
(17)
as expected from eqn. 10, substituting in A, = 0 and a = A. At the other extreme AA % 1, it is easily verified that /Iopr-, 0.5 and C ,A - l log 2 as is also expected since no more than log 2 nats of information can be transmitted per channel use (each A s) using a two valued distribution. Of course for AA 9 1, this two valued distribution is not optimal any more. In the case of no dark current (A, = 0) and no average power constraint (a = A) the two valued distribution F! (eqn. 7) with P = Bopt(eqn. 11) is capacity achieving in the region AAo (13) The capacity achieving distribution F: in the region 3.3679 < AA < c, where 5 > 3.3679, is ternary (superscript t stands for ternary).
p a = (1 - /?I - /?z)S(A)
where the supremum is carried out over all distributions of {A,}," satisfying the constraints specified by eqn. 4. Since yi depends only on A, (eqn. 16), using the same arguments as in theorem 4.2.1 of Reference 18, page 75 and symmetry, capacity is shown to be achieved by {A,} being an IDD random sequence. Therefore C is given by
1 c=-
sup Z ( y : l )
F AE F ( A . ~ )
0 < AA < 3.3679
+ #llS(A - 0.3839A) +
where the sup is carried over all F, the distributions of the random variable 1in the class F(A, a). The conditional PDF of y given 1 is determined by eqn. 16 with the index i omitted. The unconditioned PDF of y is denoted by P,(n, Fa)
8 2 S(A
- A)
(14) explicitly emphasising the functional dependence of this distribution on F,. The mutual information per channel use I(y :A) denoted also by Z(F,) equals
where PI, &, 1 - - p2 > 0. Eqn. 14 shows that the location of the additional mass point is at 0.3839A which proves that equispaced mass points are not optimal in general. Unfortunately, analytical closed forms expressions for 5 the upper bound of the value of (AA), for which the ternary distribution is still optimal and for the specific weights of the nonnull mass points PI, B 2 , are difficult to find. However, standard computer optimisation procedures are directly applicable to calculate the optimal distribution F: for any value of AA. It is self evident that the binary distribution Fb, and the ternary one FA are still capacity achieving also in cases of a 'trivial' average power constraint, meaning that it is already satisfied by these distributions. In the limiting case AA % 1, it is expected that the capacity achieving distribution will be close to a uniform one and capacity will behave logarithmically with AA [14]. Where ad % 1 is large and no effective peak constraints are imposed, that is A l a 4 CO, the capacity achieving distribution approaches the exponential law [111, and the capacity is approximated by the expression calculated in [12]. When only an average power constraint is imposed (AA = CO, while ad is finite), it is not known whether the capacity achieving distribution remains discrete.
W,) 4 I ( y : A ) = H(Y) - H ( y I 4
= n=o
1P,(n, FA) 1%
P Y h FA) dF,
(20)
n=O
2 p Y , , ( n ) log CP,,,(n)l
where H( ) and H( I ) are the unconditional and conditional entropies, respectively [181. Interchange of internal and summation performed in eqn. 20, which is justified by the finiteness of i(A, Fa), yields
i(A, F,) = -
n=O
+ A0)A]" 1 e-(A+'o)A[(A n!
n!
and by Fubini's Theorem
Derivation of results
W a= ) r i ( L Fa) dFa
follows. We are interested in
C = A-'
SUP
FA E P ( A . 0 )
It is easily shown [l, 171 that the observables y i

y, = v((i 1)A) - 4iA) i = 0, 1, 2, ... (15) form here sufficient statistics enabling a maximum likelihood decoding of the transmitted message. From eqn. 5 it is clear that, for a given sequence {A,}, the corresponding yi are statistically independent Poisson random variables where the conditioned probability density function pyilai (4equals e - Ail\? Pyilain) = Pr(yi = n I A,) = n! n = 0, 1, 2 (16)
Z(FJ
and where
Ai = (Ai
426
+ &)A
where eqn. 18 is rewritten to emphasise explicitly the dependence of I on Fa. The method to determine the supremising distribution denoted by F: is based on the technique used by Smith to find the capacity of a scalar Gaussian channel with peak and average power constrained inputs [161. Where the proofs follow straightforwardly from Smith's results [16], only short notes are pointed out. In appendix 7.1 five lemmas following those stated by Smith [16], with short proofs emphasising relevant points to our discussion, are given. The following corollary follows these lemmas.
Corollary I : FX is a supremising distribution if for some 6 2 0 the following expressions are satisfied:
uted by Fb, (eqn. 7). In this case, i(A, Fb,)(eqn. 21) is easily calculated to be
i(1, FX) < i(FX) + 6 ( 1 - 0) V 1 E CO, A]
(24) (25)
( 1 , FX) = Z(FX) + s(1 - a) v1 E EX

where EX denotes the set of increase points of FX
= AA
Proof: This corollary states the 'Kuhn-Tucker' conditions and the proof is classical and identical to that appearing in Reference 16. It is easy to show Cl61 that, if eqns. 24 and 25 are not satisfied, another probability distribution F , # FX constructed the same way as in Reference 16 satisfies Z(F& > Z(FX). On the basis of these results theorem 1 can be proved. Proof o f Theorem 1 : The function i(A, F,) (eqn. 21) is analytic in A (viewed as a complex variable 1E C-the complex space) in the Region 9 = {A:ReA> - A,}. This is readily seen as
log - - e-AA log (1 - B(1 - e-"")) eA
(27) where In denotes the Kronecker delta function. Z(Fb,) is easily found by substituting i(A:Fb,) in (eqn. 22). I(Fb,)= - p(1 - e-AA) log /3 - B(AA)e
- (1
- (1
- e-AA) log (Be-"")
-"'
e-("bb'A[(l + A,)A]" n!
as well as log (A region 9, and
(28) It is verified by differentiation that the optimal p maximising Z(Fb,)is given by Bopt (eqn. 11) and the corresponding mutual information is expressed by eqn. 12. It is also verified that Z(F9 (eqn. 28) and i(A, Fb,)(eqn. 27) with B = /Iopt satisfy (29) i(0, F 9 = i(A, Fb,) = Z(Fb,)IS=Bor as is demanded by eqn. 25 by corollary 1. Eqn. 29 can be used as an alternative method to find /Iopt (eqn. 11). Examine now the function x(A, A) (30) in all the relevant expressions. After some calculations the following results:
- p(1 - e-AA)) log (1 - b(1 - e-""))
+ A,),
are analytic functions of A in the
[19]. Confor any distribution F, in the class 9(A,a) vergence of the sum defining i(A, F,) (eqn. 21) is guaranteed as i(1, F& and Z(F3 (eqn. 22) are finite. Assume now that E , contains an infinite number of points in the interval CO, A] and 1, > o. By the Bolzano Weierstrass theorem [20], E, must have a limit point in the interval CO, A], since all points in E, are located in the interval CO, A]. By the identity theorem of analytic functions [19], if two analytic functions (in some region 9) agree on an infinite set of points in 9 along with their limit point (that is also located in 9), then these functions are identical in the whole region 9. By corollary 1 we conclude that (26) where d1 = Z(FX) - 6a is a constant (not a function of A). In Appendix 7.3 we show that eqn. 26 cannot be satisfied even for real A > A,, hence reaching a contradiction, which concludes the proof for the case A > 0. For A, = 0, the limit point can be at A = 0 where i(1,F& is not necessarily analytic. Therefore in principle, the capacity achieving distribution must be discrete but may have an infinite countable number of increase points in the interval CO, e) where e > 0 can be made as small as desired. This case is ruled out by observing first that capacity is a strictly decreasing function of A,. We know that Z(F& is a continuous function of F, (lemma 3 Appendix 7.1), and also that for any 1 , > 0 the capacity achieving distribution is unique and discrete with finite number of increase points. If the hypothetical distribution with a countably infinite number of points (having a limit at zero) had been capacity achieving at A, = 0, then for a given e, a , > 0 small enough such that value could be found for 1 in the interval (0, e) the capacity achieving distribution for this A, > 0 would have contained an arbitrary large number of increase points. This is clearly a contradiction, by which the proof of theorem 1 is completed for any 1 , 2 0. Consider the case of no dark current A, = 0 and assume that A is a two valued random variable distribIEE PROCEEDINGS, Vol. 137, P t . I, No. 6, DECEMBER 1990
x(A,A) = i(1, Fb,)- Z(F9 with /3 = popr(eqn. 11) substituted

1 log eA
x(A, A)
= AA
+ (1 1--e-")AA e-AA
We note that, as expected (eqn. 29), x(0,A) = X(A, A) = 0 Providing the only points in the interval 1 E [0, A] that satisfy eqn. 25 are 1 = 0 and 1= A, and for other points eqn. 24 is valid, it is guaranteed by corollary 1 that Fb, is the capacity achieving distribution. The first time that Fb, ceases to be the optimal distribution occurs when a new point A, E (0, A) satisfying , y ( 1 , , A) = 0 appears. Explicitly, 1 , should satisfy 1,A log (AJeA)
i(A, FX) = Z(FX) + 6 ( 1 - a) = 61 + 6 , VA
+ AA
which is interpreted as an equation describing (AA) as a function of I , . The minimum possible value of AA for , E (0, A) exists is found by differwhich a real solution 1 entiating (33) Eqns. 32 and 33 are explicitly written as the set of the following nonlinear equations : (344
r- "">
-= 0
the numerical solution of which is AA A

=
3.3679
(354 (354
421
1 , = 0.3839 -
These values determine exactly the threshold AA, (eqn. 13) and the new increase point (that is added to those existing at 0 and A) of the ternary distribution F: (eqn. 14), which is capacity achieving in the regron 3.3679 G AA < 5.
4
Discussion and conclusions
Throughout the discussion on the capacity of the direct detection photon channel we have restricted the input signals A(t), where n(t) + A, is the rate of the observed Poisson process (with dark current rate A,), to rectangular pulse amplitude modulated waveforms with symbol duration A. The parameter A- is representative of the bandwidth of the input signal. Using these input signals, transitions occur only at predetermined (synchronised) time instants iA, where i is an integer. With this restriction and additional peak power A(t) < AVt and average power E(A(t))< a constraints, it was determined that the capacity achieving inputs {Ai} are IID discrete (assuming finite number of values) random variables. It was shown that for no dark current (A =0) and no average power constraint (a = A), the two valued distribution with mass points at extreme values 0 and A is capacity achieving not only for A + 0 (as is well known [l]) but in the whole region 0 < AA < 3.3679. It was further determined that, for 3.3679 < AA < 5, the capacity achieving distribution is ternary having three mass points at the extreme values 0, A and at an intermediate point 0.38394. The main technique used to derive the results relies upon properties of analytic functions and it is similar in principle to that introduced by Smith [16] to evaluate the capacity of a scalar Gaussian channel with both peak and average power constraints imposed on the input signal. Note that the error probability P , of an ideal uncoded, two-level modulated, direct detection optical scheme is governed by the quantum limit P, = 1/2 exp (-AA) [21], where we refer specifically to the case of no dark current (A, = 0) and no variance constraint (a = A). Therefore, to operate at low error probabilities, say P , = for which AA N 20 in power limited practical systems (where A is limited in value) implies the use of pulse durations A which are considerably larger than those allowed for by the switching capabilities (say Amin) of the lasers (or other bandwidth-like constraints), thus considerably lowering the transmission rate A-. However, when optimaly coded systems are considered, the pulse duration (coded channel symbol duration) A can be decreased to meet the switching capabilities Amin, providing therefore an increased transmission rate determined ideally by C (eqn. 18). The optimised symbol distribution to be used in an ideally coded system is determined solely by the value AA,, (for 2, = 0 and CT = A), and for fast enough switching lasers, for which AAmi, < 3.3679, we have established the important conclusion that signalling with only the two extreme levels 0 and A is optimal was made. The elegant techniques exploiting convexity properties of the mutual information functional and related theorems [22] cannot be applied directly in this case because, though the observed counting process is countable, it is not finite in probability 1. Nevertheless, for all practical purposes for any set of parameters A > 0, A > 0 (and Jinite), A, 2 0 and 0 < a i A, N,(A, A, a, A,) can be chosen large enough and counts being larger than N o (appearing with vanishing small probabilities) can be ignored. For example, No = 1 is sufficient for AA + 0 [I].
428
Clearly this has a negligible effect on capacity. However, by Dubins theorem [22] it is guaranteed that the capacity achieving distribution is discrete and it would contain not more than No + 1 mass points. This procedure is similar to quantising the output of a scalar continuous (Gaussian for example) channel turning it into a discrete channel with negligible loss in capacity if optimal fine enough quantisation is used. In the scalar Gaussian channels, it is well known that under an average power constraint imposed on the input, the capacity achieving distribution is Gaussian [18] whereas if an additional peak power constraint is imposed, the capacity achieving distribution becomes discrete assuming a finite number of mass points [16]. In the direct detection photon channel where A is bounded away from zero, it is not known whether the capacity achieving distribution, for A + CO where a is kept finite, is continuous in nature. Eventually, it is conjectured that in this case also this distribution is discrete as is demonstrated in the limiting case aA + 0 (and the corresponding capacity is then infinite). For the other extreme,-when CTA & 1, the optimal distribution is approximated by an exponential PDF [111. A promising direction for further research refers to the achievable rates of a direct detection photon channel using an extended class of signals satisfying a bandwidthlike constraint, specified (for example) by restricting the minimum time duration between successive signal changes to A or spectral-like constraints. This class of signals gained interest in the context of additive white Gaussian channels [23]. A potential preference of this class of signals, as compared to the (synchronised) pulse amplitude modulated waveforms examined here, is evidenced by results reported in Reference 4 for the overlapped pulse position modulation. Bounds on the capacity for the Poisson channel, with either two-level input signals satisfying a minimal intertransition duration constraint, or with general input signals with a predetermined spectral type constraint, were recently reported in References 24 and 25, respectively.
5
Acknowledgment
The author is grateful for the help of H.J. Landau and acknowledges interesting discussions with J.E. Mazo, J. Ziv, A.D. Wyner and H.S. Witsenhausen.
6
References
1 WYNER, A.D.: Capacity and error exponent for the direct detection photon channel - Part I and 11, IEEE Trans., 1988, IT-34, pp. 1449-1471 2 DAVIS, M.A.: Capacity and cut-off rate for Poisson-type channels, IEEE Trans., 1980, IT-26, pp. 71CL715 3 KABANOV, Y.M.: The capacity of a channel of the Poisson type, Theory o f probability and applications, 1978, U , pp. 143147 4 BAR-DAVID, I., and KAPLAN, G.: Information rates of photonlimited overlapping pulse position modulation channels, IEEE Trans., 1984, IT-30, pp. 455464 5 MAZO, J.E., and SALZ, J.: On optimal data communication via direct detection of light pulses, BSTJ, 1976,55, pp. 347-369 6 SNYDER, D.L., and RHODES, J.B.: Some implications of the cutoff-rate criterion for coded direct-detection optical communication channels, IEEE Trans., 1980, IT-26, pp. 327-338 7 PIERCE, J.R., POSNER, E.C., and RODEMICH, E.R.: The capacity of the photon counting channel, IEEE Trans., 1981, IT-27, pp. 61-77 8 BAR-DAVID, I.: Information in the time arrival of a photon packet: capacity of PPM channels, JOSA, 1973,63, pp. 166170 9 MASSEY, J.L.: Capacity, cutoff rate, and coding for a directdetection optical channel, IEEE Trans., 1981, COM-29, pp. 16151621 10 MELIECE, R.J.: Practical codes for photon communication, IEEE Trans., 1981, IT-27, pp. 393-398 IEE PROCEEDINGS, Vol. 137, Pt. I , No. 6, DECEMBER 1990
11 GORDON, J.P.: Quantum effects in communications systems, Proc. IRE, 1962, SO, pp. 1898-1908 12 GOODWIN, B.E., and BOLGIANO, L.P. Jr.: Information capacity of a photoelectric detector, Proc. IEEE, 1965,53, pp. 1745-1746 13 JODIN, R., and MANDEL, L.: Information rate in an optical communication channel, JOSA, 1971,61, pp. 191-198 14 HISDAL, E.: Information on a photon beam vs. modulation-level spacing, JOSA, 1971,61, (3), pp. 328-332 15 LANDAU, H.J., and WYNER, A.D.: Optimum waveform signal sets with amplitude and energy constraints, ZEEE Trans., 1984, IT-30, pp. 61-22 16 SMITH, J.G.: The information capacity of amplitude and variance - constrained scalar Gaussian channels, Information and Control, 1971, 18, pp. 203-219 (see also Smith, J.G.: On the Information
stating that if two PDFs of y induced by F: and F : , respectively (see eqn. 19), are equal then the Levy metric between these distributions , F?) equals zero. The relation (eqn. 37) is proven in appendix 7.2.
Lemma 3 : The functional Z(FA): F(. a) --* W is continuous.

This means that for any sequence F ; + F A in the Levy metric (where F ; , F, E 9(, J , Z(F2 + (FA) in the topology of Lebessgue on the real line. Proof: The proof follows the same lines as presented in Reference 16 and is based on the Helly Bray Theorem C261.
17 18 19 20
Capacity of Peak and Average Power Constrained Gaussian Channels PhD dissertation, Dept. of Electrical Engineering, Univ. of California, Berkeley, California, 1969) SNYDER, D.L.: Random point process (John Wiley, 1975) GALLAGER, R.G. : Information theory and reliable communication (John Wiley, New York, 1968) KNAPP, K.: Theory of Functions (Dover, New York, 1945) BARTLE, R.G.: Elements of real analysis (John Wiley, New York,
1964)
21 SALZ, J.: Coherent lightwave communication, AT&T Tech. J., 1985,64, pp. 2153-2209 22 WITSENHAUSEN, H.S.: Some aspects of convexity useful in information theory, ZEEE Trans., 1980, I T 2 6 , pp. 265-271 23 BAR-DAVID, I., and SHAMAI (SHITZ), S.: On information trans24 25 26 27 28 29
Lemma 4 : The functional I(F,) is weakly differentiable in 9(A* a) and its weak derivative (see details and definitions in References 16 and 27) at the point F , denoted by ZkJF,) is determined by
(38) Proof: The proof is based on the definition of the weak derivative and changing the order of a limit operation and an integral. The latter is justified in our case due to elementary properties of Schwartz functions [28]. The use of Schwartz functions simplifies considerably the proof as compared with the proofs presented in Reference 16.
fer by envelope-constrained signals over the AWGN channel, I E E E Trans., 1988,34, pp. 371-379 SHAMAI (SHITZ), S.: On the capacity of a direct detection photon channel with intertransition constrained binary input, submitted to ZEEE Transactions on Information Theory SHAMAI (SHITZ), S., and LAPIDOTH, A.: Bounds on the capacity of a spectrally constrained Poisson channel, submitted to ZEEE Transactions on Information Theory LOEVE, M.: Probability Theory(van Nostrand, New York, 1950) LUENBERGER, D.G.: Optimization by vector space methods (John Wiley and Sons, New York, 1963) RUDIN, W.: Functional analysis (McGraw-Hill, 1973) AKHIEZER, N.I.: The classical moment problem and some related questions in analysis (Hafner Publishing Company, New York)
Lemma 5 : The supremum of C is achieved by a unique distribution Ff . A necessary and sufficient condition for Ff to be that supremising distribution is
I(i(1, F f ) - 6A) d F , G Z(Ff) - 60 for some nonnegative 6 and any F ,
E
Appendixes
(39)
7.1 Appendix A The following lemmas are helpful in proving the main results. Only main ideas and steps in the proofs are presented. The detailed proofs follow the lines of those given in Reference 16.
9(A, A).
9 Proof: Define the functional J(F,) : 9(A, A)
Lemma I : 9(A,a) is a convex and compact space in the Levy metric.

Proof: The proof is identical to that given in Referene 16 and it is based on Helleys weak compactness theorems and the equivalence of weak convergence in a finite interval and convergence by the Levy metric [26].
where F ( A A ),(with 0 replaced by A) denotes the set of all distributions F, having their points of increase in the interval CO, A]. This functional is linear, bounded and it is a convex cup. The functional J(F,) is also weakly differentiable with JF;(FJthe weak derivative at point F: given by (41) Jkp(FA)= J(F,) - J(F,VF,, F: E 9 ( A , A ) The new functional Z(F,) - 6J(F,) viewed as a functional from 9(, A ) to 9 is continuous and a strictly convex cap, due to lemmas 1 , 2 and 3. Lemma 1 and 4 and the above properties enable the application of the Lagrange optimisation theorem with constraints [27]. This theorem replaces the original optimisation problem with another one where the functional Z(F,) - 6(J(FA) is supremised with respect to all distributions F A E FtCA, A ) and where 6 denotes the Lagrange coefficient. Lemma 5 is a direct consequence of this optimisation theorem.
7.2 Appendix B Eqn. 37 results by showing that the coefficient set {C,}n=0,1,2 , . . .
429
Lemma 2 : The functional Z(FA), viewed as a functional from 9-(A,a) to the field or reals 9 is a strictly convex cup.
Proof: The proof follows directly [16] using the convexity of the mutual information functional [18] (1 - o)z(F:) + ez(F:) G z((i - O)F:
+ OF:)
oGeG
(36) where 9 : and F: are both distributions belonging to F(, The strict convexity follows directly from the property (37)
IEE PROCEEDINGS, Vol. 137, Pt. I , No. 6, DECEMBER I990
cn
1"
e+++")(A n!
+ 2),.
dF,
= 0, 1, 2,
...
(42)
where yn are coefficients determined by

yn = log [e-('+&)A[(A
uniquely determines the distribution F , (in the sense of the Levy metric). It is equivalent to show that the moments M , of another distribution Q ,
1" dQ, n = 0, 1, 2, . . . (43) M , = ,J having its point of increase in the interval [A,, A A,] uniquely determine this distribution. This is a classical moment problem [29]. By using Carleman's theorem (Reference 29, page 85), applicable in this case since,
rA+&
+ I,)A]" dF,]
(47)
If eqn. 26 is to be satisfied for any complex 1E 9 it must be also satisfied for real 1 > -A, explicitly written
n=O
+ A,)A]"
n!
+ (A + &)A log
r-1
yn = - S A - SI
V 1 > -1,
(48)
where 1 is assumed to be a real variable. As A, 2 0, it is clear that the result follows.
r e - ( A + & ) A [ (+ I A,)A]" dF,

and therefore
Yn
< [ ( A + A,)A]"
(49)
7.3 Appendix C
Observing that
n=O
" e-("+"bA[(A + &)A]"

n!
n log [ ( A + AJAl
(50)
log [e-("Ao)A[(I A,)A]"]
Thus the left hand side of eqn. 48 is upper bounded by
we express i(1, F A )(eqn. 21) by
) , the right hand side of eqn. 48 Since, for large A(l + a grows as I log I , whereas the left hand side of this equation cannot grow faster than k1 (eqn. 51) where k is a positive constant, a contradiction results. In conclusion eqn. 48 and therefore eqn. 26 cannot be valid for any distribution F , having its increase points in the interval CO, A l .
430
IEE PROCEEDINGS, Vol. 137, Pt. I , N o . 6, DECEMBER 1990

Shama I 1990

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Shama I 1990

Transféré par

Droits d'auteur :

Formats disponibles

Capacity of a pulse amplitude modulated direct detection photon channel

Prof. S. Shamai (Shitz), MSc, DSc

Indexing terms: Detection, Communication systems theory

Ai U(t - iA) 0 < t < CO

0 < 1(t) < A

Fb, = (1 - B)S(A) + pS(A - A )

where S(u) is a step function

The corresponding capacity [l, 23 is given then by

c = A{q*(l + &/A) log (1 + &/A)

We are interested in C the channel capacity in nats/s.

IEE PROCEEDINGS, Vol. 137, Pt. I , No. 6, DECEMBER 1990

The capacity C is given by

C = lim - Z(Fb,) = Ale A-0 A

1 lim sup - I({ y,},", AN-m N 1

0 < AA < 3.3679

It is easily shown [l, 171 that the observables y i

i(1, FX) < i(FX) + 6 ( 1 - 0) V 1 E CO, A]

( 1 , FX) = Z(FX) + s(1 - a) v1 E EX

log - - e-AA log (1 - B(1 - e-"")) eA

- e-AA) log (Be-"")

- p(1 - e-AA)) log (1 - b(1 - e-""))

are analytic functions of A in the

x(A,A) = i(1, Fb,)- Z(F9 with /3 = popr(eqn. 11) substituted

i(A, FX) = Z(FX) + 6 ( 1 - a) = 61 + 6 , VA

the numerical solution of which is AA A

Lemma 3 : The functional Z(FA): F(. a) --* W is continuous.

9 Proof: Define the functional J(F,) : 9(A, A)

Lemma I : 9(A,a) is a convex and compact space in the Levy metric.

where yn are coefficients determined by

where 1 is assumed to be a real variable. As A, 2 0, it is clear that the result follows.

r e - ( A + & ) A [ (+ I A,)A]" dF,

" e-("+"bA[(A + &)A]"

log [e-("Ao)A[(I A,)A]"]

Thus the left hand side of eqn. 48 is upper bounded by

we express i(1, F A )(eqn. 21) by

IEE PROCEEDINGS, Vol. 137, Pt. I , N o . 6, DECEMBER 1990

Vous aimerez peut-être aussi