Vous êtes sur la page 1sur 8

86

CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

A High-Throughput VLSI Architecture for


Real-Time Optical OFDM Systems
With an Efficient Phase Equalizer
Architecture dITGE haut-dbit pour les
systmes optique temps rel OFDM avec
un galiseur de phase efficace
Reza Ghanaatian, Mahdi Shabany, and Morteza H. Shoreh
Abstract In this paper, a novel high-throughput very large scale integrated circuit architecture for a
real-time implementation of intensity modulation direct detection optical orthogonal frequency division
multiplexing system is proposed, achieving the highest throughput reported to date. The proposed
architecture utilizes a fast, pipelined, and parallel inverse fast Fourier transform/fast Fourier transform
in the transmitter/receiver, which is customized to satisfy the throughput requirements of the advanced
optical systems. In addition, an efficient high-accuracy equalization method is developed, improving the
system performance compared with the conventional linear equalizers. To evaluate the system performance,
the OptiSystem software is used to model the optical channel and a Virtex-6 ML-605 evaluation board
is used as the implementation platform. Moreover, the synthesis results in a 180-nm CMOS technology
prove that the proposed architecture achieves a sustained throughput of 22.5 Gb/s with a 4.89-mm2 core
area.
Rsum Dans ce papier, une architecture de circuit intgr trs grande chelle (ITGE) et haut dbit est
propose pour une nouvelle mise en uvre dun systme temps rel de modulation dintensit optique
orthogonale pour la dtection directe de la frquence des systmes de multiplexage, permettant ainsi
datteindre de plus haut dbit jusqu nos jour. Larchitecture propose utilise une transforme de Fourier
rapide (Fast Fourier Transform, FFT) inverse, en pipeline, et parallle dans lmetteur/rcepteur. Cette
dernire a t adapte afin de satisfaire les contraintes de dbit des systmes optiques de pointe. galement,
un procd dgalisation de haute prcision efficace est dvelopp, ce qui amliore les performances du
systme par rapport aux galiseurs linaires classiques. Pour valuer les performances du systme, le
logiciel OptiSystem est utilis pour modliser le canal optique et une carte dvaluation Virtex-6 ML-605
est utilise comme plate-forme de dveloppement. De plus, les rsultats de la synthse avec une technologie
CMOS de 180 nm prouvent que larchitecture propose permet dobtenir un dbit soutenu de 22.5 Gb/s
avec une puce microlectronique de 4.89 mm2 .
Index Terms Efficient phase equalizer, intensity modulation direct detection (IMDD), optical orthogonal
frequency-division multiplexing (OOFDM), throughput, very large scale integrated circuit (VLSI)
architecture.

I. I NTRODUCTION

RTHOGONAL
frequency-division
multiplexing
(OFDM) is widely used in both wired and wireless
systems due to its various advantages, such as the spectral
efficiency, great performance in multipath fading channels,
and the simple hardware implementation [1]. Recently, it
has become the technology of choice for systems employing
optical communications [2]. The tremendous increase in
Manuscript received August 26, 2013; revised January 6, 2014 and March 2,
2014; accepted April 3, 2014. Date of current version August 15, 2014.
The authors are with the Department of Electrical Engineering,
Sharif University of Technology, Tehran 11369, Iran (e-mail:
reza.ghanaatian@gmail.com; mahdi@sharif.edu; m.h.shoreh@gmail.com).
Associate Editor managing this papers review: Reza Heidari.
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/CJECE.2014.2317756

demand for the network capacity due to the advent of


new Internet applications, and the development in digital
signal processing (DSP) technology, which enables the
implementation of sophisticated OFDM signal processing
algorithms, has created a great motive to use OFDM in
optical communication systems [3].
Among optical OFDM (OOFDM) systems, the coherent
optical OFDM provides the ultimate performance on the
receiver sensitivity, spectral efficiency, and robustness against
dispersion [4]. On the other hand, the direct detection optical OFDM (DD-OOFDM) systems come with a lower cost,
appealing to various applications [4]. Intensity modulation
direct detection OOFDM (IMDD-OOFDM) is one type of
DD-OOFDM system, which is a promising solution toward
developing high-bandwidth access networks. This is mainly
because of the fact that IMDD-OOFDM offers a reasonable

0840-8688 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

GHANAATIAN et al.: HIGH-THROUGHPUT VLSI ARCHITECTURE FOR REAL-TIME OPTICAL OFDM SYSTEMS

reduction in the overall network complexity. Meanwhile, the


experimental demonstration of a real-time IMDD-OOFDM
transceiver is vital to enable the practical realization of the
OFDM systems in next-generation optical networks. However, the implementation of an OOFDM system has various
challenges including devising a high-throughput hardware
architecture, which could handle the parallel, complex, and
computationally intense OFDM signal processing algorithms.
The real-time implementation of OOFDM systems has been
previously studied in [5][7]. However, an efficient architecture for the baseband part of the system at extremely high data
rates is still a major challenge.
In this paper, a real-time, high-throughput very large
scale integrated circuit (VLSI) implementation of an
IMDD-OOFDM transceiver is proposed, which provides great
performance for the system via taking advantage of a high
speed, an efficient FFT, and a low-complexity symbol synchronizer in the transceiver. In addition to the above contribution, an efficient yet simple phase equalizer is developed based on the nature of the optical channels, providing better channel estimation. This effectively enhances the
system performance compared with the conventional equalization approaches. The rest of this paper is organized as
follows. In Section II, the OOFDM system architecture is
reviewed and the proposed hardware architecture for the realtime implementation is discussed. The transmission performance and the implementation results are demonstrated in
Sections III and IV, respectively. Finally, Section V concludes
this paper.
II. R EAL -T IME OOFDM T RANSCEIVER A RCHITECTURE
A. System Model
Fig. 1 shows the block diagram of a real-time DD-OOFDM
transceiver. On the transmitter side, the parallel input data are
mapped onto the constellation points using an M-quadraticamplitude modulation (QAM), and then fed to a 32-point
parallel IFFT core. To generate real valued samples for the
intensity modulator, only half of the subcarriers are used for
data (one zero component at the dc subcarrier and others
for data), while others are left for their complex conjugate.
A cyclic prefix (CP) of eight samples is used resulting in
a 40 samples OFDM symbol. The real OFDM electrical
signal is then fed into a MachZehnder modulator (MZM)
to modulate the laser intensity. Following the modulator, an
erbium-doped fiber amplifier (EDFA) is utilized to adjust
the optical launch power to measure the bit-error-rate (BER)
performance. Finally, the optical signal is transmitted through
the optical link (see Section III for more details).
On the receiver side, an EDFA is used to amplify the
received signal power followed by a simple photodetector to
perform the direct detection. The samples of the converted
photocurrent are then fed to the OFDM receiver, where the
symbol synchronizer, the FFT, the channel equalizer, and the
demodulator recover data in each subcarrier (Fig. 1).
With respect to the architecture in Fig. 1, in this paper, the
OFDM transmitter and receiver are implemented on a Xilinx
Virtex-6 Field-programmable gate array (FPGA) (see Section

Fig. 1.

87

Block diagram of the proposed real-time DD-OOFDM system.

IV for details). The OptiSystem is used for the simulation


of the optical channel and other optical devices used in the
system. The OFDM symbols are saved and imported to the
software and the output of the channel is exported, and fed to
the FPGA.
B. Optical OFDM System Description
1) Symbol Synchronization: The symbol synchronization
is one of the essential blocks at the receiver in an OFDM
system, which plays an important role in the overall system
performance. For the purpose of the symbol synchronization,
one task is to find the symbol start. The conventional method
to estimate the OFDM symbol start is using the correlation
concept. Considering an OFDM symbol y(n), the correlation
of this signal with its shifted version can be defined as
+Ng +1

(n) =

y(n + i ) y (n + N + i )

(1)

i=

where N represents the number of FFT points, Ng is the


length of the CP region, and shows the initial random offset
of the receiving OFDM samples. The result of the above
correlation is maximized when the correlation is calculated
in the CP region. This means by detecting the peak absolute
value of , the beginning of the symbol can be found.
It can be shown that the performance of the above method
deteriorates in noisy environments as well as in cases where
the FFT size is small, such as OOFDM systems. Therefore,
in this paper, it is proposed to use a folding technique where
the value of is added to its shifted version with the length
of OFDM symbol as follows:
k [n] =

k1
1 
[n i Ns ]
k

(2)

i=0

for example, the five order of folding can be derived as


5 [n] =

1
{ [n] + [n Ns ] + [n 2Ns ]
5
+ [n 3Ns ] + [n 4Ns ]}

(3)

where Ns represents the length of OFDM symbol that is equal


to N + Ng . In other words, the folding implies that the number
of samples in (1) increases, which leads to improving the SNR
value. This is mainly because of the fact that, more than one
symbol is observed during the correlation, which results in an
improvement in the algorithm accuracy. Fig. 2 shows and 5
values calculated for three consecutive OFDM symbols. As it
can be seen, the peek positions of are not easily recognizable

88

CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

Fig. 2.

Calculated values for three consecutive OFDM symbols. (a) value. (b) 5 value.

Fig. 3.

Received constellations for some of the subcarriers 1, 5, 9, 13, and 15 (left to right). I = inphase and Q = quadrature.

while with the help of the folding technique, the peaks occur
at correct positions with a sharp pattern.
2) Channel Equalizer: The channel equalizer is used to mitigate the fiber nonlinearity and phase modulation effects [8].
In this paper, two methods of equalization are considered.
a) Conventional linear equalizer: In conventional linear
equalizers, a channel response for each OFDM subcarrier is
calculated based on the reference signals as follows [9]:
Hk =

Xk
X k,ref

(4)

where Hk is the estimated channel response in the kth subcarrier and X k and X k,ref are the received and the reference signal
in the kth subcarrier, respectively. By multiplying the inverse
of each subchannel response by the corresponding received
signal, the equalized subcarrier Yk is manipulated as follows:
Yk = ( Hk )1 X k .

(5)

In other words, in this method, the calculation of the


channel response for each OFDM subcarrier requires the
computation of 2M parameters, where M is the number of
active subcarriers.
b) Proposed equalizer: In this paper, an efficient threestage phase and amplitude equalizer/compensator is proposed
whose BER performance outperforms that of the conventional
linear equalizer. In this method, the equalizer function (inverse
channel response) is represented by
H 1( f ) = H0 e j

 H 1 ( f )

(6)

where H0 is the amplitude of the equalizer function. For the


phase-shift keying scenario, since the data are modulated only

on the phase of the signal, without loss of generality, it is


assumed that H0 = 1. In addition, based on simulation results,
as shown in Fig. 3, each subcarrier has a constant phase shift
relative to its adjacent subcarriers. Thus, the inverse channel
response is modeled by a linear subcarrier-dependent phase
shift as
 H 1 ( f ) = k +
(7)
where and are constants, which will be discussed in the
following, and k is the subcarrier index such that f (k) =
f c + k  f sc , in which fc is the carrier frequency and  f sc is
the subcarrier frequency spacing.
In higher order modulations, such as 16-QAM and
64-QAM schemes, the magnitude compensation is also necessary. Hence, the amplitude distortion of the received signal
can be compensated by H0 = e , in which is a constant and
its calculation will be discussed in the following. As a result,
the total equalization procedure is formulated as
Yk = e+ j (k+) X k

(8)

where Yk is the equalized signal and X k is the received


signal. In summary, in the proposed equalization method in
this paper, the equalization process is simplified to calculate
three constants for all subcarriers. In other words, in the
conventional linear equalizer, a subchannel response must
be calculated for each subcarrier; however, in the proposed
method, the equalization of all subcarriers is performed using
three parameters of , , and . The calculation method of
these three parameters are performed in three steps, explained
in the following.
Step 1 ( calculation): To compute , a training sequence
is employed, and the BER is calculated for a range of

GHANAATIAN et al.: HIGH-THROUGHPUT VLSI ARCHITECTURE FOR REAL-TIME OPTICAL OFDM SYSTEMS

89

varying . Then, the optimum value of , which minimizes


the BER value, is chosen. To achieve further performance
improvement, this procedure is repeated for a shorter interval
around the estimated . This approach continues until the
computed satisfies the desired precision.
Step 2 ( calculation): On the other hand, is chosen so
that the whole constellation is rotated to the position, that
the average phase of the estimated constellation points Yn is
equal to the phase of the transmitted constellation points Yn .
Therefore, by sending Nt training symbols, will be defined
as the geometric mean of the phase error as follows:
=

Nt
1 
[arg(Yn ) arg(Yn )] .
Nt

(9)

n=1

Fig. 4 shows the procedure for computation of and .


Step 3 ( calculation): The parameter is chosen so that the
whole constellation is placed at the position that the magnitude
of the estimated constellation points is equal to that of the
received constellation points. Thus, by sending Nt training
symbols, will be defined as the geometric mean of the
absolute ratio between the amplitude of the estimated signal
Yn and the received signal X n as follows:

 Nt



Yn
t
.
= ln N
abs
(10)
Xn
n=1

It is interesting to note that the coefficients and


are linearly dependent on the frequency spacing among the
subcarriers as well as the total length of the fiber used between
the transmitter and the receiver. Therefore, once their values
are known for one setup, they can be easily calculated for
another setup with different length and frequency spacing by
simple arithmetic manipulations. Furthermore, considering the
long-term stability of the optical channel, the overhead of the
training sequence for DD-OOFDM systems is negligible [10].
In addition, in the proposed method, the equalization process is
simplified to calculate three constants for all subcarriers, which
results in decreasing the computational efforts, implying that
the overhead of the training sequence is negligible.
In addition to this fact, the simulation results in Section III
will show that the subcarrier-dependent phase equalizer fits
to the nature of the optical channel. This results in a higher
accuracy in the channel estimation and better transmission
performance compared with the conventional linear approach.
C. Proposed Hardware Architecture for the OOFDM System
From the digital design point of view, the main difference
between the conventional wireless OFDM systems and the new
optical OFDM systems is the system data rate. This would
imply that the parallel data processing should be performed
both in the transmitter and the receiver. Therefore, a fully
unrolled datapath without any resource sharing is required
for all of the sub-blocks. Clearly, such a datapath occupies
significant area (resource) on the application specified integrated circuit (ASIC) (FPGA) platform; but, it is necessary
for the system throughput requirements. To obtain an efficient

Fig. 4. Convergence to the coefficients of the proposed equalization method.

architecture for each part of the transceiver, all of the subblocks are designed using the Verilog hardware description
language (Verilog-HDL), based on the optimum bitwidth,
derived from the MATLAB fixed-point golden model.
1) IFFT and FFT: In all of the OFDM systems, FFT is one
of the essential, computationally intensive parts of the system,
which can be implemented using different algorithms, depending on the FFT size and system requirements. In the proposed
system, a 32-point FFT core is needed that produces all of the
outputs simultaneously at every clock cycle. For this purpose,
the radix-2 decimation in time algorithm with a parallel datapath is used [11]. The pipelining technique is performed in
the entire feed-forwarding path of the FFT module so that its
critical path is one simple multiplier. This multiplier can be
mapped on a DSP block of the FPGA, and operates at a very
fast clock frequency.
To minimize the FFT core area, suitable bitwidth for each
stage needs to be carefully calculated using the fixed-point
tools, and truncation should be performed to omit unnecessary bits after each operation. For all of the operations,
minimum bitwidth for the fractional part is selected while the
BER performance of the fixed-point algorithm matches that of
the floating-point curve.
2) Symbol Synchronization: The synchronization block
should perform the computations in (1) and (2). To calculate
the correlation, a buffer is used to store the received symbols.
This buffer, called the computation buffer, is used to store at
least two consecutive symbols. In the buffer, samples of the
OFDM signal are shifting with the rate of one sample per clock
cycle. Therefore, instead of using 40 complex multipliers,
only one multiplier and a delay buffer with the CP length
are used (Fig. 5). After eight clock cycles, the first value
of is calculated and then, one new value for will be
produced in each clock cycle. Finally, a simple comparator
finds the maximum value and detects the peak position. In the
proposed architecture, the number of multiplication is reduced
to one. In this architecture, reducing the number of functional
units reduces the hardware complexity and also the system
throughput, which undoubtedly may lead to some frame loss.
This, however, is not an issue as this procedure is performed
only when the system is in the reset mode.
To implement the folding technique in the above architecture, a simple buffer with the length of the OFDM symbol is
needed to store values. After storing the first symbol, the

90

Fig. 5.

Fig. 6.

CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

Fig. 7. BER versus optical launch power for transceiver in 64-QAM mode
(one fiber span).

Proposed architecture to calculate the correlation.

Architecture of the conventional linear equalizer.

second symbol will be added to the result and again is stored


into this buffer. This can continue as much as it is needed.
In addition, the length of the computation buffer depends on
the order of folding. For example, for the folding order of
three, this buffer should store at least four OFDM symbols.
Both the simulation results and the hardware constraints would
specify the optimum order of the folding.
3) Channel Equalizer:
a) Conventional linear equalizer: In the conventional
linear equalizer, the equalization is performed by multiplying
the inverse of the subchannel response by each subcarrier (5).
The calculation of the inverse of the subchannel response is
as follows:
Hk real j Hk imag
1
=
(11)
(Hk )1 =
Hk real + j Hk imag
Hk2real + Hk2imag
which based on (5) can be written as
Yk =

Hk real j Hk imag
Hk2real + Hk2imag

Xk .

(12)

The equalization can be implemented using one complex


multiplier and two dividers. Fig. 6 shows the VLSI architecture
of this equalizer. This architecture is pipelined so that the
critical path is reduced to one simple multiplier. It is worth
mentioning that for the above system, all of the subcarriers
should be equalized simultaneously; thus, 15 parallel equalizers are needed to meet the system throughput.

The conditional subtraction method is used to implement


the division [12]. In this method, increasing the number of
iterations will result in more accuracy. Comparing the floatingpoint system to the fixed-point model, a division operation
with 13 integer bits and six fractional bits for the quotient is
needed resulting in 19 iterations to deduce the result. In each
divider, a fully unrolled datapath with 19 functional units is
used to meet the throughput requirements. Consequently, the
result of the division will be available at each clock cycle.
b) Proposed efficient equalizer: In the proposed equalization method, a constant phase and amplitude for each subcarrier are calculated, which is based on the , , and parameters, using the three-stage method described in Section II-B
(see (8)). From the hardware implementation point of view,
according to this equation, a complex-domain multiplier for
each subcarrier is sufficient to perform this equalization.
III. S IMULATION R ESULTS
A. Simulation Framework
The performance evaluation for the proposed system is
performed based on the BER of the received signal. The
OptiSystem is used to simulate the optical channel and to
consider the real effect of the optical devices, including the
MZM, the EDFA, and the photodetector. The simulations
are performed over an optical link that consists of 50-km
standard single-mode fiber (SSMF) and 9.94-km dispersion
compensation fiber (DCF) for the 64-QAM scenario. For the
16-QAM and 4-QAM scenarios, the optical channel
comprises three and seven spans, respectively, each of
which consisting 70-km SSMF and 14.1-km DCF [13]. The
SSMF has an attenuation of 0.2 dB/km, a dispersion of
16.75 ps/(nm km), and a nonlinearity factor of 1.3/W km.
The DCF has an attenuation of 0.6 dB/km, a dispersion of 80
ps/(nm km), and a nonlinearity factor of 5.0/W km. In the
simulations, by changing the optical lunch power of the fiber
input signal, the BER of the received signal is evaluated for
different modulation scenarios.
B. Simulation Results
Fig. 7 shows the BER of the received signal versus the
optical launch power for both the conventional linear and the
proposed equalizer for the 64-QAM. As shown in this figure,

GHANAATIAN et al.: HIGH-THROUGHPUT VLSI ARCHITECTURE FOR REAL-TIME OPTICAL OFDM SYSTEMS

91

Fig. 8. Received signals. (a) 16-QAM with the proposed equalizer. (b) 16-QAM with the linear equalizer. (c) 4-QAM with the proposed equalizer. (d) 4-QAM
with the linear equalizer.

Fig. 9. BER versus optical launch power for transceiver in 16-QAM mode
(three fiber spans).

Fig. 10. BER versus optical launch power for transceiver in 4-QAM mode
(seven fiber spans).

the proposed equalization method results in approximately


2.5-dB BER performance improvement for optical launch
powers less than 5 dBm, compared with the conventional
linear equalizer. Moreover, for launch powers greater than
5 dBm, the BER performance improvement of about half
order of magnitude is achieved.
Fig. 8 shows the constellation of the received signal for both
of the equalizers in 16-QAM and 4-QAM scenarios at optical
launch power of 6 dBm. In addition, the BER performance
versus the optical launch power for both equalizers in 16QAM and 4-QAM schemes are shown in Figs. 9 and 10,
respectively. As it can be seen, the proposed equalization
method has a significant performance enhancement compared
with the conventional linear equalizer at the optimum launch
power range, leading to more than 1.5 order of magnitude
improvement for these scenarios.
Fig. 11.

(a) Xilinx Virtex-6 ML605 evaluation board. (b) Chip layout.

IV. P ROPOSED D ESIGN S PECIFICATIONS


The proposed architecture for the transmitter and receiver
is realized using the Verilog-HDL and is synthesized and
implemented on FPGA as well as on the ASIC platform.
For the FPGA implementation, the design is implemented
and tested on the Xilinx Virtex-6 ML605 evaluation board
[Fig. 11(a)]. The results are verified with the golden model
generated from the MATLAB fixed-point outputs. For the
ASIC implementation, the design is synthesized with Synopsys
Design Compiler, and placed and routed with the Cadence
SoC Encounter using 180-nm CMOS standard cells. The chip
layout is shown in Fig. 11(b).

A. Complexity Analysis
Table I shows the synthesis results for the FPGA and ASIC
implementation of the transmitter as well as the receiver.
The synthesis results of the receiver utilizing each of the
equalization approaches and the 64-QAM scheme are extracted
and shown in this table.
For both the transmitter and receiver, the critical path of the
design is one multiplier, resulting in the maximum operating
clock frequency of 225 MHz in the FPGA, and 250 MHz in
the ASIC platform. Needless to say that by applying the fine-

92

CANADIAN JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, VOL. 37, NO. 2, SPRING 2014

TABLE I
I MPLEMENTATION R ESULTS

TABLE II
C OMPARISON OF R EAL -T IME DD-OOFDM I MPLEMENTATIONS

grain pipelining technique to the multipliers, the critical path


of the design can be further reduced and the maximum clock
frequency will increase, at the cost of larger latency and the
silicon area.
Although the proposed architecture is attractive for highspeed applications, targeted for optical systems, it can also
be easily customized for any target application with different
specifications. For instance, it can be used for applications
requiring low power/area requirements by merging some functional units in parallel IFFT and FFT.

maximum operating clock frequency of 153 MHz on Stratix II


platform, for both the transmitter and receiver, implying the
sustained throughput of 13.77 Gb/s. In addition, the ASIC
results are scaled to a 65-nm technology and shown in Table II,
for a fair comparison with [15]. The resulting throughput and
the silicon area prove a significant enhancement with the result
in [15]. It is worth mentioning that in this table, the reported
area is the total area of the transmitter and receiver. The results
in Table II, can be summarized as follows.
1) Most of the real-time implementations of the OOFDM
use direct detection for the system architecture. This is
because of the lower complexity and easier implementation of the system compared with the coherent detection.
2) The FPGA platform is more popular for the real-time
OOFDM systems implementation. Apparently, this can
be a good beginning for the future high-throughput
ASIC implementation of these systems.
3) Comparing this paper with [14], which is the fastest
design to date on the FPGA platform, and scaling its
throughput to the single band, the proposed design
achieves almost 38% higher throughput on the same
target device.
4) Compared with [15] and scaled to the same process,
which is the fastest ASIC design, the design achieves a
higher throughput and also a lower silicon area. In other
words, the achieved throughput per area is 5.5 better
than the best ASIC implementation reported to date.
V. C ONCLUSION

B. System Throughput
From Table I, it is clear that the maximum achievable clock
frequency of the transceiver for the FPGA implementation is
225 MHz, leading to the throughput of 9 GS/s for a system
using a 32-point IFFT. Ultimately, this would translate to the
sustained throughput of 20.25 Gb/s in the 64-QAM scheme.
The equivalent throughput for the ASIC implementation is
22.5 Gb/s at the operating clock frequency of 250 MHz.
C. Design Comparison
Table II summarizes the implementation results for state-ofthe-art publications of real-time OOFDM systems. For a fair
comparison between this paper and the design with the highest
throughput to date [14], which used the Altera Stratix II as
the target device, the proposed design was also synthesized on
this device, shown in this table. The synthesis results prove the

A high-throughput VLSI architecture for a real-time


IMDD-OOFDM system was introduced, which achieves a
sustained throughput of 20.25 Gb/s on the Virtex-6 FPGA
platform as well as 22.5 Gb/s on a 180-nm CMOS technology. In the proposed architecture, a high-speed, parallel FFT
core was developed to satisfy the throughput requirements of
the optical systems. Moreover, a simple architecture for the
synchronization block of the receiver was proposed, which
utilized much lower functional units than the conventional
methods. In addition, a high-accuracy subcarrier-dependent
equalization method was introduced that fits to the nature of
optical channels, resulting in an improved system performance
compared with the conventional linear equalizers. The implementation results confirmed that the proposed design achieves
a higher throughput and occupies a lower area compared with
the existing designs to date both on the FPGA and ASIC
platforms.

GHANAATIAN et al.: HIGH-THROUGHPUT VLSI ARCHITECTURE FOR REAL-TIME OPTICAL OFDM SYSTEMS

ACKNOWLEDGMENT
The authors would like to thank Prof. Salehi and
Dr. Beyranvand for their helpful suggestion through this work.
R EFERENCES
[1] S. Tian, K. Panta, H. A. Suraweera, B. J. C. Schmidt, S. McLaughlin,
and J. Armstrong, A novel timing synchronization method for ACOOFDM-based optical wireless communications, IEEE Trans. Wireless
Commun., vol. 7, no. 12, pp. 49584967, Dec. 2008.
[2] J. Armstrong, OFDM for optical communications, J. Ligthw. Technol.,
vol. 27, no. 3, pp. 189204, Feb. 1, 2009.
[3] Y. Tang, High-speed optical transmission system using coherent optical
orthogonal frequency-division multiplexing, Ph.D. dissertation,
Dept. Electr. Electron. Eng., Univ. Melbourne, Melbourne, Australia,
2010.
[4] W. Shieh, OFDM for flexible high-speed optical networks, J. Lightw.
Technol., vol. 29, no. 10, pp. 15601577, May 15, 2011.
[5] E. Hugues-Salas et al., Directly modulated VCSEL-based real-time
11.25-Gb/s optical OFDM transmission over 2000-m legacy MMFs,
IEEE Photon. J., vol. 4, no. 1, pp. 143154, Feb. 2012.
[6] Y. Benlachtar et al., Real-time digital signal processing for the generation of optical orthogonal frequency-division-multiplexed signals,
IEEE J. Sel. Topics Quantum Electron., vol. 16, no. 5, pp. 12351244,
Sep./Oct. 2010.
[7] R. I. Killey et al., Recent progress on real-time DSP for direct detection
optical OFDM transceivers, in Proc. Conf. OFC/NFOEC, 2011.
[8] M. H. Shoreh, H. Beyranvand, and J. A. Salehi, Mathematical modeling
of nonlinearity impairments in optical OFDM communication systems
using multiple optical phase conjugate, in Proc. IWCIT, May 2013,
pp. 15.
[9] S. Coleri, M. Ergen, A. Puri, and A. Bahai, Channel estimation
techniques based on pilot arrangement in OFDM systems, IEEE Trans.
Broadcast., vol. 48, no. 3, pp. 223229, Sep. 2002.
[10] B. J. C. Schmidt, A. J. Lowery, and J. Armstrong, Experimental
demonstrations of electronic dispersion compensation for long-haul
transmission using direct-detection optical OFDM, J. Lightw. Technol.,
vol. 26, no. 1, pp. 196203, Jan. 1, 2008.
[11] R. Chassaing and D. Reay, Digital Signal Processing and Applications
With the TMS320C6713 and TMS320C6416 DSK, 2nd ed. Hoboken, NJ,
USA: Wiley, 2008.
[12] N. Kehtarnavaz, Real-Time Digital Signal Processing Based on the
TMS320C6000. New York, NY, USA: Elsevier, 2004.
[13] E. Ip and J. M. Kahn, Compensation of dispersion and nonlinear
impairments using digital backpropagation, J. Lightw. Technol., vol. 26,
no. 20, pp. 34163425, Oct. 15, 2008.
[14] E. Hugues-Salas, R. P. Giddings, and J. M. Tang, First experimental
demonstration of real-time adaptive transmission of 20 Gb/s dual-band
optical OFDM signals over 500 m OM2 MMFs, in Proc. OFC/NFOEC,
2013, pp. 13.
[15] R. Bouziane et al., Design studies for ASIC implementations of 28 GS/s
optical QPSK- and 16-QAM OFDM transceivers, Opt. Exp., vol. 9,
no. 21, pp. 2085720864, 2011.

93

Reza Ghanaatian was born in Jahrom, Iran, on May


13, 1988. He received the B.Sc. degree in electrical
engineering from the Khaje Nasir University of
Technology (KNTU), Tehran, Iran, in 2010, and the
M.Sc. degree in digital systems from the Department
of Electrical Engineering, Sharif University of Technology (SUT), Tehran, in 2012.
He has been with Advanced Integrated Circuit
Design Laboratory (AICDL), Sharif University of
Technology since 2010. His current research interests include VLSI architecture design of digital
signal processing algorithms for wireless and optical communication systems
and field-programmable gate array-based systems.

Mahdi Shabany received the B.Sc. degree in electrical engineering from the Sharif University of Technology (SUT), Tehran, Iran, in 2002, and the M.Sc.
and Ph.D. degrees in electrical engineering from the
University of Toronto, Toronto, ON, Canada, in 2004
and 2008, respectively.
He is an Associate Professor with the Electrical Engineering Department, Sharif University of
Technology and also works with the University of
Toronto periodically as a Visiting Researcher. He
was with Redline Communications Co., Toronto,
from 2007 to 2008, where he developed and patented designs for WiMAX systems. He also served as a Postdoctoral Fellow with the University of Toronto in
2009. He holds three U.S. patents. His current research interests include digital
electronics, VLSI architecture/algorithm design for broadband communication
systems, efficient implementation of signal processing algorithms for various
applications including imaging, and bio-oriented systems.

Morteza H. Shoreh was born in Tehran, Iran, on


January 9, 1988. He received the B.Sc. degree in
electrical engineering from the University of Tehran
(UT), in 2010, and the M.Sc. degree in communication system from the Department of Electrical Engineering, Sharif University of Technology (SUT),
Tehran, in 2012.
He has been with Optical Networks Research
Lab (ONRL), since 2011, under the supervision of
Prof. Jawad A. Salehi. His current research interests include high-speed optical communications, alloptical networking devices, nonlinear optics, optical signal processing, optical
networks (CO-OFDM and OFDM-CDMA), and wireless communications
systems and networks.