08lwc02 2905315 Completeissue APRIL 2019

APRIL 2019 VOLUME 8 NUMBER 2 IWCLAF ISSN (2162-2345)
Optimal User Pairing for Downlink Non-Orthogonal Multiple Access (NOMA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lipeng Zhu, Jun Zhang, Zhenyu Xiao, Xianbin Cao, and Dapeng Oliver Wu 328
On the Capacity of Gaussian MIMO Channels Under the Joint Power Constraints . . . . . . . . . . . . . . . . . . . Sergey Loyka 332
Secure Transmission With Interleaver for Uplink Sparse Code Multiple Access System . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ke Lai, Lei Wen, Jing Lei, Gaojie Chen, Pei Xiao, and Amine Maaref 336
Hybrid Modulation Scheme Combining PPM With Differential Chaos Shift Keying Modulation . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meiyuan Miao, Lin Wang, Marcos Katz, and Weikai Xu 340
Double Shadowing the Rician Fading Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nidhi Simmons,
Carlos Rafael Nogueira da Silva, Simon L. Cotton, Paschalis C. Sofotasios, and Michel Daoud Yacoub 344
Energy-Efficient Prefix Code Based Backscatter Communication for Wirelessly Powered Networks . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yufan Zhang, Ertao Li, Yi-Hua Zhu, Kaikai Chi, and Xianzhong Tian 348
Outage Constrained Robust Multigroup Multicast Beamforming for Multi-Beam Satellite Communication Systems .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li You, Ao Liu, Wenjin Wang, and Xiqi Gao 352
Low-Complexity Differential Spatial Modulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ruey-Yi Wei and Tzu-Yun Lin 356
Time-Expanded Graph-Based Resource Allocation Over the Satellite Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peng Wang, Xiushe Zhang, Shun Zhang, Hongyan Li, and Tao Zhang 360
A Novel Frequency Allocation Scheme for In Band Full Duplex Systems in 5G Networks . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parthiban Annamalai, Jyotsna Bapat, and Debabrata Das 364
Optimal Transmission Scheduling in Small Multimodal Underwater Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Filippo Campagnaro, Paolo Casari, Michele Zorzi, and Roee Diamant 368
Placement Delivery Array Design via Attention-Based Sequence-to-Sequence Model With Deep Neural Network . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Zhengming Zhang, Meng Hua, Chunguo Li, Yongming Huang, and Luxi Yang 372
Massive MIMO-OFDM Channel Estimation via Distributed Compressed Sensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abbas Akbarpour-Kasgari and Mehrdad Ardebilipour 376
Analysis of Unslotted IEEE 802.15.4 Networks With Heterogeneous Traffic Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J. Ortín, M. Cesana, A. E. C. Redondi, M. Canales, and J. R. Gállego 380
(Contents Continued on Page 325)

IEEE C OMMUNICATIONS S OCIETY
The field of interest of the Communications Society consists of all telecommunications, including telephony, telegraphy, facsimile, and point-to-point television, by electro-
magnetic propagation, including radio; wire; aerial; underground, coaxial, and submarine cables; waveguides, communication satellites, and lasers; in marine aeronautical,
space, and fixed station services; repeaters, radio relaying, signal storage, and regeneration; telecommunication error detection and correction; multiplexing and carrier
techniques; communication switching systems; data communications; and communication theory. All members of the IEEE are eligible for membership in the Society upon
payment of the annual Society membership fee of $33.00. Members may receive this publication upon payment of an additional $20.00 (electronic). Society members can
subscribe to a Digital Library of Multiple Communications Society publications, including this one, for $178.00 (three years of access) or $277.00 (access from 1953). Please
contact www.comsoc.org, www.ieee.org, or write to IEEE at the address below. Member copies of Transactions/Journals are for personal use only.
IEEE WIRELESS COMMUNICATIONS LETTERS

Editorial Board
R. S CHOBER, Director of Journals W EI Z HANG, Editor-in-Chief J. M ILIZZO, Assistant Publisher
Friedrich-Alexander Univ. Erlangen-Nürnberg School of Electrical Eng. & Telecom. IEEE Communications Society
Erlangen, Germany Univ. of New South Wales 3 Park Avenue
Sydney, Australia New York, NY 10016
S ENIOR E DITORS
D USIT N IYATO K AI K IT W ONG X IANG -G EN X IA
Nanyang Technological Univ. Dept. of Electron. & Elect. Eng. Dept. of Elect. and Comput. Eng.
Singapore Univ. College London Univ. of Delaware
London, United Kingdom Newark, DE, USA
A SSOCIATE E DITORS
KOICHI A DACHI S WADES D E C HUAN H UANG B EHROOZ M AKKI TANELI R IIHONEN J IE X U
The Univ. of Electro-Communications Indian Institute of Technology Univ. of Electronic Science and Ericsson Tampere Univ. of Guangdong Univ. of
Tokyo, Japan Delhi Technology of China Stockholm, Sweden Technology Technology
New Delhi, India Chengdu, China Tampere, Finland Guangzhou, China
S ERGEY A NDREA PANOS P. M ARKOPOULOS
Tampere Univ. of Technology T OMASO DE C OLA YONGMING H UANG Rochester Institute of Technology C ONG S HEN KOJI YAMAMOTO
Tampere, Finland German Aerospace Center (DLR) Southeast Univ. Rochester, NY, USA Univ. of Sci. and Technol. of Kyoto Univ.
Wessling, Germany Nanjing, China China Kyoto, Japan
M OHAMAD A SSAAD Sheng, China
CentraleSupélec JAN M IETZNER
RODRIGO C. DE L AMARE H ARSHAN JAGADEESH Hamburg Univ. of Applied Sciences Y UAN S HEN S HAOSHI YANG
France
Pontifical Catholic Univ. Indian Institute of of Technology Delhi Hamburg, Germany Beijing Univ. of Posts and
Tsinghua Univ.
L IN BAI of Rio de Janeiro Delhi, India Telecommunications
Beijing, China
Beihang Univ. Rio de Janeiro, Brazil S AIF K HAN M OHAMMED Beijing, China
Beijing, China A BLA K AMMOUN Indian Institute of Technology M IN S HENG
H ARPREET D HILLON King Abdullah Univ. of Science Delhi Xidian Univ. H IROYUKI YOMO
C HAN -B YOUNG C HAE Virginia Tech and Technology New Delhi, India Xi’An, China Kansai Univ.
Yonsei Univ. Blacksburg, VA, USA Thuwal, Saudi Arabia Osaka, Japan
Seoul, Korea M OHAMMED NAFIE DANIEL KC S O
PANAGIOTIS D IAMANTOULAKIS M ARIOS KOUNTOURIS Cairo Univ. Univ. of Manchester G UANDING Y U
YANJIAO C HEN France Research Center
Aristotle Univ. of Thessaloniki Cairo, Egypt Manchester, United Kingdom Zhejiang Univ.
Wuhan Univ. Huawei Technologies Co. Ltd., France
Hubei, China Thessaloniki, Greece Hangzhou, China
L AKSHMI P RASAD NATARAJAN DANIELE TARCHI
I NGMAR L AND Univ. of Bologna
J INHO C HOI PAWEL D MOCHOWSKI Indian Institute of Technology J IANHUA Z HANG
France Research Center Bologna, Italy
Gwangju Institute of Science Victoria Univ. of Wellington Hyderabad Beijing Univ. of Posts and
Huawei Technologies Co. Ltd., France
and Technology Wellington, New Zealand Telangana, India Telecommunications
M ANUEL V ELEZ
Gwangju, Korea Beijing, China
J UNGWOO L EE Univ. of Basque Country
M IANXIONG D ONG H IEN N GO
K AEWON C HOI Seoul National Univ. Vizcaya, Spain
Muroran Institute of Technology Queen’s Univ. Belfast H AIBO Z HOU
Seoul National Univ. of Science Seoul, Korea
Muroran, Japan Belfast, UK RUI WANG Nanjing Univ.
and Technology Nanjing, China
PAN L I Southern Univ. of Sci. and
Seoul, Korea Y UE G AO AYÇA Ö ZÇELIKKALE
Case Western Reserve Univ. Technol.
C HUN T UNG C HOU Queen Mary Univ. of London Uppsala Univ. Guangdong, China S HENG Z HOU
Cleveland, OH, USA
Univ. of New South Wales London, United Kingdom Uppsala, Sweden Tsinghua Univ.
RUI WANG Beijing, China
Sydney, Australia A N L IU
WALAA H AMOUDA P RZEMYSLAW PAWELCZAK Tongji Univ.
Hong Kong Univ. of Shanghai, China
X IAOLI C HU Concordia Univ. TU Delft X IANGYUN Z HOU
Science and Technology
Univ. of Sheffield Montreal, Canada Delft, Netherlands The Australian National
Clear Water Bay, Hong Kong SAR C HAO -K AI W EN
Sheffield, United Kingdom Univ.
National Sun Yat-sen
D INH T HAI H OANG VASANTHAN R AGHAVAN Canberra, Australia
J USTIN C OON S HAODAN M A Univ.
Univ. of Technol. Sydney Qualcomm Flarion
Univ. of Oxford Univ. of Macau Kaohsiung, Taiwan
Sydney, Australia Technologies, Inc. Y U Z HU
Oxford, United Kingdom Macau SAR
Bridgewater, NJ, USA D IRK W ÜBBEN Fudan Univ.
Univ. of Bremen Shanghai, China
Bremen, Germany
IEEE O FFICERS
J OS É M. F. M OURA , President W ITOLD M. K INSNER , Vice President, Educational Activities
T OSHIO F UKUDA , President-Elect H ULYA K IRKICI , Vice President, Publication Services and Products
K ATHLEEN K RAMER , Secretary F RANCIS B. G ROSZ , J R ., Vice President, Member and Geographic Activities
J OSEPH V. L ILLIE , Treasurer ROBERT S. F ISH , President, Standards Association
JAMES A. J EFFERIES , Past President K. J. R AY L IU , Vice President, Technical Activities
T HOMAS M. C OUGHLIN , President, IEEE-USA
V IJAY K. B HARGAVA, Director, Division III-Communications Technology
IEEE E XECUTIVE S TAFF

S TEPHEN P. W ELBY, Executive Director & Chief Operating Officer
T HOMAS S IEGERT, Business Administration C HERIF A MIRAT, Information Technology
J ULIE E VE C OZIN , Corporate Governance K AREN H AWKINS , Marketing
D ONNA H OURICAN , Corporate Strategy C ECELIA JANKOWSKI , Member and Geographic Activities
JAMIE M OESCH , Educational Activities M ICHAEL F ORSTER , Publications
S OPHIA A. M UIRHEAD , General Counsel & Chief Compliance Officer KONSTANTINOS K ARACHALIOS , Standards Association
VACANT, Human Resources M ARY WARD -C ALLAN , Technical Activities
C HRIS B RANTLEY, IEEE-USA
IEEE Publishing Operations

Senior Director, Publishing Operations: DAWN M ELLEY
Director, Editorial Services: K EVIN L ISANKIE
Director, Production Services: P ETER M. T UOHY
Associate Director, Editorial Services: J EFFREY E. C ICHOCKI Associate Director, Information Conversion and Editorial Support: N EELAM K HINVASARA
Managing Editor: C HRISTOPHER P ERRY Journals Coordinator: C ATHERINE VAN S CIVER
IEEE W IRELESS C OMMUNICATIONS L ETTERS (ISSN 2162-2337) is published bi-monthly by the Institute of Elect. and Electronics Engineers, Inc. Responsibility for the
contents rests upon the authors and not upon the IEEE, the Society/Council, or its members. IEEE Corporate Office: 3 Park Avenue, 17th Floor, New York, NY 10016-5997.
IEEE Operations Center: 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ 08855-1331. NJ Telephone: +1 732 981 0060. Price/Publication Information: Individual copies:
IEEE Members $20.00 (first copy only), nonmembers $91.00 per copy. Member and nonmember subscription prices available upon request. Available in microfiche and
microfilm. Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of patrons, provided
the per-copy fee indicated in the code at the bottom of the first page is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For all
other copying, reprint, or republication permission, write to Copyrights and Permissions Department, IEEE Publications Administration, 445 Hoes Lane, P.O. Box 1331,
Piscataway, NJ 08855-1331. Copyright c 2019 by Institute of Elect. and Electronics Engineers, Inc. All rights reserved. IEEE prohibits discrimination, harassment and
bullying. For more information visit http://www.ieee.org/nondiscrimination. Printed in U.S.A.
Digital Object Identifier 10.1109/LWC.2019.2905313

(Contents Continued from Front Cover)
Full-Duplex Energy-Harvesting Enabled Relay Networks in Generalized Fading Channels . . . . . . . . . . Khaled Rabie,
Bamidele Adebisi, Galymzhan Nauryzbayev, Osamah S. Badarneh, Xingwang Li, and Mohamed-Slim Alouini 384
Connectivity and Blockage Effects in Millimeter-Wave Air-To-Everything Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kaifeng Han, Kaibin Huang, and Robert W. Heath, Jr. 388
Receiver Design for OOK Modulation Over Turbulence Channels Using Source Transformation . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Taghi Dabiri and Seyed Mohammad Sajad Sadough 392
Handover Probability Analysis of Anchor-Based Multi-Connectivity in 5G User-Centric Network . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hongtao Zhang, Wanqing Huang, and Yi Liu 396
Traffic-Aware Relay Vehicle Selection in Millimeter-Wave Vehicle-to-Vehicle Communication . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bo Fan, Hui Tian, Shushan Zhu, Yanyan Chen, and Xuzhen Zhu 400
Decentralized Precoding for Cache-Enabled Ultra-Dense Radio Access Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . Shiwen He, Yiyun Chen, Ju Ren, Yongming Huang, Luxi Yang, and Yaoxue Zhang 404
LoRa Throughput Analysis With Imperfect Spreading Factor Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Antoine Waret, Megumi Kaneko, Alexandre Guitton, and Nancy El Rachkidy 408
An Adaptive Optimal Mapping Selection Algorithm for PNC Using Variable QAM Modulation . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tong Peng, Yi Wang, Alister G. Burr, and Mohammad Shikh-Bahaei 412
Deep Learning-Based CSI Feedback Approach for Time-Varying Massive MIMO Channels . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tianqi Wang, Chao-Kai Wen, Shi Jin, and Geoffrey Ye Li 416
A Dynamic Pricing Strategy for Vehicle Assisted Mobile Edge Computing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Di Han, Wei Chen, and Yuguang Fang 420
Spectral-Energy Efficiency Pareto Front in Cellular Networks: A Stochastic Geometry Framework . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marco Di Renzo, Alessio Zappone, Thanh Tu Lam, and Mérouane Debbah 424
Rician K-Factor-Based Analysis of XLOS Service Probability in 5G Outdoor Ultra-Dense Networks . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hatim Chergui, Mustapha Benjillali, and Mohamed-Slim Alouini 428
Power Splitting-Based SWIPT Systems With Decoding Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohsen Abedi, Hamed Masoumi, and Mohammad Javad Emadi 432
On Iterative Compensation of Clipping Distortion in OFDM Systems . . . . . . Shansuo Liang, Jun Tong, and Li Ping 436
Automatic Modulation Classification Using Cyclic Correntropy Spectrum in Impulsive Noise . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jitong Ma and Tianshuang Qiu 440
Vertical and Horizontal Building Entry Loss Measurement in 4.9 GHz Band by Unmanned Aerial Vehicle . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kentaro Saito, Qiwei Fan, Nopphon Keerativoranan, and Jun-ichi Takada 444
Multi-Slot Allocation Protocols for Massive IoT Devices With Small-Size Uploading Data . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tsung-Yen Chan, Yi Ren, Yu-Chee Tseng, and Jyh-Cheng Chen 448
On the Sum-Rate of Heterogeneous Networks With Low-Resolution ADC Quantized Full-Duplex Massive
MIMO-Enabled Backhaul . . . . . . . . . Prince Anokye, Roger K. Ahiadormey, Changick Song, and Kyoung-Jae Lee 452
Efficient Computation of Multivariate Rayleigh and Exponential Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reneeta Sara Isaac and Neelesh B. Mehta 456
Adaptive Frequency Band and Channel Selection for Simultaneous Receiving and Sending in Multiband
Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ayaka Hanyu,
Yuichi Kawamoto, Hiroki Nishiyama, Nei Kato, Naoto Egashira, Kazuto Yano, and Tomoaki Kumagai 460
Localization Using Blind RSS Measurements . . . . . . . . . . . . . . . . . . . . Yongchang Hu, Jiani Liu, and Bingbing Zhang 464
Fast Analog Transmission for High-Mobility Wireless Data Acquisition in Edge Learning . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuqing Du and Kaibin Huang 468
On Achieving the Maximum Streaming Rate in Hybrid Wired/Wireless Overlay Networks . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jianwei Zhang, Xinchang Zhang, Meng Sun, and Chunling Yang 472
Interleave-Division Multiple Access in High Rate Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yang Hu, Chulong Liang, Lei Liu, Chunlin Yan, Yifei Yuan, and Li Ping 476
To Establish a Secure Channel From a Full-Duplex Transmitter to a Half-Duplex Receiver: An Artificial-Noise-Aided
Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xinyue Hu, Caihong Kai, Shengli Zhang, Zhongyi Guo, and Jun Gao 480
Hybrid Precoding for Single Carrier Wideband Multi-Subarray Millimeter Wave Systems . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wei Huang, Zhaohua Lu, Yongming Huang, and Luxi Yang 484
Threshold Setting for Multiple Primary User Spectrum Sensing via Spherical Detector . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xi Yang, Kejun Lei, Shengliang Peng, Li Hu, Shu Li, and Xiuying Cao 488
High-Accuracy Entity State Prediction Method Based on Deep Belief Network Toward IoT Search . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Puning Zhang, Xuyuan Kang, Dapeng Wu, and Ruyan Wang 492

(Contents Continued from Page 325)
High Rate CCK Modulation Design for Bandwidth Efficient Link Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Han Wang, Lianyou Jing, Chengbing He, and Zhi Ding 496
Sequential 0/1 for Cooperative Spectrum Sensing in the Presence of Strategic Byzantine Attack . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jun Wu, Yue Yu, Tiecheng Song, and Jing Hu 500
Spherical Wave Positioning Based on Curvature of Arrival by an Antenna Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . Siwei Zhang, Thomas Jost, Robert Pöhlmann, Armin Dammann, Dmitriy Shutin, and Peter Adam Hoeher 504
Molecular Communication: The First Arrival Position Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nilay Pandey, Ranjan K. Mallik, and Brejesh Lall 508
Spatial Correlations of a 3-D Non-Stationary MIMO Channel Model With 3-D Antenna Arrays and 3-D Arbitrary
Trajectories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . Qiuming Zhu, Ying Yang, Cheng-Xiang Wang, Yi Tan, Jian Sun, Xiaomin Chen, and Weizhi Zhong 512
Protograph-Based Folded Spatially Coupled LDPC Codes for Burst Erasure Channels . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Inayat Ali, Hyunjae Lee, Ayaz Hussain, and Sang-Hyo Kim 516
Distribution of the Number of Users per Base Station in Cellular Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geordie George, Angel Lozano, and Martin Haenggi 520
Joint Power, Altitude, Location and Bandwidth Optimization for UAV With Underlaid D2D Communications . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenhuan Huang, Zhaohui Yang,
Cunhua Pan, Lu Pei, Ming Chen, Mohammad Shikh-Bahaei, Maged Elkashlan, and Arumugam Nallanathan 524
OTFS-Based Multiple-Access in High Doppler and Delay Spread Wireless Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Venkatesh Khammammetti and Saif Khan Mohammed 528
Optimal Hybrid Beamforming for Multiuser Massive MIMO Systems With Individual SINR Constraints . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . Guangda Zang, Ying Cui, Hei Victor Cheng, Feng Yang, Lianghui Ding, and Hui Liu 532
On the Error Rate Analysis of Coded OFDM Over Multipath Fading Channels . . . . . . . . . . . . . . . . . . . . . . Jinho Choi 536
Adaptive AoA and Polarization Estimation for Receiving Polarized mmWave Signals . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hang Li, Thomas Q. Wang, Xiaojing Huang, and Y. Jay Guo 540
Antieigenvalue-Based Spectrum Sensing for Cognitive Radio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chen Guo, Ming Jin, Qinghua Guo, and Youming Li 544
MRB Decoding of LT Codes Over AWGN Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Valerio Bioglio 548
Spectral and Energy Efficient Resource Allocation for Massive MIMO HetNets With Wireless Backhaul . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bo Huang and Aihuang Guo 552
Practical User Selection With Heterogeneous Bandwidth and Antennas for MU-MIMO WLANs . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sulei Wang, Zhe Chen, Yuedong Xu, Xin Wang, and Qingsheng Kong 556
Coupling Information Transmission With Window Decoding . . . . . . . . . . . . . . . Alireza Karami and Dmitri Truhachev 560
Secure UAV-to-UAV Systems With Spatially Random UAVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jia Ye, Chao Zhang, Hongjiang Lei, Gaofeng Pan, and Zhiguo Ding 564
Flexible-Rate SIC-Free NOMA for Downlink VLC Based on Constellation Partitioning Coding . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chen Chen, Wen-De Zhong, Helin Yang, Pengfei Du, and Yanbing Yang 568
Meta Distribution of Downlink Non-Orthogonal Multiple Access (NOMA) in Poisson Networks . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Konpal Shaukat Ali, Hesham ElSawy, and Mohamed-Slim Alouini 572
PAPR Reduction Based on Parallel Tabu Search for Tone Reservation in OFDM Systems . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yajun Wang, Renjie Zhang, Jun Li, and Feng Shu 576
Resource Allocation in UAV-Assisted M2M Communications for Disaster Rescue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Xilong Liu and Nirwan Ansari 580
Coded Redundant Message Transmission Schemes for Low-Power Wide Area IoT Applications . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Samuel Montejo-Sánchez, Cesar A. Azurdia-Meza,
Richard Demo Souza, Evelio Martin Garcia Fernandez, Ismael Soto, and Arliones Hoeller, Jr. 584
On Optimizing Effective Rate for Random Linear Network Coding Over Burst-Erasure Relay Links . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huangnan Wu, Ye Li, Yingdong Hu, Bin Tang, and Zhihua Bao 588
Different Power Adaption Methods on Fluctuating Two-Ray Fading Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hui Zhao, Zhedong Liu, and Mohamed-Slim Alouini 592
Optimal Dynamic Capacity Allocation for High Throughput Satellite Communications Systems . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . Anargyros J. Roumeliotis, Charilaos I. Kourogiorgas, and Athanasios D. Panagopoulos 596
Learning-Based Wireless Powered Secure Transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dongxuan He, Chenxi Liu, Hua Wang, and Tony Q. S. Quek 600

(Contents Continued from Page 326)
New Analytical Approach in the SER Evaluation of CSIN-Assisted AF Dual-Hop Wireless Systems . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yazid M. Khattabi 604
DF-CSPG: A Potential Game Approach for Device-Free Localization Exploiting Joint Sparsity . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sixing Yang, Yan Guo, Ning Li, and Dagang Fang 608
Fundamentals on Base Stations in Urban Cellular Networks: From the Perspective of Algebraic Topology . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ying Chen, Rongpeng Li, Zhifeng Zhao, and Honggang Zhang 612
Modified Conjugate Beamforming for Cell-Free Massive MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Masoud Attarifar, Aliazam Abbasfar, and Angel Lozano 616
Divergence-Optimal Fixed-to-Fixed Length Distribution Matching With Shell Mapping . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patrick Schulte and Fabian Steiner 620
SCR-Based Tone Reservation Schemes With Fast Convergence for PAPR Reduction in OFDM System . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingqi Wang, Xin Lv, and Wen Wu 624
Average Age of Information in Wireless Powered Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ioannis Krikidis 628
User Cooperation in Wireless-Powered Backscatter Communication Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bin Lyu, Dinh Thai Hoang, and Zhen Yang 632
Tag Cardinality Estimation Using Expectation-Maximization in ALOHA-Based RFID Systems With Capture Effect
and Detection Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chuyen T. Nguyen, Van-Dinh Nguyen, and Anh T. Pham 636
Pilot Allocation and Computationally Efficient Non-Iterative Estimation of Phase Noise in OFDM . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ville Syrjälä, Toni Levanen, Tero Ihalainen, and Mikko Valkama 640
Energy-Perceptive MAC for Wireless Power and Information Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Youngil Cho, Yunmin Kim, and Tae-Jin Lee 644
COMMENTS AND CORRECTIONS

Corrections to “Outage Analysis for Decode-and-Forward Multirelay Systems Allowing Intra-Link Errors” . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Albrecht Wolf, Diana Cristina González,
Meik Dörpinghaus, Luciano Leonel Mendes, José Cândido Silveira Santos Filho, and Gerhard Fettweis 648
328 IEEE WIRELESS COMMUNICATIONS LETTERS, VOL. 8, NO. 2, APRIL 2019
Optimal User Pairing for Downlink Non-Orthogonal Multiple Access (NOMA)

Lipeng Zhu , Jun Zhang, Zhenyu Xiao , Senior Member, IEEE, Xianbin Cao , Senior Member, IEEE,
and Dapeng Oliver Wu, Fellow, IEEE
Abstract—In this letter, we explore user pairing in a downlink with more-user NOMA. For analytical tractability, we consider
non-orthogonal multiple access (NOMA) network. As power allo- 2-user NOMA in this letter [3], [5]–[7], while the generaliza-
cation inherently intertwines with user pairing, a joint user tion of 2-user NOMA to p-user NOMA is also considered in
pairing and power allocation problem is considered to optimize
the achievable sum rate (ASR) with minimum rate constraint Section III-D.
for each user, which is a mixed integer programming problem. When applying the 2-user NOMA to mobile cellular,
To solve this non-convex problem, we first obtain the optimal the achievable benefit is highly dependent on user pair-
power allocation in an NOMA system with only 2 users; then ing [3], [7]. In [3], user pairing under two cases, i.e., NOMA
analyze the user pairing problem in a simplified situation, i.e., an with fixed power allocation (F-NOMA) and cognitive-radio-
NOMA system with four users. Finally, we obtain the closed-form
globally optimal solution in a general NOMA system. Extensive inspired NOMA (CR-NOMA), was studied. A general criteria
performance evaluations are conducted to compare the ASRs of was given to design distributed approaches for dynamic user
the NOMA and OMA systems. Results show that the performance pairing/grouping, but explicit user pairing strategy was not
of the NOMA system with the proposed optimal user pairing is given [3]. User pairing in the CR-NOMA system was further
significantly better than that of the OMA system, as well as the studied in [7], where the conventional distributed matching
performance of the NOMA system with random user pairing.
algorithm (DMA) was adopted to maximize the achievable
Index Terms—Non-orthogonal multiple access (NOMA), user sum rate (ASR). In this letter, we also explore the user pair-
pairing, power allocation. ing problem, but with a minimum rate constraint for all the
NOMA users, so as to guarantee the users’ quality of service
(QoS). With this system model, we obtain the closed-form
I. I NTRODUCTION globally-optimal solution.
ON-ORTHOGONAL multiple access (NOMA) is con-
N sidered as a key candidate technology for the fifth
generation (5G) networks [1]–[4]. The basic idea of
II. S YSTEM M ODEL AND P ROBLEM F ORMULATION
A. System Model
NOMA is to serve multiple users in the same resource Without loss of generality, we consider a downlink NOMA
(time/frequency/code) block (RB). As the signals of different system. There are 2K users uniformly deployed in a disc,
users are superimposed in power domain, the receivers exploit namely D with the radius d. The base station (BS) is located at
successive interference cancellation (SIC) to distinguish each the center of D. For the sake of improving the spectrum effi-
other. Thus both the number of users and the spectrum efficiency, the users are paired into K clusters. The paired users
ciency can be improved manyfold. In general, 2-user NOMA transmit information in the same RB, but users from different
is a typical scenario, where the number of users perform- pairs should transmit in different RBs. For each pair of users,
ing NOMA in a single RB is 2. The decoding complexity the BS transmits a superimposed signal as
and delay at the receivers are lower and shorter compared
s = αm P sm + αn P sn , (1)
Manuscript received June 4, 2018; accepted June 28, 2018. Date of pub- where sk (k = m,n) is the signal for User-k, and E (|sk |2 ) = 1.
lication July 9, 2018; date of current version April 9, 2019. This work was P is the total transmission power for each pair of users.
supported in part by the National Key Research and Development Program
under Grant 2016YFB1200100, in part by the National Natural Science αk (k = m,n) denotes the coefficient of signal power for User-k,
Foundation of China under Grant 61571025 and Grant 91538204, and in and αm + αn = 1.
part by the Open Research Fund of Key Laboratory of Space Utilization, The received signals at the users are
Chinese Academy of Sciences under Grant LSU-DZXX-2017-02. The asso- √ √
ciate editor coordinating the review of this paper and approving it for ym = hm (√ αm P sm +√ αn P sn ) + n̂m ,
publication was R. C. de Lamare. (Corresponding author: Jun Zhang.) (2)
L. Zhu, Z. Xiao, and X. Cao are with the School of Electronic and yn = hn ( αm P sm + αn P sn ) + n̂n ,
Information Engineering, Beihang University, Beijing 100191, China, also where hk (k = m,n) denotes the channel gain between the
with the Key Laboratory of Advanced technology of Near Space Information
System, Ministry of Industry and Information Technology of China, Beijing BS and User-k, which is assumed to be Rayleigh distributed.
100191, China, and also with the National Engineering Laboratory for Big Without loss of generality, we assume that the channel gain
Data Application Technologies for Comprehensive Traffic, Beijing 100191, obeys the standard Rayleigh distribution at the node 100m
China.
J. Zhang is with the Advanced Research Institute of Multidisciplinary away from the BS. Additionally, the pathloss is defined as
Science, Beijing Institute of Technology, Beijing 100081, China (e-mail: 1/d2ab , where dab is the distance between node a and node b.
buaazhangjun@vip.sina.com). n̂k denotes the white noise at User-k with power σ 2 .
D. O. Wu is with the Department of Electrical and Computer Engineering,
University of Florida, Gainesville, FL 32611 USA. Without loss of generality, assume the channel gains of the
Digital Object Identifier 10.1109/LWC.2018.2853741 users are sorted as |h1 |2 ≤ |h2 |2 ≤ · · · |h2K |2 . In particular,
2162-2345 c 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
ZHU et al.: OPTIMAL USER PAIRING FOR DOWNLINK NOMA 329
User-m and User-n are two of the 2K users with |hm |2 ≤ III. S OLUTION OF THE P ROBLEM
|hn |2 . The channel gain of User-n is higher, so User-n can In order to solve the original problem, we commence from
first decode and remove sm , and then decodes sn . By contrast, the cases with relatively fewer number of users, i.e., 2K = 2,4,
User-m directly decodes sm , and meanwhile sn is treated as which is easier and can be instructive to investigate the general
noise. The SIC technique is only employed at the user with case.
higher channel gain in each pair. Then the achievable rates of
the users are A. Power Allocation for NOMA With 2 Users
(m,n) |h |2 (1−α )
Rm = log2 (1 + |h m|2 α +γn−1 ), In a NOMA system with 2 users, we do not need to consider
m n (3)
(m,n) user pairing. Problem (6) is simplified as
Rn = log2 (1 + |hn |2 αn γ),
(m,n) (m,n)
where γ = σP2 , and αm is replaced by (1-αn ). Maximize Rm + Rn
{αn }
The achievable rate of the user in an OMA system is (m,n) (OMA)
given by Subject to Rm ≥ Rm ,
(m,n) (OMA)
(OMA) 1 Rn ≥ Rn ,
Rk = log (1 + |hk |2 γ), (4)
2 2 0 ≤ αn ≤ 1. (7)
1
where the factor is due to the fact that conventional OMA
2 There is only one variable, αn , in Problem (7). We give the
results in a multiplexing loss of 12 . optimal solution in the following theorem.
Theorem 1: In a NOMA system with 2 users, the optimal
B. Problem Formulation power allocation of Problem (7) is
First, it is necessary to define a 2K-dimension matrix to rep-
(m,n) 1 + |hm |2 γ − 1
resent the pairing relationship among the users. The formation αn = . (8)
|hm |2 γ
rule of U is
Proof: The differential of the objective function in
1, User-m pairs User-n;
um,n = (5) Problem (7) is
0, others;
(m,n) (m,n)
where um,n is the m-th row and j-th column element. It is d (Rm + Rn ) 1 (|hm |2 αn + γ −1 )2
= . (9)
obvious that UT = U due to the definition, and the diagonal d (αn ) ln2 (|hn |2 − |hm |2 )(|hm |2 + γ −1 )
element of U is zero because one user cannot pair itself. Each Note that we have assumed |hm |2 ≤ |hn |2 , thus the objective
user can pair one and only one user, so the summation of the function is nondecreasing for αn . The range of αn can be
elements in each row or column of U is 1. Furthermore, there obtained from the constraints directly.
are minimal rate constraints for the users, i.e., the achievable
(m,n) (OMA)
rate of the user in the NOMA system should be no less than Rm ≥ Rm
that of this user in the OMA system. Then the optimization |hm |2 (1 − αn ) 1
⇔ log2 (1 + 2 −1
) ≥ log2 (1 + |hm |2 γ)
problem is formulated as |h | αn + γ 2
m
2K
2K
1 + |hm |2 γ − 1
Maximize um,n (Rm
(m,n) (m,n)
+ Rn ) ⇔ αn ≤ . (10)
|hm |2 γ
{αn ,um,n }
m=1 n=m+1 (m,n) (OMA)
(m,n) (OMA) Rn ≥ Rn
Subject to Rm ≥ um,n Rm , 1
(m,n) (OMA) ⇔ log2 (1 + |hn |2 αn γ) ≥ log2 (1 + |hn |2 γ)
Rn ≥ um,n Rn , 2
0 ≤ αn ≤ 1, 1 ≤ n ≤ 2K , 1 + |hn |2 γ − 1
⇔ αn ≥ . (11)
|hn |2 γ
um,n ∈ {0, 1}, 1 ≤ m, n ≤ 2K ,
um,n = un,m , 1 ≤ m, n ≤ 2K , Thus the optimal solution of αn is the upper bound, i.e.,

um,m = 0, 1 ≤ m ≤ 2K , (m,n) 1 + |hm |2 γ − 1
αn = . (12)
2K
|hm |2 γ
um,n = 1, 1 ≤ n ≤ 2K , (m,n)
m=1 Finally, we verify the range of αn . As√ |hm | 2γ > 0,
2
1 +|h m | γ−1
2K
we have 1 + |hm |2 γ > 1. Thus, 0 < |hm |2 γ
<
um,n = 1, 1 ≤ m ≤ 2K . (6) (1+|hm |2 γ)−1
n=1 |h |2 γ
= 1.
m
(m,n)
Problem (6) is a mixed integer programming problem. The The superscript of αn denotes the indices of the pair-
complexity of direct search for the optimal user pairing is ing users. It is worthy to note that the achievable rate of
(m,n) (OMA)
O ((2K-1)!!), which is rather high. In this letter, we will the user with lower channel gain is Rm = Rm when
obtain the closed-form solution of Problem (6) by analytical the power allocation is optimal. That is to say, only nec-
derivation. essary power is allocated to the user with a worse channel
1 (1+β x )2
to satisfy the rate constraint, and all the remaining power Define ψ(x ) = log2 ( 1+x ) and it is easy to find that
is allocated to the user with a better channel to maximize
ψ (x) ≥ 0 when x ≥ |h1 | γ. Thus we have
2
the ASR.
Rcase2 − Rcase1
(1,3) (2,4) (OMA) (1,2) (3,4) (OMA)
= (R3 + R4 + R2 ) − (R2 + R4 +R3 )
B. User Pairing for NOMA With 4 Users
(1 + |h3 |2 β1 γ)(1 + |h4 |2 β2 γ) 1 + |h2 |2 γ
In a NOMA system with 4 users, if User-1 pairs one user, = log2 ( )
then the other two users must be a pair. For this reason, there (1 + |h2 |2 β1 γ)(1 + |h4 |2 β3 γ) 1 + |h3 |2 γ

are three cases of user pairing, which are (1 + |h3 |2 β1 γ) 1 + |h2 |2 γ
≥ log2 ( )
Case 1: User-1 pairs User-2, i.e., {u1,2 = 1, u3,4 = 1}. (1 + |h2 |2 β1 γ) 1 + |h3 |2 γ
The ASR with the optimal power allocation in (12) is 1 (1 + β1 |h3 |2 γ)2 (1 + β1 |h2 |2 γ)2
= [log2 ( 2
) − log2 ( )] ≥ 0.
(1,2) (1,2) (3,4) (3,4) 2 1 + |h3 | γ 1 + |h2 |2 γ
Rcase1 = R1 + R2 + R3 + R4 (20)
(OMA) (1,2) (OMA) (3,4)
= R1 + R2 + R3 + R4 . (13)
Case 2: User-1 pairs User-3, i.e., {u1,3 = 1, u2,4 = 1}. The

ASR with the optimal power allocation in (12) is C. User Pairing for NOMA With 2K Users
Theorem 1 shows the optimal power allocation for each pair
(1,3) (2,4) (1,3) (2,4)
Rcase 2 = R1 + R2 + R3 + R4 of users, and meanwhile Theorem 2 shows the optimal pairing
(OMA) (OMA) (1,3) (2,4) for a NOMA system with 4 users. If there is an optimal user
= R1 + R2 + R3 + R4 . (14)
pairing strategy for a NOMA system with 2K users, arbitrary
Case 3: User-1 pairs User-4, i.e., {u1,4 = 1, u2,3 = 1}. The 2 pairs of the users with the optimal user pairing strategy will
ASR with the optimal power allocation in (12) is constitute a NOMA system with 4 users, and the ASR should
always be optimal, i.e., Case 3 in (16). We give the following
(1,4) (2,3) (2,3) (1,4) theorem to solve the original problem.
Rcase 3 = R1 + R2 + R3 + R4
(OMA) (OMA) (2,3) (1,4) Theorem 3: In a NOMA system with 2K users, the optimal
= R1 + R2 + R3 + R4 . (15) user pairing is

To compare the ASRs of the three cases, we have the 1, |m + n| = 2K + 1;
um,n = (21)
following theorem. 0, others.
Theorem 2: In a NOMA system with 4 users, the order of Another expression of Theorem 3 is that User-k (1 ≤ k ≤ K)
ASRs is pairs User-(2K-k+1).
Proof: We can prove this theorem by mathematical induc-
Rcase1 ≤ Rcase2 ≤ Rcase 3 . (16)
tion:
Proof: Note that the expression of αn
(m,n)
in (12) is not 1) When i = 1, we use contradiction to prove that
influenced by index n, thus we have u1,2K = 1. Assume User-1 pairs User-m (2 ≤ m
⎧ ≤ 2K−1) and User-(2K) pairs User-n (2 ≤ n ≤ 2K−1).
√
⎪ 2
⎪ α(1,2) = α(1,3) = α(1,4) = 1+|h12| γ−1 β1 , User-1, User-m, User-n and User-(2K) constitute a
⎪
⎪ 2 3 4 |h | γ
⎨ √ 1 NOMA system with 4 users, and the user pairing among
(2,3) (2,4) 1+|h2 |2 γ−1 them is Case 1 (m < n) or Case 2 (m > n). Both of them
α3 = α4 = β2 , (17)
⎪
⎪ √ |h2 |2 γ
⎪
⎪ (3,4) 1+|h |2 γ−1
are worse than Case 3. Thus the assumption does not
⎩ α4 = 3
β3 . hold. In other words, User-1 inevitably pairs User-(2K),
|h |2 γ3
√ i.e., u1,2K = 1.
Define φ(x ) = 1+x x
−1
and it is easy to find that φ (x)<0 2) When i = k (k ≥ 1), we assume that u1,2K =
when x > 0. For this reason, according to the assumption that 1, u2,2K −1 = 1, . . . , uk ,2K −k +1 = 1, and we also use
|h1 |2 ≤ |h2 |2 ≤ |h3 |2 , the order of βk (k = 1, 2, 3) is contradiction to prove that uk +1,2K −k = 1. Assume
User-(k + 1) pairs User-m (k + 2 ≤ m ≤ 2K − k − 1)
β3 ≤ β2 ≤ β1 . (18) and User-(2K-k) pairs User-n (k +2 ≤ n ≤ 2K −k −1).
User-(k + 1), User-m, User-n and User-(2K-k) consti-
Then, we have tute a NOMA system with 4 users, and the user pairing
among them is Case 1 (m < n) or Case 2 (m > n). Both
Rcase3 − Rcase2 of them are worse than Case 3. Thus the assumption
(1,4) (2,3) (1,3) (2,4)
= (R4 + R3 ) − (R3 + R4 ) does not hold. In other words, User-(k + 1) inevitably
(1 + |h4 |2 β1 γ)(1 + |h3 |2 β2 γ) pairs User-(2K-k), i.e., uk +1,2K −k = 1.
= log2 ( ) 3) Finally, we can conclude that u1,2K = 1, u2,2K −1 =
(1 + |h3 |2 β1 γ)(1 + |h4 |2 β2 γ)
1, . . . , uK ,K +1 = 1.
(|h4 |2 − |h3 |2 )(β1 − β2 )γ
= log2 (1 + ) ≥ 0. (19) Hereto, the closed-form solution of Problem (6) is obtained
(1 + |h3 |2 β1 γ)(1 + |h4 |2 β2 γ) in (12) and (21), which is globally optimal. Given the order
ZHU et al.: OPTIMAL USER PAIRING FOR DOWNLINK NOMA 331
Fig. 1. Comparison of ASRs between NOMA with optimal user pairing, Fig. 2. Comparison of ASRs between NOMA with optimal user pairing,
NOMA with random user pairing and OMA with varying number of users, NOMA with random user pairing and OMA with varying total power to noise
where P/σ 2 = 20 dB and d = 500 m. ratio, where K = 16 and d = 500 m.
of the channel gains, the optimal user pairing is determined power to noise ratio, respectively. Each point in these figures
according to the closed-form expression of (21). The main is the average performance based on 103 user distributions and
computational complexity is the calculation of power allo- channel realizations. It can be observed from the two figures
cation, which is shown in (12). Thus, the computational that the ASR for optimal user pairing is significantly better
complexity of the proposed approach is O(K ). than that for random user pairing and OMA.
D. Generalization Consideration V. C ONCLUSION

The proposed approach solves the joint user pairing and In this letter we have studied user pairing and power allo-
power allocation problem in a 2-user NOMA system. When cation in a downlink 2-user NOMA network. As joint user
the number of users in the same RB increases, it turns to be a pairing and power allocation to optimize the ASR is a mixed
joint user grouping and power allocation problem. The basic integer programming problem, we first obtain the optimal
idea of analyzing the user grouping problem from 1 group, power allocation in a NOMA system with 2 users; then ana-
2 groups, to K groups is also instructive. First, the optimal lyze the user pairing problem in a simplified situation, i.e., a
power allocation can be obtained in the 1-group case. Then, scenario with 4 users. Finally, we obtain the globally-optimal
the user grouping among two and more groups of users can closed-form solution in a general NOMA system. Performance
be analyzed analogously. However, it may be difficult to find evaluations are conducted to compare the ASRs of NOMA
a closed-form optimal solution of user grouping, because the with the proposed optimal user pairing, NOMA with random
user grouping presents more complicated property compared user pairing and OMA. Results have demonstrated our analy-
with user pairing for 2-user NOMA. Concretely, if we group sis and show that the proposed user pairing scheme is optimal
p−1
2p users into two groups, there are totally C2p−2 combina- among the alternatives.
tions. The exponential increase of alternatives makes it difficult
to find the optimal user grouping like Theorem 2 with an ana- R EFERENCES
lytical approach. Instead, an optimization approach may be [1] L. Dai et al., “Non-orthogonal multiple access for 5G: solutions, chal-
used to find the optimal or a suboptimal solution of the user lenges, opportunities, and future research trends,” IEEE Commun. Mag.,
grouping problem. This topic will be studied in detail in our vol. 53, no. 9, pp. 74–81, Sep. 2015.
[2] A. Benjebbour et al., “Concept and practical considerations of non-
future work. orthogonal multiple access (NOMA) for future radio access,” in Proc.
Int. Symp. Intell. Signal Process. Commun. Syst., Naha, Japan, Nov. 2013,
pp. 770–774.
IV. P ERFORMANCE S IMULATION [3] Z. Ding, P. Fan, and H. V. Poor, “Impact of user pairing on 5G nonorthog-
onal multiple-access downlink transmissions,” IEEE Trans. Veh. Technol.,
In this section, we evaluate the performance of the proposed vol. 65, no. 8, pp. 6010–6023, Aug. 2016.
optimal user pairing scheme with minimal rate constraint in a [4] Z. Xiao, L. Zhu, J. Choi, P. Xia, and X.-G. Xia, “Joint power alloca-
2-user NOMA network. As [3] and [7] did not consider min- tion and beamforming for non-orthogonal multiple access (NOMA) in
5G millimeter wave communications,” IEEE Trans. Wireless Commun.,
imal rate constraint, we compare the performance of NOMA vol. 17, no. 5, pp. 2961–2974, May 2018.
with the proposed optimal user pairing with that of NOMA [5] Z. Ding, P. Fan, and H. V. Poor, “Random beamforming in
utilizing random user pairing but optimal power allocation, as millimeter-wave NOMA networks,” IEEE Access, vol. 5, pp. 7667–7681,
2017.
well as that of the conventional OMA system. [6] Q. Sun, S. Han, C.-L. I, and Z. Pan, “On the ergodic capacity of
Figs. 1 and 2 show the comparison of ASRs between MIMO NOMA systems,” IEEE Wireless Commun. Lett., vol. 4, no. 4,
NOMA with the proposed optimal user pairing (NOMA pp. 405–408, Aug. 2015.
[7] W. Liang, Z. Ding, Y. Li, and L. Song, “User pairing for downlink non-
optimal), NOMA with random user pairing (NOMA random) orthogonal multiple access networks using matching algorithm,” IEEE
and OMA, with varying number of users and varying total Trans. Commun., vol. 65, no. 12, pp. 5319–5332, Dec. 2017.
On the Capacity of Gaussian MIMO Channels Under the

Joint Power Constraints
Sergey Loyka
Abstract—The capacity and optimal signaling over a fixed power budget and where each antenna is equipped with its own
Gaussian MIMO channel are considered under the joint total and power amplifier. The capacity of fixed Gaussian MISO chan-
per-antenna power constraints. While the general case remains nel under the joint TPC and PAC has been established in [10],
an open problem, a closed-form full-rank solution is obtained
along with its sufficient and necessary conditions. The conditions
where it was shown that the optimal signaling is a combination
for each constraint to be inactive are established. The high and of EGT and maximum ratio transmission (MRT), with phase
low-SNR regimes are studied. Isotropic signaling is shown to be shifts adjusted to compensate channel-induced phase shifts.
optimal in the former case while rank-1 signaling (beamforming) Following the remark in [9, Sec. II-B], the MISO result can be
is not necessarily optimal in the latter case. Unusual properties of also adapted to any rank-1 MIMO channel. This result was fur-
optimal covariance under the joint constraints are pointed out. ther extended to fading MIMO channels, where it was shown
Index Terms—MIMO, channel capacity, power constraint, that isotropic signaling is optimal if the fading distribution is
optimal signalling. right-unitary-invariant [10].
While an iterative algorithm to compute an optimal covari-
ance was developed in [11] for the general MIMO case under
I. I NTRODUCTION
the joint constraints, it provides limited insights (due to its
ULTI-ANTENNA (MIMO) systems have been widely
M accepted by both academia and industry due to their
high spectral efficiency. Presently, MIMO systems experience
iterative nature) and no closed-form solution is yet known.
The key difficulty is the fact that, unlike the TPC only case,
the feasible set of Tx covariance matrices is not isotropic any-
a re-surge of interest in the form of massive MIMO, which more (due to the PAC) and hence the tools developed under the
is considered a key technology for future 5G systems to meet TPC (which exploit this symmetry) cannot be used anymore.
ever-increasing traffic demand when a limited bandwidth is New tools are needed.
available [1]. The capacity of a fixed Gaussian MIMO chan- This letter partially closes this gap by obtaining a closed-
nel and its optimal signaling strategy are well-known under the form full-rank solution for the optimal signaling (Tx covari-
total transmit (Tx) power constaint (TPC): its is on the channel ance matrix) and respective capacity of a fixed full-rank
eigenmodes with power allocation given by the water-filling Gaussian MIMO channel under the joint constraints when the
(WF) procedure [2], [3]. While the TPC is motivated by a constraint powers exceed certain thresholds, thus extending
limited power (energy) supply, individual per-antenna powers earlier analytical results in [9] and [10]. Sufficient and nec-
can also be limited when each antenna is equipped with its essary conditions for optimal signaling to be of full rank are
own amplifier (of limited power), in either collocated or dis- also established. Optimal signaling under the joint constraints
tributed implementations, hence motivating per-antenna power is shown to have properties significantly different from those
constraint (PAC) for single-user as well as multi-user systems, under the TPC only, see Section VI. It is the inter-play between
as in [4]–[9]. While a number of iterative optimization algo- the TPC and PAC that induces these unusual properties. The
rithms have been proposed [4]–[6], closed-form solutions are conditions when either TPC and PAC are inactive are given.
known only in some special cases. The capacity and optimal The high and low-SNR regimes are studied. Isotropic signal-
signaling for a fixed Gaussian MISO channel under the PAC ing is shown to be optimal under the joint constraints in the
has been established in [7], which is significantly different former case while rank-1 signaling (beamforming) is not nec-
from the standard WF solution and is equivalent to the equal- essarily optimal in the latter case (in contrast to the standard
gain transmission (EGT) with phases adjusted to compensate WF signaling).
for the channel phase shifts. This problems remains open in Notations: Bold lower-case letters denote column vectors
the general MIMO case while a numerical algorithm was while bold capital denote matrices; R+ is Hermitian conjuga-
proposed in [8] and a closed-form full-rank solution was tion of R; rii denotes i-th diagonal entry of R; (R)ij is ij-th
obtained in [9]. entry of R, λi (R) is i-th eigenvalue of R, unless indicated oth-
The joint constraints, i.e., the TPC and the PAC simulta- erwise, eigenvalues are in decreasing order: λ1 ≥ λ2 ≥, . . . ,;
neously, are motivated by the scenario with limited overall R ≥ 0 means that R is positive semi-definite; |R| is the
determinant of R, I is identity matrix of appropriate size.
Manuscript received June 7, 2018; revised August 10, 2018; accepted
August 27, 2018. Date of publication August 30, 2018; date of current version
April 9, 2019. The associate editor coordinating the review of this paper and II. C HANNEL M ODEL
approving it for publication was L. P. Natarajan.
The author is with the School of Electrical Engineering and Computer Let us consider a discrete-time model of a fixed Gaussian
Science, University of Ottawa, Ottawa, ON K1N 6N5, Canada (e-mail: MIMO channel:
sergey.loyka@ieee.org).
Digital Object Identifier 10.1109/LWC.2018.2867858 y = Hx +ξ (1)
2162-2345
c 2018 Crown Copyright
LOYKA: ON CAPACITY OF GAUSSIAN MIMO CHANNELS UNDER JOINT POWER CONSTRAINTS 333
where y, x, ξ and H are the received and transmitted signals, entries set to zero), and D̄(W ) = W − D(W ) retaining
noise and channel, respectively; m is the number of trans- off-diagonal entries only; di = (W −1 )ii , where, without loss
mit antennas. The noise is Gaussian circularly-symmetric with of generality, d1 ≤ d2 ≤, . . ., i.e., in decreasing order.
zero mean and unit variance, so that power is also the SNR. Theorem 1: Let the channel matrix in (1) be of full column
The channel H is fixed and known to the transmitter and the rank, W = H + H > 0, and let the per-antenna and total
receiver (Rx). Under the Tx power constraint(s), Gaussian sig- transmit constraint powers be sufficiently high,
naling is known to be optimal in this setting [2], [3] so that
finding the channel capacity C and optimal signaling amounts P > λ−1 −1
m (W ), PT > mλm (W ) − trW
−1
(6)
to finding an optimal Tx covariance matrix R: Then, the (unique) optimal Tx covariance R∗ in (2) under the
C = maxR∈SR ln |I + W R| (2) TPC and PAC in (5) is of full-rank and is given by
where W = H + H , SR is the constraint set. In the R∗ = min(P I , λ−1 I − D(W −1 )) − D̄(W −1 ) (7)
case of the TPC constraint only, it takes the form SR = −1 −1 −1 −1
= λ I −W − ((λ − P )I − D(W ))+ (8)
{R : R ≥ 0, trR ≤ PT }, where PT is the maximum
total Tx power, and the optimal covariance is well-known: the where the operator min applies entry-wise; λ = 0 if
optimal signaling is on the channel eigenmodes with optimal mP ≤ PT , otherwise,
power allocation via the water-filling, which can be compactly ⎛ ⎞
m
expressed as 1 ⎝
λ−1 = PT + di − kP ⎠, (9)
m −k
R∗WF = (λ−1 I − W −1 )+ (3) i=k +1
where (A)+ retains positive eigenmodes of Hermitian matrix, k is the number of active PACs, determined as the largest
integer satisfying
(A)+ = λi (A)u i u +
i (4)
i : λi (A)>0 αk ≥ mP − PT (10)
m
where u i is i-th eigenvector of A; λ > 0 is determined from where αk = i=k (di − dk ). The capacity is
the TPC trR∗WF = PT . m
Under the PA constraints, SR = {R : R ≥ 0, rii ≤ P },
C = ln |W | + ln min(λ−1 , P + di ) (11)
where rii is i-th diagonal entry of R (the Tx power of i-
i=1
th antenna), P is the PA power constraint. No closed-form
solution is known for the optimal covariance in the general Proof: See the Appendix.
case under this constraint, while such solutions are available Note that αk is decreasing in k so that (10) can be efficiently
in the MISO case [7] and in the MIMO case when the optimal solved (even for massive MIMO) by verifying the condition
covariance is of full-rank [9]. for k in increasing order and stopping at the largest k satisfying
The joint power constraints, i.e., TPC and PAC, are moti- it. The number k of active PACs decreases with mP − PT .
vated by practical designs where each antenna has its own The expression in (8) has the following interpretation:
amplifier (and hence PAC) while limited total power/energy its first part λ−1 I − W −1 is the standard full-rank WF
supply motivates TPC. The optimal signaling and capacity solution under the TPC only, and its 2nd part (λ−1 I −
have been obtained in closed form under the joint constraints D(W −1 ) − P I )+ is a correction term accounting for
for the MISO channel in [10], while the general MIMO case the PAC.
remains an open problem. The next section provides a closed- It follows from (7) that per-antenna powers are as follows:
form full-rank solution for the MIMO case as well as sufficient
rii = min(P , λ−1 − (W −1 )ii ) > 0 (12)
and necessary conditions for this solution to hold and some
related properties. which also has an insightful interpretation: these powers are
the minimum of those under the PAC and TPC individually
III. O PTIMAL S IGNALING AND C APACITY (1st and 2nd term in the min operator, respectively).
Following the standard arguments, see [2], [3], Gaussian Next, we observe that the solution in Theorem 1 reduces to
signaling is still optimal under the joint constraints and the known solutions in some special cases.
channel capacity C is as in (2), where the constraint set SR Corollary 1: In Theorem 1, if the per-antenna constraint
is as follows: power P is sufficiently high,
SR = {R : R ≥ 0, tr R ≤ PT , rii ≤ P } (5) P > m −1 (PT + α1 ), (13)
and PT , P are the total and per-antenna constraint powers. then all PACs are inactive and (8) reduces to the
Unfortunately, no closed-form solution is known for the standard WF solution, R∗ = λ−1 I − W −1 , where
optimal covariance in (2) under the constraints in (5) in the λ−1 = m −1 (PT + trW −1 ).
general case. The following Theorem partially closes this gap Corollary 2: In Theorem 1, if the TPC power PT is suf-
and gives a closed-form full-rank solution for optimal signal- ficiently high, PT ≥ m P, then all PACs are active, the TPC
ing in this setting. To this end, let D(W) be the diagonal matrix is inactive and (7) reduces to the PAC-only full-rank solution
retaining only diagonal entries of W (with all off-diagonal in [9]: R∗ = P I − D̄(W −1 ).
Corollary 3: In Theorem 1, i-th PAC is active if and only if where C (R) = ln |I + W R|, since R = PI is feasible under
−1 −1 the PAC. Next,
(W )ii < λ −P (14)
CPAC ≤ CTPC (mP ) (19)
It should be pointed out that while (6) are sufficient for the
optimal signaling to be of full-rank, they are not necessary, where CTPC (mP ) is the capacity under the TPC with the
i.e., there are cases where the optimal signaling is of full- total power PT = mP , since any feasible R under the PAC,
rank even when (6) does not hold. The following proposition rii ≤ P, is also feasible under the TPC with tr R ≤ mP.
gives necessary conditions for an optimal covariance to be of Using (16), one obtains at high SNR CTPC (mP ) ≈ C (P I )
full-rank. and hence CPAC ≈ C (P I ) and (17) follow.
Proposition 1: Let W > 0. The necessary conditions for We are now in a position to establish the optimality of
optimal covariance R∗ to be of full rank are as follows: isotropic signaling under the joint (TPC + PAC) constraints
P > λ1 (D̄(W −1 )), λ < λm (W ) (15) at high SNR.
Proposition 4: Consider a full column-rank channel. Let
where 1st condition is also sufficient if mP ≤ PT (inactive P ∗ = min(P , PT /m). Isotropic signaling is optimal in this
TPC), and λ is determined from (9). channel under the joint constraints (TPC + PAC) in the
Proof: Using (7), R∗ ≤ P I − D̄(W −1 ), so that R∗ > 0 high-SNR regime, i.e., when P ∗ λ−1 m (W ),
implies P I > D̄(W −1 ) and hence 1st condition in (15). 2nd
condition is obtained from 0 < R∗ ≤ λ−1 I − W −1 . R∗ ≈ P ∗ I , C ≈ ln |W | + m ln P ∗ (20)
Based on this, the following procedure can be used to
Proof: First, observe that C ≥ C (P ∗ I ) since R = P ∗ I is
establish whether optimal covariance is of full-rank in general:
feasible under the joint constraints: trR ≤ PT and rii ≤ P .
1. If PT ≥ mP, then 1st condition in (15) is both sufficient
Next,
and necessary for R∗ > 0 and (7) applies.
2. If PT < mP, define R∗ (λ) for a given λ > 0 from (7) and C ≤ min(CTPC , CPAC ) (21)
find λ from (9). If R(λ)∗ > 0, then it is a solution; otherwise,
optimal covariance is rank-deficient. and, at high SNR, CTPC ≈ C (PT I /m), CPAC ≈ C (P I ),
This procedure gives an exhaustive characterization of all and hence C ≈ C (P ∗ I ), as desired. The inequality P ∗
cases when R∗ is of full rank for a full-rank channel (since λ−1
m (W ) comes from the approximation ln(1 + x ) ≈ ln x ,
it follows from KKT conditions which are necessary for which holds if x 1.
optimality). It is remarkable that, for any of the constraints considered
In the following, we characterize the conditions when some here, isotropic signaling is optimal at high SNR. This sim-
constraints are inactive for a full-rank channel (even when plifies the system design significantly as no feedback and no
optimal covariance is not of full rank). elaborate precoding are necessary for this signaling strategy.
Proposition 2: Let W > 0. If the TPC is inactive, then all This also complements the respective result in [10] obtained
PACs are active. Hence, (i) when at least one PAC is inac- for the right-unitary-invariant fading channel.
tive, the TPC is active; (ii) the TPC is inactive if and only if
mP ≤ PT . V. L OW-SNR R EGIME
Proof: Follows from the stationarity condition in (24).
In this section, we consider the behaviour of optimal
It should be noted that this Proposition does not hold if the
covariance in the low-SNR regime, namely, when
channel is rank-deficient, as the example below demonstrates.
min(mP , PT ) λ−1
1 (W ) (22)
IV. H IGH -SNR R EGIME
It is well-known that, for the standard WF solution (under
It is well-known that isotropic signaling is optimal at high
the TPC only), the optimal signaling is beamforming (rank-1)
SNR for the standard WF solution (under the TPC only) in a
at low SNR, R∗WF ≈ PT u 1 u + 1 , where u 1 is the eigenvector
full-rank channel,
of W corresponding to its largest eigenvalue. As the following
PT
R∗WF ≈ I (16) example shows, this does not necessarily hold under the joint
m constraints.
when PT mλ−1 Example: Let PT = 1.5 · 10−2 , P = 10−2 , and W =
m (W ). In this section, we establish the
optimality of isotropic signaling under the joint constraints. diag{2,1}. It is straightforward to see that the optimal covari-
As a first step, the following proposition shows that isotropic ance is R∗ = 10−2 · diag{1, 0.5} in this case, i.e., full-rank
signaling is optimal at high SNR under the PAC. and beamforming is not optimal, does not matter how low
Proposition 3: Consider a full column-rank channel (W the SNR is. If, however, the per-antenna constraint power is
> 0). Isotropic signaling is optimal in this channel under the increased to P ≥ 1.5 · 10−2 , all PACs become inactive and
PAC in the high-SNR regime, i.e., when P λ−1 beamforming is optimal: R∗ = 10−2 · diag{1.5, 0}.
m (W ),
Hence, we conclude that it is the interplay between the TPC
R∗PAC ≈ P I , CPAC ≈ ln |W | + m ln P (17) and the PAC that makes a significant difference at low SNR
Proof: First, observe that while having negligible impact at high SNR: while the optimal
signaling under the TPC, the PAC and the joint constraints are
CPAC ≥ C (P I ) (18) all isotropic at high SNR, they are quite different at low SNR.
LOYKA: ON CAPACITY OF GAUSSIAN MIMO CHANNELS UNDER JOINT POWER CONSTRAINTS 335
VI. P ROPERTIES OF O PTIMAL C OVARIANCE where Λ is determined from the PACs

As it was shown in the previous sections, optimal signal- rii = (λ + λi )−1 − (W −1 )ii ≤ P (30)
ing under the joint constraints can be significantly different
from that under the TPC only. In the following, we point out and complementary slackness λi (rii − P ) = 0 so that λi > 0
additional significant differences. (active PAC) implies rii = P and hence
1. The TPC can be inactive. λi = (P + (W −1 )ii )−1 − λ > 0 (31)
2. An optimal covariance is not necessarily unique (if W is
rank-deficient). Combining this with the case of inactive PAC λi = 0, one
3. An optimal covariance can be of full-rank even when the obtains rii = λ−1 − (W −1 )ii ≤ P and hence
channel is not. λi = ((P + (W −1 )ii )−1 − λ)+ ≥ 0 (32)
4. Optimal signaling is not on the eigenmodes of W, unless
it is diagonal or all PACs are inactive, and the capacity depends where (x )+ = max(x, 0). It follows from (28) that off-diagonal
not only on its eigenvalues, but also on its eigenvectors. parts of R and W −1 are the opposite of each other: D̄(R) =
These unusual properties should be contrasted with those −D̄(W −1 ) and, from (30)-(33), that
under the TPC only, where (i) the TPC is always active (unless
rii = min(P , λ−1 − (W −1 )ii ) > 0 (33)
C = 0 - a trivial case not considered here), (ii) the optimal
covariance is always unique, (iii) the optimal covariance is from which (7) follows. Equation (8) is a straightforward
rank-deficient in a rank-deficient channel, and (iv) optimal manipulation
of (7). Equation (9) follows from the TPC
r = P −1 − (W −1 )
signaling is on the channel eigenmodes and the capacity is i ii T and (10) follows from P ≤ λ ii
independent of channel eigenvectors. for all active PACs.
While Property 4 follows from Theorem 1, the following It remains to show that R∗ > 0. To this end, observe the
example illustrates Properties 1-3. following:
Example: Let W = diag{1, 0}, PT = 2, P = 1. It is
R∗ = min(P I , λ−1 I − D(W −1 )) − D̄(W −1 )
straightforward to see that
> min(λ−1 −1
w I , λw I − D(W
−1
)) − D̄(W −1 )
R∗ = diag{1, a}, 0 ≤ a ≤ 1 (23) = λ−1 −1
w I −W ≥0 (34)
so that (i) R∗ is not unique, (ii) it is of full-rank when a > 0, where λw = λm (W ). The last inequality follows from
even though the channel is not, and (iii) the TPC is inactive if λw I ≤ W , while 1st inequality follows from P> λ−1 w and
a < 1 (so that minimizing a will minimize the total Tx power). λ−1 > λ−1 , where the latter inequality follows from (6)
w
Note however that if the channel is enhanced to a full-rank one, and (10) used in (9). This also implies that rii > 0 in (33). The
W = {1, b}, b > 0, then R∗ = I and all unusual properties uniqueness of R∗ is due to the strict concavity of ln |I +W R|
disappear. when W > 0.
A PPENDIX R EFERENCES
P ROOF OF T HEOREM 1 [1] M. Shafi et al., “5G: A tutorial overview of standards, trials, challenges,
Since the problem in (2) is convex and Slater’s condition deployment, and practice,” IEEE J. Sel. Areas Commun., vol. 35, no. 6,
pp. 1201–1221, Jun. 2017.
holds (as long as P, PT > 0), its KKT conditions are both suf- [2] B. S. Tsybakov, “Capacity of vector Gaussian memoryless channel,”
ficient and necessary for optimality [12]. The KKT conditions Prob. Inf. Transm., vol. 1, no. 1, pp. 26–40, 1965.
for this problem are as follows: [3] I. E. Telatar, “Capacity of multi-antenna Gaussian channels,” Eur. Trans.
Telecommun., vol. 10, no. 6, pp. 1–28, Dec. 1999.
− (I + W R)−1 W − M + λI + Λ = 0 (24) [4] W. Yu and T. Lan, “Transmitter optimization for the multi-antenna down-
link with per-antenna power constraint,” IEEE Trans. Signal Process.,
M R = 0, λ(trR − PT ) = 0, λi (rii − P ) = 0 (25) vol. 55, no. 6, pp. 2646–2660, Jun. 2007.
[5] A. Wiesel, Y. C. Eldar, and S. Shamai, “Zero-forcing precoding and
trR ≤ PT , rii ≤ P , R ≥ 0 (26) generalized inverses,” IEEE Trans. Signal Process., vol. 56, no. 9,
M ≥ 0, λ ≥ 0, λi ≥ 0 (27) pp. 4409–4418, Sep. 2008.
[6] S. Shi, M. Schubert, and H. Boche, “Per-antenna power constrained rate
where λ, λi are Lagrange multipliers (dual variables) respon- optimization for multiuser MIMO systems,” in Proc. Int. ITG Workshop
Smart Antennas, Vienna, Austria, 2008, pp. 270–277.
sible for the TPC and PAC, M is the (matrix) Lagrange [7] M. Vu, “MISO capacity with per-antenna power constraint,” IEEE Trans.
multiplier responsible for R ≥ 0, Λ = diag{λi }. The key diffi- Commun., vol. 59, no. 5, pp. 1268–1274, May 2011.
culty in solving analytically these conditions is that they are a [8] M. Vu, “MIMO capacity with per-antenna power constraint,” in Proc.
IEEE Globecom, Kathmandu, Nepal, Dec. 2011, pp. 1–5.
system of non-linear matrix equalities and inequalities, and the [9] D. Tuninetti, “On the capacity of the AWGN MIMO channel under
PACs make the feasible set SR non-isotropic so that standard per-antenna power constraints,” in Proc. ICC, Sydney, NSW, Australia,
tools (e.g., Hadamard inequality) cannot be used. However, Jun. 2014, pp. 2153–2157.
[10] S. Loyka, “The capacity of Gaussian MIMO channels under total and
when R is of full rank, the stationarity condition simplifies to per-antenna power constraints,” IEEE Trans. Commun., vol. 65, no. 3,
(R + W −1 )−1 = λI + Λ
pp. 1035–1043, Mar. 2017.
(28) [11] P. L. Cao and T. J. Oechtering, “Optimal transmit strategy for MIMO
channels with joint sum and per-antenna power constraints,” in Proc.
since M = 0 (from MR = 0), so that ICASSP, New Orleans, LA, USA, Mar. 2017, pp. 3569–3573.
[12] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.:
R = (λI + Λ)−1 − W −1 (29) Cambridge Univ. Press, 2004.
Secure Transmission With Interleaver for Uplink Sparse

Code Multiple Access System
Ke Lai , Lei Wen , Jing Lei , Gaojie Chen , Senior Member, IEEE, Pei Xiao , Senior Member, IEEE,
and Amine Maaref, Senior Member, IEEE
Abstract—Sparse code multiple access (SCMA) is a promising Physical layer security (PLS) has been widely studied in
air interface candidate technique for next generation mobile recent years [2]–[5], which is regarded as a promising supple-
networks. By introducing the Tent map in the Chaos theory, we ment to the cryptographic techniques. The first NOMA scheme
propose a novel physical layer transmission scheme with code-
word level interleaving at the transmitter in this letter, which
was defined on the power domain, and its secrecy outage
is termed as interleaver-based SCMA (I-SCMA). Simulation probability was derived in [6]. In [7], a secure transmission
results and analysis show that I-SCMA can provide high security scheme that maximizes the minimum confidential informa-
performance without any loss in performance and transmission tion rate among users was proposed. In [8], we propose a
rate, thus constitutes a viable solution for the next generation secure transmission scheme for downlink SCMA system with
wireless networks to provide secure communications. extra phase rotations in the design of codebook, called ran-
Index Terms—5G, SCMA, secure transmission, interleaving. domized constellation rotation based SCMA (RCR-SCMA).
However, in that work, extra uplink and downlink communi-
I. I NTRODUCTION cations are necessary, and the randomized codebooks are not
fully optimized, thus the secure communication is achieved by
CMA [1] is a code domain non-orthogonal multiple access
S (NOMA) scheme that is considered to be a promising
5G candidate due to its excellent ability to support massive
satisfying transmission rate with possible performance loss.
In this letter, we mainly focus on improving the com-
putational complexity for the eavesdropper to recover the
quantities of users under heavily loaded conditions. transmitted data, and thus achieving security for the system. In
In the possible application scenarios for SCMA, such as consequent, a novel physical layer secure transmission scheme
massive machine type of communication (mMTC), millions of with chaotic map and codeword level interleaving, which is
nodes must be accessed. Since ubiquitous mobile devices are denoted as I-SCMA, is proposed. To ensure the security,
required to be accessed to Internet of Things (IoT) in mMTC; interleavers that are constructed according to the channel phase
hence, unprecedented amount of private and sensitive data is with the association of Tent map in the Chaos theory is intro-
transmitted over wireless channels in an SCMA network. From duced. The basic idea of I-SCMA is to amplify the randomness
this perspective, the research on the secrecy issue of SCMA is of limited information that can be extracted from the chan-
of significant importance. Moreover, owing to the requirements nel state information (CSI), and thus leading to asymmetric
such as ultra low latency and low power consumption for the knowledge between the legitimate users (LUs) and eavesdrop-
IoT, it is also challenging to ensure the security in such a per in interleavers since they disperse the order of codewords
network. for each LU. From this perspective, the proposed scheme
Manuscript received August 10, 2018; accepted September 3, 2018. Date can be regarded as a physical layer encryption, which is a
of publication September 13, 2018; date of current version April 9, 2019. combination of conventional ciphers and physical layer secu-
This work was supported in part by the National Natural Science Foundation
of China under Grant 61502518, Grant 61372098, and Grant 61702536, in rity. Therefore, the security of I-SCMA is guaranteed by the
part by the National Defense Technology Foundation under Grant 3101168, unacceptable computational complexity at the eavesdropper
in part by the Hunan Natural Science Foundation under Grant 2017JJ2303, side.
and in part by the Natural Science Research Project of National University
of Defense Technology Research of New Multiple Access Techniques Based
The rest of this letter is organized as follow. Section II
on Joint Sparse Graph. The associate editor coordinating the review of this describes the system model of I-SCMA. In Section III,
paper and approving it for publication was R. Wang. (Corresponding author: the proposed I-SCMA transmission scheme is dis-
Lei Wen.) cussed. Numerical results and conclusion are presented
K. Lai and J. Lei are with the Department of Communication Engineering,
College of Electronic Science and Engineering, National University of in Sections IV and V, respectively.
Defense Technology, Changsha 410072, China.
L. Wen is with the Department of Communication Engineering, College
of Electronic Science and Engineering, National University of Defense II. S ECURE T RANSMISSION W ITH I NTERLEAVER
Technology, Changsha 410072, China, and also with the Institute for The system model of I-SCMA along with the construction
Communication Systems, Home of the 5G Innovation Centre, University of
Surrey, Guildford GU2 7XH, U.K. (e-mail: newton1108@126.com). of interleaver are presented in this section.
G. Chen is with the Department of Engineering, University of Leicester,
Leicester LE1 7RH, U.K. (e-mail: gaojie.chen@leicester.ac.uk).
P. Xiao is with the Institute for Communication Systems, Home of the 5G
A. Interleaver Based SCMA
Innovation Centre, University of Surrey, Guildford GU2 7XH, U.K. (e-mail: As shown in Fig. 1, we consider the uplink transmissions
p.xiao@surrey.ac.uk). where J single-antenna LUs transmit signal to the same base
A. Maaref is with Huawei Technologies Canada Co. Ltd., Ottawa,
ON K2K 3J1, Canada (e-mail: amine.maaref@huawei.com). station (BS) in the presence of an eavesdropper. We assume
Digital Object Identifier 10.1109/LWC.2018.2869792 that the eavesdropper is an external node, and the BS can
LAI et al.: SECURE TRANSMISSION WITH INTERLEAVER FOR UPLINK SCMA SYSTEM 337
B. Interleaver Construction
The key principle of I-SCMA is that the πj is derived
from the CSI of each LU to BS; hence, πj (j ∈ 1, . . . , J )
are unique for different users. It should be noted that the
interleaver in this letter has different usage with the ones of
Turbo codes or interleave-division multiple-access (IDMA), it
is utilized to disperse the order of SCMA codewords and thus
make the detected bits unpredictable for the eavesdropper. As
such, the eavesdropper cannot recover the original messages
without the knowledge of interleaver, therefore, the security
Fig. 1. Illustration of I-SCMA system with an external eavesdropper.
can be ensured. As the channels are assumed to be indepen-
dently, and the Tent map can generate quasi-random sequences
that are irreversible, thus πj are independent and random. In
distinguish it from LUs, which can be simply realized via
consequent, the transmitted data can be fully scrambled by
authentication in practice. As for the SCMA transmitter, each
the interleaver, i.e., the eavesdropper is unable to obtain any
function element is allocated to df users, and each user occu-
information even though it can receive the signals transmit-
pies du function elements. The encoder of I-SCMA is the
ted by LUs since the sequences of the data is different from
same as conventional SCMA (C-SCMA), which is defined by a
the original ones. It is obviously that the key of I-SCMA is
codebook Xj that maps log2 (C) binary bits to a K dimensional
the construction of πj , which requires sufficient randomness
complex codeword xj selected from the dedicated codebook
subject to very limited information that can be used in φr .
Xj corresponding to user j, where |Xj | = C, and C is the size
To construct an interleaver that can satisfy our demands, we
of constellation.
employ a Tent map in the Chaos theory [9], which can generate
At any transmission time, the received signal between single
a random sequence with a few initial parameters. The original
LU and BS can be expressed as
Tent map is expressed as follows:
yr (t) = hr (t)x (t) + n(t), (1) fμ := μ min{x , 1 − x } (2)
where x(t) is the transmitted signal, hr (t) is the channel gain For the values of the parameter μ ∈ [0, 2], fμ maps the unit
and n(t) is the additive white Gaussian noise. Without loss of interval [0, 1] into itself, thus defining a recurrence relation.
generality, hr (t) can be modeled as a complex Gaussian ran- In particular, iterating a given point x0 in [0, 1] gives rise to
dom variable, and its polar form is hr (t) = |hr (t)|e j φr (t) , a sequence xn :
where φr (t) ∈ [0, 2π]. At each transmission time instant,
μxn for xn < 12
the BS and each LU perform channel estimation in the same xn+1 = fμ (xn ) = (3)
μ(1 − xn ) for xn ≥ 12
coherent interval and thus they can obtain the same φr from
the CSI. Therefore, once the initial parameter μ and x0 are given,
In I-SCMA, bj and bˆj denote the coded information bits a random sequence take values that range from 0 to 1 can
and candidate decoded bits of user j, respectively, where be obtained. This is mainly due to the Tent map holds the
the code length is N. After encoding by the SCMA code- initial parameters within the given range define a dynamical
book mapper, there are N / log2 (C) codewords for each user. system and thus make the output from predictable to chaotic.
Subsequently, the codewords are conveyed to the interleaver πj It should be noted that such a Tent map is irreversible, i.e., the
to scramble the sequences of codewords. Note that each user original input cannot be obtained even the generated sequence
utilizes a different interleaving pattern and each data block is known to the eavesdropper.
can be encrypted with different πj at different transmission Considering the range of φr is [0, 2π] while the range of
time so that the security can be enhanced. The construction of μ and x0 are [0, 2] and [0, 1], respectively; hence, a mapping
interleaver will be demonstrated in the next subsection. After from φr to μ and x0 have to be constructed. Note that there
the transmitted chips are received, a de-interleaver is utilized exists numbers of such transformations and they are performed
before forward error correction (FEC) encoder. As the BS esti- locally and silently at BS and each LU such that no signals will
mates the channel in the same interval of each user, it can be radiated. Consequently, the eavesdropper cannot intercept
recover the interleaving patterns utilized by LUs and further f and g. However, due to the sensitivity of μ and x0 in the
attain the original transmitted data. It should be noted that as a Tent map, i.e., very minor variation of μ and x0 can generate
codeword level interleaver is applied and the detection at the a totally different sequence, and the robustness of I-SCMA,
receiver is codeword by codeword; hence, the received sig- f and g should have the ability to enlarge the digits of φr ,
nals can be directly detected. Subsequently, the de-interleaver which can further exploit the sensitivity of Tent map.
should be utilized to the detected bits of each user since the For the ease of implementation, we select two basic func-
codewords are scrambled. tions, which can be written as:
As can be observed from Fig. 1, the security of I-SCMA is
| sin φr | + | cos φr | = f : φr → μ (4)
based on the channel interdependence, moreover, the channel
independence is amplified via Tent map for the eavesdropper and
while the BS can recover the data since the CSI is assumed 1
to be unchanged within the coherent internal. (sin φr + 1) = g : φr → x0 (5)
2
TABLE I
Algorithm 1: Interleaver Construction of I-SCMA S EARCHING S PACE C OMPARISON P ER F RAME
Output: Interleavers πj
1 foreach transmission time do
2 LUs and BS start the channel estimation process;
3 LUs and BS obtain the channel phase φr (t) from the
CSI;
4 Each LU uses their φr (t) to calculate the initial
parameters in Tent mapping according to (4) and (5); channel phase with low margin. (iii) The interleavers can be
5 Each LU uses the generated μ and x in last step to changed over a period time as the CSI is time-varying, which
generate a Chaos sequence according to (3); can further enhance the security of I-SCMA.
6 Labling each block from 1 to N / log2 C; Therefore, the eavesdropper cannot recover the interleavers
7 Sorting the generated Chaos sequence and rearanging to intercept the transmitted data as BS do even the CSI
the lable of each block according to the sorted order; can be estimated with certain accuracy since the pilots are
8 Each user generates their own interleaver πj . broadcasted
To demonstrate the advantages of I-SCMA, we also com-
pare the decryption complexity of RCR-SCMA and C-SCMA
with I-SCMA by using brute force searching. In Table I, the
It is obviously that (4) and (5) can map φr to the range frame length equals to the code length N, C is the com-
of initial parameter in (2). After the μ and x0 are obtained, plexity of MPA to detect a data block, denotes the step
substituting them into (2) and (3) yields a sequence with cer- size of searching for a correct codebook. As reported in [8],
tain length (equal to the number of symbols of each user). the security cannot be enhanced by further reducing and
Therefore, each value of the generated sequence corresponds thus I-SCMA can achieve a better secrecy performance than
to an index of the transmitted codewords. For simplicity, the C-SCMA and RCR-SCMA in terms of decryption complexity.
construction of interleaver is given in Algorithm 1: B. Entropy Analysis of Post-Processing Attack
From the discussion above, inaccurate estimation will result
in catastrophic results. However, since the digits of channel Aiming at evaluating the performance of the post-processing
phase can be enlarged by f and g, the requiring accuracy of attacker (PPA) [11], we analyze the entropy of the received
extracted channel phase can be decreased. Furthermore, num- signal at the PPA side (HPPA ), and make comparison to the
bers of secret key negotiation and correction techniques are entropy of transmitted messages (HT ).
reported [10]. We assume that each user is independent as the codewords
are interleaved to be uncorrelated, according to the defini-
III. A NALYSIS AND D ISCUSSION tion of entropy, for a message with N bits, HT equals to N.
A. Decrypted Complexity of I-SCMA Considering various sizes of codebook in C-SCMA, as for a
PPA, the codebook size should be judged at first; hence, the
The decryption complexity for an eavesdropper is a vital
calculation of the entropy is:
secrecy performance metric. From the previous discussion, if
K Ik
the eavesdroppers intend to intercept the data by guessing the 1 N 1 1
interleavers of each user, then the complexity equals to J · HPPA = − · log2 (7)
K K log2 Ck k =1 i=1 KIk KIk
(N / log2 (C))!. Note that log2 (C) equals to 2 and 4 typically.
Consequently, the search space S can be approximated as: where K is the possible codebook size that the LUs can use;
Ck is the size of kth candidate codebook; Ik is the number
S = J · (N / log2 (C))! ≈ 10n·(ln n−1) , (6) of all possible interleavers. For simplicity and considering the
where n = N / log2 (C). As can be observed from (6), even if worse scenario, we assume that the codebook size is known
the code length N is moderate, the computational complexity to the eavesdropper, then (7) degenerates to:

is certainly unaffordable for the eavesdroppers. N 1 1 N
HPPA = − · log2 = · log2 (Ik ) (8)
In contrast to enumerate the interleavers as a random log2 C Ik Ik log2 C
Ik
attacker with brute force attack, another attacker model con-
sidered in this letter, called intelligent attacker [11] is able By applying the Stirling’s approximation, (7) can be further
to estimate the CSI with certain accuracy. However, they still written as:
encounter the following difficulties: (i) According to the fea- 2 √ 3
N 2π N 2
ture of Tent map in the Chaos theory, the generated sequences HPPA ≈ · log2 (9)
are very sensitive to the initial parameters μ and x0 . As log2 C e log2 C
reported in [9], the variation of 10 digits after decimal point √
2π
can even generate totally different sequences. Therefore, the Note that e ≈ 1, therefore,
intelligent attacker should have the ability to approximate the 2
3 N N
CSI with extremely high accuracy. (ii) The mappings from HPPA ≈ · log2 (10)
channel phase φr to μ and x0 are diverse, and the mapping f 2 log2 C log2 C
and g are unknown to the eavesdroppers; hence, it is impossi- In general, N C, thus (10), HPPA HT , which indicates
ble for the eavesdropper to intercept the initial parameter that that the entropy of secret keys is larger than plaintext; hence,
used to generate the interleavers even though it can guess the the proposed I-SCMA can reach the perfect secrecy for PPA.
LAI et al.: SECURE TRANSMISSION WITH INTERLEAVER FOR UPLINK SCMA SYSTEM 339
SER performance of the eavesdropper is still too high to detect

the correct transmitted information, which follows from the
fact that the Tent map is very sensitive to the input value.
Note that the results of 16-point I-SCMA are similar to 4-point
I-SCMA.
By combining the higher layers cryptography, the eaves-
dropper would have to guess both the key and the random
interleavers introduced by the CSI, which leads to a significant
increase in the search space when performing cryptanaly-
sis. The proposed I-SCMA is a viable solution for secure
transmissions in an SCMA network.
V. C ONCLUSION
Fig. 2. Error rate performance comparison of legitimate users and an
eavesdropper for I-SCMA. In this letter, we propose a novel secure transmission
scheme for uplink SCMA system, which is called I-
TABLE II SCMA. As indicated by the simulation results and analysis,
BER P ERFORMANCE OF E AVESDROPPER U NDER C ERTAIN
E STIMATED ACCURACY σ 2 = 0.04 I-SCMA can achieve a good secrecy performance with-
out any performance loss. Furthermore, I-SCMA does not
impose any overhead in terms of extra uplink and down-
link communications compared to the existing physical layer
security transmission schemes. In conclusion, our scheme
serves as a valuable supplement to conventional cryptographic
technologies.
IV. S IMULATION R ESULTS AND D ISCUSSION
The simulation results and security analysis of the I-SCMA
R EFERENCES
are discussed in this section.
As noted in [8] and [12], the error rate performance can [1] H. Nikopour and H. Baligh, “Sparse code multiple access,” in Proc.
IEEE 24th Int. Symp. Pers. Indoor Mobile Radio Commun. (PIMRC),
be used to assess the secrecy performance of a system, thus London, U.K., Sep. 2013, pp. 332–336.
we evaluate the error rate performance from the perspective of [2] J. Qiao, H. Zhang, X. Zhou, and D. Yuan, “Joint beamforming and time
LUs and eavesdropper, respectively, to demonstrate the validity switching design for secrecy rate maximization in wireless-powered FD
of the proposed scheme. relay systems,” IEEE Trans. Veh. Technol., vol. 67, no. 1, pp. 567–579,
Jan. 2018.
Average error rate comparisons of each user for 4-point, [3] J. Qiao, H. Zhang, F. Zhao, and D. Yuan, “Secure transmission and
16-point I-SCMA and C-SCMA are shown in Fig. 2 (simula- self-energy recycling with partial eavesdropper CSI,” IEEE J. Sel. Areas
tion parameters K = 4; J = 6; df = 3; du = 2; max(Niter ) Commun., to be published, doi: 10.1109/JSAC.2018.2825541.
= 6; λ = 150%; N = 256 and code rate r = 0.5). As [4] G. Chen, Y. Gong, P. Xiao, and J. A. Chambers, “Dual antenna selection
in secure cognitive radio networks,” IEEE Trans. Veh. Technol., vol. 65,
can be seen from the figures, the performance of I-SCMA no. 10, pp. 7993–8002, Oct. 2016.
is slightly better than C-SCMA especially in the high SNR [5] G. Chen, J. Coon, and M. D. Renzo, “Secrecy outage analysis for down-
region, which indicates that I-SCMA will not suffer from link transmissions in the presence of randomly located eavesdroppers,”
performance loss compare to the existing secure transmission IEEE Trans. Inf. Forensics Security, vol. 12, no. 5, pp. 1195–1206,
May 2017.
for SCMA in [8]. This is mainly because the interleavers [6] Y. Liu, Z. Qin, M. Elkashlan, Y. Gao, and L. Hanzo, “Enhancing
disperse the coded sequences so that the adjacent blocks the physical layer security of non-orthogonal multiple access in large-
are approximately uncorrelated. Furthermore, it is clear that scale networks,” IEEE Trans. Wireless Commun., vol. 16, no. 3,
pp. 1656–1672, Mar. 2017.
eavesdroppers cannot obtain any information as they cannot
[7] B. He, A. Liu, N. Yang, and V. K. N. Lau, “On the design of secure
estimate φr with very high accuracy and the mappers are non-orthogonal multiple access systems,” IEEE J. Sel. Areas Commun.,
unknown to them. Note that the symbol error rate (SER) vol. 35, no. 10, pp. 2196–2206, Oct. 2017.
of I-SCMA for eavesdroppers approximate to 0.75 and 0.94 [8] K. Lai et al., “Secure transmission with randomized constellation rota-
tion for downlink sparse code multiple access system,” IEEE Access,
for 4-point and 16-point I-SCMA, respectively, which follows vol. 6, pp. 5049–5063, 2018.
from the fact that each symbol can be wrongly detected with [9] P. Collet and J. P. Eckmann, Iterated Maps on the Interval As Dynamical
probability: Systems. Boston, MA, USA: Birkhäuser, 1980.
[10] Y. Liu, H.-H. Chen, and L. Wang, “Physical layer security for next
C−1 generation wireless networks: Theories, technologies, and challenges,”
Pr{xi = x } = (11)
C IEEE Commun. Surveys Tuts., vol. 19, no. 1, pp. 347–376, 1st Quart.,
2017.
In Table II, the error rate performance of LUs under the [11] S. Althunibat, V. Sucasas, and J. Rodriguez, “A physical-layer secu-
condition that the eavesdropper can approximate the CSI with rity scheme by phase-based adaptive modulation,” IEEE Trans. Veh.
low error margin is presented. We assume that the estimated Technol., vol. 66, no. 11, pp. 9931–9942, Aug. 2017.
φr of eavesdropper follows the distribution φ˜r ∼ N (φr , σ 2 ). [12] I. M. Kim, B.-H. Kim, and J. K. Ahn, “BER-based physical layer
security with finite codelength: Combining strong converse and error
As can be observed from the table, although the eavesdrop- amplification,” IEEE Trans. Commun., vol. 64, no. 9, pp. 3844–3857,
per can estimate the CSI with certain accuracy, the BER and Sep. 2016.
Hybrid Modulation Scheme Combining PPM With Differential

Chaos Shift Keying Modulation
Meiyuan Miao, Lin Wang , Senior Member, IEEE, Marcos Katz, Member, IEEE,
and Weikai Xu , Member, IEEE
Abstract—In conventional M-ary differential chaos shift keying communications. The idea of a hybrid combining modulation
modulation (DCSK) systems, the distance between constellation using PPM was first proposed in optical communications [11].
points gets closer as M increases, resulting in poor performance. A hybrid pulse PPM-BPSK add space for transmitted refer-
A hybrid modulation scheme based on pulse position modulation
(PPM) and DCSK is proposed in this letter to improve bit-error- ence pulse cluster (TRPC) is proposed to improve performance
rate (BER) performance. In this scheme, one part of the bits is in [12]. Similarly, DCSK also can be combined with PPM
modulated by PPM while the other part is modulated by DCSK. for better performance in spread spectrum communication
Thus, information bearing signals are simultaneously modulated systems.
by the information bit and the selected pulse position of PPM To improve the BER performance of conventional M-ary
which is determined by extra information bits. Analytical BER
performance of the proposed scheme is derived and verified by DCSK, a hybrid PPM-DCSK modulation scheme is proposed
simulations. Results show that the considered scheme outper- in this letter. The main contribution of the brief are summa-
forms conventional M-DCSK, code index modulation DCSK, and rized as follows. Firstly, based on a hybrid two-dimensional
commutation code index DCSK in additive white Gaussian noise modulation scheme, a mixed modulation scheme with PPM
and multipath Rayleigh fading channels. and DCSK is proposed which the PPM parts bearing mc
Index Terms—Differential chaos shift keying modulation bits signal. Secondly, the bit error rate (BER) expression for
(DCSK), pulse position modulation (PPM), hybrid modulation, the considered system is obtained analytically and validated
bit error rate (BER). then by simulations. Results show that the proposed system
I. I NTRODUCTION has better BER performance than conventional M-DCSK,
multidimensional CIM-DCSK and CCI-DCSK schemes.
HAOTIC communication has gained increasing attention
C as it can be used widely in spread spectrum com-
munication systems due to its low-power, low complexity,
This letter is organized as follows. Section II presents
the system model of PPM-DCSK system. The calculation of
the BER expressions of the proposed system is derived in
and excellent anti-fading capabilities [1], [2]. Differential
Section III. Simulation results and discussions are presented
chaos shift keying (DCSK), a modulation scheme proposed
in Section IV. Section V concludes this letter.
for chaotic communications [3], is characterized by its
simple transceiver configuration and excellent performance
II. S YSTEM M ODEL
over multipath fading channels [4], only requiring a simple
non-coherent demodulator without channel estimators and The transmitter of proposed system is shown in Fig. 1,
equalization [5]. Some variants of DCSK, such as permuta- where the total transmitted bits are mc +1 and symbol duration
tion index DCSK(PI-DCSK) [6] and code index modulation is (2mc +1)R. In the proposed system, mc bits are mapped into
DCSK (CIM,CCI-DCSK) [1], [7] which used index modu- a PPM position. The transmitted signal sl can be expressed as
lation [8], [9] have also been proposed. A multi-resolution
sl = [ cx , bl sPPM cx ] (1)
M-ary DCSK with M-ary phase-shift-keying (MPSK) con-
stellation is proposed based on quadrature chaotic shift reference information−bearing
keying (QCSK) [10] which offers better BER performance
where cx is a R-length
chaotic signal, bl ∈ {−1, 1} is
by changing the structure of constellation.
information bit, is Kronecker operator. In the proposed
Due to its simplicity and high performance, pulse position
system, the information bit is transmitted on one position in
modulation (PPM) is also widely used in spread spectrum
the PPM frame. The position is determined by mapping bits.
Manuscript received August 29, 2018; accepted September 10, 2018. Date sPPM = [0, 0, . . . , 1al , . . . , 0]1×P (P = 2mc ) in the PPM sig-
of publication September 19, 2018; date of current version April 9, 2019. This nal, where 1al represents that the alth position of sPPM is 1.
work was supported by the National Natural Science Foundation of China
under Grant 61671395. The associate editor coordinating the review of this al is a position index modulation symbol which is converted
paper and approving it for publication was J. Coon. (Corresponding author: by mapping bits.
Lin Wang.) Assuming that the transmitted signal is corrupted by a
M. Miao, L. Wang, and W. Xu are with the Department of
Communication Engineering, Xiamen University, Xiamen 361005, multipath Rayleigh fading channel, the received signal can be
China (e-mail: meiyuanmiao@foxmail.com; wanglin@xmu.edu.cn; written as
xweikai@xmu.edu.cn).
L

M. Katz is with the Centre for Wireless Communications, University of
Oulu, 90014 Oulu, Finland (e-mail: marcos.katz@ee.oulu.fi). rl = αl δ(t − τl ) ⊗ sl + nl , (2)
Digital Object Identifier 10.1109/LWC.2018.2871137 l=1
MIAO et al.: HYBRID MODULATION SCHEME COMBINING PPM WITH DCSK MODULATION 341
Fig. 1. Block diagram of PPM-DCSK transmitter. Fig. 2. Block diagram of PPM-DCSK receiver.
III. P ERFORMANCE OF PPM-DCSK A NALYSIS

where L is the number of paths, αl and τl are the channel
coefficient and the path delay of the lth path, respectively, A. System BER Analysis
⊗ denotes the convolution operator. Moreover, the chan- The total system BER is a function of the BER of the mod-
nel coefficients are constant over each symbol duration and ulated bits Pem and the BER of the mapped Pecim , which is
the maximum multipath delay is much shorter than R, i.e., the function of the error probability of PPM detection error
R >> τlmax . nl is the additive white Gaussian noise (AWGN) probability Ped . We set mc as the number of bits mapped to
with zero mean value and variance of N0 /2. The block dia- one position of PPM index modulating symbol. The symbol
gram of the receiver is shown in Fig. 2. The receiver not only al is mapped from mc bits and the probability of detecting
needs to detect the modulation bits from DCSK signal, but one of the remaining P−1 incorrect positions are the same at
also the index position of PPM. That means the received ref- all positions. Thus, the expectation of the number of errors
erence signal need to be correlated with each R-length parts can be expressed as
of received sPPM , thus the al th part of information-bearing mc mc

signal in PPM signal can be written as Q= i i , (7)
P −1
i=1
rinf = ral , 1 ≤ al ≤ P , (3) n
where P−1 incorrect position probability is P 1−1 and m =
n!
The decision variable Im of the m th branch when m = al m!(n−m)!
. Thus the BER of mapped bits is calculated as
can be expressed as Q
Pecim = P , (8)
R
L
L
mc ed

Im = αl cx + nr αl bl cx + nr −R , (4) the BER for modulated bits can be given by
i=1 l=1 l=1
Pem = Pe (1 − Ped ) + 0.5Ped , (9)
Similarly, when m = al , the decision variable of the mth
where the Pe is bit error probability of the DCSK, and the
branch can be expressed as
probability of wrong PPM position detection is 50% when
R
L
error probability of modulating bits is wrong, therefore the

error probability of total system is expressed by
Im = αl cx + nr (nr −R )
i=1 l=1 mc 1
R L R
Psys = Pecim + Pem . (10)
mc + 1 mc + 1
= αl cx nr −R + nr nr −R , (5)
i=1 l=1 i=1 B. Derivation of Ped
Assuming that both the modulating symbol bl = +1 and
where the symbol al and bl can be estimated respectively as
al = m̂ (it selects position index for PPM) are transmitted.
Thus the mean and the variance of Im and Im are calculated
âl = arg max (|Im |, |I m |),
m=1,...,P by, respectively
bˆl = sign(Iaˆl ), L
(6) α 2 Es
μ1 = E{Im } = l=1 l , μ2 = E{Im } = 0,
2
it can be seen from (6) that the maximum of estimated al is L L
α 2 Es N0 N 2R
decided as the position of information bearing signal in PPM 2
σ1 = Var{Im } = l=1 l + 0 ,
frame, and bl can be obtained by demodulating outputs of cor- l=1
2 4
responding correlators. Then the bits mapped by position can L
α 2 Es N0 N 2R
be obtained from the decimal-to-binary converting part. For σ22 = Var{Im } = l=1 l + 0
mc mapped bits, the symbol duration of PPM-DCSK, CIM- 4 4
L 2
DCSK and CCI-DCSK is (2mc + 1) ∗ R [2], (2mc + 1) ∗ R α R
= Es N0 ( l=1 l + ) (11)
and 2n ∗ R(n ≥ mc ) [7], respectively. Thus they have similar
4

4rs

bandwidth efficiency. λ
where Es = 2RE{x 2 } is symbol energy of PPM-DCSK,

L 2
rs = l=1 αl Es /N0 is signal-to-noise ratio, E{•} is the
expectation operator, and Var{•} is the variance operator, then
| follow identical folded nor-
the random variables |Im | and |Im
mal distribution, thus the probability density function of |Im |
and cumulative distribution function of |Im | are calculated
respectively as
⎧ ⎫
(y−μ|I | )2 (y+μ|I | )2
⎪
⎨ − m − m ⎪
⎬
1 2σ 2 2σ 2
f|Im | (y) = e |Im | +e |Im | , (12)
2
2πσ|I ⎪
⎩ ⎪
⎭
m |
⎛ ⎞
y
| (y) = erf ⎝
F|Im ⎠. (13) Fig. 3. Simulation and analytical BER of PPM-DCSK system over AWGN
2σ2 2
and Rayleigh multipath fading channel with spreading factor SF = 64, the
PPM modulated bits is mc and P = 2mc .
2
The μ|Im | and σ|I are the mean and variance of the |Im |
m|
which can be expressed as Pe is bit error probability of the DCSK which can be

μ2 calculated as
2σ12 − 2σ12 −μ1 %
μ|Im | = e 1 − μ1 erf Es N0 γ, (14) ( 1
π 2σ12
= 1 4 2R − 2
Pe = erfc + 2 . (19)
2 rs rs
where
⎛ ⎞ total system Finally, substituting these above into (10), we
⎛ ⎞
−⎝ 4
1
2R
⎠ √ obtain the instantaneous BER of PPM-DCSK over Rayleigh
1 R rs + r 2 rs 1
γ= + e s − erf ⎝− 4 2R
⎠, fading channel, and the multipath Rayleigh fading channel can
2π 4πrs 2 rs + rs2 be given as
(15) ∞
here and Pmul = Pe · f (rs )drs , (20)
0
2
σ|Im|
= μ21 + σ12 − μ2|Im |
where f (rs ) is the PDF of rs which can be found in [5].
rs 1 R
= Es N0 ( + + − γ2) (16)
4 2 4rs IV. S IMULATION R ESULTS AND D ISCUSSION

ρ In this section, simulations are carried out to evaluate
Assume that X1 = max{|Im |}, m = 1, 2, . . . , P − 1. And the performance of the proposed system over AWGN and
the position of PPM detection error probability is calculated multipath Rayleigh fading channels. These results are then
by, contrasted with the theoretical results derived in previous
analysis on both AWGN and multipath Rayleigh fading chan-
Ped = 1 − Pr {Y ≥ X } nel. In all figures, SF represents the spreading factor and
∞ mc is the number bits mapped into one position of the
= [1 − Pr {y ≥ X }]f|Im | (y)dy PPM modulation. In multipath Rayleigh fading channel, three
0 paths L = 3, are considered, having equal average power
⎡ ⎡ ⎛ ⎞⎤P −1 ⎤ gain E[α21 ] = E[α22 ] = E[α32 ] = 1/3, with path delays
∞
1 ⎢ y ⎠⎦ ⎥ τ1 = 0, τ2 = 1, τ3 = 2. To quickly illustrate, mc in PPM-
= ⎣1 − ⎣erf ⎝ ⎦ DCSK means modulated bits number of PPM parts and the
2
2πσ|I | 2
2σ 2
⎧
m 0
⎫ whole number of modulated bits is mc + 1 while the whole
⎪
⎨ −
(y−μ |Im | ) 2 (y+μ 2
|Im | ) ⎪
⎬ number of M-DCSK is only mc .
−
2σ 2 2σ 2 The analytical and simulated results for PPM-DCSK are
× e |Im | +e |Im | dy. (17)
⎪
⎩ ⎪
⎭ shown in Fig. 3, where the analytical results match the simula-
tions at high SNR. As can be also seen, BER always decreases
Let μ = √ y , then the position detection error probability as mc increases and as SF decreases in AWGN while less
Es N0
is derived as influence in multipath fading. The reason for this behavior
∞$ '(P −1 )
is that more bits in the same symbol energy are transmitted
% &
1 μ with increasing mc , which means each transmitted bit needs
Ped = √ 1 − erf √ less energy. Fig. 4 shows performance comparisons of PPM-
2πρ 2λ
0 DCSK, CIM-DCSK, CCI-DCSK and M-DCSK over AWGN
* (y−γ)2 (y+γ)2
+
− 2ρ − 2ρ channel. Note that the PPM-DCSK exhibits lower BER than
× e +e dμ, (18)
the others, moreover the BER of M-DCSK increases as mc
MIAO et al.: HYBRID MODULATION SCHEME COMBINING PPM WITH DCSK MODULATION 343
proposed PPM-DCSK. Fig. 6 shows BER performance with

the modulated bits number of these schemes over AWGN
and multipath Rayleigh fading channels. The performance
of M-DCSK increases monotonously as mc increases. As
mc increases, performance of PPM-DCSK, CIM-DCSK and
CCI-DCSK increases in fading but decreases in AWGN.
V. C ONCLUSION
In this letter, a novel hybrid modulation scheme com-
bining PPM and DCSK is considered. The scheme avoids
the problem of distinguishing inaccurately the constellation
points in M-DCSK systems especially in large values of M.
Fig. 4. Performance comparisons of PPM-DCSK, CIM-DCSK, CCI-DCSK,
M-DCSK over AWGN channel SF = 64.
In the proposed system, information bearing signal is car-
ried by PPM, which carries both DCSK modulated bit and
extra red mc bits by position modulation. The theoretical
BER expression were derived and validated by simulations. By
comparing the BER performance with conventional M-DCSK,
CIM-DCSK and CCI-DCSK, the numerical results show that
the proposed hybrid PPM-DCSK system has superior BER
performance over AWGN and multipath Rayleigh fading
channel.
R EFERENCES
[1] Y. Tan, W. Xu, T. Huang, and L. Wang, “A multilevel code shifted
differential chaos shift keying scheme with code index modula-
tion,” IEEE Trans. Circuits Syst. II, Exp. Briefs, to be published,
doi: 10.1109/TCSII.2017.2764916.
Fig. 5. BER performance comparisons of PPM-DCSK, CIM-DCSK, CCI- [2] W. Xu, Y. Tan, F. C. M. Lau, and G. Kolumbán, “Design and
DCSK, M-DCSK over Rayleigh multipath fading channel with SF = 64, optimization of differential chaos shift keying scheme with code index
L = 3. modulation,” IEEE Trans. Commun., vol. 66, no. 5, pp. 1970–1980,
Mar. 2018.
[3] G. Kolumbán, B. Vizvári, W. Schwarz, and A. Abel, “Differential chaos
shift keying: A robust coding for chaos communication,” in Proc. NDES,
Seville, Spain, 1996, pp. 87–92.
[4] M. Dawa, G. Kaddoum, and Z. Sattar, “A generalized lower bound on
the bit error rate of DCSK systems over multi-path Rayleigh fading
channels,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 65, no. 3,
pp. 321–325, Mar. 2018.
[5] G. Cheng, L. Wang, W. Xu, and G. Chen, “Carrier index differen-
tial chaos shift keying modulation,” IEEE Trans. Circuits Syst. II, Exp.
Briefs, vol. 64, no. 8, pp. 907–911, Aug. 2017.
[6] M. Herceg, G. Kaddoum, D. Vranješ, and E. Soujeri, “Permutation
index DCSK modulation technique for secure multiuser high-data-rate
communication systems,” IEEE Trans. Veh. Technol., vol. 67, no. 4,
pp. 2997–3011, Apr. 2018.
[7] M. Herceg, D. Vranješ, G. Kaddoum, and E. Soujeri, “Commutation
code index DCSK modulation technique for high-data-rate communica-
tion systems,” IEEE Trans. Circuits Syst. II, Exp. Briefs, to be published,
Fig. 6. Effect of the number of modulated bits on BER at PPM-DCSK, CIM- doi: 10.1109/TCSII.2018.2817930.
DCSK, CCI-DCSK, M-DCSK over AWGN and Rayleigh multipath fading [8] G. Kaddoum, M. F. A. Ahmed, and Y. Nijsure, “Code index
channel with L = 3, SF = 64. modulation: A high data rate and energy efficient communica-
tion system,” IEEE Commun. Lett., vol. 19, no. 2, pp. 175–178,
Feb. 2015.
[9] G. Kaddoum, Y. Nijsure, and H. Tran, “Generalized code index
increases, while PPM-DCSK decreases as mc increases. The modulation technique for high-data-rate communication systems,”
gain of PPM-DCSK over M-DCSK is at least about 4 dB as IEEE Trans. Veh. Technol., vol. 65, no. 9, pp. 7000–7009,
Sep. 2016.
mc increase, and over CIM-DCSK and CCI-DCSK is about [10] L. Wang, G. Cai, and G. R. Chen, “Design and performance analysis
0.3 dB, 2.5 dB at 10−4 . The performance of PPM-DCSK is of a new multiresolution M -ary differential chaos shift keying com-
significantly improved over M-DCSK. As shown in Fig. 5, munication system,” IEEE Trans. Wireless Commun., vol. 14, no. 9,
pp. 5197–5208, Sep. 2015.
BER performance of PPM-DCSK on multipath fading chan- [11] L. Bosotti and G. Pirani, “A PAM-PPM signalling format in optical
nel outperforms those of the other schemes when SF = 64. fibre digital communications,” Opt. Quantum Electron., vol. 11, no. 1,
All schemes have similar increasing BER behavior, PPM- pp. 71–86, Jan. 1979.
[12] Y. Dai and X. Dong, “Hybrid PPM-BPSK for transmitted reference pulse
DCSK has less increasing tendency then others when mc cluster systems in UWB and 60-GHz channels,” IEEE Wireless Commun.
increasing. Note the good performance at high mc of the Lett., vol. 3, no. 6, pp. 657–660, Dec. 2014.
Double Shadowing the Rician Fading Model

Nidhi Simmons , Student Member, IEEE, Carlos Rafael Nogueira da Silva,
Simon L. Cotton , Senior Member, IEEE, Paschalis C. Sofotasios , Senior Member, IEEE,
and Michel Daoud Yacoub , Member, IEEE
Abstract—In this letter, we consider a Rician fading envelope contributions brought about by shadowing. Hence, several
which is impacted by dual shadowing processes. We conveniently composite fading models have been proposed which address
refer to this as the double shadowed Rician fading model which these shortcomings. The shadowing in these is LOS if
can appear in two different formats, each underpinned by a dif- the dominant component of the envelope is shadowed, and
ferent physical signal reception model. The first format assumes
a Rician envelope where the dominant component is fluctu- multiplicative when the total power of the dominant (if
ated by a Nakagami-m random variable (RV) which is preceded present) and scattered components are shadowed. A number of
(or succeeded) by a secondary round of shadowing brought about multiplicative shadow fading models were proposed in [3]–[6].
by an inverse Nakagami-m RV. The second format considers These include the Nakagami-m/gamma [3], κ-μ/gamma [4],
that the dominant component and scattered waves of a Rician η-μ/gamma [5], κ-μ/inverse gamma and η-μ/ inverse gamma
envelope are perturbed by two different shadowing processes. models [6].
In particular, the dominant component experiences variations Here, we focus on the shadowed Rician model [7] which
characterized by the product of a Nakagami-m and an inverse
Nakagami-m RV, whereas the scattered waves are subject to fluc- considers a Rician envelope in which the LOS is perturbed by
tuations influenced by an inverse Nakagami-m RV. Using the shadowing shaped by a Nakagami-m random variable (RV).
relationship between the shadowing properties of the two for- This model has good analytical properties [8] and provides an
mats, we develop unified closed-form and analytical expressions excellent fit to data obtained from land mobile satellite and
for their probability density function, cumulative distribution underwater acoustic channels [7], [9]. Motivated by this, we
function, moment-generating function and moments. All derived introduce the double shadowed Rician fading model which
expressions are validated through Monte Carlo simulations and
can appear in two formats. The first format considers a Rician
reduction to a number of special cases.
signal in which the LOS undergoes variations influenced by
Index Terms—Composite fading, fading channels, inverse a Nakagami-m RV. It also assumes that the root mean square
Nakagaim-m distribution, shadowed Rician model. (rms) power of the dominant component and scattered waves
undergo a secondary round of shadowing shaped by an inverse
Nakagami-m RV. The second format assumes that the domi-
I. I NTRODUCTION nant component of a Rician envelope undergoes fluctuations
EVERAL statistical distributions have been proposed to characterized by the product of a Nakagami-m and an inverse
S characterize fading in wireless channels [1]. Shadowing
is commonly modeled using the lognormal distribution [1]
Nakagami-m RV, whilst the scattered waves are fluctuated by
an inverse Nakagami-m RV. The PDF, cumulative distribu-
whilst multipath fading is described by the Rayleigh, Rice, tion function (CDF), moment generating function (MGF) and
Nakagami-m, and more recently κ-μ and η-μ [2] distribu- moments are derived which are coincidentally identical for
tions. Nevertheless, these models are unable to account for both formats, differing only in the interpretation of the under-
fluctuations of the line-of-sight (LOS) or scattered signal lying physical phenomena. These results are then used to
obtain amount of fading (AF) and outage probability (OP).
Manuscript received June 13, 2018; revised September 11, 2018; accepted
September 13, 2018. Date of publication September 24, 2018; date of II. T HE P HYSICAL M ODEL
current version April 9, 2019. This work was supported in part by the
U.K. Engineering and Physical Sciences Research Council under Grant The PDF of the shadowed Rician fading model [7] is
EP/L026074/1, in part by the Department for the Economy, Northern Ireland md
2
2σ md −x 2
x 2 2
d x
under Grant USI080, and in part by Khalifa University of Science and fX(x ) = e 2σ2 1F1 md ; 1; 2 2
Technology under Grant 8474000122 and Grant 8474000137. The associate 2σ 2 md +d 2 σ2 2σ (2σ md +d 2 )
editor coordinating the review of this paper and approving it for publication
was A. Kammoun. (Corresponding author: Simon L. Cotton.)
(1)
N. Simmons and S. L. Cotton are with the Institute of Electronics, where, 1F1 (·; ·; ·) denotes the confluent hypergeometric func-
Communications and Information Technology, Queen’s University
Belfast, Belfast BT3 9DT, U.K. (e-mail: nsimmons01@qub.ac.uk;
tion [10, eq. (9.210.1)], md denotes the shape parameter of
simon.cotton@qub.ac.uk). the Nakagami-m RV, 2σ 2 is the average power of the scat-
C. R. N. da Silva and M. D. Yacoub are with the Wireless tered component, and d 2 is the average power of the LOS
Technology Laboratory, School of Electrical and Computer Engineering,
component. Here, k = 2σ d 2 is the Rician k parameter, and
University of Campinas, Campinas 13083-970, Brazil (e-mail: √ 2
carlosrn@decom.fee.unicamp.br; michel@decom.fee.unicamp.br). x̂ = E[X 2 ] = 2σ 2 + d 2 represents the rms power of x,
P. C. Sofotasios is with the Department of Electrical and where E[ · ] is the expectation operator. We rewrite (1) as
Computer Engineering, Khalifa University of Science and Technology,
Abu Dhabi 127788, UAE, and also with the Department of Electronics and mdmd 2x (1+k ) −(1+k ) x 22 k (1+k )x 2
Communications Engineering, Tampere University of Technology, 33720 fX(x ) = e x̂ 1F1 md ; 1; 2 .
Tampere, Finland (e-mail: p.sofotasios@ieee.org). (k +md )md x̂ 2 x̂ (md +k )
Digital Object Identifier 10.1109/LWC.2018.2871677 (2)
SIMMONS et al.: DOUBLE SHADOWING THE RICIAN FADING MODEL 345
The first format of the double shadowed Rician model where 2 F1 (·, ·; ·; ·) is the Gauss hypergeometric function [10].
assumes a Rician fading channel which undergoes LOS Proof: See Appendix A.
shadowing followed by a secondary round of composite shad- Letting γ represent the instantaneous signal-to-noise-ratio
owing or vice versa. Physically, this may arise when the signal (SNR) of the double shadowed Rician fading model, the PDF
power delivered through the direct path between the transmit- of its instantaneous SNR, fγ (γ), is obtained from the envelope
ter and receiver is subject to varying levels of shadowing, PDF in (8) via a transformation of variables (r = γ r̂ 2 /γ̄),
whilst further shadowing of the received power (combined md
scattered multipath and LOS) is due to obstacles moving in γ̄ ms ms (ms − 1)ms (1 + k ) md
fγ (γ) =
the vicinity of either the transmitter or receiver. Its signal (γ(1 + k ) + (ms − 1)γ̄)ms +1 md + k
envelope, R, is
k (1 + k )γ
2 × 2F1 md , ms + 1; 1;
R 2 = A2 (I + ξμx )2 + Q + ξμy (3) (md + k )(γ(1 + k ) + (ms − 1)γ̄)
(9)
where I and Q are mutually independent Gaussian random
processes
2
with mean E[I ] = E[Q] = 0 and variance E I 2 = where γ̄ = E[γ] denotes the corresponding average SNR.
E Q = σ 2 , μx and μy are the mean values of the in-phase Lemma 1: For k, md , γ̄, γ ∈ R+ , and ms > 1 the CDF

ξ is
and the quadrature phase components, respectively. In (3), of the double shadowed Rician fading model, Fγ (γ), can be
a Nakagami-m RV with shape parameter md where E ξ 2 = 1, obtained such that Fγ (γ) = 0γ fγ (t)dt, as
and A is an inverse

Nakagami-m RV with shape parameter ms ,
where E A2 = 1, whose PDF is given by ∞ i
md md k (md )i (i + 1)ms
2(ms − 1)ms − (ms −1) Fγ (γ) =
fA (α) = e α2 . (4) md + k md + k Γ(ms )Γ(i + 2)
i=0
Γ(ms )α2ms +1
γ(1+k ) i+1 γ(1+k )
Here, Γ(·) represents the Gamma function [10, eq. (8.310.1)]. × 2 F1 i +1, i +ms +1; i +2; − .
The second format of the double shadowed Rician model (ms − 1)γ̄ (ms − 1)γ̄
assumes a Rician faded signal in which the dominant com- (10)
ponent and scattered waves are subject to two different Γ(x +n )
shadowing processes. More precisely, the dominant compo- where (x )n = Γ(x ) denotes the Pochhammer sym-
nent experiences variations characterized by the product of a bol [10]. When (ms − 1)γ̄ > γ(1 + k ),
Nakagami-m and an inverse Nakagami-m RV, whilst the scat-
tered waves are subject to fluctuations influenced by an inverse mdmd ms γ(1 + k )
Fγ(γ) =
Nakagami-m RV. Its signal envelope, R, is (md +k )md (ms −1)γ̄
2
R 2 = (AI + B μx )2 + AQ + B μy (5) 2,1,0 ms + 1, 1; md ; −; k (1+k )γ −(1+k )γ
×F1,1,0 ,
2; 1; −; (md +k )γ̄(ms −1) γ̄(ms −1)
where B = Aξ, A and ξ are as defined above. It is worth high-
(11)
lighting that as shown in [11], B 2 follows a Fisher-Snedecor F

distribution [11]. Now, substituting for B in (5), we obtain (3). ·,·,· ·, ·; ·; ·;
where F·,·,· ·, · denotes the Kampé de Fériet
Note that although (3) and (5) are mathematically identical, ·; ·; ·;
their physical meanings differ as explained above. function [12]. On the contrary, when (ms − 1)γ̄ < γ(1 + k ),
md
III. S TATISTICAL C HARACTERISTICS md 1; md ; 0; k −(ms −1)γ̄
Fγ(γ) = F 1,1,1 ,
Exploiting the mathematical relationship above and select- (md +k )md 0,1,1 −; 1; 1 − ms ; md +k γ(1+k )

ing (3) as our starting point, the distribution of the received 1,1,1 ms + 1; md ; ms ; k −(ms −1)γ̄
− ζ F0,1,1 , (12)
signal envelope, R, in a double shadowed Rician channel can −; 1; 1 + ms ;md +k γ(1+k )
be obtained by determining the conditional probability
((m −1)γ̄)ms Γ(m +1)
∞ where ζ = m s(γ(1+k ))ms Γ(m
s
s s)
fR (r ) = fR|A (r |α)fA (α)d α (6) Proof: See Appendix A.
0 Lemma 2: For k, md , γ̄, γ ∈ R+ , and ms > 1 the MGF
where of the double shadowed Rician fading model,
Mγ (s), can be
r2 obtained such that Mγ (s) E[e −sγ ] = 0∞ e−sγ fγ (γ)d γ,
−(1+k ) k (1+k )r 2
2r(1+k )e α2 r̂ 2 1F1 md ; 1; α2 r̂ 2 (m +k )
fR|A(r |α) = d
.
md−md (md +k )md α2 r̂ 2 md md k
(7) Mγ(s) = ψ1 1, md, 1, 1−ms ; ,ζ
md + k md +k

Theorem 1: For k, md , r̂ 2 , r ∈ R+ and ms > 1 the PDF of ζ ms Γ(−ms ) k
+ ψ1 1+ms , md , 1, 1 +ms ; ,ζ (13)
the double shadowed Rician fading model can be written as B(ms , 1) md +k
md
2r r̂ 2ms ms (ms − 1)ms (1 + k ) md γ̄(m −1)s
s
fR (r ) = m +1 where ζ = 1+k and ψ1 (·, ·, ·, ·; ·, ·) is the Humbert ψ1
(r 2 (1 + k ) + (ms − 1)r̂ 2 ) s md + k
function [13].
k (1 + k )r 2 Proof: See Appendix B.
× 2 F1 md , ms + 1; 1;
(md + k )(r (1 + k ) + (ms − 1)r̂ 2 )
2 Lemma 3: For k, md , γ̄, γ ∈ R+ , and ms > 1 the n-th
(8) order moment of the double shadowed Rician fading model,

E[γ n ], can be obtained such that E[γ n ] 0∞ γ n fγ (γ)d γ, as

Γ(ms −n)Γ(1+n)2 F1 md , n + 1; 1; m k+k
E[γ n ] = −m
d
.
md d (md +k )md Γ(ms )(1+k )n [(ms −1)γ̄]−n
(14)
Proof: See Appendix B.
IV. P ERFORMANCE A NALYSIS

A. Amount of Fading
Corollary 1: For k ∈ R+ , ms > 2, the AF of the double
shadowed Rician fading model is obtained such that AF
V[γ] E[γ 2 ]
E[γ]2 = E[γ]2
− 1, where V(·) denotes the variance operator, Fig. 1. Double shadowed Rician PDF alongside special cases. Lines represent
analytical results, and circle markers represent simulation results (r̂ = 0.9).
md ms (1 + 2k ) + (md + ms − 1)k 2
AF = . (15)
md (ms − 2)(1 + k )
B. Outage Probability
Corollary 2: For k, md , γ̄ ∈ R+ and ms > 1 the OP of the
double shadowed Rician fading model, can be obtained such
that POP (γth ) P [0 ≤ γ ≤ γth ] = Fγ (γth ), as
∞ i
md md k (md )i (i + 1)ms
POP (γth ) =
md + k md + k Γ(ms )Γ(i + 2)
i=0
i+1
γth (1+k ) γth (1+k )
× 2 F1 i +1, i +ms +1; i +2; −
(ms − 1)γ̄ (ms − 1)γ̄
(16) Fig. 2. The AF in double shadowed Rician fading channels for a range of
ms and md when k = 0.5 and 20.6.
where γth is the threshold SNR.
Proposition 1: For (ms − 1)γ̄(md + k ) > γth k (1 + k ), the
truncation error, T , for the infinite series in (16) is given as

γ (1+k ) γth (1+k )
T ≤ 2 F1 T0 +1, T0 +ms +1; T0 +2; − th
(ms −1)γ̄ (ms −1)γ̄

γth k (1+k )
×ms 2 F1 md , 1+ms ; 2, . (17)
(ms − 1)γ̄(md +k )
Proof: See Appendix C.
V. S PECIAL C ASES AND N UMERICAL R ESULTS

The results presented here encompass the statistics of the
shadowed Rician, shadowed Rayleigh, Nakagami-q, Rician
and Rayleigh fading models. For example, letting ms → ∞
in (8) we obtain the PDF of the shadowed Rician model, and
Fig. 3. OP versus γ̄ for different values of md , ms and k. Here γth = 0 dB.
allowing md → 0 in (8) we obtain the PDF of the shad-
owed Rayleigh model. Allowing ms → ∞ and md = 0.5
in (8) we obtain the PDF of the Nakagami-q (Hoyt) model,
while letting ms → ∞ and md → ∞ in (8), the Rician the multiplicative parameter (low ms ), when compared to the
PDF is obtained, followed by the Rayleigh fading model when shadowing of the LOS component (md ). For instance, the AF
k → 0. Fig. 1 shows these special cases alongside Monte-Carlo observed when {md , ms , k } = {3.5, 2.5, 20.6} is 3.05, which
simulations. is greater compared to the AF observed when {md , ms , k } =
To provide some insights into the effect of shadowing upon {2.5, 3.5, 20.6}, which is 1.42. From Fig. 3 we observe that
the dominant and scattered multipath signal in double shad- the OP increases for severe shadowing of the LOS compo-
owed Rician fading channels, Fig. 2 shows the calculated AF nent (low md ) and multiplicative parameter (low ms ), and low
for different values of md and ms . Choosing the first format values of the Rician k parameter. Moreover the rate at which
of the double shadowed Rician model, it is observed from the OP decreases is faster as md , ms and k parameters grow
Fig. 2 that the greatest AF occurs for severe shadowing of large.
SIMMONS et al.: DOUBLE SHADOWING THE RICIAN FADING MODEL 347
VI. C ONCLUSION A PPENDIX C

The double shadowed Rician fading model has been P ROOF OF (17) - T RUNCATION E RROR
proposed in conjunction with two underlying signal models. T for the series in (16) if T0 − 1 terms are used, is
It was shown that although the two formats have differ- ∞ i
k (md )i (i + 1)ms γth (1 + k ) i+1
ent physical meanings, mathematically they are identical. T =
Consequently, fundamental statistics such as the PDF, md + k Γ(i + 2)Γ(ms ) (ms − 1)γ̄
i=T0
CDF, MGF, and moments were obtained, while impor-
γ (1 + k )
tant performance measures such as the AF, and OP were ×2 F1 i + 1, i + ms + 1; i + 2; − th . (19)
(ms − 1)γ̄
derived.
Since the Gauss hypergeometric function in (19) is monoton-
A PPENDIX A ically decreasing with respect to i, T can be bounded as

P ROOF OF (8), (10), (11) AND (12) γ (1 + k )
T ≤ 2 F1 T0 + 1, T0 + ms + 1; T0 + 2; − th
The PDF of the double shadowed Rician model shown in (8) (ms − 1)γ̄
∞
i
is obtained by first substituting (4) and (7) in (6), followed by k (md )i (i +1)ms γth (1+k ) i+1
solving the resultant integral using [10, eq. (7.621.4)]. × . (20)
Replacing the Gauss hypergeometric function with md +k Γ(i +2)Γ(ms ) (ms − 1)γ̄
i=T0
[14, eq. (07.23.02.0001.01)] in (9), substituting the resultant Since we add up strictly positive terms, we have
expression in Fγ (γ) = 0γ fγ (t)dt, and solving the inte- ∞ i
gral using [10, eq. (3.194.5)] we obtain the CDF shown k (md )i (i + 1)ms γth (1 + k ) i
in (10). Now, substituting the Gauss hypergeometric function md + k Γ(i + 2)Γ(ms ) (ms − 1)γ̄
i=T0
with [14, eq. (07.23.02.0001.01)] (for (ms − 1)γ̄ > γ(1 + k ))
∞ i
in (10), using the Pochhammer symbol identities, and finally k (md )i (i + 1)ms γth (1 + k ) i
≤ . (21)
using the definition of Kampé de Feriét function [12], we md + k Γ(i + 2)Γ(ms ) (ms − 1)γ̄
i=0
obtain (11). On the contrary, substituting the Gauss hyper-
geometric function in (10) with [14, eq. (07.23.02.0004.01)] When (ms − 1)γ̄(md + k ) > γth k (1 + k ), sim-
(for (ms − 1)γ̄ < γ(1 + k )), using the Pochhammer symbol plifying (20) using Pochhammer symbol identities
identities, and finally using the definition of Kampé de Feriét and [14, eq. (07.23.02.0001.01)], we obtain (17).
function, we obtain (12).
R EFERENCES
A PPENDIX B [1] M.-S. Alouini and M. K. Simon, “Dual diversity over correlated log-
P ROOF OF (13), (14) AND (15) normal fading channels,” IEEE Trans. Commun., vol. 50, no. 12,
∞ −sγ pp. 1946–1959, Dec. 2002.
Substituting (9) in Mγ (s) = 0 e fγ (γ)d γ, fol- [2] M. D. Yacoub, “The κ-μ distribution and the η-μ distribution,” IEEE
lowed by replacing the Gauss hypergeometric function Antennas Propag. Mag., vol. 49, no. 1, pp. 68–81, Feb. 2007.
[3] P. M. Shankar, “Error rates in generalized shadowed fading channels,”
with [14, eq. (07.23.02.0001.01)], we obtain an integral Wireless Pers. Commun., vol. 28, no. 3, pp. 233–238, 2004.
similar to [10, eq. (3.383.5)]. Now substituting for the [4] P. C. Sofotasios and S. Freear, “The κ-μ/gamma composite fading
generalized Laguerre polynomial [10] given by Lm n (x ) =
model,” in Proc. IEEE Int. Conf. Wireless Inf. Technol. Syst., Honolulu,
Γ(m+n+1) HI, USA, Aug. 2010, pp. 1–4.
Γ(m+1)Γ(n+1) 1 1
F (−n; m + 1; x ) and simplifying, yields [5] P. C. Sofotasios and S. Freear, “The η-μ/gamma composite fading
∞ md i model,” in Proc. IEEE Int. Conf. Wireless Inf. Technol. Syst., Honolulu,
md k (md )i (i + 1)ms HI, USA, Aug. 2010, pp. 1–4.
Mγ (s) = [6] S. K. Yoo et al., “The κ-μ/inverse gamma and η-μ/inverse
md + k md + k Γ(ms )i!
i=0 gamma composite fading models: Fundamental statistics and
ms empirical validation,” IEEE Trans. Commun., to be published,
γ̄(ms −1)s γ̄(ms −1)s
B (ms , i +1) 1F1 i +1; 1−ms ; + doi: 10.1109/TCOMM.2017.2780110.
1+k 1+k [7] A. Abdi, W. C. Lau, M.-S. Alouini, and M. Kaveh, “A new simple model
for land mobile satellite channels: First- and second-order statistics,”
γ̄(ms −1)s
Γ(−ms ) 1F1 i +ms +1; 1+ms ; . (18) IEEE Trans. Wireless Commun., vol. 2, no. 3, pp. 519–528, May 2003.
1+k [8] J. F. Paris, “Closed-form expressions for Rician shadowed cumulative
distribution function,” Electron. Lett., vol. 46, no. 13, pp. 952–953,
By substituting the Kummer confluent hypergeometric func- Jun. 2010.
tion with [14, eq. (07.20.02.0001.01)] in (18), using the [9] F. Ruiz-Vega, M. C. Clemente, P. Otero, and J. F. Paris, “Ricean shad-
Pochhammer symbol identities, and the definition of the owed statistical characterization of shallow water acoustic channels for
Humbert ψ1 function [13], we obtain (13). wireless communications,” arXiv preprint arXiv:1112.4410, 2011.
[10] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and
The n-th order moment is obtained by first substituting [14, Products, 7th ed. New York, NY, USA: Academic, 2007.
eq. (07.23.02.0001.01)] in (9), then substituting the resul- [11] S. K. Yoo et al., “The Fisher–Snedecor F distribution: A simple and
tant expression in E[γ n ] = 0∞ γ n fγ (γ)d γ to obtain an accurate composite fading model,” IEEE Commun. Lett., vol. 21, no. 7,
integral identical to [10, eq. (3.194.3)]. Now, using [14, pp. 1661–1664, Jul. 2017.
[12] Listing of Mathematical Notations, Wolfram Res., Inc., Champaign,
eq. (07.23.02.0001.01)] and simplifying, yields (14). IL, USA, 2017, Accessed: Aug. 31, 2018. [Online]. Available:
Substituting n = 1 and 2 into (14), we obtain the first http://functions.wolfram.com/Notations/5
and second moments of the double shadowed Rician model. [13] P. Humbert, “Sur les fonctions hypercylindriques,” Comptes Rendus
Hebdomadaires Des Séances De L’Académie Des Sci., vol. 171,
Utilizing these in the AF formulation (see corollary 1), pp. 490–492, 1920.
followed by using [14, eq. (07.23.02.0001.01)] and finally [14] (2016). Wolfram Research, Inc. Accessed: Feb. 3, 2017. [Online].
simplifying the resultant expression, yields the AF in (15). Available: http://functions.wolfram.com/id
Energy-Efficient Prefix Code Based Backscatter Communication

for Wirelessly Powered Networks
Yufan Zhang, Ertao Li, Yi-Hua Zhu , Senior Member, IEEE, Kaikai Chi , Member, IEEE, and Xianzhong Tian
Abstract—Backscatter communications have been widely encoding) and Miller modulation have been widely adopted
applied in wirelessly powered networks. Energy constraint forces in the RFID systems [2]. We found that there exists Energy
the nodes to only backscatter a few bits once, causing backscat- Consumption Disparity (ECD) between transmitting/receiving
ter communications unable to be applied in the applications that
bit 0 and bit 1 under the backscatter communications adopt-
require to deliver more data (e.g., an image). It is significant to
save energy for backscatter communications so that the nodes ing FM0 due to an additional mid-symbol phase inversion in
can deliver more data with the limited energy. In this letter, the backscattering bit 0 [4]. That is, the energy consumed in deliv-
energy-efficient code based backscatter communication (CBBC) ering a single bit 0 almost suffices to deliver two bit 1 s. In fact,
is proposed, which makes use of the energy consumption dis- ECD is also present in Miller modulation. To cope with the
parity between transmitting/receiving bit 0 and bit 1 in the ECD, the Energy-Efficient Data Delivery Scheme (EEDDS)
existing backscatter communications. The energy-efficient pre- was proposed in our previous work [4] that reduces energy
fix codebook is derived from the formulated energy consumption
minimization problem. In the CBBC, the codebook is shared consumption by using a codebook shared by the sender and
by the sender and the receiver of a backscatter link, and the the receiver. Under the EEDDS, the sender breaks original
sender breaks the original bit stream into equal-length blocks bit stream into multiple m-bit data blocks, finds the corre-
and delivers the energy-consuming blocks by using their corresponding codewords of the blocks, and then transmits the
sponding codewords from which the receiver decodes the original codewords from which the receiver recovers the original data.
data. The experiments show that the proposed CBBC can save There exist the following shortcomings in the EEDDS. Firstly,
energy for backscatter communication.
for a smaller ECD, say less than 2, the EEDDS almost does
Index Terms—Backscatter communication, wirelessly powered not save energy when m<12. That is, it requires a greater m to
networks, energy conservation, prefix code. gain energy saving. Thus, the number of the codewords (i.e.,
codebook size) in the EEDDS may be too large to be contained
in the nodes’ tightly-constrained memory since the codebook
I. I NTRODUCTION size is 2m , which makes the EEDDS hard to be implemented.
HE NODES in wirelessly powered networks are powered In fact, it is unnecessary to encode all the m-bit data blocks
T by dedicated wireless chargers or harvest energy from
the environment. Backscatter communications, emerged as a
because some of them contain many bit 1 s, which can be
directly delivered with less energy consumption. Secondly,
promising solution to achieve green communication for future each codeword has length greater than m, which causes each
Internet of Things (IoT) [1], have been widely applied in deliv- data block to be delivered with a longer delay. Hence, it is
ering data in wirelessly powered networks. The applications critical to develop an energy-efficient and easy-implementation
of backscatter communications include radio frequency iden- scheme for the backscatter communications used in the wire-
tification (RFID) systems [2], the backscatter sensors [3], and lessly powered networks. This is the main motivation of this
more. In the RFID systems, tags are powered by a reader and letter.
transmit their data to the reader through backscatter communi- The main contributions of this letter are as follows. 1) We
cations. Energy constraint causes the nodes to only backscatter propose the prefix code based backscatter communication
a few bits once. This prevents backscatter communications (CBBC) scheme, which supports to directly deliver the origi-
from being applied in the applications that require to deliver nal data blocks with number of bit 0 s less than an encoding
more data, such as backscattering image. Therefore, it is threshold in addition to delivering codewords. To inform the
important to let the energy-restricted nodes deliver more data receiver of whether a data block is backscattered using its
through backscatter communications. codeword or not, we let the sender adopt two different data
In backscatter communications, data are transmitted bit rates. 2) We formulate the energy consumption minimization
by bit. FM0 baseband encoding (a bi-phase space baseband problem to derive the codebook, in which only the blocks with
the number of energy-consuming bits being equal to or greater
Manuscript received July 31, 2018; revised September 15, 2018; accepted than the threshold are encoded. This leads to a smaller code-
September 24, 2018. Date of publication September 28, 2018; date of current book size so that the codebook can be entirely stored in the
version April 9, 2019. This work was supported by the National Natural nodes’ memory. That is, we emphasize the number of energy-
Science Foundation of China under Grant 61432015, Grant 61772470, Grant
61872322, and Grant 61672465. The associate editor coordinating the review
consuming bits in designing the codebook and only encode
of this paper and approving it for publication was H. T. Dinh. (Corresponding the blocks that bring in energy saving. 3) We experiment the
author: Yi-Hua Zhu.) CBBC on wireless identification sensing platform (WISP) [5],
The authors are with the School of Computer Science and Technology, a programmable, sensing and computationally enhanced plat-
Zhejiang University of Technology, Hangzhou 310023, China (e-mail:
yhzhu@zjut.edu.cn; kkchi@zjut.edu.cn; txz@zjut.edu.cn). form compatible with GS1 RFID protocols [2]. The CBBC
Digital Object Identifier 10.1109/LWC.2018.2872538 extremely extends the EEDDS and outperforms the EEDDS
2162-2345 c 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/
redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
ZHANG et al.: ENERGY-EFFICIENT PREFIX CBBC FOR WIRELESSLY POWERED NETWORKS 349
Fig. 1. Block diagram of the proposed CBBC.
in terms of the number of bits transmitted by the WISP with the ECD between transmitting/receiving bits 0 and 1 such that
the same amount of energy. et0 = αet1 and er 0 = αer 1 . Denoting the energy consump-
The remainder of this letter is organized as follows. We tions of delivering bits 0 and 1 by ε0 and ε1 , respectively, we
outline the CBBC in Section II, design the prefix codebook have ε0 = et0 + er 0 , ε1 = et1 + er 1 , and
in Section III, evaluate the performance of the CBBC in
Section IV, and conclude this letter in Section V. ε0 = αε1 . (1)
For a given m, there are 2m different m-bit data blocks, from
II. T HE CBBC which we choose and encode the ones that have the number
of bit 0 s being equal to or greater than the encoding threshold
The proposed CBBC consists of procedures of Proc_sender
z. Therefore, the codebook contains the number of codewords
and Proc_receiver, which are applied in the sender and the
as follows:
receiver, respectively. The CBBC delivers m bits from the
m
sender to the receiver once a time. An m-bit block is transmit- m
Nc = . (2)
ted using a codeword in the designed codebook if it contains
u=z
u
z or more bit 0 s, where z is a preset constant called encod-
ing threshold. Under the CBBC, the sender uses two different Necessarily, there exists one-to-one corresponding between
data rates r1 and r2 to transmit codewords and uncoded m-bit the encoded m-bit blocks and the codewords in the codebook.
blocks, respectively. The sender and the receiver share the Then, the number of m-bit blocks without being encoded is
designed codebook, which will be derived in Section III. 2m − Nc . Accordingly, the original m-bit blocks can be classi-
The procedure Proc_sender contains the following steps: fied into encoded and uncoded ones. Assume the probabilities
Step 1. The sender sets the optimal z, denoted by z ∗ , which of bit 0 and bit 1 occurring in the original data are identical.
is determined by the optimization problem (OP) in (11). Then, the probabilities of the sender transmitting a codeword
Step 2. The sender divides the original data into multiple and an uncoded m-bit block can be expressed in (3) and (4),
m-bit blocks. respectively.
Step 3. For each m-bit block, if it contains the number of m
Nc 1 m
bit 0 s equal to or greater than z ∗ , the sender searches for its Pc = m = m , (3)
corresponding codeword in the designed codebook, and then 2 2 u=z u
transmits the found codeword using data rate r1 . Otherwise, m
1 m
it transmits the original m-bit block using data rate r2 . Puc = 1 − Pc = 1 − m . (4)
2 u=z u
The procedure Proc_receiver is as follows. When the
receiver receives a packet, it considers the received packet as
a codeword if the packet is received with data rate r1 and as A. Energy Consumption of the Common Backscatter
an original m-bit block otherwise. For a received codeword, Communication
the receiver searches in the codebook for its corresponding We first calculate the energy consumption for the common
original m-bit block. backscatter communication (ComBC). For a given m, there are
The CBBC is illustrated in Fig. 1. In the figure, the sender totally 2m m-bit blocks, in which the numbers of bit 0 s and
can deliver data either in the coding mode or in the non-coding bit 1 s are each equal to (2m m)/2 = m2m−1 . Hence, the
mode. For instance, blocks 1 and 4 are directly delivered, average energy consumption of the sender and the receiver in
whereas blocks 2, 3, and 5 are delivered via the corresponding delivering one m-bit block under the ComBC is
codewords in the designed codebook.
m2m−1 ε0 + m2m−1 ε1 1
Ecbc = m = m(1 + α)ε1 . (5)
2 2
III. C ODEBOOK D ESIGN
We use et0 and et1 to represent the energy consumptions of B. Energy Consumption of Delivering Uncoded Data Blocks
transmitting bit 0 and bit 1, respectively. In addition, er 0 and Under the CBBC
er 1 stand for the energy consumptions of receiving bit 0 and Before deriving the codebook, we consider the energy con-
bit 1, respectively. Considering ECD is a constant for a given sumption under the CBBC without using any codebook. With
backscatter communication, we use constant α(>1) to reflect the CBBC, there are totally 2m − Nc uncoded m-bit blocks
deliveredwithout using codebook. These uncoded −1blocks We use OP (11) to generate prefix codebook C, where
−1 m
include zu=1 u u bit 0 s and (2m − Nc )m − zu=1 u mu notation ⊕ stands for the bitwise exclusive-OR (i.e., XOR)
bit 1 s. Thus, using (1), we obtain the average energy con- operation. In the OP, Constraint (11a) is obtained by substi-
sumption of the sender and the receiver in delivering one m-bit tuting (5) and (8) into (9). Eq. (11b) reflects prefix property,
block as follows where c i,j = min{|ci |, |cj |}. Eq. (11c) is the Kraft inequality.
−1 m −1 m
ε0 zu=1 u u + ε1 [m(2m − Nc ) − zu=1 u u ] Eqs. (11b) and (11c) are required in designing prefix code [6].
Euc = Constraint (11d) comes from tb ≤ θ/r using (10), where θ >0
(2m − Nc )
is a preset constant and r = max{r1 , r2 }. Here, 1/r repre-
z −1
ε1 (α − 1) m sents the duration of a bit transmitted using the higher rate in
= ε1 m + m u . (6)
(2 − Nc ) u r1 and r2 . Hence, Constraint (11d) aims to limit bit duration
u=1
tb so that it does not exceed θ multiples of the bit duration 1/r.
C. Average Energy Consumption of Delivering Encoded Aiming to accelerate delivering the original data over the link,
Data Blocks Under the CBBC we set r1 = r if N c
i=1 |c i | > m(2m − Nc ) (i.e., the num-
ber of bits in the codebook is larger than that in the uncoded
The CBBC applies an energy-efficient prefix codebook C
blocks), and r2 = r otherwise. In Constraint (11e), m̂ is the
that includes Nc prefix codewords. Here, prefix codewords are
upper bound of m to ensure that the derived codebook can be
also called prefix-free codewords, which have the property that
any codeword is not a prefix of the other ones. stored in nodes’ restricted memory. In (11f), z = 0 causes all
For a codeword ci ∈ C, we use |ci | to represent the the m-bit data blocks to be encoded, and z = m + 1 makes
number of the bits in ci such that the codeword can be no data block encoded.
expressed as ci = bi,1 bi,2 · · · bi,|ci | , where bi,j ∈ {0, 1}, i =
min Ecbbc
1, 2, . . . , Nc , j = 1, 2, . . . , |ci |. Thus, codeword ci has ni,1 =
|ci | |ci | w .r .t.: m, z , bi,k (i = 1, 2, . . . , Nc ; k = 1, 2, . . . , |ci |) (11)
j =1 bi,j bit 1 s and ni,0 = |ci | − j =1 bi,j bit 0 s. Hence,
delivering ci causes the sender and the receiver to expend s.t.:
energy of ni,1 ε1 + ni,0 ε0 = [|ci |α + (1 − α)ni,1 ]ε1 . As a ⎧ ⎡ ⎤
⎨ 1 Nc |ci |
result, energy consumption per codeword is
⎣|ci |α + (1 − α) bi,j ⎦
⎡ ⎤ ⎩ Nc
Nc |ci | i=1 j =1
ε1 ⎫
Ec = ⎣|ci |α + (1 − α) bi,j ⎦. (7) z −1 ⎬m
Nc α−1 m m
i=1 j =1 − m u −m
2 − Nc u ⎭ u
Therefore, the expected energy consumption in the CBBC u=1 u=z
using the prefix codebook is Ecbbc = Pc Ec + Puc Euc , i.e., z
−1
2m (α − 1) m
N ⎡ ⎤ + u < 2m−1 m(α − 1); (11a)
m |ci | 2m − Nc u
ε1 m c
⎣|ci |α + (1 − α)
u=1
Ecbbc = m bi,j ⎦ c i,j
2 Nc u=z u i=1 j =1

m m bi,k ⊕ bj ,k = 0, ∀i , j ∈ {1, 2, . . . , Nc }(i = j );
m z
−1
2 − u=z u ε1 (α − 1) m k =1
+ m
ε1 m + m u . (8) (11b)
2 2 − Nc u=1 u
Nc
1
Clearly, to benefit from using codebook, we need ≤ 1; (11c)
i=1
2|ci |
Ecbbc < Ecbc . (9)
Nc
1 1 1 r1 θ
In the case when the original data blocks are delivered + m |ci | − Nc ≤ ; (11d)
r2 2 r1 m r2 r
through codewords, each m-bit block corresponds to a unique i=1
one of the Nc codewords in C. Hence, weobtain the aver- 1 ≤ m ≤ m̂; (11e)
age delivery time of an m-bit block as r11 ( N i=1 |c
c
i |/Nc ) so bi,k ∈ {0, 1}, i = 1, 2, . . . , Nc ; k = 1, 2, . . . , |ci |;
that the average delivery time per bit is mr1 Nc N1
i=1 |ci |,
c
z ∈ {0, 1, . . . , m + 1}. (11f)
where r1 is the data rate applied in the sender and the
receiver. Additionally, in the case when the original data blocks Noticing a prefix codebook corresponds to a binary tree, we
are delivered directly (the blocks are uncoded), the delivery use the pruning and expanding (PEO) operation [6] on binary
time per bit in the m-bit block is 1/r2 . Hence, the aver- trees to find the solution of OP (11). It has been proved in [6]
age transmission time per bit in the original block under the that the PEO is with complexity of O(Nc4 ) in the worst case. In
CBBC is solving the OP, the PEO is conducted for each pair of z and m,
Nc where z ∈ {0, 1, . . . , m + 1}, m ∈ {1, 2, . . . , m̂}. That is, the
1 1
tb = Pc |ci | + Puc times of the PEO being conducted is (m + 2)m̂ ≤ m̂(m̂ + 2).
mr1 Nc r2 Thus, the complexity of solving the OP is O(m̂ 2 Nc4 ). This
i=1
Nc
complexity is acceptable because m̂ is usually very small in
1 1 1 r1
= + m |ci | − Nc . (10) practice and the OP can be solved offline without strictly time
r2 2 r1 m r2 constraint.
i=1
ZHANG et al.: ENERGY-EFFICIENT PREFIX CBBC FOR WIRELESSLY POWERED NETWORKS 351
Fig. 2. PoINTB and PoRT vs θ when m̂ = 12. Fig. 3. PoINTB and PoRT vs r when m̂ = 12 and θ = 1.6.
We note that the codebook designed in [4], which is applied

in the EEDDS, consists of equal-length codewords. Obviously, in Fig. 3. The upper part of the figure indicates that PoINTB
equal-length codebook is a special prefix codebook since they increases slightly as r grows, i.e., a higher rate brings in a
satisfy the conditions of prefix codewords. Therefore, the higher number of transmitted bits, while the lower one illus-
prefix code based CBBC outperforms the EEDDS presented trates that PoRT stays closely (i.e., the errors between them are
in [4], which is validated by simulations but we omit the less than 4%). This is because the ratio of the two applied data
simulation results due to space limitation. rates, i.e., r1 /r2 , remains unchanged when the higher rate r
varies. Additionally, we experiment by varying m̂, which leads
to the observation that too greater a m̂ does not bring more
IV. P ERFORMANCE E VALUATION energy saving (e.g., no more energy saving is gained when
We conduct the experiments on two WISPs, in which m̂ > 4 for r = 3 kbps).
one WISP backscatters its data to the other. To evaluate the In summary, the proposed CBBC can improve energy-
performance of the CBBC, in each experiment, the capacitor in efficiency in backscatter communications, which is measured
the WISP is fully charged (i.e., its voltage reaches 4.25 V) and by the number of the bits transmitted with the same amount
then the WISP transmits to the other WISP until the capacitor of energy consumption.
exhausts its energy (i.e., its voltage drops to 2.0 V). We focus
on the effect of the CBBC on the sender. From the experi- V. C ONCLUSION
ments, we obtain α = 1.64, and thus we set ε1 = 1 unit of
energy (UoE) and ε0 = 1.64 UoE. All the experimental results Backscatter communications are widely used in wireless
shown below are the average of 10 experiments. powered networks. The proposed CBBC makes use of the
To compare the CBBC with the ComBC, we choose per- ECD in backscattering bits 0 and 1 so as to save energy by
centage of increased number of transmitted bits (PoINTB) and delivering codewords instead of the original data block con-
percentage of reduced throughput (PoRT) as metrics. Here, taining considerable energy-consuming bits. The CBBC can be
the PoINTB is defined as the ratio of (a1 − a2 )/a2 , where applied in the nodes with restricted memory. It is extremely
a1 and a2 stand for the numbers of bits transmitted with the suitable for the popular RFID systems. In future we will study
CBBC and the ComBC, respectively; and the PoRT is similarly how to apply the CBBC in WISPCam, a battery-free RFID
defined. camera, to energy-efficiently backscatter image.
First, we apply data rates of 1 and 3 kbps in the CBBC. For
fair comparison, we, in the ComBC, let the sender transmit R EFERENCES
with the maximum data rate r = 3 (in kbps). Impact of θ [1] G. Yang, Y.-C. Liang, R. Zhang, and Y. Pei, “Modulation in the air:
on PoINTB and PoRT is shown in Fig. 2. From the upper Backscatter communication over ambient OFDM carrier,” IEEE Trans.
part of the figure, we have the following two observations. Commun., vol. 66, no. 3, pp. 1219–1233, Mar. 2018.
Firstly, the CBBC can gain the PoINTB higher than 20% when [2] GS1. EPCTM Radio-Frequency Identity Protocols Generation-2
UHF RFID Standard. Accessed: Oct. 1, 2018. [Online]. Available:
θ ≥ 1.2. That is, with the same amount of energy, the CBBC https://www.gs1.org/standards/epc-rfid/uhf-air-interface-protocol/2-0-1.
can deliver 20% more bits than the ComBC. This observation [3] G. Zhu, S.-W. Ko, and K. Huang, “Inference from randomized trans-
indicates the proposed CBBC consumes less energy than the missions by many backscatter sensors,” IEEE Trans. Wireless Commun.,
vol. 17, no. 5, pp. 3111–3127, May 2018.
ComBC. Secondly, the number of transmitted bits increases [4] Y.-H. Zhu, E. Li, and K. Chi, “Encoding scheme to reduce energy
as θ grows. In other words, more bits can be delivered when consumption of delivering data in radio frequency powered battery-free
the limitation on bit duration (i.e., tb ) is relaxed. The lower wireless sensor networks,” IEEE Trans. Veh. Technol., vol. 67, no. 4,
part of the figure illustrates that the gain in the number of pp. 3085–3097, Apr. 2018.
[5] A. P. Sample, D. J. Yeager, P. S. Powledge, A. V. Mamishev, and
transmitted bits is at cost of throughput. The reason is that the J. R. Smith, “Design of an RFID-based battery-free programmable sensing
CBBC needs to apply two different data rates, the lower rate platform,” IEEE Trans. Instrum. Meas., vol. 57, no. 11, pp. 2608–2615,
leads to throughput reduction. Nov. 2008.
Next, we consider impact of data rates on PoINTB and [6] K. Chi, Y.-H. Zhu, X. Jiang, and V. C. M. Leung, “Energy-efficient
prefix-free codes for wireless nano-sensor networks using OOK mod-
PoRT by using data rate pair of (r, r/3). We set the higher data ulation,” IEEE Trans. Wireless Commun., vol. 13, no. 5, pp. 2670–2682,
rate r = 1.0, 1.5, . . . , 3.0. The experimental results are shown May 2014.
Outage Constrained Robust Multigroup Multicast Beamforming for

Multi-Beam Satellite Communication Systems
Li You , Member, IEEE, Ao Liu, Wenjin Wang , Member, IEEE, and Xiqi Gao , Fellow, IEEE
Abstract—We investigate outage constrained robust multigroup communication systems cast into a multigroup multicast beam-
multicast beamforming for multi-beam satellite communica- forming optimization problem. In addition, the channel phase
tion systems with full frequency reuse. Based on a satellite uncertainty due to, e.g., the long propagation delays in satellite
downlink beam domain channel model with channel phase uncer- communication systems [6] motivates a robust beamforming
tainty taken into account, we first investigate robust multigroup design. Moreover, the per-beam power constraints should be
multicast beamforming with the aim to maximize the worst-case
outage signal-to-interference-plus-noise ratio under the outage
taken into account due to the limitation of on-board inter-beam
and the per-beam power constraints. We then cast the out- power sharing.
age constrained robust beamforming design into the convex Motivated by the above practical issues, we investigate
optimization framework with some approximation techniques. robust multigroup multicast beamforming for multi-beam
Simulation results show that the proposed robust multigroup satellite communications in this letter. Most of the previous
multicast beamformer can provide significant performance gains works on multigroup multicast precoding, see [7], usually
in terms of multicast rate and outage probability over the assumed perfect channel state information at the transmitter
conventional approach. (CSIT), which, however, is difficult to obtain in practical satel-
Index Terms—Multi-beam satellite communication systems, lite communication systems. For the imperfect CSIT case,
robust transmission, multigroup multicast beamforming, outage the expectation-based robust precoding designs were inves-
probability, channel state information (CSI). tigated in unicast [8] and multicast [9] multi-beam satellite
transmissions, respectively. In addition, the outage constrained
robust precoding designs were investigated in unicast [6]
I. I NTRODUCTION and single-group multicast [10] transmissions, respectively.
ULTI-BEAM satellite communication has received Satellite relaying systems were investigated in [17] and [18].
M extensive research interest recently due to its potential
to increase the satellite transmission rate and provide seam-
In this letter, we investigate outage-based robust multigroup
multicast beamforming for multi-beam satellite communica-
less connectivity in a wide coverage area [1]. Meanwhile, tion systems. Based on a satellite downlink beam domain
aggressive spectrum reuse among beams is desirable to achieve channel model with the channel phase uncertainty taken into
high transmission rates in multi-beam satellite communication account, we first investigate outage constrained robust multi-
systems [2], [3]. As a result, interference mitigation tech- group multicast beamforming subject to the per-beam power
niques become mandatory in multi-beam satellite systems constraints. We further cast the outage constrained robust
with aggressive frequency reuse to reduce the inter-beam beamforming design into the convex optimization framework
interference. Inter-beam interference mitigation can be per- with some formulation. Simulation results demonstrate the
formed at either the satellite side or the user side. In this letter, performance gains of our proposed robust approach over the
we focus on investigating linear beamforming performed at conventional approach.
the satellite side as it can effectively manage the inter-beam
interference with a relatively low complexity. II. S ATELLITE C HANNEL M ODEL
For beamforming design in multi-beam satellite communi- We consider a multi-beam satellite communication system
cation systems, several practical issues should be taken into with full frequency reuse, where Nt beams are utilized
account. Firstly, one beamformer is applied to several users in to simultaneously serve Nu single-antenna users. Consider
the same frame to cope with the framing structure of the cur- the multigroup multicast transmission where the number of
rent satellite standards, e.g., DVB-S2 [4] and DVB-S2X [5], multicast clusters is K = Nt and each user belongs to only
which leads the beamforming design in multi-beam satellite one cluster [11]. We denote K = {1, 2, . . . , K } as the total
Manuscript received August 30, 2018; accepted September 25, 2018. Date
cluster set and B = {1, 2, . . . , Nt } as the total beam set.
of publication October 1, 2018; date of current version April 9, 2019. This Denote Uk as the kth multicast cluster. We focus on the
work was supported in part by the National Natural Science Foundation of signal model in the beam domain. The signal received by the
China under Grant 61320106003, Grant 61761136016, Grant 61801114, Grant ith user in cluster Uk can be expressed as
61471113, Grant 61631018, and Grant 61521061, in part by the Natural
Science Foundation of Jiangsu Province under Grant BK20170688, in part yi = h H
i wk sk + hH i w s + ni , i ∈ Uk , (1)
by the National Science and Technology Major Project of China under Grant
2017ZX03001002-004, and in part by the Civil Aerospace Technologies =k
CNt ×1
Research Project under Grant D010109. The associate editor coordinating
the review of this paper and approving it for publication was J. Coon. where hi ∈ is the downlink beam domain channel
(Corresponding author: Wenjin Wang.) vector from the Nt beams to the ith user, wk ∈ CNt ×1 is the
The authors are with the National Mobile Communications beam domain precoder for cluster Uk , sk is the signal intended
Research Laboratory, Southeast University, Nanjing 210096, China for users in cluster Uk with unit power, and ni ∼ CN (0, N0 )
(e-mail: liyou@seu.edu.cn; ao_liu@seu.edu.cn; wangwj@seu.edu.cn;
xqgao@seu.edu.cn).
is the additive noise. Note
that theHradiation power from the
Digital Object Identifier 10.1109/LWC.2018.2872710 nth beam is given by [ K k =1 wk wk ]n,n .
YOU et al.: OUTAGE CONSTRAINED ROBUST MULTIGROUP MULTICAST BEAMFORMING FOR MULTI-BEAM SATELLITE COMMUNICATION SYSTEMS 353
The downlink channel vector between the satellite and the The outage constrained robust multigroup multicast beam-
ith user in the beam domain can be modeled as [2], [8] forming design formulated in (8) takes several practical issues
1 1 in multi-beam satellite communication systems into account.
hi = ψi bi2 ri2 exp{jθ i }, (2) Firstly, the multigroup multicast beamforming formulation nat-
urally embraces the framing structure of the existing satellite
where denotes the Hadamard product, ψi is the large communication standards [4], [5]. The per-beam power con-
scale fading coefficient, bi denotes the far-field beam radi- straints are also taken into account in the problem formulation.
ation pattern [12], ri represents the rain attenuation with In addition, a feasible solution to the problem F can guarantee
elements obeying the lognormal distribution [13], and θ i the quality of service of all users under the channel uncertainty,
denotes the channel phase vector with elements independently which is practically meaningful for satellite communications.
and uniformly distributed between 0 and 2π [3]. The problem F is in general difficult to handle. We first
For the ith user, the channel vector is estimated at instant decompose the problem F into a sequence of outage con-
t0 and fed back to the gateway. Then the CSIT is used at strained power minimization problem. In particular, for the
instant t1 after the propagation delays plus the processing predetermined SINR thresholds {γi }i , the outage constrained
delays [6], [14]. Due to the temporary invariance of amplitude, power minimization robust multigroup multicast beamforming
we model the channel phase at t1 as follows design can be formulated as follows
θ i (t1 ) = θ i (t0 ) + ei , (3)
Q: min r
{Wk }K
where ei ∼ N (0, σi2 I)
is the channel phase error and is σi2 k =1
the variance of the phase error vector [8], [14]. We denote the s.t. pi Pr{SINRi ≥ γi } ≥ αi ,
estimated channel at t0 and the actual channel at t1 by ĥi and K
hi , respectively. Then hi can be modeled as [6] 1
Wk ≤ r , ∀n ∈ B, ,
Pn
hi = ĥi qi = diag (ĥi )qi , (4) k =1 n,n
Wk 0, rank(Wk ) = 1, ∀k ∈ K. (9)
where qi exp{jei }. Let Qi qi qH i , then the long term
correlation matrix of qi can be expressed as Note that the feasible set of the problem Q will decrease
as min{γi } increases. Then, the optimum objective value of
Ai = E{qi qH
i } = E{Qi }. (5) i
Q is monotonically non-decreasing in min{γi }. Thus, using
It is not difficult to obtain the elements of Ai as follows [9] i
a classic bisection search approach [15], the problem F can
be solved via iteratively solving the problem Q with different
[Ai ]m,n =
1, m = n, (6)
exp −σi2 , otherwise. SINR thresholds {γi }i . In the following, we will focus on the
outage constrained power minimization robust beamforming
III. ROBUST M ULTIGROUP M ULTICAST B EAMFORMING design problem Q.
We first investigate the expression of the non-outage prob-
In this section, we investigate outage constrained robust ability pi = Pr{SINRi ≥ γi }. Define some auxiliary
multigroup multicast beamforming for frame-based satellite variables Zk Wk − γi =k W , and Ri hi hH i =
communication systems. From the signal model in (1), the H
diag(ĥi )Qi diag(ĥi ), then the non-outage probability pi
signal-to-interference-plus-noise ratio (SINR) at the ith user in (9) can be expressed as
in cluster Uk can be represented as [6] ⎧ ⎫
⎨ ⎬
hH
i Wk hi
SINRi , ∀i ∈ Uk , k , ∈ K, (7) pi = Pr Tr(Ri Wk ) ≥ γi Tr(Ri W ) + γi N0
H
=k i W hi + N0
h ⎩ ⎭
=k
where Wk wk wkH . = Pr{Tr(Ri Zk ) ≥ γi N0 }. (10)
Note that for the imperfect CSI case, it is difficult to design
Define xi Tr(Ri Zk ) and xi can be rewritten as
a beamformer to guarantee a target SINR all the time due to

the channel uncertainty. Thus, we are interested in the robust
xi = Tr diag(ĥi )Qi diag(ĥH
i )Zk = Tr(Ck Qi ), (11)
beamforming design and we aim to maximize the worst-case
outage SINR of all users in a high probability, which can be
formulated as the following problem where Ck = diag(ĥi )Zk diag(ĥH i ). We can observe from (11)
that xi is real-valued and is the sum of statistically independent
F : max min γi random variables. Thus, xi can be approximated by a real-
{Wk }K i
k =1 valued Gaussian distribution when Nt is sufficiently large from
s.t. pi Pr{SINRi ≥ γi } ≥ αi , the central limit theorem. The mean and variance of the real
K valued random variable xi can be obtained as follows

Wk ≤ Pn , ∀n ∈ B, μi = E{xi } = Tr(Ck E{Qi }) = Tr(Ck Ai ), (12)
k =1 n,n
vi2 = E{(Tr(Ck Qi ))2 } − μ2i
Wk 0, rank(Wk ) = 1, ∀k ∈ K, (8)
= vecH (CH H H 2
k )E{vec(Qi )vec (Qi )}vec(Ck ) − μi
where αi is the non-outage probability threshold for user i H
and Pn is the power budget of the nth beam. We focus on the = vecH (CH ∗ ∗ H 2
k )E{(qi ⊗ qi )(qi ⊗ qi ) }vec(Ck ) − μi
case where αi > 0.5, which is of more practical interest [6]. = vecH (CH T H 2
k )E{Qi ⊗ Qi }vec(Ck ) − μi , (13)
TABLE I
where vec(·) denotes the vectorization operation and ⊗ S IMULATION S ETUP PARAMETERS
denotes the Kronecker product.
Define Gi E{QT i ⊗ Qi }, it is not difficult to show that
the (m, n)th element of Gi with m = (m1 − 1)K + m2 and
n = (n1 − 1)K + n2 is given by
⎧
⎨ 1, m1 = n1 and m2 = n2
2 , m
= n and m
= n
[Gi ]m,n = exp−2σi 1 1 2 2
⎩
exp −σi2 , otherwise.
With the above derivations, we can then obtain the expres-
sion of the non-outage probability pi = Pr{xi ≥ γi N0 } as
follows

1 1 μi − γ i N0
pi = + erf √ , for γi N0 ≤ μi , (14)
2 2 2vi
where erf(·) is the Gaussian error function. Note that in the K
evaluation of the non-outage probability, we only focus on the 1
[ Wk ]n,n ≤ r , ∀n ∈ B,
case where pi > 0.5, which is of interest for practical satellite Pn
k =1
communication links. Wk 0, ∀k ∈ K. (17)
From (12), (13) and (14), the outage constraint pi ≥ αi can
be rewritten as Problem Qf is a convex optimization problem with
1 2 a linear objective and second-order cone and semi-

G 2 vec(CH ) definite programming constraints and thus can be efficiently
i k
solved [15]. We observe from extensive simulation that the
2
⎛ ⎞2 problem Qf formulated in (17) yields rank-one solutions

1 a ai2 in most of the cases. For other cases, classic randomiza-
≤ 2⎝ bi2 + 1Tr(Ck Ai ) − i ⎠ +
2
,
bi bi2 + 1 bi + 1 tion approaches [16] can be adopted to address the rank
issue.
(15)
√ IV. S IMULATION R ESULTS
where ai = γi N0 and bi = 2erf −1 (2αi − 1). Note that (15)
can be transformed into a convex constraint when we drop the We present simulation results to illustrate the performance
term ai2 /(bi2 + 1). Then problem Q can be reformulated as of the proposed outage constrained robust multigroup multicast
beamforming for frame-based multi-beam satellite commu-
Qs : min r nication systems. Conventional approach which adopts the
{Wk }K
k =1 outdated CSI directly as the true CSI is also considered for
1
performance comparison [6]. The simulation results are based
H
s.t. Gi vec(Ck )
2
on 106 channel realizations.
⎛ 2 ⎞ The simulation setup is presented as follows. We adopt the

1 a channel model presented in Section II, and detailed param-
≤ ⎝ bi2 + 1Tr(Ck Ai ) − i ⎠, eter values are listed in Table I. Assume that the channel
bi bi2 + 1 phase error variances for different users are identical and
K given by σi2 = σ 2 . The non-outage probability and SINR
1 thresholds for all users are set to be equal, i.e., αi = α,
[ Wk ]n,n ≤ r , ∀n ∈ B,
Pn γi = γth . Assume that there are 7 beams covering the
k =1
target area in which a total of 35 users are uniformly
Wk 0, rank(Wk ) = 1, ∀k ∈ K. (16)
distributed.
As the term ai2 /(bi2 + 1) > 0, then the feasible set of the We first evaluate the achievable sum multicast rate given by
transformed problem Qs is a subset of the beamformer set K

that satisfies (15). Rsum = Nk · r k , (18)
Using a semidefinite relaxation technique [16], we relax the k =1
non-convex rank-one constraint, and then the problem Qs can
be rewritten as follows where Nk is the number of users in cluster k, and rk is the
multicast rate in cluster k given by
Qf : min r
{Wk }K
k =1 rk = E log2 (1 + min SINRi ) . (19)
1 i∈Uk

s.t. G
i
2
vec(CH
k ) We can observe from Fig. 1 that the proposed outage con-
⎛ 2 ⎞ strained robust approach shows multicast rate performance

1⎝ a gains over the conventional approach. In addition, the
≤ bi2 + 1Tr(Ck Ai ) − i ⎠, performance gain becomes larger as the non-outage prob-
bi 2
bi + 1 ability threshold α increases or the SINR threshold γth
YOU et al.: OUTAGE CONSTRAINED ROBUST MULTIGROUP MULTICAST BEAMFORMING FOR MULTI-BEAM SATELLITE COMMUNICATION SYSTEMS 355
per-beam power constraints. We then reformulated the out-

age SINR maximization robust beamforming problem into
a convex optimization problem. Simulation results demon-
strated that the proposed outage constrained robust beamform-
ing approach can provide significant performance gains in
multicast rate and outage probability over the conventional
approach.
R EFERENCES
[1] M. Á. Vázquez et al., “Precoding in multibeam satellite communica-
tions: Present and future challenges,” IEEE Wireless Commun., vol. 23,
no. 6, pp. 88–95, Dec. 2016.
[2] G. Zheng, S. Chatzinotas, and B. Ottersten, “Generic optimization
of linear precoding in multibeam satellite systems,” IEEE
Fig. 1. Comparison of the achievable sum multicast rate between the Trans. Wireless Commun., vol. 11, no. 6, pp. 2308–2320,
proposed robust and conventional beamforming approaches. Results are shown Jun. 2012.
versus the SINR threshold γth for different values of non-outage probability [3] V. Joroughi, M. Á. Vázquez, and A. Pérez-Neira, “Generalized
threshold α with σ 2 = 30◦ . multicast multibeam precoding for satellite communications,” IEEE
Trans. Wireless Commun., vol. 16, no. 2, pp. 952–966, Feb. 2017.
[4] Digital Video Broadcasting (DVB); Second Generation Framing
Structure, Channel Coding and Modulation Systems for Broadcasting,
Interactive Services, News Gathering and Other Broad-Band Satellite
Applications (DVB-S2), V1.4.1, ETSI Standard EN 302 307-1,
Nov. 2014.
[5] Digital Video Broadcasting (DVB); Second Generation Framing
Structure, Channel Coding and Modulation Systems for Broadcasting,
Interactive Services, News Gathering and Other Broad-Band Satellite
Applications. Part 2: DVB-S2 Extensions (DVB-S2X), V1.1.1, ETSI
Standard EN 302 307-2, Oct. 2014.
[6] A. Gharanjik, M. R. B. Shankar, P.-D. Arapoglou, M. Bengtsson, and
B. Ottersten, “Robust precoding design for multibeam downlink satellite
channel with phase uncertainty,” in Proc. IEEE ICASSP, Brisbane, QLD,
Australia, Apr. 2015, pp. 3083–3087.
[7] D. Christopoulos, S. Chatzinotas, and B. Ottersten, “Weighted fair
multicast multigroup beamforming under per-antenna power con-
straints,” IEEE Trans. Signal Process., vol. 62, no. 19, pp. 5132–5142,
Oct. 2014.
Fig. 2. Comparison of the outage probability between the proposed robust and [8] A. Gharanjik, M. R. B. Shankar, P.-D. Arapoglou, M. Bengtsson, and
conventional beamforming approaches. Results are shown versus the SINR B. Ottersten, “Precoding design and user selection for multibeam satel-
threshold γth for different values of non-outage probability threshold α with lite channels,” in Proc. IEEE SPAWC, Stockholm, Sweden, Jun. 2015,
σ 2 = 30◦ . pp. 420–424.
[9] W. Wang et al., “Robust multigroup multicast transmission for
frame-based multi-beam satellite systems,” IEEE Access, vol. 6,
pp. 46074–46083, 2018.
increases. For the case with α = 0.8 and γth = [10] M.-C. Yue, S. X. Wu, and A. M.-C. So, “A robust design
−4 dB, the proposed robust approach can provide approx- for MISO physical-layer multicasting over line-of-sight chan-
imately 46% multicast rate gain over the conventional nels,” IEEE Signal Process. Lett., vol. 23, no. 7, pp. 939–943,
approach. Jul. 2016.
In Fig. 2, the outage performance of the proposed robust [11] D. Christopoulos, S. Chatzinotas, and B. Ottersten, “Multicast multi-
group precoding and user scheduling for frame-based satellite communi-
and conventional approaches are depicted. We can observe cations,” IEEE Trans. Wireless Commun., vol. 14, no. 9, pp. 4695–4707,
that the proposed robust approach can provide significant Sep. 2015.
reduction in outage probability than the conventional beam- [12] C. Caini, G. E. Corazza, G. Falciasecca, M. Ruggieri, and F. Vatalaro,
forming approach.1 In particular, for the case with α = 0.8 “A spectrum- and power-efficient EHF mobile satellite system to be
integrated with terrestrial cellular systems,” IEEE J. Sel. Areas Commun.,
and γth = −3 dB, the proposed robust approach can pro- vol. 10, no. 8, pp. 1315–1325, Oct. 1992.
vide approximately 65% outage performance gain over the [13] T. Maseng and P. Bakken, “A stochastic dynamic model of rain atten-
conventional approach. uation,” IEEE Trans. Commun., vol. COM-29, no. 5, pp. 660–669,
May 1981.
[14] G. Taricco, “Linear precoding methods for multi-beam broadband satel-
V. C ONCLUSION lite systems,” in Proc. 20th Eur. Wireless Conf., Barcelona, Spain,
May 2014, pp. 1–6.
In this letter, we have investigated outage constrained [15] S. Boyd and L. Vandenberghe, Convex Optimization. New York, NY,
robust multigroup multicast beamforming for multi-beam USA: Cambridge Univ. Press, 2004.
satellite communication systems. With the channel phase [16] Z.-Q. Luo, W.-K. Ma, A. M.-C. So, Y. Ye, and S. Zhang, “Semidefinite
uncertainty taken into account, we formulated a robust beam- relaxation of quadratic optimization problems,” IEEE Signal Process.
former design with the aim to maximize the worst-case Mag., vol. 27, no. 3, pp. 20–34, May 2010.
[17] M. K. Arti and M. R. Bhatnagar, “Two-way mobile satellite relaying:
outage SINR of all users subject to the outage and the A beamforming and combining based approach,” IEEE Commun. Lett.,
vol. 18, no. 7, pp. 1187–1190, Jul. 2017.
1 Note that for larger values of γ , infeasibility might arise in the consid- [18] M. K. Arti, “Imperfect CSI based multi-way satellite
th
ered problem due to, e.g., the channel uncertainty. For these scenarios, joint relaying,” IEEE Wireless Commun. Lett., to be published,
design of beamforming and admission control should be considered. doi: 10.1109/LWC.2018.2833471.
Low-Complexity Differential Spatial Modulation

Ruey-Yi Wei , Senior Member, IEEE, and Tzu-Yun Lin
Abstract—Differential spatial modulation (DSM) is a multi- of the previous matrix only, not a whole matrix. By doing
antenna technique which transmits additional data bits by so, the complexity is roughly proportional to the number of
selecting indexes of antennas and avoids pilot overhead. However, transmitter antennas only.
DSM is complicated in two aspects and we aim to reduce the
On the other hand, DSM with complex-valued antenna-
complexity of DSM in this letter. First, the complexity of the
noncoherent maximum-likelihood (ML) detector increases expo- index matrices in [5] has better error performance than DSM
nentially with the number of transmitter antennas. For this in [4]. But the matrices of full-rate DSM are obtained by
problem, we propose a new symbol-based ML detector whose random searches, and there are unlimited possibilities of trans-
complexity is roughly proportional to the number of transmit- mitted signals after differential encoding. This also increases
ter antennas. The other problem is that the signal constellation the hardware complexity [10]. Complex space-time matrices
of the transmitted signal has unlimited points due to complex- for reduced-rate (with increased diversity) DSM were designed
valued antenna-index matrices. We propose a systematic design
in [9] and [10]. In this letter, we propose a systematic design of
of complex-valued antenna-index matrices for which the con-
stellation of the transmitted signal has a few signal points only. complex space-time matrices for full-rate DSM using MPSK
Both the proposed techniques decrease the complexity without signal. Be properly selecting the complex value, the constel-
sacrificing error performance. lation of the transmitted signal only contains a few signal
Index Terms—Spatial modulation, differential encoding, dif-
points.
ferential detection. Notation: (.)T , (.)† and . denote the transpose, the
conjugate transpose and the Frobenius norm of a matrix,
respectively. diag{.} represents the operation from a row
I. I NTRODUCTION vector to a diagonal matrix. denotes the floor func-
ECAUSE of growing demand for wireless commu- tion. CN (0, σ 2 ) denotes the zero-mean, σ 2 -variance, complex
B nications, various techniques of wireless communica-
tions have been proposed. Among them, spatial modulation
Gaussian distribution.
(SM) [1]–[3] attracts much attention. SM is a multi-antenna II. R EVIEW OF DSM

technique which uses a single transmitter antenna each time, Consider a communication system with NT transmitter
so it is able to transmit additional data bits by selecting indexes antennas and NR receiver antennas. The channels between
of antennas. SM offers flexibility in reducing RF circuits and antenna pairs are Rayleigh-fading and independent of each
power consumption. other. If the symbol s is transmitted via the mth antenna,
Differential SM (DSM) [4], [5] which avoids pilot overhead the constellation vector is represented by a column-vector
attracts much attention in recent years. DSM is consid- [0, . . . , 0, s, 0, . . . , 0]T where the only nonzero entry s is the
ered as a special case of differential space-time modulation mth element. Each block of DSM contains NT time slots, and
(DSTM) [12], [13] which tests all data matrices one-by-one all NT column-vectors form an NT ×NT matrix of transmitted
at the noncoherent maximum-likelihood (ML) receiver. For signals S(t) for the tth block which satisfies two restrictions:
DSM, more transmitter antennas mean higher transmission (1) At each time slot, only one antenna is activated, i.e., there
rate. However, the number of data matrices increases expo- is only one nonzero entry in each column of S(t). (2) In each
nentially with the number of transmitter antennas, so the block, each transmitter antenna is activated exactly once, i.e.,
complexity is too high to be realized in the cases of a lot there is only one nonzero entry in each row of S(t). For the
of transmitter antennas. Therefore, low-complexity but not ML tth block, the NR × NT matrix of received signals is
detectors for DSM were proposed in [6]–[8], even though sim-
ulation results in [8] show that the sphere decoder in [8] has Y(t) = H(t)S(t) + N(t) (1)
identical error performance to the noncoherent ML detector. where H(t) is the NR × NT matrix of channel coefficients
In this letter, we separate the matrix-based ML detector into a whose entries are CN (0,1), and N(t) is the NR × NT matrix
symbol-based ML detector. The idea is based on the fact that of AWGN with CN (0, N0 ) entries. The entry in the ith row
each transmitted symbol of DSM is related to one symbol and the jth column of Y(t) is denoted by yij (t).
Manuscript received September 3, 2018; accepted September 26, 2018. Date The number of permutating the activation order is NT !,
of publication October 1, 2018; date of current version April 9, 2019. This but only Q = 2log2 NT ! permutations are used. The sig-
work was supported by the Ministry of Science and Technology, Taiwan, nal constellation for data symbols is M-ary PSK (phase-shift
under Grant MOST107-2221-E-008-025. The associate editor coordinating
the review of this paper and approving it for publication was H. Q. Ngo. keying) where M = 2b and b is an integer. For full-rate
(Corresponding author: Ruey-Yi Wei.) DSM, there are totally log2 Q + NT b data bits at each
log Q
The authors are with the Department of Communication block, so the spectral efficiency is N2T + b bits/s/Hz. For
Engineering, National Central University, Taoyuan 320, Taiwan (e-mail:
rywei@ncu.edu.tw). the tth block, log2 Q bits determine an antenna-index matrix
Digital Object Identifier 10.1109/LWC.2018.2872990 A(t) ∈ A = {A1 , A2 , . . . , AQ } and NT b bits decide NT data
WEI AND LIN: LOW-COMPLEXITY DSM 357
symbols x(t) = [x1 (t), x2 (t), . . . , xNT (t)]. An NT × NT data s2 (t − 1)x2 (t)e j θ4 . For this A(t), Y(t − 1)X(t) = [y3 (t −
matrix X(t) is calculated by 1)x3 (t)e j θ1 , y1 (t − 1)x1 (t)e j θ2 , y4 (t − 1)x4 (t)e j θ3 , y2 (t −
1)x2 (t)e j θ4 ], so [y3 (t − 1), y1 (t)], [y1 (t − 1), y2 (t)], [y4 (t −
X(t) = diag{x(t)}A(t). (2)
1), y3 (t)] and [y2 (t −1), y4 (t)] are utilized for detecting x3 (t),
Note that ∀q ∈ {1, 2, . . . , Q}, Aq should satisfy the two x1 (t), x4 (t) and x2 (t), respectively.
(k )
restrictions, and the definitions of nonzero entries of Aq Let pq represent the position of the nonzero
in [4] and [5] are different. In [4], nonzero elements are entry, e j θq,k , in the kth column of Aq where
always 1, while in [5], to improve the error performance, k ∈ {1, 2, . . . , Nt } and q ∈ {1, 2, . . . , Q}. With
nonzero elements are complex numbers with unit absolute (k )
pq and X̃ = diag{[x̃1 , x̃2 , . . . , x̃NT ]}Aq , the met-
values. NR
ric in (4) becomes Y(t) − Y(t − 1)X̃2 = i=1
Because X(t) is a unitary matrix, differential encoding and NT j θq,k |2 . According to the
differential detection of DSTM can be applied to DSM and |y
k =1 ik (t) − y (k ) (t − 1)x̃ (k ) e
ipq pq
thus DSM is a special case of DSTM. At the transmitter, S(t) discussion above, the detected value of x (k ) (t) for Aq is
pq
is determined by
NR

(q)
S(t) = S(t − 1)X(t). (3) x̂ (k ) (t) = arg min |yik (t) − y (k ) (t − 1)x̃ e j θq,k |2 (5)
pq x̃ ipq
i=1
The initial reference matrix X(0) is the identity matrix, so
and the determined data symbols of Aq is represented by
the transmitted matrix S(t) is unitary and satisfies the two (q) (q) (q)
restrictions. At the receiver, the noncoherent ML detection is x̂q (t) = [x̂1 (t), x̂2 (t), . . . , x̂NT (t)]. The metric of Aq is
NT
NR
X̂(t) = arg min Y(t) − Y(t − 1)X̃2 (4) (q)
X̃∈X mq (t) = |yik (t) − y (k ) (t − 1)x̂ (k ) (t)e j θq,k |2 (6)
ipq pq
k =1 i=1
where X denotes the set of all possible values of X(t). To
and the detected value of A(t) is
obtain X̂(t), the receiver has to try all Q × M NT elements
in X . The signal constellation for x(t) is MPSK, but the signal Â(t) = arg min mq (t). (7)
Aq ∈A
constellation for S(t) obtained from x(t) and A(t) by (2) and (3)
may contain extremely many signal points if nonzero elements Based on the detected value of q corresponding to Â(t),
of A(t) are complex-valued. denoted by q̂, the detection of x(t) is x̂(t) = x̂q̂ (t).
The minimization of (5) and (7) needs to try Q ×(MNT +1)
III. P ROPOSED ML D ETECTION times, which is much less than the number for (4). For exam-
ple, in the case of 8PSK and NT = 4 which has Q = 16, the
In (4), data symbols x1 (t), x2 (t), . . . , xNT (t) are jointly number of test for (4) is 65536 but the number for (5) and (7) is
detected. In this section, we propose to separatively detect 528 only. The complex reduction of this example exceeds 99%.
data symbols instead. Let p (k ) ∈ {1, 2, . . . , NT } represent the For comparison, an example of the sphere decoder for NT = 4
position of the only nonzero entry in the kth column of A(t) in [8] has complex reduction less than 75%. In the case of
where k ∈ {1, 2, . . . , NT }, and define the permutation order 8PSK and NT = 8, the number of test for (4) is 224 Q while
p = (p (1) , p (2) , . . . , p (NT ) ). According to the differential the number for (5) and (7) is only 65Q. To further reduce the
encoding (3), in the kth time slot of the tth block, the acti- complexity, the complexity can be independent of constellation
vated antenna is the antenna used in the p (k ) th time slot of the size M with the aid of hard-limiting [14].
t − 1th block, and the transmitted data symbol is xp (k ) (t). In
other words, the transmitted symbol in the kth time slot of the IV. P ROPOSED D ESIGN OF C OMPLEX
tth block, denoted by sk (t), is sk (t) = sp (k ) (t −1)xp (k ) (t)e j θk S PACE -T IME M ATRICES
where θk is the phase of the nonzero element in the kth col-
Let X = diag{x}A and X = diag{x }A represent two
umn of A(t). Therefore, the receiver can use yip (k ) (t − 1) and
data matrices in (2) where x = [e j φ1 , e j φ2 , . . . , e j φNT ] and
yik (t) only to detect xp (k ) (t) where i ∈ {1, 2, . . . , NR }. j φ
1: Assume N x = [e j φ1 , e j φ2 , . . . , e NT ]. There are two possible cases for
⎛ Example ⎞T = 4, NR = 1, and A(t) = X = X : (i) A = A and x = x (ii) A = A . The transmitter
0 e j θ2 0 0
⎜ 0 diversity in case (i) is one, independent of the design of A. In
⎜ 0 0 e j θ4 ⎟ ⎟ whose p = (3 1 4 2), so X(t) =
⎝ e j θ1 0 this section, we only consider case (ii).
0 0 ⎠
There are at least two different elements of the permuta-
0 0 e j θ3 0 tion order p between A and A in case (ii). If all nonzero
⎛ ⎞
0 x1 (t)e j θ2 0 0 entries of A and A are 1, then the transmitter diversity
⎜ 0 0 0 x2 (t)e j θ4 ⎟
⎜ ⎟. If S(t − 1) order in case (ii) is only one. Therefore, we aim to design
⎝ x3 (t)e j θ1 ⎠
0 0 0 two matrices A and A with only two different elements in
0 0 x4 (t)e j θ3 0 p in the following. Without
⎛
s2 (t − 1)
⎞ ⎛ j θ1 loss of the generality, they
⎞ are
0 0 0 e 0 0 ··· 0
⎜ 0 0 0 s4 (t − 1) ⎟ ⎜ 0 e j θ2 0 · · · 0 ⎟
is ⎝ ⎠, then in ⎜ ⎟
0 0 s3 (t − 1) 0 ⎜ 0 0 e j θ3 · · · 0 ⎟
s1 (t − 1) represented by A = ⎜ ⎟ and
0 0 0 ⎜ . .. .. .. .. ⎟
S(t), we have s1 (t) = s3 (t − 1)x3 (t)e j θ1 , s2 (t) = s1 (t − ⎝ .. . . . . ⎠
1)x1 (t)e j θ2 , s3 (t) = s4 (t − 1)x4 (t)e j θ3 and s4 (t) = j θNT
0 0 0 ··· e
rank, i.e., |(X − X )(X − X )† | = 0, no matter what the

value of θ (θ cannot be 0 or a multiple of 2π M ) is. Because
of sk (t) = sp (k ) (t − 1)xp (k ) (t)e j θk where θk = θ or 0, the
phase of nonzero entries of S(t) in our design is 2k π
M + lθ
where k and l are integers.
The complex-valued matrices in A in [5] were obtained by
random computer searches, but were not shown in [5] or other
references except an example of reduced-rate DSM in [11]
to the best of our knowledge. The transmitter diversity of the
example in [11] is 2, i.e., φ1 = φ2 = φ and φ1 = φ2 = φ .
e j (1.041π+φ) 0
This example in [11] has X = j (1.571π+φ)
0 e
j (1.503π+φ )

Fig. 1. The values of 1 − cos(2θ), 1 − cos(2θ − 2π 4π 0 e
M ), 1 − cos(2θ − M ), and X = j (1.609π+φ ) , or
1 − cos(θ) and 1 − cos(θ − 2π
M ) with M = 8 for 0 ≤ θ ≤ 2π .
M j (0.633π+φ) e 0
e 0
X = and X =
⎛ ⎞ 0 e j (1.182π+φ)
0

e j θ2 0 ··· 0
⎜ e j θ1 ⎟ 0 e j (1.304π+φ )
⎜ 0 0 ··· 0 ⎟ j (0.011π+φ ) with M = 4.1 Note
⎜ ⎟ e 0
A = ⎜ 0 0 e j θ3 ··· 0 ⎟. According to the that (9) is invalid for reduced-rate DSM. For this example, (8)
⎜ . .. .. .. ⎟
⎜ . .. ⎟ becomes
⎝ . . . . . ⎠
j θ
0 0 0 · · · e NT |(X − X )(X − X )† | = 2[1 − cos(2φ − 2φ + θ1 + θ2 − θ1 − θ2 )].
design guideline of DSTM in [13], the rank of X − X which (10)
represents the transmitter diversity should be maximized first
and then the determinant of (X − X )(X − X )† should be Due to 2φ − 2φ = 4k π
M where k is an arbitrary integer, (10)
maximized [15]. It can be shown that this determinant is has the maximum value when
2π π
NT

θ1 + θ2 − θ1 − θ2 = ± =± . (11)
|(X − X )(X − X )† | = 2NT −1 [1 − cos(ψ1 + ψ2 )] (1 − cos ψi ) M 2
i=3
We find that in A of the example, θ1 + θ2 − θ1 − θ2
(8)
is (1.041 + 1.571 − 1.503 − 1.609)π = −0.5π or
where ψi = φi + θi − φi − θi ∀i ∈ {1, 2, . . . , NT }. (0.633 + 1.182 − 1.304 − 0.011)π = 0.5π which coincide
is θi = 0 and θ
Because in general A and A may differ in any two with (11). In the proposed design which i =
columns, all nonzero entries of A (A ) should be the same. 1 0 0 e j π/4
θ∀i = 1, 2, A is simply be , .
Therefore, in the proposed design, we choose θi = θ and 0 1 e j π/4 0
θi = 0 ∀i ∈ {1, 2, . . . , NT }. Because φi and φi are phases Unlike [11] whose constellation for S(t) has extremely many
of MPSK symbols, the value of θ which maximizes the points, this A has the same coding gain but the signal con-
minimum value of the determinant, denoted by θopt , should stellation for S(t) is 8PSK only. Note that the unbounded
maximize [1 − cos(2θ + 2k π 2lπ
M )][(1 − cos(θ + M ))]
NT −2 where differential constellation size issue makes the employment of
k and l are arbitrary integers. The values of θ which minimize a high-resolution digital-to-analog converter (DAC) impera-
cos(2θ + 2k π 2lπ π π
M ) and cos(θ + M ) are 2M and M , respectively.
tive, which is both expansive and power hungry [10]. For the
Curves of 1 − cos(2θ + M ) and 1 − cos(θ + 2lπ
2k π
M ) for k,
proposed design, low-cost DACs are sufficient to implement
l = 0,−1,−2 are shown in Fig. 1, which indicates that θopt is the transmitter.
π ≤ θ ≤ π ) maximizing
the value of θ ( 2M Next, we propose a systematic construction for A of full-
M
rate DSM. The initial element in A is the identity matrix
f (θ) = [1 − cos(2θ −
2π
)](1 − cos θ)NT −2 . (9) denoted by A1 . Define Ω0 = {A1 }. First, N2T matrices in
M Ω1 = {A2 , A3 , . . . , A(NT )+1 } are obtained by interchanging
π and 2
For NT = 2, we have f (θ) = (1 − cos 2θ) so θopt = 2M any two columns of A1 and replacing all entries 1 with e j θ .
π ).
the minimum value of |(X − X )(X − X )† | is 2(1 − cos M Then matrices in Ω2 are obtained by interchanging any two
For NT > 2, the values of θopt can be found by math- columns of A2 , A3 , . . . , A(NT )+1 and replacing all entries e j θ
ematical software. With M = 4, the determinant of (8) is 2
with 1. If an element in Ω2 has the same permutation order p
0.0892 at θopt = 22.925◦ (In simulations we replace it by as any element in Ω0 , Ω1 , delete it. If there are two identical
22.5◦ for simplicity.) for NT = 3, and 0.0192 at θopt = 30◦ elements in Ω2 , remove one of them. ∀i ≥ 1, matrices in Ωi
for NT = 4. are obtained by interchanging any two columns of matrices
For the design of [4] whose nonzero entries of antenna- in Ωi−1 , and replacing 1 with e j θ for odd i or replacing e j θ
index matrices are all 1 (θ = 0), the diversity between X with 1 for even i. Delete elements in Ωi whose permutation
and X is 1. Consequently, increasing NT cannot increase
diversity order. For the proposed design of A and A , the 1 The DSM is different from conventional DSM. To avoid confusion or a
diversity between X and X is NT because X − X has full lot of explanation, we modify its form of presentation herein.
WEI AND LIN: LOW-COMPLEXITY DSM 359
orders have appeared before. This process is continued until all

NT ! antenna-index matrices are obtained. If necessary, remove
matrices in Ωi where i ≥ 2 such that all matrices in Ωi differ
at least three columns. Finally, arbitrarily choose Q matrices
from Ω0 , Ω1 · · · to form A.
Lemma: In the proposed construction, if two matrices in A
have only two different elements in p, then one of them has
nonzero entries 1 and the other has nonzero entries e j θ .
Proof: Consider all possible cases of A and A : (1) A ∈ Ωi
and A ∈ Ωi+1 for i ≥ 0; (2) A ∈ Ωi and A ∈ Ωi+j for
i ≥ 0 and j ≥ 2; (3) A, A ∈ Ωi for i ≥ 1. Obviously in
case (1) one matrix has nonzero entries 1 and the other has
nonzero entries e j θ . In case (2) A and A differ at least three
columns; otherwise A must be in Ωi+1 . Finally in case (3)
with i = 1, because both A and A are modified from A1 ,
Simulation results of Example 2 and Example 3 for NR = 1.
A and A always differ at least three columns; in case (3)
Fig. 2.
with i > 1, all matrices in Ωi differ at least three columns

according to the construction process.
Based on the Lemma, for any two matrices in A with only complex numbers are all the same. The optimal value of the
two different elements in p, the transmitter diversity between phase is derived and the constellation of the transmitted signals
them is NT provided that θ satisfies f (θ) = 0 in (9). Two is simple.
examples of the proposed construction are given as follows.
To show all antenna-index matrices needs too much space so R EFERENCES
we present the permutation order p instead.
[1] R. Y. Mesleh, H. Haas, S. Sinanovic, C. W. Ahn, and S. Yun, “Spatial
Example 2: For NT = 3, we have Ω0 = {123}, Ω1 = modulation,” IEEE Trans. Veh. Technol., vol. 57, no. 4, pp. 2228–2241,
{213, 132, 321} whose nonzero entries are e j θ , and Ω2 = Jul. 2008.
{231, 312}. For M = 4, θ is π/8 so the signal constellation [2] M. D. Renzo, H. Haas, and P. M. Grant, “Spatial modulation for
for S(t) is 16PSK. multiple-antenna wireless systems: A survey,” IEEE Commun. Mag.,
vol. 49, no. 12, pp. 182–191, Dec. 2011.
Example 3: For NT = 4, we have Ω0 = {1234}, Ω1 = [3] N. Ishikawa, S. Sugiura, and L. Hanzo, “50 years of permutation, spatial
{1243, 1324, 1432, 2134, 3214, 4231} whose nonzero entries and index modulation: From classic RF to visible light communica-
are e j θ , Ω2 = {1342, 1423, 2143, 2314, 2431, 3124, 3241, tions and data storage,” IEEE Commun. Surveys Tuts., vol. 20, no. 3,
pp. 1905–1938, 3rd Quart., 2018.
3412, 4132, 4213, 4321} whose nonzero entries are 1, and [4] Y. Bian et al., “Differential spatial modulation,” IEEE Trans. Veh.
Ω3 = {2341, 2413, 3142, 3421, 4123, 4312} whose nonzero Technol., vol. 64, no. 7, pp. 3262–3268, Jul. 2015.
entries are e j θ . For M = 4, θ is π/6 so the signal constellation [5] N. Ishikawa and S. Sugiura, “Unified differential spatial modulation,”
for S(t) contains 12 points only. IEEE Wireless Commun. Lett., vol. 3, no. 4, pp. 337–340, Aug. 2014.
[6] L. Xiao et al., “A low-complexity detection scheme for differential spa-
In the above two examples, all matrices in Ωi where tial modulation,” IEEE Commun. Lett., vol. 19, no. 9, pp. 1516–1519,
i ≥ 2 differ at least three columns, so it is unnecessary to Sep. 2015.
delete matrices. Compared with complex-valued antenna-index [7] M. Wen, X. Cheng, Y. Bian, and H. V. Poor, “A low-complexity near-
matrices in [5], the proposed design has two benefits: system- ML differential spatial modulation detector,” IEEE Signal Process. Lett.,
vol. 22, no. 11, pp. 1834–1838, Nov. 2015.
atic construction rather than computer search, and fewer points [8] Z. Li et al., “A low-complexity optimal sphere decoder for differential
of S(t). spatial modulation,” in Proc. IEEE Glob. Commun. Conf., San Diego,
The two examples are compared to matrices with entries CA, USA, 2015, pp. 1–6.
0 or 1 in [4] by computer simulations. In [4], Q lexico- [9] R. Rajashekar, N. Ishikawa, S. Sugiura, K. V. S. Hari, and L. Hanzo,
“Full-diversity dispersion matrices from algebraic field extensions for
graphically smaller permutations are chosen from all NT ! differential spatial modulation,” IEEE Trans. Veh. Technol., vol. 66,
permutations, so we choose the same Q permutations for our no. 1, pp. 385–394, Jan. 2017.
examples. Simulation results are shown in Fig. 2 where solid [10] R. Rajashekar et al., “Algebraic differential spatial modulation is capable
curves represent NT = 3 and dashed curves denote NT = 4. of approaching the performance of its coherent counterpart,” IEEE Trans.
Commun., vol. 65, no. 10, pp. 4260–4273, Oct. 2017.
The proposed A in two examples both outperform matrices [11] N. Ishikawa and S. Sugiura, “Rectangular differential spatial modula-
with entries 0 or 1. When NT is increased from 3 to 4, matri- tion for open-loop noncoherent massive-MIMO downlink,” IEEE Trans.
ces with entries 0 or 1 becomes worse because diversity is not Wireless Commun., vol. 16, no. 3, pp. 1908–1920, Mar. 2017.
[12] B. L. Hughes, “Differential space-time modulation,” IEEE Trans. Inf.
increased as indicated earlier but the number of codewords is Theory, vol. 46, no. 7, pp. 2567–2578, Nov. 2000.
increased, The proposed design is slightly better for increased [13] B. M. Hochwald and W. Swelden, “Differential unitary space-time
NT because in case (i) diversity is not increased. modulation,” IEEE Trans. Commun., vol. 48, no. 12, pp. 2041–2052,
Dec. 2000.
[14] H. Men and M. Jin, “A low-complexity ML detection algorithm for
V. C ONCLUSION spatial modulation systems with M PSK constellation,” IEEE Commun.
In this letter, we propose a new noncoherent ML detector Lett., vol. 18, no. 8, pp. 1375–1378, Aug. 2014.
whose complexity is decreased significantly compared with the [15] V. Tarokh, N. Seshadri, and A. R. Calderbank, “Space-time codes for
high data rate wireless communication: Performance criterion and code
conventional one. Besides, we propose a systematic design construction,” IEEE Trans. Inf. Theory, vol. 44, no. 2, pp. 744–765,
of complex-valued antenna-index matrices whose phases of Mar. 1998.
Time-Expanded Graph-Based Resource Allocation Over the Satellite Networks

Peng Wang, Xiushe Zhang , Shun Zhang, Member, IEEE, Hongyan Li , Member, IEEE, and Tao Zhang
Abstract—In this letter, we propose a transceiver resource

allocation scheme based on the time-expanded graph (TEG)
for the satellite networks. Due to the time-varying topology
of the satellite networks, the design of the efficient transceiver
resource allocation requires multiple nodes to make joint deci-
sion across time. To overcome this problem, the distribution of
the transceiver resources is first depicted by the TEG. Then, we
add some virtual nodes and edges in the TEG to represent the
resources, and obtain the modified TEG, which can transform
the nodes’ transceiver resources into the link constraints. Finally,
the joint resource allocation will be implemented through solv-
ing the max-flow problem over the modified TEG. Numerical
simulation results will also be presented to verify our proposed
Fig. 1. An example of the topology of satellite network within two time
scheme. intervals.
Index Terms—Resource allocation, satellite network,
transceiver resource representation, time-expanded graph. With respect to the data transmission over the satellite
networks, there are some important results. Araniti et al. [2]
I. I NTRODUCTION designed a contact graph based routing scheme to find
the optimal path in the networks without considering the
ATELLITE networks are typical delay-tolerant networks,
S and possess long delay, time-varying topology and dis-
continuous links [1]. The networks have been widely applied
transceiver constraints. Alaoui and Ramamurthy [3] con-
structed a modified temporal graph, and proposed the earliest
arrival optimal delivery ratio (EAODR) algorithm. Although
for disaster monitoring, meteorological watch, and hot spot
EAODR could effectively reduce the time delay, it did not
observation. As shown in Fig. 1, satellite node S is connected
involve the transceiver resource allocation. Zhang et al. [4]
with node A and B in time interval τ1 , and A and B are con-
proposed an efficient temporal centrality-balanced based rout-
nected with D in τ2 . To implement data transmission from
ing algorithm by reducing the network congestion and enhanc-
node S to D, S must allocate its transceiver resources to links
ing the robustness of networks. However, transceiver resource
S − A and S − B in τ1 to separately transmit data to A and B.
allocation was still not involved. In fact, with the multiple
A and B store the receiving data until D allocates resources
access technologies of the satellite networks [5], any satel-
for A − D and B − D in τ2 . During the process, for links
lite’s transceiver resources could be allocated to its multiple
S − B and B − D, if and only if the transceiver resources
incoming or outgoing links instead of being occupied by only
assigned to them are equal there will be no resource waste,
one link.
which need S and D to allocate resources jointly. Thus, it
Therefore, we will propose a graph based transceiver
is necessary to develop an efficient joint resource allocation
resource allocation scheme to maximize network resource
scheme to maximize the resource utilization. Besides, due to
utilization. Firstly, the capacity of the inter-satellite com-
the intermittently connected links, the satellite networks must
munication links, and the satellites’ storage and transceiver
be precisely modeled to support across-time data transmission
resources will be carefully considered. Then, we will charac-
and resource allocation.
terize satellite networks with the TEG [6]. Furthermore, one
Manuscript received August 11, 2018; revised September 23, 2018; kind of efficient representation rule of the transceiver resources
accepted September 24, 2018. Date of publication October 1, 2018; date will be devised through simple reconstruction operations for
of current version April 9, 2019. This work was supported in part by the
National Natural Science Foundation of China under Grant 91638202, Grant
the TEG. Specially, some virtual nodes and edges will be put
61871456, Grant 61401326, and Grant 61571351, in part by the National Key into the original TEG to obtain the modified TEG, which will
Research and Development Program of China under Grant 2016YFB0501004 incorporate the constraints of the transceiver resources into
and Grant 6140518010101, in part by the National S&T Major Project under virtual links and nodes. Finally, the joint resource allocation
Grant 2015ZX03002006, in part by 111 Project under Grant B08038, and
in part by the Natural Science Basic Research Plan in Shaanxi Province of will be implemented through solving the max-flow problem
China under Grant 2016JQ6054. The associate editor coordinating the review over the modified TEG.
of this paper and approving it for publication was M. Nafie. (Corresponding
author: Xiushe Zhang.)
P. Wang, S. Zhang, H. Li, and T. Zhang are with the State Key Laboratory II. S YSTEM M ODEL
of Integrated Services Networks, Xidian University, Xi’an 710071,
China (e-mail: pengwangclz@163.com; zhangshunsdu@xidian.edu.cn; A. Satellite Network Model
hyli@mail.xidian.edu.cn; taozhangfsz@gmail.com). Since the trajectories of the moving satellites are deter-
X. Zhang is with the Communication Engineering Department, Research
Institute of Navigation Technology, Xi’an 710068, China (e-mail:
mined, the time windows for the satellite communication,
xszhang163@163.com). i.e., the network topology, can be predicted. Moreover, the
Digital Object Identifier 10.1109/LWC.2018.2872996 connections between satellites are discontinuous. In summary,
WANG et al.: TEG-BASED RESOURCE ALLOCATION OVER SATELLITE NETWORKS 361
Then, we can define:

c uτm ,vτm = τm B uτm ,vτm , ∀(uτm , vτm ) ∈ E. (2)
B. Transceiver Resource Allocation Condition

As shown in Fig. 2, the network of 4 satellites (S, A, B, D)
within the time horizon (T = {τ1 , τ2 , τ3 }, τ = 8 s) can be
modeled by the TEG with the above definitions. The param-
eters at the nodes represent the nodes’ transceiver resources
and the unit is Mbps. Besides, the link capacity of transmis-
sion link (Sτ1 , Bτ1 ) is c Sτ1 ,Bτ1 = 8 MB, which indicates the
maximum amount of data S can transmit to B in τ1 is 8 MB.
Fig. 2. Satellite network modeled by time-expanded graph within 3 time With respect to the caching link (Sτ1 , Sτ2 ), its link capacity is
intervals.
c Sτ1 ,Sτ2 = 10 MB. It can be checked that the amount of data
Sτ
that Sτ2 can transmit is τ2 Btr 2 = 12 MB, which is less than
the satellite network is time-varying and divinable. Besides, c Sτ2 ,Aτ2 + c Sτ2 ,Bτ2 = 14 MB. Therefore, it is necessary to
our main concern is the allocation of satellite resources and allocate the transceiver resources of satellite S to its outgoing
communications between satellites, ground stations are not transmission links (Sτ2 , Aτ2 ) and (Sτ2 , Bτ2 ) in τ2 . Then for
involved in the proposed model. satellite u in τp , the allocation of the transmission resources
With the properties of the satellite networks, we adopt the should be implemented if the following inequality holds:
directed TEG to depict the snapshot of the network for each uτ
time interval. For a given time interval T = (t0 , tn ], we can τp Btr p < c uτp ,vτp , (3)
split it into n small time intervals {τ1 , . . . , τp , . . . , τn }, where vτp ∈Vτp
τp = (tp−1 , tp ], and the network topology is fixed during τp while the receiving resource allocation will be necessary if:
but may change in τp , p = p , p, p = 1, 2, . . . , n. Since the
uτ
nodes may store data during τp and transmit it in τp+1 , the τp Btr p < c vτp ,uτp . (4)
caching links will be defined as the edge connecting the same vτp ∈Vτp
satellites in two successive time intervals, where the capacity
of the link represents the satellites’ buffer size. Specifically, the Remark 1: Since the transceiver resources of a satellite are
directed time-expanded graph can be formulated as TEG = fixed during a small time interval, the allocation conditions
{(T , VT , E, C )}, where: in (3) or (4) will only be feasible, if the number of the trans-
• VT = {Vτ1 , . . . , Vτp , . . . , Vτn }, a set of nodes, is con- mission links into or out of the node is no less than 2, and
structed by the replicas of the given satellites in each will not hold if the number is 1 or 0.
time interval. The element in Vτp can be denoted as uτp ,
which represents the satellite u in the time interval τp . It III. R ESOURCE A LLOCATION S CHEME
can be checked that the size of Vτp is the same with that A. Problem Formulation
of Vτp , p = p .
For one TEG, the source and destination nodes can be col-
• E, a set of directed edges, contains the caching and trans-
lected into S = {sτp |∀p ∈ [1, n]} and D = {dτp |∀p ∈ [1, n]},
mission links. The caching links are across two successive
intervals and are denoted as {(uτm , uτm+1 ))|∀m ∈ [1, n− respectively, and the flow through any link (uτp , vτq ) ∈ E
1], ∀uτm ∈ Vτm }, while the transmission links lie in is f uτp ,vτq . Similar to the works [7], our objective of the
the snapshot of each time interval and are written as transceiver resource allocation is to transmit as much data as
{(uτp , vτp )|∀p ∈ [1, n], u = v }. possible from the node s to the node d within the given time
• C is the capacity set, and each element corresponds to T. It can be formulated as the maximal flow problem as:
the capacity of one specific edge in E. For the caching n
n

link (uτm , uτm+1 ), the corresponding element in C can max fT = max f sτp ,uτp = max f uτp ,dτp ,
be written as c uτm ,uτm+1 , and represents the available p=1,u∈V−S p=1,u∈V−D
buffer of satellite u between time interval τm and τm+1 . (5)
In terms of the transmission link (uτm , vτm ), the element
where the f uτp ,vτq should satisfy some constraints as follows.
is c uτm ,vτm , and can be defined by the satellite u’s trans-
Satellite buffer constraint: The size of the stored data at the
mission resources and v’s receiving resources in τm . The
node u from the time interval τp to τp+1 can’t exceed u’s
satellite’s link bandwidth can be adopted to character-
available buffer size:
ize the satellite u’s transceiver resources in τm . Without
uτp ,uτp+1 uτp ,uτp+1
loss of generality, we assume any satellite’s transmission 0≤f ≤c , ∀(uτp , uτp+1 ) ∈ E. (6)
resources are equal to its receiving ones. Thus, in the
following, both the transmission and receiving resources Flow conservation: Since the caching links are introduced
u in TEG, the incoming flow of each node in V −{S, D} should
of the node u in τm can be written as Btrτm . Moreover,
the bandwidth of link (uτm , vτm ) will be limited by the be equal to its outgoing flow as:
u
minimal value between Btrτm and Btrτm as:
v
f uτp ,vτq = f vτq ,uτp , ∀vτq ∈ V − {S, D}, (7)
u v
B uτm ,vτm = min{Btrτm , Btrτm }. (1) uτp ∈V uτp ∈V
where A − B contains the elements in the set A but not in

B, and the {A, B} is constructed by the elements in both A
and B.
Transceiver resource constraints: Obviously, the flow of
each transmission link should be no more than the capacity of
that link, and can be described as:
0 ≤ f uτp ,vτp ≤ c uτp ,vτp , ∀(uτp , vτp ) ∈ E. (8)
Furthermore, to assure the feasibility of the allocation scheme,
the other two constraints should be satisfied as:
uτ
f uτp ,vτp ≤ τp Btr p , ∀uτp ∈ VT , (9)
vτp ∈Vτp
uτ
f vτp ,uτp ≤ τp Btr p , ∀uτp ∈ VT , (10)
vτp ∈Vτp
where (9) means that the flow out of a satellite u during τp

should be within its transmission ability, while (10) illustrates
that the flow into this node in τp should be no more than its
receiving capacity. Fig. 3. Representation of transceiver resources.
B. Representation Rule of Transceiver Resources

The existing max-flow algorithms [8] do not consider the Algorithm 1 The Representation of Transceiver Resources
nodes’ transceiver resources, and can not be directly applied Over TEG
to solve the problem in (5). In order to overcome this bottle- 1: Input: TEG ={T , VT , E, C }.
neck, some virtual nodes and edges will be added in TEG to 2: Output: The modified TEG with the representation of
depict the nodes’ transceiver resources. Theoretically, only the transceiver resources.
nodes with no less than 2 incoming or outgoing transmission 3: Define Din (uτp ) and Dout (uτp ) as the number of the
links should be considered, which can be obtained from the incoming transmission links and that of the outgoing
necessary condition of the allocation scheme in Remark 1. transmission links at the node uτp , respectively, ∀uτp ∈
For the simplicity of the illustration, with the satellite VT ;
network in Fig. 2, we present two examples in Fig. 3. And −→
4: The set A(uτp ) contains the head nodes hτp of the trans-
Fig. 3(a) deals with the transmission resources, while Fig. 3(b) mission links (hτp , uτp ), while the set A(← uτ−p ) is formed
is the representation of the receiving resources. In Fig. 3(a), by the tail nodes lτp of the transmission links (uτp , lτp ),
Sτ2 is firstly disconnected with the nodes Aτ2 and Bτ2 . where hτp , uτp , lτp ∈ VT ;
Secondly, we add the virtual node Sτ2 to this TEG, and con- 5: for each node uτp ∈ VT with Dout (uτp ) ≥ 2 do
nect it with the nodes Aτ2 and Bτ2 . Thirdly, the Sτ2 is linked to 6: Separately add the virtual node uτ p and the virtual edge
Sτ2 to transform its transmission resources as the link capacity (uτp , uτ p ) into VT and E, and set c
uτp ,uτ p uτ
= τp Btr p ;
for vτp ∈ A(← uτ−p ) do
of the added virtual edge as:
7:
Sτ u ,v
c Sτ2 ,Sτ2 = τ2 Btr 2 . (11) 8: Plug (uτ p , vτp ) into E, c τp τp = c uτp ,vτp ;
Delete (uτp , vτp ) from E;
Finally, we set c Sτ2 ,Aτ2 = c Sτ2 ,Aτ2 , c Sτ2 ,Bτ2 = c Sτ2 ,Bτ2 .
9:
10: end for
With respect to the receiving resources in Fig. 3(b), the 11: end for
similar operations can be implemented to obtain the modified 12: for each node vτp ∈ VT with Din (vτp ) ≥ 2 do
TEG. Specially, the added virtual edge is (Dτ1 , Dτ1 ), and its 13: Add the virtual nodes vτp and the virtual edge (vτp , vτp )
link capacity is: vτp ,vτp vτ
Dτ1
into VT and E, respectively, and fix c = τp Btr p ;
D
τ ,Dτ1
c 1 = τ1 Btr , (12)
which equivalently quantizes the constraint of the receiving 14: for uτp ∈ A(−
v→
τp ) do
uτp ,vτp
resources at node Dτ1 . 15: Put (uτp , vτp ) into E, and define c = c uτp ,vτp ;
For completeness, we list the detailed steps in Algorithm 1
for the representation of transceiver resources in TEG. 16: Remove (uτp , vτp ) from E;
Remark 2: With some virtual nodes and edges added in 17: end for
TEG, the nodes’ transceiver constraints in (9) and (10) can 18: end for
be transformed into the added edges’ link constraints (8)
and added nodes’ flow conservation (7). For the modified
TEG, we can formulate the max-flow problem with the objec-
tive (5) and the constraints (6), (7) and (8). Then, the joint Remark 3: With the expansion of the network topol-
transceiver resource allocation for the multi-nodes can be ogy along the time, the max-flow problem of the single-
achieved through solving the max-flow problem. source single-destination over the satellite network will
WANG et al.: TEG-BASED RESOURCE ALLOCATION OVER SATELLITE NETWORKS 363
Fig. 4. The curves of the maximum flow versus the given time period T Fig. 5. The curves of the maximum flow versus the different satellite caching
for 6 satellites. with given time period T = 6 hours.
be changed into that of the multiple-source multiple- We present the impact of the nodes’ caching on the
destination under the modified TEG, which is the impor- network’s max-flow in Fig. 5, where the parameters for the
tant characteristic of the TEG. We can add super-source second scenario is utilized. It can be seen from Fig. 5 that
and super-destination nodes into the modified TEG to the network flow increases within the small range of the
solve the max-flow problem with the Ford-Fulkerson caching size. However, in the big region, we can not obtain
algorithm [9]. the performance gain in terms of the network flow through
augmenting the caching. This phenomenon can be of great
IV. S IMULATIONS significance for the guidance of network design.
We conduct our simulations over a predictable low orbital
satellite network, which consists of 6 orbital planes with V. C ONCLUSION
one satellite fixed on each plane. All the satellites are arbi- In this letter, we propose a TEG-based transceiver resource
trarily selected from the Iridium constellation, and are fixed allocation scheme for satellite networks to maximize the
at a height of 780 km and inclination of 86.4◦ . Moreover, network flow and resource utilization. We construct one effec-
we utilize the Satellite Tool Kit(STK) simulator to gener- tive representation rule of the network’s transceiver resources,
ate the contact topology. The small time interval τ is set which is based on adding both the virtual nodes and edges.
as 1 minute. With this rule, we obtain the modified TEG, and imple-
To explore the performance of our scheme, we conduct ment joint multi-node resource allocation through solving the
two simulation cases. In the first scenario, the time T is max-flow problem over the modified TEG.
set as 12 hours. And each satellite possesses infinite stor-
age size, with transceiver resources set as 30 Mbps. In
the second case, T = 6 hours, and the storage size of R EFERENCES
any satellite varies from 0 MB to 4500 MB. Moreover, [1] C. Caini, H. Cruickshank, S. Farrell, and M. Marchese, “Delay-
in this case, three transceiver resources equipment, i.e., and disruption-tolerant networking (DTN): An alternative solution for
future satellite networking applications,” Proc. IEEE, vol. 99, no. 11,
20 Mbps, 30 Mbps, and 40 Mbps, will be examined. pp. 1980–1997, Nov. 2011.
Moreover, both the source and the destination will be randomly [2] G. Araniti et al., “Contact graph routing in DTN space networks:
selected. Overview, enhancements and performance,” IEEE Commun. Mag.,
In Fig. 4, we compare our proposed TEG-based allocation vol. 53, no. 3, pp. 38–46, Mar. 2015.
[3] S. E. Alaoui and B. Ramamurthy, “Routing optimization for DTN-based
scheme with the random and the average allocation schemes. space networks using a temporal graph model,” in Proc. IEEE ICC,
For the random scheme, the nodes randomly allocate the May 2016, pp. 1–6.
transceiver resources to their outgoing links with (3) estab- [4] Z. Zhang, C. Jiang, S. Guo, Y. Qian, and Y. Ren, “Temporal centrality-
lished and the incoming transmission links when (4) holds, balanced traffic management for space satellite networks,” IEEE Trans.
Veh. Technol., vol. 67, no. 5, pp. 4427–4439, May 2018.
while the average scheme allocates the transceiver resources [5] X. Zhu, C. Jiang, L. Kuang, N. Ge, and J. Lu, “Non-orthogonal multiple
to the corresponding links evenly. As shown in Fig. 4, the access based integrated terrestrial-satellite networks,” IEEE J. Sel. Areas
amounts of the network maximum flow of all methods gradu- Commun., vol. 35, no. 10, pp. 2253–2267, Oct. 2017.
ally increase, which is attributed to the fact that more contacts [6] L. R. Ford and D. R. Fulkerson, Flows in Networks. Princeton, NJ, USA:
Princeton Univ. Press, 1962.
can be generated with the increasing of time. Then it can be [7] Y. Li, C. Song, D. Jin, and S. Chen, “A dynamic graph optimization
checked that our proposed scheme has a larger network-flow framework for multihop device-to-device communication underlaying
than the other two schemes, which is expected and can be cellular networks,” IEEE Wireless Commun., vol. 21, no. 5, pp. 52–61,
explained as follows: our proposed scheme jointly considers Oct. 2014.
[8] A. V. Goldberg and R. E. Tarjan, “Efficient maximum flow algorithms,”
the transceiver constraints at multi-nodes, while both the ran- Commun. ACM, vol. 57, no. 8, pp. 82–89, Aug. 2014.
dom and the average scheme independently deal with these [9] L. R. Ford and D. R. Fulkerson, “Constructing maximal dynamic flows
constraints from one node to another. from static flows,” Oper. Res., vol. 6, no. 3, pp. 419–433, 1958.
A Novel Frequency Allocation Scheme for In Band

Full Duplex Systems in 5G Networks
Parthiban Annamalai , Jyotsna Bapat, Member, IEEE, and Debabrata Das, Senior Member, IEEE
Abstract—In band full duplex (IBFD) is an emerging −101.5 dBm [4]. For an assumed transmit-receive RF path iso-
transceiver technology that facilitates simultaneous transmission lation of 15 dB [2], SI power at the receiver will be 132.5 dB
and reception on the same frequency to double spectral efficiency. above BS’s REFSENS. Thus, for a FD receiver to achieve the
Self-interference (SI) from its own transmitter is the biggest
same link level Signal to Noise Ratio (SNR) as that of a HD
challenge in IBFD radios causing significant degradation to its
receiver performance. SI cancellation (SIC) techniques proposed device, SI power should be suppressed by at least 133 dB to
in literature involve considerable hardware and software com- make IBFD systems realistic.
plexities for practical realization of IBFD radios. In this letter, a Recent breakthroughs in SI cancellation (SIC) techniques
hybrid cellular architecture is proposed composing of IBFD base have demonstrated the feasibility of IBFD by suppressing SI
station and legacy half duplex user equipments (UEs), thus avoid- to the tolerable limit. SIC generally happens at three stages:
ing complex SIC requirement at UE. A novel frequency allocation Propagation domain, analog domain and digital domain [5].
scheme is proposed for this hybrid architecture that allows shar-
ing of carrier frequencies among UEs based on distance criteria.
Additional hardware and intensive processing required for SIC
As the number of UEs increases, chances of finding UEs that sat- result in increased cost, size and power consumption. These
isfy the distance criteria also increase steadily. Simulation results complexities make it difficult to realize an IBFD radio on every
reveal that the probability of 100% frequency reuse exceeds 0.9 User Equipment (UE) and may not be possible in near future.
for as few as 8 UEs itself, demonstrating the potential of the However, making a BS IBFD capable is quiet possible since
proposed idea. the restrictions on cost, size and power are not as stringent for
Index Terms—In band full duplex, frequency reuse, spectral BS. SIC up to 110 dBm is achieved so far as mentioned in [6].
efficiency, data rate, 5G network. Considering the total cancellation requirement of 133 dB for
LTE-A scenario, gap of 23 dB could be achieved using higher
TX/RX antenna spacing and other shielding mechanisms pos-
I. I NTRODUCTION sible at BS. Taking above factors into account, a hybrid cellular
MPROVING Spectral Efficiency (SE) is a preferred architecture composing of IBFD capable BS and legacy HD
I approach to meet the ever increasing data rate require-
ment for rapidly emerging applications and there is a special
UEs is proposed [7].
Frequency allocation scheme for the hybrid architecture
focus for the same in upcoming 5G [1]. Traditional cellular must consider the varying capabilities of BS and UEs to
systems being Half Duplex (HD) require two different carrier maximize SE. A hybrid architecture composing of IBFD relay
frequencies for uplink (UL) and downlink (DL) respectively. and HD UE is discussed in [8]. The relay operates either in
An emerging In Band Full Duplex (IBFD) or Full Duplex FD or HD mode based on the instantaneous Channel State
(FD) technique proposes the usage of same carrier frequency Information (CSI). The allocation scheme chooses a suitable
for both UL and DL at the same time [2]. This can effectively mode locally for the relay to improve SE. Another hybrid
double SE and thus gained significant attention recently. It is architecture discussed in [9] divided all HD UEs into two
a promising alternative to legacy HD systems and a potential groups and assumed that the two groups are sufficiently sep-
candidature for upcoming 5G [3]. arated to avoid the in-band interference between them. Each
Self-Interference (SI) is the biggest challenge that an IBFD group is configured to operate either in UL only or DL only
radio must overcome. SI is caused by its own strong UL signal mode and carrier frequencies were allocated accordingly.
interfering to a relatively weaker DL signal present in the same As a first of its kind, a novel frequency allocation scheme
frequency band at the same time. To understand the magnitude is proposed in this letter for hybrid cellular architecture. This
of SI, a practical example from LTE-A is considered, where a allocation algorithm runs at BS and works generically for
Wide Area (Macro) BS can transmit at a maximum power of all random distribution scenarios of HD UEs. The proposed
46 dBm and its receiver sensitivity power limit (REFSENS) is scheme allocates a frequency for UL of UEi and DL to UEj
and another frequency for UL of UEj and DL to UEi (two
Manuscript received August 15, 2018; accepted September 25, 2018. Date frequencies for two HD UEs) to improve frequency reuse for
of publication October 1, 2018; date of current version April 9, 2019. The better SE, while legacy HD system requires four frequencies
associate editor coordinating the review of this paper and approving it for for two HD UEs. To prevent the potential in-band interference,
publication was W. Zhang. (Corresponding author: Parthiban Annamalai.)
P. Annamalai is with Intel, Bengaluru 560103, India, and also a Research UEi and UEj must be sufficiently separated. It should be
Scholar with the International Institute of Information Technology Bangalore, noted that each frequency can be allocated only twice between
Bengaluru 560100, India (e-mail: parthiban.annamalai@intel.com). a pair of UEs and 100% frequency reuse is achieved in a BS
J. Bapat and D. Das are with the International Institute of Information when all UEs share the frequencies. Simulation results reveal
Technology Bangalore, Bengaluru 560100, India (e-mail: jbapat@iiitb.ac.in;
ddas@iiitb.ac.in). that, for as less as 8 UEs itself, probability of 100% frequency
Digital Object Identifier 10.1109/LWC.2018.2873316 reuse (p100% ) exceeds 0.9. As the number of UEs increases,
ANNAMALAI et al.: NOVEL FREQUENCY ALLOCATION SCHEME FOR IBFD SYSTEMS IN 5G NETWORKS 365
TABLE I
F REQUENCY A LLOCATION P OSSIBILITIES FOR T WIN AND T RIPLET S HARING FOR N = 2 AND N = 3
levels as well as interference from neighboring cells, only UEs

within Rg are considered eligible for frequency sharing.
Let N (≤ M) be the total number of UEs that are present
within Rg . BS uses the same location information of UEs to
compute the inter-UE distance (dij ) among all N UEs and
populate the distance matrix A as,
⎡ ⎤
d11 d12 . . . d1N
⎢ d21 d22 . . . d2N ⎥
⎢ ⎥
AN ×N = ⎢ . . .. .. ⎥. (1)
⎣ .. .. . . ⎦
dN 1 dN 2 ... dNN
It can be observed that this matrix is symmetric with its diag-
onal elements equal to ‘0’. A set D containing only the upper
Fig. 1. Illustration of Proposed Cellular Architecture. triangular elements of A with cardinality |D| = N(N-1)/2 is
defined as,
D = {d12 , . . . d1N , d23 , . . . , d2N , dN −1N }. (2)
for every UE, the probability of finding another UE located at
sufficient distance also increases rapidly. This results in p100% This inter-UE distance set D in (2) forms a cornerstone for
approaching towards 1, illustrating the great potential of our the proposed allocation scheme.
proposed idea. Restricting the complex SIC requirement only To improve frequency reuse within a BS, a carrier frequency
to BS and leaving the legacy HD UEs unaffected while still is allocated for UL to UEi as well as for DL to UEj . Such an
achieving SE close to complete IBFD (both BS and UEs are allocation may result in potential in-band interference if these
IBFD capable) makes our solution far more attractive. two UEs are closely located. The minimum distance between
This letter is organized as follows. Section II describes the two UEs that ensures this in-band interference attenuating
the proposed system model in detail. Frequency alloca- to an accepted level is termed as dmin . It depends upon the
tion methodology of the proposed scheme is elaborated in maximum transmit power of UE, its REFSENS, channel char-
Section III. Section IV discusses the results obtained. Finally, acteristics and path loss parameters. Detailed relationship for
Section V concludes this letter with a list of future work items. dmin with an example computation is given in the Appendix.
It is important to mention that a carrier frequency can only be
used twice; for UL of UEi and DL of UEj .
II. P ROPOSED S YSTEM M ODEL In case of two UEs, only two possible cases exist as
A typical LTE-A cellular network is shown in Fig. 1, where shown in Table I. If d12 ≥ dmin , two UEs will be paired
M UEs are assumed to be attached to a BS and are randomly to achieve 100% frequency reuse or else they will be left
distributed within its coverage area (Re ). For the proposed unpaired. Probability that two UEs satisfy the dmin crite-
scheme, BS should be aware of the geographical location coor- ria will depend upon their geographical locations within Rg
dinates (x, y, z) of all M UEs in the region R 3 and updated and the total area of Rg itself. In this letter, for simplic-
at periodic intervals. There are well defined UE positioning ity, we have assumed both cases to be equi-probable, i.e.,
techniques in 3GPP [10] that can be utilized by BS to find the p(d12 ≥ dmin ) = p(d12 < dmin ) = 1/2 but our system is
location of UEs. BS uses this absolute location information provisioned to accommodate any variations. For N = 3, four
to compute the distance (di ) of each UEi from itself to dis- cases (|D| + 1) are possible as listed in Table I. To achieve
tinguish the UEs in good coverage region (Rg ) as shown in 100% frequency reuse, all three inter-UE distances must sat-
Fig. 1. Since cell edge UEs generally suffer from low signal isfy the dmin criteria and that probability is 1/8. For higher
number of UEs, it is not necessary that all dij ’s ∈ D but only

the right set of dij ’s ∈ D should meet the dmin criteria to
achieve 100% frequency reuse. E.g., for N = 4, dij ’s ∈ D of
any two different UEs pairs ({d12 , d34 } or {d13 , d24 } or . . . )
are sufficient to satisfy the dmin criteria. The primary task of
the proposed scheme is to identify such UEs sets based on
their inter-UE distances.
III. F REQUENCY A LLOCATION S CHEME

Consider a set S ⊂ D containing the inter-UE distances such
that {dij ∈ D ≥ dmin } and defined as,
S = {dij |1 ≤ i ≤ N − 1; 2 ≤ j ≤ N } (3)
with |S| being its cardinality and can vary from 0 to |D|. As Fig. 2. Twin(s) and Triplet sharing examples for different N from 2 to 5.
per the proposed allocation scheme, UEs are grouped as either
a twin or a triplet to share frequencies among themselves. In TABLE II
twin sharing, a carrier frequency fp is allocated for UL to UEi C OMBINATORIAL A NALYSIS FOR D IFFERENT N
as well as for DL to UEj and another carrier frequency fq is
allocated for UL to UEj as well as for DL to UEi , provided
dij ≥ dmin . It is also possible to share three carrier frequencies
among three UEs, provided all three inter-UE distances satisfy
the above dmin criteria (dij , dik , djk ≥ dmin ). Such a sharing
is referred as triplet sharing. In this case, carrier frequency fp
is allocated for UL to UEi and DL to UEj , fq is allocated
for UL to UEj and DL to UEk and fr is allocated for UL
to UEk and DL to UEi . Frequency sharing among more than
three UEs does not offer any additional advantage.
For even N, the necessary condition to achieve 100%
determined after exhaustive search. These values show a clear
frequency reuse is that, for a set E ⊂ S in (4), its cardinality trend that, as N increases, the number of combinations that
should be such that |E| ≥ N/2. would lead to 100% frequency reuse also increases rapidly.
E = {dij , dkl , . . . , dN −1N ∈ S |i = j = · · · = N } = {∅}; N ≥ 2.
(4)
IV. R ESULTS AND D ISCUSSION
For odd N (≥ 3), complete twin sharing is not possible. Only Proposed allocation scheme in this letter facilitates
(N-3)/2 UEs can be twin shared while the remaining 3 UEs frequency reuse among UEs to maximize SE for LTE-A. It
must be shared as a triplet. In order to achieve 100% frequency can achieve 100% frequency reuse when all UEs are within
reuse for odd N, for a set O ⊂ S in (5), its cardinality should Rg of BS (N = M) and either twin or triplet shared. As a
be such that |O| ≥ 3+(N-3)/2. measure of efficiency of the proposed scheme, probability of

O = dlm , dln , dmn , dij , dkl , . . . , dN −1N ∈ S | 100% frequency reuse (p100% ) is defined and estimated as the
l = m = n = i = j = · · · = N } = {∅}; N ≥ 3 (5) ratio of CN to KN as in Table II. Simulations were carried out
to identify all 100% frequency reuse possibilities to determine
There are multiple combinations of dij s ∈ S that can satisfy CN . Fig. 3 depicts estimated p100% for N up to 8, assuming
the cardinality conditions mentioned in (4) and (5). Fig. 2 illus- all UEs are within Rg (N = M). For N = 2, p100% is 1/2.
trates an example set of dij s ∈ S achieving 100% frequency As N increases, both CN and KN increase exponentially and
reuse composing of twin(s) and/or triplet sharing for N varying so does p100% approaching towards ‘1’ since the UEs pairing
from 2 to 5. A double-colored circle in these figures symboli- possibility also increases. This probability remains as ‘0’ for
cally denotes a UE with right-color referring to UL frequency legacy HD systems as there is no frequency reuse possible.
and left-color to DL frequency. For odd N, p100% is lower than that of its previous even
As the number of eligible UEs (N) increases, the task of N. E.g., p100% is 0.36 for N = 5 while it is 0.58 for N = 4
identifying the right set of UEs for twin and/or a triplet shar- resulting in a saw-tooth shaped curve. This phenomenon can
ing becomes a challenging task. For N UEs, the number of be explained as follows: Probability of achieving twin sharing
inter-UE distances is |D| and the number of all combina- for N = 2 is 1/2 while that of triplet sharing for N = 3 is much
tions (whether dij ∈ D is greater or less than dmin as in lower, i.e., 1/8 (Table I). For odd N (≥ 3), there must be at
Table I) is denoted by KN (= 2|D| ). Of these KN combina- least one triplet sharing and remaining (N-3)/2 UEs should be
tions, let CN be the number of all possible combinations of twin shared to achieve 100% frequency reuse. As N increases,
dij s ∈ S that can achieve 100% frequency reuse. It can be the contribution percentage of triplet reduces as seen in Fig. 4
seen from Table II that, as N increases, the value of KN also and the difference between even and odd N start to narrow as
increases exponentially making the determination of CN com- observed in Fig. 3. With 8 UEs itself, p100% has crossed 0.9,
putationally intensive. Values of CN for N up to 8 have been illustrating the potential of the proposed idea.
ANNAMALAI et al.: NOVEL FREQUENCY ALLOCATION SCHEME FOR IBFD SYSTEMS IN 5G NETWORKS 367
be accommodated to identify new sets of UEs for frequency

sharing at regular scheduling intervals.
A PPENDIX
E XAMPLE C OMPUTATION OF dmin
COST HATA path loss model (between two UEs) given
in (A.1) is used to compute the received power at the
LTE-A UE.
PLdB = 46.3 + 33.9log10 (f ) − 13.82log10 (hB ) − a(hR , f )

+ (44.9 − 6.55log10 (hB ))log10 (d ) + C (A.1)
where, a(hR , f ) = (1.1log10 (f ) − 0.7)hR − (1.56log10 (f ) −

0.8)
Fig. 3. Improvement in Probability of 100% Frequency Reuse (p100% ). f - Carrier Frequency (1960 MHz, E-UTRA Band 2)
d - Distance between the UEs in meters
hB , hR - Height of Base Station/UE (1.5 meters)
C - 3 dB for metropolitan area considered
Received power (dBm) at UEj from UEi is given by
PRj = PTi + GRj + GTi − PLij (A.2)
where,
PTi , GTi - Transmit power and antenna gain (0 dB) of UEi
PRj , GRj - Receive power and antenna gain (0 dB) of UEj
PLij - Path Loss between UEi and UEj as defined in (A.1)
Minimum value of d in (A.1) for which the UE’s received
power (A.2) should reach its REFSENS, after substituting for
path loss from (A.1), is termed as dmin . For a LTE-A UE oper-
ating at 10 MHz bandwidth in E-UTRA Band 2, its REFSENS
is −97.7 dBm for QPSK and maximum transmit power is
Fig. 4. Twin-Triplet Sharing Contribution. 23 dBm [11]. To pair UEi with UEj , dmin between them
should be such that the UL power of 23 dBm transmitted
by UEi should attenuate below its REFSENS of −97.7 dBm
while received at UEj and the same is estimated as 137 meters.
V. C ONCLUSION AND F UTURE W ORK
The frequency allocation scheme proposed in this letter R EFERENCES
allows BS to allocate a single carrier frequency to two UEs
to potentially double SE. BS identifies such UEs based on [1] “IMT vision—Framework and overall objectives of future develop-
ment of IMT for 2020 and beyond,” Int. Telecommun. Union, Geneva,
their inter-UE distances for frequency sharing. Grouping in Switzerland, ITU-Recommendation M.2083-0, Sep. 2015.
smaller units (either two or three) of UEs is found to be [2] A. Sabharwal et al., “In-band full-duplex wireless: Challenges
most efficient and accordingly twin/triplet sharing is proposed. and opportunities,” IEEE J. Sel. Areas Commun., vol. 32, no. 9,
pp. 1637–1652, Sep. 2014.
Achievable frequency reuse depends upon the geographical
[3] S. Hong et al., “Applications of self-interference cancellation in 5G and
distribution of UEs, better reuse is achieved when UEs are beyond,” IEEE Commun. Mag., vol. 52, no. 2, pp. 114–121, Feb. 2014.
randomly distributed within the good coverage area of BS. [4] Base Station (BS) Radio Transmission and Reception, V15.3.0, 3GPP
Proposed solution has the potential to reach 100% frequency Standard TS 36.104, Jul. 2018.
[5] Z. Zhang, K. Long, A. V. Vasilakos, and L. Hanzo, “Full-duplex wireless
reuse as the number of UEs (N) increases. It is important communications: Challenges, solutions, and future research directions,”
to note that these improvements are achieved while retaining Proc. IEEE, vol. 104, no. 7, pp. 1369–1409, Jul. 2016.
legacy HD UEs that are less complex than IBFD UEs and [6] D. Bharadia, E. McMilin, and S. Katti, “Full duplex radios,” in Proc.
requiring IBFD capability only at BS. Additional complexity ACM SIGCOMM, pp. 375–386, Aug. 2013.
[7] D. Korpi et al., “Full-duplex mobile device: Pushing the limits,” IEEE
and cost to realize IBFD capability at BS can be justified Commun. Mag., vol. 54, no. 9, pp. 80–87, Sep. 2016.
by the SE improvement gained by the proposed approach. [8] T. Riihonen, S. Werner, and R. Wichman, “Hybrid full-duplex/half-
Since the costs of acquiring additional spectrum are far higher, duplex relaying with transmit power adaptation,” IEEE Trans. Wireless
proposed approach will be more attractive to service providers Commun., vol. 10, no. 9, pp. 3074–3085, Sep. 2011.
[9] A. C. Cirik, K. Rikkinen, R. Wang, and Y. Hua, “Resource allocation in
despite the additional costs required for IBFD capability in BS. full-duplex OFDMA systems with partial channel state information,” in
In this letter, the combinatorial analysis based on inter- Proc. IEEE China Summit Int. Conf. Signal Inf. Process., pp. 711–715,
UE distances to allow frequency sharing was done using Jul. 2015.
[10] Stage 2 Functional Specification of User Equipment (UE) Positioning
exhaustive search algorithm for up to 8 UEs. This needs to in E-UTRAN, V 14.3.0, 3GPP Standard TS 36.305, Oct. 2017.
be extended systematically to obtain similar metrics for any [11] User Equipment (UE) Radio Transmission and Reception, V14.7.0,
higher values of N. In addition, the mobility of UEs must 3GPP Standard TS 36.101, Apr. 2018.
Optimal Transmission Scheduling in Small Multimodal Underwater Networks

Filippo Campagnaro , Paolo Casari , Senior Member, IEEE,
Michele Zorzi , Fellow, IEEE, and Roee Diamant , Senior Member, IEEE
Abstract—We describe a scheduling protocol for multimodal how to optimally schedule the transmissions of the available
networks of relatively limited size, whose nodes encompass var- PHYs.
ious underwater communication technologies. For such a case, Different from multi-band scheduling, which aims mostly
we show that significant improvement in the network operations
is possible when the transmission schedule is set to jointly utilize at interference avoidance, multimodal networks enable diver-
all communication technologies. Our solution is based on per- sity in terms of the PHY itself. For example, low-frequency
technology time-division multiple access frames, whose time slots acoustic communications achieve kbit/s transmission rates over
are determined optimally to maximize the overall channel utiliza- ranges of a few kilometers, high-frequency acoustics tops tens
tion while preserving flow limitations and maintaining fairness of kbit/s over up to a few hundred meters, whereas opti-
in resource allocation. Our numerical simulations and experi-
mental results for multimodal networks with several acoustic cal communications yield Mbit/s links over ranges of a few
technologies show that, while maintaining a fair resource alloca- meters. Different PHYs face different challenges: for exam-
tion, our scheduling solution provides both high throughput and ple, acoustic communications are sensitive to time-varying
low packet delivery delay. multipath, whereas water turbidity and ambient light ham-
Index Terms—Underwater communication networks, under- per optical communications [1]. Hence, efficient multimodal
water acoustic communications, multi-modal systems, transmis- networking requires a specific scheduling solution with differ-
sion scheduling. ent properties than the schemes in single-technology network
domains.
The approaches designed for multi-radio or multi-channel
I. I NTRODUCTION AND R ELATED W ORK wireless radio networks rely on frequent communications or
feedback, and target the management of voice calls rather
NDERWATER communications have gradually become
U the enabler of several types of submerged operations.
Submarines, divers, autonomous underwater vehicles (AUVs),
than data transmission [2]. Scheduling poses different require-
ments in multimodal underwater networks, where the available
PHYs have widely different communication capabilities, and
and floaters are often endowed with underwater communica- the scheduler must account for PHY-dependent adjacency and
tion capabilities. This has progressively led to the formation of interference matrices while avoiding bottlenecks.
underwater networks, where devices share their measured data For underwater multimodal operations, it has been proposed
and act to collaborate with other sensors. The increase of com- to use visual image processing [3] or signaling to identify the
munication systems and underwater operations will soon result fastest available PHY [4]. The MURAO protocol [5] organizes
in multimodal networks, where a node incorporates multiple the nodes in clusters, using optics for intra-cluster commu-
physical layer (PHY) technologies, e.g., acoustic, optical and nications and acoustics for cluster management. The above
radio-frequency (RF). This will enable new applications, such approaches are tailored to specific scenarios, or offer solutions
as data muling and wireless telemetry for hybrid vehicles. for stable networks. However, they may suffer from bottle-
With several heterogeneous nodes, employing a different necks and delays in realistic multimodal networks, where PHY
sub-network for each technology may result in disconnections performance changes over time and space due to mobility and
and poor data transfer performance. Instead, it would be pos- environmental conditions, and does so in different ways for
sible to increase the throughput, decrease the communication different PHYs. A scheduling mechanism that can optimize
delay via simultaneous transmissions, and reduce the occur- the use of the multimodal network’s resources is therefore
rence of bottlenecks by properly leveraging the full set of needed.
PHYs. The main challenge, and the focus of this letter, is To address the above challenges, we propose the optimal
Manuscript received July 25, 2018; revised September 10, 2018; accepted multimodal scheduling (OMS) protocol. OMS manages trans-
September 17, 2018. Date of publication October 1, 2018; date of current missions through any set of PHYs by jointly setting trans-
version April 9, 2019. This work was supported by the NATO Science for mission time slots in a per-technology time-division multiple
Peace and Security Programme under Grant G5293. The associate editor coor-
dinating the review of this paper and approving it for publication was S. De. access (TDMA) fashion, and divides the data load among the
(Corresponding author: Filippo Campagnaro.) PHYs to optimize link utilization and transmission delay. In
F. Campagnaro and M. Zorzi are with the Department of Information addition, OMS organizes transmission slots to favor packet
Engineering, University of Padova, 35131 Padua, Italy (e-mail: campagn1
@dei.unipd.it; zorzi@dei.unipd.it). routing and enforce a fair number of transmission opportu-
P. Casari is with IMDEA Networks, 28198 Madrid, Spain (e-mail: nities per node. We tested the performance of OMS against
paolo.casari@imdea.org). benchmark schemes in numerical simulations and in a sea
R. Diamant is with the Department of Marine Technologies, University of
Haifa, Haifa 3498838, Israel (e-mail: roeed@univ.haifa.ac.il). experiment using multimodal nodes encompassing different
Digital Object Identifier 10.1109/LWC.2018.2873329 acoustic PHYs. The results show that OMS achieves better
CAMPAGNARO et al.: OPTIMAL TRANSMISSION SCHEDULING IN SMALL MULTIMODAL UNDERWATER NETWORKS 369
throughput, packet delivery delay, and fairness in resource PHY n. The transmission slot indices are arranged in a vec-
allocation. tor tTx
i,n = {r : Si,r ,n = 1} for node i over technology n.
Our objective is to maximize channel utilization, measured via
the total number of transmissions over a given time period.
II. T HE OMS A LGORITHM The schedule also considers collisions among neighboring
A. System Model nodes and facilitates the forwarding of packets across multiple
Our system consists of N network nodes equipped with one hops. OMS requires the knowledge of the adjacency matrix M
or more of T underwater PHYs. The set of PHYs is arranged obtained, e.g., via [6]. This includes the existing connections
in the N × T technology matrix T such that Ti,n = 1 if node i and the available per-node PHY technologies.
has PHY n. Call M the adjacency matrix, where Mi,j ,n = 1 Let ∨ be the logical “or” and ∧ the logical “and” operators.
if node i is connected to node j via PHY n. The number of The optimal schedule S with time frame length τ fr is the
neighbors of i through technology n is Di,n = j Mi,j ,n . We solution of the following problem:
assume that T is given, and that M can be obtained via prelim-
inary link probing [6]. We remark that the difference between S , τfr = arg min max ci Si,t,n (1a)
τ fr i t n
the communication and interference range is limited in under-
water networks, due to the very fast power decay incurred for s.t. Ti,n = 0 =⇒ Si,t,n = 0 (1b)
increasing range by any PHY technology [4]. Hence, to har- Si,t,n = 1 ∧ Sj ,t,n = 1 =⇒
ness spatial reuse for performance gain, we allow collisions (Mi,p,n + Mj ,p,n = 0)
in OMS.
∨ (Mi,p,n + Mj ,p,n = 1 ∧ Sp,t,n = 0) (1c)
OMS organizes orthogonal multimodal PHYs via per-
technology TDMA frames. As different PHYs are character- ∨ (Mi,p,n + Mj ,p,n = 2 ∧ Sp,t,n = 1) ∀p = i , j
ized by diverse transmission rates and may incur different Si,t,n
≥ ci ≥ C ∀i , n (1d)
propagation delays (e.g., optics vs. acoustics), the duration of n
Di,n
t
the time slots is also set per-technology. We choose TDMA ∃ t s.t. Si,t,n = 1, Sj ,t,n = 0 ∀i , j , n s.t. Mi,j ,n = 1
since it allows a simple time slot alignment via guard intervals.
(1e)
This is specifically important in multimodal systems, where
different PHYs have a diverse outage capacity. Additionally, ∃ n s.t. Mp,j ,n = 1 ∧ Mj ,i,n = 1 ∧ Mp,i,n = 0
in TDMA-based schemes the transmission delay is known ∧ max(tTx Tx
j ,n ) > min(tp,n ). (1f)
in advance, making it possible to plan the load allocated to
each PHY. This is in contrast to handshake-based schemes The solution can be obtained via branch-and-bound, which
(where the delay depends also on the receiver) and to fully completes in polynomial time on average [7].
random access (where collisions may trigger an unpredictable Constraint (1b) prevents transmissions on technology n if
number of retransmissions). The synchronization of the low node i does not have it. Constraint (1c) allows simultaneous
latency technologies can be achieved either via atomic clocks transmissions by two nodes i and j in the same slot using
or through the network time protocol (NTP). For acoustics, technology n only if: ∀p = i, j, the links i ↔ p and j ↔ p
however, we can simply rely on guard times: as the time slot do not exist for technology n; or if one of the two links exists
duration is at least as long as the maximum propagation delay, and p does not transmit in the same slot (lest p would be deaf
such guard times are negligible. to i’s or j’s transmission); or otherwise, if both links exist, p
We impose traffic constraints by allowing a node i to trans- also transmits in the same slot (so that i and j’s transmissions
mit in at least ci > C time slots. Calling Ri the number of would not collide at p). Constraint (1d) specifies that more
bits transmitted in each time slot of the slowest communication slots are given to nodes with more neighbors, so that the total
technology of node i, constraint ci ensures the transmission of number of slots per neighbor is at least ci ≥ C . Constraint (1e)
at least ci Ri bits within a given frame. Fairness then results imposes that node i be the only transmitter in at least one slot
from setting ci such that nodes with lower Ri receive a higher t over each technology n. Hence, although (1c) allows primary
ci value. conflicts, there exists at least one slot for each node to transmit
free from interference. Finally, constraint (1f) facilitates that
the same packet can propagate further than one hop within the
B. OMS Scheduling Solution same frame, and is achieved by allowing a node j located at
Our solution allocates transmission time slots, organized an intermediate position between two nodes p and i to have
in n TDMA frames of N slots and duration τnsl : one frame at least one transmission slot later than node p’s slot, i.e.,
for each PHY. We synchronize transmissions by considering max(tTx Tx
j ,n ) > min(tp,n ).
a TDMA super-frame of length τ fr , such that for some PHYs The formalization in (1) shows that OMS optimizes link uti-
several (not necessarily full) TDMA cycles are possible every lization by allocating the data flows across all PHYs of a node,
τ fr seconds. The input to OMS is the PHY matrix T, the while considering possible bottlenecks and packet delays due
adjacency matrix M, the per-PHY communication capacity, to the different capabilities of the various PHYs. However,
and the number of slots N and time slot duration τnsl . The OMS is a centralized solution and thus fits the case of small
output is the minimum allowed value of τ fr , and a matrix networks. Still, by sharing the adjacency matrix M, OMS
S, where Si,t,n = 1 if node i can transmit in slot t via avoids the use of a centralized hub.
TABLE I
S IMULATIONS : C HARACTERISTICS OF THE PHY T ECHNOLOGIES
III. S IMULATION R ESULTS

Unlike many other scheduling solutions [2], OMS focuses
on managing transmissions effectively through diverse PHYs.
We therefore compare the performance of OMS with the
Fig. 1. CDF of the throughput (4) for OMS and Aloha.
only two benchmark schemes we found to be appropri-
ate for multimodal underwater scheduling: the Aloha proto-
col [8, Sec. 4.2], where a packet is sent as soon as it becomes
available (except that no transmission can start if a reception is
in progress, in order to replicate the behavior of actual acoustic
transceivers); and the TDMA scheme in [4]. In both cases, to
transmit on a given link, a node employs the PHY providing
the highest bit rate, among those integrated by both itself and
the receiver. We consider the packet delivery ratio (PDR),
PDR = N rx /N tx , (2)
where N txand N rx
are the total number of packets transmit-
ted and received, respectively; the packet delivery delay (PDD)
that is calculated as the time elapsed from the packet gener- Fig. 2. CDF of packet delivery delay for OMS and Aloha. Total case.
ation until its reception; and the service fairness, which we
define by Jain’s fairness index [9] for the PDR,
N 2
2 −1 ,
J = i=1 PDR i N· N i=1 PDR i ) (3)

where PDRi = Nirx ( N tx −1 is the PDR of the packets
i=1 Nk ,i )
received by node i, and Nk ,i is the number of packets trans-
tx
mitted by node k to node i. For Lp bits in a packet, we also

consider the network throughput
THR = N rx · Lp /Ts , (4)
where Ts is the duration of the simulation. Fig. 3. Sketch of the network deployment in Hadera, Israel.
We deploy a multimodal network of 4 nodes uniformly at
random over an area of 2 × 2 km2 and of depth 100 m. We
consider three PHYs, based on low-, mid- and high-frequency that the gain increases with frequency, since high frequency
acoustics (respectively LF, MF and HF for short). The PHY translates into a higher bit rate, and the link utilization
characteristics are summarized in Table I. Due to the random becomes more effective.
deployment, the MF and HF modems may not form fully con- From the CDF of the PDD (Fig. 2), we observe that
nected subnetworks. Every node incorporates an LF modem. OMS outperforms both Aloha and TDMA by a significant
At random, two nodes also have an MF modem, and three figure of 3 s and 4 s, respectively, providing a delay that
nodes have an HF modem. We perform a Monte-Carlo set of is 50% and 66% lower than the other two MAC schemes.
600 runs, each with a different random topology realization, Moreover, OMS proves less sensitive to specific topologies
using DESERT Underwater [10]. We set Lp = 1000 bytes, and than Aloha and TDMA. This is due to constraint (1d), that
the number of slots N = 12. Guard times have been chosen enforces interference-free slots for all nodes.
according to the propagation time and the bit rate of each PHY.
In Fig. 1, we show the cumulative distribution func-
tion (CDF) of the throughput for each technology and for IV. S EA E XPERIMENT
their combination (“Total”). The poor performance of TDMA We demonstrated OMS in a sea experiment on May 2017
proves that it is unable to exploit all technologies. Since OMS in Hadera, Israel. The deployment (see Fig. 3) involved four
optimally utilizes all available links, in total its throughput stations: nodes 2 and 4 lowered from a pier stretching 2 km
performance always exceeds that of Aloha, with a gain of eastwards from the shore, and nodes 1 and 3 placed on boats.
roughly 100% in more than 50% of the cases. We also observe The water depth was 25 m.
CAMPAGNARO et al.: OPTIMAL TRANSMISSION SCHEDULING IN SMALL MULTIMODAL UNDERWATER NETWORKS 371
Fig. 4. Topology A: PDR (2), PDR fairness (3), and throughput (4).
Fig. 5. Topology B: PDR (2), PDR fairness (3), and throughput (4).
consumption of 2.2 Wh by OMS and 4.0 W/h by Aloha. This

We used EvoLogics underwater modems operating in three power consumption gain of OMS increased for Topology 2,
frequency bands: 7–17 kHz (LF, up to 6.9 kbit/s), 18–34 kHz where OMS consumed 2.3 Wh, and Aloha consumed 5.1 Wh.
(MF, up to 13.9 kbit/s), and 48–78 kHz (HF, up to 31.2 kbit/s).
Each node lowered its modems to the same depth, and con-
V. C ONCLUSION
nected to them via Ethernet from a single laptop. A sea state
of 3 resulted in a low PDR. We described OMS, a new scheduling protocol for
By changing the locations of nodes 1 and 3, we tested multimodal underwater networks. OMS maximizes the chan-
two network topologies, each for a total of 20 min. To nel utilization while providing a fair quality of service to all
achieve intense network traffic, we let each node transmit a nodes, and guaranteeing that at least some of the slots will
packet whenever possible. Considering the poor performance be free from interference. We have tested OMS both in sim-
of TDMA in the simulations, we only focused on the OMS ulations and in a sea experiment. The results show that OMS
and Aloha protocols in the experiment. fully utilizes the multimodal network, and thus achieves gains
In Fig. 4, we show the PDR, fairness, and throughput in both throughput and packet delivery delay. Future work
performance for Topology A. The differences among the nodes will include the adaptation of multimodal PHY technologies
are mostly due to the sparse topology, where nodes have to network flow requirements.
a different number of one-hop neighbors. We observe that,
except for the LF case, OMS’s PDR is consistently better R EFERENCES
than Aloha’s. Similarly, the bottom panels of Fig. 4 show that [1] T.-C. Wu et al., “Blue laser diode enables underwater communication
OMS’s transmission fairness and throughput are also better. at 12.4 Gbps,” Nat. Sci. Rep., vol. 7, Jan. 2017, Art. no. 40480.
We note that, due to its channel utilization, the experimen- [2] V. Gabale et al., “A classification framework for scheduling algorithms
in wireless mesh networks,” IEEE Commun. Surveys Tuts., vol. 15, no. 1,
tal results of OMS also exceed those of an ideal (theoretical) pp. 199–222, 1st Quart., 2013.
TDMA with perfect PDR. [3] I. Vasilescu et al., “Data collection, storage and retrieval with an under-
Fig. 5 shows the performance for Topology B. Compared water optical and acoustical sensor network,” in Proc. ACM Sensys,
San Diego, CA, USA, Nov. 2005, pp. 154–165.
to Topology A, more LF links are available. This diversity is [4] F. Campagnaro, F. Favaro, P. Casari, and M. Zorzi, “On the feasibil-
utilized by OMS. Thus, unlike in Fig. 4, here the PDR of OMS ity of fully wireless remote control for underwater vehicles,” in Proc.
is better than Aloha’s also for the LF case. While the fairness 48th Asilomar Conf. Signals Syst. Comput., Pacific Grove, CA, USA,
Nov. 2014, pp. 33–38.
performance follows the same trend, OMS’s throughput gain [5] T. Hu and Y. Fei, “MURAO: A multi-level routing protocol for acoustic-
decreases. This is mostly because in Topology A there are optical hybrid underwater wireless sensor networks,” in Proc. IEEE
fewer connection possibilities. Hence, Aloha experienced more SECON, Seoul, South Korea, Jun. 2012, pp. 218–226.
[6] R. Diamant et al., “Topology-efficient discovery: A topology dis-
collisions than in Topology B. Still, relative to Aloha, in the covery algorithm for underwater acoustic networks,” IEEE J. Ocean.
sea experiment OMS demonstrated a significant performance Eng., to be published. [Online]. Available: https://ieeexplore.ieee.org/
gain in all metrics. document/7962167
[7] W. Zhang and R. E. Korf, “An average-case analysis of branch-and-
Based on the data sheets of the manufacturer, the transmis- bound with applications: Summary of results,” in Proc. Nat. Conf. AI
sion power of the LF, MF, and HF modems is Ptx,LF = 40 W, (AAAI), 1992, pp. 545–550.
Ptx,MF = 35 W, and Ptx,LF = 18 W, respectively. Thus for [8] A. S. Tanenbaum, Computer Network. Englewood Cliffs, NJ, USA:
Prentice-Hall, 2002.
the transmissions executed during the experiment, where the [9] R. Jain et al., “A quantitative measure of fairness and discrimination
packet duration was t = 0.4 s, the power consumed for a sin- for resource allocation in shared computer systems,” Digit. Equipment
gle packet transmission is E = t · (Ptx ,LF · Ntx ,LF + Ptx ,MF · Corporat., Maynard, MA, USA, Rep. DEC-TR-301, Sep. 1984.
[10] P. Casari et al., “Open-source suites for underwater networking: WOSS
Ntx ,MF + Ptx ,HF · Ntx ,HF )/3600. Considering the number and DESERT underwater,” IEEE Netw., vol. 28, no. 5, pp. 38–46,
of packets transmitted, we calculate for Topology 1 a power Sep./Oct. 2014.
Placement Delivery Array Design via Attention-Based Sequence-to-Sequence

Model With Deep Neural Network
Zhengming Zhang , Meng Hua , Chunguo Li , Senior Member, IEEE,
Yongming Huang , Senior Member, IEEE, and Luxi Yang , Member, IEEE
Abstract—Recently, coded caching scheme was proposed as the placement delivery array (PDA) was proposed creatively to
ability of alleviating the load of networks. Especially, the place- describe placement and delivery phase [5].
ment delivery array (PDA) used for characterizing the coded PDA construction is completely equivalent to coded caching
caching scheme has attracted vast attention. In this letter, a
scheme design, because a PDA can indicate what should be
deep neural architecture is first proposed to learn the construc-
tion of PDAs for reducing the computational complexity. The cached by users and what should be sent by the server in
problem of variable size of PDAs is solved using mechanism a single array. Numerous methods have been proposed to
of neural attention and reinforcement learning. Different from address the construction of PDAs. The method proposed in [5]
previous works using combined optimization algorithms to get significantly decreases F, while only suffering from a slight
PDAs, our proposed deep neural architecture uses sequence-to- sacrifice. Yan et al. [6] found the connection between strong
sequence model to learn construct PDAs. Numerical results are edge coloring of bipartite graphs and PDA construction and
given to demonstrate that the proposed method can effectively
implement coded caching meanwhile reducing the computational they proposed a placement delivery array design algorithm
complexity. using graph theory presented in [7].
However, finding an optimal strong edge coloring for bipar-
Index Terms—Coded caching, placement delivery array, deep
learning, neural attention.
tite graph is a NP-hard problem [8]. Although new PDAs
can be discovered through finding strong edge coloring, for
the general case, only some sporadic results exist [7], [8]. In
I. I NTRODUCTION this letter, we revisit PDA design problem in a simpler per-
UE TO the exponential growth in the number of smart
D mobile equipments and innovative high-rate mobile data
services (such as videos streaming for mobile gaming and
spective, i.e., sequence-to-sequence (Seq2Seq) learning model,
which is widely used in natural language processing [9], [10].
This inspires us to present a deep neural architecture to
road condition monitoring), 5G networks should accommo- devise coded caching schemes. The architecture has three key
date the overwhelming wireless traffic demands. Deploying technologies, i.e., Seq2Seq learning [9], content based input
intelligent caching is an efficient strategy and able to satisfy attention [11] and reinforcement learning.
the rate requirements of users for the ability of information The main contributions of this letter are summarized as
quick acquisition [1]. follows:
The gain from traditional (uncoded) caching approaches (i) A deep neural architecture is first proposed to learn the
derives from making content available locally, and it is con- construction of PDAs, and it allows us to realize coded caching
strained by the limited memory available at each individual using deep learning technology.
user. In the seminal work [2], a coded caching scheme was (ii) Attention model is used to deal with the fundamental
proposed for the centralized caching system by Maddah-Ali problem of representing variable size of PDAs. Reinforcement
and Niesen which is referred to AN scheme in this letter. signals are used to accelerate deep neural network training.
The AN scheme can create multicast opportunities depend- Our results demonstrate that this approach can achieve approx-
ing on the cumulative memory available at all users. It has imate solutions to the problems of construction of PDAs that
been used in many scenarios, for example, finite file size are computationally intractable.
caching system [3] and wireless networks with unequal link The rest of this letter is organized as follows. In Section II,
rates [4]. However, in order to implement the coded caching we introduce the background of PDA to elicit our Seq2Seq
scheme proposed in [2], each file must be split into F file learning problem. In Section III we propose the attention-based
packages. The number of packages generally increases expo- Seq2Seq learning algorithm of the construction of PDAs.
nentially with the number of users. In order to reduce F, Finally, numerical results and a discussion are presented in
Section IV, and a conclusion is reached in Section V.
Manuscript received September 3, 2018; accepted September 23, 2018. Date
of publication October 1, 2018; date of current version April 9, 2019. This
work was supported by the National Natural Science Foundation of China
under Grant 61671144. The associate editor coordinating the review of this II. S YSTEM M ODEL AND BACKGROUND
paper and approving it for publication was C. Shen. (Corresponding authors:
Yongming Huang; Luxi Yang.)
We consider a caching system composing of one server and
The authors are with the National Mobile Communications Research N files W = {W1 , W2 , . . . , WN }. This server is connected
Laboratory, School of Information Science and Engineering, Southeast to K users through an error-free shared link, and the set of all
University, Nanjing 210096, China (e-mail: zmzhang@seu.edu.cn; users is denoted by K = {1, 2, . . . , K } (N > K). We assume
mhua@seu.edu.cn; chunguoli@seu.edu.cn; huangym@seu.edu.cn;
lxyang@seu.edu.cn). that each file has equal size, and each user is equipped with a
Digital Object Identifier 10.1109/LWC.2018.2873334 cache of size M. The caching system is parameterized by K, M
ZHANG et al.: PDA DESIGN VIA ATTENTION-BASED SEQ2SEQ MODEL WITH DEEP NEURAL NETWORK 373
and N, and it is called a (K, M, N) caching system. According

to the AN scheme, the caching system has two phases:
Placement Phase: A file is subdivided into F equal packets,
i.e., Wi = {Wi,j : j ∈ [1, F ]}. The size of each packet is 1/F.
Placement phase, i.e., these packets placed in users’ cache
memories, is independent of users’ demands. This phase is
performed during off-peak times.
Delivery Phase: Each user randomly requests one file
from W independently. Their requests constitute d = Fig. 1. Attention-Based Seq2Seq Placement Delivery Network.
(d1 , d2 , . . . , dK ), where dk means that user k ∈ K requests
the file Wdk for any dk ∈ [1, N ]. Once the server received
d it broadcasts a coded signal of at most RF (where R is satisfies the two conditions
Proof: It is easy to find that G
called the delivery rate) packets to users, such that each user’s
of Lemma 1, thus P is a PDA.
demand is satisfied.
Obviously, Theorem 1 does not guarantee that the resulting
The goal is to minimize the load of RF packets. The AN
PDA is optimal, but it is useful for our learning model because
model can be reformulated as a PDA design problem.
it can expand our trainable data set.
Definition 1 (Placement Delivery Array, [5]): For positive
Seq2Seq learning problem for PDA design: From AN
integers K, F and nonnegative integers Z and S with F ≥ Z,
scheme and Lemma 1, we can find that the placement phase
an F × K array P = (pi,j ), i ∈ [1, F ], j ∈ [1, K ], composed
basing on a (K, F, Z, S) caching system whose PDA is P
of a specific symbol ∗ and S nonnegative integers 1, 2, . . . , S ,
can produce a F × K adjacency matrix A = (ai,j ), where
is called a (K, F, Z, S) placement delivery array (PDA) if it
ai,j = 1, if pi,j = ∗ and ai,j = Inf, if pi,j = ∗. Define
satisfies the following conditions:
E = (e1 , e2 , . . . , eL ) as an ordered sequence compose of
C1: The symbol ∗ appears Z times in each column;
adjacent edges in A, where el = (i , j ), ai,j = 1 and L is
C2: For any two distinct entries pi1 ,j1 and pi2 ,j2 , pi1 ,j1 =
the number of edges of A. Assume we have another sequence
pi2 ,j2 = s is an integer only if
C = (c1 , c2 , . . . , cL ), where cl ∈ {1, 2, . . . , S } is the color of
a. i1 = i2 , j1 = j2 , i.e., they lie in distinct rows and
the edge el . Then we can generate an array P = (p i,j )
distinct columns; and
b. pi1 ,j2 = pi2 ,j1 = ∗.
cl , if ai,j = 1
Based on a (K, F, Z, S) PDA P, the caching system with p i,j = (3)
∗, if ai,j = Inf.
M/N = Z/F is called a (K, F, Z, S) caching system and it can
be obtained as follows: The Seq2Seq learning problem for PDA design is that given
1. Placement Phase: Each file is split into F packets, sequence E we should find sequence C so that the array P is
i.e., Wi = Wi,j : j ∈ [1, F ], ∀i ∈ [1, N ], and user k caches a PDA.
packets Remark 1: Different (K, F, Z, S) caching system has dif-
Ck = {Wi,j : pj ,k = ∗, ∀i ∈ [1, N ]}. (1) ferent size of the PDA and S ≤ L, thus the size of output
dictionary of the sequence C is related to the length of the
2. Delivery Phase: The server receives the request d, at input sequence. Traditional Seq2Seq learning methods require
the time slot s, it broadcasts: the size of the output dictionary to be fixed. Therefore, we
⊕ Wdk ,j , (2) cannot directly apply this framework to the PDA design
pj ,k =s,j ∈[1,F ],k ∈[1,K ] problem.
where the operation ⊕ is bitwise Exclusive OR (XOR)
operation. III. S OLVE PDA L EARNING P ROBLEM
Yan et al. [6] review the definitions from graph theory and
We first review the Seq2Seq and input-attention models that
they found the connections (Lemma 1) between strong edge
are the baselines for this letter, and then describe our model
coloring of bipartite graphs and PDAs.
using attention like [12] and reinforcement learning like [13].
Lemma 1: The array P composed of symbol ∗ and
Seq2Seq model: Assume we have a training sequence pair,
1, 2, . . . , S is a PDA if and only if its corresponding colored
(E , C E ), the Seq2Seq model computes the conditional prob-
bipartite graph G(K , F , E ) satisfies
ability p(C E |E ; θ). A learnable model with parameters θ (in
1. Each vertex in K has a constant degree;
this letter we use gated recurrent unit, i.e., GRU) is used to
2. The corresponding coloring is a strong edge coloring.
estimate the terms of the probability chain rule
Proof: Please refer to [6].
Lemma 1 shows that PDA construction suffers from a L

high complexity. This letter is to reduce the complexity using E
p(C |E ; θ) = pθ (ci |c1 , . . . , ci−1 , E ; θ), (4)
Seq2Seq model and the following conclusion. i=1
Theorem 1: Assume a PDA P corresponds to the colored
bipartite graph G(K , F , E ), and each vertex in K has the where E = (e1 , e2 , . . . , eL ) is a sequence of L vectors and
constant degree Δ = F − Z. For each vertex v ∈ K , randomly C E = (c1 , c2 , . . . , cL ) is a sequence of L indices, each cl
select δ (0< δ < Δ) edges, then we get a new colored bipartite belongs to {1, 2, . . . , S }. If we take K samples from train-
graph G and its corresponding array P is a PDA. ing set, we maximize the conditional probabilities to learn the
parameters of the model, i.e., Algorithm 1 Seq2Seq Placement Delivery Network

1: Training: Training the attention-based Seq2Seq neural
K
network μ.
θ∗ = arg max log p(CkE |Ek ; θ). (5) 2: Placement: Use the placement of AN scheme to broad-
θ k =1 cast some file packets and get A.
We use a GRU to model pθ (ci |c1 , . . . , ci−1 , E ; θ). A GRU is 3: Generating: Use μ and A to get the PDA P.
formulated as 4: Delivery: Use P and the delivery scheme of [5] to
⎧ broadcast the remaining packets.
⎪ r = σ(U r xt + W r yt−1 + b r ),
⎨ t
zt = σ(U z xt + W z yt−1 + b z ),
(6)
⎪ yt = g(U s xt + rt ◦ W s yt−1 + b z ),
⎩
yt = zt ◦ yt−1 + (1 − zt ) ◦ yt , where softmax normalizes ult to be an output distribution over
the inputs.
where xt is the input variable at time t, U∗ and W∗ are the Traditional Seq2Seq models are often difficult to train due to
weight matrices applied on input and hidden units, respec- the lack of an accurate assessment algorithm. In our problem,
tively; σ(·) and g(·) are sigmoid and tangent activation we can use Definition 1 as an evaluator to speed up the
functions, respectively; b∗ is the bias, yt is the output and convergence of the algorithm. Policy gradient reinforcement
◦ means element-wise product. The GRU is fed Et at each learning [14] is used like [13] to achieve this. The environment
time step t until the end of the input sequence is reached, at state is the generated array P, the action is the color assigned to
which time a special symbol ⇒ is input to the model. The each edge, the policy is the coloring scheme (μθ = p(·|E ; θ))
model then switches to the generation mode. and the reward R(P) is 1 if P is a PDA, otherwise the reward
In this model, the output dictionary size for all ci is R(P) is 0. Thus, the objective function of our model is
fixed and equal to S, since the outputs are chosen from K
{1, 2, . . . , S }. This means that the output dimensionality is 1
f = R(Pk ) log p(CkE |Ek ; θ). (9)
fixed by the dimensionality of the problem and it is the same K
k =1
during training and inference [9]. This prevents us from learn-
ing solutions to problems that have an output dictionary with The gradient of (9) is formulated using the well-known
a size that depends on the input sequence length [12]. REINFORCE algorithm [14]
Attention model: Seq2Seq model constrains the amount of K
1
information and computation that can arrive at any part of ∇θ f = R(Pk )∇θ log p(C E |E ; θ). (10)
the generative model. This problem can be ameliorated by K
k =1
using attention model. The attention vector at time step t Our attention-based Seq2Seq placement delivery network is
is given by shown in Fig. 1. It contains three parts, placement phase,
⎧ t T 1 2 attention-based Seq2Seq model and delivery phase. The
⎪ uj = β tanh(W sj + W dt ),
⎪
⎪
⎨ a t = softmax(u t ), detailed processes of our scheme are shown in Algorithm 1.
j j In the training step, policy gradient method and stochastic
L (7)
⎪
⎪
t
⎪
⎩ td = a s
j j , gradient descent (SGD) are used to optimize the parameters.
j =1 Our Seq2Seq model comprises two B-GRU modules, encoder
(light blue square in Fig. 1) and decoder (brown square in
where j ∈ {1, 2, . . . , L}; sj and dt are the encoder and decoder Fig. 1). The encoder network reads the input sequence E, one
hidden states, respectively; ut is the attention mask over the el ∈ E at a time, and transforms it into a sequence of latent
inputs; β, W 1 and W 2 are learnable parameters. Usually, dt memory states (s1 , . . . , sl ). The decoder network also main-
and dt are used as hidden states which are fed to the next time tains its latent memory states (s1 , . . . , sl ), and at each step,
step in the attention model. uses (8) to produce a color distribution over the next edge to
Our model: We would like the color of each edge to process color the edge. Once the next edge is colored, it is passed as
not only the preceding edges, but also the following edges. the input to the next decoder step. Until all edges are colored,
Hence, we use a bidirectional GRU (B-GRU) [11] which con- the model outputs an array P according to (3).
sists of forward and backward GRUs. The forward GRU reads Note that the PDAs are scarce, only a small amount of
the input sequence as it is ordered and calculates the for- training data can be obtained by using schemes proposed
ward hidden states (− →s1 , . . . , −
→
sL ). The backward GRU reads in [5] and [6]. During training, first, we pre-train our deep
the sequence in the reverse order and calculates the backward neural network μθ using the data generated by Theorem 1.
hidden states (←s− ←−
1 , . . . , sL ). Finally, we obtain the hidden states Then, we randomly draw K i.i.d. sample adjacency matrixes
s by concatenating the forward hidden state and the backward A1 , . . . , AK and get E1 , . . . , EK . And we draw K i.i.d. solu-
one, i.e., sl = [−
→
sl ; ←s−l ]. tions C1 , . . . , CK using current policy μθ = p(·|E ; θ). Finally,
As mentioned in Remark 1, for the PDA design problem, use SGD and (10) we update θ with learning rate 0.001, i.e.,
the output size is related to the number of elements in the θ ← SGD(θ, ∇θ f ).
input sequence. To address this problem, we use attention
scheme [12] model p(cl |c1 , . . . , cl−1 , E ) as follows
IV. S IMULATION R ESULT AND D ISCUSSION
t
ul = β T tanh(W 1 sl + W 2 dt ), In this section, simulation results are provided to illustrate
(8)
p(cl |c1 , . . . cl−1 , E ) = softmax(ult ), the effectiveness of the proposed method with the parameters
ZHANG et al.: PDA DESIGN VIA ATTENTION-BASED SEQ2SEQ MODEL WITH DEEP NEURAL NETWORK 375
TABLE I
S IMULATION PARAMETERS To construct a (K, F, Z, S) PDA (K × F >4) which corre-
sponding to a colored bipartite graph G(K , F , E ) (its degree
Δ ≥ 2), we compare the following schemes:
1) Scheme 1: Use coloring scheme in [6]. The complexity is
O(min{Cm a+b−λ C λ a−λ b−λ
m−(a+b−λ) , Cm Cm−(a−λ) }|E |(a log a+
b log b)), where Cm a = K , C b = F , 0 ≤ λ ≤ min{a, b}, |E |
m
is the size of the edge set.
2) Scheme 2: Use strong coloring scheme in [8]. The col-
oring of G whose degree no greater than 3, can be obtained
in polynomial time O((K + F )2 |E |).
3) Scheme 3: Use strong coloring scheme in [15]. The
complexity is O((K + F )Δ|E |2 ).
4) Scheme 4: Use our method. If the attention-based
Seq2Seq neural network has been trained, the complexity of
our PDA design method is O(|E | log(|E |)).
V. C ONCLUSION
Fig. 2. Training Loss. In this letter, we established the connection between the
placement delivery array in coded caching and the sequence-
to-sequence learning. We first proposed a learning method to
construct PDAs using attention model and reinforcement learn-
ing. Then a new coded caching scheme is constructed based on
the deep neural architecture. Numerical results demonstrated
that the proposed method can effectively implement coded
caching and the complexity is low.
R EFERENCES
[1] S. Wang et al., “A survey on mobile edge networks: Convergence
of computing, caching and communications,” IEEE Access, vol. 5,
pp. 6757–6779, 2017.
Fig. 3. Training Accuracy. [2] M. A. Maddah-Ali and U. Niesen, “Fundamental limits of caching,”
IEEE Trans. Inf. Theory, vol. 60, no. 5, pp. 2856–2867, May 2014.
TABLE II [3] S. Jin, Y. Cui, H. Liu, and G. Caire, “Order-optimal decentralized coded
N UMERICAL C OMPARISONS caching schemes with good performance in finite file size regime,” in
Proc. IEEE Glob. Commun. Conf. (GLOBECOM), Dec. 2016, pp. 1–7.
[4] A. Tang, S. Roy, and X. Wang, “Coded caching for wireless backhaul
networks with unequal link rates,” IEEE Trans. Commun., vol. 66, no. 1,
pp. 1–13, Jan. 2018.
[5] Q. Yan, M. Cheng, X. Tang, and Q. Chen, “On the placement delivery
array design for centralized coded caching scheme,” IEEE Trans. Inf.
Theory, vol. 63, no. 9, pp. 5821–5833, Sep. 2017.
[6] Q. Yan, X. Tang, Q. Chen, and M. Cheng, “Placement delivery array
design through strong edge coloring of bipartite graphs,” IEEE Commun.
listed in Table I. Seq2Seq [9] and Seq2Seq with attention [10] Lett., vol. 22, no. 2, pp. 236–239, Feb. 2018.
are used as benchmarks for comparison. We also compare the [7] J. J. Quinn and A. T. Benjamin, “Strong chromatic index of subset
graphs,” J. Graph Theory, vol. 24, no. 3, pp. 267–273, 1997.
complexity with [6], [8], and [15]. [8] J. Bensmail, A. Lagoutte, and P. Valicov, “Strong edge-coloring of (3, δ)-
Fig. 2 shows the convergence behavior of the loss function. bipartite graphs,” Discr. Math., vol. 339, no. 1, pp. 391–398, Jan. 2016.
Fig. 3 shows the training accuracy of our model. We can find [9] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learn-
that no matter how (K, F, M, N) is, as long as L is less than 60 ing with neural networks,” Adv. Neural Inf. Process. Syst., vol. 2,
pp. 3104–3112, 2014.
or 120, our algorithm shows good convergence performance [10] J. Su et al., “A hierarchy-to-sequence attentional neural machine trans-
and the training accuracy can reach more than 90%. lation model,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 26,
To use Seq2Seq or Seq2Seq with attention methods solve no. 3, pp. 623–632, Mar. 2018.
[11] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation
the learning problem for PDA design, the length of input and by jointly learning to align and translate,” in Proc. Int. Conf. Learn.
output should be fixed. Notice that this is not necessary for Representat. (ICLR), 2015, pp. 1–15.
our method. The experimental results in Table II show that our [12] O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer networks,” in Proc.
method can achieve accuracy of 88.96% and 82.86% when L is Int. Conf. Neural Inf. Process. Syst., 2015, pp. 2692–2700.
[13] L. Yu, W. Zhang, J. Wang, and Y. Yu, “SeqGAN: Sequence generative
less than 60 and 120. It demonstrates that our proposed method adversarial nets with policy gradient,” in Proc. AAAI Conf. Artif. Intell.,
is obviously better than the other two methods. This is due to San Francisco, CA, USA, Feb. 2017, pp. 2852–2858.
the plausible combination of B-GRU [11], neural attention [12] [14] R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient
and reinforcement learning [14]. In fact, pre-trained network methods for reinforcement learning with function approximation,” in
Proc. Adv. Neural Inf. Process. Syst., vol. 12, 1999, pp. 1057–1063.
already has a certain ability to work. Reinforcement learning [15] G. J. Chang and N. Narayanan, “Strong chromatic index of 2-degenerate
has fine-tuned the pre-trained network, which is very helpful. graphs,” J. Graph Theory, vol. 73, no. 2, pp. 119–126, 2013.
Massive MIMO-OFDM Channel Estimation via

Distributed Compressed Sensing
Abbas Akbarpour-Kasgari and Mehrdad Ardebilipour
Abstract—Massive multiple-input multiple-output orthogonal angular domain and a unified channel estimation for TDD
frequency division multiplexing (mMIMO-OFDM) channel esti- and FDD system is proposed in [3]. Moreover, to reduce the
mation is considered recently utilizing compressed sensing (CS) required pilot and accurate channel estimation in mMIMO
based methods. Here, we proposed to use the joint sparsity of
mMIMO-OFDM channels using stage-wise forward backward systems, exploiting spatio-temporal sparsity, an adaptive struc-
pursuit (StFBP) algorithm. In order to increase the speed of tured subspace pursuit (ASSP) is introduced in [1]. ASSP
convergence and accuracy of estimation, we proposed to gather algorithm estimates the sparsity of the channel and channel
multiple good atoms in each step and to exploit common sparsity coefficients, simultaneously. Choi et al. [5] utilize the tempo-
in the system model, respectively. Furthermore, the backward ral correlation of the channels to develop the locally common
steps improve the accuracy by omitting bad previously gath-
ered atoms. Simulation results represent the superiority of the support (LCS) algorithm for channel estimation. The Bayesian
proposed StFBP approach rather than the conventional CS-based estimation was used via marginal based SCS method in order
and non-CS-based approaches. to improve the channel estimation accuracy in [6]. Stage-wise
Index Terms—Channel estimation, compressed sensing, mas- OMP (StOMP) was utilized in block sparsity mode to esti-
sive multiple input multiple output-orthogonal frequency division mate the channel state information of MIMO-OFDM rapidly
multiplexing (mMIMO-OFDM), stage-wise forward backward and accurately in [7]. In each step of OMP algorithm, the
pursuit (StFBP). estimation is based on just one of the atoms while in StOMP
the estimation works using multiple atoms. Hence, StOMP
based algorithms converge more rapid than the former. The
I. I NTRODUCTION
OMP and its derivatives the same as StOMP suffers from an
ASSIVE Multiple-Input Multiple-Output (mMIMO)
M is one the promising approaches for the future 5G
telecommunication systems. By employing a large number of
inherent drawback which is caused by not eliminating bad
atoms which are gathered in previous steps. This drawback is
compensated in Forward-Backward Pursuit (FBP) algorithm
antennas in Base Station (BS), it facilitates the implementa- introduced in [9].
tion of high-throughput wireless systems. In order to increase In this letter, we have proposed to utilize the common spar-
the data availability, accurate channel estimation is mandatory. sity together with the fading channel large scale behavior. First
Hence, researchers devote lots of their studies on the channel of all, we have considered all the channel ensembles com-
estimation in MIMO-OFDM systems, recently [1]–[6]. mon sparsity by using the μ-norm. Then, in order to deploy
The sparse behavior of the mMIMO-Orthogonal Frequency the channel large scale behavior, the channel is modeled by a
Division Multiplexing (OFDM) channel ensembles is uti- weighted cost function based on the distance of the channel
lized in Compressed Sensing (CS)-based channel estimation coefficients from the origin. Hence, the channel coefficients
methods which are the result of a small number of sig- will be extracted from the random measurements optimally.
nificant scatterers in the medium. Moreover, the small dis- In order to solve the resultant optimization problem which
tance of the antenna pairs on the mMIMO node rather than address the common sparsity and large scale behavior, we
the considerable conveyed distance by the wave leads to proposed a Stage-wise FBP (StFBP) algorithm mode wherein
unique and common support for all the channel ensembles forward steps multiple good atoms are gathered while in the
between two mMIMO nodes. Hence, using common spar- conventional FBP method only one atom is selected in each
sity channels, the compressed channel estimation is improved forward step. Consequently, the proposed method increases the
significantly [1]–[8]. A channel estimation scheme based speed of convergence. Moreover, exploiting backward steps
on training sequence (TS) design and optimization with makes it possible to omit bad atoms gathered previously, as
high accuracy and spectral efficiency is investigated in the a consequence it would improve the accuracy of estimation
framework of structured compressive sensing in [2]. Channel rather than StOMP and OMP. To exploit the proposed StFBP,
estimation for indoor mMIMO systems is decomposed in we design the channel matrix and measurement matrix to
exploit the common sparsity using μ-norm which is intro-
Manuscript received September 3, 2018; accepted September 21, 2018. Date duced. The system model is formulated using matrix repre-
of publication October 1, 2018; date of current version April 9, 2019. The
associate editor coordinating the review of this paper and approving it for sentation to exploit the common sparsity of the channel. Then,
publication was J. Choi. (Corresponding author: Mehrdad Ardebilipour.) a StFBP algorithm-based channel estimation is developed to
The authors are with the Department of Electrical and Computer estimate the channel coefficients in the time domain.
Engineering, K. N. Toosi University of Technology, Tehran 19697, Iran
(e-mail: mehrdad@eetd.kntu.ac.ir). The remainder of this letter is as follows. In Section II the
Digital Object Identifier 10.1109/LWC.2018.2873339 system model is represented, and the problem is formulated.
AKBARPOUR-KASGARI AND ARDEBILIPOUR: mMIMO-OFDM CHANNEL ESTIMATION VIA DISTRIBUTED CS 377
Section III the proposed channel estimation method is scatterers where K << L. Consequently, the channel could be
introduced. Numerical results and Concluding remarks are modeled using sparse vectors. Moreover, the signal conveyed
represented in Sections IV and V, respectively. distance is large relative to the transmit-receive antenna spac-
ing in each terminal. Hence, the encountered scatterers in each
II. S YSTEM M ODEL AND P ROBLEM F ORMULATION chip period are identical between different antennas. In other
A. System Model words, the delays of different paths are the same in all the
channel ensembles between two terminals. Thus, the sparsity
Consider a mMIMO BS A and a user terminal (UT) B pattern of varying channel pairs could be assumed to be the
which are equipped by NA and NB transmit-receive antennas, same while the channel attenuation is different. Since each
respectively. Each terminal employs OFDM signaling on path channel attenuation is composed of multiple distinct scat-
each of u-th antenna with u = 0, 1, . . . , NA (NB ) − 1 to terers with zero-mean and identically independent distributed
exchange data with the other terminal. The emitted OFDM (i.i.d.) subpaths, it is assumed to be CN (0, σ 2 ). Hence, the
symbol is constructed using N subcarriers which are spaced channel coefficient could be represented as
by Δ f in hertz while prefaced using NG cyclic prefix (CP). I
−1
The received signal by v-th antenna of the receiver with huv (l ) = αuv (i )g(lT − τ (i )) (3)
v = 0, 1, . . . , NB (NA ) − 1, is passed through a frequency i=0
selective channel which can be determined by a Finite
where l ∈ [0, 1, . . . , L] is the channel path index, τ (I − 1) ≥
Impulse Response (FIR) filter and constructed of multiple
· · · ≥ τ (1) ≥ τ (0) are the respective paths’ delay, αuv is
paths. Ts = 1/(N Δf ) is used as the sampling period in the
the corresponding paths’ gain, and g(.) is the shaping pulse
receiver. Ignoring the CP fallen chips and transforming other
in the continuous domain. The shaping pulse is zero out-
chips into Fourier transform domain, the received signal is
side the interval [0, Tg ], where Tg is the integer multiple
ready to be selected on the pilot subcarriers. This selection is
of chip time T. Without loss of generality, we assumed that
made on each of the antennas and respective pilot sequence.
τ (i) are integer multiples of T. Thus, the number of chan-
As a consequence, the received signal on the n-th pilot
nel paths, caused by channel itself and shaping filter is
symbol can be derived as
derived by L = τ (I − 1)/T + Tg /T + 1. Furthermore, we
L−1
assume that L is lower than TG /T . Using the mentioned nota-
yuv (n) = huv (l )e(l , u, n) + wuv (n) (1) tions, we can represent the channel impulse response using
l=0 huv = [huv (0), huv (1), . . . , huv (L − 1)]T .
where n = 0, 1, . . . , Np − 1 is the pilot index,
l = 0, 1, . . . , L − 1 is the paths index and C. Problem Formulation
e(l , u, n) = e −j 2πlpu (n)/N . Moreover, pu (n) represents the Up to now, the channel model and the system model
n-th pilot of u-th transmit antenna with n = 0, 1, . . . , Np . are represented. In order to estimate the channel impulse
wuv (n) denotes the received Additive White Gaussian Noise response, we have developed a formulation to utilize the
(AWGN) sample between the u-th transmit antenna, and v-th channel common sparsity together with the large scale charac-
received antenna in the n-th sample of pilot which obeys the teristic. To develop the model we consider Multiple Input and
CN (0, σw 2 ). Besides, h (l ) is the l-th channel tap between
uv Single Output (MISO) since it is practical according to the
the u-th transmitter antenna, and v-th received antenna. The mMIMO systems. As mentioned earlier, the channel impulse
vector representation of (1) can be formulated as response could be represented as huv wherein MISO chan-
yuv = Φu huv + wuv (2) nels u = 0, 1, . . . , NB − 1, and v = 0. Using the channel
ensembles in MISO system, we can represent the channel
where yuv = [yuv (0), yuv (1), . . . , yuv (Np − 1)]T is the
matrix as H = [h00 , h10 , . . . , h(NA −1)0 ] where the size of
received pilot vector, Φu = diag{xu }Fu is the mea-
H is L × NA . By collecting all the received pilot sequences,
surement matrix with xu = [xu (0), xu (1), . . . , xu (Np −
we can represent the Np × NA received pilot matrix as
1)]T as the transmitted pilot vector and Fu as the par-
Y = [y00 , y10 , . . . , y(NB −1)0 ]. Hence, the extension of (2)
tial Fourier matrix with Np rows corresponding to the pilot
could be represented in matrix form for MISO case as
sequence of transmit antenna u and L first columns, huv =
[huv (0), huv (1), . . . , huv (L − 1)]T is the channel coefficient Y = ΦH + W (4)
vector and wuv = [wuv (0), wuv (1), . . . , wuv (Np − 1)]T is the
where W is the AWGN matrix with corresponding columns
received noise vector. Besides, in this formulation the super-
according to the wu0 . Φ is the measurement matrix with
script T is the sign of transpose and diag{.} denotes the
the size of Np × L. For the sake of simplicity, we drop the
diagonal representation of the corresponding vector.
index 0 for the UT antenna. Defining μ norm for matrices
as μ(H) = card{Hu 2 = 0} where Hu is the u-th column
B. Channel Model
of H, and card{S } is the number of elements in the set S,
As mentioned, the channel between transmit-receive pair estimating the channel could be accomplished by following
which is denoted by huv is consist of L resolvable paths. These optimization criterion.
resolvable paths result from L scatterers which are encoun-
tered by the signal conveying from the transmitter antenna u min Y − ΦH22
to the receiver antenna v. K of these L scatterers are significant s.t. μ(H) ≤ K . (5)
Apparently, K is the maximum sparsity of the columns of Algorithm 1 StFBP-Based Channel Estimation
H. We called the objective function as F(H) = Y − ΦH22 . 1: R(0) = Y, λ0 = φ and t = 0
Utilizing μ(H), we can exploit the joint sparsity of the 2: while stop criterion not met do
channel ensembles in the system. The objective function 3: Proxy of the signal is formed by P = ΦH R(t−1)
in (5) represents the error of channel estimation method, 4: r = Pθ
and the constraint controls the sparsity order of the channel 5: Select good atoms according to {i (t) :ri ≥ τ }
6: Merging supports of the previous iteration and the present one
ensembles in H. Moreover, using μ-norm definition, the block
λ(t) = λ(t−1) ∪ i (t)
sparsity of the channels are exploited. 7: Calculate H(t) by solving Eq. (10)
(t)
8: δF = F(H(t−1) ) − F(H(t) )
III. P ROPOSED S TAGE -W ISE F ORWARD -BACKWARD 9: while 1 do
(t)
P URSUIT 10: j (t) = arg minj ∈λ(t) F(H(t) − Hj )
(t) (t)
To solve the optimization problem in Eq. (5), pseudo-inverse 11: δB = F(Hj ) − F(H(t) )
of the matrix Φ is needed. Calculating the channel coefficients (t) (t)
12: if δB ≥ 0.5δF /|i (t) | then
without any consideration of channel characteristics, would
13: Update residual R(t) = Y − Φλ(t) H(t)
cause some difficulties in inversion of the matrix. Actually,
14: break
this inversion didn’t consider any characteristics of channels. 15: end if
In order to consider the channel impulse response, we have 16: Exclude bad atom λ(t) = λ(t) − j (t)
considered the large scale fading phenomena. Each path of the 17: Update H(t) by solving Eq. (10).
channel is changed in the power, according to the exponential 18: Update residual R(t) = Y − Φλ(t) H(t)
distribution based on the distance of the origin. In fact, each 19: end while
path could be defined by the following equation 20: end while
L−1

Z (H ) = hl 22 ωl (6)
l=0
a greedy algorithm is a particular case of StFBP which the
forward selection is present, but the backward fixing is absent,
where consequently, it cannot fix its own mistakes in previous steps.

1, if l = 0 Moreover, FBP algorithm constructs the new subspace by
ωl = σ (7)
lα , otherwise adding just one atom to the previous subspace, and in the back-
ward steps, it reconstructs the subspace by omitting bad atoms.
where 0 ≤ l ≤ L-1, σ is the coefficient of the pathloss model
To increase the speed of convergence, we have proposed to add
and α is the environment factor. Accordingly, by defining the
multiple elite atoms in each forward step and in the backward
new vector gl = [h1 (l ), h2 (l ), . . . , hNA (l )] and consequently
step it excludes various atoms, too. In this case, the speed
the matrix G = [g0T , g1T , . . . , gL−1
T ]T , the Z (H ) could be
of convergence would increase, while the accuracy is guar-
considered as the anteed using backward steps. As a consequence, the proposed
Z(H) = Tr(GΩGH ) (8) StFBP algorithm could be compared with its greedy one called
StOMP, where StOMP is the particular case of StFBP without
where Ω = diag{ω0 , ω1 , . . . , ωL−1 }. Thus, in order to backward steps to increase the estimation accuracy.
consider the channel ensembles, we should change the In Algorithm 1, |i (t) | denotes the number of selected atoms
(t)
optimization problem in Eq. 5 as in forwarding step, and Hj represents the H(t) while j (t) -th
min Y − ΦH22 + λTr(GΩGH ) column of it is omitted. Furthermore, τ is the threshold of
selection in forwarding steps and θ is NB × 1 ones vector
s.t. μ(H) ≤ K (9)
(i.e., all the elements equal to one).
where λ is the regulation factor. In order to solve the problem, In algorithm 1 each step is included of different
we try to consider the gradient of the optimization problem multiplications which will be mentioned in the following. The
equal to zero. proxy matrix costs 4NA Np real multiplications. Moreover, In
9th step 4NA Np real multiplications is encountered. In each
ΦH ΦH + λ/2HΩ = ΦH Y (10) step t the cardinality of λ (t) could be defined by |λ λ(t) |; hence,
th
the overall cost of 10 step is evaluated by 4|λ λ(t) | + 4NB .
which is a Lyapunov equation [10]. Hence, by solving a
(t)
Lyapunov equation, we can calculate the channel coefficients. In backward stages, calculating j (t) and δB cost 4(|λ λ(t) | −
In order to handle the optimization in (9), we have proposed 1) + 4NB . Furthermore, defining residual in 15 and 20th
th
a Stage-wise Forward-Backward Pursuit (StFBP) based on [9] step costs 4|λ λ(t) | real multiplications. Finally, updating H(t)
where 0 norm was used. The algorithm is represented in in each step consumes 4NA Np real multiplications.
details in Algorithm 1. Eq. (9) could be solved using three
different methods as convex relaxation, greedy processes, and
message-passing algorithms. StFBP which is based on the IV. N UMERICAL R ESULTS
Message-Passing algorithm is used here because of its forward The simulation results are represented in this section to
selection and backward fixing. Specifically, StOMP which is compare the performance of the proposed method with other
AKBARPOUR-KASGARI AND ARDEBILIPOUR: mMIMO-OFDM CHANNEL ESTIMATION VIA DISTRIBUTED CS 379
the proposed StFBP this is not as important as ABSP since

in fixing process we can omit the previous steps chosen bad
atoms.
In Fig. 2, the comparison of different number of anten-
nas is represented. As obvious, NMSE is only 3 dB effected
by increasing the number of antennas. This demonstrate
the proposed method application in large number of anten-
nas in BS. Moreover, the proposed channel estimation is
superior approximately 5 dB in lower SNR and 15 dB in
higher SNRs than ABSP method for all the number of
antennas.
Fig. 1. Comparison of different estimation algorithms in using Np = 32
pilots and NA = 20 antennas from NMSE view point.
V. C ONCLUSION
In this letter channel estimation of the mMIMO-OFDM
system using DCS approach is considered. To utilize the
channel block sparsity together with the MIMO channel large
scale behavior, we have proposed to study StFBP algorithm.
The proposed estimation approach which is called StFBP
utilizes the channel block sparsity and adds multiple atoms
in each forward steps while omits badly selected atoms.
Skipping bad atoms makes it possible improve the precision
of the channel estimation which is entirely appreciable in
comparison with ABSP. In order to increase the speed of con-
Fig. 2. Comparison of different number of antennas in BS by using Np = 32 vergence, we have proposed to select multiple atoms in each
pilots from NMSE view point. forward step.
R EFERENCES
known approaches. In the simulations, the number of subcarri-
ers were N = 2048 while there were 15 kHz spacing between [1] Z. Gao, L. Dai, W. Dai, B. Shim, and Z. Wang, “Structured compres-
sive sensing-based spatio-temporal joint channel estimation for FDD
them. The BS was equipped with large number of antennas. massive MIMO,” IEEE Trans. Commun., vol. 64, no. 2, pp. 601–617,
Moreover, in each transmitter, the interleaved bits are modu- Feb. 2016.
lated using 16 Quadrature-Amplitude-Modulation (16-QAM). [2] X. Ma, F. Yang, S. Liu, J. Song, and Z. Han, “Design and
optimization on training sequence for mmWave communications: A
Transmit power of both data and pilot subcarriers are the same. new approach for sparse channel estimation in massive MIMO,”
Besides, the total available power is divided among transmit- IEEE J. Sel. Areas Commun., vol. 35, no. 7, pp. 1486–1497,
ting antennas. Channels are distributed according to Extended Jul. 2017.
Pedestrian A (EPA) available in LTE-A standard channel [3] D. Fan, F. Gao, G. Wang, Z. Zhong, and A. Nallanathan, “Angle domain
signal processing aided channel estimation for indoor 60GHz TDD/FDD
model [11]. Moreover, the channel taps are scaled to support massive MIMO systems,” IEEE J. Sel. Areas Commun., vol. 35, no. 9,
NA NB total channel power. In the detection, after channel pp. 1948–1961, Sep. 2017.
estimation, the zero-forcing equalizer omits the impact of the [4] Y. Nan, L. Zhang, and X. Sun, “Efficient downlink channel estimation
scheme based on block-structured compressive sensing for TDD mas-
channel. Furthermore, the simulation results are averaged on sive MU-MIMO systems,” IEEE Wireless Commun. Lett., vol. 4, no. 4,
2000 independent runs. pp. 345–348, Aug. 2015.
At first, the proposed channel estimation is compared with [5] J. W. Choi, B. Shim, and S.-H. Chang, “Downlink pilot reduction for
massive MIMO systems via compressed sensing,” IEEE Commun. Lett.,
the other well-known DCS-based channel estimation methods vol. 19, no. 11, pp. 1889–1892, Nov. 2015.
called ABSP from [4]. The results are shown in Fig. 1. This [6] M. Masood, L. H. Afify, and T. Y. Al-Naffouri, “Efficient coordinated
simulation is carried on utilizing Np = 32 pilot subcarriers recovery of sparse channels in massive MIMO,” IEEE Trans. Signal
Process., vol. 63, no. 1, pp. 104–118, Jan. 2015.
for each of the transmitting antenna. As represented in Fig. 1, [7] D. Lee, “MIMO OFDM channel estimation via block stagewise
the proposed StFBP approach outperforms the other method. orthogonal matching pursuit,” IEEE Commun. Lett., vol. 20, no. 10,
From NMSE perspective, the proposed approach is superior to pp. 2115–2118, Oct. 2016.
[8] A. Akbarpour-Kasgari and M. Ardebilipour, “Probability-based pilot
ABSP method almost 7 dB. Moreover, increasing SNR more allocation for MIMO relay distributed compressed sensing based chan-
than 30 dB is seemingly useless for ABSP. The main advantage nel estimation,” EURASIP J. Adv. Signal Process., vol. 2018, no. 1,
of the proposed StFBP approach to the well-known ABSP is pp. 1–18, Mar. 2018.
the fixing process which is absent in ABSP. As a consequence, [9] T. Zhang, “Adaptive forward-backward greedy algorithm for learn-
ing sparse representations,” IEEE Trans. Inf. Theory, vol. 57, no. 7,
previous steps selected bad atoms are omitted, and the NMSE pp. 4689–4708, Jul. 2011.
is getting lower while in ABSP the previous steps selected bad [10] A. Jameson, “Solution of the equation AX + XB = C by inversion
atoms are not omitted and remain in the last estimated channel of an M M or N N matrix,” SIAM J. Appl. Math., vol. 16, no. 5,
pp. 1020–1023, Sep. 1968.
impulse response. Hence, adjusting a reasonable threshold for [11] User Equipment Radio Transmission and Reception (Rel. 12), V12.9.0,
selecting atoms in each step is essential in ABSP while in 3GPP Standard TS 36.101, Oct. 2015.
Analysis of Unslotted IEEE 802.15.4 Networks With

Heterogeneous Traffic Classes
J. Ortín , M. Cesana , A. E. C. Redondi , M. Canales , and J. R. Gállego
Abstract—We propose a modeling framework composed of 802.15.4. To fill this gap and based on [5], we propose here
a Markov chain and the related coupling equations to evalu- a Markovian model to evaluate the performance of hetero-
ate the performance of unslotted IEEE 802.15.4 wireless sensor geneous unslotted 802.15.4 networks with different “classes”
networks based on CSMA/CA medium access control. Different of nodes (each class generating traffic according to a class-
from the related literature, the proposed model is able to cap-
specific rate). The main contributions of this letter with respect
ture heterogeneous classes of nodes with class-specific traffic
generation rate and includes a more refined calculation of the to [5] are: first, instead of using a different Markov chain to
probability of finding the channel busy during the carrier sens- model each node, we use a Markov chain for each class of
ing process. The proposed model is used to derive class-specific nodes thus decreasing the complexity of the model; second, all
performance figures including the probability of a successful the aforementioned related work assume that the probability of
transmission and the average delay for a successful/unsuccessful finding the channel busy during a Clear Channel Assessment
transmission. (CCA) does not depend on the backoff stage of the node;
Index Terms—Wireless sensor networks, IEEE 802.15.4, although this effect is not relevant when all nodes are saturated,
medium access control (MAC). the accuracy of the model is compromised when saturated and
unsaturated nodes coexist in the same network; to this extent,
we show how to keep track of the backoff stage when deriving
the probability of finding the channel busy. Third, we derive
I. I NTRODUCTION a more accurate expression for the collision probability expe-
HE DIFFUSION of wireless sensor networks based on the
T IEEE 802.15.4 standard has stimulated research efforts
on the performance evaluation of the its Medium Access
rienced by the nodes. Specifically, we consider that if there is
a collision, the probability that any other node performs CCA
must be conditioned by the fact that the channel is not busy.
Control (MAC) scheme. For instance, a Markov chain model
for unslotted, acknowledged 802.15.4 CSMA/CA is proposed
in [1] for unsaturated nodes considering a deterministic idle II. P ROPOSED M ODEL
time after every transmission. Markov chains are also used to We assume a scenario with M classes of nodes, each class l
model the prioritized contention access for industrial appli- formed by Nl nodes generating packets according to a Poisson
cation of IEEE 802.15.4 [2] or to analyze the performance process of rate λl . All the nodes access the medium accord-
of a modified MAC mechanism assuming homogeneous and ing to the unslotted IEEE CSMA/CA 802.15.4 protocol; when
Poisson traffic [3]. A different approach is followed in [4], a node tries to transmit a new packet, it waits for a random
where the modeling of unslotted CSMA/CA is carried out number of backoff slots in the range [0, 2BE −1], being BE the
using Event Chains Computation instead of Markov chains, backoff exponent that is initialized to mmin . When the back-
showing that this approach better models the case where nodes off counter is 0, the node performs CCA to determine whether
start reporting data simultaneously when an event is detected. the transmission channel is empty. If not, BE is increased by 1
The number of works on heterogeneous 802.15.4 networks until it reaches the limiting value mmax and the node waits for
is more limited. The main works modeling heterogeneous a new random backoff period generated with the new value of
traffic in IEEE 802.15.4 networks are [5] for acknowl- BE. This process is repeated until the number of failed CCAs
edged unslotted 802.15.4 and [6] for unacknowledged slotted exceeds the parameter m. In that case, the packet is discarded
due to a channel access failure. On the contrary, if the chan-
Manuscript received September 12, 2018; accepted September 24, 2018.
Date of publication October 1, 2018; date of current version April 9, 2019. nel is empty, the node switches from the listening mode to
This work was supported in part by the Spanish Government from the Mobility the transmitting mode, transmits the packet and waits for the
Program of the Ministerio de Educacion, Cultura y Deporte under Grant reception of the ACK. If the ACK is not received, then the
CAS17/00624, in part by the Ministerio de Ciencia e Innovación under Project packet is retransmitted following the CSMA/CA mechanism
TEC2014-52969-R, in part by the Universidad de Zaragoza under Project
UZ2018-TEC-04, and in part by the Centro Universitario de la Defensa under
described above. This process can be repeated up to n times.
Project CUD2017-18. The associate editor coordinating the review of this When this value is exceeded, the packet is discarded due to a
paper and approving it for publication was H. Zhou. (Corresponding author: collision failure.
J. Ortín.) To model the backoff, sensing and transmitting states of the
J. Ortín is with the Centro Universitario de la Defensa, 50090 Zaragoza,
Spain (e-mail: jortin@unizar.es).
nodes, we rely on the Markov chain model shown in Fig. 1 and
M. Cesana and A. E. C. Redondi are with the Dipartimento di Elettronica, proposed in [5]. A state in the chain is the tuple (i, j, r), being
Informazione e Bioingegneria, Politecnico di Milano, 20133 Milan, Italy i the backoff stage, j the backoff counter and r the retransmis-
(e-mail: matteo.cesana@polimi.it; alessandroenrico.redondi@polimi.it). sion counter. The backoff stage and the retransmission counter
M. Canales and J. R. Gállego are with the Instituto de Investigación
en Ingeniería de Aragón, Universidad de Zaragoza, 50018 Zaragoza, Spain
are limited by the parameters m and n respectively. Similarly, j
(e-mail: mcanales@unizar.es; jrgalleg@unizar.es). ranges from 0 to Wi = 2BEi − 1, with BEi the backoff expo-
Digital Object Identifier 10.1109/LWC.2018.2873347 nent corresponding to the backoff stage i. In the states with
2162-2345 c 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/
redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
ORTÍN et al.: ANALYSIS OF UNSLOTTED IEEE 802.15.4 NETWORKS WITH HETEROGENEOUS TRAFFIC CLASSES 381
The collision probability for nodes of class l is the proba-

bility that any other node performs a CCA in a 2tta period,
with tta the turnaround time to the transmitting mode,
N −1 N
Pc,l = 1 − 1 − (1 + γ)τl l 1 − (1 + γ)τi i , (3)
i∈M
i=l
where γ = 2ttatb−tb is a corrective factor that introduces into

the model the fact that the length of the period where collisions
can happen, 2tta , is longer than a backoff slot tb .
Additionally, the term τl represents the probability that a
node of class l performs a CCA, conditioned by the fact that
the channel is not busy (if it were busy, the CCA would have
failed and there could not be a collision). From the Markov
chain of Fig. 1 and assuming that Lc ≈ Ls , this probability is
τl
τl = . (4)
1 − τl (1 − αl )Ls
In order to compute αl , previous works have assumed that this
probability does not depend on the specific backoff stage of
the node. Nevertheless, this is not accurate when the traffic
load is low. In that case, the probability of finding the channel
busy in the first CCA is low, but if the channel is found busy
in the first attempt, it is likely that the packet that caused the
first CCA failure is still occupying the channel on the second
attempt, thus increasing the probability of finding the channel
busy in that attempt.
The introduction of this effect into the model would imply
(i)
Fig. 1. Markov chain model of the CSMA/CA algorithm of a transmitting the use of a different αl for each backoff stage i of the
node of class l for unslotted IEEE 802.15.4 MAC.
Markov chain in Fig. 1, which is too complex. In order to
j = 0 the node performs CCA. We call αl the probability that reduce the complexity of the model, we propose to use a
a node of class l finds the channel busy upon CCA, and Pc,l unique term αl in the Markov chain and compute it with
the collision probability for nodes of class l. m i (k ) (1)
i=0 k =0 αl (0) 1 + αl
The states (−1, j, r) represent the transmission of a packet, αl = i (k )
≈ αl (0)
. (5)
with 0 ≤ j < Ls , and Ls the duration in slots of a success- 1 + m−1i=0 k =0 αl 1 + αl
ful transmission.1 Similarly, the states (−2, j, r) represent the
(0) (0)
collision of a packet, with 0 ≤ j < Lc , and Lc the duration in The probability αl is given by αl = αl,pkt + αl,ack ,
slots of a collided transmission.2 Note that the meaning of the where αl,pkt and αl,ack are the probabilities of finding the
counter j depends on the value of i, as it can be the backoff channel busy during the first CCA because of the transmission
counter (when i ≥ 0), or a counter going through the different of a data packet and an ACK respectively.
slots of a successful/collided transmission (when i = −1, −2). To compute αl,pkt , let Ti be the event that at least one node
The traffic generation for a node of class l is modeled with a of class i is transmitting when a node of class l performs CCA.
packet generation probability in idle state ql . We also include Then
in the model the probabilities of having a packet ready to M ⎛ ⎞
M

i−1

be transmitted after a successful transmission qsuc,l , after a
channel access failure qcf ,l and after a collision failure qcr ,l . αl,pkt = P Ti = P ⎝Ti TjC ⎠, (6)
The expressions of these probabilities are derived afterward. i=1 i=1 j =1
From this model, we can compute the probability τl that a with
node belonging to class l performs CCA in a randomly chosen ⎛ ⎞
time slot. Following [5], this probability is i−1

i−1

P ⎝Ti TjC ⎠ = L 1 − (1 − τi )Ni (1 − αi ) (1 − τj )Nj ,
1 − αlm+1 1 − yln+1 j =1 j =1
τl = pl (0, 0, 0), (1)
1−α 1 − yl (7)
where pl (0, 0, 0) is the steady state probability of state (0, 0, 0) where Ni = Ni if i = l and Ni = Ni − 1 if i = l . On
for nodes of class l and yl = Pc,l (1 − αm+1 ). The expression the other hand, αl,ack is the probability of finding the channel
for pl (0, 0, 0) is given in Eq. (2) at the top of the next page. busy because of the successful transmission of a packet, which
happens when only one packet is being transmitted
1 This value is L = L + t
s ack + Lack + IFS , with L the total transmission
N
time of a packet, tack is the ACK waiting time, Lack is the transmission αl,ack = Lack Ni τi (1 − τi )Ni −1 1 − τj j . (8)
time of the ACK and IFS is the Inter-Frame Spacing. i∈M j ∈M
2 This value is L = L + t
c m,ack , with tm,ack the timeout of the ACK. j =i
⎧

⎪
m+1 1−αm+1 1−yln+1 m+1 1−yl
n+1
⎪ 12 1−(2α
⎪ 1−2α
l)
W 0 + 1−α
l
1−y + Ls (1 − P c,l ) + L c P c,l 1 − αl 1−yl
⎪
⎪ l l l
⎪
⎪ −1
⎪
⎪ m+1
1−qcf ,l αl (1−yln+1 ) 1−qcr ,l n+1 1−qsuc,l (1−αm+1 )(1−yln+1 )
⎪
⎨ + ql + ql y l + (1 − Pc,l ) l
, if m < m̂ = mmax − mmin
1−yl ql 1−yl
pl (0, 0, 0) =
(2)
⎪
⎪ 1 1−(2αl )
m̂+1 1−αlm̂+1
mb +1 m̂+1 1−αl
m−m̂
1−yl n+1
m+1
⎪
⎪ 2 1−2αl W 0 + 1−αl + 2 + 1 αl 1−αl 1−yl + Ls (1 − P c,l ) + Lc P c,l 1 − αl
⎪
⎪
⎪
⎪ −1
⎪
⎪ 1−y n+1
1−q αm+1
(1−y n+1
) 1−q 1−q (1−α m+1
)(1−y n+1
)
⎩×
1−yl +
l
ql
cf ,l l
1−yl
l
+ qlcr ,l yln+1 + ql
suc,l
(1 − Pc,l ) l
1−yl
l
, otherwise
(1)
The term αl can be computed as We show now the expressions for the average delay expe-
rienced by a packet in a successful transmission and when it
(1) (0) is discarded due to a channel access failure or a retry limit.3
αl ≈ 1 · P (E0 ) + αl (1 − P (E0 )), (9)
Let Tsuc,l be the delay of a packet transmitted success-
where E0 is the event that the packet that has caused the CCA fully, Cj the event of having a successful transmission after
failure in the first attempt is still being transmitted when the (j )
j previous collisions, and Tsuc,l the delay experienced by a
second CCA is performed. From the Markov chain of Fig. 1, packet when the event Cj occurs. Following [5],
the probabilities of the states (−1, j, r), with 0 ≤ j < Ls
and a fixed r, are all equal. Therefore, if a device performs
n
(j )
CCA and the channel is busy with a successful transmission, E Tsuc,l = P (Cj )E Tsuc,l , (15)
the node that is transmitting can be at any of the states of j =0
the form (−1, j, r) with equal probability (i.e., it follows a with
discrete uniform distribution in [0, Ls − 1]). Likewise, if the j
j
channel is busy with a collided transmission, it can be at any 1 − Pc,l 1 − αlm+1 Pc,l 1 − αlm+1
of the states of the form (−2, j, r) with equal probability P (Cj ) = n+1 (16)
(i.e., it follows a discrete uniform distribution in [0, L−1]). 1 − Pc,l 1 − αlm+1
On the other hand, the first stage of the backoff also follows and
a discrete uniform distribution in [0, W0 − 1]. Therefore, if j

S = U(0, Ls − 1), C = U(0, L − 1) and B0 = U(0, W0 − 1) (j )

E Tsuc,l = Ls + tTA + jLc + E[Tb ], (17)
are discrete uniform random variables, then
h=0
P (E0 ) = Pc,l P (C > B0 ) + (1 − Pc,l )P (S > B0 ), (10) being Tb the random time that a node spends in backoff or
sensing states during the CSMA/CA procedure. The expected
with value of Tb is
m

(W0 −1)/2+L−W0
, if L > W0
P (C > B0 ) = L (11) E[Tb ] = P (Di )E Tb,i , (18)
L−1 , otherwise,
2W0 i=0
where P (Di ) is the probability of finding the channel idle at
and
the i + 1th attempt, given that the channel has been found
(W0 −1)/2+Ls −W0 busy in the preceding i attempts and the packet has not been
Ls , if Ls > W0
P (S > B0 ) = Ls −1 (12) discarded due to a channel access failure; and E[Tb,i ] is the
2W0 , otherwise. expected time a node spends in backoff or sensing states given
the event Di . P (Di ) can be calculated as
Eqs. (1), (3) and (5) form a system of coupled nonlinear
equations with variables τl , αl and Pc,l that can be solved αi 1 − αl
P (Di ) = m l k = αli , (19)
numerically to obtain the point of operation of the network.
k =0 αl 1 − αlm+1
From them, different performance metrics can be obtained.
First, we derive the probabilities of a discarded packet due while
to a collision failure, Pcr ,l , and due to a channel access failure, i

W −1
Pcf ,l . From the Markov chain of Fig. 1, we have E Tb,i = (i + 1)tCCA + tb k , (20)
2
k =0
n+1
αlm+1 1 − Pc,l 1 − αlm+1 with tb and tCCA the durations of a backoff slot and CCA.
Regarding the delay suffered by a packet when it is dis-
Pcf ,l = (13) carded due to a channel access failure, Tcf ,l , it can be derived
1 − Pc,l 1 − αlm+1
following the same approach used to compute Tsuc,l
and
n
(j )
n+1 E Tcf ,l = P (Fj )E Tcf ,l , (21)
Pcr ,l = Pc,l 1 − αlm+1 . (14) j =0
3 We consider only the time from the instant the packet is ready to be
With this, the probability of a successful transmission is transmitted until an ACK is received or until it is discarded because of the
Psuc,l = 1 − Pcf ,l − Pcr ,l . aforementioned failures (i.e., we do not include queuing time in this analysis).
ORTÍN et al.: ANALYSIS OF UNSLOTTED IEEE 802.15.4 NETWORKS WITH HETEROGENEOUS TRAFFIC CLASSES 383
Fig. 2. Performance metrics for a network with 1 saturated stations and 50 unsaturated stations as a function of the un traffic rate of unsaturated stations.
where Fj is the event of having a channel access failure after j λ increases, both probabilities converge since the unsaturated
(j ) nodes tend to behave like the saturated one.
previous collisions and Tcf ,l is the delay suffered by a packet
on the occurrence of the event Fj . It can be easily derived Fig. 2(b) shows the average delay incurred by a packet in
a successful transmission, whereas Fig. 2(c) shows the aver-
that P (Fj ) = P (Cj ) and
age delay suffered by a packet when it is discarded due to a
j
−1 m

Wk − 1
channel access failure or a retry limit. This delay is
(j )
E Tcf ,l = E[Tb ] + jTc + (m + 1)tCCA + tb . pcf ,l pcr ,l
2
h=0 k =0 Tunsuc,l = T + Tcr ,l . (24)
(22) pcf ,l + pcr ,l cf ,l pcf ,l + pcr ,l
j −1 In both cases, the delay is higher for unsaturated nodes as
For j = 0, the term h=0 E[Tb ] is 0.
they will find the channel busy more frequently and will have
The delay suffered by a packet when it is discarded due to
to perform more backoffs.
a retry failure Tcr ,l is
The validity of the proposed model has also been tested
varying the rates of each class as well as the number of nodes
E Tcr ,l = (n + 1)Lc + E[Tb ]. (23)
per class, obtaining similar results in terms of accuracy.
Finally, the probabilities of having a packet ready to be
transmitted in idle state, after a successful transmission, after
IV. C ONCLUSION
a channel access failure and after a retry limit failure are ql =
1−e −λl tb , qsuc,l = λE[Tsuc,l ], qcf ,l = λE[Tcf ,l ] and qcr ,l = We have studied the performance of unslotted 802.15.4 with
λE[Tcr ,l ]. A detailed explanation of their derivation can be heterogeneous classes of nodes and class-specific packet gen-
found in [5]. Note that in case a station is saturated, qsuc,l = eration rate. We have validated our model in a scenario with a
qcr ,l = qcf ,l = 1 and the idle state in the Markov chain of saturated node and a fixed number of unsaturated nodes with
Fig. 1 is removed. varying traffic rate. The results show that the proposed model
reflects the simulated performance of the reference network
scenario better than previous state-of-art approaches.
III. R ESULTS AND M ODEL VALIDATION
To stress-test the proposed model, we have considered a
network scenario with 50 sensor nodes generating packets at R EFERENCES
a low packet rate, and one node with a backlog of saturated [1] K. Govindan, A. P. Azad, K. Bynam, S. Patil, and T. Kim, “Modeling
traffic. The results obtained through the model are validated and analysis of non beacon mode for low-rate WPAN,” in Proc. 12th
Annu. IEEE Consum. Commun. Netw. Conf. (CCNC), Jan. 2015,
against a system-level, discrete-event simulator of the IEEE pp. 549–555.
802.15.4 PHY/MAC layers. All the simulated results represent [2] M. P. R. S. Kiran and P. Rajalakshmi, “Performance analysis
the average of 108 generated packets. The MAC parameters of CSMA/CA and PCA for time critical industrial IoT applica-
used in the simulations are mmin = 4, mmax = 7, m = 4, tions,” IEEE Trans. Ind. Informat., vol. 14, no. 5, pp. 2281–2293,
n = 0, L = 7, Lack = tm,ack = 2, IFS = tack = 0, tb = May 2018.
[3] S. R. Pattanaik, P. K. Sahoo, and S.-L. Wu, “Performance analysis of mod-
20 · 16 μs, tCCA = 8 · 16 μs, and tta = 12 · 16 μs. ified IEEE 802.15.4e MAC for wireless sensor networks,” in Proc. 14th
Fig. 2 shows the performance of the different classes of ACM Symp. Perform. Eval. Wireless Ad Hoc Sensor Ubiquitous Netw.,
nodes as the traffic generation rate of the unsaturated nodes Nov. 2017, pp. 25–31.
varies, comparing the results of our model with those of the [4] D. D. Guglielmo, F. Restuccia, G. Anastasi, M. Conti, and S. K. Das,
“Accurate and efficient modeling of 802.15.4 unslotted CSMA/CA
one proposed in [5]. Fig. 2(a) depicts the probability of a suc- through event chains computation,” IEEE Trans. Mobile Comput., vol. 15,
cessful transmission for the saturated and unsaturated nodes. no. 12, pp. 2954–2968, Dec. 2016.
As expected, this probability is close to 1 for the saturated [5] P. D. Marco, P. Park, C. Fischione, and K. H. Johansson, “Analytical
node when the traffic of the unsaturated nodes is very low as modeling of multi-hop IEEE 802.15.4 networks,” IEEE Trans. Veh.
Technol., vol. 61, no. 7, pp. 3191–3208, Sep. 2012.
it finds the channel idle most of the time. On the contrary, [6] J. Zhu, Z. Tao, and C. Lv, “Performance evaluation for a beacon enabled
this probability starts approximately at 0.82 for the unsatu- IEEE 802.15.4 scheme with heterogeneous unsaturated conditions,” Int.
rated ones as they have to contend with the saturated one. As J. Electron. Commun., vol. 66, no. 2, pp. 93–106, 2012.
Full-Duplex Energy-Harvesting Enabled Relay Networks

in Generalized Fading Channels
Khaled Rabie , Member, IEEE, Bamidele Adebisi , Senior Member, IEEE,
Galymzhan Nauryzbayev , Member, IEEE, Osamah S. Badarneh , Member, IEEE,
Xingwang Li, Member, IEEE, and Mohamed-Slim Alouini , Fellow, IEEE
Abstract—This letter analyzes the performance of a full-duplex two-way relaying for SWIPT. More specifically, AF relay-
decode-and-forward relaying network over the generalized κ-μ ing with multiple antennas and zero-forcing was deployed
fading channel. The relay is energy-constrained and relies entirely in this letter. Other studies exploiting jamming signals for
on harvesting the power signal transmitted by the source based on energy-harvesting (EH) have recently appeared in [6] and [7].
the time-switching relaying protocol. A unified analytical expres- None of the aforementioned works considered FD
sion for the ergodic outage probability is derived for the system EH-enabled relay networks over generalized κ − μ fading
under consideration. This is then used to derive closed-form
channels. In contrast, and motivated by this lack of analytical
analytical expressions for three special cases of the κ-μ fading
model, namely, Rice, Nakagami-m and Rayleigh. Monte Carlo analysis, we present in this letter a thorough performance eval-
simulations are provided throughout to validate our analysis. uation of FD EH-enabled relay networks over such generalized
fading channels. Specifically, DF and time-switching relaying
Index Terms—Decode-and-forward (DF) relaying, energy har- (TSR) protocols are deployed at the relay. The motivations of
vesting, full-duplex (FD), generalized κ − μ fading. our work come from the following two factors. Firstly, the
κ-μ model is a small-scale fading model and is able to char-
I. I NTRODUCTION acterize the scattering cluster in homogeneous communication
environments including Rice (κ = k, μ = 1), Nagakami-m
IMULTANEOUS wireless information and power trans-
S fer (SWIPT) in full-duplex (FD) relaying networks
has recently attracted a great deal of research attention.
(κ → 0, μ = m) and Rayleigh (κ → 0, μ = 1) [8]–
[10]. Secondly, FD communication allows devices to operate
on the same frequency, which potentially doubles the spec-
For instance, Zhong et al. [1] considered a dual-hop FD tral efficiency, and, because of this, FD has become a viable
SWIPT network with both amplify-and-forward (AF) and option for next generation wireless communication networks.
decode-and-forward (DF) relaying protocols equipped with a Thus, the network studied herein is meaningful and valuable
single-antenna. Several analytical expressions of the achiev- for consideration.
able throughput were derived. Instead of the single-antenna The main contribution of this letter resides in deriving
relay, Mohammadi et al. [2] extended the work in [1] to a novel unified analytical expression for the ergodic outage
include multiple-input multiple-output (MIMO) FD relaying. probability of the proposed system over the generalized κ-μ
Unlike [1] and [2], which focused on Rayleigh fading chan- fading channel. In addition, closed-form expressions for the
nels, the work in [3] analyzed the performance of a FD SWIPT aforementioned special cases of the κ-μ fading scenario are
system in indoor environments characterized by log-normal presented. The derived expressions were used to investigate
fading. The study in [4] considered the outage probability of the impact of several system and fading parameters on the
FD SWIPT networks over α−μ fading channels. Furthermore, performance. Results show that the system performance can
Okandeji et al. [5] studied physical layer security in FD be enhanced considerably as the fading parameters κ and μ are
increased. It is also shown that as the loop-back interference,
Manuscript received June 6, 2018; revised August 4, 2018; accepted associated with FD relaying, increases the outage probability
August 31, 2018. Date of publication October 1, 2018; date of current ver-
sion April 9, 2019. The associate editor coordinating the review of this performance deteriorates drastically.
paper and approving it for publication was L. Bai. (Corresponding author:
Khaled Rabie.)
II. S YSTEM M ODEL
K. Rabie and B. Adebisi are with the School of Engineering, The considered network consists of a source (S), a relay (R)
Manchester Metropolitan University, Manchester M15 6BH, U.K. (e-mail:
k.rabie@mmu.ac.uk; b.adebisi@mmu.ac.uk).
and a destination (D). The end nodes are equipped with a
G. Nauryzbayev is with the Division of Information and single-antenna whereas R, based on DF, has two antennas
Computing Technology, College of Science and Engineering, Hamad and operates in the FD mode. It is assumed that there is no
Bin Khalifa University, Qatar Foundation, Doha, Qatar (e-mail:
nauryzbayevg@gmail.com).
direct link between the end nodes due to severe shadowing and
O. S. Badarneh is with the Electrical and Communication Engineering path-loss effect [1]–[3]; hence, all communication is accom-
Department, School of Electrical Engineering and Information plished over two phases. The S-to-R, R-to-D and loop-back
Technology, German Jordanian University, Amman 11180, Jordan (e-mail:
osamah.badarneh@gju.edu.jo).
interference channel coefficients, denoted as h1 , h2 and h3 ,
X. Li is with the School of Physical and Electronics Engineering, respectively, are assumed to be independent but not necessar-
Henan Polytechnic University, Zhengzhou 454000, China (e-mail: ily identical following the κ-μ distribution with a probability
lixingwang@hpu.edu.cn).
density function (PDF)
M.-S. Alouini is with the Computer, Electrical and Mathematical Science

and Engineering (CEMSE) Division, King Abdullah University of Science and μi −1 φi z φi z
Technology, Thuwal 23955, Saudi Arabia (e-mail: slim.alouini@kaust.edu.sa). fh 2 (z ) = Υi z 2 exp − Iμi −1 2μi , (1)
Digital Object Identifier 10.1109/LWC.2018.2873360 i Ωi Ωi
RABIE et al.: FD EH ENABLED RELAY NETWORKS IN GENERALIZED FADING CHANNELS 385
where i ∈ {1, 2, 3}, Ωi = E[hi2 ], φi = μi (1 + κi ), arrive at a tractable expression, we use the series representation
μ +1
μi (1+κi )
i
2 of Iμ3 −1 (·) [12, eq. (8.445)]; that is
Υi = μi −1 μi +1 , Ip [·] is the modified Bessel func- ∞
1
exp(μi κi )κi 2 Ωi 2
Iμ3 −1 (2Λ) = Λμ3 −1+2q , (8)
tion of the first kind with arbitrary order p [11, eq. (9.6.20)], Γ(μ3 + q)q!
μi represents the number of the multipath clusters and κi > 0 q=0

denotes the ratio between the total powers of the domain com- where Λ = μ3 κ3 (1 + κ3 )z and Γ(·) is the Gamma function
ponents and the scattered waves. The path-loss exponents for [12, eq. (8.310.1)].
the S-to-R and R-to-D links are denoted by ξ1 and ξ2 , respec- Now, replacing (8) in (1) and then integrating, we can
tively. It is assumed that perfect channel state information express the cumulative distribution function (CDF) of Z,
(CSI) is available at all receiving nodes, and that R has no FZ (υ), as
power supply and operates by harvesting the RF signal com-
μ3 −1+2q
ing from S. The energy used for information processing at R is ∞
μ3 κ3 (1 + κ3 )
negligible and hence all the harvested energy will be utilized FZ (υ) = Υ3 J0 , (9)
to forward the source information. Γ(μ3 + q)q!
q=0
As mentioned earlier, the TSR protocol is used for EH at R, where
in which the time frame T is divided into two consecutive b
time slots: α T and (1 − α)T, which are used for EH and J0 =
υ
z μ3 +q−1 exp(−φ3 z )dz
S-to-R / R-to-D information transmissions, respectively; where 0
0 ≤ α ≤ 1 is the EH time factor.
(β) −μ3 −q φ3 b
For the sake of brevity, we omit the mathematical modeling = φ3 γ μ3 + q, , (10)
of the received signals at R and D and present only the cor- υ
responding signal-to-noise ratios (SNRs). Readers may refer where b = 1−αηα and γ(·, ·) is the lower incomplete Gamma
to [1] and [3] for more details. The SNRs at R and D nodes function [12, eq. (8.350.1)]. Note that (β) in (10) is obtained
are expressed, respectively, as with the help of [12, eq. (3.351.1)].
Ps h12 1 Substituting (10) into (9), along with some straightforward
γr = = , and (2) manipulations, yields
ξ
Pr d1 1 h32 ζh32
∞

1 (μ3 κ3 )q φ3 b
Pr h22 ζPs h12 h22 FZ (υ) = γ μ3 + q, . (11)
γd = = , (3) exp(κ3 μ3 ) Γ(μ3 + q)q! υ
d2ξ2 σd2 d1ξ1 d2ξ2 σd2 q=0
Definition 1: For any two independent RVs U and V, the
where Ps is the source transmit power, Pr = ηαPs h12 /((1 −
CDF ofx the product of them is defined as P (UV ≤ x ) =
α)d1ξ1 ) is the relay transmit power, η is efficiency of the energy FV ( u )fU (u)du.
ηα
harvester, σd2 is the noise variance at D, ζ = 1−α , d1 and d2 Using this definition, we can determine the CDF of W as
∞ r
are the S-to-R and R-to-D distances, respectively.
The instantaneous capacity of the first, Cr , and second, Cd , FW (r ) = FX f (u)du, (12)
0 u Y
links can be given by
where fY (u) has the distribution as in (1) and FX ( ur ) can be
Ci = (1 − α)log2 (1 + γi ), i ∈ {r , d }. (4) obtained from (11) with the appropriate change of notations.
With this in mind, the ergodic outage probability, which With this in mind, (12) can be expressed as
is defined as the probability that the instantaneous capacity ∞

Υ2 (κ1 μ1 )n
falling below a certain threshold (Cth ), can be calculated as FW (υ) = Γ(μ1 + n)J1 − J2 , (13)
exp(κ1 μ1 ) Γ(μ1 + n)n!
n=0
Pout = Pr(min{Cr , Cd } < Cth ). (5)
where
∞ μ −1 √
2
III. P ERFORMANCE A NALYSIS J1 = z 2 exp(−φ2 z ) Iμ2 −1 2Δ2 z dz , (14)
0∞
In this section, we derive analytical expressions of the μ2 −1 √ φ υ
ergodic outage capacity in generalized κ-μ fading and its spe- J2 = z 2 exp(−φ
2 z )Iμ2 −1 2Δ2 z Γ μ1 + n, 1 dz .
0 az
cial cases. To begin with, we substitute (2) and (3) into (4) (15)
and then into (5), with some mathematical manipulations, to
ηαPs
obtain
Δ2 = μ2 κ2 (1 + κ2 ), a = (1−α)d m m 2 and Γ(·, ·)
1 d2 σd
1 ζPs W indicates the upper incomplete Gamma function
Pout (υ) = Pr min , <υ , (6)
ζZ d ξ1 d ξ2 σ 2 [12, eq. (8.350.2)].
1 2 d To the best of the authors’ knowledge, there is no ana-
Cth
where Z = h32 , W = XY, X = h12 , Y = h22 and υ = 2 1−α −1. lytical solution for the integral J1 . Hence, to solve this
Because the random variables (RVs) Z and W are indepen- integral, we use the infinite series representation of Iμ2 −1 (·)
dent, we can calculate the probability in (6) as [12, eq. (8.445)]. Thus, J1 can be rewritten as
ξ ξ ∞ μ +2m−1 ∞
d1 1 d2 2 σd2 Δ2 2
Pout (υ) = 1 − F̄Z
1
F̄W υ , (7) J1 = z μ2 +m−1 exp(−φ2 z )dz
ζυ ζPs Γ(μ2 + m)m! 0
m=0

μ2 +2m−1
where F̄Z (·) is the complementary cumulative distribu- ∞ μ 2 κ (κ
2 2 + 1)
(β)
tion function (CCDF) of Z, which can be obtained by = μ +m , (16)
integrating (1) with appropriate notation changes. In order to m=0 φ2 2 m!
where (β) is obtained with the help of [12, eq. (3.351.3)] Thus, the resultant closed-form expression of the ergodic
followed with some basic algebraic manipulations. outage probability in Nakagami-m fading can be given as
Similarly, to solve the integral J2 , we first replace
{Nak} 2 b
Iμ2 −1 (·) and Γ(·, ·) with their series representations using [12, Pout =1− γ m3 , m3
Γ(m2 )Γ(m3 ) υ
eq. (8.445)] and [12, eq. (8.352.2)], respectively, as follows m 1 −1 m2 +k
∞
√ μ2 −1+2l 1 m1 m2 υ 2 υ
1 × Km2 −k 2 m1 m2 . (24)
Iμ2 −1 (2Δ) = Δ2 z , (17) k! a a
Γ(μ2 + l )l ! k =0
l=0

ψ k
φ υ φ υ 1 φ1 υ When m1 = m2 = m3 = 1, the expression in (24) reduces to
Γ μ1 + n, 1 = ψ!exp − 1 , (18) the Rayleigh fading scenario as
az az k! az
k =0
where ψ = μ1 + n − 1. {Ray} υ b υ
Pout =1−2 1 − exp − K1 2 . (25)
Using (17) and (18), we can rewrite J2 as a υ a
∞ ψ μ +2l−1 We know that K1 (z ) can be approximated by 1/z when
φ1 υ k Δ2 2
J2 = ψ J3 , (19) z 1. Based on this, at high SNR, (25) can be simplified
a Γ(μ2 + l ) l ! k ! {Ray}
l=0 k =0 to Pout ≈ exp(− 1−αηαυ ), which indicates that the system
where performance improves as we increase η and/or decrease υ.
∞ φ1 υ
J3 = z μ2 +l−k −1 exp −φ2 z − dz
0 az IV. N UMERICAL R ESULTS AND D ISCUSSIONS
1 All our evaluations in this section, unless we specify other-
(β) φ2 a 2 (k −l−μ2 ) υ
= 2 Kk −l−μ2 φ1 φ2 , (20) wise, are based on: ξ1 = ξ2 = ξ3 = 2.7, d1 = d2 = 4 m,
φ1 υ a σd = 0.01 W, η = 1 and Cth = 0.2 bits/s/Hz [13]. To
where Kp [ · ] is the modified Bessel function of the second begin with, in Fig. 1 we show a 3D plot for the analytical
kind with arbitrary order p [11, eq. (9.6.22)]. Note that (β) and simulated ergodic outage probability as a function of the
is accomplished by means of [12, eq. (3.471.12)], along with fading parameters κ and μ. Note that these analytical results
some mathematical manipulations. are obtained using (22) while considering the first 20 terms
Substituting (20) into (19) and then (16) and (19) into (13), of all series. It is clear that the performance improves as κ
we obtain an expression for FW (·) given in (21), shown at the and/or μ is increased. This is because increasing κ indicates
bottom of this page. Finally, using (7), (11) and (21), along an increase in the ratio between the total powers of the dom-
with basic algebraic manipulations, we obtain an accurate and inant components and the scattered waves, and increasing μ
unified expression for the ergodic outage probability of the implies increasing the number of multipath clusters.
dual-hop FD-DF relaying system over the generalized κ-μ fad- Now, to illustrate the influence of the EH time factor we
ing channel. This is given by (22), shown at the bottom of this present in Fig. 2 the ergodic outage probability with respect
page. to α for the two special cases of the κ-μ fading model: Rice
Now, substituting μ1 = μ2 = μ3 = 1 in (22), we get an ana- fading in Fig. 2(a) with different values of κi and Nakagami-m
lytical expression of the outage probability for the Rice fading fading in Fig. 2(b) for various values of μi where i ∈ {1, 2, 3}.
scenario, given in (23), shown at the bottom of this page. To Note that the numerical results in Figs. 2(a) and 2(b) are
obtain a mathematical expression for the Nakagami-m fading obtained from (23) and (24), respectively. It is clear that, for
case, we start from (22) and substitute κ1 = κ2 = κ3 → 0, all fading scenarios, when α is either too high or too small,
μ1 = m1 , μ2 = m2 and μ3 = m3 . Note that due to the fact the performance degrades significantly; hence, this parameter
that κ1 = κ2 = κ3 → 0, only the first terms of all infinite must be selected carefully to minimize the outage probability.
series will have non-zero values, except the last summation. It is worthwhile pointing out that the results represented by

μ2 +2m−1
μ2+1
μ2 (1 + κ2 ) 2
∞
∞
μ2 κ2 (1 + κ2 ) (κ1 μ1 )n ∞
ψ
∞
FW (υ) = −2
((1 + κ2 )μ2 )μ2 +m m! n!
μ2 −1
κ2 2 exp(κi μi ) n=0 m=0 n=0 l=0 k =0
i∈{1,2}

μ2 +2l−1
n μ κ (1 + κ ) k − 1 (μ2 +l−k )
(κ1 μ1 ) (μ1 + n − 1)! 2 2 2 φ1 υ aφ2 2 υ
Kk −l−μ2 2 φ1 φ2 (21)
n! k ! l ! Γ(μ1 + n)Γ(μ2 + l ) a υφ1 a
∞ μ2+1 ∞ ∞
Δμ2 +2m−1
1 (μ3 κ3 )q φ3 b μ2 (1 + κ2 ) 2 2
Pout (υ) = 1 − γ μ3 + q, 1 − μ −1 μ +m
exp(κ3 μ3 ) Γ(μ3 + q)q! υ 2 φ2 2
q=0 κ2 2 exp(κi μi ) n=0 m=0
i∈{1,2}
n ∞
ψ
∞ n μ +2l−1 k
− 1 (μ2 +l−k )
(κ μ ) (κ1 μ1 ) Δ2 2
φ1 υ ψ!
φ2 a 2 υ
× 1 1 −2 Kk −l−μ2 2 φ1 φ2 (22)
m! n! Γ(μ1 + n)Γ(μ2 + l )n! k ! l ! a φ1 υ a
n=0 l=0 k =0
∞ ∞ ∞
{Ric} 1 κq3 b 1 κn κm
1 2
Pout (υ) = 1 − γ q + 1, (1 + κ3 ) 1−
exp(κ3 ) Γ(q + 1)q! υ exp(κ1 + κ2 ) n! m!
q=0 n=0 m=0

l
υ 2 (l+k +1)
∞ ∞ n 1
κn1 κ2 Cth
−2 (κ1 + 1)(κ1 + 1) Kk −l−1 2 (1 + κ1 )(1 + κ2 ) (23)
n! l ! k ! l ! a a
n=0 l=0 k =0
RABIE et al.: FD EH ENABLED RELAY NETWORKS IN GENERALIZED FADING CHANNELS 387
Fig. 1. Ergodic outage probability versus the fading parameters κi and μi ,

i ∈ {1, 2, 3}, when α = 0.06 and Ps = 0.5 W. Fig. 4. Optimal EH time factor versus the fading parameter mi for different
values of η in Nakagami-m fading.
Fig. 4 depicts some numerical results of the optimal EH time

factor versus the fading parameters m1 and m2 with different
values of η when Ps = 1 W and m3 = 3. It is apparent that
increasing mi , i ∈ {1, 2}, and/or η will reduce the optimal
EH time factor, which is intuitive.
V. C ONCLUSION
This letter analyzed the performance of a FD-DF EH-enabled
relaying network over κ-μ fading channels. Accurate mathemat-
ical expressions were derived for the ergodic outage probability.
Three special cases of the κ-μ fading were investigated, namely,
Rice, Nakagami-m and Rayleigh. Using the derived expres-
sions, the impact of several system parameters were examined
such as the fading parameters, loop-back interference channel,
end-to-end distance and source transmit power.
Fig. 2. Ergodic outage probability versus α for the FD-DF relay system with
different κ and μ values. R EFERENCES
[1] C. Zhong et al., “Wireless information and power transfer with full
duplex relaying,” IEEE Trans. Commun., vol. 62, no. 10, pp. 3447–3461,
Oct. 2014.
[2] M. Mohammadi, H. A. Suraweera, G. Zheng, C. Zhong, and I. Krikidis,
“Full-duplex MIMO relaying powered by wireless energy transfer,”
in Proc. IEEE Int. Workshop Signal Process. Adv. Wireless Commun.
(SPAWC), Jun. 2015, pp. 296–300.
[3] K. M. Rabie et al., “Half-duplex and full-duplex AF and DF relay-
ing with energy-harvesting in log-normal fading,” IEEE Trans. Green
Commun. Netw., vol. 1, no. 4, pp. 468–480, Dec. 2017.
[4] G. Nauryzbayev et al., “Outage probability of the EH-based full-duplex
AF and DF relaying systems in α-μ environment,” in Proc. IEEE
Conf. Veh. Technol. (VTC-Fall), to be published. [Online]. Available:
http:arXiv:1808.02570
[5] A. A. Okandeji et al., “Secure full-duplex two-way relaying for SWIPT,”
IEEE Wireless Commun. Lett., vol. 7, no. 3, pp. 336–339, Jun. 2017.
[6] J. Guo et al., “Exploiting adversarial jamming signals for energy harvest-
ing in interference networks,” IEEE Trans. Wireless Commun., vol. 16,
no. 2, pp. 1267–1280, Feb. 2017.
[7] N. Zhao et al., “Artificial noise assisted secure interference networks
Fig. 3. Ergodic outage probability with respect to Ps for different loop-back with wireless power transfer,” IEEE Trans. Veh. Technol., vol. 67, no. 2,
pp. 1087–1098, Feb. 2018.
interference scenarios. Note that d = d1 + d2 .
[8] J. F. Paris, “Outage probability in η-μ/η-μ and κ-μ/η-μ interference-
limited scenarios,” IEEE Trans. Commun., vol. 61, no. 1, pp. 335–343,
the symbol (+) in Fig. 2(b) are for Rayleigh fading, obtained Jan. 2013.
[9] S. Kumar and S. Kalyani, “Coverage probability and rate for κ−μ/η−μ
from (25). fading channels in interference-limited scenarios,” IEEE Trans. Wireless
To illustrate the impact of the loop-back interference chan- Commun., vol. 14, no. 11, pp. 6082–6096, Nov. 2015.
nel on the system performance, we plot in Fig. 3 the ergodic [10] M. D. Yacoub, “The α-η-κ-μ fading model,” IEEE Trans. Antennas
outage probability as a function of Ps in Nakagami-m fading Propag., vol. 64, no. 8, pp. 3597–3610, Aug. 2016.
[11] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions
for different values of m3 when Cth = 0.3 bits/s/Hz, α = 0.6, With Formulas, Graphs and Mathematical Tables. New York, NY, USA:
d1 = 2d2 and m1 = m2 = 5. It can be seen that as we Wiley, 1972.
improve the loop-back interference channel, i.e., increasing [12] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and
m3 , the performance deteriorates. It is also noticeable that the Products. Amsterdam, The Netherlands: Academic, 2007.
[13] A. A. Nasir et al., “Relaying protocols for wireless energy harvesting
performance enhances as the transmit power is increased and and information processing,” IEEE Wireless Commun., vol. 12, no. 7,
worsens with increasing the end-to-end distance. Furthermore, pp. 3622–3636, Jul. 2013.
Connectivity and Blockage Effects in Millimeter-Wave

Air-To-Everything Networks
Kaifeng Han , Kaibin Huang , and Robert W. Heath, Jr.
Abstract—Millimeter-wave (mmWave) offers high data rate success probability of establishing a backhaul link as well as
and bandwidth for air-to-everything (A2X) communications backhaul data rate. In [9], the multiple-input multiple-output
including air-to-air, air-to-ground, and air-to-tower. MmWave (MIMO) non-orthogonal multiple access (NOMA) techniques
communication in the A2X network is sensitive to buildings were used in UAV network and the outage probability and
blockage effects. In this letter, we propose an analytical frame-
work to define and characterise the connectivity for an aerial
ergodic rate of network were studied based on a stochastic
access point (AAP) by jointly using stochastic geometry and geometry model. In [4], [7], and [8], the blockage effects were
random shape theory. The buildings are modelled as a Boolean characterized by a statistical model where the link-level line-
line-segment process with fixed height. The blocking area for an of-sight (LOS) probability is approximated as a simple sigmoid
arbitrary building is derived and minimized by optimizing the function. The parameters of sigmoid function are determined
altitude of AAP. A lower bound on the connectivity probability by the buildings’ density, sizes, and heights’ distribution. The
is derived as a function of the altitude of AAP and different model is unsuitable for mmWave A2X networks since it fails to
parameters of users and buildings including their densities, capture the fact that multiple nearby links could be simultane-
sizes, and heights. This letter yields guidelines on practical
mmWave A2X networks deployment. ously blocked by the same building and does not consider the
diversity in user types (e.g., their different heights). In [10], a
Index Terms—A2X communications, mmWave networks, mathematical framework was proposed for studying mmWave
blockage effects, network connectivity, stochastic geometry, ran- A2A networks, in which multiple aerial-users are equipped
dom shape theory.
with antenna arrays. Blocking effects were not included since
A2A scenario was assumed to be well above the blockages.
I. I NTRODUCTION In this letter, we develop an analytical framework for char-
acterizing the blockage effects and connectivity of a mmWave
IR-TO-EVERYTHING (A2X) communications can
A leverage aerial access points (AAPs) mounted on
unmanned aerial vehicles (UAVs) to provide seamless wire-
A2X network covered by a single AAP. The 3D buildings are
modelled as a Boolean line-segment process with fixed height.
Given an arbitrary building, the corresponding blocking area
less connectivity to various types of users [1] (see Fig. 1).
is derived as a function of altitude of AAP, users and build-
Millimeter-wave (MmWave) communication is one way to
ings’ parameters including their density, sizes, and heights.
provide high data rate for aerial platforms [2]. Unfortunately,
Based on the model, the AAP coverage area is maximized (or
mmWave communication is sensitive to building block-
equivalently the blocking area is minimized) by optimizing the
ages [3], which are widely expected in urban deployments of
altitude of AAP. Furthermore, both upper and lower bounds on
AAPs. In this letter, we define and characterize the connec-
the blocking area and a suboptimal result of AAP’s altitude
tivity for an AAP, using tools from stochastic geometry and
are derived in closed-form. The spatial average connectivity
random shape theory.
probability of a typical A2X network is obtained, which is
Leveraging UAVs as AAPs has been studied in [4]–[10]. A
maximized by optimizing the AAP’s altitude.
single-UAV network was proposed in [4], where the network
coverage was maximized by optimizing the UAVs’ altitudes.
The coverage performance can also be maximized via optimiz- II. S YSTEM M ODEL AND P ERFORMANCE M ETRIC
ing the placement of UAVs [5]. In [6], the coverage probability Consider a mmWave A2X network as illustrated in Fig. 1.
of a finite 3D multi-UAV network was calculated via a stochastic In this letter, we focus on the downlink communication from
geometric approach. Both network coverage and the sum-rate a typical low-altitude AAP to users with different heights.
of a hybrid A2G-D2D network were investigated in [7]. An
analytical framework that UAV uses ground-BS for wireless A. Channel Model Between AAP and Users
backhaul was proposed in [8] with providing the analysis for
The mmWave channel between the AAP and the different
Manuscript received August 1, 2018; revised September 5, 2018; accepted
types of users is assumed to be LOS or blocked by a build-
September 20, 2018. Date of publication October 1, 2018; date of current ver- ing. For simplicity, we assume the non-LOS (NLOS) signals
sion April 9, 2019. The work of K. Han and K. Huang was supported by Hong are completely blocked due to severe propagation loss from
Kong Research Grants Council under Grant 17209917 and Grant 17259416. penetration and limited reflection, diffraction, or scattering [3].
The work of R. W. Heath was supported by the National Science Foundation
under Grant ECCS-1711702 and Grant CNS-1731658. The associate editor
For the LOS case, the channel is assumed to have path-loss
coordinating the review of this paper and approving it for publication was without small-scale fading [11]. We assume perfect 3D beam
S. Zhou. (Corresponding author: Robert W. Heath, Jr.) alignment between AAP and users for maximal directivity
K. Han and K. Huang are with the Department of EEE, University of gain [10]. For the path-loss model, we assume the reference
Hong Kong, Hong Kong (e-mail: kfhan@eee.hku.hk; haungkb@eee.hku.hk).
R. W. Heath is with the University of Texas at Austin, Austin, TX 78712
distance is 1 m. The AAP transmission with power P and
USA (e-mail: rheath@utexas.edu). propagation distance r is attenuated modelled as r −α where
Digital Object Identifier 10.1109/LWC.2018.2873361 α is the path-loss exponent [11]. Let σ 2 be the thermal noise
HAN et al.: CONNECTIVITY AND BLOCKAGE EFFECTS IN mmWAVE A2X NETWORKS 389
Fig. 2. 2D projection of an arbitrary building modelled by a line-segment

pq. Some geometrical relations are described as follows. dS = o − q,
dL = o − p, ΛH = o − u = o − s, ω = ∠qxh, θ = ∠qop, and
Fig. 1. An illustration of the A2X communications network. A typical ∠hxo = π/2. The grey area covered by qpsvt is the blocking area Sb (x ) and
(central) AAP provides wireless connectivity to different types of users, the area covered by qop (blue area) and tvu (green area) is the coverage area.
including AAP connects with ground-users (e.g., mobile) via air-to-ground Specifically, the blue area covered by tvu denotes the coverage gain Sgain
(A2G), tower-users (e.g., base station, BS) via air-to-tower (A2T), and due to the fact that higher altitude of AAP can cover more LOS area.
airborne-users (e.g., UAV) via air-to-air (A2A) communications. The alti-
tude of AAP is denoted by Ha , and the height of users are denoted by Hu .
The 3D buildings are modelled as a Boolean line-segment process with fixed
height Hb . sphere and the link between user and AAP is LOS without
being blocked by any building. Consider an arbitrary building
power normalized by the transmit power P. The corresponding whose 2D line-segment center is located at x ∈ R2 . The build-
signal-to-noise ratio (SNR) received at user is defined as ing results in a blocking area Sb (x ) where the links between
−α
Pr = Grσ2 where G denotes the beamforming gain. We users and AAP are fully blocked by the building (see Fig. 2).
assume that the user is connected to the AAP if the receive To measure the network performance, we define the spatial
SNR exceeds a given threshold γ. We say that the AAP has average connectivity probability, denoted by pc , as the spatial
1 average fraction of the A2X network that is connectable at
a maximal coverage sphere with the radius Rmax = ( σG2 γ ) α .
any time [15]. The pc is mathematically expressed as
The 2D projection of the coverage sphere of the AAP into ⎛ ⎞

the planewith user’s height Hu forms a disk with the radius x ∈{Φ∩O(ΛH )} Sb (x )
ΛH = Rmax 2 − (Ha − Hu )2 , called the efficient coverage pc = 1 − E ⎝ ⎠, (1)
|O(ΛH )|
disk and denoted by O(ΛH ). The user is connected to the AAP
if its 2D location is inside the efficient coverage disk and the
where |O(ΛH )| = πΛH denotes the size of O(ΛH ).
link between the user and the AAP is LOS. For higher users
heights, i.e., larger Hu , the efficient coverage disk is larger. Let
III. A NALYSIS F OR N ETWORK C ONNECTIVITY
the center of efficient coverage disk, i.e., the 2D projection of
AAP’s location, be the origin denoted by o ∈ R2 . A. Size of Blocking Area
We begin by calculating the size of blocking area Sb (x ) for
B. 3D Building Model an arbitrary building whose 2D line-segment center located at
x. We first fix the length and ω of the typical building. Let
A 3D building model is adopted to characterize blockage
dx be the distance between x and o. Let dS be the minimal
effects where buildings are modelled as a Boolean line-
(shortest) distance between o and line-segment (2D projection
segment process with the same fixed height Hb for tractabil-
of building) and dL be the maximal (longest) distance (see
ity [12]–[14]. Adding randomness to the buildings height is left
the lines oq and op in Fig. 2). If the AAP’s altitude does not
to future work. The blocking effects of randomly distributed
exceed building’s height, i.e., Ha ≤ Hb , the size of blocking
buildings are approximated as line segments with random
area Sb (x ) (see the gray area covered by qpsvt in Fig. 2) is cal-
length and orientation on the 2D plane. Although the build- d 2 − 1 2
ings have polygon shapes in practice, we are interested in their culated by 12 [θΛ2H − dS dL sin θ], where θ = arccos( xd d4 )
S L
1D intersections with the communication links. Therefore, and
⎧
assuming that the buildings’ shape are lines is a reasonable ⎪ d = 1 2 + d 2 − dx sin ω,
⎨ S 4 x
approximation. The effectiveness of the Boolean line-segment (2)
model has been validated with real building data in [12] to ⎪ 1 2 2
⎩ dL = min ΛH , 4 + dx + dx sin ω .
ensure the derived insight is not changed for practical build-
ing deployment. The center locations of the line-segments are If Ha > Hb , the blocking area Sb (x ) could be further reduced
modelled as a homogeneous PPP Φ = {x } on R2 plane with since the AAP covers more area via LOS links due to the
density λb . The lengths {} and orientations {ω} of blockage benefit of higher altitude. Compared with the coverage area of
line-segments are independent identically distributed random AAP when Ha ≤ Hb , we define this additional coverage area
variables. Let fL () be the distribution of and let fΘ (ω) be due to Ha > Hb as the coverage gain, denoted by Sgain (x )
that of ω. The Boolean line-segment model can be extended (see the area covered by tvu (green area) in Fig. 2). Based on
to other models as discussed in Remark 2. geometric calculations, Sgain (x ) is calculated as
⎡ ⎤
C. Connectivity and Performance Metric θ 2
1⎢ (dS cos β) ⎥
We assume that all the users can be simultaneously con- Sgain (x ) = ⎣ dφ + θ Λ2H ⎦, (3)
2 cos2 (φ + β)
nected to the A2X network if they are in AAP’s coverage 0
dS cos β
where θ = arccos( H −H ) and β = disk with radius diameter (i.e., a cylinder in 3D), the block-
(1− Hb −Hu ΛH )
d
a u aging area is recalculated as Sb = 2θ Λ2H − (dx + 18 2 (θ +
cos θ− dS π)) − 1(Ha > Hb )Sgain (x ), where Sgain (x ) is lower bounded
arctan( sin θ L ). Then, Sb (x ) is calculated as follows. (−)
Lemma 1 (Size of Sb (x)): The blocking area is by Sgain (x ) = 2θ [Λ2H − ( H1b −Hu (dx + 12 ))2 ]+ . The case
1− H
a −Hu
1 2 that building has a rectangular shape in 2D can be analyzed
Sb (x ) = θΛH − dS dL sin θ − 1(Ha > Hb )Sgain (x ), similarly (e.g., [14]).
2
(4)
B. Network Connectivity Probability
where 1(·) denotes the indicator function and Sgain (x ) is given
in (3). In this section, we calculate the connectivity probability
The calculations follow from geometry, the detailed proof defined in (1). Notice that the spatial correlation between
is omitted due to limited space. To simplify the result in different buildings exsits such as the blocking areas of
Lemma 1 and obtain more insights therein, both upper and multiple buildings may overlap with each other. For analyt-
lower bounds of Sgain (x ) are derived. By assuming the dis- ical tractablility, we ignore the spatial correlation of buildings
tance between any point on building’s line-segment and o has due to overlap in the blockaging area of multiple buildings.
the same value dL or dS given in (2), the lower and upper This assumption is accurate when density of buildings is not
(−) (+) very high, which has been validated in [3]. We derive a lower
bounds of Sgain (x ), denoted by Sgain (x ) and Sgain (x ), are
bound of pc by jointly using Campbell’s theorem, random
derived as
shape theory, with the results given in Lemmas 1 and 2.
⎡ 2 ⎤+ Theorem 1 (Connectivity Probability of AAP): The connec-
(−) θ dL (−)
Sgain (x ) = ⎣Λ2H − ⎦ , (5) tivity probability pc is lower bounded by pc as
2 1− H b −Hu
Ha −Hu 2
ΛH
(+) (−) (−) πλ θ
and Sgain (x ) is obtained by replacing dL in Sgain (x ) with pc = 1 − 2b F (r , , ω)r drfΘ (ω)dωfL ()d, (7)
ΛH
dS . The result for the bounds of Sgain (x ) is summarized as L Θ 0
follows. where
Lemma 2 (Bounds of Sgain (x)): The coverage gain +
Sgain (x ) can be upper or lower bounded as follows. θ 2 2 2
F(r , , ω) = Λ − 1(Ha > Hb ) ΛH − (dL + ΩH )
2 H
(−) (+)
Sgain (x ) ≤ Sgain (x ) ≤ Sgain (x ), (6) 1
− dL2 sin θ, (8)
2
where ΛH is specified in Lemma 1 and [A]+ = max [0, A].
The bounds for Sgain can be treated as the bounds for Sb via ΩH = (1 − H b −Hu −1 2
Ha −Hu ) , and ΛH is specified in Lemma 1.
(−) (+) Proof: See the Appendix.
substituting (6) into (4): Sb (x ) ≤ Sb (x ) ≤ Sb (x ), where (−)
(−) (+) Remark 3: The lower bound pc becomes tighter when
Sb (x ) = 12 (θΛ2H − dS dL sin θ) − 1(Ha > Hb )Sgain (x ) and density of buildings, i.e., λb , becomes smaller. This is because
(+) (+) (−)
Sb (x ) is obtained via replacing Sgain (x ) in Sb (x ) with sparsely deployed buildings result in less spatial correlation.
(−) Remark 4: Based on the discussion in Remark 1 and
Sgain (x ). expression of F(r , , ω), the connectivity probability pc can
Remark 1 (Optimal Altitude of AAP): A larger AAP’s also be maximizing by optimizing the APP’s altitude Hu .
altitude Ha can effectively increase the coverage (LOS)
area, while shrinking the radius of effective coverage disk.
So there exists an optimal Ha∗ to maximize the size of IV. S IMULATION R ESULTS
coverage gain Sgain , which can be calculated by solving In this section, we validate the analytical results via Monte
Ha∗ = arg maxHa Sgain (x ). To obtain a simple result with Carlo simulation. The radius of maximal coverage sphere
closed-form, we characterize this behavior by optimizing is Rmax = 100 m. The height of building is Hb = 30 m
(−) and that of user is Hu = 2 m. The density of buildings is
instead Sgain (x ) to obtain a suboptimal solution for Ha . When
dL 2 (−) λb = 2×10−4 m2 . The length and orientation ω of building’s
( H −H ) < Λ2H , we have H̃a∗ = arg maxHa Sgain (x ) = line-segments follow independently and uniformly distribu-
1− Hb −Hu
a u
1 tions. Specifically, is uniformly distributed in (0, 15 m] and
(dL2 (Hb −Hu )) 3 +Hb . Substituting H̃a∗ into (3) gives the sub- ω is uniformly distributed in (0, π].
optimal solution of Sb (x ). It will be shown in Fig. 3 that the Fig. 3 shows the coverage gain Sgain calculated via (3) and
derived suboptimal AAP altitude H̃a∗ is close to the optimal (+) (−)
one Ha∗ via numerical calculation. its bounds Sgain , Sgain calculated via Lemma 2 versus the
Remark 2: Extending the current building model to any altitude of AAP Ha . It is observed that Sgain is well bounded
(+) (−)
model that each building has a random size in 2D projec- by Sgain and Sgain . The lower bound becomes tighter when
tion, such as rectangle [3] or disk, follows a similar analytical Ha is small and the upper bound becomes tighter when Ha is
structure. The main difference is that the area of buildings large. This agrees with the intuition because larger or smaller
should be included into blocking area Sb . Also, Sgain needs altitude of AAP results in larger or smaller coverage gain,
to be recalculated based on different building’s shape. For respectively, which makes the bound tighter. Moreover, the
instance, if the 2D projection of a building is modelled as a AAP’s altitude that maximizes Sgain given by Ha∗ and the
HAN et al.: CONNECTIVITY AND BLOCKAGE EFFECTS IN mmWAVE A2X NETWORKS 391
the effects of spatial correlation of buildings on network con-

nectivity and modelling a A2X network including multi-AAP’s
connections.
A PPENDIX
P ROOF OF T HEOREM 1
By omitting the spatial correlations between {Sb (x )}, con-
nectivity probability pc defined in (1) is lowered bounded as
follows.
⎛ ⎞

x ∈{Φ∩O(ΛH )} Sb (x )
pc ≥ 1 − E ⎝ ⎠
|O(ΛH )|
2
Fig. 3. The effect of the AAP’s altitude on coverage gain Sgain . The ΛH
coverage gain is shown to be a concave function of AAP’s altitude. The (a) πλb θ
parameters of building are set as {dx , , ω} = {25 m, 6 m, π/4}. The upper = 1− Sb (r )r drfΘ (ω)dωfL ()d. (9)
and lower bounds are plotted based on (6). It is observed that Sgain is well Λ2H
bounded and the lower bound becomes tighter when AAP’s altitude is small L Θ 0
and upper bound becomes tighter when AAP’s altitude is large. Moreover, both where (a) follows the Campbell’s theorem and ramdom shape
optima and suboptimal altitudes of APP, Ha∗ and H̃a∗ given in Remark 1, are
highlighted. theory [12]. Based on (4), the blocking area Sb (r ) can be
(−)
upper bounded by 12 [θΛ2H − dS2 sin θ] − 1(Ha > Hb )Sgain (r ).
(−)
Substituting the result above into (9) gives pc .
R EFERENCES
[1] Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications
with unmanned aerial vehicles: Opportunities and challenges,” IEEE
Commun. Mag., vol. 54, no. 5, pp. 36–42, May 2016.
[2] Z. Xiao, P. Xia, and X.-G. Xia, “Enabling UAV cellular with millimeter-
wave communication: Potentials and approaches,” IEEE Commun. Mag.,
vol. 54, no. 5, pp. 66–73, May 2016.
[3] J. G. Andrews et al., “Modeling and analyzing millimeter wave cellular
systems,” IEEE Trans. Commun., vol. 65, no. 1, pp. 403–430, Jan. 2017.
[4] A. Al-Hourani, S. Kandeepan, and S. Lardner, “Optimal LAP altitude
for maximum coverage,” IEEE Wireless Commun. Lett., vol. 3, no. 6,
pp. 569–572, Dec. 2014.
[5] J. Lyu, Y. Zeng, R. Zhang, and T. J. Lim, “Placement optimization of
Fig. 4. The effect of building density on connectivity probability pc . The UAV-mounted mobile base stations,” IEEE Commun. Lett., vol. 21, no. 3,
exact value of pc is plotted via Monte Carlo simulation. The lower bound pp. 604–607, Mar. 2017.
(−) [6] V. V. Chetlur and H. S. Dhillon, “Downlink coverage analysis for a
pc is plotted based on (7). It is observed that connectivity probability
decreases with building density and the lowered bound becomes tighter when finite 3-D wireless network of unmanned aerial vehicles,” IEEE Trans.
building density is small. Commun., vol. 65, no. 10, pp. 4543–4558, Oct. 2017.
[7] M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Unmanned aerial
vehicle with underlaid device-to-device communications: Performance
and tradeoffs,” IEEE Trans. Wireless Commun., vol. 15, no. 6,
suboptimal H̃a∗ are close, which confirms the discussion given
pp. 3949–3963, Jun. 2016.
[8] B. Galkin, J. Kibilda, and L. DaSilva. (2017). A Stochastic Geometry
in Remark 1. Model of Backhaul and User Coverage in Urban UAV Networks.
In Fig. 4, we validate the lower bound of connectivity prob- [Online]. Available: https://arxiv.org/abs/1710.03701
(−) [9] T. Hou, Y. Liu, Z. Song, X. Sun, and Y. Chen. (2018). Multiple Antenna
ability pc , i.e., pc , given in Theorem 1 by comparing it with Aided NOMA in UAV Networks: A Stochastic Geometry Approach.
the exact value via Monte Carlo simulation. It is observed that [Online]. Available: https://arxiv.org/abs/1805.04985
(−)
both pc and pc decrease with building density λb . More [10] T. Cuvelier and R. W. Heath, Jr. (2018). MmWave MU-MIMO for Aerial
(−) Networks. [Online]. Available: https://arxiv.org/abs/1804.03295
importantly, pc becomes tighter when buildings are sparsely [11] K. Han, Y. Cui, Y. Wu, and K. Huang, “The connectivity of mil-
deployed, which aligns with the discussion in Remark 3. limeter wave networks in urban environments modeled using random
lattices,” IEEE Trans. Wireless Commun., vol. 17, no. 5, pp. 3357–3372,
May 2018.
[12] A. K. Gupta, J. G. Andrews, and R. W. Heath, Jr., “Macrodiversity in cel-
V. C ONCLUSION AND F UTURE W ORK lular networks with random blockages,” IEEE Trans. Wireless Commun.,
In this letter, we propose an analytical framework to define vol. 17, no. 2, pp. 996–1010, Feb. 2018.
[13] X. Li, T. Bai, and R. W. Heath, Jr., “Impact of 3D base station antenna
and characterize the connectivity in a mmWave A2X network. in random heterogeneous cellular networks,” in Proc. IEEE WCNC,
Based on the blockage model that buildings are modelled by Apr. 2014, pp. 2254–2259.
the Boolean line-segment process with fixed height, we cal- [14] T. Bai, R. Vaze, and R. W. Heath, “Using random shape theory to model
culate the blocking area due to an arbitrary building and the blockage in random cellular networks,” in Proc. IEEE SPCOM, 2012,
pp. 1–5.
connectivity probability of an AAP. Moreover, the AAP’s alti- [15] J. G. Andrews, F. Baccelli, and R. K. Ganti, “A tractable approach to
tude can be optimized to maximize the coverage area as well as coverage and rate in cellular networks,” IEEE Trans. Commun., vol. 59,
the network connectivity. Future work will focus on studying no. 11, pp. 3122–3134, Nov. 2011.
Receiver Design for OOK Modulation Over Turbulence

Channels Using Source Transformation
Mohammad Taghi Dabiri and Seyed Mohammad Sajad Sadough
Abstract—For coherent detection of ON-OFF keying (OOK) and its generalizations are advocated. However, implementing
symbols in free-space optical (FSO) communications, the receiver these works involves computation of cumbersome and time-
requires the instantaneous channel fading coefficients. To accu- consuming integrals which compromises real-time exploita-
rately detect OOK symbols over FSO channels without requiring
transmission of pilot symbols, in the first part of this letter,
tion. As an alternative to MLSD detection methods, gener-
we propose an expectation-maximization (EM)-based sequence alized likelihood ratio test (GLRT) sequence detection (SD)
detection (SD) method with low implementation complexity which is proposed in [8]. The method in [8] uses GLRT as a met-
is shown to be particularly suitable for fast FSO communication ric inside the Viterbi algorithm to detect the sequence of OOK
systems. In the second part of this letter, to remove the error floor symbols. Recently, a GLRT-based SD was proposed in [15] for
and to improve the accuracy of the proposed detector, by utilizing the more general scenario of non-return-to-zero (NRZ)-OOK
a source information transformation we propose an EM-based
SD for this state which significantly outperforms than classical
symbols where the variances of transmitted one-bit and zero-
EM-based method. bit are not necessarily equal. The GLRT-based SD method
proposed in [15] achieves performance close to the MLSD
Index Terms—Free-space optics (FSO), data detection,
receiver for a sufficiently large length of the observation
expectation-maximization (EM) algorithm, ON-OFF keying
(OOK) modulation, source transformation. window, with a considerably less computational complexity
compared to MLSD.
Notice that due to the very high data rate of FSO communi-
I. I NTRODUCTION cation, the speed of opto-electronic devices is the main limiting
REE space optical (FSO) communications have recently factor to implement an FSO link. In this letter, to detect
F attracted a great part of research activity with numerous
advantages over conventional radio-frequency (RF) transmis-
OOK symbols over FSO channel, we use the expectation-
maximization (EM) algorithm. The use of EM-based detection
sion such as very high optical bandwidth, low implementation in the context of optical wireless systems was first proposed
cost and high security [1]. However, despite its advantages, the in [6] and then analyzed more in depth in [16]. Here, without
main source of impairment in FSO communications is fading resorting to the transmission of pilot symbols, we first pro-
due to the atmosphere turbulence [2]. pose an EM-based SD for the more general case of NRZ-OOK
Intensity modulation with direct detection (IM/DD) by using symbols where the variances of transmitted one-bit and zero-
pulse-position modulation (PPM) or on-off keying (OOK) has bit are not necessarily equal. We will show that the proposed
attracted a great deal of attention from both academia and EM-based SD can achieve performance very close to those
industry in the most current commercial FSO systems [2]. achieved with perfect channel state information (CSI) for a
The main advantage of using PPM is that there is no need to sufficiently large length of the observation window, while
perform threshold adjustment at the receiver for signal demod- the processing load is even faster than the GLRT-based SD
ulation. On the other hand, OOK modulation offers a better recently introduced in [15]. The main challenge of using EM-
bandwidth efficiency but it requires adaptive threshold setting and GLRT-based methods is the error floor due to the transmis-
under channel fading conditions [3]. Although data detection sion of all-zero sequences [16]. To overcome this limitation,
issues have been widely addressed in the context of classical in the second part of this letter, by utilizing a specified source
RF wireless communication [4], these results are not directly information transformation (SIT), we propose an EM-based
applicable to OOK-based optical systems. Recently, different SD for this state that by removing all-zero sequence, achieve
techniques for FSO receiver design have been proposed and a significantly better performance compared to the similar
analyzed in the context of optical wireless communication methods.
systems [5]–[17].
As transmission of pilot symbols reduces the bandwidth
II. S YSTEM M ODEL
efficiency of FSO links [6], in most of the aforementioned
references, maximum-likelihood sequence detection (MLSD) Similar to the system model of [15], we consider an IM/DD
FSO link with NRZ-OOK modulation over atmospheric tur-
Manuscript received September 6, 2018; accepted September 24, 2018. bulence channel. The received signal denoted rk , can be
Date of publication October 1, 2018; date of current version April 9, 2019.
The associate editor coordinating the review of this paper and approv-
expressed at any discrete time k as rk = Pt sk h + nk , where
ing it for publication was P. P. Markopoulos. (Corresponding author: h is the fading channel coefficient that is assumed constant
Seyed Mohammad Sajad Sadough.) over a large number of transmitted bits, sk ∈ {αe , 1} is the
The authors are with the Department of Electrical Engineering, transmitted NRZ-OOK symbol, sk = αe represents the digital
Shahid Beheshti University G. C., Tehran 1983969411, Iran (e-mail:
s_sadough@sbu.ac.ir). symbol sk = 0 while sk = 1 represents the digital symbol
Digital Object Identifier 10.1109/LWC.2018.2873382 sk = 1 [1]. Without loss of generality, we assume that the
DABIRI AND SADOUGH: RECEIVER DESIGN FOR OOK MODULATION OVER TURBULENCE CHANNELS USING SOURCE TRANSFORMATION 393
transmitted energy Pt is normalized to one. Moreover, we

assume that nk is the additive white Gaussian noise with zero
mean and variance σ12 due to the transmitted one-bit and σ02
due to the transmitted zero-bit [7], [14]. Moreover, the average
2E{|sk h|2 }
electrical SNR is defined as σ2 +σ 2 , where E{.} denotes
1 0
statistical expectation.
In FSO links, the channel state can be formulated as
h = hl ha , where hl is the deterministic propagation loss and
ha is the attenuation due to atmospheric turbulence. For a
weak to strong range of atmospheric turbulence, the gamma-
gamma turbulence model has emerged as a useful model
for FSO communication applications [18], [19]. The gamma- Fig. 1. BER of proposed EM-based SD versus SNR for different values of
gamma distribution can be expressed as [18] fGG (ha ) = L, compared to the BER achieved with GLRT-based method proposed in [15].
α+β
2(αβ) 2
α+β
−1 √
Γ(α)Γ(β)
h a
2
kα−β (2 αβha ), where Γ(·) is the gamma
function, km (·) is the second kind modified Bessel function s, at each iteration, which incurs an exponentially increas-
of order m, 1/β and 1/α are the variances of the small and ing degree of complexity. Since the transmitted sequence is
large scale eddies, respectively. independently generated, maximizing vector Q(s, s i ) results
in maximizing each of its elements Qk (sk , s i ) = −R1k sk −
III. EM-BASED S EQUENCE D ETECTION AND R0k (1 − sk ). Hence, the complexity of the sequence detec-
P ERFORMANCE A NALYSIS tion method based on (4) can be reduced to that of a
In what follows, we assume that the channel state infor- symbol-by-symbol detection method. We have
mation (CSI) is not available at the receiver. Moreover, we (i+1)
assume that the received data is gathered in an observation sk =1
window composed of L intervals, r = {r1 , r2 , . . . , rL }, corre- R1k <
> R0k for k ∈ {1, 2, . . . , L}. (5)
(i+1)
sponding to L transmitted signals, s = {s1 , s2 , . . . , sL } and L sk =0
transmitted digital signals, s = {s1 , s2 , . . . , sL } during which It is well known that an accurate initial estimate of s
the channel is assumed to remain unchanged (i.e., we assume is necessary for the iterative steps of the EM algorithm to
a quasi-static channel model). converge to the global maximum. For large values of the
According to the terminology of the EM algorithm, the observation window of length L, the number of bits ‘1’ and
received sequence r and y = (r, h) are referred to as incom- the number of bits ‘0’ tends to theirs expected values, i.e.,
plete and complete data sets, respectively. After dropping E{ L s
k =1 k
} = E{ L (1 − s )} = L/2. Accordingly, we
k =1 k
some unnecessary terms, the log-likelihood function for the estimate the channel without requiring any pilot symbols as
2 L
complete data is expressed as h 1 = L(1+α k =1 kr . Then, by substituting h 1 in (2) and
e)
log p(r |s, h) using (5), the initial estimate of s is obtained.
L

(rk − hsk )2
=− + sk ln σ1 + (1 − sk ) ln σ0 . (1) A. Performance and Complexity Analysis
σ12 sk + σ02 (1 − sk )
k =1
In this subsection, we provide performance and complex-
The E-step finds the so-called auxiliary function Q(s, s i ), ity comparison of the proposed method with similar existing
which is defined as the expectation of the log-likelihood works in the literature of FSO communications. Throughout
function for the complete data set conditioned on the our analysis, the receiver with perfect CSI is considered as a
received sequence r and the ith estimate of the transmit- lower bound benchmark. The average bit error rate (BER) of
ted sequence s, denoted s i = {s1i , . . . , sLi }. Moreover, we the receiver with perfect CSI is derived in [15, eqs. (5)–(7)].
denote the ith estimate of the transmitted digital sequence s For deriving our numerical results,
√ we consider a link length
by s i = {s1i , . . . , sLi }. Starting from (1), we get Q(s, s i ) = equal to 1 km, αe = 0.2, σσ10 = 7 and we set α = 11.7 and

− L 1 0
k =1 (Rk sk + Rk (1 − sk )), where
β = 10.2.
In Fig. 1, we have depicted the performance achieved with
(rk − αe h i )2 (rk − h i )2
R0k = 2
+ ln σ0 , & R1k = + ln σ1 , (2) EM-based SD introduced in this letter by using Monte-Carlo
σ0 σ12
L simulations for different values of L. For comparison, we have
k =1 σ02 rk ski + αe σ12 rk (1 − ski ) also plotted the BER curves of the GLRT-based SD intro-
h i = E h|ski , rk = L . (3)
k =1 σ02 ski + α2e σ12 (1 − ski ) duced in [15]. There are two important observations which
The M-step calculates the (i + 1)th estimate of s, i.e., s i+1 , can be drawn from Fig. 1. First, we observe that after only
that maximizes Q(s, s i ). More precisely, we have two iterations, the proposed EM-based SD converges (i.e., it

attains the global maximum) and further iterations result in
s i+1 = arg max Q s, s i . (4) a negligible performance improvement. Second, we observe
s that with increasing the length of the observation window, the
The iterative detection method given in (4), requires an performance of the proposed EM-based detector and that of
exhaustive search over all 2L possible transmitted sequence the GLRT-based method proposed in [15] become close to
OE ∝ 2 × 182 = 648 for L = 18. Indeed, although the

proposed EM-based method requires larger values of L to
achieve a certain target BER performance, its computational
complexity is lower than that of the GLRT-based method.
As observed from Fig. 2b, the EM detector is particularly
interesting since even with a larger value of L, the BER is
improved without increasing the computational complexity too
much compared to the GLRT method.
IV. R EMOVING E RROR F LOOR BY U SING S OURCE

I NFORMATION T RANSFORMATION
The main challenge of using EM- and GLRT-based meth-
ods is the error floor due to the transmission of all-zero
sequences. To overcome this problem, we utilize the SIT
method proposed in [14] which is suitable for fast FSO com-
munications. According to this SIT method, for odd values
of L , a binary information sequence of length L is mapped
to a binary information sequence of length L = L +1. Hence,
by reducing the data transmission rate to L−1 LTs where Ts is
the slot duration, the all-zero sequence can be easily removed.
In this case, writing the E-step and the M-step of the EM
Fig. 2. BER of proposed EM-based SD compared to the BER achieved algorithm, (5) can be simplified as
with GLRT-based method for SNR=27 dB (a) versus L, and (b) versus
(i+1)
computational complexity. sk =1
1 0
R k <
> R k for k ∈ {1, 2, . . . , L}, (6)
(i+1)
sk =0
the performance achieved with perfect CSI case. In addition, L
rk ski 2
we observe that for a given length of the observation win- where R 0k = σ12 (rk − αe k =1
L i ) + σ02 σ12 ln σ0 , and
k =1 ks
dow L, the GLRT-based detector outperforms the performance L
rk ski 2
of EM-based detector. R 1k = σ02 (rk − k =1
L i ) + σ 2 2
0 σ1 ln σ1 .
To show the supe-
k =1 sk
Let us now address the computational complexity of EM and riority of this proposed method, in the sequel, we provide a
GLRT-based SD. Notice that the GLRT-based SD proposed proof that the proposed scheme does not floor. Notice that the
in [15] has a less computational complexity compared to worse case of this proposed scheme occurs when the sequence
MLSD. However, to detect a sequence of received symbols has only one bit ‘1’. In this case, error occurs only for zero-
composed of L bits, the GLRT-based SD in [15] calculates bits. Let us denote by ML the number of ‘0’ bits that after
the metric [15, eq. (11)] for 2L possible bit sequences. As the first decision are detected mistakenly as ‘1’ bit. At high
the computational complexity of the metric [15, eq. (11)] is SNR (i.e., σ1 << 1), for sk = 0 in the considered transmitted
proportional to L, to detect L bits of the received sequence, sequence, (6) can be approximated as
the total computational complexity of GLRT-based SD is pro-
portional to OG ∝ L × 2L . On the other hand, according 2 s (i+1) =0
1 + ML αe 2 k >
2 2 σ1 σ1
to (5), to detect L bits of the received sequence, the proposed αe h 2 1 − < σ12 ln . (7)
σ0 ML + 1 (i+1) σ0
EM-based method makes L decisions at each iteration and sk =1
according to (2) and (3), the computational complexity of each Equation (7) confirms that for each small value of h, there is
decision is proportional to L. Moreover, as shown in Fig. 1, σ2
the proposed EM-based method converges after only two itera- a small value of σ1 that satisfies αe2 h 2 σ12 (1 − 1+M L αe 2
ML +1 ) −
0
tions. Hence, to detect L bits of the received sequence, the total σ12 ln σσ10 > 0 and consequently after second iteration ML = 0.
computational complexity of the EM-based SD is proportional For performance comparison, in Fig. 3 we have contrasted
to OE ∝ 2 × L2 . the BER of the proposed EM-based SD using SIT to the
Figure 2a shows the BER performance versus L at a fixed BER of simple proposed EM-based SD and GLRT-based SD.
SNR of 27 dB for GLRT and EM detector, respectively. As we can observe, the proposed EM-based SD using SIT
This allows us to evaluate the length of the observation win- outperforms competitive methods at the expense of using an
dow necessary to achieve a certain target BER. We observe additional hardware for SIT.
that the GLRT detector requires a window of 12 symbols The occurrence probability of an all-zero sequence for an
(L = 12) to achieve a BER of 10−4 while the EM detec- observation window composed of L bits is equal to 1/2L .
tor attains this BER for a window of 18 symbols (L = 18). Hence, for a given target BER Pe,tar , L must be chosen as
On the other hand, according to the above discussion, the L >> − log2 (Pe,tar ). It is well known that Pe,tar of FSO com-
computational complexity of GLRT detector is proportional munications are commonly lower than 10−9 [16]; for instance,
to OG ∝ 12 × 212 = 49152 for L = 12 whereas the for two common values of Pe,tar = 10−9 and 10−12 , L must
computational complexity of EM detector is proportional to satisfy L >> 30 and 40, respectively. Hence, for Pe,tar = 10−9
DABIRI AND SADOUGH: RECEIVER DESIGN FOR OOK MODULATION OVER TURBULENCE CHANNELS USING SOURCE TRANSFORMATION 395
and GLRT-based SD methods was the error floor due to the

transmission of all-zero sequences. To overcome the error floor
due to the transmission of all-zero sequence, in the second
part of this letter, we proposed a modified EM algorithm for
a system that removes the all-zero sequence by using a SIT.
It was verified that the proposed scheme significantly outper-
forms than other SD methods without increasing considerably
the computational complexity at the receiver.
R EFERENCES
[1] Z. Ghassemlooy, W. Popoola, and S. Rajbhandari, Optical Wireless
Communications. Boca Raton, FL, USA: CRC Press, 2012.
[2] M. A. Khalighi and M. Uysal, “Survey on free space optical communi-
Fig. 3. BER of proposed EM-based SD with SIT versus SNR for L = 4, cation: A communication theory perspective,” IEEE Commun. Surveys
compared to the BER achieved with simple proposed EM-based SD and Tuts., vol. 16, no. 4, pp. 2231–2258, 4th Quart., 2014.
GLRT-based method of [15]. [3] M. T. Dabiri, S. M. S. Sadough, and M. A. Khalighi, “FSO channel
estimation for OOK modulation with APD receiver over atmospheric
turbulence and pointing errors,” Opt. Commun., vol. 402, pp. 577–584,
Nov. 2017.
and Pe,tar = 10−12 , the computational complexity of EM- [4] Y. Liu, Z. Tan, H. Hu, L. J. Cimini, and G. Y. Li, “Channel estimation for
based method is proportional to 1800 and 3200, respectively, OFDM,” IEEE Commun. Surveys Tuts., vol. 16, no. 4, pp. 1891–1908,
and the computational complexity of GLRT-based method is 4th Quart., 2014.
[5] X. Zhu and J. M. Kahn, “Free-space optical communication through
proportional to 3.2 × 1010 and 4.4 × 1013 , respectively. atmospheric turbulence channels,” IEEE Trans. Commun., vol. 50, no. 8,

According to the methodology of SIT algorithm, 2L pos- pp. 1293–1300, Aug. 2002.

sible information sequences of length L is mapped to 2L
[6] N. D. Chatzidiamantis, M. Uysal, T. A. Tsiftsis, and G. K. Karagiannidis,

different sequences of length L = L +1 for which each
“Iterative near maximum-likelihood sequence detection for MIMO opti-
cal wireless systems,” J. Lightw. Technol., vol. 28, no. 7, pp. 1064–1070,
sequence of length L, contains at least one bit “1” [12]. For Apr. 1, 2010.
such method, the transceiver requires an one-to-one mapping [7] H. Moradi, H. H. Refai, and P. G. LoPresti, “Thresholding-based optimal

with 2L input and 2L output. Hence, the complexity of detection of wireless optical signals,” J. Opt. Commun. Netw., vol. 2,
no. 9, pp. 689–700, Sep. 2010.

SIT method is proportional to OS ∝ 2L = 2L−1 . Now, [8] T. Song and P.-Y. Kam, “A robust GLRT receiver with implicit channel
considering the joint effect of SIT and EM, the computa- estimation and automatic threshold adjustment for the free space optical
channel with IM/DD,” J. Lightw. Technol., vol. 32, no. 3, pp. 369–383,
tional complexity of EM-SIT algorithm is proportional to Feb. 1, 2014.
OES = OE + 2OS ∝ 2 × L2 + 2L . More precisely, the [9] T. Song and P.-Y. Kam, “Robust data detection for the photon-counting
overall computational complexity of EM-SIT algorithm is pro- free-space optical system with implicit CSI acquisition and back-
portional to OES = OE + 2OS ∝ 2 × L2 + 2L where the ground radiation compensation,” J. Lightw. Technol., vol. 34, no. 4,
pp. 1120–1132, Feb. 15, 2016.
complexity of SIT is multiplied by two as SIT is performed at [10] M. M. Abadi, Z. Ghassemlooy, M.-A. Khalighi, S. Zvanovec, and
both transmitter and receiver sides. On the other hand, accord- M. R. Bhatnagar, “FSO detection using differential signaling in outdoor
ing to the results depicted in Fig. 3, EM-SIT algorithm has a correlated-channels condition,” IEEE Photon. Technol. Lett., vol. 28,
no. 1, pp. 55–58, Jan. 1, 2016.
performance comparable to the receiver with perfect CSI for [11] M. M. Abadi et al., “Impact of link parameters and channel correlation
L = 4 and consequently its computational complexity is pro- on the performance of FSO systems with the differential signaling tech-
portional to 2 × 42 + 24 = 48 which is much lower than nique,” IEEE/OSA J. Opt. Commun. Netw., vol. 9, no. 2, pp. 138–148,
the complexity of EM- and GLRT-based methods at practical Feb. 2017.
[12] L. Yang, X. Song, J. Cheng, and J. F. Holzman, “Free-space optical
target BERs. communications over lognormal fading channels using OOK with finite
extinction ratios,” IEEE Access, vol. 4, pp. 574–584, 2016.
[13] L. Yang, J. Cheng, and J. F. Holzman, “Maximum likelihood estimation
V. C ONCLUSION of the lognormal-rician FSO channel model,” IEEE Photon. Technol.
Lett., vol. 27, no. 15, pp. 1656–1659, Aug. 1, 2015.
In the first part of this letter, we proposed an EM-based [14] L. Yang, B. Zhu, J. Cheng, and J. F. Holzman, “Free-space optical
detector over FSO channels for the general case of NRZ OOK communications using on–off keying and source information trans-
formation,” J. Lightw. Technol., vol. 34, no. 11, pp. 2601–2609,
symbols where the variances of transmitted one-bit and zero- Jun. 1, 2016.
bit are not necessarily equal. The proposed detector has a [15] M. T. Dabiri, S. M. S. Sadough, and H. Safi, “GLRT-based sequence
high spectral efficiency since it does not require transmis- detection of OOK modulation over FSO turbulence channels,” IEEE
sion of any pilot symbols for channel estimation. We showed Photon. Technol. Lett., vol. 29, no. 17, pp. 1494–1497, Sep. 1, 2017.
[16] M. T. Dabiri and S. M. S. Sadough, “Performance analysis of EM-based
that the proposed EM-based detector is particularly suitable blind detection for ON–OFF keying modulation over atmospheric optical
for employing in fast FSO communications because it has channels,” Opt. Commun., vol. 413, pp. 299–303, Apr. 2018.
BER performance close to those achieved with perfect CSI. [17] M. T. Dabiri and S. M. S. Sadough, “Generalized blind detection of OOK
modulation for free-space optical communication,” IEEE Commun. Lett.,
Moreover, it was shown that EM-based SD outperforms GLRT vol. 21, no. 10, pp. 2170–2173, Oct. 2017.
as it provides higher quality of service in terms of target BER. [18] L. C. Andrews and R. L. Phillips, Laser Beam Propagation Through
Although compared to GLRT the proposed EM-based method Random Media, vol. 1. Bellingham, WA, USA: SPIE Press, 2005.
requires larger values of L to achieve performance close to the [19] M. T. Dabiri, S. M. S. Sadough, and M. A. Khalighi, “Channel
modeling and parameter optimization for hovering UAV-based free-
perfect CSI case, its computational complexity is lower than space optical links,” IEEE J. Sel. Areas Commun., to be published,
that of GLRT-based method. The main challenge of using EM- doi: 10.1109/JSAC.2018.2864416.
Handover Probability Analysis of Anchor-Based Multi-Connectivity

in 5G User-Centric Network
Hongtao Zhang , Senior Member, IEEE, Wanqing Huang, Student Member, IEEE,
and Yi Liu , Senior Member, IEEE
Abstract—In order to reduce the handover cost due to

network densification, this letter proposes an anchor-based multi-
connectivity (MC) architecture and derives compact expressions
of handover probabilities (HOPs) through stochastic geometry
analysis in user-centric network (UCN). For MC a given user
connects with multiple access points (APs), and the best one is
chosen as a handover anchor to provide control-plane, which
reduces the handover rate. Moreover, HOPs are quantified for a
typical user moving with a random direction and a fixed speed
in an irregular UCN with APs modeled as a Poisson point pro-
cess. The simulation and analytical results show that the HOP
in control-plane achieves a decrease of more than 40% over the
traditional handover scheme in the LTE system.
Index Terms—User-centric network, multi-connectivity, han-
dover, Poisson point process.
I. I NTRODUCTION
ENSIFYING access points (APs) is deemed as a
D promising solution to improve system capacity in 5G [1].
Extensive studies have been conducted to quantize the han-
Fig. 1. A UE connect with the nearest 3 APs. Handover in CP occurs when
AP C is leaving the AS.
dover performance in dense networks utilizing stochastic

geometry methods [2], [3], revealing that the handover rate handover rate due to the larger cluster size. The “de-cellular”
increases with base station (BS) density. Therefore, the mobil- architecture of UCN [4], which implies the decoupling of the
ity enhancement has become significant for dense networks. control-plane and user-plane (CP/UP), could be a potential
Fortunately, by introducing the philosophy of the network solution to lessen the network control overheads and reduce
serving user and the “de-cellular” method, user-centric the handover delay [9]. It can be concluded that MC with
network (UCN) is defined, which eliminates the traditional cell CP/UP split architecture is a feasible solution to enhance user
boundaries by organizing a dynamic AP group (APG) [4]. To mobility in UCN.
form the dynamic APG, dual connectivity (DC) [5] and multi- In contrast to the existing literature, this letter aims to
connectivity (MC) [6] schemes are introduced, decreasing the improve the mobility performance for dense networks with
handover failure (HOF) rate without a loss in the throughput the anchor-based MC architecture. The main contribution is
performance gain. modeling the handover in both UP and CP for MC in UCN,
Surprisingly, few research for MC can be found using and deriving compact expressions of handover probabilities
stochastic geometry. The downlink rate for the multiple asso- (HOPs) via stochastic geometry.
ciation is investigated in [7] for stationary users. However,
handover patterns are more complicated for MC as the han-
II. S YSTEM M ODEL
dovers involve changes of a set of APs in user’s vicinity. In
light of this, [8] calculates the handover rate for multiple asso- In this letter, the locations of APs and UEs are modeled
ciation by characterizing the boundaries of the cooperative as two independent 2-D Poisson point processes (PPPs) ΦB
cluster of the serving APs, however, resulting in increased with its density λB and Φu with its density λu , respectively.
For each user u, the collection of the M APs with strongest
Manuscript received August 18, 2018; accepted September 23, 2018. Date downlink received signal strength (DL-RSS), i.e., the nearest
of publication October 1, 2018; date of current version April 9, 2019. This M APs, forming a transmission group is termed as the active
work was supported by the National Science Foundation of China under Grant set (AS) with the best AP (e.g., the AP with the largest back-
61302090 and Grant 61671341. The associate editor coordinating the review
haul capacity) in the AS taking charge of CP as a handover
of this paper and approving it for publication was X. Chu. (Corresponding
author: Hongtao Zhang.) anchor is termed as the control point (AP C).
H. Zhang and W. Huang are with the Key Laboratory of Universal The user moves linearly in a short unit of time with a
Wireless Communications, Ministry of Education of China, Beijing fixed velocity and changes its direction randomly in the next
University of Posts and Telecommunications, Beijing 100876, China (e-mail: unit of time. Thus, the user’s trajectory consists of several
htzhang@bupt.edu.cn; hwq2013@bupt.edu.cn).
Y. Liu is with the State Key Laboratory of Integrated Service Network,
line segments. The UE sends the measurement reports to the
Xidian University, Xi’an 710071, China (e-mail: yliu@xidian.edu.cn). AP C with the reference signal received power (RSRP) from
Digital Object Identifier 10.1109/LWC.2018.2873389 neighboring APs. Then, the AP C will decide whether to
ZHANG et al.: HANDOVER PROBABILITY ANALYSIS OF ANCHOR-BASED MC IN 5G UCN 397
update the APs in AS (defined as a UP handover) according IV. H ANDOVER P ROBABILITY IN U SER -P LANE
to the filtered RSRP, i.e., DL-RSS, to avoid the ping-pong The UP handover event is represented by Hu , and Fj
effect. Hence, the AS is always formed by the nearest M denotes the event that AP j (1 ≤ j ≤ M) becomes the fur-
APs. Especially, before the serving AP C leaving the AS, it thest serving AP in the AS after the movement of the UE.
will choose a target AP C among all the serving APs and According to the system model, the UP HOP is given by
send handover request (defined as a CP handover). During the
handover, other APs in the AS keep serving the UE. Fig. 1 P(Hu )
illustrates the handover when the AS size is 3, where AP 2 M

is leaving the AS after the movement of UE. Especially, if = P Hu |Fj , rM +1 , rj , θ P Fj |rM +1 , rj , θ
AP 2 takes charge of the control-plane, it will choose the best j =1 (Ω)
serving AP as the target AP C, leading to CP handover process.
·fj ,M +1 rj , rM +1 |θ fθ (θ)d rM +1 d rj d θ
III. A NALYTICAL M ODEL AND P RELIMINARY A NALYSIS M
(a) 1
In this letter, AP j represents the j-th nearest APs to the UE. = P Hu |Fj , rM +1 , rj , θ P Fj |rM +1 , rj , θ
π
The UE moves a distance v in a unit of time at angle θ with j =1 (Ω)
respect to the direction of the connection with the AP j from
the initial location l1 , to a new location l2 . Note that, AP M is ·fj ,M +1 rj , rM +1 d rM +1 d rj d θ, (3)
not always the furthest serving AP after the movement of the where (a) uses the assumption that θ is independent of
UE. For the sake of unity in analyses, AP (M + 1) is taken rj , and its distribution is uniform whose PDF is fθ (θ) =
into account. rj (R) is the distance between l1 (l2 ) and AP j. 1/π due to the symmetry. The integration domain Ω and
C (Cj ) denotes the circle with its center at l1 and its radius fj ,M +1 (rj , rM +1 ) are given by Proposition 1 and 2, respec-
rM +1 (rj ), and A denotes the circle with its center at l2 and tively.
its radius R. It is obvious that R 2 = rj2 + v 2 + 2v rj cos θ. Fj is satisfied if there is no APs left in the region
In our model, the positional relation between circles A and C\A ∩ C. C\A ∩ C have to be divided into two indepen-
C depends on the value of rj , rM +1 and θ. However, the dent regions: SI : Cj \Cj ∩ A with at most (j − 1) APs, and
handover occurs iif circle A intersects with circle C, which
SII : (C \A ∩ C )\SI with at most (M − j) APs. The loca-
limits the the integration domain used in following sections
from {(rj , rM +1 , θ)|0 < rj < rM +1 , 0 < θ < π} to Ω, given tions of the APs then follows two independent Binomial point
by the following proposition. processes (BPPs) on both the region SI and the region SII [10].
Proposition 1: Based on the system model in Section II, Thus, the conditional probability of Fj is given by
the necessary condition of the handover is that the circle A
P(Fj |rM +1 , rj , θ) = P(N (|SI |) = 0)P(N (|SII |) = 0)
intersects with the circle C. Let AP j be the furthest serving M −j
AP in AS after the movement of the UE. Therefore, rj , rM +1 |SI | j −1 |SII |
= 1− 1−
and θ are limited by the domain Ω, given by |Cj | |C \Cj |
j −1 M −j
S∩ (rj , R, v ) S∩ (rM +1 , R, v ) − S∩ (rj , R, v )
(Ω) = (Ω1 ) + (Ω2 ) + (Ω3 ), (1) = , (4)
πrj2 πrM2
+1 − πrj
2
where
⎧ ⎫ where |S| denotes the measure of S, N(·) is the number of APs

⎨ 0 < rj < v , ⎬ in the specified area, and the function S∩ (r , R, v ) represents
(Ω1 ) = rj , rM +1 , θ rj < rM +1 < 2v − rj , ,
⎩ 0 < θ < π ⎭ the common area between the two intersecting circles with
⎧ ⎫ radii r and R, and the central distance v, is given by
⎪ 0 < rj < v , ⎪
⎪
⎨ ⎪
⎬ r 2 + v 2 − R2 R2 + v 2 − r 2
2v − rj < rM +1 < 2v + rj , 2
S∩ (r , R, v ) = r cos
−1 2
+ R cos
−1
(Ω2 ) = rj , rM +1 , θ 2 , 2vr 2vR
⎪
⎪
2
0 < θ < cos−1 rM +1 −2v rM +1 −rj ⎪
⎪
⎩ 2v rj
⎭ 1
− (r + R − v )(r + R + v )(v + r − R)(v − r + R). (5)
⎧ ⎫ 2
⎪ rj > v , ⎪
⎪
⎨ ⎪
⎬ Conditioned on Fj , the UP handover does not occur only if
rj < rM +1 < rj + 2v ,
(Ω3 ) = rj , rM +1 , θ 2 . two independent conditions are fulfilled: (a) no new APs is in
⎪
⎪
2
0 < θ < cos−1 rM +1 −2v rM +1 −rj ⎪
⎪ the region A\A ∩ C which is denoted by SIII , and (b) the arc
⎩ 2v rj
⎭
of circle C which is included in area A have to be exclusive
Proof: See Appendix A. of AP (M + 1), which is denoted by the notation Narc . The
The joint distribution of user-to-APs distances is also con- probability of condition (a) can be expressed by
sidered in the analysis. The following proposition gives the
P(N (|SIII |) = 0) = e −λB [πR −S∩ (rM +1 ,R,v )] .
2
joint probability density function (PDF) fj ,M (rj , rM ). (6)
Proposition 2: Consider the distance between a typical user The event Narc is the necessary condition for the event that
and its j-th nearest APs is rj in a network with APs distributed the UP handover does not occur. Since the positions of the APs
according to PPP. The joint PDF of rj and rM is given by follows a PPP, the AP (M + 1) follows a uniform distribution
on the circle C. Based on the relation of the central angle and
(πλB )M rM rj2j −1 M −j −1
fj ,M (rj , rM ) = 4 2
rM − rj2
2
e −λB πrM , (2) 2π, the probability of Narc is given by
Γ(j )Γ(M − j ) 2
1 −rM +1 + rj2 + 2v rj cos θ
where Γ(·) is the gamma function and rj ≤ rM . P(Narc ) = cos −1
. (7)
Proof: See Appendix B. π 2v rM +1
Therefore, according to (6) and (7), the expression of the

conditional probability of UP handover is given by

P Hu |Fj , rM +1 , rj , θ = 1 − P(N (|SIII |) = 0)P(Narc )
2 2
1 −λB πR2−S∩ (rM +1 ,R,v ) −1 −rM +1 +rj + 2v rj cos θ
= 1− e cos .
π 2v rM +1
(8)
By substituting (4) and (8) into (3), UP HOP is obtained.

Fig. 2. HOPs versus the user velocity for different AS size M when AP
density is 0.001 cells/m2 : a) UP HOP; b) CP HOP.
V. H ANDOVER P ROBABILITY IN C ONTROL -P LANE
Different from the analysis in Section IV, whether the AP C
becomes the furthest serving AP or not, the CP handover may
occur. Cj represents the event that AP j (1 ≤ j ≤ M) is the
AP C. Hence, the CP HOP Hc can be calculated by
P(Hc )
M

= P Hc |Cj , rj , rM +1 , θ P Cj |rj , rM +1 , θ
j =1 (Ω)

·fj ,M +1 rj , rM +1 fθ (θ)drj drM +1 dθ
M
1 Fig. 3. HOPs versus the AP density for different AS size M when user
= P Hc |Cj , rj , rM +1 , θ fj ,M +1 rj , rM +1 drj drM +1 dθ,
Mπ velocity is 5 m/s: a) UP HOP; b) CP HOP.
j =1 (Ω)
(9)
n (M −1)!
CM −1 = n!(M −n−1)! is Binomial Coefficient. According to
where the integration domain Ω and fj ,M +1 (rj , rM +1 ) are
the property of PPP, we have
given by proposition 1 and 2, respectively, and we assume
P(Cj ) = 1/M for all j. For single connection, i.e., M = 1, the P(N (|SIII |) = n)
CP HOP is equal to the UP HOP. In this section, only the case n
πR 2 − S∩ (rM +1 , R, v ) λB 2
that M ≥ 2 is under our consideration. After the movement = e −(πR −S∩ (rM +1 ,R,v ))λB
. (12)
n!
of the UE, if the number of APs in the region A\A ∩ C (and
on its periphery) is more than that in the region C\A ∩ C, the Due to the complementarity between P(Hc |Cj , rj , rM +1 , θ)
CP handover occurs. Hence, we have and (10), the expression for conditional CP HOP can be
solved. So far, all the factors for CP HOP are derived.

P H̄c |Cj , rj , rM +1 , θ
= P(Narc )P(N (|SIII |) ≤ N (|SI |) + N (|SIII |)) VI. N UMERICAL R ESULTS

+P N̄arc P(N (|SIII |) ≤ N (|SI |) + N (|SIII |) − 1) The simulations in Figs. 2 and 3 validate the theoretical
M −1 n−1
analysis, where (A) and (S) in the legends denote analytical
and simulation results, respectively. Compared with [2], HOP
= P(N (|SI |) + N (|SII |) = n) P(N (|SIII |) = k )
n=1
in CP for MC shows a decrease of at least 40% over the model
k =0
M −1
in [2] when λM = 0.1λB , where λM is the macro cell density.
Especially, when λM = 0, the scenario in [2] is as same as
+P(Narc ) P(N (|SIII |) = n)P(N (|SI |) + N (|SII |) = n),
the single connectivity (i.e., M = 1) in this letter.
n=0
(10) Fig. 2 shows the HOPs versus the user velocity. Before
HOPs reach 0.5, which implies that the HOP is equal to
the number of handovers in a unit time approximately, they
and the APs in SI and SII follow independent BPP, thus
are almost proportional to the user velocity. The increases of
UP HOP and the decreases of CP HOP brought by larger
P(N (|SI |) + N (|SII |) = n)
min(n,j −1)
i j −i−1 AS size are diminishing, since the radius of AS rM follows
|SI | |SI | Gamma distribution whose difference between peak values of
= Cji−1 1−
i=max(0,n−M +j )
πrj2 πrj2 rM diminishing with M.
⎛ ⎞n−i ⎛ ⎞M −n−j +i Fig. 3 illustrates the HOP versus the AP density in logarith-
n−i ⎝ |SII | |SII | mic coordinates. The HOPs are proportional to the logarithm
·CM −j
⎠ ⎝1 − ⎠ , of AP density approximately before AP density achieves 0.01
2
π rM 2 2 2
+1 − rj π rM +1 − rj
cells/m2 . This is because the HOP is equal approximately to
(11) the number of handovers in a unit time in that case, and the
logarithm of the number of APs in the specified area is pro-
where the upper and lower boundaries of i are set to guar- portional to the AP density. The HOPs converge to 1 for larger
antee the number of APs in SI and SII are positive, and AP density, which is consistent with our expectations.
ZHANG et al.: HANDOVER PROBABILITY ANALYSIS OF ANCHOR-BASED MC IN 5G UCN 399
than rM , which is given by

P N CM /Cj ≤ M − j − 1
M
−j −1

= P N CM /Cj = k
k =0
k
M
−j −1 2 − πr 2 λ
πrM
j B 2 −πr 2
−λB πrM
= e j
. (14)
k!
k =0
Similar to the analysis of [11, Th. 1], for k ≥ 1,
d
P(N (|CM /Cj |) = k )
drM

(πλB )k 2 k −1 (πλB )k +1 2 k
= 2rM rM − rj2 − rM − rj2
(k − 1)! k!
Fig. 4. UP HOP and CP HOP versus AS size M for different AP density
2
−λB π rM −rj2
and different user velocity. ×e

2
Δ −λB π rM −rj2
= 2rM (Sk −1 − Sk )e , (15)
k +1
(πλB ) 2 − r 2 )k . For k = 0,
Fig. 4 illustrates the differences between the UP HOP and where Sk = k! (rM j
the CP HOP. HOPs in CP shows a significantly decrease with
P N CM /Cj = 0 = −2rM S0 e −λB π(rM −rj ) .
d 2 2
AS size M and converge gradually when M = 3 for v = 1 m/s (16)

d rM
and M = 6 for v = 5 m/s. Note that, the UP handover in
Hence, the PDF of rM is given by
our model is equal to traditional handover scheme in the LTE
d
system. Hence, the differences between UP HOPs and CP fM rM |rj = − P N CM /Cj ≤ M − j − 1
HOPs is the gain brought by the multi-connectivity scheme, d rM
⎛ ⎞
which increases with the AS size growth. M−j −1 M−j −2
−λ π r 2 −r 2
= 2rM ⎝ Sk − Sk ⎠e B M j
k =0 k =0
VII. C ONCLUSION
(πλB )M −j 2 M −j −1
−λ π r 2 −r 2
In this letter, an anchor-based MC mobility model has been = 2rM rM − rj2 e B M j .
proposed in 5G UCN environment to enhance user mobility Γ(M − j )
robustness. By grouping an AS for a UE, the UE maintains (17)
connections with other APs when one or more APs leaving its So far, fj ,M (rj , rM ) can be obtained by fM |j (rM |rj )·fj (rj ).
AS which provides robustness and reliable service for mobile
users. Based on our developed model, UP HOP and CP HOP R EFERENCES
are derived, showing a significantly decrease compared to the
[1] H. Zhang, Y. Chen, Z. Yang, and X. Zhang, “Flexible coverage for
traditional network. Furthermore, the AS size can further be backhaul-limited ultra-dense heterogeneous networks: Throughput anal-
optimized for different user velocity and AP density to achieve ysis and η-optimal biasing,” IEEE Trans. Veh. Technol., vol. 67, no. 5,
a tradeoff between handover rate and signal overheads. pp. 4161–4172, May 2018.
[2] S.-S. Hsueh and K.-H. Liu, “An equivalent analysis for handoff proba-
bility in heterogeneous cellular networks,” IEEE Commun. Lett., vol. 21,
A PPENDIX A no. 6, pp. 1405–1408, Jun. 2017.
[3] Y. Teng, M. Liu, and M. Song, “Effect of outdated CSI on handover
P ROOF OF P ROPOSITION 1 decisions in dense networks,” IEEE Commun. Lett., vol. 21, no. 10,
The event that circle A intersects circle C after the user pp. 2238–2241, Oct. 2017.
moving a distance v is the necessary condition of the handover. [4] H. Zhang, Z. Yang, Y. Liu, and X. Zhang, “Power control for 5G user-
centric network: Performance analysis and design insight,” IEEE Access,
Then, the distance between the centers of the circle A and the vol. 4, pp. 7347–7355, 2016.
circle C is given by [5] H. Zhang, N. Meng, Y. Liu, and X. Zhang, “Performance evaluation for
local anchor-based dual connectivity in 5G user-centric network,” IEEE
A ∩ C ⇔ |rM +1 − R| < v < rM +1 + R Access, vol. 4, pp. 5721–5729, 2016.
[6] F. B. Tesema, A. Awada, I. Viering, M. Simsek, and G. Fettweis,
⇔ (rM +1 − v )2 < R 2 < (rM +1 + v )2 “Evaluation of context-aware mobility robustness optimization and
2 2 multi-connectivity in intra-frequency 5G ultra dense networks,” IEEE
⇔ rM +1 − 2v rM +1 < rj + 2v rj cos θ Wireless Commun. Lett., vol. 5, no. 6, pp. 608–611, Dec. 2016.
2
rM 2
+1 − 2v rM +1 − rj
[7] M. Kamel, W. Hamouda, and A. Youssef, “Performance analysis of
⇔ cos θ > . (13) multiple association in ultra-dense networks,” IEEE Trans. Commun.,
2v rj vol. 65, no. 9, pp. 3818–3831, Sep. 2017.
[8] W. Bao and B. Liang, “Optimizing cluster size through handoff analy-
Therefore, the integration domain is narrowed down from sis in user-centric cooperative wireless networks,” IEEE Trans. Wireless
Commun., vol. 17, no. 2, pp. 766–778, Nov. 2017.
{(rj , rM +1 , θ)|0 < rj < rM +1 , 0 < θ < π} to (1). [9] H. Ibrahim, H. ElSawy, U. T. Nguyen, and M. S. Alouini, “Mobility-
aware modeling and analysis of dense cellular networks with
C-plane/U-plane split architecture,” IEEE Trans. Commun., vol. 64,
A PPENDIX B no. 11, pp. 4879–4894, Nov. 2016.
P ROOF OF PROPOSITION 2 [10] M. Haenggi, Stochastic Geometry for Wireless Networks. Cambridge,
U.K.: Cambridge Univ. Press, 2012.
For j < M, the conditional CDF of rM is the probability [11] M. Haenggi, “On distances in uniformly random networks,” IEEE Trans.
that there are less than (M − j − 1) nodes which are closer Inf. Theory, vol. 51, no. 10, pp. 3584–3586, Oct. 2005.
Traffic-Aware Relay Vehicle Selection in Millimeter-Wave

Vehicle-to-Vehicle Communication
Bo Fan , Hui Tian , Shushan Zhu, Yanyan Chen, and Xuzhen Zhu
Abstract—Line-of-sight (LOS) blockage is a crucial problem Results indicate that the probability of V2V LOS blockage
in millimeter-wave vehicle-to-vehicle communication due to its is notably increased as the increasing traffic density improves
severe penetration loss and high vehicle mobility. In order to the blockage probability of mobile and static objects (vehicles,
overcome the LOS blockage problem, in this letter, we propose
using neighbor vehicles as relays to forward the blocked traf-
buildings, trees, etc).
fic flows. Specifically, a traffic-aware relay vehicle selection is To overcome the blockage effect, a self-organized V2V
investigated. First, analytic hierarchy process (AHP) is adopted association strategy is proposed in [5]. When blockage inter-
to handle the situation that different traffic types have different rupts the data transmission, either a new data packet queue
preferences on the performances of rate, delay and data dropping or a new V2V association is triggered, leaving the unfin-
ratio. Second, we introduce a coalitional game (CG) to evaluate ished data packets dropped. Through the proposed strategy,
the relative performances of the relay vehicles. Finally, by com-
bining the results of AHP and CG, a heuristic relay selection the performance of data rate and delay can be improved,
scheme is devised by selecting the relay vehicle with the best but the reliability can not be guaranteed, especially when the
rationality degree. Simulation results show that the proposed dropped data packets have high priority. To improve the relia-
scheme can adapt to the requirements of different traffic types bility, Wu et al. [6] propose using mmWave relays to forward
and improve the performances of rate, delay and data dropping the blocked data packets. Considering the vehicle mobility
ratio when LOS blockage occurs.
and mmWave propagation conditions, Deng et al. [7] pro-
Index Terms—LOS blockage, Millimeter-wave V2V communi- pose a low-complexity network control framework for fast
cation, Relay selection. relay selection and resource allocation. The relay is selected
by opportunistically discovering the relays which meets a
predefined signal-to-interference-noise (SINR) threshold.
In existing studies, however, the requirements of the
I. I NTRODUCTION specific traffic services in V2V communications are ignored.
EHICLE-TO-VEHICLE (V2V) communication is an For example, intelligent driving services are designed to
V important technology to support future automotive ser-
vices including intelligent driving, in-vehicle infotainment, etc.
help drivers avoid collisions and increase road safety [8].
Infotainment services provide audio and video entertainment
Many of these services require high data rate. For example, to vehicle users. That is to say, intelligent driving needs
self-driving vehicles can generate up to 1 TB data per driv- high-reliable communications while infotainment services are
ing hour [1] and vehicle laser radars produce high-resolution more sensitive to rate and delay. Therefore, when determining
maps requiring 10x Mbps data rate [2]. In order to satisfy the the relay selection, the preference of different traffic types
high data rate requirement, the adoption of millimeter-wave to different network performances is important. However,
(mmWave) in V2V communications has grasped considerable exiting studies in [6] and [7] only consider single performance
attention [3]. without the actual traffic type.
In mmWave V2V communications, one significant chal- In this letter, we devise a traffic-aware relay vehicle selection
lenge is the high possibility of inter-vehicular line-of-sight to handle the multi-traffic and multi-performance relay selection
(LOS) blockages, which are caused by the dynamic topology problem and meet the preference of different traffic types.
of V2V networks. Boban et al. [4] analyze the LOS block- First, analytic hierarchy process (AHP) is adopted to model
age for V2V communications via realistic traffic information. the preference of different traffic types to different network
performances. AHP provides an efficient method for quantifying
Manuscript received August 27, 2018; revised September 26, 2018;
accepted September 27, 2018. Date of publication October 3, 2018; date of the traffic preference through pre-defined pair-wise comparison
current version April 9, 2019. This work was supported by the National Key matrix and evaluating the performance weights according to
Research and Development Program of China under Grant 2017YFC0803903. specific traffic types [9]. However, the subjectiveness of the pair-
The associate editor coordinating the review of this paper and approving it
for publication was H. Q. Ngo. (Corresponding author: Yanyan Chen.)
wise comparison weakens the accuracy of the AHP method. To
B. Fan and Y. Chen are with the Beijing Key Laboratory of Traffic further improve the accuracy of the AHP method, a coalitional
Engineering, Beijing University of Technology, Beijing 100022, China game (CG) is introduced where relay vehicles are modeled as
(e-mail: fanbo@bjut.edu.cn; cdyan@bjut.edu.cn). players. CG provides a comparatively fair method to evaluate
H. Tian and X. Zhu are with the State Key Laboratory of
Networking and Switch Technology, Beijing University of Posts and the relative contribution of the performance capabilities for
Telecommunications, Beijing 100000, China (e-mail: tianhui@bupt.edu.cn; each players in the coalition [10]. Finally, we combine the
zhuxuzhen@bupt.edu.cn). results of AHP and CG to assess the rationality degree of the
S. Zhu is with the School of Mechanical Engineering, Beijing Institute of
Technology, Beijing 100081, China (e-mail: 13520870832@126.com). relay vehicles and choose the most rational one. Simulation
Digital Object Identifier 10.1109/LWC.2018.2873585 results show that the proposed solution can overcome the LOS
FAN et al.: TRAFFIC-AWARE RELAY VEHICLE SELECTION IN mmWAVE V2V COMMUNICATION 401
relay selection problem as a multi-attribute decision making

(MADM) problem as follows.
Transmission Rate: In Fig. 1 (a), consider a directional
two-dimension ideal sectored antenna. The transmission and
℘
reception antenna gains gi,k (℘ ∈ {tx , rx }) are given by [11]
⎧ ℘
⎨ 2π−(2π−ϕi,k )g ℘
℘ ℘ , if θi,k ≤ ϕi,k
gi,k = ϕi,k 2 (1)
⎩g , otherwise.

where θi,k is the alignment error angle between vTx i and

vRx k, ϕ℘i,k is the halfpower beamwidth at transmission (℘ =
tx ) and reception (℘ = rx ) sides. 0 < g 1 is the non-
negligible sidelobe power. Let i,k be the data link between
vTx i and vRx k. The channel gain gi,k of link i,k in decibels
is given by [12]
Fig. 1. The proposed relay vehicle selection model.
gi,k = 10A log10 (|i,k |) + C + 15|i,k |/1000 (2)
where the third term refers to the atmospheric attenuation at
blockage effect by improving the performance of rate, delay
60 GHz. Pathloss exponent A and constant C depend on the
and data dropping ratio. Moreover, the proposed relay selection
number of obstacles that obtrude link i,k . We take A = 2.1
proves to be a promising method of guaranteeing the quality
and C = 75.1 for LOS propagation and A = 1.22 and C = 94.6
of service (QoS) according to different traffic conditions.
for NLOS propagation. And the achievable data rate of link
i,k can be obtained by
II. P ROPOSED R ELAY V EHICLE T RANSMISSION M ODEL tx rx
pi gi,k gi,k gi,k
Consider a bi-direction four-lane urban road section, where γi,k = B log 1 + tx rx (3)
the moving vehicles are equipped with vehicular transmit- z =i pz gz ,k gz ,k gz ,k + N0 B
ters (vTx) and vehicular receivers (vRx) of mmWave band. tx
with pi being the transmission power of vTx i, gi,k and
In Fig. 1(a), the communication link between vTx i and vRx rx
k is blocked by an obstacle between them, causing a sudden gi,k being the transmission and reception antenna gains
decrease of transmission rate or received signal strength (RSS). respectively. N0 is the Gaussian white noise power den-
The RSS decrease exceeding the maximum threshold triggers sity and
B is the bandwidth of the mmWave band. The
the process of relay discovery and relay selection. term z =i pz gztx,k gz ,k gzrx,k in (3) represents the interference
Relay vehicle discovery: Source vehicle i and destination received at vRx k from vTx z, where z = i.
vehicle k simultaneously search for neighbors of candidate Utilizing decode-and-forward (DF) relay protocol, the effec-
relays. For example, vehicles can send directional mmWave tive data rate Rn,j of transmitting the nth traffic flow on relay
beacon signals to help the neighbor discover itself and esti- j can be bounded by the lowest rate between link i,j and
mate the channel between them. If the neighbor is in both link j ,k
s
source and destination vehicles’ beam coverage (either LOS Rn,j = αn,j min(γi,j , γj ,k ) (4)
or NLOS), it sends back an acknowledgement to the source
vehicle, informing that it can serve as a candidate relay. Delay: The delay of transmitting the nth traffic flow through
Relay vehicle selection: The source vehicle i decodes the relay j is
acknowledgement signals to obtain the information such as Dn,j = di,j + dj ,k (5)
neighbor vehicle ID, channel state, antenna sector ID, etc. Note
that the antenna sector ID can help the neighbor estimate its where di,j and dj ,k denote the delay of link i,j and j ,k
relative location and reduce the beam alignment complexity. respectively. The delay include two parts: transmission delay
After decoding all the acknowledgement signals, source vehi- and beam alignment delay

cle i serves as a central coordinator and performs the relay s alignment

di,j = αn,j qns /γi,j + di,j (6a)
vehicle selection.
s
dj ,k = αn,j qns /γj ,k + djalignment
,k (6b)
III. P ROBLEM F ORMULATION AND S OLUTION
Since the antenna sector ID has already been learned via
Let J {1, 2, . . . , J } denote the set of candidate relay the relay vehicle discovery, only fine-granularity beam-level
vehicles with cardinality J. qns represents the size of the nth alignment is performed within the sector. Thus the delay
traffic flow with service type s. Each relay vehicle j is equipped of beam alignment can effectively reduced, which can be
s be a binary relay asso-
with a limited buffer size Qj . Let αn,j estimated by [5]
s
ciation variable such that αn,j = 1 means relay vehicle j is
alignment
s = 0. Due to
selected to transmit traffic flow n, otherwise αn,j di,j = ψitx ψjrx /ϕti,j
x
ϕri,jx Tp (7a)
the fact that different traffic types have different preferences
alignment
on different communication performances, we formulate the dj ,k = ψjtx ψkrx /ϕtjx,k ϕrj x,k Tp (7b)
alignment alignment
where di,j and dj ,k are the beam alignment delay modeled as players with different capabilities in terms of the
of link i,j and j ,k respectively. Within (7a), ψitx and ψjrx optimization objectives in (9). When a coalition of players
denote the sector-level beamwidths of vTx i and vRx j, and cooperates, a certain overall gain can be obtained from the
Tp denotes the pilot transmission duration. ψjtx and ψkrx in coalition. Shapley value calculates the contribution, or the rela-
(7b) denote the sector-level beamwidths of vTx j and vRx k. tive importance that each player makes to the overall gain [10].
Data dropping ratio: If the relay’s buffer size is smaller First, we model relay vehicles as players and define the
than the traffic size qns , the exceeding data will be dropped. coalitional set as S ⊆ J . To evaluate the capabilities to trans-
Hence the data dropping ratio is mit the nth traffic flow, we define the game by (S, Φu ), u ∈

{R, D, Γ}. Φu is the characteristic function of coalition S
Γn,j = 1 − min qns , αn,js
Qj /qns (8) which have three different formations

To this end, we have obtained the multiple performances ΦR (S) = Rn,j (11a)
j ∈S
of the relay transmission. Next we write the relay selection
ΦD (S) = maxj ∈S (Dn,j ) (11b)
problem as a MADM problem
ΦΓ (S) = minj ∈S (Γn,j ) (11c)
max R, min D, min Γ
s ΦR , ΦD , ΦΓ respectively represent the transmission rate,
s.t. C1: αn,j ∈ {0, 1}, ∀n, j
J delay and data dropping ratio when relay vehicles in S simul-
s taneously transmit the same traffic flow. The Shapley value
C2 : αn,j = 1, ∀n (9)
j =1 calculates the average marginal contribution (relative impor-
J N J
where R = N n=1 j =1 Rn,j , D = n=1 j =1 Dn,j , Γ =
tance) of a certain player in the coalitional game as follows
N J s (J − |S|)!(|S| − 1)!
n=1 j =1 Γn,j . The decision variable is αn,j . The objec- φu,j = (Φu (S) − Φu (S − {j }))
tives include maximizing the relay transmission rate, minimiz- J!
S(j ∈S)
ing the delay and data dropping ratio. Constraint C2 ensures
that one traffic flow can be allocated to at most one relay (12)
vehicle. where |S| denotes the cardinality of set S. The Shapley value
is expressed as a weighted sum of the player j’s marginal
A. Evaluating Objective Weights by AHP contribution Φu (S) − Φu (S − {j }), in the assumption that
We introduce AHP to evaluate the weights of different the player join the coalition by a completely random order.
(J −|S|)!(|S|−1)!
optimization objectives. First, the importance level of each J! is a weighting factor that assigns equal share
objective is compared with that of another according to the of the generated contribution to each coalition of interest.
importance numerical scales listed in [9]. Thus, the compar- To this end, we can get a characteristic matrix [φu,j ]3×J . In
ison matrix A = [au,v s ] s
3×3 is constructed as follow: au,v order to insure that all the elements are the-smaller-the-better,
denotes the relative importance of objective u comparing to we convert the first row of the matrix by φR,j = 1/φR,j .
s = 1/a s and a s = 1 (u, v ∈ {R, D, Γ}).
objective v, au,v v ,u u,u Further, to make the values of different characteristic types
The largest eigenvalue λmax of A and its corresponding comparable, we normalize the elements in matrix [φu,j ]3×J
normalized eigenvector W can be calculated as follows by φu,j = φu,j /maxj (|φu,j |). To determine the most suitable
relay vehicle, we introduce the definition of rationality degree

w̄us = 3 a s , u ∈ {R, D, Γ} (10a) (RD) by combining the results of AHP and CG as follows
v u,v
3
wus = w̄us / w̄vs , u ∈ {R, D, Γ} (10b) RDj =
u=1
wu φu,j (13)
v
s s
W = [wR , wD , w s ]T (10c) where for each traffic flow, the relay vehicle with the minimum
Γ
λmax = ( a s w s /3wus ) (10d) RD value is selected.
u v u,v v
Then we can use wus to represent the weight of the corre- IV. S IMULATION AND A NALYSIS
sponding objective u ∈ {R, D, Γ}. Since AHP is a subjective The simulation parameters are listed in Table I. We con-
evaluating method, the consistency of A should be checked to duct 1000 simulation periods and take the average value. Two
guarantee accuracy. First, the consistency index C.I. is calcu- benchmarks are used: transmission without relay and oppor-
lated as (λmax − 1)/(|A| − 1) (|A| denotes the order of the tunistic relay selection in [7]. Intelligent driving and infotain-
comparison matrix). The random consistency index R.I. is 0.58 ment traffics are generated with equal probability (50%) in
when |A| = 3 [9]. The consistency constraint is C.I./R.I. < 0.1. Poisson distribution. The traffic arrival rate is 0.8 Mits/ms in
Figs. 2(a), 3(a), 4(a) and the halfpower beamwidth is 15◦ in
B. Modeling the Relay Selection As CG Figs. 2(b), 3(b), 4(b).
Though AHP has helped us to evaluate the objective weights From Figs. 3–4, we can observe that comparing to the
according to different traffic types, the weakness is that the relay selection in [7], the delay and data dropping ratio of the
choose of the comparison matrix is subjective. Therefore, we proposed scheme are reduced by 0.43ms and 8.82% on aver-
introduce the Shapley value in coalitional game (CG) to reduce age for infotainment and intelligent driving respectively. While
the subjectiveness of AHP. In the game, relay vehicles are Fig. 2 (a), (b) indicate that the benchmark in [7] achieves the
FAN et al.: TRAFFIC-AWARE RELAY VEHICLE SELECTION IN mmWAVE V2V COMMUNICATION 403
TABLE I
S IMULATION PARAMETERS
Fig. 4. Data dropping ratio versus (a) Source vTx/Destination vRx

beamwidth, (b) Average traffic arrival rate.
proposed scheme is subjective to relay vehicle’s buffer size,

causing the sudden increase of data loss in Fig. 4(b).
V. C ONCLUSION AND L IMITATIONS

In this letter, a traffic-aware relay selection is proposed to
mitigate LOS blockages in mmWave V2V communications.
AHP is introduced to model the traffic’s preference on different
network performances and CG is utilized to assess the relay’s
capability in each performance. Although AHP has helped us
to quantify the traffic-to-performance preference, the assign-
ment of comparison matrix still lacks rationality. In the future,
Fig. 2. Rate versus (a) Source vTx/Destination vRx beamwidth, (b) Average
traffic arrival rate. we will utilize learning theories to devise a self-adaptive score
assignment mechanism to increase the comparison rationality.
R EFERENCES
[1] A. D. Angelica. (2013). Google’s Self-Driving Car Gathers Nearly
1 GB/Sec. [Online]. Available: http://www.kurzweilai.net/googles-self-
driving-car-gathers-nearly-1-gbsec
[2] C. Perfecto, J. Del Ser, M. Bennis, and M. N. Bilbao, “Beyond
WYSIWYG: Sharing contextual sensing data through mmWave V2V
communications,” in Proc. Eur. Conf. Netw. Commun. (EuCNC), Oulu,
Finland, 2017, pp. 1–6.
[3] J. Choi et al., “Millimeter-wave vehicular communication to support
massive automotive sensing,” IEEE Commun. Mag., vol. 54, no. 12,
pp. 160–167, Dec. 2016.
[4] M. Boban, X. Gong, and W. Xu, “Modeling the evolution of line-of-
Fig. 3. Delay versus (a) Source vTx/Destination vRx beamwidth, (b) Average sight blockage for V2V channels,” in Proc. IEEE Veh. Technol. Conf.
traffic arrival rate. (VTC), Montreal, QC, Canada, 2016, pp. 1–7.
[5] C. Perfecto, J. Del Ser, and M. Bennis, “Millimeter-wave V2V com-
munications: Distributed association and beam alignment,” IEEE J. Sel.
best rate performance. The reason is that the proposed scheme Areas Commun., vol. 35, no. 9, pp. 2148–2162, Sep. 2017.
[6] S. Wu, R. Atat, N. Mastronarde, and L. Liu, “Improving the cov-
puts emphasis on different performances according to different erage and spectral efficiency of millimeter-wave cellular networks
traffic types, whereas benchmark in [7] only considers SINR using device-to-device relays,” IEEE Trans. Commun., vol. 66, no. 5,
and ignores the traffic-to-performance preference. pp. 2251–2265, May 2018.
[7] J. Deng, O. Tirkkonen, R. Freij-Hollanti, T. Chen, and N. Nikaein,
An interesting observation drawn from Fig. 3(a) is that “Resource allocation and interference management for opportunis-
the proposed scheme tends to achieve the minimum delay at tic relaying in integrated mmWave/sub-6 GHz 5G networks,” IEEE
15◦ − 20◦ beamwidth, where the alignment delay decrease Commun. Mag., vol. 55, no. 6, pp. 94–101, Jun. 2017.
[8] M. C. G. Quintero and P. A. C. Cuervo, “Intelligent driving assistant
and transmission delay increase approach comparable values. based on accident risk maps analysis and intelligent driving diagnosis,”
Wider beamwidths lead to lower alignment complexity but in Proc. IEEE Intell. Veh. Symp. (IV), Los Angeles, CA, USA, 2017,
increase the transmission delay. This tradeoff between align- pp. 914–919.
[9] R. W. Saaty, “The analytic hierarchy process—What it is and how it is
ment delay and transmission delay directs our future work used,” Math. Model., vol. 9, no. 35, pp. 161–176, 1987.
towards fine-grained beamwidth optimization joint with relay [10] M. Pulido, J. S. Soriano, and N. Llorca, “Game theory techniques for
selection. Note that the delay is upper-bounded by a maximum university management: An extended bankruptcy model,” Ann. Oper.
Res., vol. 109, nos. 1–4, pp. 129–142, 2002.
threshold of 5 ms set in the simulation. [11] J. Wildman, P. H. J. Nardelli, M. Latva-Aho, and S. Weber, “On the joint
In Fig. 4(a), the beamwidth has minimal effect on data impact of beamwidth and orientation error on throughput in directional
dropping ratio because the LOS relay transmission can guar- wireless Poisson networks,” IEEE Trans. Wireless Commun., vol. 13,
antee successful data delivery within stringent delay threshold, no. 12, pp. 7072–7085, Dec. 2014.
[12] A. Yamamoto, K. Ogawa, T. Horimatsu, A. Kato, and M. Fujise, “Path-
whereas the first baseline renders severe data dropping ratio loss prediction models for intervehicle communication at 60 GHz,” IEEE
due to LOS blockages. However, the data dropping ratio of the Trans. Veh. Technol., vol. 57, no. 1, pp. 65–78, Jan. 2008.
Decentralized Precoding for Cache-Enabled Ultra-Dense

Radio Access Networks
Shiwen He , Member, IEEE, Yiyun Chen, Ju Ren , Member, IEEE, Yongming Huang , Senior Member, IEEE,
Luxi Yang , Member, IEEE, and Yaoxue Zhang , Senior Member, IEEE
Abstract—This letter studies the fully decentralized access nodes (RANs). In the last decades, some decentral-
optimization of the precoding matrices for the downlink ized optimization methods were designed for the conventional
of multiple-input-multiple-output cache-enabled ultra-dense network MIMO system to maximize certain criterion, such
radio access network. To avoid signalling exchange between
coordinated edge radio access nodes (RANs), we first construct a as the weighted sum-rate maximization (WSRMax), and to
virtual downlink cache-enabled ultra-dense radio access network. reduce the amount of signalling exchange between coordinated
Following, a new concept of signal-to-interference-leakage-plus- RANs [3]–[5]. In [6], the high SINR approximation method
noise ratio is defined. Based on this concept, a virtual weighted was also used to decouple the WSR maximization problem.
sum rate maximization problem is formulated to optimize the Edge caching and signal processing at the networks is
precoders. We then propose a computationally efficient decen-
tralized optimization algorithm that achieves stationary point to regarded as a promising way to reduce the burden on the fron-
the problem. Numerical results show that the proposed algorithm thaul links and the delivery latency via caching files at the
can achieve more than 96.15% of the performance of the cen- local cache in close proximity to the mobile terminals [7]–[9].
tralized optimization without any signalling exchange between Park et al. [7] investigated the joint design of cloud and edge
coordinated edge RANs for multiple-input single-output cases. processing for cache-enabled radio access network. Cache-
Index Terms—Interference network, multiple-input multiple- enabled physical layer security video transmission was studied
output, edge caching, decentralized optimization. and was illustrated to achieves simultaneously a low secrecy
outage probability and a high power efficiency [8]. Hybrid
content caching design that does not require the knowledge of
I. I NTRODUCTION content popularity was investigated for hierarchical network
HE INCREASING number of mobile devices and surging architecture [9]. One common point of the aforementioned
T user demands for multi-media and delay-sensitive service
have triggered massive amounts of data in wireless networks,
methods is that the signalling exchange per-iteration among
coordinated RANs is needed [3]–[5], [7]–[9].
which brings about huge burden on fronthaul/backhaul links. In addition, in ultra-dense networks, the lack of direct
Recently, caching of popular content during off-peak traffic link (wireline or wireless link) among RANs prohibits
periods at various levels of the wireless network architecture the signalling exchange per-iteration between RANs [10].
and network densification have emerged as promising ways Furthermore, signalling exchange per-iteration between RANs
to satisfy low latency demands [1]. From the perspective of requires strict synchronization and causes a considerable delay
network architecture, cache-enabled multiple-input multiple- and signalling overhead that are determined by the conver-
output (MIMO) coordinated design is very similar with the gence speed. It also implies that the conventional network
traditional network MIMO and coordinated beamforming [2]. MIMO technologies without considering edge caching and sig-
Unfortunately, to obtain the performance gain provided by nal processing have no ability to satisfy the requirements of
network MIMO and coordinated beamforming, a large amount the low latency and low overhead of fronthaul/backhaul in the
of information, such as the channel state information (CSI) or wireless communication systems.
data streams of users, need to be exchanged between radio With aiming to reduce delivery latency, edge caching and
signal processing generally limits the operation of edge nodes
Manuscript received September 1, 2018; revised September 28, 2018; to non-cooperative transmission strategies. How to design fully
accepted September 29, 2018. Date of publication October 3, 2018; date decentralized transmission scheme has become a key problem
of current version April 9, 2019. This work was supported in part by the
National Natural Science Foundation of China under Grant 61471120 and for cache-enabled ultra-dense communication system. In this
Grant 61711540305, and in part by the National Science and Technology letter, we investigate the fully decentralized optimization of
Major Project of China under Grant 2018ZX03001002-003. The associate the precoding matrices for the downlink of cache-enabled
was X. Chu. (Corresponding author: Yongming Huang.) ultra-dense radio access network. To avoid signalling exchange
S. He, J. Ren, and Y. Zhang are with the School of Information Science among coordinated edge RANs causing additional delay, we
and Engineering, Central South University, Changsha 410083, China, and also construct firstly a virtual coordinated downlink of cache-
with the School of information Technology, Jiangxi University of Finance
and Economics, Nanchang 330032, China (e-mail: shiwen.he.hn@csu.edu.cn; enabled ultra-dense radio access network and introduce a
renju@csu.edu.cn; zyx@csu.edu.cn). new concept of signal-to-interference-leakage-plus-noise ratio,
Y. Chen, Y. Huang, and L. Yang are with the School of Information which is only related with local CSI. Using local CSI infor-
Science and Engineering, Southeast University, Nanjing 210096, China
(e-mail: 220160742@seu.edu.cn; huangym@seu.edu.cn; lxyang@seu.edu.cn). mation, the optimization of the precoding matrices is for-
Digital Object Identifier 10.1109/LWC.2018.2873671 mulated as a series of parallel WSRMax problems. Finally,
HE et al.: DECENTRALIZED PRECODING FOR CACHE-ENABLED ULTRA-DENSE RADIO ACCESS NETWORKS 405
a fully decentralized optimization algorithm that is indepen- which needs the signalling exchange per-iteration among coor-
dently implemented at each edge RAN without any signalling dinated RANs [3]–[5]. The signalling exchange per-iteration
exchange is proposed to address the considered problem. is prohibitive in practical communication systems, especially
for delay sensitive traffic in ultra-dense radio access network.
II. S YSTEM M ODEL
Consider the downlink K-cell multiuser MIMO time divi- III. D ECENTRALIZED O PTIMIZATION OF P RECODERS
sion duplex (TDD) cache-enabled ultra-dense radio access In this section, we focus on exploring a fully decentral-
network where each cell includes one cache-enabled multiple ized optimization method to design the precoding matrices
antennas RAN at the edge of the networks and a plurality at each RAN independently by defining a virtual WSRMax
of users equipped with multiple antennas. Each RAN has (VWSRMax) problem. We assume that local CSI is available
cached the requested contents at its local cache and each for each user, namely, RAN k knows the local channel matri-
user associates with and served by only one RAN. Let k ces Hjl ,k , ∀j ∈ Kul , ∀l ∈ Kb , which can be obtained by
and l be the RAN index, i and j be the user index. RAN k, exploiting the downlink-uplink reciprocity in TDD system.
k ∈ Kb = {1, . . . , K }, is equipped with Mk transmit antennas To realize the fully decentralized optimization of the
and serves Ik users located in cell k. Let us define ik to be precoding matrices and avoid the signalling exchange among
the i-th user in cell k and Nik be the number of antennas of coordinated edge RANs, we construct a virtual downlink
user ik . The received signal of user ik is given by: cache-enabled ultra-dense radio access network by regard-

yik = Hik ,l Wjl xjl + zik . (1) ing RAN k as Ik virtual RANs, ∀k ∈ Kb . The cor-
responding channel coefficients are defined as, Gik ,ik =
l∈Kb j ∈Kul
[HH H
ik ,k , 0Mk ×(N ik −Nik ) ] , Gjk ,ik = Gjk ,jk , ∀j ∈ Kuk \ {i },
where Kuk is the set of the users located in cell k, ∀k ∈ Kb . and the channel coefficient from virtual RAN ik , ∀i ∈ Kuk ,
Hjl ,k ∈ CNjl ×Mk denotes the flat fading channel coeffi- ∀k ∈ Kb , to user jl served by virtual RAN l is defined as
cient between RAN k and user jl . Wik ∈ CMk ×dik and dik H
denote the precoding matrix and the number of data streams Gjl ,ik = HH jl ,k , 0M ×(N i −Nj ) , ∀j ∈ Kul , ∀l ∈ Kb \ {k },
k k l
for user ik , respectively. xik ∈ Cdik ×1 denotes the informa-
tion signal intended for user ik with distribution CN (0, Idi ). where 0M ×N is an M × N zero matrix and N ik =
k max(Nik , max max Njl ). The virtual received signal at
zik ∈ CNik ×1 denotes the additive white Gaussian noise with l∈Kb \{k } j ∈Kul
distribution CN (0, σi2k INi ) and Id is d × d identity matrix. user ik can be defined as:
k
In this letter, we assume that there are no direct link among
yik = Gik ,ik Wik xik + Gik ,ik Wjk xjk
edge RANs, i.e., the signalling exchange among coordinated
j ∈Kuk \{i}
edge RANs is impossible. Therefore, we need to explore the
fully decentralized design of precoders without any signalling + Gjl ,ik Wik xik + zik , (5)
exchange among coordinated edge RANs for cache-enabled l∈Kb \{k } j ∈Kul
ultra-dense communication systems. To design the precoding
matrices Wik ∈ CMk ×dik , ∀i ∈ Kuk , k ∈ Kb , the WSRMax where zik = [zH H
ik , 01×(N ik −Nik ) ] . Then, for RAN k, ∀k ∈
problem is formulated as Kb , a VWSRMax is defined as
Ik

max αik Rik , s.t. Tr Wik WiHk ≤ Pk , i , s.t. H
{Wk } max αik R k
Tr W i k
W i k ≤ Pk . (6)
k ∈Kb i∈Kuk i∈Kuk Wk
i∈Kuk i=1
(2)
Note that problem (6) is not equivalent to problem (2). The
where Wk denotes the set of all precoding matrix Wik , the i of user ik is defined as:
virtual achievable rate R k
weight αik is used to denote the priority of user ik , ∀i ∈ Kuk ,
i = ln det I H H −1
∀k ∈ Kb . The achievable rate Rik of user ik is given by: R k N i + Gik ,ik Wik Wik Gik ,ik Ωik . (7)
k
−1
Rik = ln det INi + Hik ,k Wik WiHk HH ik ,k Ωik . (3) i denotes the interference-plus-noise matrix includ-
In (7), Ω
k k
ing the intra-cell inter-user interference, inter-cell leakage
In (3), Ωik denotes the interference-plus-noise matrix includ-
interference, and the noise covariance, which is different from
ing intra-cell inter-user interference, inter-cell inter-user
the signal-to-leakage-noise ratio defined in [12], given by
interference, and the additive white Gaussian noise, given by

i =
Ω Gi ,i Wj WH GH
Ωik = Hik ,k Wjk WjHk HH
ik ,k
k k k k jk ik ,ik
j ∈Kuk \{i}
j ∈Kuk \{i}
+ Gjl ,ik Wik WiHk GH 2
+ Hik ,k Wjk WjHk HH + σi2k INi . (4) jl ,ik + σik Φik , (8)
ik ,k k l∈Kb \{k } j ∈Kul
l∈Kb \{k } j ∈Kuk
It is well known that problem (2) is non-convex and is difficult where matrix Φik ∈ CN ik ×N ik is the covariance of zik .
to obtain the global optimal solution. Furthermore, problem (2) When the inter-cell leakage interference is not considered,
is in general solved in a centralized or decentralized way i.e., removing the second item in the right side of (8) called
as Proposed Algorithm (wo), problem (6) is simplified to Algorithm 1 Decentralized Optimization Algorithm
maximize per-cell sum rate. Different from the research on 1: Initialize Wk such that the transmit power constraint is
the maximization of SLNR, the power allocation in this let- met. Compute R i with Wk , ∀i ∈ Ku and let flag = 1.
k k
ter needs to be optimized for each user, i.e., Tr(Wik WiHk ), 2: while flag == 1 do
−1
i ∈ Kuk , k ∈ Kb , under per-cell transmit power constraint. 3: Uik ← Ωik Gik ,ik Wik .
−1
Due to the coupling between optimization variables in
4: Vik ← I − UH G W
ik ik ,ik ik .
the objective of problem (6), it is non-convex and is dif-
ficult to obtain its global optimum solution. Furthermore, 5: Wik ← Ξ−1 H
ik Gik ,ik Uik Vik with
optimal λk .

unlike the optimization problem investigated in [5], we cannot i ≤ η, let flag = 0, other-
i and the minimum mean
directly use the relation between R 6: If ln det Vik − R
k i∈Ku k

square errors (MMSE) of xik to simplify the optimization k
wise let R i = ln det Vi and go to Step 3.
problem (6). Therefore, to obtain a tractable form, we firstly k k
7: end while
introduce a virtual optimization problem, i.e.,

min Tr Eik , s.t. Tr Wik WiHk ≤ Pk , (9)
Wk ,Uk
i∈Kuk i∈Kuk where λk ≥ 0 is the Lagrange multiplier associated with the
where virtual MMSE matrix Eik is defined as: transmit power constraint of RAN k. The first-order optimality
condition of L(Wk , λk ) with respect to each Wk yields
Eik = Idi − UH H H H
ik Gik ,ik Wik − Wik Gik ,ik Uik + Uik Ωik Uik ,
k Wi∗k = αik Ξ−1 H
ik Gik ,ik Uik Vik . (16)
where Uik ∈ CN ik ×dik is an auxiliary variable, ∀i ∈ Kuk , In (16), λk should be chosen such that the complementarity
and Uk is the set of all auxiliary variables. Ωik is given by slackness condition of the power constraint is satisfied, and
Ωi = Gi ,i Wi WH GH + Ω i . (10) Ξik is
k k k k ik ik ,ik k

In problem (9), fixing precoding matrices Wk , the optimal Ξik = λk IMk + αjk GH H
jk ,jk Ujk Vjk Ujk Gjk ,jk
solution of Uik is j ∈Kuk
−1
U∗ik = Ωik Gik ,ik Wik . (11) + αik GH H
jl ,ik Uik Vik Uik Gjl ,ik . (17)
Subsituting (11) into (10), we have l∈Kb \{k } j ∈Kul
−1
E∗ik = Idi − WiHk GH
ik ,ik Ωik Gik ,ik Wik . simplicity, let Wik (λHk ) be the right side of (16). When
(12) For
k
Tr(Wik (0)Wik (0) ) ≤ Pk and the matrix Ξik −λk IMk
Applying the Woodbury matrix identity to (7), we have R i = i∈Kuk
k
ln det((E∗ik )−1 ) and can obtain the following conclusion. is invertible, then Wi∗k = Wik (0), otherwise we must have

Proposition 1: Let Vik ∈ Cdik ×dik 0 be a weight matrix Tr(Wik (λk )Wik (λk )H ) = Pk . Let Qik Λik QH
ik be the
for user ik . Problem (6) is equivalent to the following problem: i∈Kuk
eigenvalue decomposition of Ξik − λk IMk , then we have
min αik Tr Vik Eik − ln det Vik , −2
Wk ,Uk ,Vk Tr Λik + λk IMk Ψik = Pk , (18)
i∈Kuk
i∈Kuk
s.t. Tr Wik WiHk ≤ Pk . (13)
i∈Kuk where Ψik = αi2k QH H H H
ik Gik ,ik Uik Vik Vik Uik Gik ,ik Qik . We fur-
ther have
where Vk denotes the set of Vik for cell k. Mk
In what follows, we focus on addressing problem (13) Ψik m,m
2 = Pk , (19)
instead of solving directly problem (6). The objective func-
i∈Kuk m=1 Λik m,m + λk
tion of problem (13) is not joint convex with respect to
optimization variables Wk , Uk , and Vk . but it is convex in where [A]m,n denotes the (m, n)-th of matrix A. Note that the
each of them. Therefore, we propose to use the block coor- left side of (19) is a decreasing function in λk . Hence, the
dinate descent method to solve problem (13). Specifically, we optimal λk can be found by using one dimension search.
address problem (13) sequentially by fixing two of the three The detailed steps used to optimize problem (13) is sum-
variables Wk , Uk , Vk , and updating the third. Fixing the other marized as Algorithm 1 for RAN k, ∀k ∈ Kb , where η
variables, the optimal solution of Vik is is a predefined stop threshold. The similar procedure can
Vi∗k = E−1
ik (14) be used to extending the algorithm in [12] to multi-cell
multiuser power allocation systems, referred as Extended
and the optimal solution of Uik is given by (11). Now, we
leakage Algorithm.
focus on optimizing problem (13) with fixed Uk and Vk via
Following the similar procedure [5, Th. 3], we have:
Lagrange duality theory. The Lagrange function is given by
Proposition 2: Any limit point Wk∗ , Uk∗ , and Vk∗ of the
L(Wk , λk ) = αik Tr Vik Eik − ln det Vik iterations generated by Algorithm 1 is stationary point of
i∈Kuk problem (13), and the corresponding Wk∗ is a stationary point
⎛ ⎞
of problem (6). Conversely, if Wk∗ is a stationary point of

+ λk ⎝ Tr Wik WiHk − Pk ⎠, (15) problem (6), then Uk∗ calculated by (11), Vk∗ is given by (14),
i∈Kuk and Wk∗ given by (16), are stationary point of problem (13).
HE et al.: DECENTRALIZED PRECODING FOR CACHE-ENABLED ULTRA-DENSE RADIO ACCESS NETWORKS 407
outperforms the distributed algorithm developed in [6] in terms

of the average ASR. This is because that the distributed algo-
rithm developed in [6] is based on the assumption of high
SNR and only one inter-cell interference. Compared to the
conventional zero-forcing method [11], the proposed algo-
rithm regards the inter-cell interference as a whole which
provides more degree of freedom to design better precoding
matrices.
Fig. 1. ASR Comparisons, Mt = 32, Nr = d = 1, and I = 10. V. C ONCLUSION

In this letter, a fully decentralized optimization of precoding
matrices was investigated for the downlink cache-enabled
ultra-dense radio access network. To avoid the signalling
exchange between coordinated RANs, a virtual downlink
cache-enabled RANs was firstly introduced and then a new
concept of signal-to-interference-leakage-plus-noise ratio was
further defined. Finally. a computationally efficient decen-
tralized iterative algorithm that achieve stationary point to
the problem was presented. Numerical results show that in
Fig. 2. ASR Comparisons, Mt = 16, Nr = I = 4 and d = 2.
low-middle SNR region, the presented algorithm can achieve
more than 96.15% of the performance of the centralized algo-
rithm without any signalling exchange between coordinated
IV. N UMERICAL R ESULTS edge RANs.
In this section, we investigate the performance of the
R EFERENCES
proposed fully decentralized optimization method for cache-
enabled RANs. Consider a cooperative cluster of K = 3 [1] A. Sengupta, R. Tandon, and O. Simeone, “Fog-aided wireless networks
for content delivery: Fundamental latency tradeoffs,” IEEE Trans. Inf.
adjacent 120◦ degree sectors each consisting of one RAN and Theory, vol. 63, no. 4, pp. 6650–6678, Oct. 2017.
multiple users. The cell radius is set to 300 m and all users are [2] E. Björnson, N. J. Jaldén, M. Bengtsson, and B. Ottersten, “Optimality
randomly distributed in the coverage area of each cell. The flat properties, distributed strategies, and measurement-based evaluation
of coordinated multicell OFDMA transmission,” IEEE Trans. Signal
fading channel matrix Hjl ,k from RAN k to the user jl is gen-
Process., vol. 59, no. 12, pp. 6086–6101, Dec. 2011.
erated based on the formulation Hjl ,k θjl ,k Hw jl ,k , where
[3] Y. Huang et al., “Distributed multicell beamforming with limited
Hwjl ,k denotes the small scale fading channel matrix whose intercell coordination,” IEEE Trans. Signal Process., vol. 59, no. 2,
pp. 728–738, Feb. 2011.
entries follow independently and identically Gaussian distri- [4] Y.-F. Liu, Y.-H. Dai, and Z.-Q. Luo, “Coordinated beamforming for
bution with zero mean and unit covariance, and the channel MISO interference channel: Complexity analysis and efficient algo-
power θjl ,k is given as θjl ,k = 1/(1 + (djl ,k /d0 )β ) with djl ,k rithms,” IEEE Trans. Signal Process., vol. 59, no. 3, pp. 1142–1157,
Mar. 2011.
denoting the distance between the user jl and access node k. [5] Q. Shi, M. Razaviyayn, Z. Luo, and C. He, “An iteratively weighted
We set the parameters d0 = 30 m and β = 2.5. All RANs MMSE approach to distributed sum-utility maximization for a MIMO
interfering broadcast channel,” IEEE Trans. Signal Process., vol. 59,
have the same transmit power Pk = P , the same number of no. 9, pp. 4331–4340, Sep. 2011.
transmit antennas Mk = Mt , and serve the same number of [6] H.-J. Choi, S.-H. Park, S.-R. Lee, and I. Lee, “Distributed beamform-
users, Ik = I , ∀k ∈ Kb . All users have the same number of ing techniques for weighted sum-rate maximization in MISO interfering
broadcast channels,” IEEE Trans. Wireless Commun., vol. 11, no. 4,
receive antennas Nik = Nr , the same number of data streams pp. 1314–1320, Apr. 2012.
dik = d , and the weighted factor αik is unity, ∀k ∈ Kuk , [7] S.-H. Park, O. Simeone, and S. Shamai, “Joint optimization of cloud and
∀k ∈ Kb . The noise variance σi2k is assumed to be equal, i.e., edge processing for fog radio access networks,” IEEE Trans. Wireless
Commun., vol. 15, no. 11, pp. 7621–7632, Nov. 2016.
σ 2 , ∀k ∈ Kuk , ∀k ∈ Kb , and η = 10−3 . [8] L. Xiang, D. K. Ng, R. Schober, and V. W. S. Wong, “Cache-enabled
Fig. 1 and Fig. 2 illustrate the average achievable sum physical layer security for video streaming in backhaul-limited cellular
rate (ASR) performance versus the SNR P /σ 2 for various networks,” IEEE Trans. Wireless Commun., vol. 17, no. 2, pp. 736–751,
Feb. 2018.
precoding matrices optimization methods. Numerical results [9] J. Kwak, Y. Kim, L. Le, and S. Chong, “Hybrid content caching in 5G
show that without any signalling exchange between coordi- wireless networks: Cloud versus edge caching,” IEEE Trans. Wireless
nated RANs, the ASR performance of the proposed algorithm Commun., vol. 17, no. 5, pp. 3030–3045, Jul. 2018.
[10] J. An et al., “Achieving sustainable ultra-dense heterogeneous
is close to that of the centralized optimization method with networks for 5G,” IEEE Commun. Mag., vol. 55, no. 12, pp. 84–90,
more than 96.15% for MISO scenario [5]. However, for MIMO Dec. 2017.
case, the centralized optimization method improves the system [11] Q. H. Spencer, A. L. Swindlehurst, and M. Haardt, “Zero-forcing
methods for downlink multiplexing in multiuser MIMO chan-
performance with up to 4.08% in terms of the ASR at low- nels,” IEEE Trans. Signal Process., vol. 52, no. 2, pp. 461–471,
middle SNR region with large number of signalling exchange. Feb. 2004.
But, at high SNR region, the ability of interference suppression [12] M. Sadek, A. Tarighat, and A. H. Sayed, “A leakage-based
precoding scheme for downlink multi-user MIMO channels,”
of the fully decentralized optimization method is obviously IEEE Trans. Wireless Commun., vol. 6, no. 5, pp. 1711–1721,
weaker. In addition, one can see that the proposed algorithm May 2007.
LoRa Throughput Analysis With Imperfect Spreading Factor Orthogonality

Antoine Waret, Megumi Kaneko , Alexandre Guitton, and Nancy El Rachkidy
Abstract—LoRa is one of the promising techniques for enabling operate in the same channel and hence boost the achievable
low power wide area networks for future Internet-of-Things system throughput. Thus, a number of works have consid-
devices. Although LoRa allows flexible adaptations of coverage ered the effect of co-SF interference only, where end-devices
and data rates, it is subject to intrinsic types of interferences: using the same SF on the same channel are subject to colli-
co-SF interferences due to collisions among end-devices with the sions [2], [3]. In particular, the outage probability of a LoRa
same spreading factor (SF), and inter-SF interferences due to col-
lisions among devices with different SFs. Most current works have
system under co-SF interference was analyzed in [2], where
considered perfect orthogonality among different SFs. We thus a signal could be captured if its Signal-to-Interference-plus-
provide a theoretical analysis of the achievable LoRa through- Noise Ratio (SINR) was higher than 6 dB. As the number of
put in uplink, where all LoRa-specific capture conditions are devices increased, it was shown that those co-SF interferences
included. Results show the accuracy of our analysis despite were causing a scalability limit. However, recent studies have
approximations, and the throughput losses from imperfect SF pointed out the fact that SFs were not perfectly orthogonal
orthogonality, under different SF allocations. Our analysis will among themselves [4]. Thus, the effect of inter-SF collisions
enable the design of specific SF allocation mechanisms, in view was investigated through computer simulations and/or exper-
of further throughput enhancements. iments. Namely, [4], [5] showed that inter-SF interferences
Index Terms—LoRa, spreading factor, uplink throughput, could considerably decrease LoRa performance, especially for
imperfect orthogonality. high SFs where frames have a greater time on air.
In this letter, we propose a theoretical analysis of the
achievable throughput on the uplink of a LoRa network,
I. I NTRODUCTION encompassing the effects of co- and inter-SF interferences. To
ensure a successful transmission, a packet must thus satisfy
S THE amount of mobile data traffic will rapidly increase
A during the upcoming years (studies forecast 50 billion
Internet of Things (IoT) devices by 2020), new spectrum
three conditions: 1) its SNR is above the reception threshold,
2) its SINR is above the co-SF capture threshold if there is
co-SF interference, and 3) its SINR is above the inter-SF cap-
access strategies adapted to high device densities are ever ture threshold if there is inter-SF interference. Considering two
more crucial. LoRa [1] is one of the prominent candidates different types of SF allocations, we theoretically derive the
for Low Power Wide Area Networks (LPWANs), providing achievable throughput expressions for both perfect and imper-
wide communication coverage with low power consumption, fect SF orthogonality. Simulation results show the accuracy
at the expense of data rate. Operating in license-free ISM of our analytical expressions despite approximations, as well
bands (i.e., 868MHz in Europe), the LoRa PHY layer uses a as the impact of the various types of interferences and SF
chirp spread-spectrum modulation where different Spreading allocations on the overall system performance.
Factors (SFs) tune the chirp modulation rates. Lower SFs such
as SF7 allow for higher data rates but reduced transmission
range, whereas higher SFs such as SF12 provide longer range II. S YSTEM M ODEL
at lower data rates. On top of the LoRa PHY layer, the higher We consider one cell of radius R with one gateway located
layers were defined by the LoRa Alliance and referred as at its center, as depicted in Fig. 1. There are N end-devices
LoRaWAN. In particular, the MAC protocol is based on a pure uniformly distributed within the cell. We denote by di the
ALOHA access with duty cycle limitations. The LoRaWAN distance from end-device i to the gateway. Since the goal of our
network architecture is a star-like topology where end-devices analysis is to derive the achievable rate by LoRa, we assume
communicate with gateways over several channels. that all end-devices transmit in a single channel of bandwidth
Most studies on LoRa scalability so far assumed a perfect BW = 125 kHz and that they all have packets to transmit.
orthogonality among SFs, thereby creating virtual channels This corresponds to the pure ALOHA access as in LoRaWAN
where multiple users with different SFs could simultaneously with saturated traffic.1 We consider M = 6 SFs, for m ∈ M,
M = {mmin , . . . , mmax }, with mmin = 7 and mmax = 12,
Manuscript received June 20, 2018; revised September 22, 2018; accepted with symbol times Tm = BW 2m . The bit-rate R of SF [1] is
m m
September 24, 2018. Date of publication October 3, 2018; date of current
version April 9, 2019. This work was supported in part by the Grant-in-Aid m × CR
for Scientific Research (Kakenhi) from the Ministry of Education, Science, Rm = 2m
, (1)
Sports, and Culture of Japan under Grant 17K06453, and in part by the NII BW
Research Grant. The associate editor coordinating the review of this paper
and approving it for publication was T. De Cola. (Corresponding author: where CR is the coding rate defined as 4/(4+n) with
Megumi Kaneko.) n ∈ {1, 2, 3, 4}. Lower SFs allow higher data rate but lower
A. Waret and M. Kaneko are with the National Institute of communication range whereas higher SFs provide longer
Informatics, Tokyo 101-8430, Japan (e-mail: antoine.waret@grenoble-inp.org; range at the expense of data rate (see Table I).
megkaneko@nii.ac.jp).
A. Guitton and N. El Rachkidy are with CNRS, LIMOS,
Two types of SF allocation will be investigated. In the first
University Clermont Auvergne, 63000 Clermont-Ferrand, France one, the SFs are uniformly distributed, i.e., every end-device
(e-mail: alexandre.guitton@uca.fr; nancy.el_rachkidy@uca.fr).
Digital Object Identifier 10.1109/LWC.2018.2873705 1 Our analysis can be easily applied to multiple channels and duty cycles.
WARET et al.: LoRa THROUGHPUT ANALYSIS WITH IMPERFECT SF ORTHOGONALITY 409
III. P ROPOSED T HROUGHPUT A NALYSIS

An end-device’s uplink signal is successfully received at the
gateway if the following conditions are jointly fulfilled:
1) Reception condition: Signal power must be above the
SF-specific threshold qSFm ,
(i,m)
Pcaprx = P (γi ≥ qSFm ), (2)
which is the probability that a received signal from end-device
i at a distance di from the gateway has a SNR γi above the
threshold qSF m (Table I).
2) Co-SF and Inter-SF capture conditions: The SINR of
Fig. 1. LoRa system setup - Case of SF-distance allocation. end-device i is defined as
γi γi
SINRi = = . (3)
TABLE I ΓcoSF + ΓiSF + 1 γj + γk + 1
L O R A C HARACTERISTICS AT BW = 125 K H Z , α = 4 AND R = 1 KM j ∈NcoSF k ∈NiSF
NcoSF is the set of devices with the same SF as device i and
NiSF , the set of devices on other SFs. ΓcoSF will be referred
as co-SF interference and ΓiSF as inter-SF interference.
We need to consider three possible capture cases:
2-a) Co-SF capture only: under orthogonal SF assumptions
as in previous works, or if there are no devices allocated to
other SFs, the capture condition when NcoSF = ∅, is given by
⎛ ⎞
(i,m) ⎜ γ ⎟
PcapcoSF = P ⎝ i ≥ qcoSF ⎠. (4)
γj + 1
j ∈NcoSF
2-b) Inter-SF capture only: if only one device is allocated to
has a probability pm = M 1 of selecting SF . We refer to
m each SF, device i will be only subject to inter-SF interferences.
this allocation as SF-random. In the second type of alloca- The capture condition when NiSF = ∅ is thus
tion referred to as SF-distance, SFs are assigned according to ⎛ ⎞
the distance di . A device located inside the annulus defined (i,m) ⎜ γi ⎟
by the smaller and larger circle radii lm−1 and lm , respec- PcapiSF = P ⎝ ≥ qiSFm ⎠. (5)
γk + 1
tively, has SFm . The distance threshold lm for SFm is given k ∈NiSF
P A(f ) 1
by lm = ( 0θrx c ) α , where A(fc ) = (fc2 × 10−2.8 )−1 is 2-c) Co-SF and Inter-SF captures: in the general case, both
m
A(f ) co- and inter-SF interferences will be present, i.e., NcoSF = ∅
the deterministic loss in the path loss model Li = d αc , fc
i
the carrier frequency and α the path loss exponent. θrxm is the and NiSF = ∅, giving the general condition
⎛
receiver sensitivity of SFm (see Table I). All nodes transmit
at equal power P0 . We assigned to l6 and l12 the origin of the (i,m) ⎜ γi
Pcapco&iS F (j ) = P ⎝
cell and its radius respectively, i.e., l6 = 0 and l12 = R. The γj + γk + 1
ranges for each SF are given in Table I for α = 4, for urban j ∈NcoSF k ∈NiSF
⎞
scenarios as in [6]. The probability of selecting SFm for the
SF-distance allocation is then given by pm = llm−1 m
h(r )dr , ⎟
≥ max{qcoSF m , qiSF m }⎠. (6)
where h(r) is the pdf of the position of an end-device in the
cell at distance r from the gateway. For uniform distribution
of devices within a cell of radius R, we get h(r ) = R 2r .
2
Defining Sm as the event of successful frame transmission
The instantaneous SNR γi of end-device i is defined as γi = for a device with SFm , the sum-rate is Rm if only SFm
P0 |hi |2 Li /σn2 , where |hi |2 is the channel gain between end- has a successful transmission, Rm + Rl if only (SFm ,SFl )
device i and the gateway (for Rayleigh fading, hi ∼ CN (0, 1)). are successful, etc., up to m Rm if all SFs are successful.
Denoting S{i,...,jc
σn2 = −174+NF+10log(BW) [dBm] is the AWGN power and } the joint event where SFs with indices in set
NF, the receiver noise figure. {i , . . . , j } are all unsuccessful, and summing over all possible
Based on [4], it is assumed that in the event of a colli- combinations of successful SFs, the sum-rate is expressed as

sion between frames of different SFs, one signal is received c

τ = Rm P (Sm ; S{∀m =m} )
successfully if its SINR is higher than its “InterSF capture
m
threshold” in Table I.2 Moreover, if there are several signals
c
with equal SFs transmitting on the same frequency simulta- + (Rm + Rl )P (Sm , Sl ; S{∀m =m,l} )
neously, the gateway is able to successfully receive one of (m,l),m=l

them if its SINR is higher than 6 dB, for any SFm [2], [7].

Therefore, both types of interferences will be considered. + ... + Rm P (S1 , S2 , . . . , SM ). (7)

m
2 In [4], q
iSFm for each SFm depends on the colliding SF as only one
colliding frame is assumed. We consider the worst case (the worst thresholds By decomposing for each Rm , we get a sum of weighted
of [4] for each SF) as multiple frames with different SFs might collide. elements φm = Rm Psucc (SFm ), where Psucc (SFm ) =
c c
P (Sm ; S{∀m =m} ) + l=m P (Sm , Sl ; S{∀m =m,l} ) + . . . + Marginalizing over γ1 , . . . , γj −1 and making the change of
P (S1 , S2 , . . . , SM ), i.e., the sum probabilities of all events variable γi = rcα , we get by independency of user channels,
i
with a success for SFm , but any state for all other SFm . (i,m)
Psucc (SFm ) is hence the marginal probability P (Sm ; Sm ∪ PcapcoSF (j )
c , ∀m = m),3 detailed in the following sections. −1 lm
j
Sm
lm qcoSF riα
h(rp )drp
Finally, the uplink throughput τ can be expressed as = e− c α h(ri )dri . (14)

lm−1 p=1 lm−1 1 + qcoSF rrpi
τ= Rm × Psucc (SFm ), (8) p=i
m h(r )dr
m∈M We define I (ri ) = llm−1 r . Since all user
1+qcoSF ( ri )α
where Rm is the bit-rate of SFm and Psucc (SFm ), the positions, i.e., distance random variable rp for user p, are sta-
probability of a successful transmission. tistically equivalent as they follow
−1the same probability density
Next, we analyze the throughput under perfect and imperfect function h(r), we can write jp=1 I (ri ) = [I (ri )]j −1 . After
SF-orthogonality, for the SF-distance case.4 similar derivations as [8], the expression becomes
lm
(i,m) q rα
A. Perfect Orthogonality PcapcoSF (j ) = exp − coSF i [I (ri )]j −1 h(ri )dri . (15)
lm−1 c
We assume first that SFs are perfectly orthogonal: no end-
In particular, for α = 4, the primitive function of I(ri ) is
device suffers inter-SF interferences. Hence, the probability of
a successful transmission Psucc (SFm ) of (8) is given by r 2 r 2√ 1
i
J (ri , r ) = − qcoSF arctan 2√ .

N R R ri
qcoSF
N r
Psucc (SFm ) = Pj (caprx , capcoSF ), (9) (16)
j
j =1
In case of co-SF interferences, the interferers are the end-
where j denotes the total number of end-devices at SFm among devices with the same SF as end-device i, i.e., with the same
N and Pj (caprx , capcoSF ) is the joint probability for reception distance boundaries. The expression of I(ri ) becomes
condition and co-SF capture.
I (ri ) = J (ri , lm ) − J (ri , lm−1 ). (17)
1) For j = 1: there are no co-SF interferences, thus only
the reception condition for device i with SFm among N−1 Therefore, (9) can be written for SF-distance allocation with
devices with different SFs needs to be satisfied, perfect SF orthogonality as,

(i,m)
P1 (caprx , capcoSF ) = P1 (caprx ) = (1 − pm )N −1 Pcaprx . Psucc (SFm ) =
N (i,m)
(1 − pm )N −1 × Pcaprx
(10) 1
N

(i,m)
N (i,m)
We determine Pcaprx for the SF-distance case. Given our + (1 − pm )N −j × PcapcoSF (j ), (18)
assumptions, the SNR γi is modeled as an exponential random j
j =2
variable with mean γi . Therefore, (i,m) (i,m)
(i,m)
with Pcaprx given in (12) and PcapcoSF (j ) in (15).
Pcaprx = P (γi ≥ qSFm |γi ) × P (γi ), (11)
where qSF m is the specific threshold of SFm . Defining γi = B. Imperfect Orthogonality
c P0 ×A(fc ) In reality, SFs are not perfectly orthogonal, so all three
riα , where c = σn2
is the path-loss constant, we obtain
capture conditions are to be satisfied to achieve a successful
lm
(i,m) qSFm riα 2r transmission. Thus, Psucc (SFm ) of (8) becomes
Pcaprx = exp − × 2i dri . (12) N
lm−1 c R
N
Psucc (SFm ) = Pj (caprx , capcoSF , capiSF ), (19)
Although this integral cannot be expressed in closed form, it j
j =1
can be efficiently determined by numerical methods.
2) For j ≥ 2: both the reception and co-SF conditions where Pj (caprx , capcoSF , capiSF ) is the joint probability for
must be fulfilled. As qSFm ≤ 1 in linear for all SFs whereas reception condition, co-SF capture and inter-SF capture, when
qcoSF = 4 (6 dB) for all SFs as explained in Section II, if there are j devices among N with SFm .
co-SF capture is satisfied, so is the reception condition, hence 1) For j = 1: the device is only subject to inter-SF capture
and reception conditions. From Table I, the dominant condition
Pj (caprx , capcoSF ) = Pj (capcoSF ) depends on each SFm . Thus, we may approximate
(i,m)
= (1 − pm )N −j PcapcoSF (j ). (13) P1 (caprx , capcoSF , capiSF ) = P1 (caprx , capiSF )

(i,m) (i,m)
In case of co-SF interferences, there are j−1 interfer- ≈ min (1 − pm )N −1 Pcaprx , PcapiSF (1) . (20)
(i,m)
ers, PcapcoSF (j ) = P ( j −1γi ≥ qcoSF ), which (i,m) (i,m)
k =1 γk +1 The expression of PcapiSF (j ) is similar to PcapcoSF (j ), but
is developed using random instantaneous SNR variables
γk and random average SNR (position) variables γk as with different thresholds and number of interferers. Inter-SF
P ( j −1γi ≥ qcoSF |γ1 , . . . , γi , . . . , γj −1 , γ1 , . . . , γj −1 ) × interferences are caused by the devices that are not in the
k =1 γk +1 annulus corresponding to SFm , i.e., they have different SFs.
P (γ1 , . . . , γi , . . . , γj −1 , γ1 , . . . , γi−1 , γi+1 , . . . , γj −1 ). If j is the number of devices at SFm , there are N−j devices
with other SFs. Therefore,
3 This can be also easily verified by direct calculation.
lm
(i,m) qiSF m riα
[I (ri )]N −j h(ri )dri . (21)
4 Analysis for SF-random case are omitted due to lack of space; they will
PcapiSF (j ) = exp −
be detailed in an extended journal version. lm−1 c
WARET et al.: LoRa THROUGHPUT ANALYSIS WITH IMPERFECT SF ORTHOGONALITY 411
observe that our derived throughput expressions approach

almost perfectly the simulation results, showing the validity
of our analysis despite approximation. Next, for SF-distance,
we can see that inter-SF interferences cause a notable decrease
of performance compared to perfect orthogonality. However,
as the number of devices increases, the gap narrows down, as
co-SF interferences lead to the scalability limit. These results
show the impact of imperfect SF orthogonality over the system
throughput, up to 50% loss. Note that all devices transmit
at 100% duty cycle. So at 1% duty cycle, the number of
end-devices in the figure is 100-fold.
Next, we compare SF-distance and SF-random alloca-
tions in imperfect case. For lower amounts of devices, a
significantly higher throughput is achieved with SF-distance
(up to 100% gain), as devices are more likely to satisfy SINR
capture thresholds. On the other hand, the gap between SF-
Fig. 2. Throughput performance of SF-distance and SF-random allocations. distance and SF-random allocations tightens for larger N, as
SF-distance performance is hindered due to higher densities
of co-SF devices. These results suggest that even a simple
h(r )dr
Here I (ri ) = R\Rm r , where R \ Rm is the SF-random policy can provide similar throughput levels for a
1+qiSF m ( ri )α
large number of devices. Note that the advantage of SF-random
whole cell area excluding the area of SFm . From (16), we get
allocation is that distance knowledge is not required at each

I (ri ) = J (ri , R) − J (ri , 0) − J (ri , lm ) − J (ri , lm−1 ) . (22) device, nor at the gateway, for SF attribution. Our analysis
will be useful to devise new SF allocation and MAC poli-
J (ri , r ) is given by Eq. (16) by replacing qcoSF by qiSFm . cies encompassing the effects of inter-SF interferences, given
2) For j ≥ 2: all capture conditions are to be considered. system requirements.
As in the perfect orthogonality case, the reception condition
derives from the co-SF capture condition, thus V. C ONCLUSION
Pj (caprx , capcoSF , capiSF ) = Pj (capcoSF , capiSF ) We have considered the uplink of a single gateway LPWAN
. (i,m) based on LoRa physical layer, for which theoretical through-
= Pcapco&iSF (j ). (23) put expressions were derived. Unlike most previous works,
As there are both co-SF and inter-SF interferences, we can our analysis encompasses all three conditions required for
(i,m) γi successful transmission: SNR reception level, SINR level
write from (6), Pcapco&iSF (j ) = P ( j −1 N −j
≥ for co-SF and inter-SF captures. Results have shown the

γp + γk +1
p=1 k =j +1 non-negligible impact of SF imperfect orthogonality, as well
max(qcoSF , qiSF m )). As max(qcoSF , qiSF m ) = qcoSF for all as the effects of SF allocations on the overall throughput.
SFm , we can derive Our analytical framework hence provides a precious tool
lm qcoSF riα for designing tailored SF allocations depending on environ-
(i,m)
Pcapco&iSF (j ) = e− c [I (ri )]j −1 [Î (ri )]N −j h(ri )dri , ments and requirements, by predicting their impact on system
lm−1 performance.
(24)
h(r )dr
where Î (ri ) = R\Rm r , hence R EFERENCES
1+qcoSF ( ri )α
[1] LoRa Modulation Basics—AN1200.22, Revision 2, Semtech Corporat.,
Î (ri ) = J (ri , R) − J (ri , 0) − [J (ri , lm ) − J (ri , lm−1 )], (25)
Camarillo, CA, USA, May 2015. [Online]. Available: www.semtech.com
[2] O. Georgiou and U. Raza, “Low power wide area network analysis: Can
with J(ri , r ) defined in (16). LoRa scale?” IEEE Wireless Commun. Lett., vol. 6, no. 2, pp. 162–165,
(i,m)
Finally, using Pcapco&iSF (j ) in (24), (19) can be written for Apr. 2017.
SF-distance allocation with imperfect orthogonality as, [3] M. C. Bor, U. Roedig, T. Voigt, and J. M. Alonso, “Do LoRa low-power
wide-area networks scale?” in Proc. ACM MSWiM, Valletta, Malta,
N (i,m) (i,m)
Psucc (SFm ) = min (1 − pm )N −1 Pcaprx , PcapiSF (1) Nov. 2016, pp. 59–67.
1 [4] D. Croce, M. Gucciardo, I. Tinnirello, D. Garlisi, and S. Mangione,

N N (i,m) “Impact of spreading factor imperfect orthogonality in LoRa commu-
nications,” in Towards a Smart and Secure Future Internet, vol. 766.
+ Pcapco&iSF (j ). (26)
j =2 j Cham, Switzerland: Springer, Sep. 2017.
[5] G. Zhu, C.-H. Liao, M. Suzuki, Y. Narusue, and H. Morikawa,
“Evaluation of LoRa receiver performance under co-technology
IV. N UMERICAL R ESULTS interference,” in Proc. IEEE CCNC, 2018, pp. 1–7.
Simulation parameters are f c = 868 MHz, BW = 125 kHz [6] M. Centenaro, L. Vangelista, A. Zanella, and M. Zorzi, “Long-range
communications in unlicensed bands: The rising stars in the IoT and
and transmit power P0 = 14 dBm. The path loss exponent was smart city scenarios,” IEEE Wireless Commun., vol. 23, no. 5, pp. 60–67,
set to α = 4 as in [2] (urban) and R = 1 km [6]. Simulation Oct. 2016.
results are averaged over 100000 Rayleigh fading channel [7] C. Goursaud and J.-M. Gorce, “Dedicated networks for IoT: PHY/MAC
realizations and uniformly distributed device positions. state of the art and challenges,” EAI Endorsed Trans. IoT, vol. 1, no. 1,
pp. 15–26, 2015.
Fig. 2 shows the simulated and analytical throughput per- [8] M. Zorzi and R. R. Rao, “Capture and retransmission control in mobile
formances for SF-distance and SF-random allocations against radio,” IEEE J. Sel. Areas. Commun., vol. 12, no. 8, pp. 1289–1298,
varying numbers of devices transmitting simultaneously. We Oct. 1994.
An Adaptive Optimal Mapping Selection Algorithm for PNC

Using Variable QAM Modulation
Tong Peng , Yi Wang, Alister G. Burr , and Mohammad Shikh-Bahaei
Abstract—Fifth generation wireless networks will need to serve

much higher user densities than existing 4G networks, and
will therefore require an enhanced radio access network (RAN)
infrastructure. Physical layer network coding (PNC) has been
shown to enable such high densities with much lower back-
haul load than approaches, such as Cloud-RAN and coordinated
multipoint. In this letter, we present an engineering applicable
PNC scheme which allows different cooperating users to use dif-
ferent modulation schemes, according to the relative strength
of their channels to a given access point. This is in contrast
with compute-and-forward and previous PNC schemes which Fig. 1. The uplink system diagram.
are designed for the two-way relay channel. A two-stage search
algorithm to identify the optimum PNC mappings for given chan-
nel state information and modulation is proposed in this letter. and its more general form called lattice network coding [7],
Numerical results show that the proposed scheme achieves low which reduces the total backhaul load to a level in the range
bit error rate with reduced backhaul load. of the total users’ data rate. However the research on PNC
Index Terms—Adaptive PNC, industrial applicable, backhaul has mostly focused on two-way relay channel (TWRC) appli-
load, unambiguous detection. cation [8], [9] or lattice code-based PNC design [10], which
have significant disadvantages in terms of engineering appli-
cability in N-MIMO networks. Moreover all these schemes
have assumed that the same modulation order is used by all
I. I NTRODUCTION
MTs sharing an AP, and that the signal strength at an AP is
HE CONCEPT of network multiple input, multiple output
T (N-MIMO) [1] has been known for some time as a means
to overcome the inter-cell interference in fifth generation (5G)
the same from all the MTs.
Different from our previous work in [12], in which a PNC
scheme for MTs using a quadrature amplitude modulation
dense cellular networks, by allowing multiple access points (QAM) with the same modulation order and equal trans-
(APs) to cooperate to serve multiple mobile terminals (MTs). mission power is presented, we consider a more practical
This was implemented in the coordinated multipoint (CoMP) scenario in which the MTs may transmit with different pow-
approach standardized in Long Term Evolution (LTE)-A, ers in this letter. In this case, different modulation schemes are
and the Cloud Radio Access Network (C-RAN) concept has employed at each MT to maintain the required bit error rate
been proposed to achieve similar goals [2]. However these (BER) performance. In the simulations, a comparison of the
approaches result in large loads on the backhaul network (also proposed algorithm with the CoMP approaches is given, along
referred to as fronthaul in C-RAN) between APs and the cen- with the impact of the estimated channel coefficients on the
tral processing unit (CPU), many times the total user data accuracy of optimal matrix selection. The primary contribu-
rate. tions of this letter are summarized as follows: (1) an adaptive
While there has been previous work addressing backhaul mapping selection algorithm when a different QAM scheme
load reduction in CoMP and C-RAN, using, for example, is employed at each MT; (2) binary mapping matrices are
Wyner-Ziv compression [3] or compressive sensing [4], the employed so that the proposed algorithm can be implemented
resulting total backhaul load remains typically several times in current practical systems to achieve engineering applicabil-
the total users’ data rate. A novel approach was introduced ity; (3) the dimension of mapping matrices is minimised to
in [5] and [6], based on physical layer network coding (PNC) achieve lower computational complexity and higher degree of
freedom.
Manuscript received May 14, 2018; revised August 14, 2018 and
September 14, 2018; accepted September 30, 2018. Date of publication
October 4, 2018; date of current version April 9, 2019. This work was sup- II. S YSTEM M ODEL
ported in part by EPSRC NetCoM Project under Grant EP/K040006/1, and
in part by EPSRC IoSIRE Project under Grant EP/P022723/1. The associate A two-stage uplink system model for N-MIMO is illustrated
editor coordinating the review of this paper and approving it for publication in Fig. 1, where u MTs are served by n APs, and the APs are
was K. Adachi. (Corresponding author: Tong Peng.)
T. Peng and M. Shikh-Bahaei are with the Centre for Telecommunications connected to a CPU via a backhaul network. A single antenna
Research, Department of Informatics, King’s College London, London WC2R is provided at all MTs and APs. The first stage in the uplink is
2LS, U.K. (e-mail: tong.peng@kcl.ac.uk; m.sbahaei@kcl.ac.uk). called the multi-access stage, in which all MTs broadcast sym-
Y. Wang and A. G. Burr are with the Department of Electronics,
University of York, York YO8 8QF, U.K. (e-mail: yi.wang@york.ac.uk;
bols to APs at the same time. Each AP maps the superimposed
alister.burr@york.ac.uk). signal to a PNC codeword vector and forwards it to the CPU
Digital Object Identifier 10.1109/LWC.2018.2874052 via a backhaul network. The multi-access stage is assumed
PENG et al.: ADAPTIVE OPTIMAL MAPPING SELECTION ALGORITHM FOR PNC USING VARIABLE QAM MODULATION 413
to use wireless communications, whilst the backhaul link is a the NCVs will be forwarded to the CPU from each AP, and
lossless but capacity-limited ‘bit-pipe’. Note that the backhaul then by concatenating them and multiplying by the inverse
link can be wireless or fixed and the proposed scheme can be of the binary PNC mapping matrix, the original data from
implemented in both cases. each MT will be recovered. Note that the global mapping
A 2m -QAM scheme is employed at each MT with a modu- matrix G [G1 G2 · · · Guj ]T must be non-singular in order
lation function of M : F2m −→ Ω, where m stands for a mod- to unambiguously recover the source data [12]. Maximum
ulation order, in bits per symbol, and Ω stands for the set of likelihood (ML) detection can also be used for PNC decoding.
all possible modulated symbols. In the multi-access stage, uj
MTs are served by an AP where the MTs transmit their signals
III. D ESIGN C RITERION FOR B INARY PNC
simultaneously, yielding a received signal at the j th AP of
Singular fading in the multi-access stage is a serious
uj
problem which affects detection performance at the CPU.
rj = hj ,i si + zj , for j = 1, 2, . . . , n, (1) We give a simple example here with 2 MTs to illustrate
i=1 this problem and the singular fading is defined as a situa-
where hj ,i denotes the complex Gaussian distributed random tion in which different pairs of transmitted signals cannot be
channel coefficient between the j th AP and the i th MT, and distinguished at the receiver, mathematically given by
si = M (bi ) is the 2m -QAM signal, where bi denotes the
binary data vector with a dimension of m × 1. zj denotes the hj ,1 s1 + hj ,2 s2 = hj ,1 s1 + hj ,2 s2 , (5)
additive white Gaussian noise with zero mean and variance σ 2 . where si and si stand for the QAM signals at the i th
For the purpose of industrial applicability, binary mapping MT, and si = si . The special channel coefficient vector
matrices are employed in the proposed PNC encoder and
hsf [hj ,1 , hj ,2 ] is defined as a singular fade state (SFS).
decoder. Define a mapping function Gj at the j th AP, the
Note that the solution of (5) is not unique, since there is more
PNC encoding is then given by
than one SFS for each QAM scheme. (5) implies that super-
xj = Gj ⊗ b, (2) imposed constellation points corresponding to two different
MT data combinations coincide, and hence this data cannot
where b [b1 , b2 , . . . , buj ]T denotes the ms × 1 binary joint be unambiguously decoded at this AP.
uj
message vector with b ∈ F2ms ×1 and ms = i=1 mi , where Besides the coincident symbols, there is a set of superim-
mi denotes the modulation order at the i th MT. Gj denotes posed symbols that would map to the same NCV; this set is
a binary matrix with dimensions of lj × ms , where lj denotes (1) (2)
defined as a cluster and denoted by scl = [sj ,sc , sj ,sc , . . . ]. We
the number of rows of the mapping matrix used at the j th then define the minimum distance between different clusters as
l ×1
AP, and ⊗ denotes matrix multiplication over F2 . xj ∈ F2j (τ ) (τ )
i k 2
is the network codeword vector (NCV) detected by the AP dmin = min |sj ,sc − sj ,sc | ,
(τi ) (τ )
k )
which consists of lj linear combinations of the original binary Nj (sj ,sc )=Nj (sj ,sc
data. (τ )
i (τ ) (τ ) (τ )
∀sj ,sc ∈ scl , ∀sj ,sc
k
∈ scl , for i = 1, 2, . . . , k = 1, 2, . . . . (6)
We define scj as a set which contains all possible combina-
tions of the modulated signals from the MTs, where uj denotes In the coincident symbols cases dmin = 0. The design
the number of MTs served by the j th AP. Given a 1×uj chan- criterion is to employ a mapping function that labels the
nel coefficient vector hj [hj ,1 , hj ,2 , . . . , hj ,uj ] at the j th AP, constellation points within a clash to the same NCV with
the vector containing all ujms possible superimposed signals maximised dmin in order to achieve unambiguous decoding
can be calculated by at the CPU. A detailed mathematical proof is derived in [12]
and the design criterion holds in multiple-MT case.
(1) (2) (u ms )
sj ,sc [sj ,sc , sj ,sc , . . . , sj ,scj ] = hj scj , for j = 1, 2, . . . , n.
(3) IV. A DAPTIVE M APPING F UNCTION
S ELECTION A LGORITHM
According to the definition, the modulation function M (·) is a
one-to-one bijective mapping function which maps all possible In this section, we describe an adaptive PNC mapping
combinations of binary data to the complex symbols s ∈ Ω. matrix selection algorithm based on the design criterion intro-
In (2), the joint message vector b can be mapped to an NCV duced in the previous section. According to the criterion, the
x by a binary matrix G. Then with the expression in (3), we separate mapping matrices used at each AP should encode
can always find a surjective PNC mapping function Nj that the constellation points within one cluster to the same NCV
(k ) and additionally, the global mapping matrix G formed by the
maps the superimposed constellation point sj ,sc to an NCV, concatenation of the selected mapping matrices at each AP
mathematically given by should be invertible for unambiguous decoding at the CPU.
(k )
Nj (sj ,sc ) = xj , for k = 1, 2, . . . , ujms . (4) We define an L × uj matrix Hsfs = [hsfs1 , . . . , hsfsL ]T whose
rows contain all special channel vectors that cause different
At each AP, an estimator calculates the conditional prob- SFSs for uj modulation scheme combinations. Like the work
ability of each possible NCV given the mapping function in [6], our proposed algorithm comprises two procedures, the
Nj and the channel coefficients. The estimator returns the first of which is an Off-line search and the second is an
log-likelihood ratio (LLR) of each bit of xj which is then On-line search. The proposed Off-line and On-line algorithms
applied to a soft decision decoder. In the backhaul stage, are described in Algorithm 1.
Algorithm 1 Binary Matrices Selection (Off/On-Line Search When the channel qualities between the MTs and the j th AP
Algorithm) are poor, the value of dmin may be much smaller than that at
Off-line Search the other APs, and this leads to performance degradation. Our
1: Define bs = [bs1 · · · bsu ] as a u ms ×ms matrix containing solution is to employ lj × ms mapping matrices at the j th AP
all possible combinations of binary data with lj ∈ [1, ms ]. An l × ms matrix with l > ms /2 at one AP
2: Define Gofl as a set that contains all l ×ms binary matrices and l < ms /2 at the other may give a greater overall mini-
for l ∈ [1, ms ], where the number of matrices is Nofl mum inter-cluster distance dmin over both APs than choosing
3: sc = [M1 (bs1 ), · · · , M2 (bsu )]T l = ms /2 at both.
4: for i = 1 : L do each SFS Our proposed algorithm provides a solution to reduce back-
(1) (u ms ) haul load in N-MIMO networks with a high user density by
5: si,sfs [si,sfs , · · · , si,sfs ] = hsfsi sc
(n ) (n )
employing PNC technique. However, the calculation of singu-
1 2 2
6: d = |si,sfs − si,sfs | , ∀n1 , n2 , n1 = n2 lar fading with multiple MTs in PNC is still an open question
7: Find Ncl different clashes scl containing the si,sfs with in binary systems. In order to apply the proposed algorithm in
d = 0 and the corresponding binary data combinations bcl practical scenarios, orthogonal frequency division multiplexing
8: for nofl = 1:Nofl do each binary matrix (OFDM) [15] could be utilised by allocating a pair of MTs
9: for ncl = 1:Ncl do each clash to the same frequency in the multiple access stage. In [13],
(ncl ) (nofl ) (ncl )
10: xcl = Gofl ⊗ bcl NCV calculation the proposed algorithm in an OFDM scheme using multiple
11: end for Universal Software Radio Peripherals (USRPs) is presented.
(n )
12: if all xcl cl are the same then We focused on network design and hardware implementation,
(n ) such as the work in [11], and studies of how to address the
13: Store Goflofl as a matrix candidate
14: end if synchronisation, channel estimation and interference issues in
15: end for the perspective of hardware design.
16: end for According to [12], the number of SFSs increases exponen-
On-line Search tially with increasing modulation order index m which leads to
17: Define Gonl as the set that contains the Nsel selected an increase in the number of matrices returned by the Off-line
mapping matrix candidates search and makes the On-line search algorithm impractical.
18: for j = 1 : J do each AP However, the 2m -QAM schemes are all defined in F2m and
19: sj ,sc = hj sc All possible superimposed signals the same constellation points can be found in different constel-
20: for nsel = 1:Nsel do each selected matrix lation books. This property allows us to use the same mapping
(n ) (n ) matrix to resolve different SFSs, and the number of mapping
21: xselsel = Gonlsel ⊗ bs NCV calculation matrices being stored reduces dramatically. We term SFSs
22: Find the set of clusters from the positions of which can be resolved by the same matrix as another SFS,
(n )
repeated columns in xselsel , and among all pairs of image SFSs. Then the original number of SFSs in quadrature
clusters: phase shift keying (QPSK) is 13 and that in 16-QAM is 389
(cl ) (cl )
23: dmin = min |sj ,sci − sj ,sck |2 , but the number is reduced to 5 in QPSK and 169 in 16-QAM
∀i,k ,i=k
where cli and clk denote the cluster indexes after the image SFSs are removed. A detailed study of the
24: end for impact on the network performance with fewer SFSs is derived
25: end for in [12] with computational complexity analysis.
26: The non-singular global mapping matrix selection:
G = [G1 · · · GJ ]T with maximum dmin V. N UMERICAL R ESULTS AND D ISCUSSION
In this section, we illustrate the simulation results of the
proposed algorithm in the 5-node system shown in Fig. 1. We
assume the multi-access links are wireless and the backhaul is
The main differences between the proposed design and the a capacity-limited wired link. For simulation simplicity, two
work in [6] and [12] are as follows. In [6], MTs employ MTs are served by two APs and we assume a 3dB difference
the same modulation scheme and the channels have the same in average path loss between each AP and the two MTs. The
average path loss, and the dimension of the mapping matri- MT with better channel quality employs a higher order modu-
ces stored at each AP is equal to ms /2 so that the sizes of lation scheme, corresponding to a higher rate. A convolutional
the NCVs at each AP are the same. In [12], a design guide- code is used in the simulation while other channel codes, such
line for an arbitrary number of cooperating users employing as low-density parity-check (LDPC) code [14], may also be
the same QAM scheme and transmission power is presented. employed. As benchmarks we use ideal CoMP with an unre-
The approach presented in this letter focuses on the situation stricted number of bits transmitted in the backhaul network and
in which the modulated signals from MTs are received at an non-ideal CoMP in which the soft information on the backhaul
AP with different powers, then in order to maintain a good is quantized to a total of 12, 6 and 2 bits per symbol, resulting
performance in terms of error probability, different modulation in the total backhaul load of 48, 24 and 8 respectively (since 4
schemes should be used. In order to achieve a higher degree LLRs are calculated at each AP).
of freedom, mapping matrices with different dimensions may In Fig. 2, the BER performances of different approaches
be utilised. The non-singular global mapping matrix used for are illustrated. We assume block fading in the multi-access
PNC decoding at the CPU is formed by concatenating the stage and that perfect channel information can be obtained to
mapping matrices used at each AP with the maximum dmin . obtain accurate SFSs. As shown in the figure, a 2.5dB and 3dB
PENG et al.: ADAPTIVE OPTIMAL MAPPING SELECTION ALGORITHM FOR PNC USING VARIABLE QAM MODULATION 415
70% probability that the optimal mapping matrix will not be

selected in a low Eb/N0 scenario which corresponds to a 5dB
degradation compared to that when perfect channel knowledge
is employed. With a pilot sequence length of 5, and at Eb/N0s
sufficient to ensure a low BER, the mis-mapping probability
drops below 10% and the performance degradation reduces
to 3dB. According to Fig. 3, a pilot sequence with a length
between 5−10 is enough for the proposed adaptive selection
algorithm.
VI. C ONCLUSION
In this letter, an engineering applicable approach is
presented to implement adaptive PNC in binary N-MIMO
systems when different MTs transmit with different powers.
Unlike our work in [12] and [13], we focus on applying the
Fig. 2. BER of the proposed algorithm with 3dB difference in channel proposed algorithm in practical scenarios in which multiple
quality. MTs could employ different modulation schemes. We also
illustrate how the estimated channel affects the optimal matrix
selection accuracy. The simulation results illustrate the bene-
fits of the proposed PNC approach in terms of backhaul load
reduction and BER improvement compared to the non-ideal
CoMP with quantized bits in the backhaul channel.
R EFERENCES
[1] M. V. Clark et al., “Distributed versus centralized antenna arrays in
broadband wireless networks,” in Proc. Spring IEEE Veh. Technol. Conf.,
May 2001, pp. 33–37.
[2] D. Lee et al., “Coordinated multipoint transmission and reception in
LTE-advanced: Deployment scenarios and operational challenges,” IEEE
Commun. Mag., vol. 50, no. 2, pp. 148–155, Feb. 2012.
[3] L. Zhou and W. Yu, “Uplink multicell processing with limited backhaul
via per-base-station successive interference cancellation,” IEEE J. Sel.
Areas Commun., vol. 31, no. 10, pp. 1981–1993, Oct. 2013.
[4] Y. Wang, Z. Chen, and M. Shen, “Compressive sensing for uplink
cloud radio access network with limited backhaul capacity,” in Proc. 4th
ICCSNT, Harbin, China, Dec. 2015, pp. 898–902.
Fig. 3. Non-optimal mapping matrices selection probability with estimated [5] Q. T. Sun, J. Yuan, T. Huang, and K. W. Shum, “Lattice network codes
channel. based on Eisenstein integers,” IEEE Trans. Commun., vol. 61, no. 7,
pp. 2713–2725, Jul. 2013.
[6] A. G. Burr and D. Fang, “Linear physical-layer network coding for
improvement can be observed when employing the proposed 5G radio access networks,” in Proc. 1st Int. Conf. 5GU, Äkäslompolo,
algorithm with QPSK+BPSK and 16-QAM+QPSK compared Finland, Nov. 2014, pp. 116–221.
[7] Y. Wang et al., “A multilevel framework to lattice network coding,”
to employing the algorithm in [12] using QPSK+QPSK and IEEE Trans. Inf. Theory, submitted for publication. [Online]. Available:
16-QAM+16-QAM, respectively. The ideal CoMP is the best https://arxiv.org/abs/1511.03297
among the approaches due to the availability of unrestricted [8] T. Koike-Akino, P. Popovski, and V. Tarokh, “Optimized constellations
bits in the backhaul network; while quantization with limited for two-way wireless relaying with physical network coding,” IEEE J.
Sel. Areas Commun., vol. 27, no. 5, pp. 773–787, Jun. 2009.
precision results in a performance degradation in non-ideal [9] D. Fang and A. G. Burr, “Uplink of distributed MIMO: Wireless network
CoMP. In the QPSK+BPSK case, the proposed algorithm coding versus coordinated multipoint,” IEEE Commun. Lett., vol. 19,
outperforms CoMP with 24 bits, while requiring a total of no. 7, pp. 1229–1232, Jul. 2015.
only 9 backhaul bits. [10] B. Nazer and M. Gastpar, “Compute-and-forward: Harnessing
interference through structured codes,” IEEE Trans. Inf. Theory, vol. 57,
The channel coefficients obtained at each AP are impor- no. 10, pp. 6463–6486, Oct. 2011.
tant to the proposed algorithm because its performance relies [11] G. Chiurco, M. Mazzotti, F. Zabini, D. Dardari, and O. Andrisano,
on the accurate identification of the closest SFS. In practi- “FPGA design and performance evaluation of a pulse-based echo
canceller for DVB-T/H,” IEEE Trans. Broadcast., vol. 58, no. 4,
cal communication networks, the trade-off between the length pp. 660–668, Dec. 2012.
of information symbols and that of pilot symbols for chan- [12] T. Peng et al., “Adaptive wireless network coding in network MIMO:
nel estimation has been discussed in [15]. We have illustrated A new design for 5G and after,” IEEE Trans. Commun., submitted for
the effect of channel estimation on the proposed algorithm publication. [Online]. Available: https://arxiv.org/abs/1801.07061
[13] Y. Chu et al., “Implementation of uplink network coded modulation for
in terms of the probability of selecting a non-optimal matrix, two-hop networks,” IEEE Access, submitted for publication. [Online].
which is shown in Fig. 3. In the simulation, different num- Available: https://arxiv.org/abs/1808.07354
bers of pilot symbols have been employed to estimate the [14] A. G. Burr, Modulation and Coding for Wireless Communications.
Harlow, U.K.: Prentice-Hall, 2000.
channel, which affects the accuracy of identification of the [15] H. D. Taun, H. H. Kha, H. H. Nguyen, and V.-J. Luong, “Optimized
SFS calculated at each AP. As illustrated in Fig. 3(a), when training sequences for spatially correlated MIMO-OFDM,” IEEE Trans.
a short pilot sequence is employed, there is a maximum of Wireless Commun., vol. 9, no. 9, pp. 2768–2778, Sep. 2011.
Deep Learning-Based CSI Feedback Approach for Time-Varying

Massive MIMO Channels
Tianqi Wang, Chao-Kai Wen , Shi Jin , and Geoffrey Ye Li
Abstract—Massive multiple-input multiple-output (MIMO) CSI. These methods sparsify CSI under certain bases to
systems rely on channel state information (CSI) feedback to apply CS for feedback and reconstruction [2] or distributed
perform precoding and achieve performance gain in frequency compressive channel estimation [4]. In reality, CSI is only
division duplex networks. However, the huge number of anten- approximately sparse under elaborate base selection or
nas poses a challenge to the conventional CSI feedback reduction sparsity modeling. Many existing CS algorithms experience
methods and leads to excessive feedback overhead. In this letter,
we develop a real-time CSI feedback architecture, called CsiNet-
difficulty in CSI compression and recovery if there is a model
long short-term memory (LSTM), by extending a novel deep mismatch.
learning (DL)-based CSI sensing and recovery network. CsiNet- Time correlation property of slow-varying channels has been
LSTM considerably enhances recovery quality and improves considered in [2] to further reduce feedback quantity. This
tradeoff between compression ratio (CR) and complexity by method reuses the previously retained channel information
directly learning spatial structures combined with time corre- for subsequent CSI recovery if the error is under a certain
lation from training samples of time-varying massive MIMO threshold. However, the reused information only provides an
channels. Simulation results demonstrate that CsiNet-LSTM estimate and is hard to update in real time. As a result, reso-
outperforms existing compressive sensing-based and DL-based lution degrades and the feedback overhead cannot reduce any
methods and is remarkably robust to CR reduction. more in fast-changing channels.
Index Terms—Massive MIMO, FDD, CSI feedback, compres- Recently, deep learning (DL) methods has been successfully
sive sensing, deep learning. applied in wireless communications [5]–[7]. A CSI feedback
network, called CsiNet [8], uses an autoencoder-like architec-
ture to mimic the CS and reconstruction processes. It uses
I. I NTRODUCTION an encoder to obtain compressed representation (codewords)
ASSIVE multiple-input multiple-output (MIMO) by directly learning channel structures from the training
M systems have been recognized as a critical develop-
ment for future wireless communications. With downlink
data and a decoder to recover CSI via one-off feedforward
multiplication. CsiNet remarkably outperforms the CS-based
channel state information (CSI), a base station (BS) with methods. But it reconstructs CSI independently, and ignores
massive antennas can use channel-adaptive techniques to time correlation in time-varying channels.
eliminate inter-user interference and increase channel capac- In this letter, we improve the architecture by considering
ity. In frequency division duplex (FDD) networks, downlink time correlation. This letter is motivated by the recurrent
CSI can only be estimated at user equipment (UE) and fed convolutional neural network (RCNN) architecture that has
back to the BS. The excessive overhead has motivated many been successfully used in video representation and recon-
feedback reduction techniques, such as vector quantization struction [9]. The basic idea is to use a convolutional neural
and codebook-based approaches [1]. However, quantization network (CNN) and a recurrent neural network (RNN) to
errors pose a challenge to CSI-sensitive applications, whereas extract spatial features and interframe correlation, respec-
the huge number of antennas complicates the codebook tively. Our contribution in this letter is summarized as
design and accordingly increases feedback overhead. follows.
The compressive sensing (CS)-based CSI feedback • We propose an DL-based CSI feedback protocol for FDD
approaches proposed recently address the aforementioned MIMO systems by extending CsiNet with a long short-
problems by using the spatial and temporal correlation of term memory (LSTM) network, which is a classic type
of RNN. The proposed network, called CsiNet-LSTM,
Manuscript received July 16, 2018; revised September 8, 2018; accepted modifies the CNN-based CsiNet for CSI compression and
September 24, 2018. Date of publication October 5, 2018; date of current initial recovery and uses LSTM to extract time correlation
version April 9, 2019. This work was supported in part by the National for further improvement in resolution.
Science Foundation (NSFC) for Distinguished Young Scholars of China under
• The experiment results demonstrate that CsiNet-LSTM
Grant 61625106, and in part by NSFC under Grant 61531011. The work of
C.-K. Wen was supported in part by the Ministry of Science and Technology achieves the best recovery quality and outperforms state-
of Taiwan under Grant MOST 106-2221-E-110-019 and in part by ITRI in of-the-art CS methods in terms of complexity. CsiNet-
Hsinchu, Taiwan. The associate editor coordinating the review of this paper LSTM exhibits remarkable robustness to compression
and approving it for publication was A. Liu. (Corresponding author: Shi Jin.)
ratio (CR) reduction and enables real-time and extensible
T. Wang and S. Jin are with the National Mobile Communications
Research Laboratory, Southeast University, Nanjing 210096, China CSI feedback applications without considerably increas-
(e-mail: wangtianqi@seu.edu.cn; jinshi@seu.edu.cn). ing overhead compared with CsiNet.
C.-K. Wen is with the Institute of Communications Engineering, National
Sun Yat-sen University, Kaohsiung 80424, Taiwan (e-mail: ckwen@ieee.org).
G. Y. Li is with the School of Electrical and Computer II. S YSTEM M ODEL
Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA
(e-mail: liye@ece.gatech.edu). Consider an FDD downlink massive MIMO-orthogonal
Digital Object Identifier 10.1109/LWC.2018.2874264 frequency division multiplexing (OFDM) system with Nc
WANG et al.: DL-BASED CSI FEEDBACK APPROACH FOR TIME-VARYING MASSIVE MIMO CHANNELS 417
subcarriers. The BS deploys a uniform linear array (ULA) with 2D-DFT is performed to obtain the original spatial frequency
Nt transmit antennas. In a time-varying channel caused by UE channel matrix.
mobility, the received signal at time t on the nth subcarrier for
UE with a single receiver antenna can be modeled as, III. C SI N ET-LSTM
The CsiNet in [8] demonstrates remarkable performance
yn,t = hT
n,t vn,t xn,t + zn,t , (1) in CSI sensing and reconstruction. However, the resolution
degrades at low CR because the it only focuses on angular-
where hn,t ∈ CNt ×1 , xn,t ∈ C, and zn,t ∈ C denote the delay domain sparsity (Observation 1) and ignores the time
instantaneous channel vector in the frequency domain, trans- correlation (Observation 2) of time-varying massive MIMO
mit data symbol, and additive noise, respectively, vn,t ∈ channels. The two observations in Section II are similar to
CNt ×1 is the beamforming or precoding vector designed by the spatial structure and interframe correlation of videos,
the BS based on the received downlink CSI. We denote respectively. Motivated by RCNN that excels in extracting
the CSI matrix at time t in the spatial-frequency domain spatial-temporal features for video representation [9], we will
as Ht = [h1,t , . . . , hNc ,t ]T ∈ CNc ×Nt . In practice, the extend CsiNet with LSTM to improve CR and recovery quality
UE continuously estimates and feeds instantaneous CSI (i.e., trade-off. We will also introduce the multi-CR strategy in [9]
Ht , Ht+1 , . . .) back to the BS to track the time-varying char- to implement variable CRs on different channel matrices.
acteristics of the channel. To reduce feedback overhead, we The proposed CsiNet-LSTM is illustrated in Fig. 1(b) with
can exploit the following observations. CsiNet shown in Fig. 1(a). Our model includes the following
Observation 1 (Angular-Delay Domain Sparsity): Ht can two steps: angular-delay domain feature extraction, correlation
be transformed into an approximately sparsified matrix Ht in representation and final reconstruction.
the angular-delay domain via 2D discrete Fourier transform 1) Angular-Delay Domain Feature Extraction: We apply
(2D-DFT) [8] by Ht = Fd Ht Fa , where Fd ∈ CNc ×Nc and CsiNet with two different CRs to {Ht }T t=1 to learn the
Fa ∈ CNt ×Nt are two DFT matrices. First, due to limited angular-delay domain structure and perform sensing and ini-
multipath time delay, performing DFT on frequency domain tial reconstruction. A high-CR CsiNet transforms the first
channel vectors (i.e., column vectors of Ht ) can transform Ht channel H1 into an M1 × 1 codeword vector that retains suf-
into a sparse matrix in the delay domain, with only the first ficient structure information for high resolution recovery. A
Nc (< Nc ) rows having distinct non-zero values. Secondly, low-CR CsiNet encoder performs on the remaining T − 1
as proved in [10], the channel matrix is sparse in a defined channel matrices to generate a series of M2 × 1 codewords
angle domain by performing DFT on spatial domain channel (M1 > M2 ), given that less information is required due to
vectors (i.e., row vectors of Ht ) if the number of transmit channel correlation. The T − 1 codewords are all concate-
antennas, Nt → +∞, is very large. Usually, Ht is only nated with the first M1 × 1 codeword before being fed into
approximately sparse for finite Nt , which challenges conven- the low-CR CsiNet decoder to fully utilize feedback informa-
tional CS methods. Therefore, we will propose a DL-based tion. Each CsiNet outputs two matrices with size Nc × Nt as
feedback architecture without sparsity prior constraint. We per- extracted features from the angular-delay domain.
form sparsity transformation to decrease parameter overhead All low-CR CsiNets shown in Fig. 1(b) share the same
and training complexity. We retain the first Nc non-zero rows network parameters, i.e., weights and bias, because they per-
and truncate Ht to a Nc × Nt matrix, Ht , which reduces the form the same work. which dramatically reduces parameter
total number of parameters for feedback to N = Nc Nt . overhead. Furthermore, the architecture can be easily rescaled
Observation 2 (Correlation Within Coherence Time): UE to perform on channel groups with different T if the value of
motion during communication results in a Doppler spread, T changes to adapt to the channel-changing speed and feed-
i.e., time-varying characteristics of wireless channels. With the back frequency. In practice, a low-CR CsiNet will be reused
maximum movement speed denoted as v, coherence time can T − 1 times instead of making T − 1 copies. The grey blocks
be calculated as in Fig. 1(b) load parameters from the original CsiNets as pre-
c training before end-to-end training with the entire architecture.
Δt = , (2) This method can alleviate vanishing gradient problems due to
2vf0
long paths from CsiNets to LSTMs.
where f0 is the carrier frequency and c is the speed of light [3]. 2) Correlation Representation and Final Reconstruction:
The CSI within Δt is considered correlated with one other. We use LSTMs to extend the CsiNet decoders for time correla-
Therefore, instead of independently recovering CSI, the BS tion extraction and final reconstruction. LSTMs have inherent
can combine the feedback and previous channel information memory cells and can keep the previously extracted informa-
to enhance the subsequent reconstruction. We set the feed- tion for a long period for later prediction. In particular, the
back time interval as δt and place T adjacent instantaneous outputs of the CsiNet decoders form a sequence of length T
angular-delay domain channel matrices into a channel group, before being fed into three-layer LSTMs. Each LSTM has
i.e., {Ht }T
t=1 = {H1 , . . . , Ht , . . . , HT }. The group exhibits 2Nc Nt hidden units, which is the same as the output dimen-
correlation property, as long as T satisfies 0 ≤ δ t · T ≤ Δ t. sion. The final outputs are then reshaped into two Nc × Nt
In this letter, we design an encoder, st = fen (Ht ), at the matrices as the final recovered Ĥt . The spatial frequency
UE to compress each complex-valued Ht of {Ht }T t=1 into an domain CSI can then be obtained via inverse 2D-DFT. At each
M-dimensional real-valued codeword vector st (M < N). If time step, the LSTMs implicitly learn time correlation from the
two real number matrices are used to represent the real and previous inputs and then merge them with the current inputs
imaginary parts of Ht , then CR will be M/2N. We also design to increase low CR recovery quality. Correlation information
a decoder with a memory that can extract time correlation from is updated after each step due to the nature of LSTM. The
the previously recovered channel matirces, Ĥ1 , . . . , Ĥt−1 and experimental results show that the highly compressed T − 1
combine them with the received st for current reconstruction, matrices can achieve better recovery accuracy than H1 as a
Ĥt = fde (st ; Ĥ1 , . . . , Ĥt−1 ), where 1 ≤ t ≤ T. Then, inverse benefit from LSTMs.
Fig. 1. (a) CsiNet architecture presented in [8]. It comprises an encoder with a 3 × 3 conv layer and an M-unit dense layer for sensing and a decoder with
a 2Nc Nt -unit dense layer and two RefineNet for reconstruction. Each RefineNet contains four 3 × 3 conv layers with different channel sizes. (b) Overall
architecture of CsiNet-LSTM. H1 and the remaining T − 1 channel matrices are compressed by high-CR and low-CR CsiNet encoders, respectively. Codewords
are concatenated before being fed into the low-CR CsiNet decoder, and final reconstruction is performed by three 2Nc Nt -unit LSTMs.
We use end-to-end learning to obtain all parameters for the places with two uniform velocities. Therefore, Δt is 30s
encoder and the decoder denoted as Θ = {Θen , Θde }. Notably, and 0.56s, respectively. Compressed CSI is fed back every
Ht are normalized with all elements scaled into the [0, 1] δt = 0.04 s. We set the channel group size T = 10, which sat-
range before being fed into the network. This normalization is isfies δ t · T < Δ t in both scenarios. We perform experiments
required for CsiNet. For details, we refer to [8]. Let f denote at CR values of 1/16, 1/32, and 1/64, with the first channel
the final trained network defined as H1 compressed under 1/4.
Ĥt = f (Ht ; Θ) = fde (fen (H1 ; Θen ), . . . , fen (Ht ; Θen ); Θde ). Training, validation, and testing sets have 75,000, 12,500,
and 12,500 samples, respectively, for offline training. Some
(3)
parameters are preloaded from the CsiNet for initialization.
We select ADAM as the optimization algorithm and use mean- The epochs are adjusted for a convergence situation ranging
squared error (MSE) as the loss function, which is defined as, from 500 to 1,000. The batch size is 100 and the learning
M T
1 rates are 0.001 and 0.0001 for the former and latter epochs,
L(Θ) = f (Ht ; Θ) − Ht 22 , (4) respectively.
M We compare our architecture with three state-of-the-
m=1 t=1
where M is the total number of samples in the training set and art CS-based algorithms, namely, LASSO 1 −solver [12],
· 2 is the Euclidean norm. TVAL3 [13], and BM3D-AMP [14], and the DL-based
The procedure for CsiNet-LSTM is described as follows. CsiNet [8]. LASSO uses simple sparsity priors but achieves
Multiple CR CsiNet encoders are deployed at UE, whereas the good performance. TVAL3 is a minimum total variation
CsiNet decoders and LSTMs are deployed at the BS. Each side method that provides remarkable recovery quality but with
has a counter. At the beginning, H1 is compressed with high high computing efficiency. BM3D-AMP achieves the most
CR at the UE and recovered by a high-CR CsiNet decoder and accurate recovery performance on natural images and is 10
initialized by the LSTMs at the BS. In the subsequent time times faster than other iterative methods.
step t (2 ≤ t ≤ T), Ht is transformed into a lower-dimensional We use the default configuration in the open source codes
codeword st at the UE, which is expected to contain the of the aforementioned methods for simulation. When compar-
learned correlation information. The lower-dimensional code- ing with CsiNet, we consider the slight difference between
word, st is then concatenated with the first one s1 and inversely datasets and refine the CsiNet parameters on our training set
transformed by the LSTMs at the BS. After each time step, for several epochs for fairness. We run the conventional CS-
the counters add by one. Similar operations continue until the based methods on an Intel Core i7-6700 CPU due to the lack
counters accumulate to T and the LSTMs are reset for the of a GPU solution. CsiNet and CsiNet-LSTM are trained and
subsequent channel group recovery. tested on Nvidia GeForce GTX 1080 Ti GPU.
Normalized MSE (NMSE) is used to evaluate the recovery
IV. S IMULATION R ESULTS AND A NALYSIS performance, which is defined as follows:
T
We use the COST 2100 model [11] to simulate time-varying 1
MIMO channels and generate training samples. We set the NMSE = E Ht − Ĥt 22 /Ht 22 . (5)
MIMO-OFDM system to work on a 20 MHz bandwidth with T
t=1
Nc = 256 subcarriers and use ULA with Nt = 32 antennas at
the BS. The angular-delay domain channel matrix is truncated To compare with CsiNet, the following cosine similarity is
to a size of 32 × 32. Two scenarios are considered: the indoor also calculated:
scenario at 5.3 GHz with UE speed v = 0.0036 km/h and the T Nc

1 1 |ĥH
n,t hn,t |
outdoor scenario at 300 MHz with UE speed v = 3.24 km/h. ρ=E , (6)
T Nc
Data sets are generated by randomly setting different start t=1 n=1 ĥn,t 2 hn,t 2
WANG et al.: DL-BASED CSI FEEDBACK APPROACH FOR TIME-VARYING MASSIVE MIMO CHANNELS 419
and −8.35 dB in average for the indoor and outdoor sce-

narios, respectively. This result is mainly attributed to the
correlation of the channel matrices in time, which can be inher-
ently retained by LSTMs. Moreover, the remaining T − 1
channel matrices achieve better recovery quality since code-
words are concatenated to offer more information before fed
into the low-CR decoder. Furthermore, the DL-based meth-
ods benefit from GPU acceleration due to the feedforward
and fast matrix vector multiplication nature, which perform
approximately thousandfold faster than the CS-based methods.
Compared with CsiNet, CsiNet-LSTM slightly loses time effi-
ciency. However, its NMSE and ρ are significantly improved.
Fig. 2. (a) Pseudo-gray plots of an original channel generated by COST In addition, runtime is considerably shorter than the feedback
2100 model in outdoor scenario, showing real part, imagine part and abso- interval δt = 0.04 s, which makes real-time reconstruction
lute values, respectively. (b) Absolute values of reconstructed images, which possible.
are performed by different methods on the original channel given by (a) at
different CRs.
V. C ONCLUSION
TABLE I In this letter, we have proposed a real-time and end-to-end
P ERFORMANCE IN NMSE (dB), C OSINE S IMILARITY ρ, CSI feedback framework by extending the DL-based CsiNet
AND RUNTIME (sec)
with LSTM. CsiNet-LSTM achieves a remarkable trade-off
among CR, recovery quality, and complexity by utilizing
the time correlation and structure properties of time-varying
massive MIMO channels. Thus, CsiNet-LSTM outperforms
CsiNet and CS-based methods. We believe that this framework
has the potential for practical deployment on real systems.
R EFERENCES
[1] D. J. Love et al., “An overview of limited feedback in wireless com-
munication systems,” IEEE J. Sel. Areas Commum., vol. 26, no. 8,
pp. 1341–1365, Oct. 2008.
[2] P.-H. Kuo, H. T. Kung, and P.-A. Ting, “Compressive sensing
based channel feedback protocols for spatially-correlated massive
antenna arrays,” in Proc. IEEE WCNC, Shanghai, China, Apr. 2012,
pp. 492–497.
[3] A. Paulraj, R. Nabar, and D. Gore, Introduction to Space-Time Wireless
Communications. Cambridge, U.K.: Cambridge Univ. Press, 2003.
[4] X. Rao and V. K. N. Lau, “Distributed compressive CSIT estimation
and feedback for FDD multi-user massive MIMO systems,” IEEE Trans.
Signal Process., vol. 62, no. 12, pp. 3261–3271, Jun. 2014.
[5] T. Wang et al., “Deep learning for wireless physical layer: Opportunities
and challenges,” China Commun., vol. 14, no. 11, pp. 92–111,
Nov. 2017.
[6] H. Ye, G. Y. Li, and B.-H. Juang, “Power of deep learning for chan-
nel estimation and signal detection in OFDM systems,” IEEE Wireless
where ĥn,t denotes the reconstructed channel vector of the nth Commun. Lett., vol. 7, no. 1, pp. 114–117, Feb. 2018.
[7] H. He, C.-K. Wen, S. Jin, and G. Y. Li, “Deep learning-
subcarrier at time t. ρ can measure the quality of beamforming based channel estimation for beamspace mmwave massive
vector when the vector is set as vn,t = ĥn,t /ĥn,t 2 since MIMO systems,” IEEE Wireless Commun. Lett., to be published,
the UE will achieve the equivalent channel ĥH n,t ĥn,t /ĥn,t 2 .
doi: 10.1109/LWC.2018.2832128.
The performance comparison of NMSE, ρ, and runtime [8] C.-K. Wen, W.-T. Shih, and S. Jin, “Deep learning for massive
MIMO CSI feedback,” IEEE Wireless Commun. Lett., to be published,
are summarized in Table I. From the table, the DL-based doi: 10.1109/LWC.2018.2818160.
CsiNet and CsiNet-LSTM considerably outperform all CS- [9] K. Xu and F. Ren, “CSVideoNet: A real-time end-to-end learning frame-
based methods. Fig. 2 shows a reconstruction result of the work for high-frame-rate video compressive sensing,” in Proc. IEEE
5th channel matrix of a certain channel group in outdoor sce- WACV, Mar. 2018, pp. 1680–1688.
[10] C.-K. Wen, S. Jin, K.-K. Wong, J.-C. Chen, and P. Ting, “Channel
nario as an example, which represents the average performance estimation for massive MIMO using Gaussian-mixture Bayesian learn-
at different CRs. Apparently, CsiNet and CsiNet-LSTM con- ing,” IEEE Trans. Wireless Commun., vol. 14, no. 3, pp. 1356–1368,
tinue to offer adequate beamforming gain at low CRs, where Mar. 2015.
CS-based methods fail to work. In particular, CsiNet-LSTM [11] L. Liu et al., “The COST 2100 MIMO channel model,” IEEE Wireless
achieves the lowest NMSE at all CRs and is multiple times Commun., vol. 19, no. 6, pp. 92–99, Dec. 2012.
[12] I. Daubechies, M. Defrise, and C. D. Mol, “An iterative threshold-
lower than CsiNet, especially when CR is low. ing algorithm for linear inverse problems with a sparsity constraint,”
Notebly, CsiNet-LSTM has the least performance loss as Commun. Pure Appl. Math., vol. 75, no. 11, pp. 1412–1457, Nov. 2004.
CR decreases, with only 8% and 10% for indoor and outdoor, [13] C. Li, W. Yin, and Y. Zhang, “User’s guide for TVAL3: TV minimization
respetively. The simulation results indicate that the remain- by augmented Lagrangian and alternating direction algorithms,” CAAM
ing channel matrices {Ht }T t=2 recovered from a low CR
Rep., vol. 20, pp. 46–47, 2009.
[14] C. A. Metzler, A. Maleki, and R. G. Baraniuk, “From denoising to com-
exhibit similar recovery quality and are better than the first pressed sensing,” IEEE Trans. Inf. Theory, vol. 62, no. 9, pp. 5117–5144,
channel matrix H1 from a high CR, which is −14.74 dB Sep. 2016.
A Dynamic Pricing Strategy for Vehicle Assisted

Mobile Edge Computing Systems
Di Han , Wei Chen , Senior Member, IEEE, and Yuguang Fang , Fellow, IEEE
Abstract—The idle computing resources of parked vehicles

could be utilized to improve performance by assisting task execu-
tions in mobile edge computing (MEC) systems. As a result, the
owner of a vehicle could be compensated, resulting in a win-win
situation. A dynamic pricing strategy is proposed to minimize the
average cost of the MEC system under the constraints on quality
of service by adjusting the price constantly based on the current
system state. To do so, a cost minimization problem is solved to
obtain the optimal dynamic pricing strategy efficiently. Finally, Fig. 1. A vehicle-assisted MEC system.
the optimization results are validated with extensive simulations.
Index Terms—Mobile edge computing, dynamic pricing strat- enough to guarantee QoS. A reasonable solution is to allow
egy, autonomous vehicle, Markov chain. computing units of parked vehicles, e.g., autonomous vehi-
cles, be leased to the MEC system to execute computing tasks
and exchange data with the MEC system through the vehicle-
I. I NTRODUCTION to-infrastructure (V2I) communication, as shown in Fig. 1.
This will be a win-win situation where not only the MEC
N FUTURE smart city, vehicles equipped with power
I capability in communication, computing, and storage,
e.g., autonomous vehicles, could be viewed as important
system could achieve better performance but also the owners
of these vehicles could gain economic benefits from the oper-
ator of the MEC system, especially when these vehicles are
networking resources to handle the explosively growing wire- not energy-hungry, e.g., electric vehicles equipped with large
less data traffic [1]. To ensure powerful performance, an expen- battery packs.
sive communication and computing unit must be installed in However, the arrival of computing tasks and locations of
these vehicles. Nowadays, existing computing solutions for vehicles, i.e., entries to and exits from the coverage range
level 4 autonomous driving often cost tens of thousands of of the MEC system, are greatly stochastic and uncertain,
dollars [2]. However, the current utilization rate of vehicles is which are hard to predict and control accurately. Thus, the
not very high, e.g., 2016 average driving time per day in U.S. is performance of a fixed price strategy is often very poor since
only 50.6 minutes according to the survey of AAA Foundation it does not take the real-time dynamic change into considera-
for Traffic Safety. Thus, these computing units will be left tion, e.g., the number of tasks in execution and the number of
idle most of time. To make full use of these idle computing parked vehicles in the coverage range of the MEC system.
resources, we could incentivize owners of these vehicles to Dynamic pricing strategy could provide a more attractive
allow their vehicles to be used for processing computing tasks. approach by adjusting the price constantly, which has attracted
MEC is a promising paradigm to enable mobile devices to great attention in both academia and industries. In [5], by
enjoy resourceful computing power with lower latency [3] and implementing dynamic parking pricing strategy, travel delay of
dynamic allocation of the computing resources in the MEC is cruising and the generic congestion can be effectively curtailed
an interesting research issue to be addressed in the future [4]. in urban networks. Moreover, time-varying pricing strategies
It could be a potential scenario in future smart cities where the are widely used in electricity use, which charge more for
vehicles with computing units could be utilized as temporary energy use on peak to reduce peak demand [6]. Similarly,
servers of an MEC system, particularly when the computing we could raise price to attract more parked vehicles when the
resources owned by the MEC system itself is not sufficiently servers are not sufficient to support computation tasks, and
Manuscript received September 7, 2018; accepted October 2, 2018. Date vice versa. Thus, there exists a tradeoff between the average
of publication October 9, 2018; date of current version April 9, 2019. This cost, i.e., the average reward paid by the MEC system, and
work was supported in part by the National Science Foundation of China under QoS of the MEC system.
Grant 61671269 and Grant 61621091, and in part by the National Program for
Special Support for Eminent Professionals (10000-Talent Program). The work In this letter, a dynamic pricing strategy is proposed to min-
of Y. Fang was partially supported by the U.S. National Science Foundation imize the average cost with the constraints on QoS based
under Grant CNS-1717736. The associate editor coordinating the review of on probabilistic scheduling approach. Each task is assumed
this paper and approving it for publication was K. W. Choi. (Corresponding
author: Di Han.)
in outage state and will be dropped if it cannot be served
D. Han and W. Chen are with the Department of Electronic when it arrives at or has to be dropped before its execu-
Engineering, Tsinghua University, Beijing 100084, China (e-mail: tion is completed, and the packet loss rate is considered as
hd15@mails.tsinghua.edu.cn; wchen@tsinghua.edu.cn). the performance metric. The system could be modeled by a
Y. Fang is with the Department of Electrical and Computer Engineering,
University of Florida, Gainesville, FL 32611 USA (e-mail: fang@ece.ufl.edu). two-dimensional Markov chain whose state is determined by
Digital Object Identifier 10.1109/LWC.2018.2874635 the system state, i.e., the number of tasks in execution and the
HAN et al.: DYNAMIC PRICING STRATEGY FOR VEHICLE ASSISTED MEC SYSTEMS 421
number of parked vehicles in the MEC system. Our objective

is to find the optimal pricing strategy and, based on this objec-
tive, the optimization problem can be converted into a linear
programming so that the optimal dynamic pricing strategy can
be obtained efficiently.
II. S YSTEM M ODEL

As shown in Fig. 1, we consider an MEC system that con-
sists of an AP node (integrated with n0 MEC servers) and a
parking lot where at most N0 vehicles equipped with comput-
ing units could park. Assume that the parked vehicles could be
unitized as the temporary MEC servers. To simplify the anal- Fig. 2. Transition diagram of state (1, 1) with N0 = 3 and n0 = 1.
ysis, we assume that each vehicle has the identical computing
capability [7] with an MEC server.
as the payment obtained by each vehicle in t-th time slot.
Time is divided into time slots. Let N[t] and M[t] denote the
Consider that K price standards are available, whose set is
number of parked vehicles and computing tasks be executing
denoted by C = {c1 , c2 , . . . , cK } and ci < cj for any i < j.
in the MEC system at the beginning of the t-th time slot.
Let λk and μk denote the arrival rate and departure rate
Let av [t] and dv [t] denote the number of vehicles arriving
of vehicle with a given price standard ck , i.e., λv [t] = λk
and departing in the t-th time slot. Likewise, au [t] and du [t]
and μv [t] = μk when c[t] = ck . It is reasonable to assume
denote the number of tasks arriving and departing in the t-th
that higher price will attract vehicle to arrive and park with
time slot. Then the dynamic system state can be expressed as
a higher probability and for a longer time. Thus, we have
N [t + 1] = (min{N [t] + av [t] − dv [t], N0 })+ , λi < λj and μi < μj for any ci < cj .
M [t + 1] = (min{N [t + 1] + n0 , M [t] + au [t] − du [t]})+ ,
(1) III. DYNAMIC P RICING S TRATEGY
where the superscript ‘+’ denotes nonnegative, i.e., a + = According to Eq. (1), the system state M[t + 1] and N[t + 1]
max{a, 0}. When the available servers cannot support all tasks in the next time slot only depend on the current state M[t] and
in the system, the packet loss occurs. The number of tasks N[t], and not on the state at the previous slots. Therefore, the
dropped at the beginning of the t-th time slot is given by system state can be formulated as a two-dimensional Markov
chain with M[t] and N[t]. We denote the state (m, n) as the
l [t] = max{0, M [t] − N [t] − n0 }. (2) system state M[t] = m and N[t] = n.
At the beginning of each time slot, the state of the Markov
We model task arrivals and departures as Bernoulli
chain could make transition to other states. For the ease of
Process [8], [9]. A new computing task arrives at the
understanding, an instance with transition diagram of one state
MEC system at the beginning of the t-th time slot with arrival
(1, 1) is given in Fig. 2. To keep the figure legible, we denote
rate λu , i.e.,
(au [t], du [t], av [t], dv [t]) as (a1 , d1 , a2 , d2 ) for each link. The
Pr{au [t] = 1} = λu , state (1, 1) cannot make transition to the states in the next time
(3)
Pr{au [t] = 0} = 1 − λu . slot that do not have a link with it, e.g., (3, 2). Since packet
At the end of each time slot, each task could be completed and drop occurs when M [t] > N [t] + n0 , the MEC system will
departs from the system with departure rate μu . The departure have to drop one packet when state (1, 1) transfers to state
of each task is independent with each other. Thus, we have (2, 0), which is illustrated in Fig. 2.
Dynamic pricing strategy is to adjust price at the beginning
Pr{du [t] = d } = C (m, d )μdu (1 − μu )m−d , ∀0 ≤ d ≤ m, of each time slot, which is determined by the probability fm,n k
(4) of choosing price ck given state (m, n), i.e.,

k
where m = M[t] > 0 and C(m, d) is the binomial coefficient. fm,n = Pr{c[t] = ck |M [t] = m, N [t] = n}. (7)
Otherwise, if M[t] = 0, we have Pr{du [t] = 0} = 1.
The normalization condition always holds for each 0 ≤ m ≤
Likewise, the distribution of av [t] and dv [t] are dependent on
N + n0 and 0 ≤ n ≤ N,
the arrival rate λv [t] and departure rate μv [t] of the parked
vehicles, which is given by K

k
fm,n = 1. (8)
Pr{av [t] = 1} = λv [t], k =1
(5)
Pr{av [t] = 0} = 1 − λv [t],
k
Let Rm,n denote the average number of packets dropped in
and the current time slot at state (m, n) with price standard ck ,
Pr{dv [t] = d } = C (n, d )µv [t]d (1 − µv [t])n−d , ∀0 ≤ d ≤ n, which is given by
k
(6) Rm,n = E{l [t]|M [t] = m, N [t] = n, c[t] = ck }. (9)
where n = N[t] > 0, and λv [t] and μv [t] are dependent on Denote qm,n (Δm, Δn) as the transition probability from
the price c[t] at the current time slot. The price c[t] is defined the state (m, n) to state (m + Δm, n + Δn) in the
k
Markov chain. Rm,n and qm,n (Δm, Δn) could be obtained By substituting Eq. (15) into problem (14), the steady-state
according to Eqs. (1-4) in Section II and details are condition can be expressed as constraint (16.c) and a matrix
omitted due to space limit. Based on this, the transi- equation Qy = 0 with ym,n k k } as variables,
= {πm,n fm,n
tion matrix of the Markov chain H could be obtained. where the constant matrix is denoted by Q and can be derived
Denote the steady-state distribution of this Markov chain from H. In this way, the optimization (14) is converted into a
by π = {π0,0 , π0,1 , . . . , π0,N0 , . . . , πn+n0 ,n . . . , πN0 +n0 ,N0 } linear programming which is summarized as follow:
which satisfies N0 n+n K
T 0
1 π = 1, k
(10) min nck ym,n (16.a)
k
π ,fm,n
H π = π. n=0 m=0 k =1
In the t-th time-slot, when system state (M[t], N[t]) = (m, n), N0 n+n
0 K
1 k k th
the cost and expectation of packets dropped are given by nck s.t Rm,n ym,n ≤ Plos (16.b)
k k . Thus, the average cost and λu
and Rm,n with probability fm,n n=0 m=0 k =1
average expectation of tasks dropped are given by n+n
N0 0 K
k
N0 n+n K
ym,n =1 (16.c)
0 n=0 m=0 k =1
k
Cava = nck πm,n fm,n (11)
Qy = 0 (16.d)
n=0 m=0 k =1
k
and ym,n ≥ 0 ∀m, n, k (16.e)
N0 n+n
K
0 This problem can be solved efficiently in polynomial time
k k
Elos = Rm,n πm,n fm,n . (12) using interior-point method [10]. After the optimal solu-
n=0 m=0 k =1 k ∗ of the linear programming (16) is obtained, the
tion ym,n
Furthermore, the expectation of number of the tasks arriving corresponding steady-state distribution can be represented as
in the t-th time-slot is the arrival rate λu . Thus, the packet loss K

∗ k ∗
rate, which is the probability that a computing task is dropped πm,n = ym,n . (17)
before it is completed, is given by k =1
E k ∗
ava
Plos = los . (13) To obtain the cost-optimal strategy, we can derive fm,n
λu k ∗
from ym,n , which is given below.
∗
Case 1: When πm,n = 0, the optimal strategy is given by
IV. O PTIMAL C OST-Q O S T RADEOFF
k ∗
In typical systems, the QoS of users should be guarantee
k ∗ ym,n
ava cannot exceed the toler-
firstly, i.e., the packet loss rate Plos fm,n = . (18)
∗
πm,n
ance Plos th . Furthermore, it is necessary to reduce the cost of
Case 2: When πm,n ∗ = 0, which means that the state (m, n)
the MEC system, i.e., average cost Cava , as much as possible.
Thus, we have the following optimization problem: is a transient state. Then, a simple strategy can be used, i.e.,
N0 n+n
K
0 k ∗ 1
k fm,n = . (19)
min nck πm,n fm,n (14.a) k
k
π ,fm,n
n=0 m=0 k =1 k ∗k ∗
The time complexity of deriving fm,n from ym,n is
N0 n+n
0 K
1 k k th also polynomial. In conclusion, the cost-optimal dynamic
s.t Rm,n πm,n fm,n ≤ Plos (14.b)
λu pricing strategy could be obtained in polynomial time.
n=0 m=0 k =1
Based on this result, the optimal cost-QoS tradeoff can be
1T π = 1 (14.c) achieved.
Hπ = π (14.d) Remark 1: Our proposed approach could be extended and
K applied to a more generalized scenario where the priority of
k
fm,n =1 ∀m, n (14.e) tasks and vehicles with different computing capability are con-
k =1 sidered. By modeling each type of task and vehicle into a
k queue, the system could be formulated into a Markov chain
fm,n≥ 0 ∀m, n, k (14.f)
with more dimensions. Then, the cost-optimal dynamic pricing
πm,n ≥ 0 ∀m, n, (14.g) strategy could be obtained using the proposed approach.
where constraints (14.b) and (14.c-d) denote the constraints
on QoS tolerance and the steady-state condition, respectively. V. N UMERICAL R ESULTS
The objective function and constraints in optimization (14) In this section, we validate our theoretical results
k }, {f k }, or {π
are linear combinations of {πm,n fm,n m,n m,n }. via simulation studies, and explain the outcomes in a
k } in Eq. (8),
By recalling the normalization condition of {fm,n more comprehensive way. Throughout this section, we set
πm,n can also be expressed as n0 = 3, K = 7, μu = 0.2, and other parameters
K
K
are summarized in Table I. By solving the optimization
k k
πm,n = πm,n fm,n = ym,n . (15) problem (16), the optimal dynamic pricing strategy can be
k =1 k =1 obtained.
HAN et al.: DYNAMIC PRICING STRATEGY FOR VEHICLE ASSISTED MEC SYSTEMS 423
TABLE I
PARAMETERS U SED FOR S IMULATION into consideration. Thus, the performance improvement of the
proposed dynamic pricing strategy has been verified.
Fig. 4 presents the optimal cost versus the size of the park-
ing lot with different task arrival rates λu . The packet loss rate
th is set to 0.15. As expected, the cost decreases
constraint Plos
when the size of parking lot N0 increases and then approaches
to different asymptotes, where the line with higher λu exhibits
higher average cost. For a given N0 , the system with higher
λu requires higher cost. Therefore, for a busy MEC system
with higher λu , a larger parking lot is required or otherwise
higher cost has to be incurred.
VI. C ONCLUSION
In this letter, we have investigated a dynamic pricing
strategy for vehicle assisted MEC systems. By adjusting
the price dynamically to control the arrival and depar-
ture rates of vehicles, the cost of the MEC system will
be minimized under a given constraint on QoS, which is
evaluated by the packet drop rate. The system is mod-
Fig. 3. Optimal cost-QoS tradeoffs between different strategies. eled as a two-dimensional Markov chain. Then the average
cost and QoS could be obtained by analyzing the steady-
state distribution of Markov chain. Based on these, the
optimization problem is formulated and solved. Moreover, the
cost-optimal dynamic pricing strategy could be obtained to
minimize the cost of the MEC system and achieve optimal
cost-QoS tradeoff, which has a significant performance
improvement compared with the fixed pricing strategy.
Finally, our theoretical results is validated by comprehensive
simulations.
R EFERENCES
[1] H. Ding, C. Zhang, Y. Cai, and Y. Fang, “Smart cities on wheels:
A newly emerging vehicular cognitive capability harvesting network
for data transportation,” IEEE Wireless Commun., vol. 25, no. 2,
pp. 160–169, Apr. 2018.
Fig. 4. Optimal cost versus parking lost size N. [2] S. Liu, J. Tang, Z. Zhang, and J.-L. Gaudiot, “Computer architectures
for autonomous driving,” IEEE Comput., vol. 50, no. 8, pp. 18–25,
Aug. 2017.
[3] M. T. Beck, M. Werner, S. Feld, and T. Schimper, “Mobile edge comput-
ing: A taxonomy,” in Proc. Int. Conf. Adv. Future Internet, Nov. 2018,
Fig. 3 compares the numerical results between the proposed pp. 48–55.
dynamic pricing strategy, one another dynamic pricing strategy [4] P. Mach and Z. Becvar, “Mobile edge computing: A survey on architec-
ture and computation offloading,” IEEE Commun. Surveys Tuts., vol. 19,
based on Lyapunov optimization, and fixed pricing strategy no. 3, pp. 1628–1656, 3rd Quart., 2017.
with N0 = 4 and λu = 0.8. The basic idea of the Lyapunov [5] N. Zheng and N. Geroliminis, “Modeling and optimization
optimization [11] is to minimize its Lyapunov drift-plus- of multimodal urban networks with limited parking and
penalty function, whose objective is to stabilize the virtual dynamic pricing,” Transp. Res. B Methodol., vol. 83, pp. 36–58,
Jan. 2016.
queue N [t] + n0 − M [t] while optimizing the average cost. [6] G. R. Newsham and B. G. Bowker, “The effect of utility time-varying
In Fig. 3, simulation results of the proposed dynamic pricing pricing and load control strategies on residential summer peak elec-
strategy are given by Monte-Carlo simulation, which match tricity use: A review,” Energy Policy, vol. 38, no. 7, pp. 3289–3296,
Jul. 2010.
perfectly well with the optimization results. It can be seen [7] X. Hou, Y. Li, M. Chen, D. Wu, and S. Chen, “Vehicular fog comput-
that with the decrease of packet loss rate constraint Plosth , i.e.,
ing: A viewpoint of vehicles as the infrastructures,” IEEE Trans. Veh.
higher QoS requirement, the required cost rises in all strate- Technol., vol. 65, no. 6, pp. 3860–3873, Jun. 2016.
[8] J. Liu, Y. Mao, J. Zhang, and K. B. Letaief, “Delay-optimal computation
gies. Additionally, when the cost is large enough, the packet task scheduling for mobile-edge computing systems,” in Proc. IEEE
loss rate in all strategies will decrease to a minimum, which ISIT, Jul. 2016, pp. 1451–1455.
verifies the performance improvement by utilizing the com- [9] M. Jia, W. Liang, Z. Xu, and M. Huang, “Cloudlet load balancing
in wireless metropolitan area networks,” in Proc. IEEE INFOCOM,
puting units of parked vehicles as temporary MEC servers. Apr. 2016, pp. 1–9.
Moreover, for any Plos th , the average cost in the fixed pricing
[10] S. P. Boyd and L. Vandenberghe, Convex Optimization. New York, NY,
strategy is always higher than that in the proposed dynamic USA: Cambridge Univ. Press, 2004.
pricing strategies. The average cost of the strategy based on [11] W. Sun, J. Liu, Y. Yue, and H. Zhang, “Double auction-based resource
allocation for mobile edge computing in industrial Internet of Things,”
Lyapunov optimization is also higher since it does not take IEEE Trans. Ind. Informat., vol. 14, no. 10, pp. 4692–4701, Oct. 2018,
the distribution of arrival and departure of the task and vehicle doi: 10.1109/TII.2018.2855746.
Spectral-Energy Efficiency Pareto Front in Cellular Networks:

A Stochastic Geometry Framework
Marco Di Renzo , Senior Member, IEEE, Alessio Zappone , Senior Member, IEEE,
Thanh Tu Lam, Student Member, IEEE, and Mérouane Debbah, Fellow, IEEE
TABLE I
Abstract—We compute the spectral-energy efficiency Pareto 2 γ , f˙ : 1 ST D ERIVATIVE )
N OTATION (α = 3.5, δ = 2/β, β > 2, η = κσN A
front in Poisson cellular networks, by formulating a spectral-
energy efficiency bi-objective optimization problem as a function
of either the transmit power or the density of the base sta-
tions. Capitalizing on fundamental theoretical results on weighted
Tchebycheff optimization problems applied to strictly quasi-
concave functions, we derive analytical expressions of the unique
Pareto-optimal solution of the bi-objective problem. We prove
that the Pareto front is constituted by a subset of the spectral-
energy efficiency tradeoff and that it can be formulated in
analytical terms. We identify new functional relations between the
Pareto-optimal transmit power and the density of base stations.
Index Terms—Cellular network, Pareto front, point process.
I. I NTRODUCTION
HE SPECTRAL Efficiency (SE) and Energy Efficiency
T (EE) are important performance metrics that guide the
optimization of cellular networks. Under typical operating con-
trade-off but in characterizing the SE-EE Pareto front, which
is the solution of a bi-objective optimization problem [2].
ditions, however, they are conflicting objective functions [1]: Recently, Aydin et al. [8] and Hao et al. [9] have computed
There exists no single solution that simultaneously optimizes the SE-EE Pareto front in cellular networks with the aid of
each of them. There exist, on the other hand, several optimal multi-objective optimization theory. Therein, however, numer-
solutions for which none of the two objective functions can ical methods are used, and, thus, no analytical formulation of
be improved without degrading the other objective. The SE- the SE-EE Pareto front is given.
EE pairs that fulfill the latter optimality condition are referred Against this background, we derive an explicit analyt-
to as Pareto-optimal solutions, and the corresponding SE-EE ical formulation of the SE-EE Pareto front in cellular
curve is known as the Pareto front [2, Definition 2.2.1]. networks, which is obtained by solving a SE-EE bi-objective
The aim of this letter is to derive a complete and explicit optimization problem, as a function of either the transmit
formulation of the SE-EE Pareto front in cellular networks. power or the density of the BSs. This major contribution
We focus our attention on analytically formulating the SE-EE is obtained by capitalizing on the approach recently intro-
Pareto front from the system-level standpoint, i.e., by tak- duced in [10] for computing, in closed-form, the SE and EE
ing the average with respect to the irregular deployments in Poisson cellular networks, and on fundamental theoreti-
of the cellular Base Stations (BSs) and the random loca- cal results on the existence and uniqueness of Pareto-optimal
tions of the Mobile Terminals (MTs) within the cells. Several solutions in weighted Tchebycheff optimization problems [2].
authors have studied the SE-EE trade-off in wireless networks,
e.g., [3] and [4], and some of them have recently analyzed the II. B I -O BJECTIVE P ROBLEM F ORMULATION
SE-EE trade-off in cellular networks from the system-level We consider a cellular network whose BSs and MTs are dis-
standpoint [5]–[7]. The contribution of this letter is, how- tributed according to two mutually independent homogeneous
ever, different: We are not interested in analyzing the SE-EE Poisson Point Processes (PPPs) of density λBS and λMT ,
respectively. The same system model as in [10] is assumed.
Manuscript received August 8, 2018; accepted September 29, 2018. Date By using the notation in Table I, the SE (bit/sec/m2 ) and the
EE (bit/Joule) can be formulated, respectively, as follows:
work was supported in part by the EC through the Projects 5Gwireless,
BESMART, and CacheMire. The associate editor coordinating the review of λBS L(λMT /λBS ) λMT
this paper and approving it for publication was M. Sheng. (Corresponding SE = BW ρD Q λBS , Ptx , (1)
author: Marco Di Renzo.) 1 + ΥL(λMT /λBS ) λBS
M. Di Renzo, A. Zappone, and T. T. Lam are with the Laboratoire SE
des Signaux et Systèmes, CNRS, CentraleSupélec, Univ Paris EE = (2)
Sud, Université Paris-Saclay, 91192 Gif-sur-Yvette, France (e-mail:
λBS (Ptx − Pi )L(λMT /λBS ) + λMT Pc + λBS Pi
marco.direnzo@l2s.centralesupelec.fr). where the denominator in (2), i.e., Pgrid , is the power
M. Debbah is with the Laboratoire des Signaux et Systèmes, consumption (Watt/m2 ) of the cellular network [10].
CNRS, CentraleSupélec, Univ Paris Sud, Université Paris-Saclay, 91192
Gif-sur-Yvette, France, and also with the Mathematical and Algorithmic In Table II, we summarize important properties of the SE
Sciences Lab, Huawei Technologies, 92100 Boulogne-Billancourt, France. and EE, as a function of Ptx and λBS , that are used next. For
Digital Object Identifier 10.1109/LWC.2018.2874642 generality, we use the symbol ξ to denote either Ptx or λBS .
DI RENZO et al.: SPECTRAL-EE PARETO FRONT IN CELLULAR NETWORKS: STOCHASTIC GEOMETRY FRAMEWORK 425
TABLE II
ξ = Ptx OR ξ = λBS , ξ (min) & ξ (max) A RE THE M IN & M AX OF ξ, Proposition 1: Let f (ξ) and g(ξ) be two strictly quasi-
Ω = [ξ (min) , ξ (max) ], Ω(min) = [ξ (min) , ξ (o) ], concave functions in ξ ∈ Ω. Then, the optimization problem:
Ω(max) = [ξ (o) , ξ (max) ] Pmax − min : maxξ∈Ω {min{f (ξ), g(ξ)}} (5)
has a unique solution in ξ ∈ Ω.
Proof: It follows from Lemma 1 and Lemma 2.
Let f (ξ) be a strictly quasi-concave function in ξ ∈ Ω.
Similar to the notation in Table II, the unique maximizer
of f (·) and its corresponding maximum objective value are
denoted by ξ (f ,opt) and fopt = f (ξ (f ,opt) ), respectively.
A. Weighted Tchebycheff Method
Unless otherwise stated, we assume f (ξ) = SE(ξ) and
g(ξ) = EE(ξ) in ξ ∈ Ω. From Table II, we have fopt = SEm
and gopt = EEo . Let μ ∈ [0, 1], the weighted Tchebycheff
optimization problem is defined as follows [2, Sec. 3.4]:
PT : maxξ∈Ω {min{μF (ξ), (1 − μ)G(ξ)}} (6)
where F (ξ) = f (ξ)/fopt − 1 and G(ξ) = g(ξ)/gopt − 1.
In particular, the unique maximizer of EE, ξ (o) , is [10]: Lemma 3: Let f (ξ) and g(ξ) be two strictly quasi-concave
functions in ξ ∈ Ω. Then, PT in (6) has, for each μ ∈ [0, 1],
ξ (o) = ξ (EE,opt) = max ξ (min) , min ξ (∗) , ξ (max) (3) a unique solution that is (strong) Pareto-optimal.
Proof: It follows from Prop. 1 and [11, Corollary 3.4.4].
where ξ (∗) is the unique unconstrained maximizer of the EE.
Proposition 2: According to Lemma 3, let ξ (μ) be the
We note that the SE and EE are continuous functions in ξ ∈ Ω.
unique solution of PT for μ ∈ [0, 1]. The Pareto front of
We are interested in solving, as a function of ξ = Ptx or
ξ = λBS , the following bi-objective optimization problem [2]: P is given by the pairs (f (ξ (μ) ), g(ξ (μ) )) obtained varying μ
in [0, 1].
P : maxξ∈Ω [SE(ξ), EE(ξ)] (4) Proof: It follows from [11, Th. 3.4.5].
Definition 1: The bi-objective optimization problem in (4) Lemma 4: According to Lemma 3, let ξ (μ) be the unique
is said to be non-trivial or meaningful if ξ (o) < ξ (max) . solution of PT for μ ∈ [0, 1]. Then, ξ (μ) ∈ [ξ (o) , ξ (max) ].
Remark 1: If ξ (o) ≥ ξ (max) , P in (4) is trivial because, Proof: In the range ξ (μ) < ξ (o) , the SE and EE are increas-
based on Table II, both the SE and EE are increasing functions ing functions. Based on the definition of Pareto-optimality, the
in ξ. Thus, they are not conflicting objectives and are both pairs (f (ξ (μ) ), g(ξ (μ) )) cannot be Pareto-optimal.
maximized for ξ = ξ (max) , i.e., the (trivial) solution of (4). Lemma 5: According to Lemma 3, let ξ (μ) ∈ [ξ (o) , ξ (max) ]
Remark 2: Strictly monotonically increasing and unimodal be the unique solution of PT for μ ∈ [0, 1]. Then: i) ξ (μ) =
functions are strictly quasi-concave functions [11]. ξ (g,opt) = ξ (o) if and only if μ = 0, ii) ξ (μ) = ξ (f ,opt) =
From Remark 2, we evince that the SE and EE are continu- ξ (m) if and only if μ = 1, iii) ξ (μ) is the unique solution of
ous and strictly quasi-concave functions. We focus our atten-
μF (ξ (μ) ) = (1 − μ)G(ξ (μ) ) if and only if μ ∈ (0, 1).
tion only on non-trivial bi-objective optimization problems.
Proof: If μ = 0 and μ = 1, PT in (6) is equivalent to max-
imizing G(·) and F (·), respectively, since, by definition, they
III. T CHEBYCHEFF S CALARIZATION are both negative functions. If μ ∈ (0, 1), the SE is monotoni-
To solve P and compute the Pareto-front, we employ cally increasing and the EE is monotonically decreasing in the
the scalarization approach [2]. Two methods are used: 1) range [ξ (o) , ξ (max) ]. Also, μF (ξ (o) ) < (1 − μ)G(ξ (o) ) = 0
the conventional weighted Tchebycheff optimization (PT ) and (1 − μ)G(ξ (max) ) < μF (ξ (max) ) = 0. So, the func-
[2, Sec. 3.4], and 2) the simplified weighted Tchebycheff tions μF (ξ) and (1 − μ)G(ξ) cross each other exactly once
optimization (PST ). PST is introduced in this letter and proved in [ξ (o) , ξ (max) ]. By definition of max-min optimization, this
to be equivalent to PT , but shown to be instrumental to analyt- unique crossing point is the solution of PT in (6).
ically formulate the Pareto front. PT is studied first in order to Lemma 6: Let f (ξ) and g(ξ) be two continuous and strictly
mathematically prove the equivalence with PST . We start with quasi-concave functions in ξ ∈ Ω. Then, there exists a con-
some preliminary results on strictly quasi-concave functions. tinuous and strictly decreasing function, C, that expresses the
Lemma 1: Let f (ξ) and g(ξ) be two strictly quasi-concave objective function g in terms of the objective function f , i.e.,
functions in ξ ∈ Ω. Then, the point-wise minimum function, g = C(f ), where f (ξ (g,opt) ) ≤ f ≤ fopt = f (ξ (f ,opt) ) and
X (ξ) = min{f (ξ), g(ξ)}, is strictly quasi-concave in ξ ∈ Ω. g(ξ (f ,opt) ) ≤ g ≤ gopt = g(ξ (g,opt) ).
Proof: Let us define ξa = aξ1 + (1 − a)ξ2 , Xf (ξ1 , ξ2 ) = Proof: It follows from [12, Ths. 2.1 and 2.2], since f (·) and
min{f (ξ1 ), f (ξ2 )}, Xg (ξ1 , ξ2 ) = min{g(ξ1 ), g(ξ2 )}. By g(·) are continuous and strictly quasi-concave functions, and
virtue of strict quasi-concavity, f (ξa ) > Xf (ξ1 , ξ2 ), g(ξa ) > Ω is a non-empty, compact, and convex set.
Xg (ξ1 , ξ2 ) for ξ1 = ξ2 ∈ Ω and a ∈ (0, 1). Then, X (ξa ) = Remark 3: Proposition 2, Lemma 5, and Lemma 6 provide
min{f (ξa ), g(ξa )} > min{Xf (ξ1 , ξ2 ), Xg (ξ1 , ξ2 )} = us with fundamental properties of the SE-EE Pareto front.
min{X (ξ1 ), X (ξ2 )}, where the inequality holds true because They, however, have the following limitations: i) Proposition 2
the point-wise minimum is an increasing function, and f (·) does not yield an explicit analytical formulation of the Pareto
and g(·) are strictly quasi-concave functions. front, which is parameterized as a function of μ, ii) if
Lemma 2: Let f (ξ) be a strictly quasi-concave function in μ ∈ (0, 1), it is not straightforward to obtain a closed-form
ξ ∈ Ω. Then, f (ξ) has a unique maximizer in ξ ∈ Ω. expression of ξ (μ) from Lemma 5, and iii) Lemma 6 asserts
Proof: It follows from [11, Corollary 2.5.1]. the existence of the curve C but does not provide an explicit
formula for it. Also, no insight for system design is obtained Remark 6: From Proposition 3, we evince that the SE-EE
from them. Pareto front of PT in (6) can be obtained from PST in (7). It
These limitations are overcome with the aid of PST . is constituted, in particular, by: i) the point (SEm , EE(ξ (m) ))
if w = wl (extreme right value), ii) the point (SE(ξ (o) ), EEo )
B. Simplified Weighted Tchebycheff Method if w = wu (extreme left value), and iii) the continuous set of
Unless otherwise stated, we assume f (ξ) = SE(ξ) and points obtained by varying w in the range (wl , wu ).
g(ξ) = EE(ξ) in ξ ∈ Ω. We introduce the simplified weighted Compared with PT in (6), the advantage of PST in (7) is
Tchebycheff problem as follows (for any w ∈ (0, 1)): that, with the exception of the extreme left and right points of
the SE-EE Pareto front that are known, the other points are
PST : maxξ∈Ω(max) min w F (ξ), (1 − w )G(ξ) (7) the unique solution of w F (ξ (w ) ) = (1 − w )G(ξ (w ) ) for w ∈
where F (ξ) = SE(ξ)/SEm and G(ξ) = EE(ξ)/EEo . (wl , wu ) and ξ (w ) ∈ Ω(max) . This is the fundamental result
Remark 4: The optimization problem in (7) is restricted that allows us to compute an explicit analytical formulation of
over the set ξ ∈ Ω(max) by virtue of Lemma 4, i.e., only the SE-EE Pareto front, as proved in the next section.
the values ξ ≥ ξ (o) are admissible in order for the pairs
(SE(ξ), EE(ξ)) not to contradict the definition of Pareto- IV. SE-EE PARETO F RONT
optimality. The following sections provide an explicit formulation of
Remark 5: PST in (7) fulfills the conditions stated in the SE-EE Pareto front for ξ = Ptx and ξ = λBS . We prove,
Proposition 1: It has a unique solution in ξ ∈ Ω(max) for every in particular, that the SE-EE Pareto front can be computed by
w ∈ (0, 1). No conclusion, however, can be drawn about its knowing only ξ (m) = ξ (max) and ξ (o) available from [10].
Pareto-optimality, i.e., Lemma 3 is, in general, not true.
The following lemma and proposition provide us with suf- A. Case Study ξ = Ptx
ficient conditions under which PT and PST describe the same
SE-EE Pareto front as a function of μ and w, respectively. Lemma 8: Let ξ = Ptx and w ∈ [wl , wu ], where wl and
(o)
Lemma 7: Define wl = (1 + EEo /EE(ξ (m) ))−1 ∈ (0, 1) wu are defined in Lemma 7. Define Ptx = ξ (o) . The unique
(w ) (w ) (o)
and wu = (1 + SE(ξ (o) )/SEm )−1 ∈ (0, 1). The unique solu- solution, ξ (w ) = Ptx , of PST in (7) is Ptx = Ptx if w =
tion, ξ (w ) , of PST in (7) is: i) ξ (w ) = ξ (m) if and only if (w )
wu and Ptx = Ptx
(max)
if w = wl . If w ∈ (wl , wu ), it is:
w ≤ wl , ii) ξ (w ) = ξ (o) if and only if w ≥ wu , and iii) the
(w ) (1/w − 1)(SE m /EEo ) − T (o) (max)
unique solution of the equation w F (ξ (w ) ) = (1 − w )G(ξ (w ) ) Ptx = ∈ Ptx , Ptx (8)
in ξ (w ) ∈ Ω(max) if and only if w ∈ (wl , wu ). λBS L(λMT /λBS )
Proof: In Ω(max) = [ξ (o) , ξ (max) ], the SE and EE are where T = λMT Pc + λBS Pi (1 − L(λMT /λBS )).
monotonically increasing and decreasing, respectively. For any Proof: The cases w = wl and w = wu follow from
w ∈ (0, 1), three cases are possible. i) If w F (ξ (max) ) ≤ Lemma 7. As for w ∈ (wl , wu ), (8) follows by insert-
(1 − w )G(ξ (max) ), i.e., the maximum of w F (ξ) in ξ ∈ ing (1) and (2) in w F (ξ (w ) ) = (1 − w )G(ξ (w ) ). Also,
(w ) (o) (max)
Ω(max) is less than the minimum of (1 − w )G(ξ) in ξ ∈ Ptx ∈ [Ptx , Ptx ] by virtue of the continuity of the Pareto
Ω(max) , we have min{w F (ξ), (1 − w )G(ξ)} = F (ξ). This front (Lemma 6).
occurs if w ≤ wl . ii) If (1 − w )G(ξ (o) ) ≤ w F (ξ (o) ), Remark 7: From (8), the following remarks can be made:
(w )
i.e., the maximum of (1 − w )G(ξ) in ξ ∈ Ω(max) is less i) Ptx is given in an analytical form. This is not pos-
than the minimum of w F (ξ) in ξ ∈ Ω(max) , we have sible, in general, from PT in (6); and ii) the functional
(w )
min{w F (ξ), (1 − w )G(ξ)} = G(ξ). This occurs if w ≥ wu . relation between Ptx and λBS is different compared with
iii) If w F (ξ (o) ) < (1 − w )G(ξ (o) ) and w F (ξ (max) ) > the case study where the coverage probability is maximized.
(1 − w )G(ξ (max) ), the functions w F (ξ) and (1 − w )G(ξ) In, e.g., the highly-loaded regime, i.e., L(λMT /λBS ) ≈ 1, (8)
(w )
cross each other exactly once in Ω(max) , since the SE and EE yields Ptx ∝ (λBS )/λBS , where (λBS ) = SEm /EEo =
are monotonically increasing and decreasing, respectively, in SEm (λBS )/EEo (λBS ) implicitly depends on λBS . If the cov-
Ω(max) . By definition of max-min optimization, this unique erage probability is maximized, on the other hand, the func-
crossing point is the solution of PST in (7). By definition, tional relation between transmit power and density of the BS
−β/2
EEo /EE(ξ (m) ) > 1, SE(ξ (o) )/SEm < 1. So, wl < wu . is Ptx ∝ λBS , where β is the path-loss slope [10].
Proposition 3: Let ξ (w ) be the unique solution of PST Theorem 1: As a function of Ptx , the SE-EE Pareto front
in (7) according to Lemma 7. The pairs (f (ξ (w ) ), g(ξ (w ) )) can be obtained from (2) by setting Ptx as follows:

β/2
obtained by varying w in [wl , wu ] describe the same SE-EE η 2/β S(λMT /λBS )SE
Pareto front as PT in (6). Given w, (f (ξ (w ) ), g(ξ (w ) )) Ptx = − ln 1 − (9)
is obtained from PT in (6) by choosing 1/μ = 1 + S(λBS ) L(λBS )
(w ) (w ) (w )
(1/w − φ+ )(1/φ− ), where φ± = 1 ± 1/G(ξ (w ) ). where S(λMT /λBS ) = 1 + ΥL(λMT /λBS ), S(λBS ) =
Proof: From Lemma 5 and Lemma 7, PT and PST yield πλBS S(λMT /λBS ), L(λBS ) = BW ρD λBS L(λMT /λBS ), and
the same (f (ξ (w ) ), g(ξ (w ) )) = (f (ξ (μ) ), g(ξ (μ) )) pairs if and SE lies in the range SE ∈ [SE(ξ (o) ), SEm ].
only if the two equations μF (ξ (μ) ) = (1 − μ)G(ξ (μ) ) and (w )
Proof: It follows by inserting Ptx in (8) into (1) and (2),
w F (ξ (w ) ) = (1 − w )G(ξ (w ) ) are simultaneously satisfied. and by expressing w as a function of the SE from (1). The
By imposing this condition, we obtain 1/μ = 1 + range of values SE ∈ [SE(ξ (o) ), SEm ] follows by virtue of
(w ) (w )
(1/w − φ+ )(1/φ− ). From this latter formula, we evince the continuity of the Pareto front (as stated in Lemma 6).
that: i) μ = 1 if and only if w = wl , ii) μ = 0 if and only if In conclusion, the SE-EE Pareto front is obtained by
w = wu , and iii) μ ∈ (0, 1) if and only if w ∈ (wl , wu ), since inserting (9) into (2) and by plotting the curve for SE ∈
μ decreases monotonically as w increases. [SE(ξ (o) ), SEm ], which is decreasing in SE (Lemma 6).
DI RENZO et al.: SPECTRAL-EE PARETO FRONT IN CELLULAR NETWORKS: STOCHASTIC GEOMETRY FRAMEWORK 427
B. Case Study ξ = λBS

Lemma 9: Let ξ = λBS and w ∈ [wl , wu ], where wl and
(o)
wu are defined in Lemma 7. Define λBS = ξ (o) . The unique
(w ) (w ) (o)
solution, ξ (w ) = λBS , of PST in (7) is λBS = λBS if w = wu
(w ) (max) (w )
and λBS = λBS if w = wl . If w ∈ (wl , wu ), λBS ∈
(o) (max)
(λBS , λBS ) is the unique solution of the equation:

(w )
Pgrid = Pgrid λBS = (1/w − 1)(SEm /EEo ) (10)
(w )
where Pgrid = Pgrid (λBS ) is defined in Table I.
(w )
Proof: The proof is similar to Lemma 8. Also, λBS ∈
(o) (max)
[λBS , λBS ] by virtue of continuity stated in Lemma 6.
(w ) (w )
As opposed to Ptx in Lemma 8, λBS cannot be for-
mulated, in general, in closed-form. Two exceptions are as
follows. Fig. 1. SE-EE Pareto front: Monte Carlo simulations vs. theory.
Corollary 1: If L(λMT /λBS ) ≈ 1 (highly-loaded
(w )
regime), λBS = ((1/w − 1)(SEm /EEo ) − λMT Pc )/Ptx . If VI. C ONCLUSION
(w ) In this letter, we have proved that the SE-EE Pareto front
L(λMT /λBS ) ≈ λMT /λBS (lightly-loaded regime), = λBS
((1/w − 1)(SEm /EEo ) − λMT (Ptx + Pc − Pi ))/Pi . is constituted by a subset of points of the SE-EE trade-off
Proof: It follows from Lemma 9 by solving (10). curve, and that the functional dependency between the Pareto-
Theorem 2: As a function of λBS , the SE-EE Pareto front optimal values of Ptx and λBS is, in general, not the same
(w ) (w ) (w ) as if the coverage is optimized. Potential applications of the
is constituted by the pairs (SE(λBS ), EE(λBS )) for λBS ∈ SE-EE Pareto front for network optimization are elaborated
(o) (max)
[λBS , λBS ], where the SE and EE are given in (1) and (2). in [2]. Promising generalizations of this letter include the anal-
Proof: It directly follows from Lemma 9. ysis of non-Poisson spatial models, heterogeneous and ad hoc
(w ) networks, and multi-objective utility functions.
It is worth noting that, even though λBS is, in general, not
explicitly available, there is no need to compute it numeri-
cally. This is because, from Lemma 9 and by the continuity R EFERENCES
(w ) (o) (max)
of the Pareto front, we have proved λBS ∈ [λBS , λBS ]. As [1] A. Zappone and E. Jorswieck, Energy Efficiency in Wireless Networks
opposed to Theorem 1, it is not possible to express the EE as via Fractional Programming Theory, vol. 11. Boston, MA, USA: Now,
Jun. 2015.
a function of the SE, because it is difficult to write λBS as a [2] K. Miettinen, Nonlinear Multiobjective Optimization. Boston, MA,
function of the SE from (1). Nevertheless, the SE-EE Pareto USA: Kluver, 1999.
front is formulated without using any numerical methods. [3] C. Xiong, G. Y. Li, S. Zhang, Y. Chen, and S. Xu, “Energy- and
spectral-efficiency tradeoff in downlink OFDMA networks,” IEEE Trans.
V. N UMERICAL R ESULTS Wireless Commun., vol. 10, no. 11, pp. 3874–3886, Nov. 2011.
[4] Y. Li, M. Sheng, C. Yang, and X. Wang, “Energy Efficiency and spec-
The findings in Theorems 1, 2 are validated in Fig. 1. Setup: tral efficiency tradeoff in interference-limited wireless networks,” IEEE
λMT = 121 · 10−6 MTs/m2 , γD = γA = 5 dB, BW = 20 MHz, Commun. Lett., vol. 17, no. 10, pp. 1924–1927, Oct. 2013.
β = 3.5, Pc = 20 dBm, Pi = 10 dBm, κ and N0 are set [5] D. Tsilimantos, J.-M. Gorce, K. Jaffrès-Runser, and H. V. Poor, “Spectral
as in [10]. If ξ = Ptx , we set λBS = (π 5002 )−1 BSs/m2 , and energy efficiency trade-offs in cellular networks,” IEEE Trans.
(min) (max) Wireless Commun., vol. 15, no. 1, pp. 54–66, Jan. 2016.
Ptx = 0 dBm, Ptx = 43 dBm. If ξ = λBS , we set [6] G. Zhao, S. Chen, L. Zhao, and L. Hanzo, “Joint energy-spectral-
(min) (max )
Ptx = 25 dBm, λBS = (π5002 )−1 BSs/m2 , λBS = efficiency optimization of CoMP and BS deployment in dense large-
2 − 1 2 scale cellular networks,” IEEE Trans. Wireless Commun., vol. 16, no. 7,
(π5 ) BSs/m . pp. 4832–4847, Jul. 2017.
The curve “SE-EE tradeoff” follows from (1), (2) for Ptx ∈ [7] A. M. Alam, P. Mary, J.-Y. Baudais, and X. Lagrange, “Asymptotic
(min) (max) (min) (max)
[Ptx , Ptx ] in Fig. 1(a) and λBS ∈ [λBS , λBS ] in analysis of area spectral efficiency and energy efficiency in PPP
networks with SLNR precoder,” IEEE Trans. Commun., vol. 65, no. 7,
Fig. 1(b). The curve “Pareto front (theory)” is computed from pp. 3172–3185, Jul. 2017.
Theorems 1, 2. The markers “Pareto front (Monte Carlo)” are [8] O. Aydin, E. A. Jorswieck, D. Aziz, and A. Zappone, “Energy-spectral
obtained by solving PT in (6) via exhaustive search. The other efficiency tradeoffs in 5G multi-operator networks with heteroge-
markers show the left and right extreme values of the Pareto neous constraints,” IEEE Trans. Wireless Commun., vol. 16, no. 9,
front (Remark 6) and the “ideal point” (SEm , EEo )1 [2]. pp. 5869–5881, Sep. 2017.
[9] Y. Hao, Q. Ni, H. Li, and S. Hou, “On the energy and spectral efficiency
Fig. 1 confirms the findings in Theorems 1, 2: The SE-EE tradeoff in massive MIMO-enabled HetNets with capacity-constrained
Pareto front is constituted by a subset of points of the SE-EE backhaul links,” IEEE Trans. Commun., vol. 65, no. 11, pp. 4720–4733,
trade-off curve, i.e., ξ (w ) ∈ [ξ (o) , ξ (max) ] ⊆ [ξ (min) , ξ (max) ]. Nov. 2017.
This is a major finding, since the SE-EE Pareto front is the [10] M. Di Renzo, A. Zappone, T. T. Lam, and M. Debbah, “System-
level modeling and optimization of the energy efficiency in cellular
solution of P while the SE-EE trade-off is obtained without networks—A stochastic geometry framework,” IEEE Trans. Wireless
solving any optimization problems. In general, thus, the SE-EE Commun., vol. 17, no. 4, pp. 2539–2556, Apr. 2018.
Pareto front and the SE-EE trade-off are different. [11] A. Cambini and L. Martein, Generalized Convexity and Optimization.
Heidelberg, Germany: Springer, 2009.
1 The ideal point is achievable if the SE and EE are independent of [12] A. Daniilidis, N. Hadjisavvas, and S. Schaible, “Connectedness of the
each other. In general, thus, it is not achievable in non-trivial bi-objective efficient set for three-objective quasiconcave maximization problems,”
optimization problems. It is reported as a reference point, as suggested in [2]. J. Optim. Theory Appl., vol. 93, no. 3, pp. 517–524, 1997.
Rician K-Factor-Based Analysis of XLOS Service Probability

in 5G Outdoor Ultra-Dense Networks
Hatim Chergui , Member, IEEE, Mustapha Benjillali , Senior Member, IEEE,
and Mohamed-Slim Alouini , Fellow, IEEE
Abstract—In this letter, we introduce the concept of Rician (DL) would urge the adoption of decoupling strategies. In that
K-factor-based radio resource and mobility management for case, we may set a K-factor threshold to associate the UL to a
fifth generation (5G) ultra-dense networks (UDN), where the LOS/OLOS gNB where, thanks to the minimal path-loss, the
information on the gradual visibility between the new radio UE can reduce its transmit power allowing the reduction of UL
node B and the user equipment (UE)—dubbed X-line-of-sight signal to interference plus noise ratio (SINR) variance, which
(XLOS)—would be required. We therefore start by presenting translates into more efficient and effective UL schedulers and
the XLOS service probability as a new performance indicator;
performance gains [4]. Zooming out from the applications, a
taking into account both the UE serving and neighbor cells. By
relying on a lognormal K-factor model, a parametric expression mathematical characterization of XLOS is yet to be established.
of the XLOS service probability in a 5G outdoor UDN is derived, In this letter, we propose the XLOS service probability
where the link between network parameters and the availability as a performance indicator, and start by introducing a
of an XLOS condition is established. The obtained formula is broader definition of the concept thereof; accommodating
given in terms of the multivariate Fox H-function, wherefore we the monitoring of both the UE serving and neighbor cells.
develop a fast graphical processing unit-enabled MATLAB code. Under the general framework of 5G large-scale parameters
Residue theory is then applied to infer the relevant asymptotic (LSPs) [5], we invoke a lognormal K-factor model for outdoor
behavior and show its practical implications. Finally, numeri- UEs [6] to conduct a closed-form analysis of the XLOS
cal results are provided for various network configurations, and service probability in a 5G multi-tier heterogeneous network
underpinned by extensive Monte-Carlo simulations. (HetNet). The obtained formula is including the different
Index Terms—5G, GPU, multivariate Fox H-function, Rician network parameters such as gNB density, height and antennas
K-factor, UDN, XLOS service probability. beamwidth, and expressed in terms of the multivariate Fox
H-function [7, A.1], wherefore we provide a fast GPU-enabled
I. I NTRODUCTION MATLAB code. Finally, we study the asymptotic behavior
HE EMERGENCE of 5G ultra-dense networks [1] will highlighting the effect of different network and channel
T certainly prompt the reshaping of radio resource and mobil-
ity management algorithms, wherefore a new set of measured
parameters on the availability of LOS conditions.
quantities might be required as inputs. In this context, the Rician

II. S YSTEM M ODEL
K-factor can serve as an accurate channel metric to measure the
gradual visibility condition of a radio link, termed X-line-of- Consider an outdoor 2 GHz orthogonal frequency division
sight (XLOS) here, and encompassing LOS, obstructed-LOS multiple access (OFDMA)-based 5G [8] N-tiers UDN, where
(OLOS) and non-LOS (NLOS) as discrete regimes, where gen- each cell class n (n = 1, . . . , N ) is modeled as a homo-
erally KNLOS KOLOS < KLOS [2]. In localization services geneous Poisson point process (PPP) Φn , and distinguished
for instance, while the availability of a LOS path is quintessential by its deployment density λn , maximum transmit power
for the classical triangulation-based schemes such as time- per resource element (RE) Pn , antennas height hn and
of-arrival (TOA) and direction-of-arrival (DOA), the massive beamwidth θn . The corresponding channel is presenting a
multiple-input multiple-output (MIMO)-based space-time pro- large scale fading, with constant path-loss exponent ν and
cessing approaches can deliver very concise localization thanks lognormal shadowing Xn of mean μn and standard deviation
to the high angular resolution of the large scale antennas, and σn . Assuming that UE locations follow an independent PPP
may therefore operate in the worst OLOS/NLOS conditions, Φu of density λu , the downlink analysis is performed at a
yet at the expense of a higher complexity [3]. To optimize the typical UE located at the origin [9].
computational cost, an operator may adopt a hybrid network
configuration where, according to a fine-tuned target K-factor A. Cell Monitoring Criteria
threshold, the 5G gNB can switch between the simpler con- As we are dealing with an outdoor context, we suppose
ventional methods and the massive-MIMO ones. On the other that all tier’s cells are open access (including femtocells). We
hand, in future UDNs with co-located sub-6GHz/mmWave also adopt a reference signal receive power (RSRP)-based
deployment, the imbalance between uplink (UL) and downlink cell selection, wherein each UE periodically monitors the
collection of the M strongest cells, dubbed here monitoring
Manuscript received August 25, 2018; accepted October 4, 2018. Date
of publication October 9, 2018; date of current version April 9, 2019. The
set, and ends up connecting to the best server. Since UE
associate editor coordinating the review of this paper and approving it for measurements rely on the long-term frequency-domain post-
publication was T. Riihonen. (Corresponding author: Hatim Chergui.) equalization receive power, small-scale fading variations do
H. Chergui and M. Benjillali are with the Communication Systems not impact cell selection/reselection and are not, therefore,
Department, INPT, Rabat 10100, Morocco (e-mail: chergui@ieee.org; reflected in the actual RSRP that reads
benjillali@ieee.org).
M.-S. Alouini is with the Computer, Electrical and Mathematical
Science and Engineering Division, King Abdullah University of
Pxn = Pn Xn xn −ν , (1)
where xn −ν stands for the standard path-loss between a

Science and Technology, Thuwal 23955-6900, Saudi Arabia (e-mail:
slim.alouini@kaust.edu.sa).
Digital Object Identifier 10.1109/LWC.2018.2874654 typical UE and an n th -tier BS located at xn ∈ Φn .
CHERGUI et al.: RICIAN K-FACTOR-BASED ANALYSIS OF XLOS SERVICE PROBABILITY IN 5G OUTDOOR UDN 429
B. K-Factor Model rewritten as

M

The K-factor—like all large scale parameters (LSPs)—
follows a lognormal distribution (see [5] and references PXLOS (Kth ) = 1 − Pr Kynm ≤ Kth , n ∈ M
m=1
therein). Without loss of generality, let us adopt the findings
of [6], where we assume that the narrowband K-factor peri- =1− nm , m = 1, . . . , M
Pr ynm ∈ Φ
odically measured by a UE at independent positions can be n∈M

empirically modeled for the n th -tier as, zn2 znM +∞ M
α/2
Kth Ωnm znm

× ... CDFγnm znm
0 0 0 Knm
Kxn = Kn γn xn −α ,
m=1
(2)
× f (zn1 , . . . , znM )dzn1 . . . dznM , (6)
where α > 0, Kn is the K-factor intercept defined as α
where znm = ynm , f (·) is the joint probability den-
sity function (PDF) whose variables verify 0 ≤ zn1 ≤
Kn = (hn /h0 )κ1 (θn /θ0 )κ2 K0 , (3) zn2 ≤ . . . ≤ znM , and CDF stands for the cumulative
distribution function. Moreover, the independence between
with κ1 > 0, κ2 < 0, K0 > 0, and γn is an independent the homogeneous PPPs Φ n as well as the superposition
m
lognormal variable, whose decibel value is zero mean with theorem [9, eq. (1.3.3)] imply that the sampling probabil-
a standard deviation σK . Accurate values of these model ity Pr[ynm ∈ Φ n , m = 1, . . . , M ] = M ρn , where
m=1
n /λT and λT = N λ
m m
parameters can be obtained through a calibration process
according to the target environment. New Jersey’s mea- ρnm = λ m n=1 n . To further develop (6),
surement campaign in [6], for instance, yields h0 = 3 m, let us introduce the following new theorem.
θ0 = 17◦ , α = 0.5, κ1 = 0.46, κ2 = −0.62, K0 = 10, Theorem 1 (Unified Expression for the Product of
and σK = 8 dB. Note that this model involves also a Lognormal CDFs)1 : Consider M independent lognormal
seasonal factor Fs that reflects the vegetation. For the sake random variables γm (m = 1, . . . , M ), with mean μm (dB)
of simplicity and without loss of generality, we consider the and standard deviation σm (dB). A unified expression for the
Summer’s dense vegetation case Fs = 1. product of their individual CDFs—that is equal to their joint
CDF CDFγ1 ,...,γM (γth,1 , . . . , γth,M )—is given by
C. Equivalent Formulation M L M

1 γth,m (1, 1)
Since manipulating distances in PPPs is easier, let us trans- CDFγm γth,m = wl H0,1
1,1 , (7)
n , π M /2 ωl,m (0, 1)
form the RSRP process (1) into a simple unit-power PPP Φ m=1 l=1 m=1
where the strongest power would correspond to the nearest √

neighbor cell to the typical UE. By invoking the random dis- where ωl,m = 10( 2σm ul,m +μm )/10 for l ∈ {1, . . . , L}, wl
placement theorem [9, eq. (1.3.9)], [10, Corollary 3] shows and (ul,1 , . . . , ul,M ) are respectively the weight and the M
that the two-dimensional (2D) process (1) is equivalent to abscissas of the Lth -order M-dimensional GaussianM /2 weight
another 2D process Pyn = yn −ν , such that yn ∈ Φ n Stroud monomial cubature [12], with L l=1 wl = π and
2/ν 2/ν H.,.
.,. [.|.] stands for the Fox H-function.
with density λn = λn Ωn , where Ωn = Pn E[Xn ] and the
2/ν μn
finite lognormal fractional moment E[Xn ] = exp [ ln10
5 ν +
Proof: See Appendix A.
1 ( ln10 σn )2 ]. By means of the mapping theorem [9, eq. On the other hand, an explicit expression of the joint PDF
2 5 ν
(1.3.11)], the K-factor can also be re-expressed as f(·) can be obtained via the following corollary.
Corollary 1 (of Theorem [13, Appendix]):In a multi-tier
−α/2 n .
Kyn = Kn γn Ωn yn −α , yn ∈ Φ (4) random network modeled in terms of N independent PPPs
n (n = 1, . . . , N ) with densities λ
Φ n , let zm = r α (m =
m
1, . . . , M ), such that rm is the distance of the m th neighbor
III. XLOS S ERVICE P ROBABILITY with respect to a certain origin. The joint PDF of z1 , . . . , zM
unconditionally to {Φ n } reads
XLOS service probability in the vicinity of a UE, PXLOS ,
is defined as the probability that at least one cell in the M M

2πλT 2/α
2/α−1
monitoring set presents a K-factor higher than a threshold, f (z1 , . . . , zM ) = e −πλT zM zm , (8)
α
say Kth , that can be fine-tuned depending on the target m=1
service, i.e., N
where λT = n=1 λn .
M
PXLOS (Kth ) Pr Kynm > Kth , n ∈ M , (5)
m=1
By making use of the aforementioned sampling proba-
bility as well as Theorem 1 and Corollary 1, the XLOS
where n = (n1 , . . . , nM ) and M = {1, . . . , N }M . In the service probability (6) can be rewritten after some algebraic
sequel, we derive a closed-form expression for the XLOS manipulations as,
service probability and study its asymptotic behavior. √ M
M L

2 π nm
PXLOS (Kth ) = 1 − λ wl × I1 , (9)
α
A. Closed-Form Analysis n∈M m=1 l=1
Using the total probability theorem as well as the indepen- 1 This theorem can be viewed as a generalization of the well-established
dence between γnm , m = 1, . . . , M , the definition (5) can be Gauss-Hermite representations of the lognormal PDF and CDF (see [11]).
TABLE I
XLOS S ERVICE P ROBABILITY A SYMPTOTIC E XPRESSIONS By plugging (14) into (12), we recognize that integral I1 can
be re-expressed in terms of the multivariate Fox H-function [7,
A.1] as given by (15) on bottom of this page, where parameter
Λnm Ωnm /πλT is encompassing network density, power
and shadowing effects. Finally, a closed-form expression for
PXLOS is deduced by substituting (15) in (9).
B. Asymptotic Behavior
As depicted in Table I, the two asymptotic regimes of the
α/2
ratio Kth Λnm /ωl,m Knm (in W α/ν m 2 ) reflect many prac-
tical scenarios, wherefore it is interesting to establish the
corresponding XLOS service probability expressions; denoted
P XLOS in the sequel. Let H stand for the multivariate Fox
where the multidimensional integral I1 is expressed as

zn2 M znM

+∞
α/2

M

H-function in (15) where

2/α−1 0,1 Kth Ωnm znm (1, 1)

I1 = ... znm H1,1 1
0 0 0 ωl,m Knm (0, 1) H= ... F (ζ1 , . . . , ζM ) dζ1 . . . dζM . (16)
m=1 2πj
2/α
−πλT zn C1 CM
×e dzn1 . . . dznM .
M (10)
In view of the series representations of the monovariate Fox
To derive a closed-form solution for (10), let us recall the H-function [15, Th. 1.2] (while noticing the inverted definition
representation of the involved Fox H-functions in terms of of the H-function therein), an asymptotic expression of (15)

Mellin-Barnes integrals [7, eq. (1.1.1)], i.e., is obtained as follows.

1 Low Ratio Regime: Since the integrand F has no poles on
0,1 (1, 1)
H1,1 z = φ(ζm )z ζm dζm , (11) the right of the M individual contours in (16), [15, eq. (1.2.23)]
(0, 1) 2πj
Cm implies that H 0, and thereby P XLOS = 1.
where φ(ζm ) = Γ(ζm )/Γ(1 + ζm ), and contours Cm (m = High Ratio Regime: By applying [15, eq. (1.2.22)] to the M
1, . . . , M ) are defined such that Re(ζm ) > 0; the highest pole individual contour integrals, an approximation of H is given
on the left. Combining (11) with (10) and interchanging the in terms of the residues of F as
order of the real and contour integrals—which is permissible 2
given the absolute convergence of the involved integrals—we H Res[F , (0, . . . , 0)] + Res F , − , 0, . . . , 0
α
obtain,

M

lim . . . lim ζm F (ζ1 , . . . , ζM )
1 M ζM →0 ζ1 →0
I1 = ... Ψ(ζ1 , . . . , ζM ) m=1
2πj
M
C1 CM 2
M

α/2 ζm
+ lim . . . lim lim ζ1 + ζm F (ζ1 , . . . , ζM ), (17)
ζM →0 ζ2 →0 ζ1 →− 2 α
Kth Ωnm α m=2
× φ(ζm ) dζ1 . . . dζM , (12) which evaluates to
ωl,m Knm ⎡ ⎤
m=1
α M −1 α/2 −2/α
with the multivariate term Ψ given by ⎣1 − K Λ
th n1 ⎦.

zn 2

zn M

+∞
H
2 ωl,1 Kn1
(18)
2/α
−πλT znM
Ψ(ζ1 , . . . , ζM ) = ... e Finally, combining (9), (15) and (18), as well as recall-
0 0 0 L
M
ing that w
l=1 l = π M /2 , we obtain after some algebraic
2/α+ζ −1 manipulations
× znm m dzn1 . . . dznM . (13) 2/α
M
λT L

m=1 1 ωl,1 Kn1
P XLOS = ρnm wl . (19)
Given that α ∈ R+∗ and Re(ζm ) > 0, and using the identity π M /2−1
n∈M
Ω n1
m=1 l=1
Kth
1/a = Γ(a)/Γ(1 + a), the iterated integrals with respect to
zn1 , . . . , znM −1 in (13) can be successively resolved by induc- IV. N UMERICAL R ESULTS AND M ATHEMATICAL
tion. The resulting integral relating to znM is then obtained S OFTWARE
using [14, eq. (3.478.1)], which leads to To validate our theoretical findings, we conduct Monte-
α α M Carlo simulations for three practical scenarios as depicted in
Ψ(ζ1 , . . . , ζM ) = (πλT )−(M + 2 m=1 ζm )
2 Table II, and we adopt New Jersey’s calibration presented
M −1 2i M in II-B with σK = 3 dB. The analytical expressions are
M
α Γ α
+ m=1 1m≤i ζm
×Γ M + ζm M . (14) evaluated via a degree-11 Stroud cubature for which
2
m=1 i=1 Γ 1+ 2i
α
+ m=1 1m≤i ζm L = (4M 5 − 20M 4 + 140M 3 − 130M 2 + 96M + 15)/15.
⎡ α/2 ⎛ ⎞ ⎤
Kth Λn1 M −times

⎢ ωl,1 Kn1 2i ⎜ ⎟ ⎥
α ⎢ . 1− ; 11≤i , . . . , 1M ≤i 1≤i≤M −1 , ⎝1 − M ; α ,..., α
2 ⎠ (1, 1)
(1, 1)⎥
I1 = (πλT )−M H0,M : 0,1 : ... : 0,1 ⎢ . α 2
. . . ⎥ (15)
M ,M −1 : 1,1 : ... : 1,1 ⎢ . (0, 1) (0, 1)⎥
2 ⎣ ⎦
α/2
Kth Λn 2i
M −times M − α ; 11≤i , . . . , 1M ≤i 1≤i≤M −1
ωl,M KnM
CHERGUI et al.: RICIAN K-FACTOR-BASED ANALYSIS OF XLOS SERVICE PROBABILITY IN 5G OUTDOOR UDN 431
TABLE II
N ETWORK AND T RANSMISSION S ETTINGS Stroud monomial cubature [12] to (20), and recalling that
0,0
δ(γm − a) = H0,0 [ γam ], we get
L M
1 0,0 γm
p = M /2 wl H0,0 , (21)
π ωl,m
l=1 m=1
with wl and (ul,1 , . . . , ul,M ) are respectively the l th weight
and√abscissas of the M-dimensional cubature, and ωl,m =
10( 2σm ul,m +μm )/10 . Finally, by invoking [7, eq. (2.53)], the
integration of (21) with respect to γm from 0 to γth,m (m =
1, . . . , M ) leads to (7).
A PPENDIX B
P ROOF OF C OROLLARY 1
It immediately follows from applying the superposition
) the-
orem [9, eq. (1.3.3)] to the equivalent PPP ΦT = N
Φ
n=1 n ,
and performing a PDF transformation to the joint distance
distribution of the first M neighbors given by theorem
[13, Appendix].
Fig. 1. XLOS probability versus Kth for UDN, HetNet (Macro/Femto) and R EFERENCES
LDN. Path-loss exponent ν = 3 and shadowing mean μn = 0 for all tiers
n = 1, . . . , N . [1] S. Sun, T. S. Rappaport, R. W. Heath, A. Nix, and S. Rangan,
“MIMO for millimeter-wave wireless communications: Beamforming,
spatial multiplexing, or both?” IEEE Commun. Mag., vol. 52, no. 12,
To that end, we make use of Stenger’s tabulations [16] to pp. 110–121, Dec. 2014.
update the MATLAB code in [17]. Moreover, we introduce [2] L. Bernadó, T. Zemen, F. Tufvesson, A. F. Molisch, and
in [18] an efficient GPU-oriented MATLAB routine to C. F. Mecklenbräuker, “Time- and frequency-varying K -factor of
non-stationary vehicular channels for safety relevant scenarios,” IEEE
calculate the multivariate Fox H-function. Trans. Intell. Transp. Syst., vol. 16, no. 2, pp. 1007–1017, Apr. 2015.
Fig. 1 shows that, in a UDN with light shadowing, LOS [3] N. Garcia, H. Wymeersch, E. G. Larsson, A. M. Haimovich, and
links are easily established (e.g., Kth = 12 dB is obtained with M. Coulon, “Direct localization for massive MIMO,” IEEE Trans. Signal
probability 1). Conversely, the low density network (LDN) Process., vol. 65, no. 10, pp. 2475–2487, May 2017.
scenario unfolds in NLOS situations with non-negligible [4] H. Elshaer et al., “Decoupled uplink and downlink access in heteroge-
neous networks,” in 5G Wireless Technologies. London, U.K.: IET Digit.
probability (e.g., K < −5 dB with probability 0.5). By Library, 2017, ch. 8.
considering the neighboring cells (M = 2, 3) in the HetNet [5] “Study on channel model for frequencies from 0.5 to 100 GHz (release
(Macro/Femto) case for instance, we remark that a substantial 14),” 3GPP, Sophia Antipolis, France, Rep. TR 38.901, May 2017.
increase of the XLOS probability is achieved only in the [6] L. J. Greenstein, S. S. Ghassemzadeh, V. Erceg, and D. G. Michelson,
non-asymptotic regime. Indeed, a high Kth requirement can “Ricean K-factors in narrow-band fixed wireless channels: Theory,
be fulfilled merely by the serving cell, since the K-factors experiments, and statistical models,” IEEE Trans. Veh. Technol., vol. 58,
no. 8, pp. 4000–4012, Oct. 2009.
of neighbor cells become limited by the corresponding [7] A. M. Mathai, R. K. Saxena, and H. J. Haubold, The H-Function: Theory
path-losses as implied by (2). and Applications. New York Ny, USA: Springer, 2010.
[8] NR: Physical Channels and Modulation (Release 15), 3GPP Standard
V. C ONCLUSION TS 38.211, Sep. 2017.
[9] F. Baccelli and B. Blaszczyszyn, Stochastic Geometry and Wireless
In this letter, we have introduced the XLOS service prob- Networks, Volume I–Theory. Boston, MA, USA: Now, 2009.
ability as a new K-factor-based performance indicator, and [10] P. Madhusudhanan et al. (2012). Downlink Performance Analysis
provided its analytical and asymptotic expressions that unveil for a Generalized Shotgun Cellular Systems. [Online]. Available:
arxiv.org/abs/1002.3943
the effect of the variation of 5G network and transmission [11] F. Yilmaz and M.-S. Alouini, “A novel unified expression for the capac-
parameters on the gradual visibility condition of radio links. ity and bit error probability of wireless communication systems over
By tweaking a K-factor threshold Kth , the XLOS metric can generalized fading channels,” IEEE Trans. Commun., vol. 60, no. 7,
be used by network optimization algorithms as the probabil- pp. 1862–1876, Jul. 2012.
ity of e.g., reconfiguring the uplink in a LOS/OLOS gNB. As [12] A. H. Stroud, Approximate Calculation of Multiple Integrals. Englewood
Cliffs, NJ, USA: Prentice-Hall, 1971.
a perspective, the adopted K-factor model from [6] can be [13] H. R. Thompson, “Distribution of distance to Nth neighbour in a pop-
extended to the vehicular case in future works. ulation of randomly distributed individuals,” Ecology, vol. 37, no. 2,
pp. 391–394, Apr. 1956.
A PPENDIX A [14] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and
Products, 7th ed. Amsterdam, The Netherlands: Academic, 2007.
P ROOF OF T HEOREM 1 [15] A. A. Kilbas and M. Saigõ, H-Transforms: Theory and Applications
First, by making a simple variablechange, the prod- (Analytical Methods and Special Functions). Boca Raton, FL, USA:
M Chapman & Hall, 2004.
uct of lognormal PDFs pγm , p = m=1 pγm , can be [16] F. Stenger, “Tabulation of certain fully symmetric numerical integra-
reformulated as

tion formulas of degree 7, 9 and 11,” Math. Comput., vol. 25, no. 116,
pp. 935 and S58–S125, Oct. 1971.
1 2 2
p= e −(u1 +...+uM ) Q(u1 , . . . , uM )du1 . . . duM , (20) [17] J. Burkardt. (2010). Stroud Numerical Integration in M Dimensions.
π M /2 [Online]. Available: people.sc.fsu.edu/∼jburkardt/m_src/stroud/stroud.
RM
html
M [18] H. Chergui, M. Benjillali, and M.-S. Alouini, GPU-Enabled Multivariate
where
√ Q(u1 , . . . , uM ) = m=1 δ(γm − Fox H-Function Code, Zenodo, Geneva, Switzerland, Aug. 2018,
10( 2σm um +μm )/10 ). By applying the Gaussian-weight doi: 10.5281/zenodo.1400403.
Power Splitting-Based SWIPT Systems With Decoding Cost

Mohsen Abedi, Hamed Masoumi, and Mohammad Javad Emadi
Abstract—We study optimal power splitting-based simultane- not been studied in the literature. In this letter, we analyze
ous wireless information and power transfer (PS-SWIPT) for PtP and two-hop cooperative communication systems wherein
two communication systems wherein all receiving nodes are sub- the destination (and relay) are PS-SWIPT based nodes with
ject to a decoding cost. First, for a point-to-point (PtP) channel, inherent powers and subject to a non-decreasing DCF. From
the optimal achievable rate is obtained, and then, for a two- implementation points of view, increasing the rate does not
hop half-duplex (HD) decode-and-forward (DF) relay channel, necessarily mean the exponential power consumption for ID.
the achievable data rate is maximized. Finally, performances of
Hence, unlike the existing works which assume an increasing
the PtP system and the relay channel with DF and amplify-and-
forward (AF) relaying schemes are discussed where we observe convex DCF for optimization simplicity, we assume a more
performance superiority of the DF relay channel over AF relay- general non-decreasing DCF. The main contributions of this
ing and the PtP channel for sufficiently less decoding cost at the letter are twofold; First, by considering the decoding cost at
relay than the destination. the receiving nodes, the achievable rates are presented and
then, for the non-convex optimization problem, we analytically
Index Terms—PS-SWIPT, PtP, two-hop DF relay, decoding derive closed-form expressions of the optimal data rates. Our
cost, power splitting.
numerical evaluations for the PS-SWIPT systems subject to the
decoding cost reveal the outperformance of the DF relaying
I. I NTRODUCTION compared to the AF one and PtP transmission.
NE OF the promising aspects of the evolving fifth gen-
O eration of wireless networks is to reduce and manage
the power consumption at inexpensive energy harvesting (EH)
II. P OINT TO P OINT C HANNEL W ITH D ECODING C OST
A. System Model
nodes constrained by the amount of battery power, while Assume a PS-SWIPT PtP transmission where the destina-
achieving higher data rate. Thus, the SWIPT and utilizing the tion splits the received signal into two parts; One for EH, and
cooperative relaying are the two key enabling techniques to the other for ID. Since decoding the signal consumes power
prolong the lifetime and throughput of the system, respectively. which is named the decoding cost, the destination utilizes this
By employing the SWIPT, each EH node can adaptively harvested energy for ID. So, this node must optimally har-
manage its received signal for both EH and or information vest energy to maximize the achievable data rate. The required
decoding (ID) [1]. Recently SWIPT-enabled cooperative com- powers for ID, i.e., decoding cost, is given by a non-decreasing
munication systems have gained many research interests from function ϕd (R) where R indicates the achievable data rate.
different aspects; Optimal PS-SWIPT schemes for two-way The received signal at the destination is given by
DF, and full-duplex (FD) DF relaying strategies are analyzed
in [2] and [3], respectively. Reference [4] studies two-way Yd = hsd Xs + Zd , (1)
massive multi-antenna relay channel wherein the users employ where hsd indicates the source-destination channel coeffi-
power splitting technique to harvest energy for data transmis- cient. Throughout this letter, it is assumed that, the local
sion. Bit error rate of the amplify-and-forward (AF) SWIPT channel state information (CSI) is available at the destina-
relaying with direct link over Nakagami-m channels is derived tion (and the relay).1 Xs is the transmitted symbol with
in [5]. In [6], SWIPT two-way AF relay network is consid-
ered to maximize the secrecy sum-rate subject to attaining E{|Xs |2 } = Ps , where E{.} denotes the statistical expec-
minimum energy requirement at the EH node. tation and Zd ∼ CN (0, Nd ) is an independent complex
On the other hand, in contrast to the conventional com- Gaussian noise. Thus, the two splitted signals for EH and ID
munication system, at the low-power and EH nodes, power are respectively given by YEH d , and YID d as follows
√
consumption for ID, named as decoding cost, may become YEH d = ρd Yd , YID d = 1 − ρd Yd + Zp,d , (2)
non-negligible. Recently, energy efficiency (EE) of SWIPT
in Internet of Things (IoT) system subject to constant circuit where ρd ∈ [0, 1] denotes the power splitting ratio and, Zp,d ∼
power consumption is studied in [7] and [8]. Moreover, opti- CN (0, Np,d ) represents the processing noise at the destination.
mal power management for EH wireless networks with decod- Thus, the harvested power is
ing cost are studied in [9]–[11], wherein the decoding cost
function (DCF) is assumed to be an increasing convex function PEH d = ηd ρd (|hsd |2 Ps + Nd ), (3)
of the data rate which simplifies the optimization problems. where ηd ∈ [0, 1] denotes the energy conversion efficiency.
To the best of our knowledge, the impact of decoding cost
in a SWIPT-enabled cooperative communication system has B. Optimization Problem
Manuscript received August 28, 2018; accepted October 2, 2018. Date The optimization problem is formulated as
of publication October 9, 2018; date of current version April 9, 2019. The maximize Rt
associate editor coordinating the review of this paper and approving it for pub- ρd
lication was K. W. Choi. (Corresponding author: Mohammad Javad Emadi.)
The authors are with the Department of Electrical Engineering, Amirkabir s.t. Rt = min Rsd , Rϕ , (4)
University of Technology (Tehran Polytechnic), Tehran 1591634311,
Iran (e-mail: mohsenabedi@aut.ac.ir; hamed_masoomy@aut.ac.ir; 1 By assuming imperfect CSI, using similar steps as presented in this letter,
mj.emadi@aut.ac.ir). one can obtain the optimal solutions, and achievable rate decreases as the
Digital Object Identifier 10.1109/LWC.2018.2874886 channel estimation quality degrades.
ABEDI et al.: PS-SWIPT SYSTEMS WITH DECODING COST 433
and Zd ∼ CN (0, Nd ) are independent complex Gaussian

noises at the relay and the destination, respectively. The split-
ted signals for EH and ID at the realy and destination are
given by
√
YEH i = ρi Yi , YID i = 1 − ρi Yi + Zp,i , i ∈ {r , d },
(13)
where ρr , ρd ∈ [0, 1] denote the power splitting ratios and,
Fig. 1. Two-hop SWIPT DF relay system with decoding costs at the relay Zp,r ∼ CN (0, Np,r ), Zp,d ∼ CN (0, Np,d ) represent the pro-
and the destination. cessing noise at the relay and the destination, respectively.
Thus, the harvested powers at the relay and the destination
where Rsd , Rϕ are the data rates transmitted by the source, are
and the decodable rate at the destination (due to the decoding
cost), respectively; PEH r = ηr ρr (|hsr |2 Ps + Nr ), (14)

(1 − ρd )|hsd |2 Ps PEH d = ηd ρd (|hrd |2 Pr + Nd ), (15)
Rsd = log2 1 + , (5)
(1 − ρd )Nd + Np,d where ηr , ηd ∈ [0, 1] denote the energy conversion
efficiencies.
Rϕ = ϕ−1 d P EHd
+ P 0
d . (6)
Equation (6) indicates the decoding cost constraint imposed
by the destination, where Pd0 denotes the inherent (pre- B. Optimization Problem
dedicated) power at the destination. The optimization problem is formulated as
Theorem 1: The optimal achievable rate is
maximize Rt
ρr ,ρd
Rt∗ = min ξ Ω−1 (a) , ξ(0) , (7) 1
s.t. Rt =min Rsr , Rrd , Rϕ , (16)
where 2
where Rsr , Rrd , Rϕ are the data rates for the source-relay
1 0
Ω(x ) = ϕd ξ(x ) − Pd , (8) and the relay-destination channels and the possible decodable
ηd x rate at the destination (due to the decoding cost), respectively;

(1 − x )|hsd |2 Ps (1 − ρr )|hsr |2 Ps
ξ(x ) = log2 1 + , (9) Rsr ≤ log2 1 + =: 2f1 (ρr ), (17)
(1 − x )Nd + Np,d (1 − ρr )Nr + Np,r

a = |hsd |2 Ps + Nd . (10) (1 − ρd )|hrd |2 Pr
Rrd ≤ log2 1 + =: 2f2 (ρr , ρd , Rt ),
Sketch of Proof: Considering Rt − ρd plane, Rsd is (1 − ρd )Nd + Np,d
a decreasing function of ρd . While Rϕ is an increasing (18)
function in ρd . As a result, maximum rate is the intersec-
−1 0
tion of the curves Rsd and Rϕ , if the intersection exists. Rϕ ≤ ϕd PEHd + Pd =: 2f3 (ρr , ρd , Rt ), (19)
Otherwise, the optimal rate is derived by intersecting the curve
Rsd and the axis Rt . where Pr = PEHr − ϕr (Rt ) + Pr0 ≥ 0 in which Pr0
Remark 1: If Pd0 ≥ ϕd [ξ(0)], Rt∗ = ξ(0). That is, the des- and Pd0 denote inherent powers at the relay and destination,
tination has sufficient inherent power, thus EH is not required. respectively.
Theorem 2: The optimal achievable rate is

1
III. R ELAY C HANNEL W ITH D ECODING C OST Rt∗ = min ξ Ω−1 1 (a) , ξ Ω −1
2 (a) , ξ(0) , (20)
A. System Model 2
where
A fading two-hop DF relay channel is depicted in Fig. 1. It ⎡ ⎤
0 1
is assumed that the relay and the destination employ adaptive 1 ⎣ ϕd ξ(x ) − Pd Nd ⎦
power splitting to partly harvest energy from their received sig- Ω1 (x ) = + ϕr ξ(x ) − Pr0 − , (21)
ηr x ηd ψ(x )|hrd |2 2 |hrd |2
nal. The relay utilizes the harvested energy for both ID, and
1 (Nd + Np,d )(2ξ(x ) − 1) 1
signal retransmission to the destination. The destination also Ω2 (x ) = + ϕr ξ(x ) − Pr0 , (22)
ηr x |hrd |2 2
partly harvests energy from the received signal to be able to
perfectly decode the received signal. These nodes must opti- 1 0 −1
|hrd |2 ηr xa − ϕr 2 ξ(x ) + Pr Nd
mally harvest energy for decoding and/or retransmitting the ψ(x ) = 1 − − , (23)
signal to maximize the end-to-end achievable data rate. The Np,d 2ξ(x ) − 1 |hrd |2

required powers for ID at the relay and the destination are |2 P
(1 − x )|hsr s
ξ(x ) = log2 1+ , (24)
denoted by non-decreasing functions ϕi (R), where i ∈ {r , d }, (1 − x )Nr + Np,r
2
respectively. The received signals at the nodes are given by a = |hsr | Ps + Nr . (25)
Yr = hsr Xs + Zr , (11) Sketch of Proof: Equation (16) can be reformulated as
Yd = hrd Xr + Zd , (12) maximize Rt
ρr ,ρd
where hsr and hrd indicate the source-relay and the relay-
s.t. Rt ≤ f1 (ρr ),
destination channel coefficients, respectively. Xs and Xr are
the transmitted symbols by the source and the relay with Rt ≤ f2 (ρr , ρd , Rt ),
E{|Xs |2 } = Ps and E{|Xr |2 } = Pr , where Zr ∼ CN (0, Nr ) Rt ≤ f3 (ρr , ρd , Rt ), (26)
where f1 is decreasing, while f2 and f3 are increasing in

ρr . The functions f2 and f3 are decreasing and increasing
in ρd , respectively, while the both functions are decreasing
in Rt . Let ρ∗r , ρ∗r and Rt∗ be the optimal solutions of opti-
mization problem (26). Noting that ρ∗r , ρ∗d = 1, the first
assumption is that 0 < ρ∗r , ρ∗d < 1. Then, we prove that
Rt∗ = f1 (ρ∗r ) = f2 (ρ∗r , ρ∗d , Rt∗ ) = f3 (ρ∗r , ρ∗d , Rt∗ ). To this aim,
we consider four cases. The first case is that Rt∗ < f1 . Then,
we can increase ρ∗r with a positive small enough value such
that Rt∗ < f1 , Rt∗ < f2 and Rt∗ < f3 . So, Rt∗ can be increased
by a small enough value such that the all three constraints Fig. 2. Minimum of the three surfaces, Rt = min(Rsr , Rrd , Rφ ) vs. ρr
are still satisfied with inequality which contradicts the opti- and ρd for ϕr (Rt ) = ϕd (Rt ) = 10−1.5 (2Rt − 1) and Pr0 = Pd0 = 0.
mality of (26). Thus, Rt∗ = f1 is assumed in the next three
cases. The second case is Rt∗ < f2 and Rt∗ < f3 . Here, ρ∗r Remark 2: If the following two conditions hold, then
can be decreased by a small enough value such that the all ρ∗r = 0.
three constraints are satisfied with inequality. Then, Rt∗ can be (1 − ρ∗d )Nd + Np,d ξ(0) 1
increased in a similar way and keeping the three constraints Pr0 ≥ ∗ 2 − 1 + ϕ r ξ(0) ,
(1 − ρd )|hrd |2 2
satisfied with inequality which contradicts the optimality. The
third case is Rt∗ = f2 and Rt∗ < f2 . Then, ρ∗d can be decreased 1
0 ∗ 2 0
by small enough value so that the both constraints are satisfied Pd ≥ ϕd ξ(0) − ηd ρd |hrd | Pr − ϕr ξ(0) + Nd .
2
with inequality which is non-optimal as it is the same as the
second case. The forth case is Rt∗ < f2 and Rt∗ = f2 . The non- (29)
optimality of this case can be proven similar to the third case Also for the following inherent power values, we have
by increasing ρ∗d with a small enough value. As a result, all the ρ∗d = 0.
Nd + Np,d ξ(ρ∗ ) 1
three constraint are satisfied with equality. The second assump-
tion is that ρ∗d = 0 and 0 < ρ∗r < 1. Then, taking the same Pr0 ≥ 2 r −1 +ϕ
r ξ(ρ∗
r ) + ηr ρ∗r a,
steps as the first assumption, it can be proven that Rt∗ = f1 |hrd |2 2

and Rt∗ = f2 . As a result, Rt∗ = f1 (ρ∗r ) = f2 (ρ∗r , ρ∗d , Rt∗ ). Pd0 ≥ ϕd ξ(ρ∗r ) . (30)
The third assumption is ρ∗r = 0. Here, using similar argu-
ment Rt∗ = f1 (ρ∗r ). The solution for (26) is also unique. In Remark 3: In Remak 2, by letting ρ∗d = 0 in (29) or
the following, The problem uniqueness is discussed. ρ∗r = 0 in (30), one can obtain the following conditions for
Assuming Rt − ρr − ρd space, we have three surfaces as ρ∗d = ρ∗r = 0,
below.
Nd + Np,d ξ(0) 1
• Rt = f1 (ρr ) is decreasing in ρr and independent of ρd , Pr0 ≥ 2 − 1 + ϕr ξ(0) , and Pd0 ≥ ϕd ξ(0) .
|hrd |2 2
• Taking derivative of the implicit equation Rt =
(31)
f2 (ρr , ρd , Rt ) with respect to ρr ,

dRt R 2
× ln 2 × 2 t (1 − ρd )Nd + Np,d + (1 − ρd )|hrd | ϕr (Rt ) IV. N UMERICAL R ESULTS AND D ISCUSSIONS
dρr
2 2
= (1 − ρd )|hrd | ηr (|hsr | Ps + Nr ). (27) For the PtP scenario, the distance between the source and
the destination is D = 10. However, for the cooperative sce-
The coefficient of dRt /d ρr and the right-hand of (27) are nario we assume the relay node placed in distance d, and
positive. Moreover, by taking the derivative of the implicit D − d from the source and the destination, respectively, where
equation Rt = f2 (ρr , ρd , Rt ) with respect to ρd , 1 ≤ d ≤ (D − 1). The channel between each pair of nodes
dRt
with the corresponding distance of l is modeled as l −2 . We
R 2
dρd
× ln 2 × 2 t (1 − ρd )Nd + Np,d + (1 − ρd )|hrd | ϕr (Rt ) also assume that Ps = 10, ηi = 0.9, Ni = 1, Np,i = 2, and
2

2 0

2R βd = 5. It is presumed that the destination nodes are cheap and
= −|hrd | ηr ρr (Ps |hsr | + Nr ) − ϕr (Rt ) + Pr + (2 t − 1)Nd .
low-power sensors such as IoT devices, while the relay nodes
(28) can be designed meticulously to have lower decoding cost, βr ,
The coefficient of dRt /d ρd is positive, and from (18), with much more inherent power than the low-power equip-
the right-hand of (28) is negative. So, Rt in Rt = Rrd ments. Regarding this statement, we set Pd0 = 2 and Pr0 = 6.
is increasing in ρr and decreasing in ρd . Let energy efficiency (EE), EE = P R t
+P 0
for the PtP channel,
s d
• Similarly, Rt = f3 (ρr , ρd , Rt ) is increasing in ρr and ρd . and EE = P +PR0t +P 0 for the relay one. Fig. 3 depicts the EE
In the above equations, chain rule in Leibniz’s notion is s r d
dϕi (Rt ) dϕi (Rt ) dRt dRt versus the achievable rate for the exponential DCF ϕi (R) =
employed, i.e., dρi = dRt dρi = ϕi (Rt ) dρi . βi (2R −1) for i ∈ {r , d } and βd = 5. It is shown that both DF-
Thus, the intersection of the surfaces Rt = f1 (ρr ), Rt =
HD and DF-FD schemes outperform PtP and AF ones,2 unless
f2 (ρr , ρd , Rt ) and Rt = f3 (ρr , ρd , Rt ) in first assumption,
for very large value of βr = 100, in which the performance
intersection of the surfaces Rt = f1 (ρr ), Rt = f2 (ρr , ρd , Rt )
of the DF relaying schemes dramatically degrades.
and the plane ρd = 0 in second assumption and the intersec-
As mentioned before, increasing the rate does not necessar-
tion of the surface Rt = f1 (ρr ) and the plane ρr = 0 in third
ily cause exponential processing power consumption. Thus,
assumption is unique.
For an illustration min(Rsr , Rrd , Rφ ) versus ρr and ρd is 2 For better performance comparison, numerical results are also provided
depicted in Fig. 2. Note that, the required power for decoding for DF-FD and AF relaying schemes. Following the same steps, one can
cost and retransmission at the relay cause larger ρ∗r . obtain the results for these two schemes.
ABEDI et al.: PS-SWIPT SYSTEMS WITH DECODING COST 435
Fig. 3. Energy efficiency vs. data rate. Fig. 6. Rt versus source-relay distance and Pr0 .
retransmission with higher power; nevertheless, large relay-

destination distance is the main obstacle to achieve high data
rates. Hence, the relay moves toward the destination. At this
place, increasing Pr0 increases the data rate until it saturates
due to limited data rate of source-relay link which stems from
limited source power Ps and high attenuation. To alleviate this,
owing to limited Ps the only option for the relay is moving
back toward the source to maximize the data rate.
V. C ONCLUSION
The achievable data rates of PS-SWIPT-based point-to-point
Fig. 4. Rt versus βr for fixed βd . and two-hop DF relay channels subject to the non-decreasing
decoding costs at the receiving nodes are derived and opti-
mized. The numerical evaluations presented for the DF/AF
relay channel, and the PtP system indicate that the two-hop
DF relay channel outperforms the AF relaying and the PtP
channel when the decoding cost at relay is sufficiently less
than that of the destination, subject to optimal placement of
the relay.
R EFERENCES
[1] I. Krikidis et al., “Simultaneous wireless information and power trans-
fer in modern communication systems,” IEEE Commun. Mag., vol. 52,
no. 11, pp. 104–110, Nov. 2014.
[2] C. Peng et al., “Optimal power splitting in two-way decode-and-forward
Fig. 5. Achievable rate versus hardware qualities βr , βd . relay networks,” IEEE Commun. Lett., vol. 21, no. 9, pp. 2009–2012,
Sep. 2017.
[3] H. Liu et al., “Power splitting-based SWIPT with decode-and-forward
besides considering the exponential DCF, linear and constant full-duplex relaying,” IEEE Trans. Wireless Commun., vol. 15, no. 11,
ones are studied in Fig. 4 to analyze effects of different types pp. 7561–7577, Nov. 2016.
of DCFs on data rates for different schemes with fixed hard- [4] X. Wang et al., “Wireless power transfer-based multi-pair two-way relay-
ware quality at the destination, i.e., βd = 5. For i ∈ {r , d }, ing with massive antennas,” IEEE Trans. Wireless Commun., vol. 16,
let the DCFs as; ϕi (R) = βi (2R − 1) for the exponential, no. 11, pp. 7672–7684, Nov. 2017.
[5] Y. Lou et al., “Performance of SWIPT-based differential AF relay-
ϕi (Rt ) = mβi Rt for the linear, and ϕi (Rt ) = ci for the ing over Nakagami-m fading channels with direct link,” IEEE Wireless
constant ones, and let m = 0.5, cd = 2.5, cr = βr . For the Commun. Lett., vol. 7, no. 1, pp. 106–109, Feb. 2018.
given parameters, all schemes with the linear DCF outper- [6] Q. Li et al., “Secure relay beamforming for SWIPT in amplify-and-
form the other cost functions. Note that the superiority of the forward two-way relay networks,” IEEE Trans. Veh. Technol., vol. 65,
performance strongly depends on the values of m, and ci . no. 11, pp. 9006–9019, Nov. 2016.
[7] Y. Huang et al., “Energy-efficient SWIPT in IoT distributed antenna
Fig. 5 investigates joint effects of βr and βd on the optimal systems,” IEEE Internet Things J., vol. 5, no. 4, pp. 2646–2656,
achievable rate for different schemes. For fixed βd , increasing Aug. 2018.
βr increases cost of decoding at the relay, therefore results in [8] J. Tang et al., “Energy efficiency optimization with SWIPT in MIMO
a performance degradation only for the DF relaying. Moreover broadcast channels for Internet of Things,” IEEE Internet Things J.,
for fixed βr , increasing βd increases decoding cost at the vol. 5, no. 4, pp. 2605–2619, Aug. 2018.
[9] C. Qin, W. Ni, H. Tian, R. P. Liu, and Y. J. Guo, “Joint beamforming and
destination which causes the performance loss for all schemes. user selection in multiuser collaborative MIMO SWIPT systems with
Note that the previous figures are obtained for the optimum nonnegligible circuit energy consumption,” IEEE Trans. Veh. Technol.,
placement of the relay. Fig. 6 addresses the impact of the vol. 67, no. 5, pp. 3909–3923, May 2018.
relay’s inherent power and location for βr = βd . For small [10] A. Arafa and S. Ulukus, “Optimal policies for wireless networks with
values of Pr0 , scarcity of relay’s inherent power forces this energy harvesting transmitters and receivers: Effects of decoding costs,”
IEEE J. Sel. Areas Commun., vol. 33, no. 12, pp. 2611–2625, Dec. 2015.
node to move as close as possible towards the source, so it can [11] A. Arafa et al., “Energy harvesting two-way channels with decoding
harvest more energy to support higher data rates. By increas- and processing costs,” IEEE Trans. Green Commun. Netw., vol. 1, no. 1,
ing Pr0 , relay has enough power for the decoding and signal pp. 3–16, Mar. 2017.
On Iterative Compensation of Clipping Distortion in OFDM Systems

Shansuo Liang , Jun Tong , and Li Ping, Fellow, IEEE
Abstract—We consider an iterative compensation method to recent progress of orthogonal approximate message passing
treat clipping distortion in orthogonal frequency division multi- (OAMP) [11].
plexing systems with clipping. A conventional approach is using
extrinsic information (EI) from a decoder to estimate and can-
cel clipping distortion iteratively. In this letter, feedback in the II. I TERATIVE C OMPENSATION
form of a posteriori information (API) from the decoder is A. Transmitter Structure
investigated. Correlation may be introduced among messages in
iterative processing when API is directly used as feedback. We Fig. 1 (a) illustrates a coded OFDM system with N sub-
show that carefully choosing a linearization model can suppress carriers. At the transmitter, information bits are first encoded
such correlation. Simulation results show that the API approach by a binary encoder (ENC) and then randomly interleaved.
outperforms the EI one.
Coded bits are mapped to a sequence of N modulated
Index Terms—OFDM, clipping, linearization, iterative com- symbols X = [X1 , X2 , . . . , XN ]T using a constellation
pensation, decorrelation. S ≡ {sj , j = 1, 2, . . . , 2B } with average power normalized to
I. I NTRODUCTION one. An inverse discrete Fourier transform (IDFT) is then
applied as
IGH peak-to-average power ratio (PAPR) is a well-
H known problem in orthogonal frequency division mul-
tiplexing (OFDM) systems. Various methods have been inves-
x = F †X , (1)
tigated to handle the problem [1], [2]. Among them, clipping is where F is the N × N unitary discrete Fourier transform (DFT)
a simple way to achieve very low PAPR [1], [3]. Compared to matrix and (·)† denotes conjugate transpose. The entries of
other alternatives, such as selective mapping and tone reser- x are approximately complex Gaussian distributed and thus
vation, clipping incurs no rate loss and has low transmitter have high PAPR [12]. Clipping can be employed to reduce
implementation complexity [4], [5]. However, clipping causes PAPR [1]. Specifically, a clipping function is defined by

nonlinear distortion in the transmitted signals. Iterative com- Axi /|xi |, |xi | > A
f (xi ) = , (2)
pensation (IC) has been shown to be effective in treating the xi , |xi | ≤ A
clipping distortion at the expense of an increased receiver com-
plexity [6]–[8]. Various symbol detection approaches can be where A > 0 is the clipping threshold. For simplicity, we
used in iterative processing: hard decision is used in [6], while will omit the standard OFDM operations involving adding and
soft decision is used in [8]. Compared to hard decision, soft stripping off cyclic prefixes. The time-domain received signal
decision offers noticeable performance improvement [8]. r is given by
Correlation among messages is a main problem in iterative r = F † ΛF · f (x ) + w , (3)
processing. The standard treatment is using extrinsic informa-
tion (EI) from a soft-input soft-output decoder [8], [9]. It has where w ∼ CN (0, σ 2 I) is a sequence of independent
been reported that a posteriori information (API) may outper- identically distributed (IID) Gaussian noise and Λ =
form EI in some iterative systems [10]. Most related works diag{H1 , H2 , . . . , HN } is a diagonal matrix with Hi being
are experimental. There is still a lack of theoretical treatments the channel coefficient on the i-th subcarrier. Applying DFT
for the API approach. to r, we obtain a frequency-domain signal vector R as
In this letter, we provide a mathematical justification for
R = ΛF · f (F † X ) + W , (4)
using API in iterative compensation. We show that correla-
tion in the iterative process can be suppressed by carefully where W = Fw contains IID Gaussian noise samples.
choosing a linearization model. Our results are based on the
Manuscript received August 18, 2018; accepted October 2, 2018. Date B. Linearization of the Clipping Function
work was supported by the University Grants Committee of the Hong Kong
The clipping function f(·) in Fig. 1 (a) is nonlinear, which
SAR, China under Project CityU 11280216 and Project CityU 11216817. makes symbol detection difficult. To circumvent this difficulty,
The associate editor coordinating the review of this paper and approving it we linearize f (x ) as [8]
for publication was J. Mietzner. (Corresponding author: Shansuo Liang.)
S. Liang and L. Ping are with the Department of Electronic f (x ) = αx + d , (5)
Engineering, City University of Hong Kong, Hong Kong (e-mail:
ssliang3-c@my.cityu.edu.hk; eeliping@cityu.edu.hk).
J. Tong is with the School of Electrical, Computer and Telecommunications
where α is a constant scalar and d = f (x) −αx is the clipping
Engineering, University of Wollongong, Wollongong, NSW 2522, Australia distortion. Substituting (5) to (4), we have
(e-mail: jtong@uow.edu.au).
Digital Object Identifier 10.1109/LWC.2018.2874935 R = ΛF · (αF † X + d ) + W . (6)
LIANG et al.: ON IC OF CLIPPING DISTORTION IN OFDM SYSTEMS 437
(DEM). Then a soft-input soft-output decoder (DEC) takes

the LLRs and performs standard APP decoding. The outputs
of DEM-DEC are two types of messages that are denoted as
extrinsic probabilities {Pr(Xn = sj )EI } and a posteriori prob-
abilities {Pr(Xn = sj )API }, respectively. The operations of
DEM-DEC follow standard principles of bit-interleaved coded
modulation with iterative decoding (BICM-ID) [13].
Step 2 (Estimating d via (5)): With {Pr(Xn = sj )API }
from DEM-DEC, the a posteriori mean and variance of X̄
are, respectively, given by
2B
Fig. 1. (a) A coded OFDM system with clipping. (b) Diagram of iterative
compensation at the receiver, where ext and APP represent the extrinsic and X̄n = sj · Pr(Xn = sj )API , (10a)
a posteriori information, respectively. j =1
2B

Defining D = Fd in (6), we have V [Xn ] = |sj − X̄n |2 · Pr(Xn = sj )API . (10b)
j =1
R = αΛX + ΛD + W . (7)
Since x = F† X, the a posteriori mean and variance of x,
We aim at estimating X based on (7). Since Λ is diagonal,
denoted as x̄ and Vx respectively, are given by
we can perform symbol-by-symbol detection. The difficulty is
that D might in general be correlated with X. Such correlation N
1
could originate from d in the linearization model (5). We will x̄ = F X̄†
and Vx = V [Xn ]. (11)
address this problem in Section II-D. N
n=1
Fig. 1 (b) shows the proposed receiver structure. The basic
idea is estimating X and d iteratively according to (7) and (5), Now we write (5) in a symbol-by-symbol form as
respectively, as detailed next.
dn = f (xn ) − αxn , n = 1, 2, . . . , N . (12)
C. Iterative Process
Given the mean and variance of x in (11), we model
Assumption 1 below underpins the proposed iterative {p(xn |x̄n ) = CN (x̄n , Vx ), ∀n} by a Gaussian approximation,
receiver shown in Fig. 1 (b). which can be justified due to the IDFT operation [12]. Then
Assumption 1: The entries of D can be approximated by the conditional mean and variance of each dn are, respectively,
Gaussian random variables with mean D̄ and equal variance given by
VD . Further, the entries of D are independent of those of X
and W. d̄n = Exn |x̄n [dn ] = Exn |x̄n [f (xn )] − αx̄n , (13a)
Since D = Fd, the entries of D are weighted sums of 2
many uncorrelated items. The Gaussian approximation for D V [dn ] = Exn |x̄n [|dn − d̄n | ]. (13b)
in Assumption 1 can be justified by the central limit theo-
rem [12]. The independence assumption will be verified in Since D = Fd, the mean and variance of D are, respectively,
Section II-D. computed as1
The iterative receiver in Fig. 1 (b) involves iterations N
between the following steps. 1
D̄ = F d̄ and VD = V [dn ]. (14)
Step 1 (Estimating X via (7)): Since Λ is diagonal, the N
n=1
system model (7) is decoupled into parallel linear channels
as Overall Iterative Process: Steps 1 and 2 above consti-
tute one iteration. In the next iteration, D̄ computed in (14)
Rn = αHn Xn + Hn Dn + Wn , n = 1, 2, . . . , N . (8) is fed back to (9) as the a priori mean and {Pr(Xn =
The modulated symbols {Xn } are assumed to be IID random sj )EI } outputted from the DEM-DEC are used to update
variables taken from the constellation S = {sj } with a priori {Pr(Xn = sj )} in (9). The process then continues iteratively
probabilities {Pr(Xn = sj ), ∀j , n}. At the beginning of the (see Fig. 1(b)). Note that (9) in Step 1 relies on the indepen-
iterative process, we set {Pr(Xn = sj ) = 1/2B , ∀j , n} when dence in Assumption 1. It does not necessarily hold during the
there is no decoding feedback. We also set D̄ = 0 and VD iterative process since d̄ is a function of x̄ as seen from (13).
can be evaluated numerically. More details in the subsequent This issue will be examined carefully in Section II-D.
iterations will be given in Step 2. Based on (8), the a posteriori
probabilities (APP) of the modulated symbols can be evaluated 1 Following the method in [8], V can be evaluated using a look-up table
D
by (9) (at the top of the next page). method. A two-dimension table (obtained by the Monte Carlo Method) can
be built beforehand to characterize the relationship of VD and Vx for the
The a posteriori probabilities {Pr(Xn = sj |Rn )} are trans- clipping function f(·) with a specific clipping threshold. The detailed table
formed into bit log-likelihood ratios (LLRs) by a demapper generation and analysis can be found in [8] and [14].

|R −Hn D̄n −αHn sj |2
exp − n σ2 +|H 2 · Pr(Xn = sj )
n | VD
Pr(Xn = sj |Rn ) = B 2
, j = 1, 2, . . . , 2B , n = 1, 2, . . . , N . (9)
2 |R n −H D̄
n n −αH s
n j |
j =1 exp − 2
σ +|H | V2 · Pr(X n = s j )
n D

D. The Correlation Problem = Ex ;x̄ f (x ) − α (19b)

The following three propositions establish the orthogonality = Ex f (x ) − α, (19c)
between D̄ and X̄. We will comment on its implication at the
where the last step holds since f(x) is not a function of x̄ . For
end of this sub-section.
α = E[f (x )], we have Ex̄ [η (x̄ )] = 0 from (19c).
Proposition 1: Let f(x) be a continuous function with x ∼
We model X̄ in (10) as [15]
N (εμx , σx2 ). We treat μx as a variable and treat ε and σx2 as
constant values. Then we have
X̄ = · X + Z , (20)
d E[f (x )] df (x )
=ε·E . (15) where = E[X † X̄ ]/E[X 2 ]. It can be verified that, with
d μx dx
such , Z is statistically uncorrelated with X. Defining z = F† Z,
Proof: Take the derivative of E [f (x)] as and according to (11), we have

d E[f (x )] d
= f (x )N (x ; εμx , σx2 )dx (16a) x̄ = · x + z . (21)
d μx dμ
x
d
Since F† is unitary, z and x remain statistically uncorrelated
= f (x ) N (x ; εμx , σx2 ) dx (16b) to each other.
d μx

d
The following approximation is widely used for the treat-
= −ε f (x ) N (x ; εμx , σx2 ) dx (16c) ment of mutually independent signals after DFT or IDFT [12].
dx

∞ Approximation 1: The entries of x, x̄, z, D and D̄ are
= −ε f (x )N (x ; εμx , σx2 ) approximately Gaussian distributed.
−∞
Since z and x are statistically uncorrelated, Approximation 1
2
+ε N (x ; εμx , σx )df (x ) (16d) implies that the entries of z are independent of those of x.
Proposition 3: Under Approximation 1 and α = E[f (x )],
df (x )
=ε N (x ; εμx , σx2 )dx (16e) we have
dx

df (x ) E[(x̄ − x )∗ · (d̄ − d )] = 0, (22)
= ε·E . (16f)
dx where the expectation is taken over the joint distribution of
x̄ and x.
Return to (13), d̄n is a function of x̄n if we treat Vx as Proof: Based on Proposition 2, we have Ex̄ [η (x̄ )] = 0 if
a parameter. Since {xn } have the same variance of Vx , we α = E[f (x )]. Then Proposition 3 is a direct consequence of
denote d̄ = η(x̄ ) with η being an entry-wise function of x̄. [11, Definition 2 and Proposition 2].3
Proposition 2: Let α = E[f (x )] in (5). Then we have We can further show that E[(x̄ − x )∗ · (d̄ − d )] = 0 using
the generalized Stein’s lemma [17], provided that (x̄ −x ) and
E η (x̄ ) = 0, (17)
(x̄ −x ) are jointly Gaussian distributed. Rigorous proof of this
where the expectation is taken on the distribution of x̄ . result requires more dedicated analysis, which is beyond the
Proof: For simplicity, we drop the subscript n in (13) and scope of this letter. Nevertheless, we conjecture that d̄ remains
take the derivative of η(·) as statistically orthogonal to x̄ during the iterative process. Since
d d F is unitary, D̄ and X̄ are also statistically orthogonal to each
η(x̄ ) = Ex |x̄ [f (x )] − αx̄ (18a)
d x̄ d x̄ other:
d
= E [f (x )] − α (18b) E[(X̄ − X )∗ · (D̄ − D)] = 0. (23)
d x̄ x |x̄
d By Approximation 1, the entries of (D̄ − D) can be approx-
= Ex |x̄ f (x ) − α, (18c)
dx imated by zero-mean Gaussian random variables. Combining
with (23), we can see that D̄ and X̄ are independent of each
where in the last step we have applied Proposition 1.2
other if X is Gaussian distributed. The Gaussian requirement
The expectation in (18) is taken over the distribution of x
for X can be approximately ensured using, e.g., superposition
conditional on fixed x̄ . Now we consider all entries in x̄ and
coded modulation [14]. So far, we are still unable to provide
treat x̄ as a random variable. We take the expectation of both
a rigorous justification for Assumption 1 if X is not Gaussian.
sides in (18) over the distribution of x̄ :
Nevertheless, we observed numerically that API works well
d d in general.
Ex̄ η(x̄ ) = Ex̄ Ex |x̄ f (x ) − α (19a)
d x̄ dx
3 The proof in [11] is for real Gaussian inputs. A similar proof for complex
2 It can be verified that Proposition 1 holds for complex Gaussian inputs. Gaussian inputs can be found in [16].
LIANG et al.: ON IC OF CLIPPING DISTORTION IN OFDM SYSTEMS 439
• For NDF, there is a cross point between the BER perfor-

mance curves of API and EI. We observed numerically
that such cross point occurs much earlier for the frame
error rate (FER) curves. We are still seeking a satisfactory
explanation for this observation.
Note that CR = 0 dB in Fig. 2 corresponds to
much lower PAPR than that reported for other alternative
techniques [4], [5]. This shows the attractiveness of the pro-
posed method for OFDM systems with very low PAPR
requirement.
IV. C ONCLUSION
We investigated an iterative API approach to treat clip-
ping distortion in clipped OFDM systems. We showed that
Fig. 2. The BER performance of API and EI in different scenarios. The
frame length is 4096 and CR = 0 dB. For PDF, we consider a downlink
correlation among messages can be suppressed by a proper
OFDMA system with four users and only one target user performs decoding. linearization model. Simulation results showed that the API
The number of iterations is 5. We observed numerically that more than 5 approach outperforms the EI one, especially for PDF.
iterations leads to marginal improvement.
R EFERENCES
[1] G. Wunder, R. F. H. Fischer, H. Boche, S. Litsyn, and J.-S. No, “The
III. A PPLICATIONS AND S IMULATION R ESULTS PAPR problem in OFDM transmission: New directions for a long-lasting
A. Applications to Various OFDM Scenarios problem,” IEEE Signal Process. Mag., vol. 30, no. 6, pp. 130–144,
Nov. 2013.
The proposed method is flexible for different performance- [2] Y. Rahmatallah and S. Mohan, “Peak-to-average power ratio reduction
in OFDM systems: A survey and taxonomy,” IEEE Commun. Surveys
complexity tradeoff. We consider the following strategies Tuts., vol. 15, no. 4, pp. 1567–1592, 4th Quart., 2013.
(from high to low complexity): [3] H. Ochiai and H. Imai, “Performance analysis of deliberately clipped
• Full Decoding Feedback (FDF): All information bits are OFDM signals,” IEEE Trans. Wireless Commun., vol. 50, no. 1,
pp. 89–101, Jan. 2002.
decoded. [4] S. Y. Le Goff, S. S. Al-Samahi, B. K. Khoo, C. C. Tsimenidis, and
• Partial Decoding Feedback (PDF): Only some infor- B. S. Sharif, “Selected mapping without side information for PAPR
reduction in OFDM,” IEEE Trans. Wireless Commun., vol. 8, no. 7,
mation bits are decoded. For the remaining symbols, pp. 3320–3325, Jul. 2009.
the posteriori mean in (10) is calculated by replacing [5] A. Behravan and T. Eriksson, “Tone reservation to reduce the envelope
{Pr(Xn = sj )API } with {Pr(Xn = sj |Rn )} in (9). PDF fluctuations of multicarrier signals,” IEEE Trans. Wireless Commun.,
vol. 8, no. 5, pp. 2417–2423, May 2009.
is suitable for a multi-user downlink OFDMA system, [6] H. Chen and A. M. Haimovich, “Iterative estimation and cancellation of
where a target user does not want to decode other users’ clipping noise for OFDM signals,” IEEE Commun. Lett., vol. 7, no. 7,
pp. 305–307, Jul. 2003.
messages. [7] D. Hao and P. Hoeher, “Iterative estimation and cancellation of clipping
• No Decoding Feedback (NDF): The a posteriori noise for multi-layer IDMA systems,” in Proc. 7th Int. ITG Conf. Source
Channel Coding, Jan. 2008, pp. 1–6.
mean (10) is calculated using {Pr(Xn = sj |Rn )} in (9). [8] J. Tong, L. Ping, Z. Zhang, and V. K. Bhargava, “Iterative soft com-
Then after iterative processing, the DEC only performs pensation for OFDM systems with clipping and superposition coded
decoding once at the final stage. modulation,” IEEE Trans. Commun., vol. 58, no. 10, pp. 2861–2870,
Oct. 2010.
[9] C. Berrou and A. Glavieux, “Near optimum error correcting coding
and decoding: Turbo-codes,” IEEE Trans. Commun., vol. 44, no. 10,
B. Simulation Results pp. 1261–1271, Oct. 1996.
Consider an OFDM system with N = 128 sub-carriers. [10] A. Movahed, M. C. Reed, N. Aboutorab, and S. E. Tajbakhsh,
“EXIT chart analysis of turbo compressed sensing using message pass-
We use a rate-1/2 convolutional code (23, 35)8 and 16-QAM ing dequantization,” IEEE Trans. Signal Process., vol. 64, no. 24,
signaling scheme with Gray mapping. The system rate is pp. 6600–6612, Dec. 2016.
[11] J. Ma and L. Ping, “Orthogonal AMP,” IEEE Access, vol. 5,
R = 2 bits/symbol. Define Eb /N0 in decibel as Eb /N0 = pp. 2020–2033, 2017.
10log10 (E[|f (z )|2 ]/Rσ 2 ) and clipping ratio (CR) in decibel [12] T. Araujo and R. Dinis, “On the accuracy of the Gaussian approximation
as CR = 10 log10 (A2 /E[|x |2 ]). The transmitted signals go for the evaluation of nonlinear effects in OFDM signals,” IEEE Trans.
Commun., vol. 60, no. 2, pp. 346–351, Feb. 2012.
through Rayleigh fading channels with L = 4 taps. [13] A. Chindapol and J. A. Ritcey, “Design, analysis, and performance eval-
In Fig. 2, we compare the bit error rate (BER) performance uation for BICM-ID with square QAM constellations in Rayleigh fading
channels,” IEEE J. Sel. Areas Commun., vol. 19, no. 5, pp. 944–957,
of API and EI in the scenarios discussed in Section III-A. “No May 2001.
treatment” in Fig. 2 means directly detecting and decoding by [14] J. Tong, L. Ping, and X. Ma, “Superposition coded modulation with
peak-power limitation,” IEEE Trans. Inf. Theory, vol. 55, no. 6,
ignoring the clipping effect. We can see that its performance is pp. 2562–2576, Jun. 2009.
quite poor. From Fig. 2, we have the following observations: [15] X. Wang and H. V. Poor, “Iterative (turbo) soft interference cancellation
• For FDF, the advantage of API over EI is small since EI and decoding for coded CDMA,” IEEE Trans. Commun., vol. 47, no. 7,
pp. 1046–1061, Jul. 1999.
performance is already close to the ideal unclipped one. [16] K. Takeuchi, “Rigorous dynamics of expectation-propagation-based sig-
• For PDF, the advantage of API over EI is noticeable since nal recovery from unitarily invariant measurements,” in Proc. IEEE Int.
Symp. Inf. Theory (ISIT), Jun. 2017, pp. 501–505.
EI, without the decoder feedbacks for the unwanted users, [17] C. M. Stein, “Estimation of the mean of a multivariate normal distribu-
performs poorly in this case. tion,” Ann. Stat., vol. 9, no. 6, pp. 1135–1151, Nov. 1981.
Automatic Modulation Classification Using Cyclic

Correntropy Spectrum in Impulsive Noise
Jitong Ma and Tianshuang Qiu
Abstract—Automatic modulation classification (AMC) plays an However, in real communication scenario, due to the influence
important role in many military and civilian communication from naturally or man-made signal sources, such as multiuser
applications. However, it remains a challenging task to sup- interference, car ignitions, and atmospheric noise, the channel
port such AMC mechanisms under impulsive noise environments. noise is non-Gaussian and has impulsive characteristics [6].
Aiming at improving the classification performance in impul-
Although there have been lots of AMC methods designed
sive noise, in this letter, a novel modulation classification method
is proposed by using the cyclic correntropy spectrum (CCES). for optimal performance in Gaussian noise, these algorithms
In the proposed method, CCES is introduced into AMC for typically exhibit significantly worse performance when impul-
effectively suppressing impulsive noise. Specifically, it is veri- sive noise is present. Effective modulation classification in
fied that modulation types can be distinguished through CCES. impulsive noise remains a challenge.
Then, multi-slices are extracted at different cycle-frequencies In order to deal with impulsive noise, Chavali and
from CCES as the original features for AMC. Following the da Silva [7] made use of the composite hypothesis testing
extraction, the principal component analysis is applied to these approach to estimate the impulsive noise parameters. The per-
slices to further optimize the original features. Finally, the radial formance of this LB method depends on the estimation of
basis function neural network is used as a classifier to perform
modulation classification. Monte Carlo simulations demonstrate unknown parameters, while it is difficult to be modeled pre-
that the proposed algorithm outperforms other existing schemes cisely. In [8], Kolmogorov-Smirnov test was used to make a
in impulsive noise cases, especially with a low generalized signal decision to get faster performance in impulsive noise. A vari-
to noise ratio. ational mode decomposition technique was proposed in [9]
to identify the modulation types. Low order wavelet packet
Index Terms—Automatic modulation classification (AMC),
cyclic correntropy spectrum (CCES), impulsive noise. decomposition technique was proposed in [10] to extract char-
acteristic parameters for modulation classification in impulsive
noise. In [11], sparse signal decomposition was employed
to remove the impulsive noise. Then cyclostationary features
I. I NTRODUCTION were extracted for AMC. However, in low GSNR, the above
UTOMATIC modulation classification (AMC) is usu-
A ally utilized as the intermediate process between signal
detection and demodulation to identify the modulation type
methods were unable to perform AMC in a high accuracy, and
their performance still need to be further improved.
Recently, a novel signal analysis concept, named cyclic
of an unknown received signal even with little prior knowl- correntropy (CCE), has just emerged to supress impulsive
edge about the transmitted signal [1]. Over the past few years, noise in [12]–[14]. CCE innovatively combines the cyclosta-
AMC has gained widespread attention due to its wide appli- tionarity method and the concept of correntropy. It has been
cations in many military and civilian fields [2]. Generally, demonstrated that CCE can achieve better performance in the
the typical AMC approaches can be divided into two cate- suppression of impulsive noise than other conventional meth-
gories: likelihood-based (LB) methods and feature-based (FB) ods [14]. Although CCE has advantages in cyclostationarity
methods [1]. Compared with LB methods, FB methods are signal processing, its applications have yet to be developed and
more prominent in the practical implementations [3]. In FB reinforced. To the best of our knowledge, CCE had only been
methods, there are mainly two crucial steps, namely fea- used in frequency estimation and spectrum sensing and never
ture extraction and classification. Some discriminating features been explored for AMC. In this letter, we attempt to extend its
have been adopted in pervious AMC methods. For example, application to AMC in impulsive noise so that the advantage
in [4], different slices were taken from spectral coherence of CCE can be taken to get a promising performance.
function as the features for AMC. Meanwhile, various clas- Inspired by CCE, we propose a novel AMC method
sifiers have also been widely studied, such as support vector based on CCES for impulsive noise environment. In this
machine, genetic algorithm and RBF networks [5]. In these method, at first, the theoretical analysis is carried out to
AMC methods, the wireless communication channel noise is verify that different modulation types have different CCES
often assumed to be AWGN (additive white Gaussian noise). patterns even in impulsive noise environment. Hence, CCES
is introduced into AMC for effectively suppressing impul-
Manuscript received August 29, 2018; accepted October 1, 2018. Date of sive noise. Through CCES, the received signals are trans-
publication October 9, 2018; date of current version April 9, 2019. This work
was supported by the National Natural Science Foundation of China under
formed into the cyclic-frequency domain. Then, multi-slices
Grant 61671105, Grant 61139001, Grant 61172108, and Grant 81241059. The at different cycle-frequencies are extracted from CCES as
associate editor coordinating the review of this paper and approving it for original features of these received signals. Particularly, PCA
publication was R. C. de Lamare. (Corresponding author: Tianshuang Qiu.) is adopted to optimize these original features. Finally, RBF
The authors are with the Faculty of Electronic Information and Electrical
Engineering, Dalian University of Technology, Dalian 116024, China (e-mail:
neural network is employed as a classifier to perform the
mjt@mail.dlut.edu.cn; qiutsh@dlut.edu.cn). modulation classification. The simulation results show that the
Digital Object Identifier 10.1109/LWC.2018.2875001 proposed method yields better performance than other methods
MA AND QIU: AMC USING CCES IN IMPULSIVE NOISE 441
mentioned previously in impulsive noise, especially with a

low GSNR.
II. I MPULSIVE N OISE AND CCES

A. Impulsive Noise
Impulsiveness can be a common feature for the noise in
communication channels, especially in the complicated elec-
tromagnetic environment. Alpha stable distribution model is
more suitable for describing impulsive noise and it is usually
represented by its characteristic function
α απ
expjat − γ α |t|
1 − jβsgn(t) tan 2 , α = 1
φ(t) = 2
exp jat − γ|t| 1 + jβsgn(t) π log|t| , α=1
(1)
where
0 < α ≤ 2, −1 ≤ β ≤ 1, γ > 0, −∞ < a < +∞ (2)
Fig. 1. The proposed AMC algorithm system.
here, α is the characteristic exponent. The smaller α is, the
more impulsive this distribution model will be. β is the sym-
metry parameter, γ is the dispersion parameter, and a is the III. P ROPOSED A LGORITHM
location parameter. When α < 2, alpha stable distribution In this section, we propose a novel AMC algorithm using
no longer has the finite second or higher order statistics, so CCES, as shown in Fig. 1. The details are presented as follows.
the mean square error (MSE) criterion can hardly be used
to design AMC algorithms as traditional way. This makes the
performance of traditional AMC algorithms degrade seriously. A. CCES Analysis and Calculation
In the complicated electromagnetic environment, the effec-
B. Cyclic Correntropy Spectrum tive feature extraction is required to eliminate the influence of
The correntropy function Vx (t, τ ) of a cyclical stationary impulsive noise. In our method, the features extracted from
modulated signal x(t) can be denoted by Fourier series as: CCES are used to identify modulation types. In order to
explain the effectiveness of CCES in feature extraction, we
+∞
carry out theoretical analysis through taking the received sig-
Vx (t, τ ) = Vxξ (τ )ej2πξt (3) nal s(t) = x(t) + v(t), where v(t) denotes impulsive noise, as
t=−∞ an example. For impulsive noise, substituting v(t) into Eq. (6),
where ξ = 1/T0 is the cyclic frequency and T0 is the period we can obtain its CCE function Vvξ (τ ), given as:
of Vx (t, τ ). Fourier series coefficient, Vxξ (τ ), is defined as
CCE, given by: Vvξ (τ ) = κσ (v (t) − v (t + τ ))e−j 2πξt (10)
t

2/T0
ξ 1
Vx (τ ) = Vx (t, τ )e−j2πξt dt (4) In Eq. (10), Gaussian kernel function of impulsive noise,
T0 −2/T0 κσ (v (t) − v (t + τ )), is contained. When it comes to a signifi-
Employing the definition of correntropy function Vx (t, τ ), cant impulse in v(t) at the moment t, the value of (v (t)−v (t +
τ ))2 has a sudden increase, and then κσ (v (t) − v (t + τ )) goes
Vx (t, τ ) = E[κσ (x (t) − x (t + τ ))] (5) to zero according to Eq. (7). Therefore impulsive noise can be
CCE could be rewritten as: suppressed by CCE. For the modulated signal x(t), expanding
correntropy function in Eq. (5) with Taylor gives:
Vxξ (τ ) = κσ (x (t) − x (t + τ ))e−j2πξt (6)
t ∞
1 (−1)n

Vx (t, τ ) = √ E (x (t) − x (t + τ ))2n (11)
the parameter σ is the kernel size and κσ x (t) − x (t +
where n 2n
2πσ n=0 2 σ n!
τ ) is usually a Gaussian kernel, given as:
1 2 2
Taylor expansion of correntropy involves information of all
κσ (x (t) − x (t + τ )) = √ e−(x (t)−x (t+τ )) /2σ (7) the even-order statistical moments regarding modulated signal
2πσ
x(t), and these moments are weighted by (−1)n /(2n σ 2n n!).
The operator ·t is defined as follows: We employ the denominator of ζσ (t, τ ) to represent the col-

T lection of the terms associated with σ whose exponents are
1
·t = lim (·) dt (8) greater than two. Then, the function can be rewritten by:
T →∞ 2T −T
1 1
The cyclic correntropy spectrum (CCES) is defined as the Vx (t, τ ) = √ E 1 − 2 (x (t) − x (t + τ ))2 + ζσ (t, τ )
Fourier transform of CCE, given as: 2πσ 2σ

∞ 1 1 1
ξ = √ 1 − 2 Rx (t; 0) + 2 Rx (t; τ ) + ζσ (t, τ )
Sx (f ) = Vxξ (τ )e−j 2πf τ dτ (9) 2πσ σ σ
−∞ (12)
After the ξ-sections extraction, the PCA algorithm is

adopted to optimize these original features and reduce its
dimensionality. This is based on the consideration that there
are lots of duplicated information items in original features
which just increase the computation load but not contribute
to improving the effectiveness of AMC. Assuming s is the
dimension of an original data set Dori , and r is the number
of principal components. Then scattering matrix is given as:
T
B= Xi − X Xi − X (16)
i
where Xi denotes the i th sample, X is the mean of samples.

Through performing eigenvalue decomposition to scattering
matrix B, we can obtain conversion matrix W ∈ Rs×r
which is formed by the eigenvectors corresponding to the
largest r eigenvalues of B. Then, a lower dimension data set
D = Ws×rT ×D
ori can be generated as features for AMC.
Fig. 2. CCES of different modulation types.
C. Classification
where Rx (t; τ ) represents the autocorrelation function of the After feature extraction and optimization, we empoly a
stochastic process x(t). Substitute (12) into (6), multi-layer RBF network as the classifier which has achieved
outperforming performance in AMC method [5]. The feature
1 1
Vxξ (τ ) = √ 2πδ(t) + (Rxξ (τ ) − Rxξ (0)) + ζσξ (t, τ ) set x is used as the input, and the output ϕ(x) is given by:
2πσ σ t
q

(13)
ϕ(x) = wi ρ(x, ci ) (17)
where Rxξ (τ ) is the cyclic autocorrelation function, δ(·) is i=1
ξ
the unit impulse function, and ζσ (t, τ ) = ζσ (t, τ ) · e−j2πξt .
where q is the number of hidden layer neurons, ci and wi
Substituting (13) into (9), CCES can be formulated as follows:
√ are the centre and the weight of the i th hidden layer neuron,
1 2π respectively. ρ(x, ci ) denotes the RBF, which is defined as the
Sxξ (f ) = √ Sa (f ) − 2πRxξ (0)δ(f ) + + ψ(ζ, e) (14)
2πσ 3 σ Euclidean distance between x and ci , given by:
where ψ(ζ, e) is the remainder determined by ζσξ , Sa (f ) is the ρ(x, ci ) = e−μi x−ci
2
(18)
Fourier transform of Rxξ (τ ) and denotes the cyclic correlation
spectrum of x(t). As shown in Eq. (14), CCES mainly relys RBF neural network is trained by 2 steps. Firstly, random
on Sa (f ) which is distinct in different modulation types. For sampling approach is utilized to calculate ci . Then back-
different modulation types, such as 2ASK, BPSK, QPSK, and propagation algorithm is employed to determine wi and μi .
16QAM, their cyclic correlation spectrums Sa (f ) are different.
Thus, according to Eq. (14), there are significant differences IV. S IMULATION R ESULTS
in their CCESs, just as shown in Fig. 2, especially when cyclic
In this section, the performance of our AMC method in
frequency ξ = 0, ±fc , ±2fc , . . . , where fc is carrier frequency.
impulsive noise is evaluated using Monte Carlo simulations.
Due to these differences in their CCESs, the features extracted
Four common modulation types are investigated in AMC,
from CCES can be used to identify the modulation types.
including 2ASK, BPSK, QPSK, and 16QAM. Moreover, due
to the infiniteness of the variance under alpha stable distribu-
B. Feature Extraction and Optimization tion, the GSNR in dB is defined to measure the strength of
Following the calculation of CCES, ξ-sections extraction impulsive noise as:
and PCA optimization are applied to obtain more discrimi-
nating features. The steps of CCES ξ-sections extraction are GSNR = 10 lg(Ps /γ) (19)
presented as follows:
where Ps denotes the power of the modulated signal x(t).
• Calculate the f -profile of CCES through taking the maxi-
When 1 < α < 2, the alpha stable distribution is sufficient
mum value along cyclic frequency ξ for each f of CCES,
to describe the impulse noise, so the value of α is set in
which is formulated as follows:
this range. Besides, we compare our method with the existing

profile(f ) = max Sxξ (f ) (15) AMC methods, including the fractional lower-order covariance
ξ cyclic stationary statistics based AMC method [8], low order
According to the relationship between cyclic frequency wavelet packet decomposition based AMC method [10], and
and carrier frequency in f -profile, that is ξ = ±2fc , the cyclostationary features based AMC method [11]. The best
carrier frequency can be obtained. performance of these AMC methods is denoted by “Existing”
• Set cyclic frequency ξ = 0, fc , ±2fc , 4fc respectively, in the result figures. The comparison experiments are car-
then five sections can be extracted from CCES as original ried out in different GSNR and characteristic exponents. The
features to distinguish the modulation type. results are shown in Fig. 3 and Fig. 4, respectively.
MA AND QIU: AMC USING CCES IN IMPULSIVE NOISE 443
C. Comparisons of Computational Complexity

Computing CCES from m symbols requires O(m 2 ) opera-
tions. Computing the optimization features with a s dimension
data set for each class requires O(s 2 ) operations. In the clas-
sification, only a small calculation effort is required after the
RBF network training. Hence, our algorithm requires O(m 2 )
computations. Compared with other FB contrastive methods,
the complexity of our method is more than the method in [10]:
O(m) and is at the same level with the methods in [9] and [11]:
Fig. 3. Classification performance in different GSNR.
O(m 2 ). The difference between these calculation efforts is
not big enough to influence the real-time performance of
computation.
V. C ONCLUSION
In this letter, a novel AMC method based on cyclic corren-
tropy spectrum is proposed. It is verified that modulation types
can be distinguished by CCES, and features extracted from
CCES are more effective in AMC, especially for impulsive
noise environment. Besides, PCA algorithm and RBF neu-
ral network are utilized to optimize features and classify the
modulation types respectively. Simulation results indicate that
the proposed method can lead to higher accuracy and robust
Fig. 4. Classification performance in different α. modulation classification in impulsive noise.
R EFERENCES
A. Performance Comparisons in Different GSNR [1] O. A. Dobre, A. Abdi, Y. Bar-Ness, and W. Su, “Survey of auto-
matic modulation classification techniques: Classical approaches and
In this subsection, we conduct the experiment with GSNR new trends,” IET Commun., vol. 1, no. 2, pp. 137–156, Apr. 2007.
in a range from -5dB to 10dB. In addition, 2000 samples are [2] M. W. Aslam, Z. Zhu, and A. K. Nandi, “Automatic modulation clas-
sification using combination of genetic programming and KNN,” IEEE
generated for every individual modulation type in each decibel Trans. Wireless Commun., vol. 11, no. 8, pp. 2742–2750, Aug. 2012.
and there are 2000 symbols in each sample. Then half samples [3] X. Yan, G. Feng, H.-C. Wu, W. Xiang, and Q. Wang, “Innovative robust
are used as training set to train RBF network and the others modulation classification using graph-based cyclic-spectrum analysis,”
are treated as the testing set. Besides, the characteristics expo- IEEE Commun. Lett., vol. 21, no. 1, pp. 16–19, Jan. 2017.
[4] A. Fehske, J. Gaeddert, and J. H. Reed, “A new approach to signal clas-
nent α is set to be 1.8. In Fig. 3(a), the correct recognition sification using spectral correlation and neural networks,” in Proc. 1st
percentage is shown for different modulation types. It is appar- IEEE Int. Symp. New Front. Dyn. Spectr. Access Netw. (DySPAN), 2005,
ent from Fig. 3(a) that the proposed algorithm outperforms the pp. 144–150.
existing algorithms under low GSNR conditions. The correct [5] J. Li, C. He, J. Chen, and D. Wang, “Automatic digital modulation
recognition based on Euclidean distance in hyperspace,” IEICE Trans.
recognition percentage of the proposed algorithm can converge Commun., vol. 89, no. 8, pp. 2245–2248, Aug. 2006.
to 1 when GSNR > 0dB, while the existing algorithm requires [6] Y. Liu, T. Qiu, and H. Sheng, “Time-difference-of-arrival estima-
GSNR to be at least 4dB so as to achieve the same perfor- tion algorithms for cyclostationary signals in impulsive noise,” Signal
mance. Furthermore, the probability of correct classification Process., vol. 92, no. 9, pp. 2238–2247, Sep. 2012.
(Pcc) for the aforementioned four modulation types is illus- [7] V. G. Chavali and C. R. C. M. da Silva, “Classification of digi-
tal amplitude-phase modulated signals in time-correlated non-Gaussian
trated in Fig. 3(b). It shows that our AMC method performs channels,” IEEE Trans. Commun., vol. 61, no. 6, pp. 2408–2419,
better than other contrastive methods. Thus, our algorithm is Jun. 2013.
more suitable for AMC in low GSNR condition. [8] F. Wang and X. Wang, “Fast and robust modulation classification via
Kolmogorov–Smirnov test,” IEEE Trans. Commun., vol. 58, no. 8,
pp. 2324–2332, Aug. 2010.
[9] T. Dutta, U. Satija, B. Ramkumar, and M. S. Manikandan, “A novel
method for automatic modulation classification under non-Gaussian
B. Performance Comparisons in Different α noise based on variational mode decomposition,” in Proc. IEEE 22nd
Fig. 4(a) shows the classification performance of the pro- Nat. Conf. Commun. (NCC), Mar. 2016, pp. 1–6.
posed and existing algorithms with α = 1.3. As shown in [10] Y. Hu, M. Liu, C. Cao, and B. Li, “Modulation classification in alpha
stable noise,” in Proc. IEEE 13th Int. Conf. Signal Process. (ICSP),
Fig. 4(a), the proposed method performs better than other Nov. 2016, pp. 1275–1278.
contrastive methods, especially in low GSNR conditions. In [11] U. Satija, M. Mohanty, and B. Ramkumar, “Cyclostationary features
comparison with Fig. 3(a), both in α = 1.8 and α = 1.3, the based modulation classification in presence of non Gaussian noise using
proposed AMC method achieves higher accuracy than other sparse signal decomposition,” Wireless Pers. Commun., vol. 96, no. 4,
pp. 5723–5741, Oct. 2017.
contrastive methods. In order to further explain the perfor- [12] S. Luan, T. Qiu, Y. Zhu, and L. Yu, “Cyclic correntropy and its spec-
mance of the proposed method, the classification accuracies trum in frequency estimation in the presence of impulsive noise,” Signal
over α in GSNR = 1 dB are shown in Fig. 4(b). Clearly, com- Process., vol. 120, pp. 503–508, Mar. 2016.
pared with other contrastive methods, the proposed method [13] A. I. R. Fontes, J. B. A. Rego, A. de Medeiros Martins, L. F. Q. Silveira,
and J. C. Principe, “Cyclostationary correntropy: Definition and appli-
has higher accuracy, especially in lower characteristic expo- cations,” Expert Syst. Appl., vol. 69, pp. 110–117, Mar. 2017.
nent. Thus, the proposed AMC method can be used to solve [14] T. Liu, T. Qiu, and S. Luan, “Cyclic correntropy: Foundations and
AMC problem in impulsive noise environments. theories,” IEEE Access, vol. 6, pp. 34659–34669, Jul. 2018.
Vertical and Horizontal Building Entry Loss Measurement

in 4.9 GHz Band by Unmanned Aerial Vehicle
Kentaro Saito , Member, IEEE, Qiwei Fan, Nopphon Keerativoranan,
and Jun-ichi Takada , Senior Member, IEEE
Abstract—User traffic of mobile wireless communication is BEL characteristics, typically through building windows, were
rapidly increasing in urban areas. Thus, service cell planning investigated by up to the 2 GHz band [4], [5], and 3.5 GHz
becomes an important issue for efficient radio resource usage. band [6], 5 GHz band [7], [8], 8 GHz band [9], 10 GHz
Since users are also inside high buildings in those areas, it is
important to know the building entry loss (BEL) characteristics band [10], and 38 GHz band [11]. The results showed that
from the outside base stations of various locations for the pur- the BEL characteristics approximately follow these models.
pose. In this letter, we measured the penetration loss of buildings However, in most of the literature, only the horizontal or ver-
with large windows in the 4.9 GHz band by using a system with tical domain characteristics were investigated, and combined
an unmanned aerial vehicle. Our contribution is to clarify the characteristics in the vertical-horizontal domain were investi-
vertical-horizontal BEL characteristics through exhaustive mea-
surements. We also extended the COST 231 BEL model for the gated by using a simulation such as a knife-edge diffraction
vertical-horizontal BEL characteristics, and estimated the model model [12] because of the limited variety of measurement con-
parameters from the measurements. The results showed that the ditions. This limitation comes from the physical difficulties of
vertical-horizontal domain BEL characteristics were quite dif- having to place the BS antenna outside.
ferent, but that they were not independent. The proposed model We thought that exhaustive BEL measurements are needed
improved the root mean squared error of the BEL prediction
by approximately 3 dB. This is expected to be utilized for the to construct the vertical-horizontal domain BEL model to
improvement of cell planning efficiency. enable 3D cell planning. Recently, there has been much
progress in robotic technologies such as unmanned aerial
Index Terms—Building entry loss, outdoor-to-indoor propaga-
tion, propagation loss measurement, unmanned aerial vehicle, vehicles (UAVs). This has attracted interest even in the com-
vertical and horizontal domain loss, window penetration loss. munications research area [13], [14]. We developed a radio
measurement system using a multicopter to solve the physi-
cal limitation problem of BS placement [15]. The contribution
of this letter is to clarify the vertical-horizontal domain pen-
I. I NTRODUCTION etration loss characteristics of buildings with large windows
WING to the widespread use of various application through exhaustive outdoor-to-indoor (O2I) measurements by
O services such as video streaming and cloud services,
user traffic in cellular networks is rapidly increasing. Thus,
using the system. The measurement sites were in front of
a research building and a cafeteria, and the frequency was
for efficient radio resource usage, service cell planning is an the 4.9 GHz band for the low super high frequency (SHF)
important issue. In particular, cell planning becomes complex band communication of the 5G cellular system. The results
in urban areas because many users are also inside high build- showed that the vertical-horizontal domain BEL characteristics
ings, which are sometimes higher than the base stations (BSs). were quite different, but they were not independent. We also
Therefore, three-dimensional (3D) cell planning is considered extended the COST 231 BEL model for the vertical-horizontal
in those areas [1]. domain BEL characteristics by using the effective azimuth
Knowledge of the building entry loss (BEL) characteris- and elevation angles from the window center, and estimated
tics of radio waves is important for this purpose. It is known the model parameters from the measurements. The proposed
that the BEL characteristics change according to the incident model improved the root mean squared error (RMSE) of the
angle of the radio wave to the building. The COST 231 BEL BEL prediction by approximately 3 dB.
model [2] and ITU-R P.2109 model [3] specified the azimuth
and elevation angle characteristics, respectively, and various
II. 3D E XTENSION OF COST 231 BEL M ODEL
radio measurements were conducted for the validation. The
The COST 231 BEL model [2] is the baseline model for
Manuscript received August 18, 2018; accepted October 3, 2018. Date of this letter. In the model, the propagation loss Ltotal from the
was supported in part by the Fujikura Foundation and in part by the Support outdoor BS to the indoor mobile station (MS) is modeled by
Center for Advanced Telecommunications Foundation. The associate editor the outdoor path loss Lout , the indoor path loss Lin , and the
coordinating the review of this paper and approving it for publication was building penetration loss Ltw as follows:
W. Zhang. (Corresponding author: Kentaro Saito.)
The authors are with the School of Environment and
Society, Tokyo Institute of Technology, Tokyo 152-8550, Japan Ltotal = Lout + Lin + Ltw (1)
(e-mail: saitouken@tse.ens.titech.ac.jp; fan.q.aa@m.titech.ac.jp; Lout = 32.44 + 20log(f ) + 20log(S + d ) (2)
keerativoranan.n.aa@m.titech.ac.jp; takada@ide.titech.ac.jp).
Digital Object Identifier 10.1109/LWC.2018.2875003 Lin = αd (3)
SAITO et al.: VERTICAL AND HORIZONTAL BEL MEASUREMENT IN 4.9 GHz BAND BY UAV 445
characteristics of buildings with large windows. The measure-

ment areas and photos are shown in Fig. 2. The transmitter
(Tx) was set indoors, and the receiver (Rx) was mounted on
the UAV. These were regarded as the indoor MS and virtual
outdoor BS, respectively, because the reciprocity is satisfied in
the measurement. One measurement environment was a high-
rise research building. The indoor MSs were fixed at three
points at different distances from the window in the confer-
ence room on the sixth floor. The floor height was 20 m from
the ground. Another environment was a cafeteria. The indoor
MS was fixed at the center of the room on the third floor and
on the terrace for the calibration, whose floor height was 7 m
from the ground. The MS antenna height was 1.7 m from the
Fig. 1. Spatial parameters of 3D BEL model extension. floor in both cases. The outside measurement areas were grass
fields in front of the buildings.
D 2 The horizontal measurement courses were set parallel to
Ltw = We + WGe (1 − ) the building external walls. The measurement plane was 20 m
S
= We + WGe (1 − cos(θ))2 (4) horizontal length by 15 m height in front of the buildings.
We obtained the propagation loss on the plane to investigate
Here, f is the carrier frequency, S is the distance from the the BEL characteristics from various BSs. For example, the
BS to the window center, d is the distance from the window low BS height represents the street-cell BSs, and the high
center to the indoor MS, and D is the perpendicular distance BS height represents the urban micro-cell BSs. Because the
from the BS to the external wall of the building. The spatial BEL model is the even function of θazm and θelv , the mea-
relations of those parameters are shown in Fig. 1. α, We , and surement was conducted only in one orthant. To simplify the
WGe are the model parameters. We is the wall penetration measurement procedure, we divided the horizontal course into
loss for the perpendicular incident angle, and WGe specifies 2 m intervals. At each measurement point, the BS ascended
the additional loss when the incident angle increases. Although to approximately 15 m in height and subsequently descended
it is proposed to use the 3D incident angle θ in (4) when the slowly. Other measurement parameters are shown in Table I.
BS is placed in a diagonal direction from the window [5],
the model has a limitation in that the vertical and horizontal B. System Calibration and Data Analysis Method
domain characteristics cannot be modeled separately.
Before the measurement, the measured value of the Rx was
In this letter, we propose a 3D extension of the COST 231
calibrated by connecting the RF cables between the Tx and
BEL model for vertical-horizontal domain window penetration
the Rx directly. The UAV position was obtained from the
loss modeling. In the proposal, the penetration loss consists of
flight log [15], and the moving average of the receiving power
an azimuth domain term and an elevation domain term.
was calculated to eliminate the multi-path fading effect. The
Ltw = We + WGe,azm (1 − cos(θazm ))nazm antenna radiation pattern of the UAV station was measured in
+ WGe,elv (1 − cos(θelv ))nelv (5) an anechoic chamber in advance, and the antenna elevation
directivity effect was canceled from the measured data based
Here, θazm = cos−1 (D
azm /S ) and θelv = cos−1 (D
elv /S ), on the grazing angle. For the validation of the system, we
are the effective azimuth and elevation angles from the win- measured the propagation loss in the Line-of-Sight (LoS) case
dow center. The effective azimuth angle is defined in the plane where the MS was fixed on the terrace of cafeteria. The mean
formed by the vector S and Dazm . It is thought that the pen- and standard deviation of the measurement error from the free
etration loss mainly caused by the diffraction of the radio space path loss were 0.1 dB and 1.4 dB, respectively.
wave on the window edge. The horizontal domain loss pro- For the data analysis, the model parameters for both the
file becomes flatter as the height difference between the BS COST 231 BEL model and the 3D extended BEL model were
and the window increases because the incident angle variation estimated from the measurement results to minimize the dB

of the radio wave to the window edge becomes small. These scale RMSE E = ( I1 Ii=1 (yî − yi )2 )1/2 of the model data
characteristics are modeled in (5) by introducing the effective yî from measured data yi . Additionally, the indoor propaga-
azimuth angle. We also extended the model by including the tion loss coefficient α was fixed at 0.6 dB/m owing to an
exponents nazm and nelv of each term as the model parameters insufficient number of indoor measurement points.
to represent the various shapes of angular profiles.
IV. BEL M EASUREMENT R ESULTS
III. 4.9 GH Z BAND R ADIO M EASUREMENT OF B UILDING
The vertical-horizontal domain receiving power profile of
E NTRY L OSS F ROM W INDOW
both the measured data and the modeling result of the research
A. Measurement Environment and Measurement Method building environment are shown in Fig. 3(a) and 3(b). The
We conducted radio measurements in the 4.9 GHz band MS position is also shown. The receiving power increased as
to investigate the vertical-horizontal domain penetration loss the BS position approached the MS position. Regarding the
Fig. 2. Measurement maps and photos of (a) research building (map), (b) research building (outdoor view), (c) research building (indoor view), (d) UAV
station, (e) cafeteria (map), (f) cafeteria (outdoor view), and (g) cafeteria (indoor view).
TABLE I
M EASUREMENT PARAMETERS loss at the perpendicular angle We ranged from 5 dB to 6 dB,
and these results corresponded to those in [2] and [7] for a
glass window case. Although the azimuth angular dependent
loss coefficient WGe,azm was similar to that in [7] for the
research building environment, it was 0 dB in the cafeteria
environment. This is because the cafeteria was a glass-walled
circular room that did not have a specific azimuth angular
dependency. The elevation angular dependent loss coefficient
WGe,elv was larger than WGe,azm in our measurements. The
reason is thought to be that the window was horizontally long
and caused a more significant diffraction loss in the verti-
cal plane. On the other hand, these BEL characteristics could
not be modeled by a single parameter WGe of the COST
231 model. As a result, We tended to increase to compensate
for the large estimation error of the angular characteristics.
Fig. 3(c) shows the cumulative distribution function (CDF) of
the modeling error. In the proposed method, the modeling error
was significantly improved from that of the COST 231 model.
The median error was improved by approximately 3 dB.
The measurement results showed that the vertical-horizontal
domain BEL characteristics were different, and that the
horizontal domain characteristics, the receiving power changed proposed 3D BEL model improved the modeling accuracy
more significantly in the high altitude area compared to the compared to the COST 231 model.
low altitude area, as described in Section II. A similar trend
was observed in the vertical domain profile. In Fig. 3(b), these
characteristics were modeled well in the proposed model by V. S UMMARY
introducing the effective incident angles in both domains. In this letter, we presented vertical-horizontal BEL mea-
All estimated model parameters are summarized in Table II. surement results from the developed system using an UAV.
The 5.25 GHz band measurement result of another litera- The 4.9 GHz band penetration loss characteristics of build-
ture [7] is also shown. In the proposed model, the penetration ings with large windows were clarified in research building
SAITO et al.: VERTICAL AND HORIZONTAL BEL MEASUREMENT IN 4.9 GHz BAND BY UAV 447
TABLE II
M ODEL PARAMETER E STIMATION R ESULTS
Fig. 3. Measurement Results:(a) receiving power profile (research building, measurement data), (b) receiving power profile (research building, proposed
model), and (c) modeling error CDF.
and cafeteria environments. The results showed that the verti- [6] K. L. Chee, A. Anggraini, T. Kaiser, and T. Kürner, “Outdoor-to-indoor
cal domain loss was more significant because of the window propagation loss measurements for broadband wireless access in rural
areas,” in Proc. 5th Eur. Conf. Antennas Propag. (EUCAP), Apr. 2011,
shapes. In addition, the horizontal domain loss varied more pp. 1376–1380.
drastically as the difference between the BS height and the [7] M. Alatossava, E. Suikkanen, J. M. H. Veli-Matti, and J. Ylitalo,
window height decreased. We also extended the COST 231 “Extension of COST 231 path loss model in outdoor-to-indoor envi-
ronment to 3.7 GHz and 5.25 GHz,” in Proc. Int. Symp. Wireless Pers.
BEL model for the vertical-horizontal BEL characteristics and Multimedia Commun., Sep. 2008, pp. 1–4.
estimated the model parameters from the exhaustive measure- [8] J. Medbo, J. Furuskog, M. Riback, and J.-E. Berg, “Multi-frequency
ments. The median modeling error was improved by 3 dB by path loss in an outdoor to indoor macrocellular scenario,” in Proc. 3rd
Eur. Conf. Antennas Propag., Mar. 2009, pp. 3601–3605.
our proposal. The further extension of the BEL model such [9] H. Okamoto, K. Kitao, and S. Ichitsubo, “Outdoor-to-indoor prop-
as the indoor angular characteristics and the detailed angular agation loss prediction in 800-MHz to 8-GHz band for an urban
profile shapes are the future work. This letter is expected to area,” IEEE Trans. Veh. Technol., vol. 58, no. 3, pp. 1059–1067,
Mar. 2009.
be utilized in 3D service cell planning for future work. [10] A. Roivainen, V. Hovinen, N. Tervo, and M. Latva-Aho, “Outdoor-
to-indoor path loss modeling at 10.1 GHz,” in Proc. 10th Eur. Conf.
Antennas Propag. (EuCAP), Apr. 2016, pp. 1–4.
R EFERENCES [11] T. Imai et al., “Outdoor-to-Indoor path loss modeling for 0.8 to 37 GHz
[1] H. Omote, M. Miyashita, and R. Yamaguchi, “Measurement of time- band,” in Proc. 10th Eur. Conf. Antennas Propag. (EuCAP), Apr. 2016,
spatial characteristics between indoor spaces in different LOS buildings,” pp. 1–4.
in Proc. Int. Symp. Antennas Propag. (ISAP), Nov. 2015, pp. 1–4. [12] I. Rodriguez et al., “A novel geometrical height gain model for line-
[2] COST Action 231—Digital Mobile Radio Towards Future Generation of-sight urban micro cells below 6 GHz,” in Proc. Int. Symp. Wireless
Systems—Final Report. Luxembourg City, Luxembourg: Office Official Commun. Syst. (ISWCS), Sep. 2016, pp. 393–398.
Publ. Eur. Commun., 1999. [13] D. W. Matolak and R. Sun, “Unmanned aircraft systems: Air-ground
[3] “Prediction of building entry loss,” Int. Telecommun. Union, Geneva, channel characterization for future applications,” IEEE Veh. Technol.
Switzerland, ITU-Recommendation P.2109, 2017. [Online]. Available: Mag., vol. 10, no. 2, pp. 79–85, Jun. 2015.
https://www.itu.int/rec/R-REC-P.2109/en [14] Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communi-
[4] Y. L. C. de Jong, M. H. J. L. Koelen, and M. H. A. J. Herben, cations with unmanned aerial vehicles: Opportunities and
“A building-transmission model for improved propagation prediction challenges,” IEEE Commun. Mag., vol. 54, no. 5, pp. 36–42,
in urban microcells,” IEEE Trans. Veh. Technol., vol. 53, no. 2, May 2016.
pp. 490–502, Mar. 2004. [15] K. Saito, Q. Fan, N. Keerativoranan, and J. Takada, “Outdoor-to-
[5] D. I. Axiotis and M. E. Theologou, “An empirical model for predicting indoor radio propagation loss measurement by using an unmanned
building penetration loss at 2 GHz for high elevation angles,” IEEE aerial vehicle in 4.9 GHz band,” in Proc. IEICE TC-SRW, Aug. 2017,
Antennas Wireless Propag. Lett., vol. 2, pp. 234–237, 2003. pp. 25–30.
Multi-Slot Allocation Protocols for Massive

IoT Devices With Small-Size Uploading Data
Tsung-Yen Chan , Yi Ren , Yu-Chee Tseng, Fellow, IEEE, and Jyh-Cheng Chen , Fellow, IEEE
Abstract—The emergence of Internet of Things applications User Equipment (UE) terminals intending to join RA. The BS
introduces new challenges such as massive connectivity and small broadcasts a parameter (e.g., probability) such that UEs can
data transmission. In traditional data transmission protocols, an perform RA probabilistically. In dynamic resource allocation,
ID (i.e., IP address or MAC address) is usually included in a the BS can dynamically allocate resources of RA channel and
packet so that its receiver is able to know who sent the packet.
However, this introduces the big head-small body problem for
data channel. It also derives an optimal trade-off problem to
light payload. To address this problem, the Hint protocols have maximize the Machine-to-Machine (M2M) throughput. In slot-
been proposed. The main idea is to “encode” information in a ted access, each Machine-Type Communication (MTC) device
tiny broadcast Hint message that allows devices to “decode” their is assigned a dedicated RA slot to access RA. The dilemma
transmission slots. Thus, it can significantly reduce transmission is that short RA cycles may lead to collision while long RA
and contention overheads. In this letter, we extend eHint to sup- cycles may lead to long delay.
port multi-slot data transmissions. Several efficient protocols are To address the big head-small body problem, a series of
proposed. Our simulation results validate that the protocols can Hint protocols [8]–[10], which remove IDs from data packets,
significantly increase the number of successfully transmitted
devices, channel utilization, and payload of transmitted devices have been proposed. In these protocols, a tiny Hint message is
compared with eHint. broadcast to allow IoT devices to decode their assigned trans-
mission resources. Interestingly, the assigned location of these
Index Terms—Internet of Things (IoT), machine-to-machine resources also imply the sender’s ID, thus eliminating a large
(M2M) communication, multi-slot allocation, random access,
wireless networks.
part of packet header. Therefore, even the device’s ID is not
transmitted, its receiver (usually a BS) is still able to know its
I. I NTRODUCTION identity. In [8], we proposed a set of Hint-based frameworks
for small data transmission. Later, a Chinese remainder theo-
NTERNET of Things (IoT) traffic characterized by small
I data transmission introduces new challenges to research
community. There are various types of IoT devices rang-
rem based Hint protocol [9] is applied to LTE-A networks
to reduce the signaling cost in random access procedures.
However, the both Hint protocols [8], [9] are based on a
ing from small tags, sensors, to complicated actuators and strong assumption that the small data has the same size and
machines [1]. Statistics show that 50% of IoT packets are is carried by the same size channel resource, i.e., a time
less than 100 bytes [2]. Collecting small data from such mas- slot. To release the assumption, the Hint protocols are then
sively connected IoT devices introduces new challenges. In enhanced by supporting multi-slot data transmission in [10].
most protocol designs, an ID (e.g., IP address or MAC address) However, when the number of transmitting devices is large, the
is included in a packet so that its receiver is able to identify multi-slot Hint protocol requires intensive computation capac-
the sender. When IoT data is small, however, the overhead of ity to find a satisfactory seed, which may lead to transmission
its ID becomes relatively large, leading to the big head-small latency. To address this issue, in this letter we use a novel
body problem. iterative approach to reduce computation overhead and enable
Connection-oriented network architectures have been stud- more devices to transmit. Through extensive simulations, we
ied in [3]–[6]. When a device intends to send data to the Base demonstrate that the proposed iterative approach significantly
Station (BS), a connection is established by using the Random outperforms eHint [10] in terms of the number of successfully
Access (RA) procedure. As the number of devices increases, transmitted devices, channel utilization, and the total payload
it may cause significant collision in RA [7], leading to a long of transmitted devices.
delay. To solve the problem, several solutions have been pro-
posed, including Access Class Barring (ACB) [3], dynamic
resource allocation [4], slotted access [5], deep learning II. M ULTI -S LOT A LLOCATION P ROBLEM
scheduling [6]. The idea of ACB is to control the number of We consider a set D = {d1 , d2 , . . . , dm } of m IoT devices
Manuscript received September 5, 2018; accepted October 4, 2018. Date
covered by a BS. Each IoT device di , i = 1..m, needs to
of publication October 11, 2018; date of current version April 9, 2019. The report its data at a regular pattern to the BS. Our goal is to
associate editor coordinating the review of this paper and approving it for allocate radio resources to D to transmit data to the BS. We
publication was M. Dong. (Corresponding author: Tsung-Yen Chan.) make the following assumptions:
T.-Y. Chan, Y.-C. Tseng, and J.-C. Chen are with the Department
of Computer Science, College of Computer Science, National Chiao
1) The value of m is quite large.
Tung University, Hsinchu 30010, Taiwan (e-mail: tychan@cs.nctu.edu.tw; 2) Each di switches between two modes, active and sleep.
yctseng@cs.nctu.edu.tw; jcc@cs.nctu.edu.tw). When intending to transmit data, di goes to the active
Y. Ren was with the Department of Computer Science, National Chiao mode; otherwise, it switches to the sleep mode.
Tung University, Hsinchu 30010, Taiwan. He is now with the School of
Computing Science, University of East Anglia, Norwich NR4 7TJ, U.K.
3) The active pattern of di is denoted by a binary period-
(e-mail: e.ren@uea.ac.uk). ical function Pi (t), where t is time (by frame count).
Digital Object Identifier 10.1109/LWC.2018.2875455 Pi (t) = 1 if di intends to transmit data at the t-th frame;
CHAN et al.: MULTI-SLOT ALLOCATION PROTOCOLS FOR MASSIVE IoT DEVICES WITH SMALL-SIZE UPLOADING DATA 449

where v is a vector. The length of v is |v | ≥ di ∈M (t) ni ,
where M (t) = {di |Pi (t) = 1}. The i-th element of v (resp.,
Alloc), is denoted by v[i] (resp., Alloc[i]). The value of v[i]
falls in {0, 1, 2}. v is for devices to decode whether they can
transmit or not. If it is safe to transmit in ni continuous slots
in Alloc, the corresponding elements in v will be: “1 2 ·
· · 2”,
ni −1
where “1” means “starting slot” and “2” means “continuous
slot”. If it is not safe to transmit or a slot is not assigned to
any device, a “0” is used.
Next, we review the VF briefly:
Fig. 1. The proposed frame structure.
1) The BS randomly picks a seed s and computes
h(s, di ) mod |v | for each device di ∈ M (t). Next,
the BS selects a subset M ⊆ M(t) of devices that can
otherwise, Pi (t) = 0. For example, if di is attached
correctly decode v for safe transmission.
to a temperature sensor which needs to report in every
2) The BS executes Step 1 a few times with different seeds
3 minutes, Pi (t) has a period of 3 minutes. If one more
and then selects the s from the iteration leading to the
humidity sensor which needs to report in every 5 minutes
largest number of safe transmission slots.
is attached to di , Pi (t) is a function of the combina-
3) After selecting, the BS encodes v.
tion of two periodical functions with periods = 3 and
4) The BS broadcasts <s, v> in Bcast.
5 minutes, respectively.
5) Each device di ∈ M (t) checks whether it is allowed to
4) Whenever Pi (t) = 1, device di needs to send ni slots
transmit or not.
of data to the BS at t. Note that ni is also quite small
6) If di ∈ M (t) can transmit safely, it uploads its data
(such as less than 3 or 5 slots).
at the corresponding Alloc. Otherwise, di can use
5) When our Hint protocol starts, the BS is informed of its
random access to contend for transmission in Rand.
Pi (t) and ni , for each di ∈ D.
Discussion: In VF, some devices may not receive slots for
To solve the multi-slot allocation problem, the communica-
transmission due to collision in v. Given a fixed v, more
tion channel is divided into a sequence of fixed-length frames.
devices may lead to higher collisions. This not only reduces
Each frame consists of three parts: (1) Broadcast (Bcast):
transmission opportunities in Alloc, but also lacks of flex-
It is for the BS to broadcast and announce resource alloca-
ibility. These shortcomings motivate us to design two new
tion information (i.e., Hint) to devices. (2) Allocated (Alloc):
protocols, 2VF and IVF.
It consists of multiple slots for uplink data transmission (no
transmitter’s ID required in our case). (3) Random (Rand): It
is for any unscheduled/unpredicted transmission not arranged B. Two Virtual Frame (2VF)
in Alloc. Our goal is to design an efficient access protocol
for such transmission. To reduce computation overhead and enable more devices to
Fig. 1 shows the frame structure. In this example, the active transmit in Alloc, the 2VF protocol divides Alloc into two
periods of d1 , d2 , and d3 are T1 , T2 , and T3 , respectively. parts, Alloc_1 and Alloc_2, and uses two seeds s1 and s2
Their requirements are n1 = 2, n2 = 1, and n3 = 3 slots, to achieve this goal. Two vectors v1 and v2 will be used. Also,
respectively. At frame t, all devices will transmit. At t + 1, the length of v1 and v2 are set to |v1 | and |v2 |, respectively,
only d1 will transmit. At t + 2, d1 and d2 will transmit. The where |v1 | ≥ di ∈M (t) ni and |v2 | ≥ di ∈M (t)−M ni .
BS will schedule their transmissions in Alloc, through the It works as follows:
announcement in Bcast. Details will be discussed later. For 1) The BS repeats the following steps a few times.
exceptions (such as transmission errors or emergency traffics), • Randomly pick a seed s1 and compute
devices can use Rand. h(s1 , di ) mod |v1 | for each device di ∈ M (t).
• Select a subset M ⊆ M(t) of devices such that
III. P ROPOSED P ROTOCOLS only devices in M can correctly decode v1 for safe
transmission.
The main idea of the Hint protocol is to arrange a com- • A seed s2 is randomly picked. Compute
mon function shared by the BS and all devices. The function h(s2 , di ) mod |v2 | for each device di ∈ M (t)−M .
takes two inputs: (i) a small piece of information computed • Select a subset M ⊆ M(t) − M’ of devices such
by the BS, and (ii) a device’s ID. The BS will broadcast the that only devices in M can correctly decode v2 for
information computed in (i) by using Bcast. Each device safe transmission.
then can use the broadcast information and its ID to retrieve 2) The BS then selects the s1 and s2 from the iteration
the transmission slots allocated to it through the function. The in Step 1, leading to the largest number of safe trans-
broadcast Hint is thus more efficient than typical 1-by-1 noti- mission slots. (Note: once s1 and s2 are selected, the
fications. Next, we first review the VF proposed in [10]. We corresponding M and M are also selected.)
then propose two more efficient protocols called 2VF and IVF. 3) v1 and v2 are encoded as follows:
• (Collision-free) For each di ∈ M such that k =
A. Review: Virtual Frame (VF) h(s1 , di ) mod |v1 |, set v1 [k ] = “1” and v1 [k +
Let h(s, ID) be a hash function which takes a device ID and 1 : k + ni − 1] = “2 ·· · 2”. For each di ∈ M such
a seed s as inputs. In frame t, the Hint message = <s, v>, ni −1
that x = h(s2 , di ) mod |v2 |, set v2 [x ] = “1” and

v2 [x + 1 : x + ni − 1] = “2
·
· · 2”.
ni −1
• (Empty/Collision) The rest of elements of v1 and
v2 are all set to “0”.
4) The BS broadcasts <s1 , v1 , s2 , v2 > in Bcast to all
devices.
5) For each device di ∈ M (t), receiving <s1 , v1 , s2 , v2 >
in Bcast, di computes k = h(s1 , di ) mod |v1 | and
x = h(s2 , di ) mod |v2 |. To check whether di is allowed
to transmit or not, there are two cases:
• Case of ni = 1: If v1 [k ] = “1” and v1 [k + 1] =
“0” or “1”, di is allowed to transmit; otherwise, it
checks v2 . If v2 [x ] = “1” and v2 [x + 1] = “0” or
“1”, di is allowed to transmit; otherwise, di cannot
transmit.
• Case of ni ≥ 2: If v1 [k ] = “1”, v1 [k + 1 : k +
ni − 1] = “2 ·
· · 2”, and v1 [k + ni ] = “0” or “1”,
ni −1
di is allowed to transmit; otherwise, it checks v2 . If
v2 [x ] = “1”, v2 [x + 1 : x + ni − 1] = “2
·
· · 2”, and
ni −1
v2 [x + ni ] = “0” or “1”, di is allowed to transmit;
otherwise, it cannot transmit.
6) For each device di ∈ M (t) that is allowed to transmit
safely in v1 , di uploads its data at Alloc[j : j + ni −
1], where j is the number of 1’s and 2’s in v1 [0 : k −
1] and k = h(s1 , di ) mod |v1 |. If di is not allowed Fig. 2. (a) The message flow of the 2VF. (b) An example of 2VF.
to transmit in v1 , but is allowed to transmit in v2 , di
uploads its data at Alloc[z : z + ni − 1], where z is
the number of 1’s and 2’s in v1 and in v2 [0 : x − 1] and
x = h(s2 , di ) mod |v2 |. If di is not allowed to transmit
in v1 and v2 , it may consider to contend for transmission
in Rand.
Fig. 2(a) shows the message flow of 2VF. As shown in
Fig. 2(b), there are 7 devices intending to transmit in frame t.
Hence, |v1 | = n1 +n2 +n3 +n4 +n5 +n6 +n7 = 13. Suppose Fig. 3. Alloc vector of IVF.
that devices d1 , d4 , and d7 are selected to transmit in v1 (i.e.,
M = {d1 , d4 , d7 }). Then v1 = “0122201201200”. Although
d2 , d3 , d5 , and d6 are not allowed to transmit in v1 , they can previous iterations and tries to find a seed sk , where
perform VF again. Hence, |v2 | is set to n2 + n3 + n5 + n6 = 5. k = 1, 2, . . . , l .
After hashing, d5 ’s slot and d2 ’s second slot are overlapped. • Time-bound: A timer Tk is set for iteration k such that the
Suppose that d2 , d3 , and d6 are allowed to transmit in v2 (i.e., search for sk would stop even if a satisfactory sk cannot
M = {d2 , d3 , d6 }). Then v2 = “10121”. d6 will transmit in be found after this time bound. In this case, the best sk
Alloc[8] because there are eight numbers of 1’s or 2’s in v1 . so far will be used. Note that this only sacrifices the
Also, d2 will transmit in Alloc[9 : 10] because there are nine transmission ratio, but our protocol still works correctly.
numbers of 1’s or 2’s in v1 and v2 [0 : 1]. Similarly, d3 will We summarize how IVF works as follows:
transmit in Alloc[11] because there are eleven numbers of 1) The BS computes Alloc_k for k = 1, 2, . . . , l as
1’s or 2’s in v1 and v2 [0 : 3]. Since device d5 finds v2 [3] = follows. In iteration k, we define the length of vector
“2”, it knows that it cannot transmit. vk as
⎧
⎨ d ∈M (t) ni if k = 1
i
|vk | ≥ , (1)
C. Iterative-Virtual-Frame (IVF) ⎩ d ∈M (t)−
k −1 M ni if k = 2, 3, . . . , l
i k̂
k̂ =1
2VF enables most of the devices to transmit data in Alloc
by running VF twice. However, as the number of IoT devices where Mk̂ is the set of devices that can transmit in
becomes larger, the difficulty of finding satisfactory seeds also Alloc_k̂ in iteration k̂ . The BS repeatedly chooses a
grows accordingly. Also, the computation power of the BS has seed sk and computes vk until (i) a satisfactory seed sk
its limitation. There are two main concepts in IVF (refer to is found, or (ii) the time bound Tk spent on the search
Fig. 3): has reached. In the later case, the best seed sk so far and
• Iterative: The VF process is executed l times. The k-th the corresponding vk is selected.After these l iterations,
iteration takes the devices unable to transmit in the the BS broadcasts {sk , vk }lk =1 in Bcast.
CHAN et al.: MULTI-SLOT ALLOCATION PROTOCOLS FOR MASSIVE IoT DEVICES WITH SMALL-SIZE UPLOADING DATA 451
Fig. 4. The different effects of the number of successfully transmitted devices, |Alloc|, and CU with different parameters.
2) Upon receiving Bcast, a device di with Pi (t) = 1 can payload to transmit. The values of CU gradually converge for
transmit in one of the l spaces (Alloc_1, Alloc_2, VF, 2VF, and IVF with the increase of |M(t)|.
. . . , Alloc_l ) according to its hash results h(sk , di ) Overall, Figs. 4(a)-(f) demonstrate that the iteration
and vectors vk . If it is not allowed to transmit in Alloc, based Hint protocol has much better performance compared
it can contend in Rand. with [10] in terms of the number of successfully transmitted
devices, the total payload of transmitted devices, and channel
utilization.
IV. P ERFORMANCE E VALUATION
We evaluate 2VF and IVF through simulations in terms of V. C ONCLUSION
three metrics, the number of successfully transmitted devices In this letter, we propose two advanced Hint protocols, 2VF
in M(t), the length of Alloc (i.e., |Alloc|), and Channel and IVF, which enable more devices to upload their data in
Utilization (CU). The results are compared with VF and tra- a collision-free manner. Compared to the eHint [10], the pro-
ditional polling protocol which collects data from each device posed 2VF and IVF provide more flexibility and address the
one-by-one. Each slot size is assumed to be 32 bits and the issue when the number of devices intending to transmit is
length of s is set to 16 bits. Device address is assumed to be large. We demonstrate, through extensive simulations, that the
64 bits. The value of ni is small and it randomly falls in the proposed 2VF and IVF outperform both eHint [10] and tra-
range of [1∼3] or [1∼5] slots. ditional polling protocol in terms of CU and slot allocation
payload
The CU is defined as: CU = payload+ packet header . For capability. As to future work, designing a smart seed with
|Alloc| low computation cost is worth of exploring.
2VF and IVF, CU =
|Alloc| +2×16+2×(|v1 |+|v2 |)
, and
|Alloc|
CU = |Alloc|+l×16+2×(|v |+|v |+···+|v |) , respectively. For
1 2 l
|Alloc| R EFERENCES
VF, CU = |Alloc|+16+2×|v | . For traditional polling proto-
[1] H. Li, K. Ota, and M. Dong, “Energy cooperation in battery-free wireless
|Alloc|
cols, CU = |Alloc|+64×|M (t)| . For IVF, we execute VF until communications with radio frequency energy harvesting,” ACM Trans.
Embedded Comput. Syst., vol. 17, no. 2, p. 44, 2018.
all the devices are allocated in Alloc successfully. [2] W. John and S. Tafvelin, “Analysis of Internet backbone traffic and
Fig. 4(a) and Fig. 4(b) demonstrate the number of success- header anomalies observed,” in Proc. ACM SIGCOMM Conf. Internet
fully transmitted devices in M(t) with ni = [1 ∼ 3] and Meas., 2007, pp. 111–116.
[1∼5], respectively. Both the figures show that IVF and 2VF [3] Z. Wang and V. W. S. Wong, “Optimal access class barring for stationary
machine type communication devices with timing advance informa-
outperform VF significantly. Considering Fig. 4(b) as an exam- tion,” IEEE Trans. Wireless Commun., vol. 14, no. 10, pp. 5374–5387,
ple, when |M(t)| = 200, IVF and 2VF gain 44% and 30% Oct. 2015.
transmitted devices compared with VF, respectively. [4] D. T. Wiriaatmadja and K. W. Choi, “Hybrid random access and data
Fig. 4(c) and Fig. 4(d) demonstrate the impacts of |M(t)| transmission protocol for machine-to-machine communications in cellu-
lar networks,” IEEE Trans. Wireless Commun., vol. 14, no. 1, pp. 33–46,
on |Alloc| by varying ni = [1 ∼ 3] and [1∼5], respectively. Jan. 2015.
We observe that both 2VF and IVF outperform VF with large [5] “Study on RAN improvements for machine-type communications,
margins, and IVF performs the best. Specifically, |Alloc| (Release 11),” 3GPP, Sophia Antipolis, France, Rep. TR 37.868,
increases as the number of devices intending to transmit grows. Sep. 2011.
[6] H. Li, K. Ota, and M. Dong, “Learning IoT in edge: Deep learning for
We also observe that the three curves are linear. This means the Internet of Things with edge computing,” IEEE Netw., vol. 32, no. 1,
that the proposed protocols are scalable and resilient to the pp. 96–101, Jan./Feb. 2018.
increasing of |M(t)|. [7] “Study on new radio (NR) access technology,” 3GPP, Sophia Antipolis,
Fig. 4(e) and Fig. 4(f) show the impacts of |M(t)| on CU France, Rep. TR 38.912, 2017.
[8] Y. Ren, R.-J. Wu, T.-W. Huang, and Y.-C. Tseng, “Give me a hint: An
when ni = [1 ∼ 3] and [1∼5], respectively. Overall, the Hint ID-free small data transmission protocol for dense IoT devices,” in Proc.
protocols (i.e., VF, 2VF, and IVF) outperform the traditional IEEE Wireless Days, 2017, pp. 121–126.
protocol significantly since we use a Hint message instead of [9] T.-W. Huang, Y. Ren, K. C.-J. Lin, and Y.-C. Tseng, “r-Hint: A message-
notifying devices by 64 ×|M(t)| times. The CU of the tradi- efficient random access response for mMTC in 5G networks,” in Proc.
IEEE PIMRC, 2017, pp. 1–6.
tional scheme is around 0.5 and 0.6 in Fig. 4(e) and Fig. 4(f), [10] T.-Y. Chan, Y. Ren, Y.-C. Tseng, and J.-C. Chen, “eHint: An efficient
respectively, when |M(t)| increases. The traditional protocol protocol for uploading small-size IoT data,” in Proc. IEEE WCNC, 2017,
with [1∼3] has higher CU than [1∼5] since [1∼5] has more pp. 1–6.
On the Sum-Rate of Heterogeneous Networks With Low-Resolution ADC

Quantized Full-Duplex Massive MIMO-Enabled Backhaul
Prince Anokye , Roger K. Ahiadormey , Changick Song , and Kyoung-Jae Lee
Abstract—This letter analyzes the sum-rate of a heterogeneous only a natural consequence. Massive MIMO FD is studied
network with wireless backhaul supported by low-resolution by [2] in a decode-and-forward relay where it is shown that SI
analog-to-digital converters (ADCs) quantized full-duplex mas- effects diminish with a large number of antennas. An SI-aware
sive multiple-input multiple-output. Communication is achieved ZF precoder is proposed for massive MIMO FD in [5].
in two phases. First phase: the macro-cell base station (BS) Recently, massive MIMO has been introduced into HetNets
is equipped with massive receive antennas and a few transmit to provide wireless backhaul support. A HetNet describes a
antennas, and the small-cell BSs deploy massive receive anten- system where a macro-cell (MC) is overlaid with low-powered
nas and a single transmit antenna. Second phase: the roles of the small-cells (SCs) serving stationary UTs. By densifying the
antennas are switched using a circulator. We derive closed-form
network, the distance to BSs is lessened leading to reduced
expressions for the uplink/downlink rates by assuming imperfect
channel state information. It is shown that the quantization noise
path losses, and improved SE [6]. Sanguinetti et al. [7] studied
(QN) due to the use of low-resolution ADCs, degrades the rate. interference management of a massive MIMO-enabled back-
This rate loss is efficiently minimized by using massive receive haul in a two-tier HetNet. Anokye et al. [8] analyzed the
antennas in the first phase. However, in the second phase, the QN sum-rate of a two-tier HetNet where the backhaul is supported
is of the same order as the desired signal. Therefore the massive by FD massive MIMO in both the MC BS and the SC BSs.
number of transmit antennas is unable to effectively suppress The use of a high number of receive antennas in massive
the QN. MIMO leads to significant increase in the power consump-
tion due to the high-resolution analog-to-digital converters
Index Terms—Massive multiple-input multiple-output, full-
(ADCs). To elucidate this, an ADC with d-bit resolution and
duplex, heterogeneous network, quantization noise.
sampling frequency f performs 2d · f computations per sec-
ond. This implies that the power increases linearly with the
I. I NTRODUCTION sampling frequency and exponentially with the resolution.
ASSIVE multiple-input multiple-output (MIMO), full- Thus, if high-resolution ADCs are employed, massive MIMO
M duplex (FD) and heterogeneous networks (HetNets)
have attracted attention as enabling technologies for next-
would be financially infeasible. It is therefore imperative to
research quantized massive MIMO where the receivers use
generation communication. In massive MIMO, base stations low-resolution ADCs. Meanwhile, the aforementioned papers
(BSs) are equipped with antennas such that the number of do not consider the impact of quantization noise (QN) which
antennas is much higher than the user terminals (UTs) [1]. occurs due to low-resolution ADCs. Fan et al. [9] studied
The degree of freedom offered by the massive antennas excel- uplink (UL) rate of quantized HD massive MIMO by assum-
lently mitigates fast fading, non-coherent interference and ing perfect channel state information (CSI). To extend [9], [10]
noise while improving spectral efficiency (SE) [1]. Also, with assumes imperfect CSI in Rician channels. FD massive MIMO
massive MIMO, simple processing such as maximum ratio with low-resolution ADCs in an amplify-and-forward relay
combining/transmission (MRC/MRT) achieves high SE [1]. was studied in [11].
Of paramount interest also is in-band FD where BS This letter studies a HetNet where the backhaul is supported
transceivers simultaneously transmit and receive on the same by low-resolution quantized FD massive MIMO. The MC BS
frequency band [2]. Thus FD systems can double the SE possesses a dedicated wired backbone connection but supports
of half-duplex (HD) at the expense of self-interference (SI). the SCs through FD wireless links. Data communication is
Multi-antenna techniques such as minimum mean square error achieved in two phases. During the first phase, the MC BS
(MMSE), zero-forcing (ZF) and null space projection have deploys a large number of receive antennas and a few trans-
been applied to cancel SI [3], [4]. Considering the advantages mit antennas. The SC BSs have massive receive antennas and
of massive MIMO and FD, combining both technologies is a single transmit antenna. For the second phase, the roles of
the receive and transmit antennas are switched using a circula-
Manuscript received September 12, 2018; accepted October 6, 2018. Date tor [12]. This configuration enables the BSs to serve UL UTs
of publication October 15, 2018; date of current version April 9, 2019. This during the first phase and downlink (DL) UTs in the second
work was supported in part by the IITP through the Korea Government phase. A similar scenario was studied in [8] but the influence
(MSIT) under Grant 2018-0-00218, and in part by the NRF through Korea of QN was ignored. Here, the aim is to equip receive antennas
Government (MSIT) under Grant NRF-2016R1C1B2011921 and Grant NRF- with low-resolution ADCs and then study the combined effects
2018R1D1A1B07049824. The associate editor coordinating the review of this
paper and approving it for publication was W. Hamouda. (Corresponding of SI, and QN. Analytic solutions are derived for the UL/DL
author: Kyoung-Jae Lee.) rate and the impact of low-resolution ADCs is characterized.
P. Anokye, R. K. Ahiadormey, and K.-J. Lee are with the Department
of Electronics and Control Engineering, Hanbat National University,
Daejeon 34158, South Korea (e-mail: princemcanokye@yahoo.com; II. S YSTEM M ODEL
rogerkwao@gmail.com; kyoungjae@hanbat.ac.kr).
C. Song is with the Department of Electronic Engineering, Korea
Consider a two-tier HetNet where an MC is overlaid with
National University of Transportation, Chungju 27469, South Korea (e-mail: K SCs [8]. For the first phase, the MC BS is equipped with
c.song@ut.ac.kr). massive receive antennas Mrx and a few transmit antennas
Digital Object Identifier 10.1109/LWC.2018.2875907 Mtx fixed such that Mtx = K (i.e., Mrx Mtx ). This allows
ANOKYE et al.: ON SUM-RATE OF HETEROGENEOUS NETWORKS WITH LOW-RESOLUTION ADC QUANTIZED FD MASSIVE MIMO-ENABLED BACKHAUL 453
TABLE I
the MC BS to send independent streams to the SCs. Each SC α FOR D IFFERENT d-B IT ADC R ESOLUTION
BS deploys massive receive antennas Nrx and a single transmit
antenna. In the second phase, the antenna roles are switched,
i.e., the MC BS now has massive transmit antennas Mtx and
a few receive antennas Mrx = K and the SC BSs possess
massive transmit antennas Ntx and a single receive antenna. the outputs of the MC BS and the SC BSs are written,
The received signals are quantized using low-resolution ADCs. respectively, as
For a coherence time T, τ (< T) slots are assigned for channel
estimation and the remaining slots are used for the first and K

(1) √ √
second phases. Assuming a time division duplex protocol, the yq,k = ε1 ps ĥH
k hk xk + ε1 ps ĥH
k hj xj
same channel estimate employed for decoding in the first phase j =1,j =k
is used to precode in the second phase. √
+ ε1 pm ĥH H H
k Qs + ε1 ĥk n + ĥk nq , (5)
(1) (1)
A. First Phase rq,k = ĝkH rq,k , (6)
Each MC BS transmit antenna sends data to a corresponding
SC BS in the DL and the k-th SC BS transmits to the MC BS where ĥk , and ĝk are the estimates of hk , and gk , respectively.
in the UL, simultaneously. The MC BS receives the desired
signal plus its own transmit signals causing SI. The k-th SC B. Second Phase
BS receives the desired signal from the MC BS together with The MC BS precodes data and transmits to the K SC BSs.
its own signal and unintended transmissions from other SC Simultaneously, the k-th SC BS precodes data and sends to
BSs, i.e., SI and SC-to-SC interference. The received signals the k-th MC BS receive antenna. Assuming MRT, the received
at the MC BS and k-th SC BS are, respectively, given by [8]
signals at the MC BS and the k-th SC BS are, respectively,
√ √
y(1) = ps Hx + pm Qs + n, (1) represented by
K K

(1) √ √ √ (2) √ H √ H
rk = pm gk sk + pm gj sj + ps qs,k xk yk = ρs,k gk ĝk xk + ρs,j gk ĝj xj
j =1,j =k j =1,j =k
√
K + ρm zm,k Ĥs + nk , (7)
√
+ ps qc,kj xj + vk , (2) (2) √
K
√ √
j =1,j =k rk = ρm hH
k ĥk sk + ρm hH
k ĥj sj + ρs,k zs,k ĝk xk
j =1,j =k
where H = [h1 , . . . , hK ] ∈ C Mrx ×K , gk ∈ C Nrx ×1 , Q ∈ K

C Mrx ×Mtx , qs,k ∈ C Nrx ×1 , and qc,kj ∈ C Nrx ×1 denote the √
+ ρs,j zc,kj ĝj xj + vk , (8)
SC BS to the MC BS channel, MC BS to the k-th SC BS chan- j =1,j =k
nel, SI channel at the MC BS, SI channel at the k-th SC BS,
and the SC-to-SC interference channel from the j-th SC BS where zm,k ∈ C 1×Mtx , zs,k ∈ C 1×Ntx , and zc,kj ∈ C 1×Ntx
to the k-th SC BS, respectively. x = [x1 , . . . , xK ]T ∈ C K ×1 , indicate the SI channel at the k-th receive antenna of the
s = [s1 , . . . , sK ]T ∈ C K ×1 , n ∈ C Mrx ×1 , vk ∈ C Nrx ×1 , MC BS, SI channel at the k-th SC BS, and SC-to-SC inter-
pm , and ps indicate the transmit signals from the SC BSs, ference channel from the j-th SC BS to the k-th SC BS,
the MC BS transmit signals, noise at the MC BS, noise at respectively. Again, we have Ĥ = [ĥ1 , . . . , ĥK ], ρm =
the k-th SC BS, the transmit power of the MC BS, and the pm K /E{tr (ĤĤH )}, and ρs,k = ps /E{tr (ĝk ĝkH )}. The ele-
SC BSs, respectively. The elements of n, vk , x, and s are 2 ), CN (0, η 2 ),
ments of zm,k , zs,k , and zc,kj are CN (0, ηm,k s,k
modeled by CN (0, 1). The channels hk ∼ CN (0, βk IMrx ), 2 ), respectively. After quantization, the signals
and CN (0, ηc,kj
and gk ∼ CN (0, ζk INrx ) consider both small-scale and large- at the MC BS and the k-th SC BS are given, respectively, as
scale fading where βk , and ζk describe the large-scale fading
(2) (2)
which are assumed to remain unchanged over a coherence yq,k = ε2 yk + nq,k , (9)
interval. The above channels depend on the favorable prop- (2) (2)
agation assumption [2]. The elements of Q, qs,k , and qc,kj rq,k = θ2 rk + vq,k , (10)
are distributed as CN (0, σm 2 ), CN (0, σ 2 ), and CN (0, σ 2 ),
s,k c,kj ε2 , and θ2 indicate ADC resolution at the MC BS, and the
respectively [8]. SC BSs, respectively. nq,k , and vq,k denote the QN at the
For tractability, we assume the additive quantization noise MC BS, and the k-th SC BS with covariances Nq,k = ε2 (1 −
model [9]. After quantizing, the signals at the MC BS, and (2) (2)
the k-th SC BS are, respectively, expressed as ε2 )E{|yk |2 }, and Vq,k = θ2 (1−θ2 )E{|rk |2 }, respectively.
(1)
yq = ε1 y(1) + nq , (3) C. Channel Estimation
(1) (1) Each MC BS transmit antenna sends mutually orthogonal
rq,k = θ1 rk + vq,k , (4) pilot signal to the corresponding SC BS for the DL chan-
where ε1 = 1 − α, and θ1 = 1 − α indicate the ADC res- nel estimation and the SC BSs send orthogonal sequences of
olution at the MC BS, and SC BSs, respectively. nq , and equal length to the MC BS for the UL channel estimation.
vq,k denote the QN of the MC BS, and the k-th SC BS This dictates that the pilot length τ ≥ 2K [8].1 By using
with covariances Nq = ε1 (1 − ε1 )diag(E{y(1) y(1)H }), and 1 During the channel estimation phase, the SC BSs can transmit their mutu-
(1) (1)H
Vq,k = θ1 (1 − θ1 )diag(E{rk rk }), respectively. α val- ally orthogonal pilot sequences to the MC BS while the MC BS keeps silent
and vice-versa as in [11]. Another technique is that both the MC BS and the
ues for a d-bit resolution
√ ADC are given in Table I. For SC BSs transmit the pilot signals simultaneously as explained in [2]. However,
d > 5, α = π 2 3 · 2−2d [9]. Using an MRC decoder, both approaches achieve equal performance.
MMSE estimation, the k-th SC BS to MC BS true channel is the receive antennas fixed at Mrx = K . For the k-th SC BS,
decomposed as hk = ĥk + h̃k , where h̃k ∼ CN (0, β̃k2 IMrx ) Ntx → ∞ and a single receive antenna are considered.
denotes the error vector for ĥk ∼ CN (0, β̂k2 IMrx ) with Theorem 1: With the MRC receive filter in the first phase
ε τ p β2 and under the assumption of imperfect CSI, the UL/DL rates
β̂k2 = 1+τ
1 τ k 2 2
pτ βk , and β̃k = βk − β̂k [11]. pτ is the pilot power. of the wireless backhaul link supported by a low-resolution
Similarly, the MC BS to the k-th SC BS channel is given as ADC quantized FD massive MIMO are approximated by
gk = ĝk + g̃k , where g̃k ∼ CN (0, ζ̃k2 INrx ) indicates the error,
θ1 τ pτ ζk 2 (1) ε1 ps β̂k2 Mrx
and ĝk ∼ CN (0, ζ̂k2 INrx ) for ζ̂k2 = 1+τ 2 2 Rm,k = log2 1 + (1) , (15)
pτ ζk , ζ̃k = ζk − ζ̂k . κ̃m,k + μ̃m,k
(1)
III. S UM -R ATE A NALYSIS

(1) θ1 pm ζ̂k2 Nrx
To derive the rate, we refer to a technique employed in [2], Rs,k = log2 1 + (1) (1)
, (16)
where the received signal is expressed as a known mean gain κ̃s,k + μ̃s,k
multiplied by the desired signal plus an uncorrelated additive (1) (1)
where κ̃m,k = ε1 (ps K 2
j =1 βj + pm σm K + 1), μ̃m,k =
noise. Specifically, to derive the rate at the MC BS for the (1)
first phase, (5) is rewritten as (1 − ε1 )(ps β̂k2 + ps K 2
j =1 βj + pm σm K + 1), κ̃s,k =
K
(1) √
yq,k = ε1 ps E{ĥH
√ H H θ1 (pm K 2
j =1 ζj + ps σs,k +
2
j =1,j =k ps σc,kj + 1), and
k hk }xk + ε1 ps (ĥk hk − E{ĥk hk })xk
(1) K
K
μ̃s,k = (1 − θ1 )(pm ζ̂k2 + pm j =1 ζj + ps σs,k 2 +
√ √ K
+ ε1 ps ĥH H H H
k hj xj + ε1 pm ĥk Qs + ε1 ĥk n + ĥk nq , p σ 2 + 1).
j =1,j =k s c,kj
j =1,j =k Proof: Please see the Appendix.
where the first term denotes the desired signal and the remain- Note that Theorem 1 exposes the influence of the ADCs
ing terms describe the effective noise which are uncorrelated. resolution at the receivers, the SI, and SC-to-SC interference
Noting that the independent Gaussian noise with the same vari- strength together with the number of receive antennas and the
ance is the worst-case uncorrelated additive noise [2], a lower large-scale fading. Again, the rates are strongly impacted by
bound on the achievable rate at the MC BS is the number of SCs due to the enhancements in the SI, SC-to-
(1) ε21 ps |E{ĥH
k hk }|2 SC interference, and QN terms.
Rm,k = log2 1 + (1) (1)
, (11) Theorem 2: By using an MRT precoding for the second
ε21 ps var(ĥHk kh ) + κm,k + μm,k phase and under the assumption of imperfect CSI, the UL/DL
(1) rates of the low-resolution quantized FD massive MIMO
where κm,k = ε21 (ps K H 2
j =1,j =k E{|ĥk hj | } + wireless backhaul link are given by
(1)
pm E{|ĥH 2 2
k Q| } + E{ĥk }), and μm,k = ε1 (1 − (2) ε2 ps ζ̂k2 Ntx
H H H Rm,k = log2 1 + (2) , (17)
ε1 )E{ĥk diag(ps HH + pm QQ + IMrx )ĥk }. Using a κ̃m,k + Ñq,k
similar technique as above, the achievable rate at the k-th SC
β̂ 4
BS for the first phase is expressed as
θ 2 pm K K k Mtx
β̂j2
(1) θ12 pm |E{ĝkH gk }|2 (2)
Rs,k = log2 1+
j =1
, (18)
Rs,k = log2 1 + (1) (1)
, (12) (2)
κ̃s,k + Ṽq,k
θ12 pm var(ĝkH gk ) + κs,k + μs,k
(2) 2
(1) where κ̃m,k = ε2 (ps ζk K + pm K ηm,k + 1),
where κs,k = θ12 (pm K H 2
j =1,j =k E{|ĝk gj | } +
2
Ñq,k = (1 − ε2 )(ps ζ̂k Ntx + ps K ζk + pm K ηm,k 2 + 1),
ps E{|ĝkH qs,k |2 } + ps K H 2
j =1,j =k E{|ĝk qc,kj | } + E{ĝk }),
2
K
(2) 2 2
(1) κ̃s,k = θ2 (pm K βk + ps ηs,k + j =1,j =k ps ηc,kj + 1), and
and μs,k = θ1 (1−θ1 )E{ĝkH diag(pm gk gkH +pm K H
j =k gj gj +
H K H
(1)
Ṽq,k = (1 − θ2 ) Kp m
β̂ 4 Mtx + pm K βk + ps ηs,k
2 +
ps qs,k qs,k + ps j =k qc,kj qc,kj + INrx )ĝk }. To derive the K 2 k
j =1 β̂j
rates for the second phase, we refer to (9) and (10) for the K 2

UL/DL, respectively, and applying similar approach, we have j =1,j =k ps ηc,kj + 1 .
Proof: This is easily proved by following the Appendix.
(2) ε22 ρs,k |E{gkH ĝk }|2 Here, it should be emphasized that the QN has the same
Rm,k = log2 1 + (2)
, (13) order as the desired signal.
ε22 ρs,k var(gkH gˆk ) + κm,k + Nq,k
IV. N UMERICAL R ESULTS AND C ONCLUSION
(2) θ22 ρm |E{hH k ĥk }|2
Rs,k = log2 1 + (2)
, (14) In this section, we validate the analytic results through
θ22 ρm var(hH ˆ
k hk ) + κs,k + Vq,k
Monte Carlo simulations. All simulations are conducted over
105 channel realizations. We define the signal-to-noise ratio
(2)
where κm,k = ε22 (ρs,k K H 2
j =1,j =k E{|ĝk gj | } + ρm E
Δ
(SNR) as SNR = pm = ps . The following parameters are used
(2) throughout. First phase: βk = β = 0.1∀k , ζk = ζ = 0.1∀k ,
{|zm,k Ĥ|2 }+1), and κs,k = θ22 (ρm K H 2
j =1,j =k E{|ĥk hj | }+ 2 = 0.3, σ 2 = σ 2 = 0.3∀k , and σ 2 2
K σm s c,kj = σc = 0.3∀k , j .
ρs,k E{|zs,k ĝk |2 } + j =1,j =k ρs,j E{|zc,kj ĝc,kj |2 } + 1). s,k
2 2 2 2
Second phase: ηm,k = ηm = 0.3∀k , ηs,k = ηs = 0.3∀k , and
A. Large System Analysis 2
ηc,kj = ηc2 = 0.3∀k , j . Here, K = 4, pτ = 10dB, τ = 2K , and
To determine the closed-form expressions, we assume that ps = pm = 10dB are set. Please refer to Table I for values of
for the first phase, the MC BS receive antennas, Mrx → ∞ θ1 , ε1 , θ2 , and ε2 . For instance, with ADC resolution d = 1,
and the transmit antennas are fixed at Mtx = K . At the k-th α = 0.3634, and thus θ1 = 0.6366. Fig. 1(a) shows a plot of
SC BS, Nrx → ∞ with a single transmit antenna. For the the sum-rate against the number of receive antennas for the
second phase, the MC BS transmit antennas Mtx → ∞ with first phase. The first subplot shows the UL sum-rate at the MC
ANOKYE et al.: ON SUM-RATE OF HETEROGENEOUS NETWORKS WITH LOW-RESOLUTION ADC QUANTIZED FD MASSIVE MIMO-ENABLED BACKHAUL 455
degrades the sum-rate. This result differs from [8], where it

is shown that the sum-rate of the first and second phases are
equal when all parameters are equal.
In this letter, it is shown that the QN which arises from the
use of low-resolution ADCs degrades the performance. For
the first phase, the rate loss due to QN and SI is effectively
compensated by increasing the number of receive antennas.
However, in the second phase, it is noted that the desired signal
and the QN are of the same order. Thus the large number of
transmit antennas may not effectively suppress the QN.
(a) (b) A PPENDIX
To prove (15), we first refer to the term corre-
Fig. 1. Sum-rate vs number of receive antennas (a) first phase and (b) second (1)
phase. sponding to the QN in (11), i.e., μm,k . We can
write [diag(ps HHH + pm QQH + IMrx )]mm =
2 + p Mtx |q |2 + 1, where h
ps K i=1 |h mi | m t=1 mt mi is
the element on the m-th row and i-th column of H and qmt
is the mt-th entry of Q. Therefore, E{ĥH k diag(p HHH +
Mrx K s
pm QQ + IMrx )ĥk } = E{ m=1 |ĥmk | (ps i=1 |hmi |2 +
H 2
tx
pm M 2
t=1 |qmt | + 1)}. Expanding the right hand side yields
rx
E{ps m=1 |ĥmk |2 |hmk |2 +ps M
Mrx
m=1 |ĥmk |2 K i=k |hmi | +
2
Mrx 2 Mtx 2 Mrx 2
pm m=1 |ĥmk | t=1 |qmt | + m=1 | ĥmk | }. Noting
that hmk = ĥmk + h̃mk , we have E{ps M m=1 |ĥmk | +
rx 4
2 = σ2 =
Fig. 2. Sum-rate vs number of quantization bits (β = ζ = 0.1, σm Mrx
σc2 = 0.3, ηm
2 = η 2 = η 2 = 0.3).
s ps m=1 |ĥmk |2 |h̃mk |2 + ps M rx
m=1
K
i=k |ĥmk |2 |ĥmi |2 +
s c Mrx K Mrx Mtx
ps m=1 i=k |ĥmk | |h̃mi | + pm m=1 t=1 |ĥmk |2
2 2
rx
BS and the second subplot shows the DL sum-rate at the SC |qmt |2 + M 2 4
m=1 |ĥmk | }. By using E{|ĥmk | } = 2β̂mk ,
4
BSs. As shown, our analytic results are exact. The sum-rate 2 2 2 2 2 2 2 2 ,
generally improves with increasing number of receive anten- E{|ĥmk | |h̃mk | } = β̂mk β̃mk , E{|ĥmk | |qmk | } = β̂mk σmt
nas but reduces as the ADC resolution decreases. With only and substituting into the above expression, we obtain the
(1) (1)
2-bit ADC resolution, the performance gap against the per- approximation μ̃m,k of μm,k . The remaining expectations
fect system (infinite resolution) is extremely low. To achieve in (11) are obtained from [8] to attain (15). Following similar
10bps/Hz rate at the MC BS, the perfect ADC requires about approach, (16) is derived.
90 receive antennas while the 2-bit ADC needs about 115
R EFERENCES
receive antennas. Thus the rate loss due to QN is effectively
compensated by increasing the number of receive antennas. [1] T. L. Marzetta, “Noncooperative cellular wireless with unlimited num-
bers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9,
Fig. 1(b) illustrates the sum-rate versus the number of trans- no. 11, pp. 3590–3600, Nov. 2010.
mit antennas for the second phase. The first subfigure indicates [2] H. Q. Ngo, H. A. Suraweera, M. Matthaiou, and E. G. Larsson,
UL sum-rate at the MC BS versus the number of transmit “Multipair full-duplex relaying with massive arrays and linear process-
antennas employed by the SC BSs while the second subplot ing,” IEEE J. Sel. Areas Commun., vol. 32, no. 9, pp. 1721–1737,
Sep. 2014.
represents the sum-rate versus the number of transmit antennas [3] D. Kim, H. Lee, and D. Hong, “A survey of in-band full-duplex transmis-
at the MC BS. The sum-rate of the low-resolution ADC sys- sion: From the perspective of PHY and MAC layers,” IEEE Commun.
tem saturates rapidly with increasing transmit antennas and the Surveys Tuts., vol. 17, no. 4, pp. 2017–2046, 4th Quart., 2015.
[4] T. Riihonen, S. Wagner, and R. Wichman, “Mitigation of loopback self-
performance gap with the infinite-resolution system maximizes interference in full-duplex MIMO relays,” IEEE Trans. Signal Process.,
as the transmit antennas increase in number. This observation vol. 59, no. 12, pp. 5983–5993, Dec. 2011.
is consistent with our analysis of Theorem 2. Therefore utiliz- [5] A. Shojaeifard et al., “Massive MIMO-enabled full-duplex cellular
networks,” IEEE Trans. Commun., vol. 65, no. 11, pp. 4734–4750,
ing massive transmit antennas cannot effectively compensate Nov. 2017.
for the loss. This problem can be circumvented by utilizing [6] S. Singh, H. S. Dhillon, and J. G. Andrews, “Offloading in heterogeneous
high-resolution ADCs during the second phase. This is feasible networks: Modeling, analysis, and design insights,” IEEE Trans. Wireless
since far fewer receive antennas are used. Commun., vol. 12, no. 5, pp. 2484–2496, May 2013.
[7] L. Sanguinetti, A. L. Moustakas, and M. Debbah, “Interference manage-
Finally, we illustrate the sum-rate versus the number of ment in 5G reverse TDD HetNets with wireless backhaul: A large system
quantization bits. For the first phase, Mrx = 300, Mtx = K , analysis,” IEEE J. Sel. Areas Commun., vol. 33, no. 6, pp. 1187–1200,
and Nrx = 200. For the second phase, because of switch- Jun. 2015.
[8] P. Anokye, R. K. Ahiadormey, C. Song, and K.-J. Lee, “Achievable
ing antenna roles, Mtx = 300, Mrx = K , and Ntx = 200 sum-rate analysis of massive MIMO full-duplex wireless backhaul
with K = 4. For each SC BS to MC BS link, we add the links in heterogeneous cellular networks,” IEEE Access, vol. 6,
UL and DL rates. The result is then aggregated over all K pp. 23456–23469, May 2018.
[9] L. Fan, S. Jin, C.-K. Wen, and H. Zhang, “Uplink achievable rate for
SCs. As shown in Fig. 2, the sum-rate for the first phase and massive MIMO systems with low-resolution ADC,” IEEE Commun.
the second phase increases with the number of bits. This is Lett., vol. 19, no. 12, pp. 2186–2189, Dec. 2015.
intuitive as the increase in quantization bits reduces the QN [10] J. Zhang, L. Dai, S. Sun, and Z. Wang, “On the spectral efficiency of
massive MIMO systems with low-resolution ADCs,” IEEE Commun.
effects and the sum-rate subsequently improves. Again, when Lett., vol. 20, no. 5, pp. 842–845, May 2016.
all system parameters are set at equal values, in the low bit [11] C. Kong et al., “Full-duplex massive MIMO relaying systems with
regime, the sum-rate of the first phase outperforms that of the low-resolution ADCs,” IEEE Trans. Wireless Commun., vol. 16, no. 6,
second phase. This is because, in the second phase, the desired pp. 5033–5047, Aug. 2017.
[12] G. Carchon and B. Nanwelaers, “Power and noise limitations of
signal and the QN are of the same order. Therefore with low- active circulators,” IEEE Trans. Microw. Theory Tech., vol. 48, no. 2,
resolution bits, the QN is more pronounced and significantly pp. 316–319, Feb. 2000.
Efficient Computation of Multivariate Rayleigh and Exponential Distributions

Reneeta Sara Isaac and Neelesh B. Mehta , Senior Member, IEEE
Abstract—We propose an efficient approach for the computa- covariance matrices, but severely limits the number of corre-
tion of cumulative distribution functions of N correlated Rayleigh lated random variables (RVs) N. For example, infinite series
or exponential random variables (RVs) for arbitrary covariance representations are derived for N = 2 in [5] and N = 3 and 4
matrices, which arise in the design and analysis of many wireless
systems. Compared to the approaches in the literature, it employs
in [6]–[8] for the arbitrary correlation model.
a fast and accurate randomized quasi-Monte Carlo method that A second class of approaches limits itself to special forms
markedly reduces the computational complexity by several orders of the covariance matrix. For example, for the constant cor-
of magnitude as N or the correlation among the RVs increases. relation model, the CDF FE (a) of N exponential RVs E =
Numerical results show that an order of magnitude larger values (E1 , . . . , EN ) evaluated at a = (a1 , . . . , aN ) is written in
of N can now be computed for. Its application to the performance terms of an infinite series as follows [9]:
analysis of selection combining is also shown.
Index Terms—Multivariate, Rayleigh, exponential, correlated ∞ n
1−ρ ρ
fading, cumulative distribution function. FE (a) =
1 + (N − 1)ρ 1 + (N − 1)ρ
n=0
N

n ai
I. I NTRODUCTION × γ , li + 1 , (1)
l1 , . . . , lN c(1 − ρ)
(l1 ,...,lN ) i=1
ULTIVARIATE or correlated Rayleigh and exponen-
M tial distributions arise in the design and analysis of
many wireless systems. An example of this is a multiple input
l1 ≥0,...,lN ≥0
l1 +···+lN =n
multiple output (MIMO) system when the transmit or receive where ρ2 is the correlation coefficient between Ei and Ej , for
antennas are spaced close together, which leads to the chan- i = j , c is the mean of Ei , for 1 ≤ i ≤ N, and γ(x , k )
nel gains of the different transmit-receive antenna pairs being is the lower incomplete gamma function [10, eq. (6.5.2)].
correlated [1, Ch. 6], [2, Ch. 13]. Another example is a sensor Computing the nth term of this series requires enumerating
network, in which the measurements of different sensor nodes
N +n −1
are correlated. Alternately, when the sensor nodes or the relay all combinations of the vector of non-negative
N −1
nodes in a cooperative relay system are close to each other
integers (l1 , . . . , lN ) that satisfy Ni=1 li = n. This entails a
or when there is a common scatterer in the propagation envi- N −1
computational complexity of O(n ), which increases expo-
ronment, the channel gains seen by the different nodes are
nentially with N. This significant complexity challenge also
correlated [3], [4]. These distributions also arise in orthogonal
arises while evaluating the CDF of correlated Nakagami-m
frequency division multiplexing (OFDM) systems, where the
RVs that is derived in [11].
subchannel gains of a user are correlated [2, Ch. 19].
When the inverse of the covariance matrix of the underly-
Performance analyses of communication techniques
ing Gaussian RVs has a tridiagonal structure, the multivariate
employed in these systems often lead to expressions that
Rayleigh CDF simplifies [12]. However, it is still in the form
involve the multivariate cumulative distributive function
of an infinite series, evaluating which entails O(n N −1 ) com-
(CDF) of such correlated channel gains. For example, the
plexity. The one exception is the specialized correlation model
outage probability of an N-branch selection combiner (SC) in
of [13] in which the multivariate PDF and CDF are given
a multi-antenna system [1, Ch. 9] or of an N-relay cooperative
in single-integral forms. These can be numerically integrated
system [4] is written in terms of the multivariate CDF of the
using Gauss-Laguerre quadrature [10, eq. (25.4.45)].
channel gains.
A third class of approaches considers arbitrary covari-
Considerable effort has been made in the literature to
ance matrices and applies to any N. In [14], the multivariate
derive tractable mathematical representations for the multi-
Nakagami-m CDF for an arbitrary correlation matrix is given
variate probability density functions (PDFs) and CDFs of
as a multi-dimensional integration of the PDF, which itself is
these distributions. One class of approaches considers arbitrary
in the form of an infinite series. In [15], the Green’s approach
Manuscript received July 20, 2018; accepted October 7, 2018. Date of is used to approximate any arbitrary covariance matrix to a
publication October 15, 2018; date of current version April 9, 2019. This matrix whose inverse has a tridiagonal form. However, the
work was supported by the Department of Telecommunications, Ministry multivariate Rayleigh CDF again involves (N − 1) nested
of Communications, India, as part of the Indigenous 5G Test Bed Project
and the Qualcomm Innovation Fellowship. The associate editor coordinating
infinite series, which entails a computational complexity of
the review of this paper and approving it for publication was A. Kammoun. O(n N −1 ) if the series is truncated after n terms. Therefore,
(Corresponding author: Neelesh B. Mehta.) results for N > 5 are seldom available in the literature for
The authors are with the Department of Electrical Communication general correlation models. However, larger values of N are
Engineering, Indian Institute of Science, Bengaluru 560012, India (e-mail:
reneetaisaac@gmail.com; nbmehta@iisc.ac.in). of interest in practice. For example, more antennas for next
Digital Object Identifier 10.1109/LWC.2018.2875999 generation systems and wireless systems with more relays and
ISAAC AND MEHTA: EFFICIENT COMPUTATION OF MULTIVARIATE RAYLEIGH AND EXPONENTIAL DISTRIBUTIONS 457
sensors are being considered. In many systems, the covariance Converting to polar coordinates, let
matrix need not take a specialized form.
Ei = Xi2 + Yi2 and φi = tan−1 (Yi /Xi ). (4)
Contributions: In this letter, we first propose a novel
approach to numerically compute the multivariate CDF of It can be seen that Ei is an exponential RV with mean 2σi2 and
N Rayleigh or exponential RVs with an arbitrary covariance φi is uniformly distributed between 0 and 2π. Furthermore, Ei
matrix. Its key ideas are as follows. It employs a series and φj are mutually independent for all i and j (including i = j)
of transformations to first convert the multivariate PDF of and (E[Ei Ej ] − 4σi2 σj2 )/( var(Ei )var(Ej )) = ρ2kj .
Rayleigh or exponential RVs to the multivariate PDF of cor- The multivariate CDF FE (·) of E = (E1 , . . . , EN )T
related Gaussian RVs. The CDF expression is then accurately evaluated at a = (a1 , . . . , aN )T is then
and quickly computed using the randomized quasi-Monte 2π 2π a1 aN
Carlo (RQMC) method [16, Ch. 4]. In contrast to the above
FE (a) = ··· ··· fE,Φ (e, θ ) d e d θ , (5)
approaches, we show that it can compute the CDF easily even 0 0 0 0
for N = 50, which is much larger than the values considered
in [5]–[9], [11], [12], [14], and [15]. Notably, it can do so where Φ = (φ1 , . . . , φN )T
and fE,Φ (·) is the joint CDF of E
even when the RVs are highly correlated. This is unlike the and Φ.
above approaches, which require more terms to be computed Using the rules governing transformation of variables, we
as the RVs become more correlated since the rate of conver- get the following from (3) and (4):
u1 u2N
gence of the series decreases. We then illustrate the utility of fX,Y (x, y)
our approach by presenting results for the performance anal- FE (a) = ··· d xd y, (6)
ysis of an N-branch SC, which is a popular receive diversity
l1 l2N x12 + y12 · · · xN 2 + y2
N
combining technique [1, Ch. 9], for a range of values of N. √ √
where
l1 = − a1 , . . . , lN = − aN , lN +1 =
2 , u = √a , . . . , u =
Outline: In Section II, we present the proposed approach. 2
In Section III, we compare its accuracy and computational − a1 − x1 , . . . , l2N = − aN − xN 1 1 N

complexity with existing approaches. In Section IV, we ana- √ 2 2
aN , uN +1 = a1 − x1 , . . . , u2N = aN − xN .
lyze the outage probability of SC. Our conclusions follow in Next, we show how to efficiently compute the integral in (6)
Section V. using the RQMC method of [16, Ch. 4].
Notations: The probability of an event A is denoted by
Pr(A). The expectation of an RV X is denoted by E[X ] and
B. RQMC Method
its variance is denoted by var(X). The joint CDF of a random
vector X = (X1 , . . . , XN ) evaluated at x = (x1 , . . . , xN ) is It transforms the integral in (6) into an integral over a unit
denoted by FX (x) = Pr(X1 ≤ x1 , . . . , XN ≤ xN ). The joint hyper-cube. This is then numerically computed using quasi-
PDF of X is denoted by fX (·). The transpose of a vector x is Monte Carlo techniques. It is as follows.
denoted by xT . 1) Take the Cholesky decomposition of Σ = CCT , where
C = [cij ] is a lower triangular matrix with cii > 0.

T
Using the variable transformation z = C−1 xT yT
II. C OMPUTATION OF M ULTIVARIATE E XPONENTIAL
changes (6) to
AND R AYLEIGH CDF
1
We first consider the multivariate exponential distribution FE (a) =
and then the multivariate Rayleigh distribution. (2π)2N
u u z12 2
z2N
1 2N
× ··· e − 2 · · · e − 2 g(z) d z, (7)
A. Multivariate Exponential CDF l1
l2N
Consider two real Gaussian random vectors X =
where li = (li − ji−1
=1 cij zj )/cii and ui = (ui −
(X1 , . . . , XN )T and Y = (Y1 , . . . , YN )T with zero mean i−1
whose second-order moments are given in general by N j =1 cij
i
zj )/cii , for 1 ≤ i ≤ 2N, and g(z) =
N +i
[( 2 2 −1/2 .
i=1 j =1 cij zj ) + ( j =1 c(N +i)j zj ) ]
E Xk2 = E Yk2 = σk2 , for 1 ≤ k ≤ N , 2) Using vi = Ψ(zi ), where Ψ(z )

z 2 √
E Xk Xj = E Yk Yj = ρkj σk σj , for k = j , − θ2

−∞ e d θ/( 2π), yields
E Xk Yj = 0, for 1 ≤ k , j ≤ N . (2) ⎡⎛ ⎞2
e1 e2N N i
⎢⎝
Thus, ρkj is the correlation coefficient between Xk and Xj FE (a) = ··· ⎣ cij Ψ−1 (vj )⎠
or Yk and Yj . The multivariate Gaussian PDF fX,Y (·, ·) of X b1 b2N i=1 j =1
and Y is ⎛ ⎞2 ⎤− 12
N +i
1 1
x ⎥
fX,Y (x, y) = exp − xT yT Σ−1 , (3) +⎝ c(N +i)j Ψ−1 (vj )⎠ ⎦ d v, (8)
|Σ|(2π)2N 2 y
j =1
T T
x = (x1 , . . . , xN ) , y = (y1 , . . . , yN ) , Σ =
where
where bi = Ψ([li − ji−1 −1
X T T
i−1 =1 cij Ψ (vj )]/cii ) and ei =
E X Y , and | · | denotes determinant. Ψ([ui − j =1 cij Ψ−1 (vj )]/cii ), for 1 ≤ i ≤ 2N.
Y
TABLE I
3) Applying the transformation wi = (vi − di )/(ei − C OMPARISON OF CDF S C OMPUTED AT P RE -S PECIFIED P OINTS (G IVEN
di ) changes (8) to the following integral over a unit IN [11, T BL . 1]) U SING P ROPOSED A PPROACH AND ISA, AND
hyper-cube: C OMPUTATIONAL C OMPLEXITY OF ISA
1 1
FE (a) = ··· f (w)d w, (9)
0 0
where
2N

f (w) = (ek − bk )
k =1
⎡⎛ ⎞2
N
i

⎢⎝
× ⎣ cij Ψ−1 (bj + wj (ej − bj ))⎠
i=1 j =1
⎛ ⎞2 ⎤− 12 III. C OMPUTATIONAL C OMPLEXITY AND C OMPARISON

N
+i
⎥
+⎝ c(N +i)j Ψ−1 (bj + wj (ej − bj ))⎠ ⎦ . (10) We now compare our approach with the infinite series
j =1 approach (ISA) of [11] and [15] given that it applies to
arbitrary correlation models. Consider the following correla-
This integral is computed using the RQMC method that tion model for Gaussian random vectors X and Y, in which
uses a carefully selected, deterministic sequence of P samples ρkj = ρ|k −j | and σk = 1, for 1 ≤ k, j ≤ N. For it, the
s1 , . . . , sP for the vector w, where the qth sample sq is the multivariate Rayleigh CDF of R is given in the form of the
vector (sq,1 , . . . , sq,2N ). An example of this is the Kronecker following infinite series [11], [15]:
√
sequence, which is given by sq,i = {q pi }, where pi is the ∞
GN ρ2(i1 +···+iN −1 )
ith prime number and {·} denotes the remainder modulo 1. To FR (r1 , . . . , rN ) = (1 − ρ2 ) N −1 2
each sample sq , a random shift u = (u1 , . . . , u2N ) is added, i1 ,...,iN −1 =0 j =1 (ij !)
where u1 , . . . , u2N are independent and identically distributed
r12 r22 (1 + ρ2 )
(i.i.d.) RVs that are uniformly distributed between 0 and 1, to × γ i1 + 1, γ i 1 + i2 + 1,
get the shifted sample sshift = {sq + u}. The integrand is 2(1 − ρ2 ) 2(1 − ρ2 )
q !
averaged over M such random shifts, where M is typically rN −1 (1 + ρ2 )
2
× · · · × γ iN −2 + iN −1 + 1,
between 8 and 12. To speed up convergence, f (w) is replaced 2(1 − ρ2 )
with (f (w) + f (1−w))/2 and w with |2w − 1| in the integrand !
2
rN
in (9). Thus, we get × γ iN −1 + 1, , (13)
2(1 − ρ2 )
1
M P
FE (a) ≈ f (|2{sq + ui } − 1|) where GN = (1 + ρ2 )−[i1 +2i2 +···+2iN −2 +iN −1 +N −2] , for
2MP
i=1 q=1 N ≥ 3, and GN = 1, for N = 2.

+f (1 − |2{sq + ui } − 1|) . (11) Table I tabulates the number of terms required by ISA and
the CDFs of ISA and the proposed approach to an accuracy
The RQMC method has a convergence rate of O(1/P ) of 4 decimal places at pre-specified points. The pre-specified
[16, Ch. points and correlation coefficients are chosen because they
√ 4], which is better than the convergence rate of
O(1/ P ) of the conventional Monte Carlo method that takes were used in [11, Table 1] for N = 3 and 4. For N = 5, r1 is
an average over any P random samples of w. The computa- set as 1 and r2 , . . . , r5 are set to be the same as r1 , . . . , r4 for
tional complexity of the proposed approach is O(MP ). N = 4. For ISA, the number of terms that need to be summed
over increases exponentially as N or ρ increases. Even for N
as small as 5, it exceeds 105 for ρ = 0.9. The number of terms
C. Multivariate Rayleigh Distribution required by the proposed approach for M = 8 and P = 1000,
Let R = (R1 , . . . , RN )T be the vector of correlated which are sufficient for all the considered values of N and ρ,
Rayleigh RVs generated from X and Y, such that Ri = is only MP = 8000. A comparison for larger N is not shown
(Xi2 + Yi2 )1/2 , for 1 ≤ i ≤ N. Then, the multivariate CDF since ISA becomes computationally infeasible.
FR (r) of R evaluated at r = (r1 , . . . , rN )T can be expressed
in terms of the exponential CDF FE (·) as follows: IV. I LLUSTRATIVE A PPLICATION TO N-B RANCH SC
FR (r) = Pr(R1 ≤ r1 , . . . , RN ≤ rN ), Consider a wireless communication system that consists of
a transmitter with one antenna and a receiver with N antennas.
= Pr R12 ≤ r12 , . . . , RN
2 2
≤ rN , Let γk = Rk2 ωs /N0 be the instantaneous signal-to-noise ratio
= FE (r12 , . . . , rN
2
). (12) (SNR) of the signal received at the kth antenna, where Rk is
its channel amplitude, which is a Rayleigh RV, ωs is the trans-
Therefore, the multivariate Rayleigh CDF can also be easily mitted symbol energy, and N0 is the additive white Gaussian
computed using the above approach. noise power spectral density. After selection combining, the
ISAAC AND MEHTA: EFFICIENT COMPUTATION OF MULTIVARIATE RAYLEIGH AND EXPONENTIAL DISTRIBUTIONS 459
Fig. 2 plots Pout as a function of γth for this model when

ρmin = 0.5 and ρmax = 0.9. The analysis and simulation
results are in good agreement even for N as large as 30. Similar
to Fig. 1, Pout decreases as N increases. Compared to the
exponential correlation model, Pout of this model is higher
for a given γth and N. Intuitively, this is because the RVs are
more correlated in the arbitrary correlation model than in the
exponential correlation model for the parameters used.
V. C ONCLUSION
The complexity of computing the multivariate CDF with
an arbitrary covariance matrix using conventional approaches
increased exponentially as N increased. We presented a novel
approach for computing the multivariate Rayleigh and expo-
nential CDFs for any N and for an arbitrary correlation
Fig. 1. Exponential correlation model: Outage probability of SC as a function
structure. It had a much lower computational complexity that
of N for different ρ (γth = 3 dB and ωs /N0 = 1). no longer increased as N or ρ increased. We then demonstrated
the utility of our approach by analyzing the outage probability
of an N-branch SC receiver.
R EFERENCES
[1] M. Simon and M.-S. Alouini, Digital Communication Over Fading
Channels, 2nd ed. Hoboken, NJ, USA: Wiley-Intersci., 2005.
[2] A. F. Molisch, Wireless Communications. Chichester, U.K.: Wiley, 2005.
[3] A. Attarkashani and W. Hamouda, “Throughput maximization using
cross-layer design in wireless sensor networks,” in Proc. ICC, May 2017,
pp. 1–6.
[4] B. V. Nguyen, R. O. Afolabi, and K. Kim, “Dependence of outage
probability of cooperative systems with single relay selection on chan-
nel correlation,” IEEE Commun. Lett., vol. 17, no. 11, pp. 2060–2063,
Nov. 2013.
[5] C. C. Tan and N. C. Beaulieu, “Infinite series representations of
the bivariate Rayleigh and Nakagami-m distributions,” IEEE Trans.
[6] K. D. P. Dharmawansa, R. M. A. P. Rajatheva, and C. Tellambura,
“Infinite series representations of the trivariate and quadrivariate
Nakagami-m distributions,” in Proc. ICC, Jun. 2007, pp. 1114–1118.
[7] K. Peppas and N. C. Sagias, “A trivariate Nakagami-m distribu-
Fig. 2. Arbitrary correlation model: Outage probability of SC as a function tion with arbitrary covariance matrix and applications to generalized-
of γth for different N. selection diversity receivers,” IEEE Trans. Commun., vol. 57, no. 7,
pp. 1896–1902, Jul. 2009.
output SNR γSC is given by γSC = max{γ1 , . . . , γN }. The [8] Y. Chen and C. Tellambura, “Infinite series representations of the trivari-
outage probability of SC for an SNR threshold γth is given by ate and quadrivariate Rayleigh distribution and their applications,” IEEE
! Trans. Commun., vol. 53, no. 12, pp. 2092–2101, Dec. 2005.
" " [9] R. K. Mallik, “On multivariate Rayleigh and exponential distributions,”
γth N0 γth N0
Pout = Pr(γSC ≤ γth ) = FR ,..., . (14) IEEE Trans. Inf. Theory, vol. 49, no. 6, pp. 1499–1515, Jun. 2003.
ωs ωs [10] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions:
With Formulas, Graphs, and Mathematical Tables, 9th ed. New York,
Exponential Correlation Model: In this model, ρkj = ρ|k −j | NY, USA: Dover, 1964.
[11] G. K. Karagiannidis, D. A. Zogas, and S. A. Kotsopoulos, “On the mul-
and σk = 1, for 1 ≤ k, j ≤ N. It arises in equi-spaced diversity tivariate Nakagami-m distribution with exponential correlation,” IEEE
antennas [1]. Fig. 1 plots the outage probability of SC as a Trans. Commun., vol. 51, no. 8, pp. 1240–1244, Aug. 2003.
function of N for different ρ. To compute the multivariate CDF [12] L. E. Blumenson and K. S. Miller, “Properties of generalized Rayleigh
in (11), we use M = 8 and P = 1000. Also shown are results distributions,” Ann. Math. Stat., vol. 34, no. 3, pp. 903–910, Sep. 1963.
[13] N. C. Beaulieu and K. T. Hemachandra, “Novel simple represen-
from simulations, in which 106 realizations of the vector of tations for Gaussian class multivariate distributions with generalized
N correlated Rayleigh RVs R = (R1 , . . . , RN ) are generated correlation,” IEEE Trans. Inf. Theory, vol. 57, no. 12, pp. 8072–8083,
and the average number of outage occurrences is measured. Dec. 2011.
[14] R. A. A. de Souza and M. D. Yacoub, “On the multivariate Nakagami-m
The analysis and simulation results match well even for N as distribution with arbitrary correlation and fading parameters,” in Proc.
large as 50. For a given ρ, Pout decreases as N increases. IMOC, Oct. 2007, pp. 812–816.
However, as ρ increases, Pout increases. [15] G. K. Karagiannidis, D. A. Zogas, and S. A. Kotsopoulos, “An effi-
cient approach to multivariate Nakagami-m distribution using Green’s
Arbitrary Correlation Model: To evaluate the utility for matrix approximation,” IEEE Trans. Wireless Commun., vol. 2, no. 5,
an arbitrary correlation model, the correlation coefficients are pp. 883–889, Sep. 2003.
taken to be linearly spaced between ρmin and ρmax with [16] A. Genz and F. Bretz, Computation of Multivariate Normal and t
σk = 1, for 1 ≤ k ≤ N. Specifically, ρkj = ρmax − (|k − Probabilities, 1st ed. Heidelberg, Germany: Springer, 2009.
[17] Q. T. Zhang, “Maximal-ratio combining over Nakagami fading channels
j | − 1)(ρmax − ρmin )/(N − 2). It applies to linear arrays with with an arbitrary branch covariance matrix,” IEEE Trans. Veh. Technol.,
antennas spaced unevenly [11], [17]. vol. 48, no. 4, pp. 1141–1150, Jul. 1999.
Adaptive Frequency Band and Channel Selection for Simultaneous

Receiving and Sending in Multiband Communication
Ayaka Hanyu , Student Member, IEEE, Yuichi Kawamoto , Member, IEEE,
Hiroki Nishiyama , Senior Member, IEEE, Nei Kato , Fellow, IEEE, Naoto Egashira, Member, IEEE,
Kazuto Yano, Member, IEEE, and Tomoaki Kumagai, Member, IEEE
Abstract—The demand for relay-transmission technologies has

recently been increasing. We study relay transmission using
multiband to reduce transmission delay and meet the increas-
ing demand. In the relay transmission, the relay node initiates
relay transmission while receiving data on another frequency
band. However, despite its potential, the difference of data rates
between reception and forwarding causes data loss from buffer
overflow. To solve this problem, we propose an algorithm that
selects appropriate channels, such that the transmission rates
become as high as possible without buffer overflow, and evaluate
the effectiveness of the proposal.
Index Terms—Device-to-device (D2D), relay transmission,
multiband, simultaneous data receiving and sending, channel
selection.
I. I NTRODUCTION
Fig. 1. Overview of the existing and proposed network.
N RECENT years, a demand for device-to-device (D2D)
I communication, which enables communication between
mobile devices, has been increasing. In order to realize the to its destination node after re-encoding and modulation, as
D2D communication, several methods have been studied, such shown in Fig. 1 [6]. This relay transmission method is called
as the maximization of the time-average total throughput of “Decode-and-Forward (DF)” and adopted by wireless local
network [1], efficient performance of the D2D network [2], area network (WLAN) standards, such as IEEE 802.11s [7].
maximization of the bounds on the capacity [3], signifi- The DF method causes a large delay because of the recep-
cant performance gains of average energy efficiency [4]. In tion and forwarding process in the relay node (Fig. 1). The
particular, the relay-transmission technology, which conveys delay can be reduced by performing data reception and for-
information to a destination node through multiple devices, warding at the same time. We propose a relay transmission
has attracted much attention for the D2D communication [5]. method using multiple bands (Fig. 1) to realize simultaneous
In one of existing relay-transmission technology, a relay data reception and forwarding at the relay node [8], [9]. Using
node that receives a frame from a source node performs multiple bands, the relay method performs the relay transmis-
demodulation and decoding, then forwards the decoded data sion in another frequency band, while the relay node receives
a data frame. Moreover, channels used for reception and for-
Manuscript received June 22, 2018; revised August 29, 2018; accepted warding can be chosen among multiple bands, and spectral
October 8, 2018. Date of publication October 16, 2018; date of current ver- resources can be efficiently utilized.
sion April 9, 2019. A part of this letter was performed by the research contract However, for the relay using multiple bands, channel condi-
entrusted by the Ministry of Internal Affairs and Communications “Research tions, such as propagation loss and fading, differ in each band.
and Development of Spectral-Efficiency Improvement Technology Employing
Simultaneous Transmission over Multiple License-Exempt Bands” and the This causes a mismatch of the data rates between reception
national program, Cross-ministerial Strategic Innovation Promotion Program and forwarding at the relay node. Buffer overflow and packet
(SIP), supported by the Cabinet Office, Japan. The associate editor coordi- loss will occur if the data rate between the source and relay
nating the review of this paper and approving it for publication was Y. Gao. nodes is greater than that between the relay and destination
(Corresponding author: Ayaka Hanyu.)
A. Hanyu, Y. Kawamoto, H. Nishiyama, and N. Kato are with the Graduate nodes. Therefore, this letter proposes an adaptive frequency
School of Information Sciences, Tohoku University, Sendai 980-8579, band and channel selection method that minimizes the trans-
Japan (e-mail: ayaka.hanyu@it.is.tohoku.ac.jp; youpsan@it.is.tohoku.ac.jp; mission delay while avoiding buffer overflow. In the proposal,
hiroki.nishiyama.1983@ieee.org; kato@it.is.tohoku.ac.jp).
N. Egashira and K. Yano are with the Wave Engineering Laboratories,
the relay node calculates the sending and receiving rates based
Advanced Telecommunications Research Institute International, Kyoto on the signal-to-interference plus noise power ratio (SINR) of
619-0288, Japan (e-mail: egashira.naoto@atr.jp; kzyano@atr.jp). each channel and selects an appropriate channel combination
T. Kumagai was with the Wave Engineering Laboratories, Advanced at which the sending and receiving rates become as high as
Telecommunications Research Institute International, Kyoto 619-0288, Japan.
He is now with the Enterprise Solutions Business Headquarters, NTT
possible.
Advanced Technology Corporation, Kawasaki 212-0014, Japan. The remainder of this letter is organized as follows:
Digital Object Identifier 10.1109/LWC.2018.2876258 Section II explains the relay transmission using the multiple
HANYU et al.: ADAPTIVE FREQUENCY BAND AND CHANNEL SELECTION FOR SIMULTANEOUS RECEIVING AND SENDING 461
bands and its issue; Section III proposes an algorithm that

selects the channels used for the relay transmission to avoid
buffer overflow; Section IV evaluates the performance of
the proposed method; and finally, Section V presents our
conclusions.
II. C ONSIDERED A RCHITECTURE AND P ROBLEM

F ORMULATION
A. Assumed Network Environment
In this letter, we assume the IEEE 802.11-based WLAN.
Fig. 1 depicts our assumed system model. Our considered relay
transmission technology is operated in 920 MHz, 2.4 GHz, and
5 GHz bands [8]. The relay transmission can use the idle chan-
nels in these bands, and more radio resources are available
compared with the single-band case. Moreover, the relay node
starts sending data on another channel while receiving data. In
other words, the relay node simultaneously receives and sends
data. Therefore, compared to the case where transmission is
initiated after completely receiving data, the delay at the relay
node can be reduced [8].
B. Problem Formulation
Fig. 1 demonstrates a simplified relay transmission system
in which the source, relay, and destination nodes are indicated
Fig. 2. Proposed algorithm for selecting the appropriate channel combination.
by S, R, and D, respectively. The communication link between
the source and relay nodes is represented by S-R, while that
between the relay and destination nodes is denoted by R-D. As with the largest transmission rates of R-D is selected, and the
mentioned earlier, we consider the simultaneous data receiving sending rates is r RD . In order to select S-R channel, we make
and sending in different bands. The receiving (S-R) rate rSR a list of channels arranged in descending order of SINR, and
and the sending (R-D) rate r RD depend on several factors, the selected channel for R-D is removed from the list because
such as propagation loss. The data accumulated in the relay of loop interference avoidance. Next, we select S-R channel
node may exceed its buffer size when rSR is larger than r RD . with the largest transmission rates of S-R is selected, and the
The data overflowed from the buffer is lost, then leads to the receiving rates on the selected channel is r RD .
increase of the transmission delay caused by the retransmission We assume that the size of the transmitted data is F .
of the lost data. The amount of data that accumulates in the relay node
when the S-R transmission is finished is represented as
(F /rSR ) · (rSR − r RD ). The data that overflows from the
III. P ROPOSED A DAPTIVE F REQUENCY BAND AND
buffer will be lost when it exceeds the buffer size b of the
C HANNEL S ELECTION M ETHOD
relay node. To avoid this situation, the channel used for S-R
A. Algorithm is re-selected until the condition (F /rSR ) · (rSR − r RD ) ≤ b
We propose herein an algorithm that selects the appropriate is satisfied. If no channel matches the condition, the rate
channels used for receiving and sending based on its SINR is adjusted according to the MCS. Furthermore, if no MCS
to avoid data loss caused by the difference between rSR and matches the condition, we use DF method using the chan-
r RD . This letter considers the employment of spatial reuse by nels with the largest transmission rates of S-R and R-D. After
adjusting a threshold of acceptable interference as discussed the channels for S-R and R-D are determined, the relay node
for IEEE 802.11ax wireless LAN [10]. In the proposed algo- informs the source and destination nodes which channels are to
rithm, the channels are selected such that the receiving and be used and starts the data transmission. In addition, when all
sending rates are as high as possible, and the transmission channels are not available or do not have communicable SINR
delay is reduced. Fig. 2 shows the flowchart of our proposal. on either S-R or R-D, this algorithm is executed again after a
In the algorithm, the SINR of each channel is first calculated lapse of time because the transmission cannot be executed.
for S-R and R-D (how to calculate the SINR is described in
Section III-B), and the modulation and coding scheme (MCS)
is chosen by looking up on the MCS table that contains an B. SINR Calculation
appropriate MCS corresponding to SINR. The receiving and The SINR in dB of each channel is calculated as follows:
sending rates are calculated as follows based on the MCS:
PS
m · NSD · rc SINR = 10 log10 , (2)
r= , (1) PN + PI
T
where T is the symbol length of an orthogonal frequency- where PS , PI , and PN represent the desired signal power,
division multiplexing (OFDM) signal [11]; m is the number interference signal power, and noise power, respectively, in
of bits included in a complex symbol; NSD is the number of each channel. We assume the free-space propagation model;
data subcarriers; and rc is the coding rate. Next, the channels hence, both PS and PI can be calculated as follows, and PS
TABLE I
Fig. 3. Assumption of interference arrival.
is given by: channel βm . PIβ is given by:

m
2 2
c c
P S = Pt , (3) PIβ = Pt nallβm · ρβm (x , y) dS . (7)
m 4πdfβ
4πdf Ω
The total interference power PIβ can be expressed as follows
where Pt is the transmission power; c is the light velocity; j
d is the distance between the nodes; and f is the frequency of using PIβ :
m
using channel. jMAX
We now explain how to obtain PI . PI is the total interference
PIβ = olf |j −m| · PIβ , (8)
signal power from the nodes that use the focused channel, j m
m=1
except for S, R, and D. We consider an infinitesimal area dS
in the circles centered both at R and D (Fig. 3). The former and where jMAX is the maximum channel index in the 2.4 GHz
the latter correspond to the area where some devices interfere band; olf h is a coefficient that defines how much the adjacent
with the S-R and R-D links, respectively. We set nall as the channel affects the channel of interest; and h represents the
total number of nodes distributed around the nodes focusing channel difference between the channel of interest and the
on. Moreover nall is separately set for each channel. Among adjacent channel [12].
them, the nodes within the communicable distance dMAX from
the focused nodes are interfering when the focused nodes use IV. P ERFORMANCE E VALUATION
the channel. We define ρ (x, y) as the probability density func- A. Assumptions
tion of the interfering nodes in the area. The total amount
of interference is given as the expectation of the sum of the The assumed environment in this performance evaluation is
interference arriving from within dMAX . first described. While many nodes perform a D2D communica-
tion, we pay attention to three nodes, namely a single source,
2 a relay, and a destination node. Table I shows the simulation
c
PI = Pt nall · ρ(x , y) dS , parameters. The two-dimensional (2D) normal distribution is
Ω 4πdf used herein to represent the distribution of interfering nodes;
Ω : (x − x )2 + (y − y )2 = d 2 ≤ dMAX2
(4) hence, nμ , which is the number of nodes at (μx , μy ), is ran-
domly set for each channel between 2 and 3. Here, (μx , μy ) is
where (x , y ) is the center of the assumed circle. the mean coordinate. nμ is used to calculate the total number
The channels in the 920 MHz, 2.4 GHz, and 5 GHz bands of interfering nodes as nall = 2πσ 2 nμ (a maps a to the
are denoted as αi , βj , and γk , respectively. i, j, and k are greatest integer less than or equal to a), where σ is the vari-
the channel number of each frequency band. The channels ance. In this evaluation, S is located the origin. R and D are
are arranged without overlapping in the 920 MHz and 5 GHz randomly located within a circle with a radius of 100 m from
bands; thus, we neglect the impact of the adjacent-channel S and R, respectively. Table II shows the MCS table used in
interference in these bands. Therefore, the interference signal this evaluation.
power is expressed as follows when the transmission power is We assume that in the environment, the node dis-
Pt in all frequency bands: tribution between S-R and R-D is different. Therefore,
2 in this evaluation, the probability density of the nodes
c is represented by the 2D normal distribution given as
PIαi = Pt nallαi · ραi (x , y) dS , (5) follows:
Ω 4πdfα
2
c 1 (x − μx )2 + (y − μy )2
PIγ = Pt nallγ · ργk (x , y) dS . (6) ρ(x , y) = exp − . (9)
k
Ω 4πdfγ k 2πσ 2 2σ 2
In this evaluation, (μx , μy ) is the coordinate within a circle
Meanwhile, in the 2.4 GHz band, the interference from the
with a radius of 300 m from the origin. (μx , μy ), σ, and nμ are
nodes using adjacent channel with overlapping is considered
uniformly selected at random within the range of the specified
because we assume the IEEE 802.11-based WLAN. To cal-
value, and each parameter is separately determined for each
culate PIβ , which includes the adjacent channel interference,
j channel. The number of channels in 920 MHz, 2.4 GHz, and
we first calculate the interference PIβ from the nodes on 5 GHz bands is 11, 13, and 19, respectively. The bandwidth
m
HANYU et al.: ADAPTIVE FREQUENCY BAND AND CHANNEL SELECTION FOR SIMULTANEOUS RECEIVING AND SENDING 463
TABLE II
MCS TABLE Fig. 4(b) shows the performance of the data transmis-
sion success rate from S to D. In the proposed and DF
methods, the success rate was 1.0 because the optimal
channel was always selected. In contrast, the random-
channel selection method significantly degraded the success
rate because it selected the channels, regardless of the
channel conditions, and, therefore, cannot avoid the buffer
overflow.
V. C ONCLUSION
TABLE III This letter introduced the relay transmission that performed
C HANNEL - TO -C HANNEL OVERLAP FACTORS
simultaneous data receiving and sending to reduce the trans-
mission delay at the relay node using multiple frequency
bands. An issue remains to be solved in the relay transmis-
sion, that is, data loss from buffer overflow is caused by the
difference between the receiving and sending rates at the relay
node. Therefore, we proposed herein an algorithm for balanc-
ing the sending and receiving rates by selecting an appropriate
channel according to its SINR to avoid buffer overflow. The
evaluation results showed that the proposed method reduces
the transmission delay compared with two conventional meth-
ods and attains a transmission success rate of 1.0 by avoiding
buffer overflow.
Fig. 4. Evaluation results. R EFERENCES

[1] J. Li and S. Huang, “Delay-aware power control for D2D communica-
is 1 MHz in the 920 MHz band and 20 MHz in the 2.4 GHz tion with successive interference cancellation and hybrid energy source,”
and 5 GHz bands. Table III shows the coefficient olf h for the IEEE Wireless Commun. Lett., vol. 6, no. 6, pp. 806–809, Dec. 2017.
2.4 GHz band [12]. [2] A. J. Roumeliotis, S. E. Sagkriotis, A. Z. Papafragkakis, and
A. D. Panagopoulos, “D2D communication for adaptive streaming
exploiting white spaces in transmissions of the cellular network,” IEEE
B. Evaluation Results Wireless Commun. Lett., vol. 7, no. 1, pp. 58–61, Feb. 2018.
[3] K. Lee, A. Yener, and X. He, “Resource allocation for the multiband
We evaluated the proposed method in terms of the data relay channel: A building block for hybrid wireless networks,” EURASIP
transmission time and the success rate from S to D. In this J. Wireless Commun. Netw., vol. 2010, Mar. 2010, Art. no. 792410.
evaluation, only the transmission loss caused by the buffer [4] Z. Zhou, K. Ota, M. Dong, and C. Xu, “Energy-efficient matching for
overflow was considered. The average, maximum, and min- resource allocation in D2D enabled cellular networks,” IEEE Trans. Veh.
Technol., vol. 66, no. 6, pp. 5256–5268, Jun. 2017.
imum values of the simulation were plotted in each graph. [5] L. Yang, W. Zhang, and S. Jin, “Interference alignment in device-
The performances of the other two relay-transmission methods to-device LAN underlaying cellular networks,” IEEE Trans. Wireless
were also evaluated for comparison. One is a random-channel Commun., vol. 14, no. 7, pp. 3715–3723, Jul. 2015.
selection method that uses channels randomly selected, and [6] A. Chaaban and A. Sezgin, “Multi-hop relaying: An end-to-end delay
analysis,” IEEE Trans. Wireless Commun., vol. 15, no. 4, pp. 2552–2561,
we assumed that it simultaneously receives and sends data. Apr. 2016.
The other is the DF method [13]. In the DF method, the [7] IEEE Standard for Information Technology—Telecommunications and
channels with the least interference for both S-R and R-D Information Exchange Between Systems Local and Metropolitan Area
are used. Networks—Specific Requirements—Part 11: Wireless LAN Medium
Access Control (MAC) and Physical Layer (PHY) Specifications, IEEE
First, the data transmission time from S to D was evalu- Standard 802.11-2016, pp. 183–245, Dec. 14, 2016.
ated. Fig. 4(a) shows the result. Compared to that using the [8] N. Egashira, K. Yano, S. Tsukamoto, J. Webber, and T. Kumagai,
conventional methods, the average transmission time using “Low latency relay processing scheme for WLAN systems employ-
the proposed method was reduced. Moreover, the proposed ing multiband simultaneous transmission,” in Proc. IEEE Wireless
Commun. Netw. Conf. (WCNC), San Francisco, CA, USA, Mar. 2017,
method can reduce the fluctuation of the data transmission pp. 1–6.
time. In other words, it was more stable than the conven- [9] Z. M. Fadlullah et al., “Multi-hop wireless transmission in multi-
tional methods because a large fluctuation in the transmission band WLAN systems: Proposal and future perspective,” IEEE Wireless
time occurs due to the SINR of the selected channel sig- Commun., to be published, doi: 10.1109/MWC.2017.1700148.
nificantly being varied by the random selection of the used [10] R. Stacey, Specification Framework for TGax, IEEE Standard 802.11-
15/0132r15, Jan. 2016.
channel. Moreover, in the DF method, R starts transmit- [11] R. Prasad, OFDM for Wireless Communications Systems. Boston, MA,
ting data to D after completing the data reception; hence, USA: Artech House, 2004.
additional waiting time is required before sending to D [12] M. Burton, “Channel overlap calculations for 802.11b networks,” Cirond
started. Therefore, the transmission time is nearly double Technol., Inc., Long Beach, NY, USA, White Paper, Nov. 2002.
[13] A. Ettefagh, M. Kuhn, I. Hammerstrom, and A. Wittneben, “On
compared to that of the proposed method by effectively uti- the range performance of decode-and-forward relays in IEEE 802.11
lizing the idle frequency band, which is not available by the WLANs,” in Proc. IEEE 17th Int. Symp. Pers. Indoor Mobile Radio
DF method. Commun., Sep. 2006, pp. 1–5.
Localization Using Blind RSS Measurements

Yongchang Hu , Member, IEEE, Jiani Liu , Student Member, IEEE,
and Bingbing Zhang , Student Member, IEEE
Abstract—Localization using received signal strength (RSS) Most RSS-based localization approaches are based on an
measurements becomes popular due to the simplicity of practi- underlying assumption that signal can be successfully demod-
cal implementation. Traditional RSS measurements are obtained ulated. Hence, the impact of the background noise (BGN),
after successful demodulation such that the impact of the back- which stems from the interference, the environmental noise
ground noise (BGN) is ignored. However, critical information for
demodulation might be expensive or difficult to obtain in hos- and etc., is significantly mitigated and reasonably ignored.
tile or harsh environments. In this case, the RSS measurements However, this requires some critical configuration informa-
need to be blindly collected without demodulation and hence tion such as the details of modulation, which might be very
characterized by a recent model with the BGN power (already expensive or difficult to access in unattended or hostile envi-
validated by real-life data). This kind of measurement is referred ronments. For instance, in military scenarios, it is hard to
to as “blind RSS measurement”. In this letter, we introduce four demodulate the signal from adversaries if used for localiza-
models for the localization using the blind RSS measurements, tion. An alternative solution has been reported in [7] that the
respectively considering the BGN power and the transmit power
to be known or unknown. A general semi-definite programming
blind RSS measurement may be collected by directly integrat-
solution that applies to all these models is proposed. The corre- ing the observed power spectral density (PSD), where we use
sponding Cramér–Rao lower bounds are presented, indicating a the term “blind” rather than “non-cooperative” as in [7] in case
significant impact of the BGN power on the estimation accuracy. of the confusion with the cooperative/non-cooperative localiza-
Numerical results show the proposed method yields a good and tion with multiple/single target(s). The drawback though is that
reliable performance with different models. both the signal and the BGN contribute to the measurement,
Index Terms—Localization, received signal strength, back- i.e., the BGN power has to be taken into account. Thus, the tra-
ground noise, semidefinite programming, Cramér-Rao lower ditional log-normal fading model is not applicable to this kind
bound. of RSS measurement. Fortunately, based on real-life experi-
ment data, R Martin et al. reported a sound and solid model for
I. I NTRODUCTION the blind RSS measurement that includes the BGN power in
OCATION awareness nowadays has been ubiquitously [7, eq. (17)]. Based on this model, an ML solution for local-
L required in many aspects of commercial, public service,
military sectors. Thus, source localization techniques have
ization was accordingly presented therein. Nevertheless, this
ML solution suffers from the same disadvantages as mentioned
become much prevalent in recent years. The commonly used before and, more importantly, there is still no other method
measurements are time-of-arrival (TOA), time-difference-of- that can blindly estimate target location without demodulating
arrival (TDOA), angle-of-arrival (AOA) and received signal the signal. Therefore, the topic of blind RSS-based localization
strength (RSS) [1]. Owing to the simplicity of practical is still in its infancy.
implementation, source localization using the RSS measure- This letter is aimed at enriching the research of this topic.
ment has drawn much attention, thus resulting into many First, Section II present the model for the blind RSS mea-
notable contributions on this topic. Some maximum likeli- surement that includes the BGN power. In Section III, four
hood (ML) solutions were introduced in [2], but they require linear models are introduced, respectively considering the
an appropriate initialization and high computational com- BGN power and the transmit power to be known or unknown.
plexity. In [3], they proposed some least squares estimators, In this section, a general SDP approach that applies to all these
which however are very susceptible to a large measurement models is also proposed and the corresponding Cramér-Rao
noise. Furthermore, some semidefinite programming (SDP) lower bounds (CRLBs) are derived. Finally, to evaluate those
approaches were also reported in [4] and [5]. This kind of models and the SDP solution, numerical simulations have been
solution has gradually become the new favourite at this time conducted, leading to some conclusions in Section IV.
due to its very robust localization performance [6].
Manuscript received July 19, 2018; revised September 10, 2018; accepted
October 12, 2018. Date of publication October 16, 2018; date of current II. B LIND RSS M ODEL
version April 9, 2019. The associate editor coordinating the review of this
paper and approving it for publication was D. So. (Corresponding author: This section presents the blind RSS model that includes the
Yongchang Hu.) BGN and is used throughout this letter. Assume the received
Y. Hu is with the Cloud Core Network Product Line/Research and signal y(t) with the time instance t can be expressed as
Development Department, Huawei Technologies Company, Ltd., Xi’an
710000, China (e-mail: hycforever2000@gmail.com). y(t) = x (t) h(t) + n(t), where denotes the convolu-
J. Liu is with the Faculty of Electrical Engineering, Mathematics, and tion operator, x(t) is the transmitted signal, h(t) indicates the
Computer Science, Delft University of Technology, 2628 CD Delft, The channel response and n(t) is the zero-mean white Gaussian
Netherlands (e-mail: j.liu-1@tudelft.nl).
B. Zhang is with the School of Electronic Science, National University of
BGN. After the signal is successfully demodulated, the impact
Defense Technology, Changsha 410073, China (e-mail: zbbzb@nudt.edu.cn). of n(t) can be largely alleviated, turning into relatively triv-
Digital Object Identifier 10.1109/LWC.2018.2876319 ial demodulation errors. Since the collection of the traditional
HU et al.: LOCALIZATION USING BLIND RSS MEASUREMENTS 465
RSS measurement estimates the expected value of the instan- distance term di2 for convenience of unfolding the Euclidean
taneous signal power r (t)2 |x (t) h(t)|2 by averaging norm. Thus, (3) is further reformulated into
the samples (see [6, Appendix A]), the demodulation errors − γ2 P 2
−γ
can further be mitigated. Therefore, it is very reasonable that P0 ||x − si ||2 = ( i − τbg ) . (4)
the traditional RSS measurement is modelled by the famous χi
log-normal fading that ignores the impact of n(t) [8]. To be For a sufficiently small χi , the right-hand side (RHS) of (4)
specific, for localization, suppose the target node is located at can be approximated by applying the first-order Taylor expan-
x ∈ Rd and N anchors are pre-deployed with known locations. P 2 2
)− γ ≈ (P − τ )− γ + ln(10) P (P −
sion as ( χi − τbg
The traditional RSS measurement in dB scale associated with i
i bg 5γ i i
− γ2 −1
the i-th anchor located at si ∈ Rd is expressed as )
τbg χi , resulting into
Pi = P̄i + χi (1) 2
− γ − 2 ln(10) 2
− γ −1
||x − si ||2 ≈ Pi − τbg
γ
P0 + Pi (Pi − τbg ) χi . (5)
5γ
P̄i P0 −10γlog10 (di ) with di ||x−si ||, P0 is the transmit
power, γ is the path-loss exponent (PLE) and χi ∼ N (0, σ 2 ) Then, rearranging (5) and stacking both sides lead to our first
is the shadowing effect. linear model with both known τbg and P0 as
However, when the signal is very difficult or expensive to y1 = F1 θ 1 + D1 χ , (6)
demodulate, we can do nothing but integrating the observed
PSD of y(t) to obtain the blind RSS measurement (in watts) where θ 1 [x, ||x||2 ]T ,
χ stacks all values of χi , D1
2
as Pr T1 |y(t)|2 dt, where T is the time window, which
ln(10)
diag([ · · · , − 5γ Pi (Pi − τbg )− γ −1 , · · · ]T ) with diag(·)
⎡ ⎤
obviously includes the BGN power. Note that the model for .. ..
this kind of measurement, which we will introduce next, has − 2⎢ . . ⎥
been reported before and already evaluated by real-life exper- the diagonal matrix, F1 P0 γ ⎢ ⎣−2si
T 1⎥ and y
⎦ 1
iments in literature. Here, we will not add more discussions .. ..
. .
here and if readers are interested in details, please refer to [7] − 2 T
2
and the references therein. −
· · · , P − τ − P γ ||si ||2 , · · · .
γ
Accordingly, the blind RSS measurement in dB scale can i bg 0
reasonably be modelled as 2) P0 is Unknown: We rewrite (5) as
− 2 2
−2sT 2
P̄i τbg γ
Pi = 10log10 (10 10 + 10 10 ) + χi , (2) i x + ||x|| − Pi − τbg P0 γ
2 ln(10) 2
− γ −1
where τbg indicates the BGN power in dB scale. Here, we ≈ −||si ||2 + P0γ Pi (Pi − τbg ) χi (7)
assume the PLE is known a priori, since the PLE awareness 5γ
has become an indispensable and possible feature for effi- and observe that the unknown transmit power P0 issue can
2
ciently designing wireless systems owing to the self-estimation
easily be tackled by estimating an unknown P0γ . Thus, stack-
techniques in [9]. The impact of the BGN power τbg can be
ing both sides of (7) leads to our second linear model with
readily seen from the model in (2) that there exists a power
unknown P0 as
floor for E (Pi ), i.e., E (Pi ) > τbg , where E(·) indicates
the expectation operator. Additionally, the BGN also affects y2 = F2 θ 2 + D2 χ , (8)
the localization performance, which will be studied later in 2 2
Section IV. where θ 2 [x, ||x||2 , P0γ ]T , D2 P0γ D1 , y2
[ · · · , −||si ||2 , . . . ]T and
III. P ROPOSED L OCALIZATION A PPROACHES ⎡ .. .. .. ⎤
. . .
This section introduces four linear models for blind RSS- ⎢ − 2 ⎥
based localization with known/unknown P0 or τbg . Because ⎢ γ⎥
F2 ⎢−2sT 1 − Pi − τbg
⎥.
P0 may be known as the standard configuration or otherwise ⎣ i ⎦
.. .. ..
due to the power control for energy saving or security pur- . . .
poses. Similarly, τbg may be constantly detected or unknown
in the dynamic environments. Then, a general SDP solution is 3) τbg Is Unknown: We equivalently consider an unknown
and notice that τ is still not linear in the RHS of (5).
τbg
proposed, which applies to all these models. The Cramér-Rao bg
P in watts and
To cope with that, we use the fact that τbg
lower bounds (CRLBs) are also studied. 0
introduce a new unknown variable α to replace the unknown
as τ = αP . Since α is sufficiently small, we can
τbg
A. Linear Models bg 0
approximate the following terms in (5) using their first order
1) Both τbg and P0 Are Known: We first convert the blind Taylor expansions around α = 0, i.e.,
RSS measurement Pi in (2) back into watts as
− 2 − 2 2 − 2 −1
P̄i Pi − αP0 γ ≈ Pi γ + P0 Pi γ α (9)
Pi = χi (10 10

+ τbg ) ⇒ Pi = χi (P0 di−γ + τbg

), (3) γ
Pi χi P0 τbg and
where Pi 10 10 , χi 10 10 , P0 10 10 and τbg
10 10 .
− 2 −1 − 2 −1 2 + γ − 2+2γ
Recalling di ||x − si ||, we need to collect the squared Pi − αP0 γ ≈ Pi γ + P0 Pi γ α. (10)
γ
Substituting (9) and (10) into the RHS of (5) and ignoring the Then, considering the dependence of the elements in θ I ,
high order small term O(αχi ) result in we can formulate our weighted optimization problem for all
− 2 the introduced linear models as
ln(10) 2
− γ −1
Pi − τbg
γ
+ Pi (Pi − τbg ) χi min (yI − FI θ I )T WI (yI − FI θ I )
5γ θ
(16a)
I
− 2 2 − 2 −1 ln(10) − γ2
≈ Pi γ + P0 Pi γ α + Pi χi . (11) s.t. [θ I ]T
1:d [θ I ]1:d = [θ I ]d+1 (16b)
γ 5γ
Plugging (11) into (5), we similarly obtain our third linear Using the Schur complement and constructing some linear
model with an unknown τbg as matrix inequalities (LMIs) result into our SDP optimization
problem
y3 = F3 θ 3 + D3 χ , (12)
min tI (17a)
where the new parameter vector becomes θ 3 [x, ||x||2 , α]T , θ I ,tI

ln(10) − 2 IN
1/2
WI (yI − FI θ I )
D3 diag([ · · · , − 5γ Pi γ , · · · ]T ), y3 s.t. 1/2 0 (17b)
T WI (yI − FI θ I )T tI
− γ2 − γ2
· · · , Pi 2
− P0 ||si || , · · · and
Id [θ I ]1:d
Θ(θ I ) (17c)
⎡ ⎤ [θ I ]T [ θ I ]d+1
.. .. .. 1:d
⎢ . . . ⎥ Θ(θ I ) 0, (17d)
⎢ − 2 − 2 2
⎥
2 P P − γ −1 ⎥.
F3 ⎢−2P γ sT P γ 1/2
⎣ 0 i 0 γ 0 i ⎦ where tI is an auxiliary slack variable, WI = D−1
1 , I = 1, 2
.. .. .. 1/2
. . . and WI = D−1
3 , I = 3, 4. Note that (17) drops the rank
constraint, i.e., rank(Θ(θ I )) = d , in order to guarantee a con-
4) Both τbg and P0 Are Unknown: Considering the approx- vex set for θ I . This is called the semidefinite relaxation (SDR).
imations (9) and (10) and omitting the high order small term Finally, the proposed SDP optimization problem in (17) can
O(αχi ), we rewrite (7) into be efficiently solved by CVX and the estimate of the tar-
− γ2 2 2 − γ2 −1 γ2 +1 get location is hence given by x̂ = [θÎ ]1:d . Considering the
−2sT 2
i x + ||x|| − Pi P0 γ − P αP0 worst-case complexity, i.e., the interior-point algorithm, the
γ i
complexity for each iteration
√ is O[d 2 N ] and the iteration num-
2 ln(10) − γ2
≈ −||si ||2 + P0γ P i χi , (13) ber is bounded by O[ N ln(1/ξ)], where ξ is the iteration
5γ tolerance, thus the total complexity is O[d 2 N 1.5 ln(1/ξ)].
which readily leads to our fourth linear model with unknown
τbg and P0 as C. CRLBs
y4 = F4 θ 4 + D4 χ , (14) According to (2), we stacks all the measurements Pi
as p ∼ N (μ, Σχ ), where p [ · · · , Pi , . . . ]T , μ
γ2 γ2 +1 γ2 [ · · · , μi , . . . ]T with μi 10log10 (P̄i + τbg
) and P̄
where θ 4 [x, ||x||2 , P0 , αP0 ]T , D4 P0 D3 , y4 i
P̄i
y2 , and 10 10 , and the covariance matrix of the measurement noise
⎡ ⎤ is Σχ = σ 2 IN . When the elements of the parameter vec-
.. .. .. ..
⎢ . . . . ⎥ tor θ are mutual independent, the Fisher information matrix
⎢ − 2 − γ2 −1 ⎥ (FIM) can be computed as [J]n,m = [ ∂θ ∂ μ T −1 ∂ μ
] Σχ [ ∂θm ] =
F4 ⎢−2sT 1 −P γ 2
− γ Pi ⎥. n
⎣ i i ⎦ 1 [ ∂ μ ]T [ ∂ μ ], where the parameter vector θ could be
.. .. .. .. σ 2 ∂θn ∂θm
. . . . θ = x, θ = [xT , P0 ]T , θ = [xT , τbg ]T or θ =
∂μ ∂[μ]
[xT , P0 , τbg ]T , and ∂θ n
[ · · · , ∂θni , . . . ]T . Letting
B. Semidefinite Programming Solution
x = [x1 , . . . , xd ]T and si = [si,1 , . . . , si,d ]T , we obtain
For convenience, our four linear models in (6), (8), (12) ∂[μ]i 10γ P̄ i i,k x −s ∂[μ]
and (14) are formulated into the following general form ∂xk = − ln(10) i
||x−s ||2 ,
P̄i +τbg
k = 1, . . . , d , ∂P0i =
i
P̄i ∂[μ] bg τ
yI = FI θ I + DI χ, I = 1, . . . , 4, (15) and ∂τ i = P̄ +τ . Now, we can easily calculate
P̄i +τbg
bg i bg
where the dimensions of FI and θ I change according to the FIMs for different θ and hence the corresponding CRLBs.
the number of unknown parameters to estimate. Obviously, In this letter, the main concern is estimating the target loca-
the model noise DI χ is colored as its covariance matrix is tion x, forwhich the CRLBs with different θ are given by
d
ΣDI χ = σ 2 D2I IN . In order to whiten the model noise, we CRLB = −1
k =1 [J ]k ,k .
can optimally construct the weight matrix WI as WI D−2 I ,
1/2
since the covariance matrix of WI DI χ becomes (scaled) IV. N UMERICAL R ESULTS
2 Monte Carlo simulations with 1000 trials have been con-
identity. Notice that, since both D2 P0γ D1 and D4
2 2 ducted to study the CRLBs and to evaluate the introduced
P0γ D3 are simply scaled by an unknown P0γ , we choose models and the SDP solution, where the target node is ran-
their weight matrices as W2 = W1 and W4 = W3 . domly placed in a 50 m × 50 m field and 10 anchors are
HU et al.: LOCALIZATION USING BLIND RSS MEASUREMENTS 467
B. Localization Performance
The proposed SDP estimators associated with our models
in (6), (8), (12) and (14) are respectively denoted as SDP1,
SDP2, SDP3 and SDP4. A recent constrained weighted least
squares (CWLS) estimator for the traditional RSS-based local-
ization that ignores the BGN power is also considered for
comparison [3]. As shown in Fig. 2, our proposed SDP estima-
tors significantly outperform the traditional CWLS estimator
with at least 1 m better in the RMSE, which again strongly
proves that the BGN power cannot be ignored. In stark con-
trast, even the performance degradation caused by unknown
parameters does not seem that significant. Furthermore, with
small τbg or σ, the proposed methods yield the performance
Fig. 1. The blue solid curves are the CRLBs for the traditional RSS-based that are close to the corresponding CRLBs. Admittedly, under
localization, while the red dashed curves represent the CRLBs for the blind large τbg or σ, the approximations in the derivations become
RSS-based localization.
less effective, thus resulting into a gap to the CRLBs, even
though the proposed methods are still better than the tradi-
tional CWLS estimator. Actually, this is a common issues for
most localization problems [6]. However, due to the page limit,
we will leave this issue for the future research challenges.
V. C ONCLUSION
In this letter, we consider the localization using the blind
RSS measurements that include the BGN power. As a kick-
off of this topic, we introduce four models with the transmit
power and the BGN power to be known or unknown. We also
propose a general SDP solution and the corresponding CRLBs
that indicates the importance of considering the BGN power.
The numerical results show that the proposed models and the
Fig. 2. Performance of the proposed SDP estimators compared with a general SDP solution are very reliable and suffice to serve this
traditional LLS estimator that ignores the BGN. new kind of localization. This letter casts light on this new kind
of localization and our future challenges include coping with
pre-deployed with known locations (50, 50), (50, 0), (0, 50), an unknown PLE and system parameter errors.
(0, 0), (25, 7), (25, 43), (12, 33), (12, 16), (37, 33) and (33, 16).
The PLE is set to 4 and the transmit power is 30 dBm. The
root mean squares error (RMSE) is used to evaluate the esti- R EFERENCES
mation accuracy and the average CRLB is calculated from [1] F. Gustafsson and F. Gunnarsson, “Mobile positioning using wireless
different target locations. Due to the page limit, we will only networks: Possibilities and fundamental limitations based on available
focus on the localization performance, not the estimation for wireless network measurements,” IEEE Signal Process. Mag., vol. 22,
other unknown parameters. no. 4, pp. 41–53, Jul. 2005.
[2] X. Li, “RSS-based location estimation with unknown pathloss model,”
IEEE Trans. Wireless Commun., vol. 5, no. 12, pp. 3626–3633,
Dec. 2006.
A. CRLBs [3] Z. Li, “Constrained weighted least squares location algorithm using
As shown in Fig. 1a, the blind source localization based received signal strength measurements,” China Commun., vol. 13, no. 4,
on the RSS measurement with the BGN power cannot outper- pp. 81–88, Apr. 2016.
[4] R. M. Vaghefi, M. R. Gholami, R. M. Buehrer, and E. G. Strom,
form the traditional one, which however can be understood as “Cooperative received signal strength-based sensor localization with
a trade-off of no signal demodulation. Obviously, the impact unknown transmit powers,” IEEE Trans. Signal Process., vol. 61, no. 6,
of the BGN is rather significant especially with a large BGN pp. 1389–1403, Mar. 2013.
power. Nonetheless, when the BGN power is small, the blind [5] X. Guo, L. Chu, and X. Sun, “Accurate localization of multiple sources
using semidefinite programming based on incomplete range matrix,”
RSS-based localization can yield almost the same accuracy IEEE Sensors J., vol. 16, no. 13, pp. 5319–5324, Jul. 2016.
as the traditional one, where no signal demodulation now [6] Y. Hu and G. Leus, “Robust differential received signal strength-
becomes a very remarkable advantage especially for military based localization,” IEEE Trans. Signal Process., vol. 65, no. 12,
scenarios. Next, let us focus on the blind RSS-based localiza- pp. 3261–3276, Jun. 2017.
[7] R. K. Martin et al., “Modeling and mitigating noise and nuisance
tion and observe that, with more unknown parameters τbg or parameters in received signal strength positioning,” IEEE Trans. Signal
P0 , the localization performance deteriorates. Particularly, the Process., vol. 60, no. 10, pp. 5451–5463, Oct. 2012.
unknown P0 yields a more severe impact than the unknown [8] T. S. Rappaport, Wireless Communications: Principles and Practice
τbg . Interestingly, although considering the BGN power is (Prentice Hall Communications Engineering and Emerging Technologies
Series), London, U.K.: Dorling Kindersley, 2009.
very important, its knowledge is relatively not that significant. [9] Y. Hu and G. Leus, “Self-estimation of path-loss exponent in wireless
Finally, from Fig. 1b, all the CRLBs become worse with a networks and applications,” IEEE Trans. Veh. Technol., vol. 64, no. 11,
large shadowing effect. pp. 5091–5102, Nov. 2015.
Fast Analog Transmission for High-Mobility Wireless

Data Acquisition in Edge Learning
Yuqing Du and Kaibin Huang
Abstract—By implementing machine learning at the network

edge, edge learning trains models by leveraging rich data dis-
tributed at edge devices and in return endow on them capabilities
of seeing, listening, and reasoning. In edge learning, the need of
high-mobility wireless data acquisition arises in scenarios where
edge devices (or even servers) are mounted on ground or aerial
vehicles. In this letter, we present a novel solution, called fast
analog transmission (FAT), for high-mobility data acquisition in
edge-learning systems, which has several key features. First,
FAT incurs low-latency. Specifically, FAT requires no source-
and-channel coding and no channel training via the proposed
technique of Grassmann analog encoding (GAE) that encodes
data samples into subspace matrices. Second, FAT supports spa-
tial multiplexing by directly transmitting analog vector data over
an antenna array. Third, FAT can be seamlessly integrated with
edge learning (i.e., training of a classifier model in this letter). Fig. 1. (a) A scenario of high-mobility wireless data acquisition for edge
In particular, by applying a Grassmannian-classification algo- learning where edge devices are mounted on ground vehicles or unmanned
rithm from computer vision, the received GAE encoded data aerial vehicles (UAVs); (b) Illustration of communication latency caused by
can be directly applied to training the model without decoding channel training.
and conversion. This design is found by simulation to outperform
conventional schemes in learning accuracy at a moderate-SNR
range under the high mobility scenario due to its robustness
deployed previously in different settings such as fast trans-
against data distortion induced by fast fading. fer of channel-state information (CSI) [3] and over-the-air
functional computation in sensor networks [4]. Compared
Index Terms—Edge learning, fast analog transmission, data with digital transmission, the analog design does not require
acquisition, high mobility. source-and-channel coding and decoding, thereby reducing
computation complexity. Moreover, direct transmission of
I. I NTRODUCTION
analog data instead of a quantized bit stream shortens the
NVISIONED as an evolution in computing, edge learn-
E ing refers to the implementation of machine learning at
the network edge so as to leverage enormous data distributed
transmission duration. In terms of learning performance, our
findings suggest that a customized design of analog transmis-
sion targeting learning (e.g., FAT) can be more robust than
at edge devices (e.g., smartphones and sensors) for training digital counterparts against data distortion by fast fading at
models [1]. Subsequently, the models are applied to empow- high mobility.
ering edge devices with the capabilities of seeing, listening The second idea is blind multiple-input-multiple-output
and reasoning. While computing speeds are growing rapidly, (MIMO) transmission without CSI. This idea was first devel-
the latency in wireless data acquisition has emerged to be the oped in the classic area of non-coherent MIMO, which is a
bottleneck of fast edge learning [1]. This issue is exacerbated digital space-time modulation scheme [5], [6]. Its unique fea-
in high-mobility scenarios where edge devices (or even edge ture is a modulation constellation comprises a set of subspace
servers) are mounted on ground or aerial vehicles as illustrated matrices. The transmitted space-time symbol in the form of
Fig. 1(a) [2]. High-mobility data acquisition faces several chal- such a matrix is invariant to rotation by a block fading chan-
lenges: 1) robustness against fast fading, 2) low-latency given nel that remains constant within each symbol duration but
short connection time, 3) seamless integration with learning varies over different durations. Thus the matrix can be trans-
algorithms. To tackle these challenges, we present a novel mitted and detected even without CSI at either side, referred to
solution, called fast analog transmission (FAT). hereafter as the channel-invariant property [5]. Consequently,
The design of the FAT scheme builds on three ideas channel training is unnecessary, thereby reducing the trans-
from communication and learning. The first idea is analog mission latency and overhead as illustrated in Fig. 1(b). On
transmission based on linear analog modulation that has been the other hand, non-coherent MIMO cannot support spatial
multiplexing like its coherent counterpart. The resultant low
Manuscript received October 3, 2018; accepted October 11, 2018. Date of
publication October 16, 2018; date of current version April 9, 2019. This data rates makes the former less popular in practice and its
work was supported by Hong Kong Research Grants Council under Grant applications are limited to low-rate ultra-fast machine-type
17209917 and Grant 17259416. The associate editor coordinating the review applications [2], [7]. In contrast, the proposed scheme retains
of this paper and approving it for publication was J. Choi. (Corresponding the advantages of both technologies, namely channel-invariant
author: Kaibin Huang.) property and (analog) spatial multiplexing. The said property
The authors are with the Department of Electrical and Electronic
and Engineering, University of Hong Kong, Hong Kong (e-mail: is achieved by the proposed Grassmannian analog encoding
yqdu@eee.hku.hk; huangkb@eee.hku.hk). (GAE), a key FAT component, which encodes a data sample
Digital Object Identifier 10.1109/LWC.2018.2876344 (an analog vector) into a subspace matrix by projection onto a
DU AND HUANG et al.: FAT FOR HIGH-MOBILITY WIRELESS DATA ACQUISITION IN EDGE LEARNING 469
point on the Grassmann manifold, thereby giving the name of

the technique. On the other hand, FAT supports spatial mul-
tiplexing by directly transmitting analog data vectors instead
of a single constellation point as in non-coherent MIMO. The
differences between FAT and conventional MIMO schemes are
summarized in Table I. Fig. 2. An edge-learning system based on FAT.
In this letter, we consider a typical edge-learning task of
training a classifier model. The last idea pertains to edge B. Simulation Models
learning and is to apply a Grassmann classification algo-
rithm for classifying the received GAE encoded training data. Simulation for evaluating learning performance is based on
Such algorithms were originally developed for computer vision the following data and channel models. The data at different
where image features or motions are represented as subspaces edge devices are assumed to be independent and identically
or equivalently points on a Grassmann manifold, referred to distributed (i.i.d) based on the classic mixture of Gaussian
as Grassmann data [8]. Via the application of such an algo- (MoG) model, which is widely adopted in the machine-
rithm, classification can be seamlessly integrated with FAT learning literature. Each data sample is a 1-by-L complex
since the received GAE encoded data can be directly used in random vector. Let M denote the number of data classes. Then
learning without decoding and conversion. Furthermore, the the i-th sample, denoted as s(i) , from the m-th class can be
integration leads to accurate edge learning with robustness in modelled as
data acquisition against fast fading due to its property of blind s(i) = µm + z(i) , ∀i , (2)
transmission and detection, avoiding detection errors caused
by inaccurate CSI. In summary, an edge learning system based where µm is the mean of the m-th class and z(i) ∈ C1×L a
on FAT comprises the following three components. deviation vector comprising i.i.d. CN (0, σs2 ) elements.
1) Grassmann analog encoding: At each edge device, the Next, high mobility induces temporally correlated MIMO
proposed GAE encodes data samples into subspace matri- channels. Assuming rich scattering, the classic Clark’s model
ces by projection onto a Grassmann manifold to enable blind is applied that translates a speed into the level of channel tem-
MIMO transmission and robust edge learning. poral correlation. Specifically, within the duration of transmit-
2) Analog transmission and detection: The GAE encoded ting a data sample, two realizations of the (m, n)-th coefficient
data is transmitted using linear analog modulation and blindly of the channel Ht separated by τ slots are correlated with the
detected at the edge server without channel knowledge. correlation function given as
3) Edge learning: At the edge server, the received (m,n) ∗ (m,n)
Grassmann data is used for training a classifier model using a E[(ht ) ht+τ ] = J0 (2πfD τ ), (3)
Grassmann-classification algorithm from computer vision.
By evaluating the classification performance of a model and where fD = fccv with v being the speed, fc carrier frequency
transmission latency using simulation (see Section V), the pro- and c speed of light, and J0 is the zero-th order Bessel
posed FAT scheme is found to substantially outperform the function of the first kind.
conventional coherent (analog and digital) MIMO transmission
at high mobility. III. FAST A NALOG T RANSMISSION S CHEME
In this section, we discuss two key algorithms in the
II. S YSTEM AND S IMULATION M ODELS proposed FAT scheme, namely GAE and blind analog trans-
A. System Model mission and detection (see Fig. 2). The received Grassmannian
Consider the edge-learning system illustrated in Fig. 2 dataset is used for training a classifier model using an existing
where an edge server trains a classifier using a training dataset Grassmannian classifiaction algorithm such as sample Karcher
transmitted by multiple edge devices. The transmissions by mean [9], which is adopted in simulation. The details are
devices are based on time sharing and independent of chan- omitted for brevity.
nels given no CSI. All nodes are equipped with antenna arrays, Grassmann Analog Encoding: To facilitate exposition, some
resulting in a set of narrow-band MIMO channels. Let Nt mathematical notions are defined as follows. The (n, m)
and Nr denote the numbers of transmit and receive antennas, Grassmann manifold is a set of all m-dimensional subspaces
respectively. Time is divided into baseband sampling intervals, in Cn , denoted by Gn,m [9]. For the special case of G3,1 , each
called (time) slots. Then the slot-t realization of the MIMO point on the manifold geometrically corresponds to a unique
channel from an active device to the server can be represented line passing through the origin as illustrated in Fig. 3. For ease
by the Nr × Nt matrix Ht . Given an analog vector-symbol gt of notation, a point on Gn,m that is a subspace is usually rep-
transmitted by the active device, the received signal is resented by an arbitrary basis matrix spanning the subspace,
√ denoted as Υ. The subspace distance between two points Υ
yt = P Ht gt + wt (1) and Υ on the Grassmannian Gn,m , denoted as dp (Υ, Υ ),
is measured using the commonly used metric of Procrustes
where P is the transmission power and wt the additive-white- distance for its better performance in simulation:
Gaussian-noise (AWGN) vector. In this letter, we focus on
transmission of data samples that dominates the data acquisi- dp2 (Υ, Υ ) = m − tr ΥΥH Υ (Υ )H . (4)
tion process. Their labels have finite values and naturally can
be transmitted using digital non-coherent MIMO modulation As discussed, GAE at the active device endows on FAT
over a low-rate channel, called label channel, orthogonal to the channel-invariant property, thereby enabling blind analog
the high-rate data channel. Due to its low rate, the label chan- transmission with robustness against fast fading. As illustrated
nel can be reasonably assumed to be noiseless similarly as the in Fig. 3, the mathematical principle of GAE is to project
CSI feedback channel. original data samples (vectors in the Euclidean space) onto
TABLE I
C OMPARISON OF D IFFERENT MIMO T RANSMISSION S CHEMES
Step 2 (Grassmann Analog Detection): The detection

of the transmitted encoded analog space-time symbol G(i)
involves the extraction of the row space, denoted by the
Nt × T unitary matrix G (i) , from the SVD of the received
Nr × T space-time signal Y(i) specified in (5), namely
Y(i) = V(i) Π(i) G (i) . Consider the special case of zero noise
Fig. 3. Principle of Grassmann analog encoding.
and static channel. The detected symbol G (i) = OG(i) where
the Grassmann manifold, generating subspace matrices as the O a Nt ×Nt rotation (unitary) matrix. In other words, G (i) and
(i)
G are the identical point on the Grassmannian, correspond-
output. The GAE algorithm is described as follows.
Step 1 (Vector-to-Matrix Conversion): Consider a data ing to perfect detection. In the presence of noise and channel
sample that is a 1 × L row vector, say s(i) , with L being an variation, they are two different points and the resultant
integer multiple of Nt . Then s(i) can be divided into 1 × T detection error affects learning. Based on the above detec-
(i) (i) (i) tion procedure, the output training dataset, called Grassmann
sub-vectors with T = L/Nt : s(i) = [s1 , s2 , . . . , sNt ]. It
dataset, is a sequence of N labeled subspace matrices (points
follows that s(i) can be converted into a Nt × T data matrix on the Grassmannian), [G (1) , G
(2) , . . . , G
(N ) ], whose labels
(i)
X(i) having {sn } as rows. The matrix X(i) such constructed are acquired by the server via the said low-rate label channel.
is typically fat (Nt < T ) since a data-sample vector is usually Remark 1(Blind Transmission and Detection): Both
long (L Nt ). For the case where L is not an integer multiple the analog transmission and detection in the above steps
of Nt , zero-padding can be applied to lengthen s(i) so that the are independent of the channel. In particular, the detection
integer-multiple constraint is met. of Grassmann dataset involves SVDs of the received array
Step 2 (Projection onto Grassmannian): The key step observations that do not require any channel knowledge.
in encoding is to project the matrix X(i) constructed in the
preceding step onto a single point on the Grassmannian GT ,Nt . IV. U NDERSTANDING THE D ESIGN
To this end, decompose the matrix X(i) by singular value A. Grassmann Analog Encoding Preserves Clustering
decomposition (SVD) as X(i) = U(i) Σ(i) G(i) . Then G(i) is An important reason FAT supports edge classification is
a Nt × T basis matrix spanning the row space of X(i) and that GAE retains the class structure in the original dataset.
thus a point on GT ,Nt . The encoder uses G(i) as the output This property is illustrated in Fig. 4 where the high-
from encoding the data-sample s(i) . dimensional datasets are visualized in the 2D plane using
In summary, the advantages of GAE are threefold: a well known visualization algorithm, t-distributed stochas-
1) enabling blind transmission and detection as discussed in tic neighbour embedding (t-SNE). As discussed in the sequel,
the next sub-section, 2) endowing on edge learning robustness GAE incurs DoF loss in the dataset. Consequently, one can
against data distortion by fast fading as shown in simulation observe form Fig. 4 that data classes are less compact after
results, and 3) allowing seamless integration with learning on GAE, sacrificing some level of discriminant of the dataset. The
Grassmannian without signal decoding and conversion. loss, nevertheless, yields communication advantages discussed
Blind Analog Transmission and Detection: Given shortly.
Grassmann encoding, the procedures for blind analog
transmission and detection in FAT are described as follows. B. Trading DoF Loss for Robustness and Low Latency
Step 1 (Analog Transmission): After encoding each data- The Grassmann encoding design leads to the DoF loss,
sample, say s(i) , into a subspace basis matrix G(i) , the Nt ×T which may make data points among different classes that are
matrix G(i) is directly transmitted by the active device over T well-separated in the Euclidean space become much closer or
slots using linear analog modulation and the array of Nt anten- even overlapped on the Grassmannian. The loss has a negative
nas. To reflect channel temporal variation, it is necessary write effect on the classification performance. The phenomenon is
(i) (i) (i)
G(i) in terms of its columns: G(i) = [g1 , g2 , . . . , gT ]. illustrated by the following example. Besides SVD, an alter-
native method for GAE is LQ decomposition. Consider the
Then the received signal due to the transmission of G(i) can be LQ decomposition of two data matrices X = LX GX and
(i) (i) (i)
represented by the Nr × T matrix Y(i) = [y1 , y2 , . . . , yT ] Y = LY GY , where the unitary matrices GX and GY represent
(i)
with yt given as identical subspaces (or identical encoding outputs) as the SVD
√ counterparts and LX and LY are lower triangular matrices.
(i) (i) (i) (i)
yt = P Ht gt + wt . (5) They represent the DoF loss for X and Y in the GAE pro-
cess. If span(GX ) = span(GY ) but LX = LY , the Euclidean
For continuous time-shared distributed uploading of total N distance between X and Y is dE2 (X, Y) = 0. However, based
data samples, t = 1, 2, . . . NT . It is important to observe on (4), the Procrustes distance between the GAE encoded data
(i)
from (5) that due to high mobility, the channel {Ht } varies samples GX and GY is dp2 (GX , GY ) = 0.
in the T-slot transmission duration of a single data sample, A key finding in this letter is that the DoF loss of GAE is
which has a negative effect on decoding as discussed in the more than compensated by its robustness against fast fading
sequel. that can cause severe errors in data transmission without GAE.
DU AND HUANG et al.: FAT FOR HIGH-MOBILITY WIRELESS DATA ACQUISITION IN EDGE LEARNING 471
Fig. 4. The clustering structure of a binary MoG data set (a) before and (b)
after GAE.
Fig. 6. Learning performance comparison for two cases: (a) a varying
Doppler shift with the average transmit SNR equal to 15 dB; (b) a varying
average transmit SNR with the normalized Doppler shift fixed at 0.01.
Learning Performance: Classifier models discussed in

Section III are trained using the training dataset acquired using
different transmission schemes and then evaluated using the
test dataset. The resultant classification error rates are com-
pared in Fig. 6 by varying Doppler shift and average transmit
SNR. Several observations can be made. In the range of mod-
erate to large Doppler shift (i.e., larger than 6 × 10−3 ), the
Fig. 5. The channel-training overhead versus normalized Doppler shift for proposed FAT outperforms the benchmarking schemes, sup-
the target classification error rate of 1 × 10−3 .
porting the former’s intended application in high-mobility data
As a result, GAE leads to a net performance gain over conven- acquisition. Furthermore, at high mobility (i.e., Doppler equal
tional schemes at high mobility. Furthermore, GAE also leads to 0.01), FAT achieves the best performance in the practical
to transmission-latency reduction as it eliminates channel- SNR range (0-17 dB). This is because that the performance
training overhead and enables analog transmission faster than gain for FAT due to its robustness against fast fading exceeds
digital counterparts. the degradation due to DoF loss in the practical SNR range
(i.e., 0-17 dB), where FAT outperforms the baseline schemes
as observed in Fig. 6(b). Outside this range, the reverse holds.
V. S IMULATION R ESULTS The above observations reconfirm the conclusion that FAT is a
The simulation parameters are set as follows. The number promising solution for high-mobility data acquisition for edge
of Gaussian classes is M = 2 (The observations have also been learning.
made from additional simulation results for large values of M) VI. C ONCLUDING R EMARKS
with source data parametric ratio, i.e., µm 2 /σs2 , being 15 dB
and the dimension of each data sample is L = 48. The 4 × 2 In this letter, we have proposed a novel transmission
MIMO channel is temporally correlated with the variation scheme, namely FAT, for high-mobility data acquisition in
speed specified by the normalized Doppler shift fD Ts = 0.01, edge learning systems. In particular, this scheme allows blind
where Ts is the baseband sampling interval. Define the train- transmission and detection using GAE, which has a wide
ing and test datasets are generated based on the discussed MoG range of applications in subspace-based learning tasks such
model, which comprise 200 and 2000 samples, respectively. as motion tracking, face recognition and etc. This letter also
The performance of FAT is benchmarked against two high- points to the promising new research area of signal encoding
rate coherent schemes: digital and analog MIMO transmission, for edge learning.
both of which assume a MMSE linear receiver and thus require R EFERENCES
channel training to acquire the needed CSI. Like FAT, ana-
[1] H. B. McMahan et al. (2016). Communication-Efficient Learning
log MIMO transmits data samples directly by linear analog of Deep Networks From Decentralized Data. [Online]. Available:
modulation. On the other hand, digital MIMO quantizes data https://arxiv.org/pdf/1602.05629.pdf
samples into 8-bit per coefficient and modulates each sym- [2] C. Bockelmann et al., “Massive machine-type communications in 5G:
bol using QPSK before MIMO transmission. All considered Physical and MAC-layer solutions,” IEEE Commun. Mag., vol. 54, no. 9,
pp. 59–65, Sep. 2016.
schemes have no error control coding. [3] T. L. Marzetta and B. M. Hochwald, “Fast transfer of channel state
Communication Latency Performance: While FAT is free of information in wireless systems,” IEEE Trans. Signal Process., vol. 54,
channel-training, benchmark schemes incur training overhead no. 4, pp. 1268–1278, Apr. 2006.
that can be quantified by the fraction of a frame allocated for [4] M. Goldenbaum and S. Stanczak, “Robust analog function computation
via wireless multiple-access channels,” IEEE Trans. Commun., vol. 61,
the purpose, i.e., the ratio P/(P + D) with P and D illus- no. 9, pp. 3863–3877, Sep. 2013.
trated in Fig. 1(b). The curves of overhead versus Doppler [5] B. M. Hochwald and T. L. Marzetta, “Unitary space-time modulation for
shift are displayed in Fig. 5 for FAT and two benchmarking multiple-antenna communications in Rayleigh flat fading,” IEEE Trans.
schemes given the classification-error rate of 1 × 10−3 . One Inf. Theory, vol. 46, no. 2, pp. 543–564, Mar. 2000.
[6] W. Yang, G. Durisi, and E. Riegler, “On the capacity of large-MIMO
can observe that the overhead grows monotonically with the block-fading channels,” IEEE J. Sel. Areas Commun., vol. 31, no. 2,
Doppler shift as the channel fading becomes faster. For high- pp. 117–132, Feb. 2013.
mobility with Doppler approaching 10−2 , the overhead can be [7] F. Boccardi, R. Heath, A. Lozano, T. Marzetta, and P. Popovski, “Five
disruptive technology directions for 5G,” IEEE Commun. Mag., vol. 52,
more than 12% and 6% for digital and analog coherent MIMO, no. 2, pp. 74–80, Feb. 2014.
respectively. Furthermore, given the same performance, digital [8] J. Hamm and D. Lee, “Grassmann discriminant analysis: A unifying
coherent MIMO (with QPSK modulation and 8-bit quantiza- view on subspace-based learning,” in Proc. ACM 25th Int. Conf. Mach.
tion) requires 4 times more frames for transmitting the training Learn., 2008, pp. 376–383.
[9] Y. Du, G. Zhu, J. Zhang, and K. Huang. (2018). Automatic Recognition
dataset than the two analog schemes. This suggests that analog of Space-Time Constellations by Learning on the Grassmann Manifold.
transmission is preferable for data acquisition in edge learning. [Online]. Available: https://arxiv.org/pdf/1804.03593.pdf
On Achieving the Maximum Streaming Rate in Hybrid

Wired/Wireless Overlay Networks
Jianwei Zhang , Member, IEEE, Xinchang Zhang, Member, IEEE, Meng Sun, and Chunling Yang
Abstract—In this letter, we study the multicast streaming

problem in a fully connected hybrid wired/wireless overlay. We
derive explicit formulas for the maximum streaming rate and
the hybrid multiplicity using a flow-based method. The consid-
ered model supports both full-duplex and half-duplex modes, and
incorporates equal service and differentiated service in a unified
framework. The obtained results have theoretical and practical
significance in wired/wireless content distribution.
Index Terms—Overlay, multicast, streaming, wireless.
Fig. 1. Inflow (A ⊂ L) and outflow (A ⊇ L). The aim is to exploit the

I. I NTRODUCTION bandwidth of L to maximize the achievable streaming rate of A.
ECENTLY, multicast streaming applications in hybrid

R wired/wireless overlays have become popular. However,
the performance bounds have not been sufficiently studied. II. S YSTEM M ODEL
Related works mainly consider wired networks. Some are Fig. 1 shows a fully-connected hybrid wired/wireless over-
flow-based; others are chunk-based. For flow-based meth- lay. Being consistent with [6], we assume that the wired nodes
ods, when assuming that the network topology has no degree use the full-duplex mode and that the wireless nodes use the
bound, we can usually obtain analytical results [1]–[3]. After half-duplex mode. In the full-duplex mode, the uplink (UL)
adding simple constraints, the problem becomes NP-hard [4]. rate and downlink (DL) rate of a node i are bounded by its UL
In comparison, most chunk-based methods use probabilistic bandwidth1 ui and DL bandwidth di , respectively. In the half-
models [5]. In flow-based methods, minimizing the finish time duplex mode, the sum of the UL rate and DL rate of a node i
in file sharing and maximizing the streaming rate in live is bounded by its bandwidth ci [6]. The duplex modes, not the
streaming are essentially equivalent. Hence the results derived node types, dominate the analysis in the model. For example,
from one can be directly extended to the other. In file sharing in wired networks, the twisted-pair cable and the fiber optic
scenarios, the derivation of the minimum finish time assumes can operate in the full-duplex mode, while the digital sub-
no limit on the downlink bandwidths of the helpers [1], as does scriber line (DSL) uses the frequency-division multiplexing
the multiplicity theorem introduced in [3]. In [6], the minimum (FDM) to emulate the full-duplex mode over a half-duplex
finish time in half-duplex wireless networks was first deduced, link; thus the end users perceive fixed UL and DL bandwidths.
but only the equal service was studied. In Wi-Fi networks, wireless nodes under an access point (AP)
In this letter, we first give an explicit expression of the use the carrier-sense multiple access with collision avoidance
maximum streaming rate in a hybrid wired/wireless over- (CSMA/CA) to contend for the channel. The dynamic adjust-
lay. We then calculate the hybrid multiplicity, a system-level ment of DL/UL ratio can be implemented via the data link
performance metric, to measure the potential streaming capa- layer [8] and driven by the end-user applications; thus it can
bility of the overlay. The obtained results can constitute a basic be handled by the half-duplex mode. In cellular networks, the
component for the optimization of multi-file distribution or long-term evolution frequency-division duplex (LTE-FDD) has
multi-channel streaming applications [6], [7]. a fixed DL/UL ratio. In the long-term evolution time-division
duplex (LTE-TDD), there are limited choices of subframe con-
Manuscript received September 6, 2018; revised October 9, 2018; accepted
figurations, leading to a few fixed DL/UL ratios. Hence, LTE
October 11, 2018. Date of publication October 17, 2018; date of current can be treated the same way as the full-duplex mode in the
version April 9, 2019. This work was supported in part by the National model. Compared to LTE, the upcoming 5G provides not only
Natural Science Foundation of China under Grant 61472230, and in part the potential use of the in-band full-duplex, but the envisioned
by the Shandong Provincial Natural Science Foundation of China under
Grant ZR2016YL001, Grant ZR2016YL004, and Grant ZR2016YL008. The flexible duplex via an enhancement of TDD and FDD [9].
associate editor coordinating the review of this paper and approving it for Although the protocol overheads and interference are not con-
publication was M. Velez. (Corresponding author: Jianwei Zhang.) sidered in the model, the results can still provide valuable
J. Zhang, X. Zhang, and M. Sun are with the Shandong Provincial
Key Laboratory of Computer Networks, Shandong Computer Science
insights on the benefits obtained from a flexible duplex mode
Center (National Supercomputer Center in Jinan), Qilu University of over a fixed one.
Technology (Shandong Academy of Sciences), Jinan 250101, China (e-mail: We assume that the data transmission between nodes,
janyway@outlook.com). whether wired or wireless, regardless of how geographically
C. Yang is with the Information Technology Center, Zhejiang University,
Hangzhou 310027, China.
Digital Object Identifier 10.1109/LWC.2018.2876536 1 In this letter, bandwidth refers to capacity, measured in bits per second.
ZHANG et al.: ON ACHIEVING MAXIMUM STREAMING RATE IN HYBRID WIRED/WIRELESS OVERLAY NETWORKS 473
TABLE I
N OTATIONS Based on Eq. (2) and u(·) and c(·) defined in Table I, the
following notations are introduced for ease of presentation:
⎧
⎨u(L, A) = u(A) + |A|−1 u(B )A , A ⊂ L
|A|
. (3)
⎩c(L, A) = c(A) + |A|−1 c(B ), A⊂L
|A|+1
III. M AIN R ESULTS

A. Maximum (Achievable) Streaming Rate
In this section, we give the upper bound expression of the
achievable streaming rate for A in terms of the bandwidths
of all nodes. In the proof, we develop a rate allocation and
routing scheme to achieve the bound.
Theorem 1: For a hybrid wired/wireless overlay, the maxi-
mum (achievable) streaming rate rmax is given by:
⎧
⎨min rA , u0 , u0 +u(L,A)+c(L,A) , A ⊂ L
|A|+|Awl |
rmax = . (4)
⎩min rA , u0 , u0 +u(L)+c(L) , A⊇L
|A|+|L |
wl
close they are, must go through the core network, i.e., there Proof: The proof can be illustrated by Fig. 1.
are no direct communications among wireless nodes connect- Step 1: We prove that Eq. (4) is an upper bound. Obviously,
ing to the same AP or base station (BS). The bottleneck lies in rmax is bounded by rA and u0 . Thus, we only prove how rmax
the access network rather than the core network, which usually is bounded by the third term, which can be intuitively under-
has abundant bandwidth resources [1], [5], [6]. stood as the average aggregate bandwidths that the system
There are two node sets A, L ⊆ N, where N denotes the uni- provides to all nodes in A.
versal node set. Each of them may contain wired and wireless 1) Case A ⊂ L: For a wired node v or a wireless node w
nodes. B denotes the difference of A and L. The server sends in A, it forwards a substream from the server to all other
a stream to the overlay at the rate equal to its UL bandwidth |A|− 1 nodes in A. Suppose that all nodes in A receive the
u0 , referred to as the original streaming rate. All nodes in A substream at rmax . The upper bounds of the total available UL
want to concurrently receive the stream. All nodes in L always bandwidths of Awd and Awl are u(A) and c(A) − |Awl |rmax ,
contribute their UL bandwidths and forward the stream. There respectively.
are two cases A ⊂ L and A ⊇ L. We refer to the former as For a wired node x in B, it forwards a substream from the
the inflow differentiated service, i.e., all the bandwidth of L server to all |A| nodes in A. Similar to Eq. (2), the available
is aggregated into A, and the latter as the outflow differenti- UL bandwidth of node x is thus limited by min{|A|dx , ux }.
ated service, i.e., all the bandwidth of L is spread across A. To fully exploit its UL bandwidth, the server should provide
Without loss of generality, we incorporate the equal service 1 min{|A|d , u } of the UL bandwidth to node x. Summing
|A| x x
case A = L into the outflow case. The first objective is to cal- up all wired nodes in B, u(B )A of their UL bandwidths can be
culate the maximum (achievable) streaming rate rmax for A 1 u(B ) of the server’s UL bandwidth
used at the cost that |A| A
when L and A are fixed.
is consumed.
Further, note that rmax may not always reach u0 . In live
For a wireless node y in B, all its UL bandwidth can be
streaming, how many nodes at most can reach u0 is a cru-
utilized when it forwards a substream from the server at rate
cial performance metric. Suppose that L is fixed and that all 1 c to all nodes in A. Summing up all wireless nodes
nodes in N are arranged in some order. The second objective |A|+1 y
|A|
is to calculate the largest integer |A|, defined as the hybrid in B, |A|+1 c(B ) of their UL bandwidths can be used while
multiplicity Mh , such that the first |A| nodes can reach u0 . 1 c(B ) of the server’s UL bandwidth is consumed.
|A|+1
Since all nodes in A receive the stream at the same rate, Combining the wired and wireless cases, rmax satisfies:
using dmin (·) and cmin (·) defined in Table I, the maximum
streaming rate that A can achieve is bounded by: 1
rmax ≤ u0 + u(A) + c(A) − |Awl |rmax
|A|
|A| − 1 |A| − 1
rA = min{dmin (A), cmin (A)}. (1)
+ u(B )A + c(B ) . (5)
|A| |A| + 1
When A ⊂ L, suppose that a wired node i in B wants to Plugging Eq. (3) into the above equation and moving rmax
forward a substream from the server to all nodes in A. The to the left-hand side, we have:
substream that it receives from the server is at most at the
rate equal to di . Thus, the total UL bandwidth that node i can 1
rmax ≤ u0 + u(L, A) + c(L, A) . (6)
offer to A is bounded by min{|A|di , ui }. Similarly, the total |A|+|Awl |
UL bandwidth that Bwd can offer to A is calculated as: 2) Case A ⊇ L: For a wired node v or a wireless node w in
L, it forwards a substream from the server to all other |A|− 1
u(B )A = min{|A|di , ui }, A ⊂ L. (2) nodes in A. Consequently, all nodes in B can reach the same
i∈Bwd streaming rate as the nodes in A, without contributing their UL
bandwidths. Similar to the previous case, the upper bounds on of nodes in Bwd are sufficiently large, i.e., A ⊂ L, c(L,A) = 0,
the total available UL bandwidths of Lwd and Lwl are u(L) Awl = 0, and |A|di ≥ ui , i ∈ Bwd , Eq. (4) is reduced to the
and c(L) − |Lwl |rmax , respectively. case in [1]. When there are only wireless nodes using the equal
Combining the wired and wireless cases, rmax satisfies: service, i.e., A = L and |A| = |Lwl |, Eq. (4) is reduced to the
1 case in [6].
rmax ≤ u0 +u(L) + c(L) − |Lwl |rmax . (7)
|A|
B. Hybrid Multiplicity
Moving rmax to the left-hand side, we can obtain:
Based on Theorem 1, we can easily calculate the hybrid

rmax ≤
1
u0 + u(L) + c(L) . (8) multiplicity Mh as follows.
|A|+|Lwl | Theorem 2: Suppose that all nodes in N are numbered from
1 to |N|. L contains the first |L| nodes and A the first |A| nodes.
Step 2: We prove that the upper bound can be achieved.
u +u(L,A)+c(L,A) Define the hybrid multiplicity function as:
1) Case A ⊂ L: Denote r = 0 |A|+|A | . There ⎧
wl
are six cases according to the relationship among r , u0 and ⎪
⎪ +∞, |A| = 1
⎨ u(L,A)+c(L,A)
rA . We only prove the fundamental case r ≤ u0 ≤ rA f (A) = , |A| ∈ (1, |L|) . (11)
|A|+|Awl |−1
where the performance bottleneck lies in the whole overlay ⎪
⎪
⎩ u(L)+c(L) , |A| ∈ [|L|, |N |]
rather than the server or individual nodes; other cases can |A|+|Lwl |−1
be proven similarly using the methods in [1]. As shown in
Then the first |A| nodes can reach u0 if and only if:
Fig. 1, we construct five types of substreams sent by the
server: u0 ≤ min{f (A), rA }. (12)
T1: A substream traverses a wired node v in A and is
copied to other |A|− 1 nodes in A. The streaming rate is: The hybrid multiplicity Mh can be computed as the largest
1 u .
sv1 = |A|−1 integer |A| such that Eq. (12) holds. If Eq. (12) cannot be
v
T2: A substream traverses a wireless node w in A and is satisfied in any case, Mh equals zero.
copied to other |A|− 1 nodes in A. The streaming rate is: Proof: Since the case |A| = 1 is trivial, we only prove the
1 (c − r
sw2 = |A|−1 other two cases.
w max ).
T3: A substream traverses a wired node x in B and is 1) Case |A| ∈ (1, |L|). Eq. (11) and Eq. (12) yield:
copied to |A| nodes in A. The streaming rate is: sx3 = 1
1 min{|A|d , u }.
x x
u0 ≤ u0 + u(L, A) + c(L, A) . (13)
|A| |A|+|Awl |
T4: A substream traverses a wireless node y in B and is
copied to all nodes in A. The streaming rate is: sy4 = |A|+1 1 c . 2) Case |A| ∈ [|L|, |N |]. Eq. (11) and Eq. (12) yield:
y
T5: A substream is directly sent to all nodes in A. The

1 u −

1
streaming rate is: s05 = |A| 0 sv1 − w ∈Awl sw2 − u0 ≤ u0 + u(L) + c(L) . (14)

v ∈A wd |A|+|Lwl |
3 4
x ∈Bwd sx − y∈Bwl sy . Eqs. (13) and (14) imply that the average aggregate band-
Each node t in A receives the original stream at the rate:
widths that the system provides to the first |A| nodes are larger
r= sv1 + sw2 + sx3 + sy4 + s05 . (9) than or equal to u0 . Specifically, as long as Eq. (12) holds,
v ∈Awd w ∈Awl x ∈Bwd y∈Bwl the streaming rate of the first |A| nodes can reach u0 .
Remark 2: Once the node order is given, |A| can uniquely
Through a simple calculation, we can obtain r = rmax . determine the nodes contained by A. Therefore, given |L| and
u +u(L)+c(L)
2) Case A ⊇ L: Denote r = 0 |A|+|L | . For the same u0 , the node order uniquely determines Mh .
wl
reason as in the previous case, we only consider the funda- The model in this letter has wide applicability in media
mental case r ≤ u0 ≤ rA . As shown in Fig. 1, we construct streaming and file sharing. For inflow, the nodes in B can serve
three types of substreams sent by the server: as helpers to accelerate the streaming rate in A. For outflow, the
T1: A substream traverses a wired node v in L and is premium users in B, without sharing any bandwidths to oth-
copied to |A|− 1 other nodes in A. The streaming rate is: ers, can reach the same streaming rate as the common users
1 u .
sv1 = |A|−1 v in L. With the hybrid multiplicity, the service provider can
T2: A substream traverses a wireless node w in L and is measure how many users at most it can cover with the highest
copied to |A|− 1 other nodes in A. The streaming rate is: quality service. In practice, multiple media streams may be
1 (c − r
sw2 = |A|−1 concurrently requested by multiple node classes. Fortunately,
w max ).
T3: A substream is directly sent to all nodes in A. The the corresponding n-class problem can often be decomposed

1 u −
1−

streaming rate is: s03 = |A| 0 s
v ∈Lwd v
2
w ∈Lwl sw .
into a series of two-class problems investigated in this let-
Each node t in A receives the original stream at the rate: ter [6]; the hybrid multiplicity can play a crucial role in file
sharing [2], [3].
r= sv1 + sw2 + s03 . (10)
v ∈Lwd w ∈Lwl IV. N UMERICAL E XAMPLES
Through a simple calculation, we can obtain r = rmax . To analyze the obtained results in the previous section, we
Remark 1: When there are only wired nodes using the set u0 ∈ [3, 12], |N| = 10, |L| = 6, and |A| ∈ [1, 10] in the
inflow differentiated service and assuming the DL bandwidths three networks. The bandwidth profile is given in Table II.
ZHANG et al.: ON ACHIEVING MAXIMUM STREAMING RATE IN HYBRID WIRED/WIRELESS OVERLAY NETWORKS 475
Fig. 2. rmax and Mh . (a) Wired network. (b) Wireless network. (c) Hybrid network. (d) Scalability, u0 = 8.
Fig. 3. Effect of node combinations. u0 = 50. The horizontal dashed line separates the inflow case and outflow case. (a) Wired network, with infinite DL
bandwidths di . (b) Wired network. (c) Wireless network. (d) Hybrid network.
TABLE II
P ROFILE FOR W IRED , W IRELESS , AND H YBRID N ETWORKS possibly larger rmax , not necessarily always so, because the
node combination has a significant effect.
V. C ONCLUSION
In this letter, we have derived explicit formulas for the
maximum streaming rate and the hybrid multiplicity in a
fully-connected hybrid wired/wireless overlay. Since the band-
widths of all nodes are heterogeneous, the obtained results are
practical and generalizable. In the future, this letter can be
extended to more flexible duplex modes, or can be adapted to
other networking paradigms with similar topologies and data
Figs. 2a–2c visualize the two theorems. When u0 = 10, transmission processes.
Mh of the three networks are 1, 4, and 2. When u0 > 10, Mh
becomes zero in Fig. 2a, while it is always positive in Fig. 2b. R EFERENCES
Mh in Fig. 2c falls in the middle compared to the other two [1] R. Kumar and K. Ross, “Peer-assisted file distribution: The minimum
networks. There is a tradeoff between the performance metrics distribution time,” in Proc. IEEE HOTWEB, Boston, MA, USA, 2006,
{rmax , Mh } and the system parameters {u0 , |L|, |A|}. From pp. 1–11.
the perspective of the server, the smaller the value of u0 , the [2] G. M. Ezovski, A. Tang, and L. L. H. Andrew, “Minimizing average
finish time in P2P networks,” in Proc. IEEE INFOCOM, Rio de Janeiro,
more the nodes which can reach it. The larger the size of A, the Brazil, 2009, pp. 594–602.
smaller the maximum streaming rate that the nodes in A can [3] M. Mehyar, W. Gu, S. H. Low, M. Effros, and T. Ho, “Optimal strategies
reach. Mh can be viewed as the critical point of the system. for efficient peer-to-peer file sharing,” in Proc. IEEE ICASSP, Honolulu,
HI, USA, 2007, pp. 1337–1340.
When |A| > Mh , no nodes in A can reach u0 , although the [4] S. Sengupta et al., “Peer-to-peer streaming capacity,” IEEE Trans. Inf.
bandwidths are fully utilized. When |A| ≤ Mh , all nodes in A Theory, vol. 57, no. 8, pp. 5072–5087, Aug. 2011.
can reach u0 , but some bandwidths may be wasted. [5] J. Zhang, X. Zhang, and C. Yang, “Towards the multi-request mecha-
In Fig. 2d, we increase the network size by replicating each nism in pull-based peer-to-peer live streaming systems,” Comput. Netw.,
vol. 138, pp. 77–89, Jun. 2018.
node in Table II from 1 to 100 times (scaling factor) while [6] X. Meng, “Bandwidth partition strategies for minimizing peer-to-
keeping their order unchanged. When |L| increases proportion- peer multi-file distribution time,” M.S. thesis, Dept. Elect. Electron.
ally, Mh grows linearly as the network size increases. Eng., Univ. Hong Kong, Hong Kong, 2013. [Online]. Available:
In Fig. 3, we fix L and study the effect of node combina- http://hub.hku.hk/handle/10722/192851
|L| [7] A. Ghaderzadeh, M. Kargahi, and M. Reshadi, “ReDePoly: Reducing
tions within A. There are |A| combinations for inflow and delays in multi-channel P2P live streaming systems using distributed
|N |−|L| intelligence,” Telecommun. Syst., vol. 67, no. 2, pp. 231–246, Feb. 2018.
|A|−|L|
combinations for outflow. Fig. 3a shows that when [8] Y. Gao and L. Dai, “Optimal downlink/uplink throughput allocation for
there is no limitation on di (or similarly, di ui ), a smaller IEEE 802.11 DCF networks,” IEEE Wireless Commun. Lett., vol. 2,
no. 6, pp. 627–630, Dec. 2013.
|A| leads to a strictly larger rmax than does a greater |A|, [9] Q. Liao, “Dynamic uplink/downlink resource management in flexible
regardless of the node combinations. In the other three fig- duplex-enabled wireless networks,” in Proc. IEEE ICC Workshops, Paris,
ures where di are limited, a smaller |A| can only guarantee a France, 2017, pp. 625–631.
Interleave-Division Multiple Access in High Rate Applications

Yang Hu , Chulong Liang , Lei Liu , Chunlin Yan, Yifei Yuan, and Li Ping, Fellow, IEEE
Abstract—Interleave-division multiple access (IDMA) is a nonfading additive white Gaussian noise (AWGN) channels.
multiple access scheme that has been considered in several recent We outline several practical matching techniques including
proposals for the 5th generation cellular system. In this letter, modulation, power control, repetition coding and zero padding.
basing on evolution analysis, we show that the performance of
IDMA can be enhanced using the transfer function matching
Incidentally, we also show that zero padding together with
principle. Such matching can be realized by superposition coded cyclic shifting can reduce the implementation cost related to
modulation, power control, repetition coding, and zero padding. user-specific interleaving in IDMA. Our analysis is based on
Zero padding together with cyclic shifting also leads to reduced AWGN channels and we will provide experimental results for
implementation complexity. Our analysis is based on additive fading channels. We will show that the proposed techniques
white Gaussian noise channels and we show by simulations that can provide noticeable performance enhancement.
the matching techniques can also provide impressive performance
in fading channels.
II. S YSTEM M ODEL AND E VOLUTION A NALYSIS
Index Terms—IDMA, evolution technique, system design.
A. Transmitter Principles
I. I NTRODUCTION Consider a K-user up-link multiple access system with
NTERLEAVE-DIVISION multiple access (IDMA) [1] is received symbols:
I inspired by the success of low-density parity-check (LDPC)
codes [2]. Recently, IDMA has been discussed for the 5th y(j ) =
K

hk xk (j ) + η(j ), j = 1, 2, . . . , J , (1)
generation (5G) cellular system [3]–[5]. k =1
For LDPC codes, decoding performance can be optimized where hk is the channel coefficient of user k, xk (j ) a transmit-
by matching the transfer functions of local decoders [2]. This ted symbol, η(j ) a complex AWGN sample with mean zero
matching principle was later extended to different iterative and variance σ 2 , and J the frame length. We assume an under-
systems [6]–[8]. An IDMA receiver also involves two local lying orthogonal frequency division multiplexing (OFDM)
processors named as, respectively, an elementary signal esti- layer that resolves the intersymbol interference problem and a
mator (ESE) and a decoder (DEC). (See Section II.) As quasi-static channel that remains unchanged over a frame.
shown in [9], the performance of IDMA can be improved The principle of IDMA is illustrated graphically in Fig. 1.
by tuning an underlying LDPC code for better matching The graph is randomized with user-specific interleaving, which
between ESE and DEC. There are, however, some obstacles is illustrated in Fig. 1 by the shuffled edge connections
for this strategy. First, in 5G, the LDPC code used has already between {ck (j )} and {xk (j )}. Fig. 1 can be seen as a graphic
been specified [10] so other alternatives, instead of alter- extension of a single-user LDPC code to a multiuser system.
ing code structure, should be used for system optimization. The randomness resulting from interleaving reduces short
Second, there is also a lack of efficient matching method when cycles in the graph, which facilitates low-cost message passing
high order modulation is involved for high rate applications. decoding. More details can be found in [1] and [11].
Third, matching for multiuser systems is generally a difficult Fig. 1 involves a user-specific interleaver for each user. This
problem. Very limited progress is made in this direction. interleaver can be combined with the inherent interleaver in
In this letter, we consider IDMA system design in high the LDPC code involved. This is equivalent to the scheme
sum-rate situations. We first derive the achievable rate for in [12], in which each user employs a unique interleaver for
IDMA using the matching principle. We show that, with its code. Later we will show that such an interleaver can be
perfect matching, IDMA is potentially capacity approaching in realized by cyclic shifting (see Fig. 5 below), which further
Manuscript received July 19, 2018; revised October 5, 2018; accepted
reduces the hardware implementation cost for IDMA.
October 9, 2018. Date of publication October 17, 2018; date of current version
April 9, 2019. This work was supported in part by the Research Fund of ZTE B. Receiver Principles
Corporation, and in part by the University Grants Committee of the Hong
Kong Special Administrative Region, China, under Project CityU 11280216 We divide an iterative detector for the system in Fig. 1 into
and Project CityU 11216817. The associate editor coordinating the review of two local processors: an ESE and a DEC. The iterative process
this paper and approving it for publication was W. Zhang. (Corresponding
author: Lei Liu.)
is outlined below.
Y. Hu, C. Liang, L. Liu, and L. Ping are with the Department of Initialization: Assume that the modulation constellation of
Electronic Engineering, City University of Hong Kong, Hong Kong (e-mail: {xk (j )} is with zero mean and unit average power. Then
yhu228-c@my.cityu.edu.hk; chuliang@cityu.edu.hk; leiliu@cityu.edu.hk; E(xk (j )) and Var(xk (j )) are respectively initialized to 0 and
eeliping@cityu.edu.hk).
C. Yan and Y. Yuan are with the Algorithm Department, ZTE 1, ∀k , j .
Corporation, Shenzhen 518057, China (e-mail: yan.chunlin@zte.com.cn; ESE Operations: We rewrite (1) as
yifei.yuan@ztetx.com).
Digital Object Identifier 10.1109/LWC.2018.2876538 y(j ) = hk xk (j ) + ζk (j ), (2a)
HU et al.: IDMA IN HIGH RATE APPLICATIONS 477
Fig. 1. A factor graph of a 2-user IDMA system with LDPC coding. J = 8. {ck (j ), j = 1, 2, . . . , J } is a codeword that is interleaved and modulated to
produce {xk (j )}. Circles represent variables and squares constraints. Three types of constraints are presented: a white square for an LDPC coding constraint,
a square with “×” for a modulation constraint and a square with “+” for a multiple access constraint defined in (1).
where If the system is symmetric for all k, we can drop index k

K
and use two common functions to characterize the behaviors
of all users as:
ζk (j ) = y(j ) − hk xk (j ) = hk xk (j ) + η(j ) (2b)
k =1,k =k ESE: snr = φ(v ), (6a)
includes the interference and noise seen by user k. We model DEC: v = ψ(snr ). (6b)
ζk (j ) ∼ CN (μk (j ), vζ,k (j )) using Gaussian approximation In this case, the final performance is determined by the first
(GA). Given a priori {E(xk (j ))} and {Var(xk (j ))}, we can fixed point of φ and ψ as shown in Fig. 2.
evaluate μk (j ) and vζ,k (j ) and then estimate xk (j ) based
on (2a). For example, for binary phase shift keying (BPSK) III. IDMA D ESIGN T ECHNIQUES BASED ON E VOLUTION
modulation, each xk (j ) ∈ {−1, +1} and the estimation A NALYSIS
outputs are log-likelihood ratios (LLRs):
A. Matching Principle
Pr(xk (j ) = +1) Re(2hk∗ (y(j ) − µk (j ))) For an LDPC or a turbo code, it is known that the over-
LLR(xk (j )) = = . (3)
Pr(xk (j ) = −1) vζ,k (j )/2 all performance can be optimized by matching the transfer
Similar results can be obtained for quadrature phase shift functions of the two local decoders [2], [14]. Following
keying (QPSK) and other modulations. this principle, a symmetric IDMA system can be optimized
DEC Operations: The DEC is further divided into K con- by matching φ and ψ in (6). The matching condition is
stituent decoders {DEC 1, DEC 2, · · · DEC K}, one for given by
each user. The LLRs in (3) are used as the inputs to the ψ(z ) = φ−1 (z ), (7)
DEC. Assume soft-output decoding, by which {E(xk (j ))} and
{Var(xk (j ))} are updated. For an LDPC or a turbo code, such where φ−1 is the reverse function of φ. The analytical treat-
decoding follows the standard procedures in [2], [13], and [14]. ments on general situations are beyond the scope of this letter.
Iterative Process: The DEC outputs {E(xk (j ))} and We will only provide derivation for AWGN channels and rely
{Var(xk (j ))} are fed back to the ESE. Then (3) is re-evaluated on simulation results for fading channels below.
and the iterative process continues.
B. Optimality of IDMA in AWGN Channels
C. Evolution Analysis Consider an AWGN channel in which hk = 1 in (1) for all
users. Assume the same code for all users. In this case, (6a)
Let vk be the average of Var(xk (j )) over j and snrk the
is given by
average signal-to-noise-ratio (SNR) related to {LLR(xk (j ))}
in (3). From [1], snrk can be expressed as a function: snr = φ(v ) = 1/((K − 1)v + σ 2 ). (8)
2
Following the minimum mean square error (MMSE)-SNR
snrk = E |xk (j )| /E(vζ,k (j )) ≡ φk (v1 , . . . .vk −1 , vk +1 , . . . , vK ). (4)
relationship developed in [6], [8], and [15], the achievable rate
Now let the input-output relationship of DEC k be charac- of each user is given by
terized by a function vk = ψk (snrk ) that can be generated +∞
numerically. The behavior of the iterative process can be char- R= mmse(snr )dsnr , (9)
acterized by the following recursions (initialized to vk = 1): 0
where mmse is the MMSE at the output of DEC k with snr at
its input. Then, following [8], we have
ESE: snrk = φk (v1 , . . . .vk −1 , vk +1 , . . . , vK ), (5a) −1
DEC: vk = ψk (snrk ). (5b) mmse(snr ) = snr + (ψ(snr ))−1 , (10)
Fig. 2. An example of evolution trajectory.

Fig. 3. Evolution functions for the 3GPP NR LDPC coded IDMA system.
σ 2 is set to −9dB. Power levels of the two SCM layers are 0.2 and 0.8 for
ψ (2) .
which includes the contributions of both a priori information
(related to snr) and extrinsic one (related to ψ(snr )). From (7)
and (8), we have symbols. Then (8) becomes

ψ(snr ) = 1, 0 ≤ snr < 1/(K − 1 + σ 2 ), snr = φ(v ) = 1/((K − 1)v + σ 2 ), (14)
ψ(snr ) = (K1−σ snr , 1/(K − 1 + σ 2 ) ≤ snr ≤ 1/σ 2 . (11)
2
−1)snr
where K is the nonzero symbols transmitted simultane-
Here the first equation in (11) results from the constraint ously (assume that ratio of zero symbols is uniform over
Var(xk (j )) ≤ 1 so that ψ(snr ) ≤ 1. all resources). ZP can shape φ to a certain extend and
Substituting (10) and (11) into (9) and with some straight- can also lead to reduced multiuser detection complexity.
forward manipulations, we have the achievable rate per user Repetition coding does not provide coding gain. However,
R = K −1 log(1 + K /σ 2 ). (12) it can be useful in multiuser systems to shape ψ.
For K users, the sum-rate is IV. N UMERICAL R ESULTS

2
Rsum = KR = log(1 + K /σ ), (13) We follow the 3GPP NR requirement [17] of frame length
of 864 symbols with information bits Jinfo = 616 per user. We
which achieves the K-user channel capacity. This demonstrates
consider AWGN first. Fig. 3 shows the φ and ψ functions in
the optimality of IDMA under perfect matching.
AWGN channels. φ is the ESE function of K = 4. ψ (1) is the
Equation (13) extends the conclusions in [8] from single-
DEC function of the 3GPP NR LDPC code and QPSK modu-
user systems to multiuser ones in AWGN channels. With
lation. We can clearly see poor matching between ψ (1) and φ.
practical coding constraints, it can be difficult to design per-
ψ (2) is the DEC function of a two-layer SCM scheme with the
fectly matched φ and ψ. Also, the situation is much more
3GPP NR LDPC code followed by a rate-1/4 repetition cod-
complicated if the received powers are random variables due
ing. (The power levels of the two SCM layers are obtained by
to fading. Detailed analytical discussions on the matching prin-
exhaustive searching.) We can see a staircase shape of ψ (2) ,
ciple for fading channels are beyond the scope of this letter.
which is due to the different convergence behaviors of differ-
Nevertheless, the matching condition in (7) suggests a way
ent SCM layers with different powers: the higher-power layer
for performance optimization. Below, we will discuss some
leads the convergence in the first stage (see Fig. 3) and then the
empirical results along this direction.
lower-power layer follows in the second stage. Clearly, such
staircase shape improves the matching between ψ (2) and φ.
C. Design Strategies Based on the Matching Principle Fig. 4 shows the frame error rates (FERs) of different
The followings are some strategies to shape either φ or ψ, schemes in a 4-user AWGN system. Schemes 1 and 2 cor-
so as to improve matching. respond to ψ (1) and ψ (2) in Fig. 3 respectively. We can see
1) A standard approach is to shape ψ by adjusting the degree that scheme 1 does not work at all, while scheme 2 works well
polynomials for an LDPC code [2], [9]. This method due to better matching. Let M be the number of SCM layers.
requires different code structures for different situations, Although increasing M may improve matching in theory, there
which may cause difficulty in hardware implementation. is a limit in practice since the value of M will influence the
In this letter, we will examine options with fixed codes. coding gain of each layer under a given Jinfo . We observed
2) Properly designed modulation and labeling methods can that, for Jinfo = 616, M = 2 represents a best compromise.
be used to shape ψ in the high rate region. In partic- We now consider fading channels, in which the fluctua-
ular, with superposition coded modulation (SCM), two tions of channel gains provide more diversity to support more
or more layers of modulated codes (with, say, QPSK users. ZP can be used to reduce receiver complexity. A cyclic
modulation) are scaled to different power levels and then shifted ZP (CSZP) scheme is shown in Fig. 5. Each user is
linearly superimposed, as demonstrated in [16]. active only in half of a frame (i.e., with 50% ZP) and the
3) Other simple options include zero padding (ZP) and rep- active half of a frame for user k is cyclically shifted by an
etition coding. With ZP, some users are silent on some amount of (k −1)J /K where J is the frame length. Assume
HU et al.: IDMA IN HIGH RATE APPLICATIONS 479
Fig. 6 also includes the performances of sparse code multiple

access (SCMA) [17]. The sparse codebook in SCMA is an
equivalent form of ZP in IDMA. The performance advantage
of IDMA over SCMA mainly attributes to the freedom of
curve shaping using SCM.
In conclusion, in this letter we examined several design
techniques for IDMA following the transfer function match-
ing principle. We showed that IDMA is capacity approach-
ing in AWGN channels under perfect matching. We also
showed by simulations that the matching techniques for
IDMA can provide impressive performance in fading chan-
nels. Some of the softwares used in this letter are available
Fig. 4. IDMA in AWGN channels. K = 4. SNR per user = 1/σ 2 . Sum-rate in the following site: http://www.ee.cityu.edu.hk/%7Eliping/
= KJinfo /864 = 2.85.
Research/Simulationpackage/.
R EFERENCES
[1] L. Ping, L. Liu, K. Wu, and W. K. Leung, “Interleave division multiple-
access,” IEEE Trans. Wireless Commun., vol. 5, no. 4, pp. 938–947,
Apr. 2006.
[2] T. Richardson and R. Urbanke, Modern Coding Theory. Cambridge,
[3] LLS Results for RDMA, GOCA, RSMA and IDMA Schemes, docu-
Fig. 5. An example of CSZP. K = 4. Two SCM layers are used. ment R1-167536, 3GPP TSG RAN WG1 Meeting #86, MediaTek Inc.,
Gothenburg, Sweden, Aug. 2016
[4] On the Performance of IDMA and RSMA Multiple Access Schemes,
document R1-167561, 3GPP TSG RAN WG1 Meeting #86, InterDigit.
Commun., Gothenburg, Sweden, Aug. 2016.
[5] C. Yan, Z. Yuan, W. Li, and Y. Yuan, “Non-orthogonal multiple
access schemes for 5G,” ZTE Commun., vol. 14, no. 4, pp. 11–16,
Oct. 2016.
[6] K. Bhattad and K. R. Narayanan, “An MSE-based transfer chart
for analyzing iterative decoding schemes using a Gaussian approx-
imation,” IEEE Trans. Inf. Theory, vol. 53, no. 1, pp. 22–38,
Jan. 2007.
[7] K. Wu, K. Anwar, and T. Matsumoto, “BICM-ID-based IDMA using
extended mapping,” IEICE Trans. Commun., vols. E97–B, no. 7,
pp. 1483–1492, Jul. 2014.
[8] X. Yuan, L. Ping, C. Xu, and A. Kavcic, “Achievable rates of
MIMO systems with linear precoding and iterative LMMSE detec-
tion,” IEEE Trans. Inf. Theory, vol. 60, no. 11, pp. 7073–7089,
Nov. 2014.
Fig. 6. IDMA vs. SCMA in TDL-C channels. 3GPP NR LDPC code is used [9] Y. Zhang, K. Peng, and J. Song, “Enhanced IDMA with rate-compatible
for both IDMA and SCMA. IDMA involves SCM (with power levels of 0.865 raptor-like quasi-cyclic LDPC code for 5G,” in Proc. IEEE GC Wkshps,
and 0.135 for both cases) and CSZP. performances of SCMA are from [17]. Singapore, Dec. 2017, pp. 1–6.
[10] Technical Specification Group Radio Access Network; NR; Multiplexing
and Channel Coding (Release 15), V2.0.0, 3GPP Standard TS 38.212,
Dec. 2017.
that a random interleaver is used in the underlying LDPC [11] Y. Hu, C. Liang, J. Hu, and L. Ping, “Low-cost implementation tech-
code. The user-specific shifting in Fig. 5 provides an equiv- niques for interleave division multiple access,” IEEE Wirelee Commun.
Lett., to be published. [Online]. Available: https://ieeexplore.ieee.org/
alent realization for the user-specific interleaving in Fig. 1, abstract/document/8398463
since a shifted version of a random interleaver can be seen [12] R. Zhang, L. Xu, S. Chen, and L. Hanzo, “Repeat accumulate code
as a different random interleaver [11]. Fig. 5 also involves a division multiple access and its hybrid detection,” in Proc. IEEE ICC,
Beijing, China, May 2008, pp. 4790–4794.
two-layer SCM scheme. To distinguish the two layers of each [13] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon
user, each layer is further independently and cyclically shifted limit error-correcting coding and decoding: Turbo-codes. 1,”
within its active frame duration. Such shifting technique, both in Proc. IEEE ICC, vol. 2, Geneva, Switzerland, May 1993,
user-specific as well layer-specific, significantly reduces the pp. 1064–1070.
[14] S. T. Brink, “Convergence behavior of iteratively decoded parallel con-
hardware implementing cost for interleavers in IDMA. catenated codes,” IEEE Trans. Commun, vol. 49, no. 10, pp. 1727–1737,
Fig. 6 shows the FERs with SCM and CSZP in the TDL- Oct. 2001.
C fading channels [17]. Two antennas are assumed at the [15] D. Guo, S. Shamai, and S. Verd, “Mutual information and minimum
mean-square error in Gaussian channels,” IEEE Trans. Inf. Theory,
receiver. We consider the following two settings: vol. 51, no. 4, pp. 1261–1282, Apr. 2005.
• Case 1: Jinfo = 616, K = 10 and sum-rate = 7.13; [16] L. Ping, J. Tong, X. Yuan, and Q. Guo, “Superposition coded modulation
• Case 2: Jinfo = 496, K = 12 and sum-rate = 6.89. and iterative linear MMSE detection,” IEEE J. Sel. Areas Commun.,
The power levels of the two SCM layers given in the caption vol. 27, no. 6, pp. 995–1004, Aug. 2009.
[17] Preliminary LLS Evaluation Results for mMTC Scenario, docu-
of Fig. 6 are obtained by exhaustive searching. Iterative linear ment R1-1803665, 3GPP TSG RAN WG1 Meeting #92bis, Huawei,
MMSE detection [11] is used at the receiver. As reference, HiSilicon, Shenzhen, China, Apr. 2018.
To Establish a Secure Channel From a Full-Duplex Transmitter to a Half-Duplex

Receiver: An Artificial-Noise-Aided Scheme
Xinyue Hu , Caihong Kai , Shengli Zhang , Zhongyi Guo , and Jun Gao
Abstract—This letter considers a new secure communication artificial-noise-aided secure scheme (ANAS) for the practical
scenario where a full-duplex transmitter (Alan) needs to trans- secure communication scenario where a FD transmitter (Alan)
mit confidential information to a half-duplex receiver (Bob), with needs to transmit confidential information to a HD receiver
an eavesdropper (Eve) that tries to overhear the confidential (Bob), with an eavesdropper (Eve) that tries to overhear the
information. For realizing the secure communication, we design information.
an effective artificial-noise-aided secure scheme (ANAS) which is Specifically, an effective two-phase ANAS scheme is
composed of two phases’ transmissions: in Phase 1, Alan and Bob
proposed: in Phase 1, Alan transmits an AN nA and Bob sends
transmit two independent artificial noises (ANs) simultaneously,
while in Phase 2, Alan superimposes the AN received in Phase 1 another independent AN nB simultaneously, while in Phase 2,
with its confidential signal and sends out the mixed signal. Since Alan superimposes the signal received in Phase 1 with the con-
the superimposed AN by Alan in Phase 2 can be effectively can- fidential signal and sends it to Bob. After Phase 2, recall that
celled by Bob while remains an interference to Eve, a secrecy rate Bob knows nB , it could cancel nB from the received signal.
could then be achieved. Importantly, we derive the approximate By contrast, in Phase 1 Eve only receives the mixture of nB
closed-form solutions of the average secrecy rate and secrecy out- and nA , and thus could not cancel nB from the signal received
age probability of ANAS under a Rayleigh block-fading channel. in Phase 2, leading to severer more interference than Bob. By
Numerical results show that the secrecy rate of ANAS is about doing so, a secure communication channel can be estblished.
twice higher than the benchmark scheme, even though in ANAS As can be seen, our ANAS scheme is simple to implement and
half of the time is used to transmit ANs. has no risk of the AN-leakage problem in which the injected
Index Terms—Physical-layer security, artificial noise, secrecy AN could be overheard by the eavesdropper [9], [10].
rate, outage probability, full-duplex. Importantly, we move forward to analyze the average
secrecy rate and the secrecy outage probability of ANAS under
Rayleigh block-fading channel. Through Gauss Laguerre
I. I NTRODUCTION quadrature, the approximate closed-form expressions of the
ECENTLY, with the progress of full-duplex (FD)
R communications, the adoption of the FD technique to
realize secure communications at the physical layer has been
average secrecy rate and the secrecy outage probability are
derived. Numerical results show that ANAS could achieve
good secrecy performance in terms of both secrecy rate and
widely studied [1]–[8]. However, most prior work considered outage probability, and thus provides a good solution to the
a scenario in which either the relay or the receiver has the secure communication from a FD transmitter to a HD receiver.
FD communication capability, and then the relay or receiver
could receive the information signals and transmit artificial II. S YSTEM M ODEL AND THE ANAS S CHEME
noises (AN) simultaneously to puzzle the eavesdropper. A. System Model
It is important to note that in practical downlink transmis- We consider a wireless communication system with three
sions, the transmitter (e.g., the base station in cellular network communication nodes: a transmitter with FD capability (Alan),
or the access point in WLAN) usually has more powerful capa- a HD intended receiver (Bob) and a HD eavesdropper (Eve).
bilities (e.g., supporting the FD mode) than the user devices. We assume all nodes are equipped with single antenna. In the
Thus, for providing secure downlink services it is also non- system, Alan needs to send confidential information to Bob,
trivial to establish a secure channel from a FD transmitter and Eve tries to eavesdrop the information.
to a half-duplex (HD) receiver. Observing that most prior Let h1 and h3 be the channel coefficients of the main
schemes such as [1]–[8] cannot be applied to the downlink channel (between Alan and Bob) and the eavesdropper chan-
communication scenario, in this letter we design an effective nel (between Alan and Eve), respectively, h2 be the channel
coefficient between Bob and Eve, and hSI be the residual
Manuscript received September 10, 2018; revised October 9, 2018; accepted
October 11, 2018. Date of publication October 17, 2018; date of current self-interference (RSI) channel coefficient of Alan, which is
version April 9, 2019. This work was supported by the National Natural modeled as Rayleigh block-fading [8]. We assume the reci-
Science Foundation of China under Grant 61571178 and Grant 61771315. procity of the forward and backward channels and a Rayleigh
The associate editor coordinating the review of this paper and approving it block-fading channel model where each channel coefficient
for publication was X. Zhou. (Corresponding author: Caihong Kai.) remains unchanged for a time-slot duration of T seconds. In
X. Hu and C. Kai are with the School of Computer Science and Information
Engineering, Hefei University of Technology, Hefei 230009, China, and also the system, since Alan and Bob are active nodes, by listen-
with the Anhui Province Key Laboratory of Industry Safety and Emergency ing to the pilot signal Alan (Bob) transmits, Bob (Alan) could
Technology, Hefei 230601, China (e-mail: huxinyue@mail.hfut.edu.cn; obtain h1 , and similarly Eve could obtain h2 and h3 . However,
chkai@hfut.edu.cn). whether h2 and h3 can be obtained by Alan and Bob depends
S. Zhang is with the School of Information Engineering, Shenzhen on whether Eve is a silent node.
University, Shenzhen 516080, China (e-mail: zsl@szu.edu.cn).
Z. Guo and J. Gao are with the School of Computer Science and Information B. The ANAS Scheme
Engineering, Hefei University of Technology, Hefei 230009, China (e-mail: √
guozhongyi@hfut.edu.cn; gaojun@hfut.edu.cn). Phase 1: Bob sends an artificial Gaussian noise PB nB
Digital Object Identifier 10.1109/LWC.2018.2876540 to Alan and Alan synchronously sends another independent
HU et al.: TO ESTABLISH SECURE CHANNEL FROM FD TRANSMITTER TO HD RECEIVER: ANAS 481
√
Gaussian noises PA1 nA to prevent Eve from over-
artificial√ B. The Rate of the Eavesdropper Channel
hearing PB nB .1 We have nB , nA ∼CN (0, 1), PB and PA1 For achieving higher achievable rate, we assume Eve is
are the transmit power of Bob and Alan. smart enough and has two methods to jointly process yE 1
After Phase 1,2 denote the signals received by Alan and Eve in (2) and yE 2 in (5): 1) maximum interference cancellation;
by yA and yE 1 , respectively. We have
2) joint decoding. The details of the two methods are given
yA = h1 PB nB + hSI PA1 nA + nA1 , (1) as follows:
1) Maximum Interference Cancellation: Eve uses yE 1 in
yE 1 = h2 PB nB + h3 PA1 nA + nE 1 , (2)
(2) to reduce the interference in yE 2 as much as possible.
where nA1 and nE 1 are the additive white Gaussian
√ noises Specifically, Eve multiplies yE 1 with a complex coefficient
(AWGN) with variance N0 and the term hSI PA1 nA in (1) hx , then removes hx yE 1 from yE 2 . After cancelling the
is introduced by the RSI. √ interference, the signal obtained by Eve is
Phase 2: Instead of decoding PB nB 3 from √ yA , Alan
superimposes its confidential information signal Ps sA with yES = h3 Ps sA + βh3 h1 PB nB + βh3 nA1 + nE 2
yA , where sA ∼CN (0, 1) and Ps is the transmit power of
sA . We assume sA and yA have the same length in the time N

domain. The superimposed signal is then − hx h2 PB nB + h3 PA1 nA + nE 1 .
(8)
xA = Ps sA + βyA , (3)
hx yE 1
Δ 2 2
where β = PA2 /(|h1 | PB + |hSI | PA1 + N0 ). As can be
seen in (3), the transmit power of yA is scaled into a prede- As regarding to how hx is determined to maximize the signal-
termined power PA2 by β. For synchronizing, Alan then adds to-interference-and-noise ratio (SINR) of Eve, we have the
a pilot signal ahead of xA and sends it to Bob. Denote the following lemma:
received signals at Bob and Eve in Phase 2 by yB and yE 2 . Lemma 1: To achieve the highest SINR, Eve can select6
We have βPB |h1 h2 h3 |
yB = h1 Ps sA + βh1 h1 PB nB + βh1 hSI PA1 nA ĥx = e (θ1 +θ3 −θ2 )i (9)
g2 PB + g3 PA1 + N0
+ βh1 nA1 + nB2 , (4)
in which θ1 , θ2 and θ3 are the phases of h1 , h2 and h3 ,
yE 2 = h3 Ps sA + βh3 h1 PB nB + βh3 hSI PA1 nA respectively.
+ βh3 nA1 + nE 2 , (5) Proof: For achieving the highest SINR, ĥx must be
where nB2 and nE 2 are the AWGNs with variance N0 .
Following [4], we assume the adopted FD technique could well ĥx = argmaxhx SIN RyES , (10)
2
√ (i.e., |hSI | is very small) 4and we
suppress the self-interference
in which
neglect the term βh3 hSI PA1 nA in the rate analysis. Also,
since |h1 |2 PB is much larger than |hSI |2
PA1 and N0 in most SIN RyES =
g3 Ps
, (11)
cases, to ease analysis, we assume β ≈ PA2 /(|h1 | PB ). 2 Var (N − hx yE 1 )
√
Since
√ Bob knows PB nB , it could cancel the term EN
βh1 h1 P√ B nB from yB . By contrast, it is difficult for Eve and we have
√ detect PB nB from yE 1 because of the interference
to √ of
PA1 nA . Thus, Eve cannot cancel the βh3 h1 PB nB term EN = |βh3 h1 − hx h2 |2 PB + |hx |2 g3 PA1 + |hx |2 N0
from yE 2 , then suffers more interference than Bob, and ANAS
+ β 2 g3 N0 + N0 . (12)
thus builds up a secure channel from Alan to Bob.
From (11), we have ĥx = argminhx (EN ). It is easy to see
III. A NALYSIS ON S ECRECY R ATE AND that when the phase of βh3 h1 is equal to that of hx h2 , the
O UTAGE P ROBABILITY |βh3 h1 − hx h2 | term in (12) is the smallest. That is, θ̂x =
A. The Rate of the Main Channel θ1 + θ3 − θ2 .
√
Since Bob knows √PB nB and h1 , it can cancel the After θ̂x is determined, we can write EN as
interference term βh1 h1 PB nB from yB and obtain yBS 5
EN = (g2 PB + g3 PA1 + N0 )|hx |2 − 2β|h1 ||h2 ||h3 |PB |hx |
yBS = h1 Ps sA + βh1 hSI PA1 nA + βh1 nA1 + nB2 . (6)
+ β 2 g1 g3 PB + β 2 g3 N0 + N0 , (13)
Let g1 =|h1 |2 , g2 =|h2 |2 , g3 =|h3 |2 and gSI = |hSI |2 . The
achievable rate ofBob is then which is a convex function of |hx | and it is easy to verify that
g1 Ps βPB |h1 h2 h3 |
|ĥx | = g2 PB +g .
RB = log2 1 + P PA2
. (7) 3 PA1 +N0
PB gSI PA1 + PB N0 + N0
A2 To ease expression, we define M as
PB g2 g3
1 It is important to note that PB nB and PA1 nA are only known by M = . (14)
Bob and Alan, respectively. g2 PB + g3 PA1 + N0

2 For synchronizing, in Phase 1, a pilot signal is added ahead of
P B nB .
3 After Phase 1, both Alan and Eve will not try to decode PB nB . Instead, By submitting ĥx into (8), the achievable rate of Eve is
Alan uses (1) to randomize the confidential information, and Eve records (2)
for cancelling the interference as much aspossible. g3 Ps
RE = log2 1 + 2 . (15)
4 We note that compared with βh h
3 1 PB nB , βh3 hSI PA1 nA has β g3 N0 + N0 + PA2 g3 − PA2 M
little effect on yE 2 in (5).
5 We cannot neglect the term h h
1 SI PA1 nA in (6), since its power may
be with the same level of the power of h1 nA1 . 6 Eve could estimate h by correlating y
1 E 1 and yE 2 , and obtain ĥx .
2) Joint Decoding: We write the two observations yE 1 and Lemma 2: The average secrecy rate and the secrecy outage
yE 2 at Eve as a signal vector, and the received signal vector probability of ANAS can be approximately computed by
at Eve is E (RS ) ≈
yE 1 σ xi +g1L (σ2 xj ,σ3 xk ,σSI xl ,0)
2 2 2 2
yE = n
n n
n
− 1 2
yE 2 0.5 ωi ωj ωk ωl e σ1
•
√ √
0 √ h PB nB + h3 PA1 nA + nE 1 i=1 j =1 k =1 l=1
= Ps sA + 2 √ . (16)
RS σ12 xi + g1L σ22 xj , σ32 xk , σSI
2

xl , 0 , σ22 xj , σ32 xk , σSI
2

xl , (26)
h3 βh1 h3 PB nB + βh3 nA1 + nE 2
Eqn. (16) is a MIMO Gaussian channel model. The informa- and
tion covariance matrix of (16) is n
n n (2 x ,σ 2 x ,σ 2 x ,r
g1L σ2 i 3 j SI k s )
Pout (rs ) ≈ 1 − ωi ωj ωk e
−
σ1 2
, (27)
0 0
Cinfo = , (17) i=1 j =1 k =1
0 g3 Ps
and the interference plus noise covariance matrix of (16) is respectively, where xi isx then ith root of n-order Laguerre
polynomial Ln (x ) = en! dx d (e −x x n ), the weight ω =
n i
g2 PB + g3 PA1 + N0 βh1∗ h2 h3∗ PB xi
Cipn = . (18) 2 2 [12] and we set n = 15.
βh1 h2∗ h3 PB β 2 g1 g3 PB + β 2 g3 N0 + N0 (n+1) (Ln+1 (xi ))
Proof: The average secrecy rate, E (RS ) and the secrecy
Consequently, the achievable rate of Eve is given by outage probability, Pout (rs ) = 1 − P (g1 ≥ g1L ) of ANAS

RE 2 = log2 det I2 + Cinfo C−1ipn . (19) can be computed by
∞ ∞ ∞ ∞ g
− 12
g
− 22
g
− 32
g
− SI
σ1 σ2 σ3 σ 2
Remark 1: It is interesting and also important to note that E (RS ) = 0.5 e e e e SI •
0 0 0 g1L
after further determinant operation, the right hand side of (19)
1
and that of (15) are exactly the same. That is, no matter RS (g1 , g2 , g3 , gSI )dg1 dg2 dg3 dgSI
σ12 σ22 σ32 σSI
2
which method Eve adopts, it will obtain the same overhearing g g g g
∞ ∞ ∞ ∞ − z2 − 22 − 32 − 1L − SI
capability. = 0.5 e σ1
e σ2
e σ3
e
2
σ1
e
2σ
SI •
0 0 0 0
C. Average Secrecy Rate and Secrecy Outage Probability 1
RS (z + g1L , g2 , g3 , gSI )dzdg2 dg3 dgSI , (28)
In this section, we analyze the average secrecy rate and the σ12 σ22 σ32 σSI
2
secrecy outage probability7 of ANAS. Since the channels are and

Rayleigh fading, the PDF (Probability Density Function) of
gj , j ∈ {1, 2, 3, SI } is Pout (rs ) = 1 −
∞ ∞ ∞ g
− 22
g
− 32
g
− SI
g
− 1L
g 2 2 1
1 − j2 e σ2
e σ3
e σ
SI e σ1
dg2 dg3 dgSI , (29)
P (gj ) = 2 e
σ
j , j ∈ {1, 2, 3, SI }, (20) 0 0 0 σ22 σ32 σSI
2
σj respectively. The factor 0.5 in (28) is due to that only half

where σj2
is the expectation of gj . According to [11], the of the time is used for information transmission in the ANAS
instantaneous achievable secrecy rate is scheme.
Although the closed-form solutions of (28) and (29) are
RS (g1 , g2 , g3 , gSI ) = [RB − RE ]+ . (21) hard to obtain, we next give the approximate computations.
To achieve an expected secrecy rate rs (i.e., RB − RE ≥ rs ), A
b general method for approximating a definite integral is
n
with some mathematical derivations, we could obtain a ω(x )f (x ) i=1 ωi f (xi ). When ω(x ) is in the form
√ of e −x and a = 0, b = ∞, the closed-form solution to the
Δ −B + B 2 − 4AC
g1 ≥ g1L (g2 , g3 , gSI , rs ) = , (22) integral can be found using the Gauss Laguerre quadrature
2A (GLQ) [12] (more details can be found in [12, Sec. 25.4.45]).
in which By rewriting (28) and (29) into the form of GLQ, the closed-
A = Ps (N0 + PA2 g3 − PA2 M ), (23) form solutions can be approximately given by (26) and (27),
rs rs respectively.
B = 1−2 (N0 + PA2 g3 − PA2 M ) − 2 Ps g3 •
(PA2 gSI PA1 /PB + PA2 N0 /PB + N0 ) + Ps PA2 g3 N0 /PB , (24) D. An ON/OFF Scheme for Reducing Outage Probability
and Since g2 and g3 could be unknown to Alan, we next
C = (1 − 2rs )(PA2 gSI PA1 /PB + PA2 N0 /PB + N0 ) • design a simple ON/OFF scheme similar to [13] to further
reduce the outage probability. Specifically, we set a thresh-
(N0 + PA2 g3 − PA2 M )g3 N0 . (25) old as gth (rs ) = min (g1L ) for a given rs . Only when
g2 ,g3 ,gSI
That is, g1L (g2 , g3 , gSI , rs ) defined in (22) is the lower bound g1 ≥ gth (rs ), Alan transmits confidential information to Bob
of g1 to guarantee the achievable secrecy rate is equal or at the rate of rs . The threshold is the required minimum chan-
greater than an expected secrecy rate rs . For expressing briefly, nel gain of the main channel under which Alan has chances
we use g1L instead of g1L (g2 , g3 , gSI , rs ) in the rest of this to achieve the secure rate rs , regardless of g2 and g3 . In other
letter. words, when g1 < gth (rs ) Alan can never achieve the secrecy
7 Alan needs to know g , g , g and g
1 2 3 SI for determining the instantaneous
rate rs for any g2 and g3 .
secrecy rate RS calculated by (21) in each time block and then obtain the From (22), we can examine that g1L increases with g2 and
average secrecy rate E (RS ) over a long period of time. In practical systems, gSI , and argming2 ,g3 ,gSI (g1L ) must be (g2 = 0, g3 , gSI =
if Eve is an active node, Alan and Bob could estimate g3 and g2 respectively, 0). When g2 = 0, we can prove that g1L increases with g3 ,
and Bob could periodically feedback g2 to Alan. Also, Alan could estimate
gSI by sensing the received power. However, if Eve keeps silent, Alan could thus the minimum g1L is g1L (0, 0, 0, rs ). However, g3 = 0
hardly obtain g3 and g2 . In this case, giving an expected secrecy rate rs , we is equivalent to that Eve does not exist in the system, which
then compute the secrecy outage probability Pout (rs ). does not make sense. Fortunately, although g1L increases with
HU et al.: TO ESTABLISH SECURE CHANNEL FROM FD TRANSMITTER TO HD RECEIVER: ANAS 483
2 , in which P
Fig. 1. Average secrecy rate versus Ps with different σSI A1 = Fig. 2. Outage probability versus rs with different transmit power of ANs.
PA2 = PB = 200.
interference caused by ANs at Eve becomes more effective.
g3 , there has an upper bound when g3 → ∞, and thus we Last but not least, in the ON/OFF scheme, since Alan does
set gth (rs ) = lim g1L (0, g3 , 0, rs ) as the threshold of the not transmit confidential information when g1 is below the
g3 →∞ threshold as described in Section III-D, the outage probability
ON/OFF scheme. The outage probability with the ON/OFF can be further reduced.
scheme can be approximately calculated by the Bayes’ formula
ON /OFF V. C ONCLUSION
Pout (rs ) = P (g1 ≤ g1L |g1 ≥ gth (rs ))
This letter has proposed a two-phase transmission scheme,
P (rs ) − (1 − P (g1 ≥ gth (rs )))
≈ out . (30) ANAS, to achieve secure transmission from a FD transmitter
P (g1 ≥ gth (rs )) to a HD receiver. The average secrecy rate and the outage
probability of ANAS under Rayleigh fading channel have
IV. N UMERICAL R ESULTS been analyzed and numerical results verified ANAS’s secrecy
We next present both numerical results and Monte Carlo performance. We note in this letter, we assume all the ANs are
simulations to evaluate our ANAS scheme. Monte Carlo transmitted with the same power. As a future work, it would be
simulations are based on 105 independent channel realiza- interesting to investigate the power allocation for transmitting
tions with hj ∼ CN (0, 1), j ∈ {1, 2, 3}. To serve as a ANs and the secrecy performance could be further improved.
benchmark, we give the average secrecy rate of the tradi-
R EFERENCES
tional scheme [14], in which, Alan knows g1 and g3 , and
only when g1 > g3 , Alan transmits confidential informa- [1] T. Guo, B. Wang, W. Wu, and P. Deng, “Secrecy-oriented antenna
tion to Bob with power Pbm . For fair comparison of the assignment optimization at full-duplex receiver with self-interference,”
IEEE Wireless Commun. Lett., vol. 7, no. 4, pp. 562–565, Aug. 2018.
two schemes in terms of power consumption, we set Pbm [2] S. Yan, X. Zhou, N. Yang, T. D. Abhayapala, and A. L. Swindlehurst,
to be Ps +PA1 + PA2 + PB .8 Then, the secrecy rate of the “Secret channel training to enhance physical layer security with a full-
benchmark is log2 (1 + g1 Pbm ) − log2 (1 + g3 Pbm ). duplex receiver,” IEEE Trans. Inf. Forensics Security, vol. 13, no. 11,
Fig. 1 presents the secrecy capacities versus Ps with respect pp. 2788–2800, Nov. 2018.
2 in which P
to different σSI [3] C. Liu, L.-L. Yang, and W. Wang, “Secure spatial modulation with
A1 = PA2 = PB = 200, where
9
a full-duplex receiver,” IEEE Wireless Commun. Lett., vol. 6, no. 6,
the numerical results are computed by (28), the GLQ approx- pp. 838–841, Dec. 2017.
imate results are given by (26). As can be seen, our ANAS [4] G. Chen, Y. Gong, P. Xiao, and J. A. Chambers, “Dual antenna selection
achieves much higher average secrecy rate than the bench- in secure cognitive radio networks,” IEEE Trans. Veh. Technol., vol. 65,
mark, even though in ANAS half of the time is used to transmit no. 10, pp. 7993–8002, Oct. 2016.
[5] B. Zhong and Z. Zhang, “Secure full-duplex two-way relaying networks
ANs. Also, our analytical results perfectly agree with Monte with optimal relay selection,” IEEE Commun. Lett., vol. 21, no. 5,
Carlo simulations, and the GLQ approximate computation pp. 1123–1126, May 2017.
keeps only about 0.02 bps/Hz lower than the numerical results, [6] T.-X. Zheng, H.-M. Wang, J. Yuan, Z. Han, and M. H. Lee, “Physical
also revealing high accuracy. Moreover, as σSI 2 increases, the layer security in wireless ad hoc networks under a hybrid full-/half-
2 corresponds
average secrecy rate decreases, since a larger σSI
duplex receiver deployment strategy,” IEEE Trans. Wireless Commun.,
vol. 16, no. 6, pp. 3827–3839, Jun. 2017.
to a severer residual interference in Alan’s FD transmission. [7] T.-X. Zheng, H.-M. Wang, Q. Yang, and M.-H. Lee, “Safeguarding
It is also worthwhile to note that when Ps is small, ANAS decentralized wireless networks using full-duplex jamming receivers,”
is worse than the benchmark, because the ratio of the noise IEEE Trans. Wireless Commun., vol. 16, no. 1, pp. 278–292, Jan. 2017.
power caused by RSI to Ps is large. However, as long as Ps [8] B. V. Nguyen, H. Jung, and K. Kim, “Physical layer security
schemes for full-duplex cooperative systems: State of the art
is not too small, our ANAS keeps better then the benchmark. and beyond,” IEEE Commun. Mag., to be published. [Online].
Fig. 2 shows the outage probability with respect to rs , where Available: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=
Δ 2 = 0.0001. 8337816, doi: 10.1109/MCOM.2017.1700588.
Ps = 400, PAN = PA1 = PA2 = PB and σSI [9] Y. Zou, “Physical-layer security for spectrum sharing systems,” IEEE
The Monte Carlo simulations validate our analysis and the Trans. Wireless Commun., vol. 16, no. 2, pp. 1319–1329, Feb. 2017.
GLQ approximate computation keeps high accuracy. It is [10] R. Zhang, L. Song, Z. Han, and B. Jiao, “Physical layer security for
straightforward to see that as rs increases, the outage proba- two-way untrusted relaying with friendly jammers,” IEEE Trans. Veh.
bility increases. Furthermore, as the transmit power of ANs, Technol., vol. 61, no. 8, pp. 3693–3704, Oct. 2012.
[11] A. D. Wyner, “The wire-tap channel,” Bell Syst. Tech. J., vol. 54, no. 8,
PAN , increases, the outage probability decreases, since the pp. 1355–1387, Oct. 1975.
[12] M. Abramowitz, I. Stegun, and A. Donald, Handbook of Mathematical
8 From Section II-B, by ignoring the RSI and AWGN power of Alan, the
Functions, New York, NY, USA: Dover, 1970.
average transmit power of ANAS is (Ps +PA1 + PA2 + PB )/2. In the [13] S. Yan, N. Yang, I. Land, R. Malaney, and J. Yuan, “Three artificial-
benchmark we have P (g1 > g3 ) = 0.5. Thus, for fair comparison, we set noise-aided secure transmission schemes in wiretap channels,” IEEE
Pbm to be Ps +PA1 + PA2 + PB . Trans. Veh. Technol., vol. 67, no. 4, pp. 3669–3673, Apr. 2018.
9 The transmit powers are normalized with AWGN power N = 1. That is, [14] P. K. Gopala, L. Lai, and H. E. Gamal, “On the secrecy capacity of fad-
0
the transmit power P means the transmit power is 10log10 P dB higher than ing channels,” IEEE Trans. Inf. Theory, vol. 54, no. 10, pp. 4687–4698,
the AWGN power. Oct. 2008.
Hybrid Precoding for Single Carrier Wideband Multi-Subarray

Millimeter Wave Systems
Wei Huang , Member, IEEE, Zhaohua Lu, Yongming Huang , Senior Member, IEEE,
and Luxi Yang , Senior Member, IEEE
Abstract—This letter studies digital/analog hybrid precoding antenna elements via the same amount of phase shifters. This
for wideband millimeter wave (mmWave) systems with multi- results in high energy consumption for a large-scale antenna
subarray architecture. By exploiting the mmWave propagation array. Hence, hybrid precoding with multi-subarray structure
property that the dominate signal energy arrives at the receivers where each RF chain is only connected with a pre-determined
through some concentrated directions, we first present the sin-
gle carrier transmission with time delay compensation technique collection of antennas, tends to be more attractive for some
performed at the base station, which can efficiently suppress mmWave communication scenarios [5].
inter-symbol interference. Then, in order to address the number Studies on hybrid precoding in wideband mmWave
of radio-frequency chains limitation, we propose a joint subarray frequency selectivity systems are relatively limited.
selection and precoding design scheme by using the group sparse In [6] and [7], hybrid precoding over mmWave multiple
approach. Numerical results also demonstrate the effectiveness input multiple output with orthogonal frequency division
of the proposed scheme. multiplexing (MIMO-OFDM) systems was considered.
Index Terms—Millimeter wave systems, single-carrier, com- However, in the OFDM mode, it is difficult to design a
pressive sensing, time delay compensation. common analog precoder shared across all the subcarriers.
Besides, OFDM transmission usually has high peak to average
I. I NTRODUCTION power (PAPR) and is power inefficient compared with single
ILLIMETER wave (mmWave) communication has been carrier (SC) transmission mode..
M viewed as a promising candidate for future cellular
communication systems [1], [2], owing to its abundant spec-
SC transmission schemes were proposed for lens [8], [9]
and fully connected array systems [10] where path and time
trum. Although mmWave signals typically suffer from severe delay compensation strategy are proposed to suppress the ISI
path loss and absorption, a large-scale antenna array can be in the mmWave frequency selective channels, respectively.
packed into an area with small form. This makes the imple- Compared with the subarray structures, the beamforming capa-
mentation of large-scale antenna array at the transceivers bility of lens systems is weaker due to its physical structure
possible, which can provide enough array gains to over- constraints, and the power consumption in fully connected
come the severe free-space pathloss in mmWave frequencies. architecture is much larger. Different from [8] and [9], where
Nevertheless, realizing large-scale array mmWave communi- only the multi-path sparsity of mmWave MIMO channels is
cation is non-trival. Equipping one radio-frequency (RF) chain applied for the path compensation technique, in this letter,
for each antenna element (the conventional fully-digital array by exploiting the channel characteristic of some mmWave
systems) generally means high energy and cost consumption. scenarios that the line-of-sight (LoS) component of signals
Therefore, a hybrid architecture with fewer RF chains than is dominate and the non-LoS (NLOS) component has much
antennas, has been proposed to overcome the issue. lower energy, we study the hybrid precoding for wideband
Recently, hybrid architecture based two-layer processing has mmWave multi-subarray systems with frequency selective
been studied for mmWave systems [3], [4]. For the fully con- MIMO channels. Specifically, we propose the SC with time
nected architecture, hybrid precoding is to divide the precoding delay pre-compensation approach to suppress ISI for LoS
into analog and digital ones so as to balance the system paths between all mobile stations (MS) and BS. For the resid-
performance and RF chain overhead, where the analog and ual ISI and inter-user interference (IUI) due to the arrival of
digital precoding are generally implemented by phase shifters weak energy in NLOS directions at the receivers, we formulate
and dedicated digital baseband hardware, respectively. In the an optimization problem to maximize the minimum signal-
fully-connected architecture, each RF chain is connected to all to-interference-plus-noise-ratio (SINR) of all users. Different
from the two layer hybrid precoding designs in [10], for
Manuscript received October 5, 2018; accepted October 14, 2018. Date of the considered subarray structure, the extra subarray selection
publication October 22, 2018; date of current version April 9, 2019. This
work was supported by the National Natural Science Foundation of China matrix results in a nonlinear mixed integer problem. To tackle
under Grant 61720106003. The associate editor coordinating the review of the issue, we transform the problem into the sparse precoding
this paper and approving it for publication was R. Wang. (Corresponding optimization design problem and solve it by relaxing l0 norm
author: Yongming Huang.) by the l1,∞ norm.
W. Huang, Y. Huang, and L. Yang are with the School of
Information Science and Engineering, Southeast University, Nanjing Notations: In this letter, boldface lower- and upper-case
210096, China (e-mail: huangwei2013@seu.edu.cn; huangym@seu.edu.cn; letters denote vectors and matrices, respectively. CM ×N
lxyang@seu.edu.cn). denotes the space of M × N complex-valued matrices. ||x||0 ,
Z. Lu is with the Algorithm Department, Wireless Product Research
and Development Institute, Wireless Product Operation, ZTE Corporation,
||x||1 , ||x||1,∞ and ||x|| denote the l0 , l1 , l1,∞ and l2
Shenzhen 518000, China (e-mail: lu.zhaohua@zte.com.cn). norm of vector x, respectively. Re(·) denotes the real part
Digital Object Identifier 10.1109/LWC.2018.2877358 of variable.
HUANG et al.: HYBRID PRECODING FOR SC WIDEBAND MULTI-SUBARRAY mmWAVE SYSTEMS 485
The subarray selection matrix Λ with the size of N × MRF

is the diagonal matrix and the entries of the matrix belong
to set {0, 1}. Since each RF chain only connects to a subset
of antennas subject to the number of RF chains in the multi-
subarrays structure, the non-zero elements in matrix Λ are
no more than MRF . sk [n] denotes the information symbol
transmitting to MS k and n is the symbol index, nk ,LoS is the
time delay compensation index over the LoS path. Then, the
signal x[n] transmitted by the BS is given by
K

Fig. 1. Illustration of mmWave MISO with SC and time delay compensation
system model. x[n] = FΛk wk sk [n + nk ,LoS ]. (4)
k =1
Thus, the signal yk [n] received at MS k can be expressed as
II. C HANNEL AND SC T RANSMISSION M ODEL
Lk

We focus on a multi-user mmWave MISO communica-
yk [n] = hH
k ,l x[n] + zk [n], (5)
tion system, where the BS serves K single antenna users,
l=1
as shown in Fig. 1. To use the benefit of massive antenna
with low RF chain overhead, the BS is equipped with multi- where hH
k ,lis the channel gain of the l th path from the BS
subarrays with M antennas but only equipped with MRF chain to MS k, Lk is the total number of paths between MS k and
units, which indicates that every RF chain unit is connected BS and zk [n] is the additive white Gaussian noise. Then, we
selectively to one subarray out of N subarrays via switch separate the LoS paths from all paths among the BS and MSs
network. For simplicity of exposition, the number of antennas so that yk [n] in (5) can be equivalently written as
in each subarray is assumed to be fixed Msub = M N , where
K ≤ MRF ≤ N ≤ M . yk [n] = hHk ,LoS x[n] + hH
k ,l x[n − nk ,l ] + zk [n]. (6)
l=LoS
A. Channel Model By combining (4) with (6), we have
The channel impulse response hH 1×M from the BS
k (t) ∈ C yk [n] = hH hH
to MS k can be modeled as k ,LoS xk [n − nk ,LoS ] + k ,l xk [n − nk ,l ]
Lk l=LoS
hH
k (t) = hH
k ,l (t)δ(t − τk ,l ), (1) Lk
l=1
+ hH xk [n − nk ,l ] + zk [n]. (7)
where τk ,l is the delay of the lth path; hH
k ,l (t)
denotes k ,l
l=1 k =k
the channel complex gain over the lth path between the
BS and MS k at t. In practical, communication systems Furthermore, by combining (3) with (7), the received signal
usually use the discrete-time signals. For this purpose, the at MS k can be further written as:
equivalent discrete-time of the channel response should be H
yk [n] = hHk ,LoS FΛk wk sk [n] + hk ,l FΛk wk sk [n − Ωkl,k LoS ]
represented mathematically by sampling continuous channel l=LoS
response in (1), which is re-modeled as desired signal
Lk ISI
hHk [n] = hH
k ,l [n]δ[n − nk ,l ]. (2) Lk
l=1
+ hH
k ,l FΛk wk sk [n − Ωkl,k LoS ] +zk [n], (8)
In this letter, we assume that the BS has known perfect chan-
k =k l=1
nel state information (CSI) while the acquisition of CSI with
hybrid architecture for frequency selective channels is a chal- IUI
lenging problem. In [8], a channel estimation scheme for Δ
frequency selective channels has been proposed, which can where Ωkl,k LoS = nk ,l − nk ,LoS represents the delay interval
also be applied for the SC multi-subarrays system. between the LoS and lth NLoS path of MS k. Then, we can
obtain the SINR expression for MS k, that is
B. SC Transmission With Time Delay Pre-Compensation SINRk
In order to suppress the multi-path effect of time domain |hH
k ,LoS FΛk wk |
2
channel in the case of SC transmission, by using the mmWave = . (9)
Lk

signals propagation characteristic, we execute the time delay |hH 2
k ,l FΛk wk | + |hH 2 2
k ,l FΛk wk | + σk
pre-compensation at the BS for LoS path before performing l=LoS k =k l=1
the baseband precoding, as shown in Fig. 1. As a result, the
signals arrive at all users via LoS path simultaneously so that III. F ORMULATION AND P RECODING D ESIGN
the transmitted signal xk [n] can be expressed as In the previous section, by using the time delay
xk [n] = FΛk wk sk [n + nk ,LoS ], (3) pre-compensation technique, the frequency selective channel
is transformed into the approximate flat-fading channel, while
where F ∈ CM ×N and wk represents the analog precoding the IUI and residual ISI resulted from the NLoS paths need to
matrix and digital precoding vector for MS k, respectively. be further eliminated by precoding. Hence, according to the
derived SINR expression in (9), we formulate to maximize the then, the sparse optimization problem (13) can be equivalently
minimum SINR received by any of K users via joint subarray written as
selection and precoding design under the limited RF chains
environment, which can be expressed as max t
{w̄k }K
k =1 ,t
max min SINRk s.t. SINRk ≥ t, ||w̄k ||2 ≤ P , ||ẅ||0 ≤ MRF .
F,{Λk ,wk }K
k =1
K
(14)

s.t. Λk ∈ {0, 1}, F ∈ Farray , ||FΛk wk ||2 ≤ P , Problem (14) is non-convex since both SINR and l0 norm con-
k =1 straints are non-convex. Nevertheless, when the target value
(10) t ≥ 0 is fixed, we obtain the following feasibility problem
where Farray denotes the beam set transmitted by all arrays Find : {w̄k }K
k =1
and P denotes the maximum transmit power at the BS. The
s.t. SINRk ≥ t, ||w̄k ||2 ≤ P , ||ẅ||0 ≤ MRF . (15)
switch network makes the formulated problem a mixed inte-
ger nonlinear programming problem, which is a challenging When the target value t is attainable, problem (15) is feasible
problem because of several difficulties, including non-convex and the optimal value t can be obtained by bisection searching
objective function, combinatorial characteristic of subarray method. Therefore, optimal solution of problem (14) can be
selection matrix and non-linear coupling of subarray selection derived by solving the following feasibility problem:
matrix, digital and analog precoders.
Usually, the analog precoder F is selected from a predefined min ||ẅ||0 s.t. SINRk ≥ t, ||w̄k ||2 ≤ P . (16)
{w̄k }K
k =1
codebook [11]. In this letter, the number of subarrays is larger
than that of RF chains and the beams transmitted by subarrays Because of the non-convexity of l0 norm, problem (16) is still
only align at LoS paths of all MSs so the analog precoding difficult to be directly solved. A regularization norm can be
matrix can use the DFT codebook F̄. Furthermore, we use used to promote sparsity for all entries of w̃n . We can approx-
a large dimension precoding vector w̄k = Λk wk ∈ CN ×1 imate the ||ẅ||0 norm by ||ẅ||1 . According to the definition
instead of the original MRF baseband precoding vector. Then, of ẅ in (12), we have
we rearrange the original precoding matrix W̄ ∈ CN ×K N N

W̄ = [w̄1 , . . . , w̄k ] = [ w̄11 , . . . , w̄1K ; . . . ; w̄N 1 , . . . , w̄NK ], (11) ||ẅ||1 = ||w̃n ||∞ = max |w̄nk |. (17)
k
n=1 n=1
w̃1 w̃N
Note that in (17), the l1,∞ norm of the original precoding
where w̃n ∈ C1×K denotes the precoding coefficient corre- matrix W̄ defined in (11) is able to be replaced by the group
sponding to the nth codeword. Next, we define the N × 1 sparse inducing norm of vector ẅ. Therefore, optimization
vector problem (16) can be further approximated as
ẅ = [||w̃1 ||q , ||w̃2 ||q , . . . , ||w̃n ||q , . . . , ||w̃N ||q ], (12) min ||ẅ||1 (18a)
{w̄k }K
k =1
where q ≥ 2, in this letter, we set p = ∞. Obviously, we do ⎛ ⎞
not wish to select the nth codeword from the codebook, so the Lk

corresponding precoding coefficients for all entries of the vec- s.t. t ∗ ⎝ |h̄H 2
k ,l w̄k | + |h̄H 2 2⎠
k ,l w̄k | + σk
tor w̃n need to be set to zero simultaneously and problem (10) l=LoS k =k l=1
is recast into the sparse precoding design problem ≤ |h̄k ,LoS w̄k |2 ,
H
(18b)
max min SINRk K
{wk }K ||w̄k ||2 ≤ P. (18c)
k =1 k =1
K
Up to now, the cost function in (18) is convex. Observing
s.t. ||ẅ||0 ≤ MRF , ||w̄k ||2 ≤ P , (13) that there exists an optimal precoding vector {w̄k }K
k =1 to (18)
k =1 that makes h̄H w̄
k ,LoS k
a real number. Consequently, the SINR
Note that in (13), the l0 norm constraint ||ẅ||0 ≤ MRF guar- constraint in (18b) is able to be written as a second order cone
antees that the number of the selected subarrays is less than form. Then, problem (18) can be equivalently rewritten as
that of the total RF chains at the BS. Moreover, the SINRk is
defined as min ||ẅ||1
{w̄k }K
k =1

|h̄H
k ,LoS w̄k |
2
SINRk = , s.t. GH H H H
kk wk ; Gk 1 w1 ; Gk 2 w2 ; . . . ; Gk (k −1) wk −1 ;
Lk

|h̄H 2 |h̄H 2 2
k ,l w̄k | + k ,l w̄k | + σk . . . GHk (k +1) wk +1 ; GH
kK w K ; σ k
l=LoS k =k l=1
1
where the effective channel h̄H H ≤ Re h̄H k ,LoS w̄k ,
k .LoS = hk ,LoS F̄ can be viewed t
as beamspace channel [12]. Problem (13) is also hard to solve K
because of the non-convexity of objective function and l0 norm ||w̄k ||2 ≤ P , (19)
constraint. To tackle the issue, we bring in a slack variable t, k =1
HUANG et al.: HYBRID PRECODING FOR SC WIDEBAND MULTI-SUBARRAY mmWAVE SYSTEMS 487
Algorithm 1 Joint Subarrays Selection and Precoding Design paths become even worse for the proposed scheme, but MISO-
1: Initialize tmin ,tmax ; OFDM scheme can completely eliminate the ISI. In addition,
2: Repeat we also found that the proposed sparse precoding schemes
3: t = (tmin + tmax )/2; based on time delay pre-compensation in the limited number
4: Solve problem (19) and obtain optimal precoding vector of RF chains case can achieve the performance as full RF
{w̄k }K
k =1 corresponding to ẅ in (12); chains environment.
5: Judging: if ||ẅ||0 ≤ MRF In Fig. 2(b), by assuming that the BS is equipped with
then set tmin = t, MRF = 5 RF chains, it reveals that the convergence of
else Algorithm 1 for the sparse optimization precoding design.
then set tmax = t; We observe that the Algorithm. 1 rapidly converges to the
6: Until tmax − tmin ≤ ; total number of RF chains at the BS, which indicates that
7: Output: Solution {ŵk }Kk =1 . the proposed l1,∞ norm can accurately approximate the l0
norm. Furthermore, the convergence rate is faster in the fewer
number of antennas case.
V. C ONCLUSION
In this letter, we investigated the wideband mmWave com-
munication system with subarray architecture and presented
the SC with time delay compensation transmission scheme.
Then, by considering the residual ISI and IUI, we derived the
resulting SINR expression. According to the SINR, we trans-
formed the joint subarray selection and precoding problem
Fig. 2. Max-min rate versus SNR and convergence curve of algorithm 1. into the sparse precoding design problem, which overcomes
the limited number of RF chains. Additionally, simulation
results demonstrated the significant effectiveness of the SC
where GH
kk ∈ C
(Lk −1)×N and GH ∈ CLk ×N are defined as
kk
scheme.
GH H H H H H
kk = [h̄k ,1 ; h̄k ,2 ; . . . ; h̄k ,LoS−1 ; h̄k ,LoS+1 ; . . . h̄k ,Lk ], R EFERENCES
GH
kk = [h̄H H H
k ,1 ; h̄k ,2 ; . . . ; h̄k ,Lk ].
[1] Y. Huang, J. Zhang, and M. Xiao, “Constant envelope hybrid precoding
for directional millimeter-wave communications,” IEEE J. Sel. Areas
Commun., vol. 36, no. 4, pp. 845–859, Apr. 2018.
Note that problem (19) is convex and the complete [2] C. Zhang, Y. Huang, Y. Jing, S. Jin, and L. Yang, “Sum-rate anal-
algorithm is summarized in Algorithm 1, where the ysis for massive MIMO downlink with joint statistical beamforming
tmax and tmin represent the upper and lower bound, and user scheduling,” IEEE Trans. Wireless Commun., vol. 16, no. 4,
respectively. pp. 2181–2194, Apr. 2017.
[3] X. Zhang, A. F. Molisch, and S.-Y. Kung, “Variable-phase-shift-based
RF-baseband codesign for MIMO antenna selection,” IEEE Trans. Signal
Process., vol. 53, no. 11, pp. 4091–4103, Nov. 2005.
IV. S IMULATION R ESULTS [4] Z. Xiao, P. Xia, and X.-G. Xia, “Codebook design for millimeter-
In this section, we verify the performance of our developed wave channel estimation with hybrid precoding structure,” IEEE Trans.
Wireless Commun., vol. 16, no. 1, pp. 141–153, Jan. 2017.
scheme. We assume that each MS is equipped with only one [5] S. Park, A. Alkhateeb, and R. W. Heath, “Dynamic subarrays for hybrid
antenna, whereas the total number of antennas at the BS is precoding in wideband mmWave MIMO systems,” IEEE Trans. Wireless
M = 64 and further assume that the number of subarrays is Commun., vol. 16, no. 5, pp. 2907–2920, May 2017.
equal to that of RF chains N = MRF = 8. Besides, the carrier [6] A. Alkhateeb and R. W. Heath, “Frequency selective hybrid precoding
for limited feedback millimeter wave systems,” IEEE Trans. Commun.,
frequency of mmWave system is set to the 28 GHz and the vol. 64, no. 5, pp. 1801–1818, May 2016.
number of channel paths for each MS has Lk = 3. Moreover, [7] C. G. Tsinos, S. Maleki, S. Chatzinotas, and B. Ottersten, “On the
the time delay index is uniformly distributed in [0, Tm ], where energy-efficiency of hybrid analog–digital transceivers for single- and
Tm = 100 ns denotes the maximum time delay and the total multi-carrier large antenna array systems,” IEEE J. Sel. Areas Commun.,
vol. 35, no. 9, pp. 1980–1995, Sep. 2017.
available bandwidth 500 MHz. Besides, the initial upper bound [8] Y. Zeng, L. Yang, and R. Zhang, “Multi-user millimeter wave
tmax and lower bound tmin are set to 50 and 0 in Algorithm 1, MIMO with full-dimensional lens antenna array,” IEEE Trans. Wireless
respectively. Commun., vol. 17, no. 4, pp. 2800–2814, Apr. 2018.
Fig. 2(a) reveals the average minimun rate for differ- [9] W. Huang, Y. Huang, Y. Zeng, and L. Yang, “Wideband millimeter
wave communication with lens antenna array: Joint beamforming and
ent schemes by assuming that the number of MS K = 5. antenna selection with group sparse optimization,” IEEE Trans. Wireless
It is shown that the proposed SC with time delay com- Commun., vol. 17, no. 10, pp. 6575–6589, Oct. 2018.
pensation scheme has a better average minimum rate than [10] W. Huang, Y. Huang, R. Zhao, S. He, and L. Yang, “Wideband mil-
the conventional MISO-OFDM scheme, that is because the limeter wave communication: Single carrier based hybrid precoding
with sparse optimization,” IEEE Trans. Veh. Technol., vol. 67, no. 10,
SC scheme can save the cyclic prefix overhead in OFDM pp. 9696–9710, Oct. 2018.
mode and effectively suppress the ISI and IUI after time [11] H. Seleem, A. I. Sulyman, and A. Alsanie, “Hybrid precoding-
delay pre-compensation and precoding processing. However, beamforming design with hadamard RF codebook for mmwave large-
scale MIMO systems,” IEEE Access, vol. 5, pp. 6813–6823, 2017.
in the high SNR area, the MISO-OFDM scheme is supe- [12] P. V. Amadori and C. Masouros, “Low RF-complexity millimeter-wave
rior to the proposed SC scheme. This is expected, with the beamspace-MIMO systems by beam selection,” IEEE Trans. Commun.,
increase of SNR, the ISI and IUI caused from the NLoS vol. 63, no. 6, pp. 2212–2223, Jun. 2015.
Threshold Setting for Multiple Primary User Spectrum

Sensing via Spherical Detector
Xi Yang , Member, IEEE, Kejun Lei, Shengliang Peng, Li Hu, Shu Li, and Xiuying Cao
Abstract—It is known that the spherical detector (SD) is very Recently, a multiple PU spectrum sensing method based on
efficient for the multiple primary user spectrum sensing in cogni- the spherical detector (SD) has been proposed in [2]. The
tive radios. However, numerical results showed that the decision SD performs spectrum sensing by evaluating whether the
threshold obtained by the existing approximation methods has population covariance matrix differs from a matrix propor-
high accuracy in large sample scenarios, but it is not the case tional to the identity matrix. The SD belongs to a blind
in small sample situations. In this letter, we develop SD for the
practical applications with small observations. Here, two methods
detection scheme, which performs spectrum sensing without
to calculate the decision threshold of the SD are proposed. In the any prior information on the noise power, channel gains, sig-
first approach, the accurate expression, in the form of Meijer’s nals powers and the number of PU signals. Moreover, the SD
G-function, for the probability of false-alarm is derived. Based is the optimal detector in the generalized likelihood ratio test
on this, the accurate threshold can be obtained via the numerical (GLRT) sense when the covariance matrix of the received pri-
solving when a target false-alarm probability is given. In the sec- mary signal is positive definite [1]. Although the SD is very
ond method, an approximation method is presented to lead to a efficient for the multiple PU detection, the numerical results
simple expression for the decision threshold, which enables us to showed that the decision threshold obtained by the existing
directly compute the threshold with low computation complexity approximation methods has high accuracy in large sample sce-
and high accuracy. Simulation results verify the effectiveness of narios, but it is not the case in small sample situations [3]. The
the two proposed methods.
latter makes the whole decision-making process of the SD
Index Terms—Cognitive radio (CR), spectrum sensing, inaccurate when only small observations are available due to
multiple primary users, spherical detector (SD), decision time-varying channel and/or detection in the shortest possible
threshold. time [4].
In this letter, we focus on the SD for the more practi-
cal applications with small observations. Here, two methods
are proposed to compute the theoretical decision threshold
I. I NTRODUCTION for the SD. In the first approach, the accurate expression
PECTRUM sensing is one fundamental and key technol- using Meijer’s G-function is derived for the probability of
S ogy in cognitive radio networks (CRNs). Existing work
on spectrum sensing is mainly based on the assumption that
false-alarm. Based on this, the accurate threshold for any
parameter setting can be obtained via the numerical solving.
only one active primary user (PU) shares the licensed spec- In the second method, an approximate but simple expression
trum. However, the single PU assumption may be too simple for the sensing threshold is provided, which can be utilized
in some practical CRNs, where multiple primary users may to directly calculate the threshold for any given false-alarm
share the same frequency band [1], [2]. probability. The second approach has much less computa-
Using existing single PU spectrum sensing algorithms in tion complexity than the first one, which is very attractive
the multiple PU detection scenarios cannot obtain optimal for real-time applications. Simulation results show that, com-
performance. As a result, the multiple PU spectrum sens- pared with the classical Beta and chi-square approximation
ing problem has attracted the attention of many researchers. methods [2], [5], our approximation method yields more accu-
rate decision threshold for the sensing scenarios with small
Manuscript received May 12, 2018; revised July 30, 2018 and observations.
September 22, 2018; accepted October 18, 2018. Date of publication
October 22, 2018; date of current version April 9, 2019. This work
was supported in part by the National Natural Science Foundation of II. S YSTEM M ODEL
China under Grant 61861019 and Grant 61362018, in part by the
Post-Doctoral Scientific Research Project of Jiangsu Province under Grant Consider K sensors cooperatively detect the presence of
1402041B, and in part by the Project of Hunan Provincial Department P (P ≥ 1) PU signals [2]. The received signal is denoted
of Education under Grant 16A174. The associate editor coordinating the as x = Hs + η, where x ∈ CK , H = [h1 , h2 , . . . , hP ] is the
review of this paper and approving it for publication was P. Pawelczak. K × P channel gain matrix between the K sensors and the
(Corresponding author: Kejun Lei.)
X. Yang is with the College of Information Science and Engineering,
P PUs, s = [s1 , s2 , . . . , sP ]T is zero mean PU signal vec-
Jishou University, Jishou 416000, China, and also with the National Mobile tor, and η is the complex Gaussian noise vector with zero
Communications Research Laboratory, Southeast University, Nanjing 210096, mean and covariance matrix σ 2 I. Without loss of general-
China (e-mail: ynkej@163.com). ity, we assume that PU signals and noise are independent
K. Lei, L. Hu, and S. Li are with the College of Information of each other. Let H1 (H0 ) denote the presence (absence)
Science and Engineering, Jishou University, Jishou 416000, China (e-mail:
leikejun-123@163.com; huli2000neu@163.com; dawning2008@qq.com). of PU signals. Denote X = [x1 , x2 , . . . , xN ] as the received
S. Peng is with the School of Information Science and Engineering, Huaqiao data matrix. Here, N denotes the number of observations. Note
University, Xiamen 361021, China (e-mail: peng.shengliang@hqu.edu.cn). that the population covariance matrix can be written
as Σ
X. Cao is with the National Mobile Communications Research Laboratory, E{XX† }/N = σ 2 I under H0 , and Σ = σ 2 I + P p=1 p hp hp
γ †
Southeast University, Nanjing 210096, China (e-mail: cao_xy@seu.edu.cn).
†
Digital Object Identifier 10.1109/LWC.2018.2877361 under H1 , where γp = E{sp sp } is the transmission power of
YANG et al.: THRESHOLD SETTING FOR MULTIPLE PU SPECTRUM SENSING VIA SD 489
the p-th PU. Invoking the results of [2], the decision rule can where the last equality results from the fact that 0 < TSD ≤ 1.
be written as Substituting (9) into (10), we get
H0 γ

det(R̂x ) K −1,0 a1 , . . . , aK −1
TSD γ, (1) Pf = Ω(N , K ) GK −1,K −1 x dx . (11)

1 tr(R̂ )
x
K
H 0 b1 , . . . , bK −1
K 1
m,n a1 ,...,an ,an+1 ,...,ap

where R̂x = XX† /N is the sample covariance matrix, TSD Using the result of Gp,q b1 ,...,bm ,bm+1 ,...,bq
x dx =
and γ represent the test statistic and the threshold, respectively. m,n+1 1,a1 +1,...,an +1,an+1 +1,...,ap +1

Here, the condition N > K should be satisfied to guarantee G p+1,q+1 b1 +1,...,bm +1,0,bm+1 +1,...,bq +1 x [7], the definite
the positive definiteness of R̂x [5]. integration in (11) can be further calculated as

K −1,1 1, a1 + 1, . . . , aK −1 + 1
Pf = Ω(N , K ) GK ,K γ
III. ACCURATE T HRESHOLD S ETTING FOR SD b1 + 1, . . . , bK −1 + 1, 0
According to the result of [2], the n-th moment of TSD

K −1,1 1, a1 + 1, . . . , aK −1 + 1
under H0 can be written as − GK 0 . (12)

,K b1 + 1, . . . , bK −1 + 1, 0
n Γ(KN )K Kn Γ̃K (N + n)
E(TSD )= , (2) Plugging ai = N − 1 + K1 and bi = N − 1 − i into (12) yields
Γ̃K (N )Γ(KN + Kn)

where Γ(α) = 0∞ x α−1 e −x dx and Γ̃K (α) is the complex K −1,1 1, N + K , . . . , N + K
1 K −1
Pf = Ω(N , K ) GK ,K γ
multivariate Gamma function. Using the fact that N − 1, . . . , N − K + 1, 0

K −1,1 1, N + K , . . . , N + K
1 K −1
Γ̃K (α) = π K (K −1)/2 Γ(α)Γ(α − 1) · · · Γ(α − K + 1) (3) − GK 0 . (13)
,K N − 1, . . . , N − K + 1, 0
for Res(α) > K − 1, the moment of TSD can be rewritten as

K −1 Correspondingly, the exact decision threshold γ can
n K Kn Γ(KN ) Γ(N + n − i ) be calculated by numerical solution when Pf is
E(TSD )= . (4)
Γ(KN + Kn) Γ(N − i ) given.
i=0
The computational complexity of the method to determine
Using the product theorem of Gamma function [6] the accurate threshold via the solution of Eq. (13) mainly
K −1 comes from two parts: the computation of Ω(N , K ), which
1−K 1 i
Γ(K α) = (2π) 2 K K α− 2 Γ α+ , (5) involves the calculation of 2(K − 1) Gamma functions, and the
K computation of the Meijer’s G-function, which requires a com-
i=0
the n-th moment of TSD can be further expressed as plicated integral operation of the product of 2(K − 1) Gamma
functions on the complex plane. As a result, the computa-
K
−1 tion complexity of this method will increase dramatically with
n Γ(N + i /K ) Γ(N + n − i )
E(TSD )= . (6) increasing of K and N. In addition, Ω(N , K ) becomes very
Γ(N + n + i /K ) Γ(N − i ) K −1,1
i=1 large while GK ,K (·) becomes very small when K increases.
Define the random variable TSD under H0 by X, i.e., X K −1,1
For instance, Ω(N , K ) = O(10302 ) and GK ,K (·|γ) =
TSD |H0 . Then, the Mellin transform of X can be written as −304
O(10 ) when N = 40, K = 20 and Pf = 0.1. Obviously,
MX (z ) = E(X z −1 ). According to the result of (6), we have how to maintain the stability of numerical computation is
K
−1 another troublesome issue. Alternatively, we present a direct
Γ(N + z − 1 − i ) Γ(N + i /K )
MX (z ) = . (7) and efficient method to obtain the approximate threshold with
Γ(N + z − 1 + i /K ) Γ(N − i ) high accuracy in the next section.
i=1
As a result, by using the inverse Mellin transform, the
probability density function (PDF) of X can be written as IV. A PPROXIMATE T HRESHOLD S ETTING W ITH H IGH
c+j ∞ K −1 P RECISION FOR SD
1 Γ(N + i /K )
fX (x ) = MX (z )x −z dz = From (4), the Nh-th moment of TSD can be written as
2πj c−j ∞ Γ(N − i )
i=1
K −1
c+j ∞ K−1 Nh K KNh Γ(KN ) Γ(N + Nh − i )
1 Γ(N + z − 1 − i ) −z E(TSD )= . (14)
× x dz . (8) Γ(KN + KNh) Γ(N − i )
2πj c−j ∞ Γ(N + z − 1 + i /K ) i=0
i=1
By using the definition of Meijer’s G-function [6], [7], the Taking the natural logarithm of both sides of (14) yields
exact PDF of TSD under H0 can be obtained as K −1

Nh
ln E(TSD ) = ln K KNh + ϑ(0) − ϑ(h) + (gi (h) − gi (0)),
K −1,0 a1 , . . . , aK −1
i=0
fX (x ) = Ω(N , K )GK −1,K −1 x (9)
b1 , . . . , bK −1
(15)
−1 Γ(N + Ki ) where ϑ(h) = ln Γ(KN + KNh), gi (h) = ln Γ(N + Nh − i ).
with Ω(N , K ) K i=1 Γ(N −i) , ai = N − 1 + i /K and Expanding ϑ(h) and gi (h) by using the Stirling’s formula for
bi = N − 1 − i . Thus the exact expression of the probability the log-Gamma function [8], we obtain the series expressions
of false-alarm is obtained as as (16) and (17)
γ (shown
at the top of the next page),
where Bn (x ) = nk=0 nk Bk x n−k and Bk is the Bernoulli
Pf = P (TSD < γ|H0 ) = P (X < γ) = fX (x )dx , (10)
0 number [6]. Usually, L = 7 is chosen to guarantee the
L
KN (1+h)− 1 1 (−1)l+1 Bl+1 (0) 1
ϑ(h) = ln KN (1 + h) 2 − KN (1 + h) + ln(2π) 2 + + O (16)
l=1 (KN (1 + h))l l (l + 1) (KN )L+1
l+1
N (1+h)−i− 1 1 L (−1) Bl+1 (−i ) 1
gi (h) = ln N (1 + h) 2 − N (1 + h) + ln(2π) +
2 +O (17)
l=1 (N (1 + h))l l (l + 1) N L+1
L
accuracy in practical applications. Plugging (16) and (17) − fχ2 (−2N ln γ) β Φ (−2N ln γ). (27)
K 2 −1 l=1 l l
into (15) yields
K 2 −1 L It is not difficult to be seen from (22) that Y converges to a
Nh
ln E(TSD ) ln(1 + h)− 2 + βl (1 + h)−l − 1 , limiting distribution Fχ2 (y) as N increases. As a result, we
l=1 K 2 −1
(18) can use the result of [10] to determine the relation between the
quantiles of FY (y) and Fχ2 (y). Let γ0 and y0 respectively
where the terms O( (KN1)L+1 ) 1
and O( N L+1 ) are omitted and K 2 −1
be the Pf quantiles of Fχ2 (y) and FY (y). Then we have
−1 K 2 −1
(−1)l Bl+1 (0) − K l K B l+1 (−i )
βl =
i=0
. (19) Pf = P (χ2K 2 −1 > γ0 ) = 1 − Fχ2 (γ0 ), (28)
l K 2 −1
l (l + 1)(KN )
Pf = P (Y > y0 ) = 1 − FY (y0 ). (29)
From (18), the approximate moment of TSD can be given as
L Using the result of [10, eq. (23)], we obtain
2
Nh − K 2−1 −l
E(TSD ) (1 + h) exp βl (1 + h) − 1 . y0 γ0 − (β1 Φ1 (γ0 ) + β2 Φ2 (γ0 ) + · · · + βL ΦL (γ0 )), (30)
l=1
(20) where the term of O(N −2 ) is omitted. Comparing (27)
Further, the above equation can be approximated as with (29), the threshold γ corresponding to Pf can be given
L
by γ = exp(−y0 /2N ). According to (30), we have
K 2 −1
Nh
E(TSD ) (1 + h)− 2 1+ βl (1 + h)−l − 1 . L
−1
l=1 γ exp 0.5N −γ0 + βl Φl (γ0 ) . (31)
l=1
(21)
Nh )| The computational complexity of the approximated method
Note that E(TSD h=−2jt = E(exp(jt(−2N ln TSD ))) is the to compute the threshold mainly comes from the computa-
characteristic function (CF) of Y −2N ln TSD , and (1 + tion of βl and Φl (γ0 )(l = 1, . . . , L). The former requires
h)−l |h=−2jt is the CF of the χ2 -distribution with 2l degrees of 3KL(L+1)
+ (4K + 3)L multiplications and
KL(L+1)
+
2 2
freedom. From (21), the cumulant distribution function (CDF) (2K + 1)L additions, and the latter requires L(L + 1) − 1
of Y can then be written as multiplications and L(L + 1) − 1 additions. As mentioned ear-
L lier, L = 7 is usually chosen to obtain a good approximation
FY (y) Fχ2 (y) + βl Fχ2 (y) − Fχ2 (y) ,
K 2 −1 l=1 K 2 −1+2l K 2 −1 result. Obviously, the computation cost for the approximated
(22) threshold is much lower than the exact one in Section III.
where Fχ2 (y) is the CDF of χ2K 2 −1 .
Compared with
K 2 −1 V. S IMULATION R ESULTS
the classical chi-square approximation in [5], the approx-
imation accuracy of FY (y) is raised from the order of Fig. 1 shows the decision threshold of the SD versus N when
O( N12 ) to O( N17 ), which helps to obtain a high-precision K = 6, the target Pf = 0.05 and 0.1. It can be seen that the
threshold in small sample scenarios. Using the properties of accurate thresholds via the numerical solution of (13) match
χ2 -distribution [9], we can verify that the Monte-Carlo results pretty well, which verifies the effec-
tiveness of our theoretical analysis in Section III. In addition,
Fχ2 (y) − Fχ2 (y) = fχ2 (y)Φl (y), (23) we can also observe that there is a little loss in accuracy of (31)
K 2 −1+2l K 2 −1 K 2 −1
compared to the accurate threshold. However, as mentioned
where earlier, the computational burden of the new approximation
2
1 −y K −3 method is relieved due to avoiding the calculation of Meijer’s
fχ2 (y) = K 2 −1
2 exp y 2 , (24) G-function. On the other hand, both the classical Beta [2]
K 2 −1 2
2 2 Γ K 2−1 and chi-square [5] approximation methods produce accurate
2 enough decision thresholds when the number of samples N is
Φl (y) = Φl−1 (y) − l (25) relatively large, however, when N is small, the error of the
q=1 (K 2 + 2q − 3)/y threshold calculation via them cannot be ignored.
with Φ0 (y) = 0. Substituting (23) into (22) yields As analyzed before, the problem of computation complexity
L and numerical stability for the accurate threshold calculation
FY (y) Fχ2 (y) + fχ2 (y) βl Φl (y). (26) via (13) is very difficult to solve when K increases. Thus,
K 2 −1 K 2 −1 l=1 the approximation method to calculate the threshold with
Note that the probability of false-alarm can be written as Pf = high accuracy is very attractive. Fig. 2 compares the decision
P (TSD < γ) = P (Y > −2N ln γ). Thus, we have thresholds obtained by the proposed approximation method
and the Beta and chi-square approximation methods when K
Pf = 1 − FY (−2N ln γ) 1 − Fχ2 (−2N ln γ) increases to 20. It is shown that the thresholds obtained using
K 2 −1
YANG et al.: THRESHOLD SETTING FOR MULTIPLE PU SPECTRUM SENSING VIA SD 491
Fig. 1. Threshold versus N when K = 6: (a) Pf = 0.05; (b) Pf = 0.1. Fig. 4. Actual Pd and Pf versus N (K = P = 20, SNR = −5 dB).
Fig. 4 shows that the actual Pd and Pf obtained using the new
approximated threshold match the Monte-Carlo results very
well, however, the gaps between the actual Pd and Pf curves
via the approximation methods in [2] and [5] and the Monte-
Carlo results are further increased. For instance, compared
with the actual performance, the detection probability via the
Beta approximation is reduced by about 36.79% when N = 25,
however, the detection probability via the chi-square approx-
imation has been falsely raised by about 14.85%. Obviously,
both of the two classical approximation methods produce very
Fig. 2. Threshold versus N when K = 20: (a) Pf = 0.05; (b) Pf = 0.1. unreliable decision results.
VI. C ONCLUSION
In this letter, two methods to calculate the threshold of the
SD have been proposed. In the first method, the accurate false-
alarm probability expression using the Meijers G-function
was derived. Based on this, the accurate threshold can be
obtained via the numerical solution. In the second method,
an approximation method with high accuracy but low com-
putation complexity was proposed to directly calculate the
threshold. Compared with the classical Beta and chi-square
Fig. 3. Actual Pd and Pf versus N (K = P = 6, SNR = 0 dB).
approximation methods, the proposed approximation method
improves the threshold accuracy and can obtain the reliable
(31) match the Monte-Carlo results very well even in the case sensing results even when the number of observations is small.
with a small number of samples. On the contrary, both the Numerical results verified the effectiveness of our methods.
classical Beta and chi-square approximation methods cannot
give the decision threshold with high accuracy when N is rel-
atively small. Furthermore, by comparing Fig. 1 and Fig. 2, it R EFERENCES
is easy to find that with increasing of K and N, the accuracy of [1] L. Wei, P. Dharmawansa, and O. Tirkkonen, “Multiple primary user
the proposed method has been further improved. The reason is spectrum sensing in the low SNR regime,” IEEE Trans. Commun.,
that the accuracy of the approximated distribution of the test vol. 61, no. 5, pp. 1720–1731, May 2013.
statistic TSD increases with increasing of K and N, which can [2] L. Wei and O. Tirkkonen, “Spectrum sensing in the presence of multiple
primary users,” IEEE Trans. Commun., vol. 60, no. 5, pp. 1268–1277,
be seen from (16) and (17). By contrast, the accuracy of the May 2012.
classical Beta and chi-square approximation methods decrease [3] D. A. Guimarães, R. A. A. de Souza, and G. P. Aquino, “Multiantenna
when K and N increase. spectrum sensing in the presence of multiple primary users over fading
Fig. 3 and Fig. 4 illustrate the effect of threshold setting on and nonfading channels,” Int. J. Antennas Propag., vol. 2015, pp. 1–14,
Jun. 2015.
the actual sensing performance of the SD. Here, we assume
[4] X. Yang, K. Lei, S. Peng, and X. Cao, “Blind detection for primary
that the received SNRs of all sensors keep the same and the tar- user based on the sample covariance matrix in cognitive radio,” IEEE
get Pf is 0.05. It is noted that the more accurate the threshold Commun. Lett., vol. 15, no. 1, pp. 40–42, Jan. 2011.
is, the closer the corresponding Pf and the detection probabil- [5] R. J. Muirhead, Aspects of Multivariate Statistical Theory. New York,
ity (Pd ) are to the simulation curves. From Fig. 3, we can see NY, USA: Wiley, 1982.
that the actual Pd and Pf are in very good agreement with the [6] I. S. Gradšhteyn, I. Ryžhik, A. Jeffrey, and D. Zwillinger, Table of
Integrals, Series, and Products, Waltham, MA, USA: Academic, 2007.
Monte-Carlo results when the accurate threshold is used. We [7] Wolfram Research, Inc. MeijerG Function. Accessed: Jul. 10, 2018.
can also see that, when N is relatively small, the new approx- [Online]. Available: http://functions.wolfram.com/PDF/MeijerG.pdf
imated threshold obtained via (31) results in that the practical [8] T. W. Anderson, An Introduction to Multivariate Statistical Analysis,
Pf is slightly lower than the target value, with a slight loss in 3rd ed. Hoboken, NJ, USA: Wiley, 2003.
the detection probability; By contrast, the decision thresholds [9] K. Krishnamoorthy, Handbook of Statistical Distributions With
Applications. Boca Raton, FL, USA: CRC Press, 2006.
via the Beta and chi-square approximation methods produce a [10] G. W. Hill and A. W. Davis, “Generalized asymptotic expansions of
non-ignorable false-alarm probability deviation, which leads Cornish-Fisher type,” Ann. Math. Stat., vol. 39, no. 4, pp. 1264–1273,
to the unreliable sensing results. When K increases to 20, Dec. 1968.
High-Accuracy Entity State Prediction Method Based on

Deep Belief Network Toward IoT Search
Puning Zhang , Member, IEEE, Xuyuan Kang, Dapeng Wu , and Ruyan Wang, Senior Member, IEEE
Abstract—The state of physical entity in the Internet of Things to predict its state at the search moment. Then Dyser esti-
(IoT) has an obvious time-varying characteristic. Preliminarily mates the matching probability of an entity with the search
selecting candidate entities by predicting their current state when request based on the predicted state. Then Dyser would access
searching match entities from massive ones can effectively reduce
the communication overhead of IoT search system. The existing
partial entities in descending order of matching probability
methods are all based on shallow learning theories whose per- rather than all of them. Whereas Dyser can only search for
formances are very limited. Thus, a high-accuracy entity state qualitative state of entity. To meet the search needs for quan-
prediction method (HESPM) based on deep learning theory is titative state of entity, [11] presented CSS, which was based
proposed. The model of HESPM is built by utilizing the deep on the fuzzy set theory to estimate the matching probability
belief network. Then the contrastive divergence algorithm is of an entity by utilizing the quantitative sensor output that
adopted to train the model. Therefore, the dynamic evolution
trend of entity state can be accurately perceived and the future
was non-periodic. MSE, which is based on Grey system, and
entity state can be precisely predicted. Simulation results demon- MPM, which is based on LS-SVM, were respectively proposed
strate the effectiveness of HESPM in enhancing the prediction in [12] and [13] to predict the future entity state by exploring
accuracy and communication overhead performances. the temporal correlations between entity state data. However,
Index Terms—IoT, IoT search, entity state prediction, DBN. these three methods are all based on shallow learning theory,
which result in limited prediction precision of entity quantita-
tive state. It further leads to the large communication overhead
I. I NTRODUCTION and energy consumptions of IoT sensors.
ITH the in-depth applications of IoT, users have In this letter, a high-accuracy entity state prediction method
W increasing demands for conveniently and effectively
obtaining entity state information in the physical world [1]–[3],
(HESPM) is proposed. The model of HESPM is built based
on DBN [14] and the CD [15] algorithm is adopted to train
which gives birth to the IoT search service [4]. The objects the model, so as to accurately predict the future entity state
of IoT search are physical entities whose state is dynamically and effectively reduce the communication overhead of IoT
changing, while the traditional Internet search engines focus on sensor networks. To our knowledge, this is the first research
static virtual information resources in the cyberspace [5]–[7]. on entity state prediction method based on deep learning
Thus, the traditional Internet searching methods can’t be theory in IoT search. Remaining parts are organized as fol-
directly applied in IoT search. lows. Section II describes the search mechanism. Section III
Typical IoT search prototype systems, like Microsearch [8], proposes the HESPM. Section IV discusses the performance
and Snoogle [9], stored static keyword descriptions of a evaluation. Section V finally concludes this letter.
physical entity in its associated sensor, which required access-
ing all sensors in the search system, according to keyword
II. S EARCH M ECHANISM D ESCRIPTION
constraints provided by the user, to search for desired enti-
ties that meet keyword requirements. As a result, it brings As shown in Fig. 1, the IoT search system consists of client,
tremendous communication overhead to the IoT search sys- server, IoT gateway, database, sensor and entity. The entity is
tem, which will definitely consume large of energy resources the target being searched. One sensor is associated with one
of IoT sensors. Besides, it cannot search for entity dynamic entity and is responsible for observing the entity state. The
state. To reduce the communication overhead and support the server responds to the search request and issue it to appropriate
search for entity dynamic state, [10] proposed Dyser, which gateway. The IoT gateway is used for managing sensor net-
perceived the transition periodicity of entity dynamic state works, collecting entity state data reported by the sensor, and
predicting the future entity state based on HESPM method pro-
Manuscript received August 18, 2018; accepted October 10, 2018. Date of posed in Section III. The database stores the predicted future
publication October 23, 2018; date of current version April 9, 2019. This entity state. The user can publish search requests via the client.
work was supported in part by the National Natural Science Foundation
of China under Grant 61771082, in part by the Program for Innovation The IoT entity search mechanism is divided into two stages,
Team Building at Institutions of Higher Education in Chongqing under offline predicting (OP) stage and online searching (OS) stage.
Grant CXTDX201601020, and in part by the Science and Technology In the OP stage, the gateway receives entity state data and
Research Program of Chongqing Municipal Education Commission under
Grant KJQN201800615. The associate editor coordinating the review of
predicts future state. Multiple search requests can be initi-
this paper and approving it for publication was M. Velez. (Corresponding ated during the OS stage. Different from the traversal search
author: Puning Zhang.) method discussed above, our search mechanism finds match
The authors are with the School of Communication and Information entities by preliminarily predicting the entity state at the search
Engineering, Chongqing University of Posts and Telecommunications,
Chongqing 400065, China (e-mail: zhangpn@cqupt.edu.cn). moment rather than accessing all entities, which will reduce
Digital Object Identifier 10.1109/LWC.2018.2877639 vast communication overhead.
ZHANG et al.: HESPM BASED ON DEEP BELIEF NETWORK TOWARD IoT SEARCH 493
the energy contained in the RBM system is

n
m
n
m
E (v, h|θ) = − ai vi − bj hj − vi wij hj (1)
i=1 j =1 i=1 j =1
where θ = {a, b, w} is the parameter set of RBM system.

ai ∈ a is the bias of visible layer neuron. bj ∈ b is the bias
of hidden layer neuron. wij ∈ w is the weight between vi and
hj . When θ is given the joint probability distribution of (v, h)
can be calculated as
e −E (v,h|θ)
P (v, h|θ) = , Z (θ) = e −E (v,h|θ) (2)
Z (θ)
v,h
Fig. 1. IoT entity search mechanism. The key to predict the future state of a given entity lies
in solving the probability distribution of the entity state data
x (t − τ ), x (t − 2τ ), . . . , x (t − nτ ) as
−E (v,h|θ)
e
P (v|θ) = P (v, h|θ) = h (3)
Z (θ)
h
The computation complexity of normalization factor Z (θ)
is extremely high, so P (v|θ) cannot be calculated by solv-
ing Z (θ). However, since neurons in adjacent layers are fully
connected and contactless in the same layer, if the state vec-
tor of neurons in the visible layer v is known, the activation
probability of the j-th neuron in hidden layer is defined as

P hj = 1|v, θ = 1/ 1 + exp(−bj − vi wij ) (4)
i
As RBM has a symmetrical structure, when the state vector
of hidden layer neurons h is given, the probability of the i-th
Fig. 2. The model of HESPM. neuron in the visible layer being activated can be computed as

P (vi = 1|h, θ) = 1/ 1 + exp(−ai − hj wij ) (5)
j
III. H ESPM D ESCRIPTION
Thus the probability distribution of the entity state data can
In this part, we propose HESPM method to perceive the be estimated and the entity state data can be reconstructed.
dynamic evolution feature of entity state and accurately predict The entity state data are taken as the input of visible layer
the future entity state. neurons in RBM1 and then tune the parameters in θ of RBM1
to reconstruct the input. The output of hidden layer in RBM1
are also taken as the input of RBM2. Continue this process
and then the entity state at time t, x(t), can be predicted in the
A. Model Building
output layer as x (t). Obviously, the fine-tuning method for
The HESPM method is based on DBN theory with several solving θ in each RBM is the key to make predictions.
Restricted Boltzmann Machines (RBMs) to build its model,
which consists of the input layer, RBMs, and the output layer,
B. Model Training
as shown in Fig. 2.
The RBM is composed of the visible layer and the hid- In this part, we adopt the CD algorithm to solve θ. In
den layer. vi is the i-th neuron of visible layer and hj general, the optimal solution of θ can be obtained by max-
is the j-th neuron of hidden layer. The entity state data imizing the log-likelihood function (θ) of entity state data
x (t − τ ), x (t − 2τ ), . . . , x (t − nτ ) are used as the input of reconstructed by RBM as
visible layer where τ is the sample interval and n is the num- T

ber of neurons in the visible layer. The hidden layer can be θ∗ = arg max (θ) = arg max log P v(t) , h|θ
regarded as a feature extractor of entity state data. Neurons in θ θ t=1 h
adjacent layers are fully connected and no connections in the T

(t)
same layer. = arg max log e[−E (v ,h|θ )]
Denote the state vector of RBM visible layer neurons as v = θ t=1 h
[v1 , v2 , . . . , vn ]. The state vector of neurons in the hidden layer

is h = [h1 , h2 , . . . , hm ] where m is the number of neurons in − log e[−E (v,h|θ )] (6)
the hidden layer. Then for a given state combination (v, h), v h
TABLE I
Assume ϕ to be one of the parameters in θ, then take the S IMULATION PARAMETERS S ETTING
partial derivative of log-likelihood function for ϕ as
T

∂(θ) ∂ (t)
= log e[−E (v ,h|θ )]
∂ϕ ∂ϕ
t=1 h

[−E (v,h| θ )]
− log e
h v
⎛
T (t)
∂ −E v , h|θ

= ⎝
∂ϕ
t=1 P (h|v(t) , θ)

∂(−E (v, h|θ ))
− (7)
∂ϕ P (v,h|θ )
where •P (•) denotes the expectation of • with regard to

the probability distribution P(•). As described above, Z (θ) is
difficult to be calculated. Thus the partial derivative ∂(θ)/∂ϕ
cannot be solved directly, which further implies that the solu-
tion of θ is unable to be obtained by using traditional stochastic
gradient ascending method. Therefore, we adopt the CD algo-
rithm to solve θ. Continuously input the entity state data to the
visible layer neurons and reconstruct the input data, according
to the formula (4) and (5), by iteratively adjusting the solution
of θ to minimize reconstruction errors. The updating method Fig. 3. Comparisons about single-step prediction performances.
of each parameter in θ is as follows
Δai = ε(vi AC − vi RC ) that need to be returned to the user. The search range [a,b] is

Δbj = ε hj AC − hj RC randomly distributed within the defined interval [xMin , xMax ]
in which xMin is the minimum and xMax is the maximum
Δwij = ε vi hj AC − vi hj RC (8)
in the dataset. We perform the experiments on a PC with
where ε is learning rate, •AC is the probability distribution Intel Core i7 CPU, 8 GB RAM under MATLAB R2016a.
of actual entity state data and •RC is the probability distri- Simulation parameters are set as shown below. All simulation
bution of entity state data after being one-step reconstructed results are the mean of 50 runs.
by RBM.
We use the above method to train RBM 1 layer by layer, and A. Validations on Prediction Accuracy
then train the RBM 2 until all RBMs are trained, so as to obtain In this part, we validate the prediction accuracy of 52 sen-
the best solution of θ. Furthermore, we construct a multi-layer sors when using different prediction methods under single-step
neural network (NN) that is consistent with the designed DBN and multi-step conditions and compare the mean absolute per-
model, and assign θ as initial values of the NN parameters. centage error (MAPE) performances of HESPM, MSE [12],
Input the entity state data x (t − τ ), x (t − 2τ ), . . . , x (t − nτ ) and MPM [13] methods, which is defined as
into the NN and we can obtain the predicted entity state value
N
x̂ (t) through the feedforward of NN. Based on the difference 1 x̂ (t) − x (t)
MAPE = (9)
between the actual value x(t) and predictive value x̂ (t), the BP N − Ntr t=Ntr +1 x (t)
(back-propagating) [15] method is executed to fine-tuning the
where N is the total number of data and Ntr is the training
weight of NN to reduce the prediction error. Keep executing
data size. The smaller MAPE indicates the better prediction
this process until the NN reaches the convergence condition.
accuracy.
Therefore, after training its model HESPM can predict the
As shown in Fig. 3, the single-step prediction MAPEs of
future state of entity according to the historical data. Finally,
these three methods are all relatively stable for different nodes.
the search mechanism can find match entities based on their
The average MAPE of HESPM is 1.09, which is much less
predictive values.
than MPM, 1.91, and far less than MSE, 3.26, indicating that
HESPM has better single-step prediction accuracy than the
IV. P ERFORMANCE E VALUATION other two methods. From Fig. 4, we see that the MAPEs of
We use Intel Lab [16] dataset, which includes 54 tempera- these three methods all present rising trends with the increase
ture and humidity sensors, and select 52 sensors whose dataset of predictive step. This is because the predictive value will be
are relatively complete to verify the performance of HESPM. taken as input for making new predictions, which result in the
The search form submitted by the user is Q = ([a, b], Rq ), accumulations of prediction errors. Thus the prediction error
which means the search need for entities whose observations becomes increasingly larger. The multi-step prediction average
is between [a, b] right now. Rq is the number of match entities MAPE of HESPM is 3.71, which is slightly less than MPM,
ZHANG et al.: HESPM BASED ON DEEP BELIEF NETWORK TOWARD IoT SEARCH 495
Fig. 4. Prediction performance under different predictive steps. Fig. 6. CO performances under different predictive steps.
Therefore, those plentiful entities whose state is dynamically

changing can be found efficiently at the cost of low communi-
cation overhead. Simulation results show the effectiveness of
HESPM. We assume that the entity and sensor are one-to-one
associated. In the future, we plan to study collaborative search
technology in one-to-many applications.
R EFERENCES
[1] D. P. Wu, S. S. Si, S. E. Wu, and R. Y. Wang, “Dynamic trust rela-
tionships aware data privacy protection in mobile crowd-sensing,” IEEE
Internet Things J., vol. 5, no. 4, pp. 2958–2970, Aug. 2018.
[2] C. Y. Zhang and W. Zhang, “Spectrum sharing for drone networks,”
IEEE J. Sel. Areas Commun., vol. 35, no. 1, pp. 136–144, Jan. 2017.
Fig. 5. CO performances under different query ranges. [3] P. Zhang and J. Ma, “Channel characteristic aware privacy protection
mechanism in WBAN,” Sensors, vol. 18, no. 8, p. 2403, Aug. 2018.
[4] N. K. Tran, Q. Z. Sheng, M. A. Babar, and L. Yao, “Searching the Web
of Things: State of the art, challenges, and solutions,” ACM Comput.
4.18, and superior to MSE, 4.85. It indicates that HESPM has Surveys, vol. 50, no. 4, pp. 1–34, Nov. 2017.
[5] D. P. Wu, J. J. Yan, H. G. Wang, D. L. Wu, and R. Y. Wang, “Social
better multi-step prediction performance than MPM and MSE. attribute aware incentive mechanism for device-to-device video dis-
tribution,” IEEE Trans. Multimedia, vol. 19, no. 8, pp. 1908–1920,
Aug. 2017.
B. Comparisons About Search Performances [6] D. P. Wu, Q. R. Liu, H. G. Wang, D. L. Wu, and R. Y. Wang, “Socially
In this part, we compared the performances of HESPM- aware energy-efficient mobile edge collaboration for video distribution,”
IEEE Trans. Multimedia, vol. 19, no. 10, pp. 2197–2209, Oct. 2017.
based, MPM-based, and MSE-based search mechanisms under [7] P. Zhang, X. Kang, Y. Liu and H. Yang, “Cooperative willingness aware
different query range and predictive step conditions. As men- collaborative caching mechanism towards cellular D2D communication,”
tioned above, the proposed search mechanism includes the IEEE Access, to be published, doi: 10.1109/ACCESS.2018.2873662.
entity state validation process to ensure the reliability of [8] C. C. Tan, B. Sheng, H. D. Wang, and Q. Li, “Microsearch: A search
engine for embedded devices used in pervasive computing,” ACM Trans.
search results. Thus the efficiency of search mechanism is Embedded Comput. Syst., vol. 9, no. 4, pp. 1–29, Mar. 2010.
directly related to the number of entities that needs to be [9] H. D. Wang, C. C. Tan, and Q. Li, “Snoogle: A search engine for
verified, na , when Rq is given. Therefore, we define the com- pervasive environments,” IEEE Trans. Parallel Distrib. Syst., vol. 21,
no. 8, pp. 1188–1202, Aug. 2010.
munication overhead (CO) as follows. The smaller of CO, [10] K. Romer, B. Ostermaier, F. Mattern, M. Fahrmair, and W. Kellerer,
CO = 1 − Rq /na indicates the higher efficiency. “Real-time search for real-world entities: A survey,” Proc. IEEE, vol. 98,
In Figure 5, as the query range increases, the CO of the three no. 11, pp. 1887–1902, Nov. 2010.
[11] C. Truong and K. Römer, “Content-based sensor search for the
mechanisms all show gradual downward trends. Compared Web of Things,” in Proc. IEEE Glob. Commun. Conf., Dec. 2013,
with MPM and MSE, the proposed HESPM reduces the aver- pp. 2654–2660.
age CO by about 25.4% and 35.5% respectively under different [12] P. N. Zhang, Y.-A. Liu, F. Wu, and B. H. Tang, “Matching state esti-
query ranges. As seen from Figure 6, with the increase of mation scheme for content-based sensor search in the Web of Things,”
Int. J. Distrib. Sensor Netw., vol. 2015, pp. 1–15, Jan. 2015.
predictive step, the CO of these three gradually rise. We also [13] P. N. Zhang, Y. A. Liu, F. Wu, S. Y. Liu, and B. H. Tang, “Low-
see that HESPM can improve the average CO by 14.6% and overhead and high-precision prediction model for content-based sensor
21.7%, respectively, in comparison with MPM and MSE. search in the Internet of Things,” IEEE Commun. Lett., vol. 20, no. 4,
pp. 720–723, Apr. 2016.
[14] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm
for deep belief nets,” Neural Comput., vol. 18, no. 7, pp. 1527–1554,
V. C ONCLUSION Jul. 2006.
For reducing the communication overhead of IoT search [15] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representa-
system, an entity state prediction method, HESPM. is proposed tions by back-propagating errors,” Nature, vol. 323, no. 9, pp. 533–536,
Oct. 1986.
in this letter to sense the dynamic evolution characteristic of [16] Intel Lab Data. (Oct. 2016). [Online]. Available:
entity state and predict its future state with high accuracy. http://db.lcs.mit.edu/labdata/labdata.html
High Rate CCK Modulation Design for Bandwidth Efficient Link Adaptation
Han Wang , Lianyou Jing, Chengbing He, and Zhi Ding , Fellow, IEEE
Abstract—This letter presents a novel design of high rate is to improve the bandwidth efficiency of CCK modulation in
complementary code keying (CCK) modulation scheme (named wireless communication links.
QAM-CCK) that is suitable for bandwidth efficient wireless link Another challenge in designing a new high bandwidth
adaptation. The proposed QAM-CCK can increase the bandwidth
efficiency of CCK signaling by simultaneously implementing
efficiency CCK modulation is the corresponding detection
quadrant and phase modulations. We introduce an optimal design performance. Under additive white Gaussian noise (AWGN)
criterion based on weighted minimum Euclidean distance to channels, the optimum receiver based on maximum a pos-
optimize the QAM-CCK mapping with minimum average bit teriori (MAP) detection principle achieves the minimum
error probability. In order to utilize the inherent coding gain probability of error for data symbols [6]. As is com-
of QAM-CCK, we propose an iterative soft maximum likeli- monly known, the MAP detection is simplified to a min-
hood receiver to enhance the decoding performance. Compared
with traditional QPSK of the same bandwidth efficiency, the imum distance detection under AWGN channels [7]. Thus,
proposed QAM-CCK achieves a 4-dB SNR gain under additive Euclidean distance between codewords is an important code
white Gaussian noise channels when used in conjunction with designing metric against AWGN channels. We are motivated
proposed iterative receiver. to optimize the new CCK-based modulation by maximiz-
Index Terms—Complementary code keying (CCK), modulation ing the weighted Euclidean distance between codewords.
design, link adaption, bandwidth efficiency. Moreover, inspired by the concept of turbo decoding by
Berrou et al. [8], soft information exchange has demonstrated
significant performance improvement of many MAP based
communication receivers [9]. In particular, the turbo receiver
I. I NTRODUCTION is a high performance joint equalization and decoding scheme
that successfully utilizes the principle of iterative soft infor-
N RECENT decades, there have been only a few advances
I in modulation technology to achieve high bandwidth
efficiency in wireless communications despite the tremen-
mation exchange [10]. To take advantage of the coding gain in
CCK, we are motivated to adopt the soft information exchange
in our receivers.
dous growth of data network traffic. At the start of WiFi
In this letter, we propose a high rate CCK modulation
network standardization, complementary code keying (CCK)
scheme that is bandwidth efficient to increase the information
was adopted as a high rate spread spectrum technology
bit load conveyed in each transmitted symbol as much as the
for indoor wireless local area networks (WLANs) [1] with
basic QPSK. The proposed CCK modulation is named QAM-
substantial success. Although CCK is not among the most pop-
CCK as it adopts the same constellation diagram of 16-QAM.
ular modulation techniques, practical applications in [2]–[4]
Furthermore, we propose a weighted minimum Euclidean dis-
demonstrate its advantages, such as flexible rate and inherent
tance as the design criterion to optimize the QAM-CCK
coding or spreading gain, under distorted channels.
performance. By utilizing the proposed iterative turbo receiver,
However, the achievable bandwidth efficiency of traditional
which is capable of soft-information exchange, the optimized
CCK modulation is lower than that of the commonly applied
QAM-CCK design demonstrates superior performance in both
PSK or QAM modulations without spreading. For example,
analytical and numerical simulation results.
IEEE 802.11b provided a CCK algorithm offering a rate of
The rest of this letter is organized as follows. Section II
1 bit/symbol and a higher rate of 1.5 bits/symbol was reported
describes a novel QAM-CCK modulation and correspond-
in [5]. Nevertheless, this rate is still below the 2 bits/symbol
ing turbo-structure iterative receiver. Section III presents the
rate of plain QPSK signaling. Therefore, one of our objectives
details of QAM-CCK design optimization based on a weighted
minimum Euclidean distance. Section IV provides simula-
Manuscript received August 13, 2018; accepted October 12, 2018. Date
of publication October 23, 2018; date of current version April 9, 2019. tion results of the proposed QAM-CCK modulation. Finally,
The work of H. Wang and C. He was supported by the National Natural concluding remarks are given in Section V.
Science Foundation of China under Grant 61771396 and Grant 61471298.
The associate editor coordinating the review of this paper and approving it
for publication was C. Huang. (Corresponding author: Han Wang.) II. S YSTEM M ODEL
H. Wang and C. He are with the School of Marine Science and
Technology, Northwestern Polytechnical University, Xi’an 710072, China A. Basic CCK Modulation
(e-mail: whan@mail.nwpu.edu.cn; hcb@nwpu.edu.cn).
L. Jing is with the School of Information and Communication
In the conventional CCK modulation technique, a sequence
Engineering, Dalian University of Technology, Dalian 116024, China (e-mail: of binary bits for transmission is first grouped into subse-
lyjing@dlut.edu.cn). quences of length 8 and each subsequence is encoded inde-
Z. Ding is with the Department of Electrical and Computer Engineering, pendently afterwards. Denote a vector corresponding to such a
University of California at Davis, Davis, CA 95616 USA (e-mail:
zding@ucdavis.edu). subsequence as b = [b1 , b2 , . . . , b8 ]. Each information vector
Digital Object Identifier 10.1109/LWC.2018.2877648 b is then encoded as one CCK codeword c = [c1 , c2 , . . . , c8 ]
WANG et al.: HIGH RATE CCK MODULATION DESIGN FOR BANDWIDTH EFFICIENT LINK ADAPTATION 497
consisting of 8 phase-modulated (QPSK) symbols

ci = exp(j θi ), i = 1, 2, . . . , 8, (1)
where θi ∈ {0, π/2, π, 3π/2}. Each ci is a chip in the CCK
codeword. Obviously, the coded symbol vector c is chosen
from a code set C with cardinality |C | = 28 = 256. The chip
phase vector θ = [θ1 , . . . , θ8 ]T is coded according to a linear
block encoding strategy [11] Fig. 1. System Model.
θ = Gφ + ϕ, (2)
in which the generating matrix G is defined as 5dB. Thus, the selection and arrangement of c Q and c S are
⎡ ⎤ crucial for satisfactory performance. The detailed principle for
1 1 1 1 1 1 1 1
⎢1 0 1 0 1 0 1 0⎥ selecting c Q and c S will be presented in Section III.
GT = ⎢ ⎣1 1
⎥. (3)
0 0 1 1 0 0⎦
1 1 1 1 0 0 0 0 C. Iterative Soft Maximum Likelihood Detection
The cover code phase vector ϕ = [0 0 0 π 0 0 π 0]T rotates the QAM-CCK inherits the block coding property of conven-
fourth and seventh chips by π, respectively. ϕ optimizes the tional CCK modulation, hence the modulation/demodulation
sequence correlation properties and minimizes the DC offset process can be regarded as a coding/decoding process. After
within the codewords [11]. The mapping from the payload introducing channel encoding before QAM-CCK modulation,
information bits b to the phase vector φ = [φ1 , φ2 , φ3 , φ4 ]T the resulted system can be treated as a serially concatenated
can also be found in [11]. coding procedure. A commonly applied decoder for concate-
nated code is the iterative detector based on turbo structure.
Thus, we propose an iterative soft maximum likelihood detec-
B. Proposed QAM-CCK Modulation
tor as shown in Fig. 1 to exploit the coding property. Under
In order to increase the rate and improve the bandwidth- an AWGN channel, the received signal can be denoted as
efficiency of CCK modulation without expanding the code set
C , it is intuitive to enlarge the information bits conveyed in y = x + n, (5)
a single chip. Unlike QPSK chips adopted by conventional
CCK, we propose a new CCK modulation algorithm whose where n is the additive white Gaussian noise with variance σ 2 .
chips are QAM symbols. This new modulation technique is The iterative soft detector communicates reliability or
named QAM-CCK because of the same constellation dia- soft information between the demodulator and the decoder.
gram as 16-QAM. The proposed modulation has a rate of 2 Compared with the hard decision method, the soft iterative
bits/symbol as a result of compacting the original 8 chips into detection can elevate the confidence level by using the soft
4 chips instead. We now describe the specific modulation steps. information at the cost of additional computations, delay and
The input bits b are first divided into subsequences of length storage. The generally adopted soft information for exchange
8 and then be modulated into CCK codewords c = [c1 , . . . , c8 ] is the log likelihood ratio (LLR). The specified posteriori LLR
through a conventional CCK modulator defined in Eqs. (1)-(3). for QAM-CCK is denoted by LP (bn |y) in (6),
It is noted that c has the constellation with QPSK modulation.
P (bn = 0|y)
QAM improves the bandwidth-efficiency by simultaneously LP (bn |y) = ln
applying quadrant and phase modulations. This strategy is P (bn = 1|y)

inherited in our QAM-CCK modulation to elevate the number ∀x:bn =0 ∀φ :φ →x p(y|x, φ , bn = 0) · p(x|φ , bn = 0)
of bits conveyed in each chip. Hence, the 8 chips of each orig- = ln
∀x:bn =1 ∀φ :φ →x p(y|x, φ , bn = 1) · p(x|φ , bn = 1)
inal CCK codeword are equally divided into two groups: c Q
and c S , used for quadrant and phase modulation, respectively. ·p(φ |bn = 0) · p(bn = 0)
Because each chip has 4 possible values, c Q first defines the ·p(φ |bn = 1) · p(bn = 1)
new √ centroid from 4 different points on the circle with a radius
of 2 2 from the origin. Then c S determines one of the four ∀x:bn =0 ∀φ :φ →x p(y|x, φ , bn = 0) · p(x|φ , bn = 0)
= ln
possible phase rotations on the unit circle with respect to the ∀x:bn =1 ∀φ :φ →x p(y|x, φ , bn = 1) · p(x|φ , bn = 1)
new circle centroid selected. ·p(φ |bn = 0)
The QAM-CCK codeword x is jointly modulated by c Q + L(bn )
·p(φ |bn = 1)
and c S . This procedure can be described mathematically as
= LE (bn |y) + L(bn ), (6)
follows,
√ √
xn = 2 2 · cnQ + 2 · cnS , n = 1, 2, 3, 4, (4) where LE (bn |y) and L(bn ) represent the extrinsic LLR
and prior LLR, respectively. Under AWGN channels,
where cnQ and cnS are the chips of original CCK codewords. It p(y|x, φ, bn ) ∼ N (0, σ 2 ) is a Gaussian random variable,
is worth noting that different divisions of c Q and c S result in
diverse bit error performance. Our simulations suggest that the 1 −|x − y|2
p(y|x, φ, bn ) = √ exp( ). (7)
performance loss among different divisions can be as much as 2πσ 2σ 2
Note that p(x|φ), p(φ|bn ) and L(bn ) are fixed once the code
design method is determined. LP (bn |y) is mainly affected
by p(y|x, φ, bn ), which is decided by the Euclidean distance
|x − y|2 with certain noise variance σ 2 . We are able to increase
the reliability of LP (bn |y) by enlarging the Euclidean dis-
tance between different codewords. Consequently, maximizing
Euclidean distance is beneficial for improving the detection
performance.
III. D ESIGN O PTIMIZATION BASED ON M AXIMUM

M INIMUM E UCLIDEAN D ISTANCE
Substantially different QAM-CCK performances are
observed with various separations of cQ and cS during
Fig. 2. Relationship between Dw and Es/N0 for BER ≤ 10−5 .
simulations. In addition, Euclidean distance is an important
performance metric in enhancing receiver performance as
demonstrated in Sections I and II-C. Thus, we introduce a
criterion based on Euclidean distance to optimize the design In (10), the probability P{x} for all codewords reduces
of QAM-CCK modulation. to the same Px under the equal-probability transmission.
Let x = [x1 , . . . , x4 ] and x̂ = [x̂1 , . . . , x̂4 ] denote two QAM- Thus, the error probability Pe is simultaneously affected by
CCK codewords modulated from any pair of bit sequences Hamming distance U (x, x̂) and Euclidean distance Ed with
with Hamming distance k, k = 0. The Euclidean distance Ed the same Px and noise variance σ 2 . In order to lower the
between x and x̂ is defined as overall bit error rate, Hamming distance should be smaller
between codewords having shorter MED, whereas Hamming

4

distance could be larger between codewords with larger MED.
E
d |x − x̂ |2 .
n n (8) Consequently, we shall apply a weighted minimum Euclidean
n=1 distance metric for selections of cQ and cS to minimize Pe .
For simplicity, we focus on Ed 2 hereafter. Assume dkMED represents the minimum Euclidean distance
In order to minimize the probability of detection error, between two codewords with Hamming distance of k. Then
we should recognize that the probability of error is typi- the proposed weighted MED is defined as
cally dominated by the worst codeword pair. For example,
the transmitted codeword x will be mistakenly detected as x̂ 8

1
if |x̂ − y|2 ≤ |x − y|2 . Such error may happen if the Euclidean Dw = 8 · k · dkMED . (11)
distance Ed is too small to separate them accurately with the k =1 k k =1
interference caused by noise. In other words, codewords sepa-
rated by the minimum distance tend to inflict the worst damage The maximum value of k is set to 8 since the length of bit
to the detection error probability. Thus, we should maximize sequence in one QAM-CCK codeword is 8.
the Minimum Euclidean Distance (MED) to decrease the code Let an index set be represented as U = {1, 2, 3, 4, 5, 6, 7, 8}.
error probability. Then, the indices of selected CCK symbols cQ and cS can
Furthermore, the corresponding number of bit errors will be be denoted as I = {i1 , i2 , i3 , i4 } and P = {p1 , p2 , p3 , p4 },
k if the detector makes a wrong decision on two codewords respectively. Note that I ∪ P = U and I ∩ P = ∅. Let F(·)
with Hamming distance of k. The error probability P {x → x̂} denote a nonlinear function. Because dkMED is determined by
that a transmitted codeword x is mis-detected as x̂ is called I and P , Dw ∝ F(I , P ). In order to indicate the selection
Pair-wise Error Probability (PEP). The upper bound of aver- criterion, the relationship between Dw and Es/N0 needed for
age bit error probability can be expressed as (9) by using BER ≤ 10−5 is depicted in Fig. 2. It can be seen that the
PEP [12] larger the Dw is, the smaller Es/N0 is required for achieving
BER of ≤ 10−5 , representing better bit error performance
Pe ≤ U (x, x̂)P {x}P {x → x̂}, (9) under same noise level. This relationship remains the same
x̂,x∈C for larger values of BER.
where U (x, x̂) represents the Hamming distance between Consequently, the optimized designing of QAM-CCK
two codewords and P{x} is the probability of transmitting should satisfy that
codeword x.
Under AWGN channels with variance σ 2 , the upper bound (I , P )optimized = arg max(Dw ). (12)
of Pe is simplified as
The selection procedure can be conducted through exhaus-
4
|xn − x̂n |2
Pe ≤ U (x, x̂)P {x} exp − n=1 tive search since the total number of selections is only 1680.
4σ 2 The required computational load is not high under this circum-
x̂,x∈C
stance. Although the noise variance might affect the optimized
E 2
= Px U (x, x̂) exp − d 2 . (10) selection, our simulations indicate that the optimized selection
4σ remains generally the same for different values of σ 2 .
x̂,x∈C
WANG et al.: HIGH RATE CCK MODULATION DESIGN FOR BANDWIDTH EFFICIENT LINK ADAPTATION 499
Fig. 3. Experimental results: (a) EXIT chart analysis (I represents the mutual information); (b) BER after three iterations for SISO system under AWGN
channels; (c) FER after three iterations for SISO system under AWGN channels.
IV. N UMERICAL R ESULTS detection and decoding to enhance the detection accuracy.
In this section, we investigate the bit-error-rate (BER) and In order to understand the convergence property of proposed
frame-error-rate (FER) performance of proposed QAM-CCK iterative receiver, we use the EXIT chart to verify that our
under AWGN channels. We set the length of each data frame receiver is able to converge with 3 iterations. Furthermore,
to 1024 and apply a rate 1/2 convolution code for forward error our simulation results demonstrate substantially improved
correction (FEC). Our Monte Carlo simulation uses 10000 BER and FER performance for the proposed QAM-CCK
transmitted frames. For performance comparison, we shall pro- design when compared against traditional QPSK modulation
vide the receiver performances of both QAM-CCK modulation or QAM-CCK without codeword optimization. Although this
and the conventional QPSK modulation. We also apply the letter only focuses on optimized design under AWGN channel,
same FEC code in both systems. the proposed methodology can be similarly applied to fading
We use the extrinsic information transfer (EXIT) chart [13] channels using different PEP upper bound.
as shown in Fig. 3a to demonstrate the convergence of
proposed iterative receiver. The arrows in the figure indi- R EFERENCES
cate the direction of the exchange of extrinsic information [1] C. Andren, “CCK modulation delivers 11Mbps for high rate IEEE
between the demodulator and decoder. Although different val- 802.11 extention,” in Proc. Wireless Symp. Portable Design Conf. Spring,
ues of Es/N0 may require different numbers of iterations 1999.
[2] M. V. Clark, K. K. Leung, B. McNair, and Z. Kostic, “Outdoor IEEE
to converge, it can be seen that the detector typically con- 802.11 cellular networks: Radio link performance,” in Proc. IEEE Int.
verges within 3 iterations with Es/N0 = 7dB. Thus, we set Conf. Commun., vol. 1, 2002, pp. 512–516.
the iteration number to 3 in the following BER and FER [3] V. Arneson, K. Ovsthus, O. I. Bentstuen, and J. Sander, “Field trials
with IEEE 802.11b-based UHF tactical wideband radio,” in Proc. IEEE
comparison. Mil. Commun. Conf., vol. 1, 2005, pp. 493–498.
Fig. 3b and Fig. 3c demonstrate the receiver output BER [4] L. Jing, H. Wang, C. He, and Z. Ding, “Spatial CCK modulation
and FER performance after 3 iterations, respectively. In order and iterative detection over frequency-selective fading channels,” IEEE
Wireless Commun. Lett., vol. 6, no. 4, pp. 506–509, Aug. 2017.
to demonstrate the effectiveness of the proposed QAM-CCK [5] C. He, J. Huang, and Z. Ding, “A variable-rate spread-spectrum system
design criterion in Eq. (12), we provided the performance of for underwater acoustic communications,” IEEE J. Ocean. Eng., vol. 34,
the best selected QAM-CCK along with a random mapping no. 4, pp. 624–633, Oct. 2009.
QAM-CCK as a benchmark. It is clear that the optimized [6] J. G. Proakis, Digital Communications. New York, NY, USA:
McGraw-Hill, 1995.
QAM-CCK outperforms the random codeword design both in [7] T. K. Moon, Error Correction Coding: Mathematical Methods and
terms of BER and FER. Furthermore, the optimized QAM- Algorithms. New York, NY, USA: Wiley, 2005
CCK achieves approximately 4 dB gain at 10−5 level after [8] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon limit
error-correcting coding and decoding: Turbo-codes,” in Proc. IEEE Int.
the third iteration compared with plain QPSK modulation. Conf. Commun., vol. 2, May 1993, pp. 1064–1070.
[9] S. ten Brink, “Convergence behavior of iteratively decoded parallel con-
catenated codes,” IEEE Trans. Commun., vol. 49, no. 10, pp. 1727–1737,
V. C ONCLUSION Oct. 2001.
[10] C. Douillard, M. Jezequel, C. Berrou, A. Picart, and P. Didier, “Iterative
This letter proposes a novel QAM-CCK modulation scheme correction of intersymbol interference: Turbo-equalization,” in Proc. Eur.
to achieve high bandwidth-efficiency by optimizing the code- Trans. Telecommun., vol. 6, no. 5, pp. 507–511, 1995.
[11] IEEE Standard for Information Technology—Telecommunications and
word selection based on maximized weighted minimum Information Exchange Between Systems—Local and Metropolitan
inter-codeword Euclidean distance. The proposed QAM-CCK Networks—Specific Requirements—Part 11: Wireless LAN Medium
scheme inherits the spread spectrum coding gain of CCK Access Control (MAC) and Physical Layer (PHY) Specifications: Higher
Speed Physical Layer (PHY) Extension in the 2.4 GHz Band, IEEE
while providing a higher bandwidth-efficiency. We also intro- Standard 802.11b-1999, pp. 11–30, Jan. 2000.
duce a weighted minimum Euclidean distance based on [12] D. Divsalar and M. K. Simon, “The design of trellis coded MPSK for
both the Hamming distance of the data sequence and the fading channels: Performance criteria,” IEEE Trans. Commun., vol. 36,
Euclidean distances between codewords to optimize the bit no. 9, pp. 1004–1012, Sep. 1988.
[13] M. Tüchler, R. Koetter, and A. C. Singer, “Turbo equalization: Principles
error performance. We further present an iterative receiver and new results,” IEEE Trans. Commun., vol. 50, no. 5, pp. 754–767,
based on soft-information exchange between QAM-CCK May 2002.
Sequential 0/1 for Cooperative Spectrum Sensing in the

Presence of Strategic Byzantine Attack
Jun Wu , Yue Yu, Student Member, IEEE, Tiecheng Song , and Jing Hu, Member, IEEE
Abstract—Cooperative spectrum sensing (CSS) is regarded as a

promising approach to identify the available spectrum. However,
it not only requires the large communication resource-consuming,
but also exposes a high vulnerability to Byzantine attack. In
this letter, starting with sequential voting rule, we propose a
low-complexity sequential 0/1 (S0/1) for CSS in the presence of
strategic Byzantine attack, which neither requires strong assump-
tions and any prior knowledge nor depends on careful threshold Fig. 1. Byzantine attack model.
selection. Compared with existing approaches, simulation results
show that S0/1 reduces the sample size in support of the higher about the PU. More importance lies in that these studies cannot
correct sensing ratio, even in the blind scenario. effectively address sophisticated Byzantine attack meanwhile
Index Terms—Cooperative spectrum sensing, Byzantine attack, extremely reduce the sample size.
sample size, correct sensing ratio. Motivated by these issues, we conduct a strategic Byzantine
attack model by observing malicious behaviors. Based on
sequential voting rule (SVR), we propose a simple yet efficient
I. I NTRODUCTION sequential 0/1 (S0/1) against sophisticated attackers which nei-
OGNITIVE radio (CR), a collection of intelligent meth- ther requires strong assumptions and any prior knowledge nor
C ods designed for allowing secondary users (SUs) to
opportunistically access spectral bands unused by the primary
depends on careful threshold selection.
user (PU), has been considered as a solution to improve spec- II. S YSTEM M ODEL
trum utilization. Cooperative spectrum sensing (CSS) is a key A. Spectrum Sensing
technology of cognitive radio (CR) system to enhance the PU Consider a network of N SUs detecting cooperatively the PU
detection performance by exploiting SUs’ spatial diversity. signal by the local spectrum sensing, the attack ratio is ρ in N
However, the process of CSS requires the large commu- collaborative SUs. In general, the local spectrum sensing can
nication resource for reporting sensing results, particularly, be formulated as a binary hypothesis test. Let H0 and H1 be
in a large cognitive radio network (CRN), which limits or the two hypotheses with P (H0 ) and P (H1 ) denoting the asso-
even compromises the available cooperative gain [1], [2]. ciated a priori probabilities. The local sensing performance, the
On this account, the authors studied the performance of false alarm and miss detection probabilities are assumed to be
hard-decision CSS with sequential reporting based on K-out- the same for every SU irrespective of whether they are reliable
of-N in [2]–[4], resulting in less samples for decision-making or malicious, denoted by Pf and Pm , respectively. Otherwise,
to avoid consuming too much communication resource, but the detection probability is derived from Pd = 1 − Pm .
none of them consider Byzantine attack, which significantly After each SU independently performs local spectrum sens-
degrades CSS performance. To protect CSS from Byzantine ing, the fusion center (FC) is responsible for the global
attack, Chen et al. [5], [6] combined the nodes reputation and decision by collecting sensing results from SUs to cooper-
used sequential probability ratio test (SPRT) and depended on ate sensing and aggregate sensing results via a fusion rule.
careful threshold to identify attackers. Furthermore, [7]–[10] Multiple SUs usually access the control channel in time divi-
moved forward with weighted sequential probability ratio test sion multiple access mode, and their sensing results need to be
(WSPRT) against Byzantine attack. All of them reduce the collected one by one not in parallel [4]. The communication
sample size to some extent while extremely increase the channels between SUs and the FC are error-free in this letter.
computational complexity. Unfortunately, they have strong
assumptions (such as there exist only a few attackers which
act in a simplified strategy) and require a priori information
B. Byzantine Sophistication
Manuscript received September 3, 2018; accepted October 15, 2018. Date of In the process of CSS, attackers intentionally falsify own
publication October 23, 2018; date of current version April 9, 2019. This work sensing results before submitting them to the FC in an attempt
was supported in part by the National Natural Science Foundation of China to mislead the global decision regarding the phenomenon of
under Grant 61771126, and in part by the Key Research and Development Plan
of Jiangsu Province under Grant BE2018108. The associate editor coordinat-
presence, thereby undermining the premise of CR technology.
ing the review of this paper and approving it for publication was W. Hamouda. Previous malicious detection and suppression algorithms focus
(Corresponding author: Jun Wu.) on simple “always attack” only (i.e., attackers do not always
The authors are with the National Mobile Commutation Research report the false information to the FC), since “always attack”
Laboratory, Southeast University, Nanjing 211189, China (e-mail:
wojames2011@163.com; yuyue@seu.edu.cn; songtc@seu.edu.cn;
strategy can be easily identified. In fact, attackers may act in a
louy@seu.edu.cn). strategic manner, such as, misleading the network occasionally
Digital Object Identifier 10.1109/LWC.2018.2877665 but behaving correctly during the rest of the time, as depicted
WU et al.: SEQUENTIAL 0/1 FOR CSS IN PRESENCE OF STRATEGIC BYZANTINE ATTACK 501
⎛ ⎛ ⎛ ⎞⎞⎞
N
ρN N
−ρN
N ρN km N −ρN k
Qf ,cvr = ⎝
i
⎝
km P (1 − Pfa )ρN −km ⎝
fa P n (1 − Pf )N −ρN −kn ⎠⎠⎠
kn f (4)
i=K km =0 kn =K −kn
⎛ ⎛ ⎛ ⎞⎞⎞
N
ρN N
−ρN
N ρN km N −ρN
Qm,cvr = 1 − ⎝
i
⎝ P (1 − Pda )ρN −km ⎝
km da P kn (1 − Pd )N −ρN −kn ⎠⎠⎠
kn d (5)
i=K km =0 kn =K −kn
in Fig. 1, a strategic Byzantine attack model is described as In CVR, N report results are required to submit to the FC.
follows Such a fixed sample size is inefficient to proceed with fus-
ing sensing information. For this reason, we first consider a
P (r = 1|s = 0) = α sequential approach using CVR to reduce the sample size in
(1)
P (r = 0|s = 1) = β the following section.
where s is the sensing result and r is the report result, α is the
III. S EQUENTIAL 0/1 FOR C OOPERATIVE
false alarm attack probability, the false alarm attack implies
S PECTRUM S ENSING
that the attacker has detected the absence of PU but submits
the presence decision 1 to the FC, its aim is to prevent reliable A. Sequential Voting Rule
SUs from using the idle channel. β is the miss detection attack Encouraged by sequential decision rule, we start with
probability. The miss detection attack implies that the attacker an objective to develop SVR. In SVR, SUs’ report results
has detected the PU’s presence but submits the absence deci- sequentially arrive at the FC until the decision condition is
sion 0, thereby alluring them to access the channels in use satisfied. That is to say, the FC receives K report result 1s or
and causing excessive interference to the PU. In the proposed N−K+1 report result 0s, the rest of report results do not need
attack model, the attacker has a certain probability, varying to be submitted. Obviously, the decision condition of SVR
from 0 to 1, to conduct various attack strategies. Thus, the is consistent with CVR, therefore, the cooperative sensing
false alarm and miss detection probabilities of the attacker performance of SVR is also identical to CVR.
can be represented as The next thing comes into consideration is the sample size.
Assume that P1 and P̂1 denote the probability of an individual
Pfa = P (s = 0|H0 )P (r = 1|s = 0) report result 1 for the reliable SU and the attacker, respectively,
+ P (s = 1|H0 )P (r = 1|s = 1) thus the average number of samples to satisfy the voting rule
= (1 − Pf )α + Pf (1 − β) (2) can be obtained by
Pma = P (s = 0|H1 )P (r = 0|s = 0) N

+ P (s = 1|H1 )P (r = 0|s = 1) ψ(N , K , P1 , P̂1 ) = ϕ(i , K , P1 , P̂1 )
i=1
= Pm (1 − α) + (1 − Pm )β (3)
+ ϕ(i , N − K + 1, 1 − P1 , 1 − P̂1 )
Besides, the detection probability of the attacker can be (6)
computed by Pda = 1 − Pma . Through above attack model, n−1
ρN ρN km N −ρN
if possible, attackers would want to make the FC completely where ϕ(n, t, p1 , p̂1 ) = t−1 km =0 km P̂1 ) t−km
t−km
ρN −km ρN −km km )(
n−t−(ρN −km )

unable to decide on a particular decision and the performance ·P1 (
km
(1 − P̂ 1 ) kn =0
km =0
at the FC can be no better than just a random guess of the state n−t−(ρN −k m)
k

of channel. In other words, the report results received by the kn (1 − P1 ) n denotes the probability of
FC are completely independent of the hypothesis test. That is, Negative Binomial (NB) distribution.1 When a global deci-
if the attack ratio and a pairwise of attack probabilities satisfy sion is made in SVR, there must be either K report result 1s
ρ ≥ 1/(α + β), which can make the FC blind. More details or N−K+1 report result 0s received at the FC, thus
of the blind scenario are given in [11]. N

ϕ(i , K , P1 , P̂1 )
C. Conventional Voting Rule i=1
Considering that voting rule (a.k.a. K-out-of-N) can be real- + ψ(i , N − K + 1, 1 − P1 , 1 − P̂1 ) = 1 (7)
ized in the low complexity without any prior knowledge on the LHS in (7) represents the sum of Qm,cvr and Qd,cvr , and
PU signal, it is utilized as a fusion rule in this letter. In con-
ventional voting rule (CVR), more than K in N report results certainly equals to one as RHS. Therefore, ψ(N , K , P1 , P̂1 ) ≤
are declared as the PU’s presence, then the FC broadcasts the N , which implies a smaller average number of samples
channel is busy, otherwise the channel is idle. Through sum- required for CSS with SVR.
ming the possibility of report results of satisfying K-out-of-N According to the Bayes theorem, the average number of
rule, when there exist ρN attackers in the network and ρN ≤ samples required at the FC regarding the primary signal can
K, the false alarm and miss detection probabilities of CVR can 1 ϕ(n, t, p , p̂ ) represents the probability of n trials that occur for a given
1 1
be obtained as (4) and (5) shown at the top of this page. As number of t successes, p1 and p̂1 refers to the “success” probability of two
for the case of ρN > K, similar results are also obtained. participants, respectively
be given by
N̄svr = ψ(N , K , Pd , Pda |H1 )P (H1 )
+ ψ(N , K , Pf , Pfa |H0 )P (H0 ) (8)
where ψ(N , K , Pd , Pda |H1 ) and ψ(N , K , Pf , Pfa |H0 ) can be
calculated by (6).
Overall, the main advantage of SVR is that it requires, on
an average, fewer samples to achieve the same performance
as a fixed sample size test.
B. Sequential 0/1
Though SVR reduces the sample size without performance
loss, it fails in considering the malicious presence. Under the Fig. 2. Correct sensing ratio for various voting rules.
assumption of a simple attack strategy, [5]–[10] have come up
with the sequential hypothesis test to defend against Byzantine
attack, but its advantage is attained at the expense of additional Byzantine attacks, but without any alteration of the decision
computation and prior knowledge of primary signal. computa- condition. Therefore, it is not surprising that S0/1 provides the
tion and prior knowledge of primary signal. Such limitations performance as well as CVR in the absence of attackers.
fuel the motivation in considering a sequential approach that As for the sample size, the FC only requires N−K+1 report
does not require any prior information to reduce the sample result 0s in support of the global decision 0 in S1 or requires
size and computational complexity. Therefore, we introduce K report result 1s in support of the global decision 1 in S0,
S0/1 using SVR to counteract strategic Byzantine attacks. hence, according to (6), the average number of samples of S0
In S0, only the report result 0 is allowed to submit to the FC
can be computed by
whereas the report result 1 is not. To be specific, the reports
result 0s from N report results are collected sequentially, if N̄s0 = ψ(N − K + 1, N − K + 1, 1 − Pd , 1 − Pda |H1 )P (H1 )
the cumulative number of 0s satisfies the decision condition,
+ ψ(N − K + 1, N − K + 1, 1 − Pf , 1 − Pfa |H0 )P (H0 )
i.e., the number of 0s is N−K+1, the remaining report result
0s are not required, the global decision is 0; Otherwise, if the (9)
FC has received N report results that still cannot satisfy the
For S1, the average number of samples are easily obtained
decision condition, then the global decision is automatically
as follows
declared as 1. In contrast with S0, S1 only allows the SU to
submit the report result 1. If the cumulative number of the N̄s1 = ψ(K , K , Pd , Pda |H1 )P (H1 )
report result 1s exceeds K, the remaining 1s are not required, + ψ(K , K , Pf , Pfa |H0 )P (H0 ). (10)
the global decision is 1; Otherwise, the cumulative number of
1s still cannot exceed K after N report results arrive at the FC,
then the global decision is announced as 0. IV. S IMULATION R ESULTS
From implementation point of view, although the PU has In this section, we simulate 1000 sensing intervals to cor-
no obligation to provide any channel information for SUs, the roborate the proposed S0/1. The related parameter settings are
network administrator basically knows the business circum- set as follows: N = 100, P (H0 ) = 0.8 and P (H1 ) = 0.2,
stances of the network coverage area, for example, the primary Pf = Pm = 0.1. In SPRT-based approaches, the toler-
network is busy during a certain period of time and idle dur- ated false alarm and miss detection probabilities are 10−3
ing another period. Before proceeding with S0/1, the network and 10−4 [8], [9], respectively. Considering the blind and
administrator can observe each SU’s performance over a sens- non-blind scenario, we simulate the correct sensing ratio (rep-
ing observation, then in the real operation of S0/1, the network resents the ratio of correct detecting the PU’s presence or
administrator will not tell SUs with the poor performance absence during a certain sensing interval) and sample size in
to choose S0/1 unless they return the normal performance the context of strategic attack (ρ, α, β) such as (0.5,0.5,0.5)
in the sequential spectrum sensing. Additionally, cooperative and (0.8,0.8,0.8).
spectrum sensing can be verified by an additional monitoring In Fig. 2, as anticipated, SVR and CVR has the same cor-
process. For a centralized CRN, the monitoring process is easy rect sensing ratio, despite which starts to deteriorate owing to
to perform by the FC itself. If the global decision is reached the malicious presence, particularly in the blind scenario. The
after X reports (X<N), the global decision will be sent along difference, however, is that SVR requires less samples than
with the stopp reporting message to all SUs. CVR. To be specific, the sample size of SVR increases at first
Our proposed S0/1 not only has a natural advantage in and decreases at last, but the high correct sensing ratio cannot
defending Byzantine attack, but reduces the sample size. In be guaranteed. Whereas the correct sensing ratio of S0/1 is
detail, if the report result 1 occurs during S0, this implies always consistent with that of CVR in the absence of attack-
that there must exist the attacker which has falsified the orig- ers, which is not affected by the blind problem. In addition,
inal report result 0. In S1, if the report result 0 appears, it along with the increasing K, the sample size of S0 reduces
represents the presence of the attacker falsifying the original while S1 increases, as depicted in Fig. 3.
report result 1. Subsequently, the attacker identification policy One interesting effect is that the sample size of S0 is greater
is easily implemented and does not depend on careful thresh- than that of S1 for K<75. This is because the low spectrum
old selection. This feature enables S0/1 to cope with various underutilization makes S1 more advantageous when P (H0 ) =
WU et al.: SEQUENTIAL 0/1 FOR CSS IN PRESENCE OF STRATEGIC BYZANTINE ATTACK 503
Fig. 3. Sample size for various voting rules. Fig. 5. Sample size versus attack ratio.
After the selection of S0/1, it only requires a low sample size

by the optimal K, in contrast, other three approaches always
require more samples than S0/1, and also still cannot guarantee
the performance. Undoubtedly, malicious presence increases
the uncertainty of the sequential more samples. In particular,
for a fixed local sensing performance, the average number of
samples of S0/1 is constant and only depends on N and K.
which agrees well with simulation result in Fig. 5.
V. C ONCLUSION
In this letter, we analyze Byzantine attack behavior in the
process of CSS and develop a sophisticated attack model.
Subsequently, a simple yet effective S0/1 based SVR is pre-
Fig. 4. Correct sensing ratio versus attack ratio. sented to counteract strategic Byzantine attack. Simulation
results show that in contrast to existing approaches, the pro-
posed S0/1 not only strongly reduces the sample size also
0.8. K is usually used as an optimization variable to obtain the provides the higher correct sensing ratio, even in the blind
optimal performance, it can be observed that within K>75, K scenario.
report result 1s are required for S1 to make the global decision
while N−K+1 report result 0s are required for S0. Such as, in R EFERENCES
S0/1, a small or large value of K(such as, K<20 and K>80) [1] M. S. Khan, M. Usman, V.-V. Hiep, and I. Koo, “Efficient selec-
results in the low correct sensing ratio, the optimal K should tion of users’ pair in cognitive radio network to maximize throughput
using simultaneous transmit-sense approach,” IEICE Trans. Commun.,
be between 20 and 80. Simulation results provide a criterion vol. 100, no. 2, pp. 380–389, 2017.
for the optimal K to counteract Byzantine attack and reduce [2] H. Luan, O. Li, and X. Zhang, “Cooperative spectrum sensing with
the sample size. energy-efficient sequential decision fusion rule,” in Proc. IEEE 23rd
Wireless Opt. Commun. Conf. (WOCC), Newark, NJ, USA, 2014,
Through above analyses, it is known that the choice of S0 or pp. 1–4.
S1 depends on whether the PU is active or inactive for a certain [3] A. A. Alkheir and H. T. Mouftah, “Sequential hard-decision fusion for
agile cooperative spectrum sensing,” in Proc. IEEE Int. Conf. Commun.
period of time. Given the reliable performance, the smaller Workshop (ICCW), London, U.K., 2015, pp. 1014–1019.
K, the lower sample size in S1 while a larger K leads to a [4] S. Peng, W. Zheng, R. Gao, and K. Lei, “Fast cooperative energy detec-
smaller sample size in S0. Based on this criterion, Fig. 4 and 5 tion under accuracy constraints in cognitive radio networks,” Wireless
Commun. Mobile Comput., vol. 2017, pp. 1–8, Sep. 2017.
respectively compare the correct sensing ratio and sample size [5] R. Chen, J.-M. J. Park, and K. Bian, “Robustness against Byzantine
of S0/1 against SPRT, WSPRT, RWSPRT across a wide range failures in distributed spectrum sensing,” Comput. Commun., vol. 35,
of attack ratios when α = β = 1. Further detail on the SPRT, no. 17, pp. 2115–2124, 2012.
[6] C.-Y. Chen, Y.-H. Chou, H.-C. Chao, and C.-H. Lo, “Secure centralized
WSPRT, RWSPRT can be found in [8] and [9]. spectrum sensing for cognitive radio networks,” Wireless Netw., vol. 18,
In Fig. 4, the correct sensing ratios of other three schemes no. 6, pp. 667–677, 2012.
have declined to varying degrees as the attack ratio increases. [7] A. A. Sharifi and J. M. Niya, “Securing collaborative spectrum sensing
against malicious attackers in cognitive radio networks,” Wireless Pers.
Specifically, the correct sensing ratios of SPRT and WSPRT Commun., vol. 90, no. 1, pp. 75–91, 2016.
are approximately 0.5 after the attack ratio exceeds 50%, this [8] J. Wu et al., “Robust cooperative spectrum sensing against probabilis-
tic SSDF attack in cognitive radio networks,” in Proc. IEEE 86th Veh.
confirms that Byzantine attack makes FC once ρ ≥ 1/(α +β). Technol. Conf. (VTC-Fall), Toronto, ON, Canada, 2017, pp. 1–6.
As the attack ratio further increases, the correct sensing ratios [9] J. Wu, T. Song, Y. Yu, C. Wang, and J. Hu, “Sequential cooperative
of SPRT is gradually close to 0 and WSPRT coverages to spectrum sensing in the presence of dynamic Byzantine attack for mobile
networks,” PLoS ONE, vol. 13, no. 7, 2018, Art. no. e0199546.
αP (H0 ) + βP (H1 ) = 0.2. Although RWSPRT is better than [10] Z. Hu, Y. Bai, L. Cao, M. Huang, and M. Xie, “A sequential com-
SPRT and WSPRT, it also begins to decline when ρ = 70%. pressed spectrum sensing algorithm against SSDH attack in cognitive
While the attack ratio has no effect on the correct sensing ratio radio networks,” J. Elect. Comput. Eng., vol. 2018, no. 8, pp. 1–9, 2018.
[11] B. Kailkhura, Y. S. Han, S. Brahma, and P. K. Varshney, “Distributed
of S0/1, absolutely, this benefits from anti-attack property of Bayesian detection in the presence of Byzantine data,” IEEE Trans.
S0/1. Signal Process., vol. 63, no. 19, pp. 5250–5263, Oct. 2015.
Spherical Wave Positioning Based on Curvature of Arrival by an Antenna Array

Siwei Zhang , Member, IEEE, Thomas Jost , Member, IEEE, Robert Pöhlmann, Member, IEEE,
Armin Dammann, Member, IEEE, Dmitriy Shutin, Member, IEEE, and Peter Adam Hoeher , Fellow, IEEE
Abstract—Array processing is a key technology for emerging must be known to the receiver. Additionally, synchronization
mobile networks, especially in short to moderate range and line- between the transmitter and the receiver must be ensured.
of-sight scenarios. In these scenarios, the incoming wavefront Large antenna arrays are widely foreseen for 5G applica-
can be modeled by a spherical wave. The wavefront curvature,
i.e., curvature of arrival (CoA), contains position information of tions, varying from dozens of antennas at mobile stations [5]
the transmitter and is observable by an antenna array poten- to over one hundred at local access points [7] and up to a few
tially asynchronous and non-coherent to the transmitter. We thousands at base stations [1]. These arrays provides communi-
derive a simplified expression of the spherical wave position- cation coverage, from their radiating near field (a few meters)
ing (SWP) Cramér–Rao bound for arbitrary centro-symmetric to the beginning of the Fraunhofer region (up to a hundred
arrays, which provides a geometrical inference about the achiev-
able performance. Additionally, a low complexity CoA positioning meters), to mobile devices under line-of-sight (LoS) condi-
algorithm is proposed. In contrast to conventional methods, tion. The signal wavefront received by the array is modeled
the proposed algorithm requires neither multiple anchors nor by a spherical wave. Under this model, not only DoA but also
coordination between devices. It also outperforms the Fresnel distance information of the transmitter is contained in the car-
approximation based SWP algorithms by overcoming the model rier phase, which enables SWP of the transmitter [7]–[10].
mismatch. Therefore, the proposed CoA positioning algorithm is
promising for precise positioning in future mobile networks. Most previous works apply the Fresnel approximation to
arrays with special geometries, e.g., uniform linear arrays
Index Terms—Joint DoA/distance estimation, 5G position- (ULAs) [8]–[10], and introduce a model mismatch. This mis-
ing, near field, spherical wavefront, antenna array, Cramér-Rao
bound. match has recently been noticed to jeopardize the achievable
positioning precision [11]. In [12] a lookup table is used
for ULA model correction. The maximum likelihood (ML)
I. I NTRODUCTION algorithm in [7] exploits the exact model, but includes a
BIQUITOUS realtime position information is envisaged computationally expansive recursion. In [13] the Cramér-Rao
U as a key feature of future mobile networks, for example
the 5th generation (5G) networks, due to emerging device-
bound (CRB) of SWP is analyzed for special array geometries.
In this letter, a simplified expression of the CRB of SWP is
centric applications [1]. Studies have been conducted on derived for arbitrary centro-symmetric arrays (CSAs), which
5G positioning, where 5G networks provide opportunities only depends on the relative geometry and characteristics of
for precise positioning in global navigation satellite system the antennas’ spatial distribution. This expression provides
(GNSS)-impaired environments [2]. Most traditional position- a geometric inference about the achievable performance and
ing techniques exploit either time of arrival (ToA) or direction brings further insights into array design. Additionally, we pro-
of arrival (DoA) estimates and demand multiple anchors [3]. pose a method dubbed CoA positioning, which extracts the
Recent research has focused on positioning with a single transmitter position information directly from the wavefront
anchor. The simultaneous localization and mapping (SLAM) curvature. CoA positioning overcomes the model mismatch
algorithm in [4] utilizes multipath components for position- introduced by Fresnel approximation, while maintaining low
ing. However, it requires memory-intensive storage and a complexity for realtime.
static environment. In combined DoA and ToA estimation, an
antenna array is used to position a transmitter. With the far- II. P OSITION I NFORMATION IN S PHERICAL WAVE
field assumption, distance information is obtained solely from
A single transmitter antenna is placed at point Ps , which
the propagation delay of the baseband pilot signal, whereas the
radiates a single-carrier signal at carrier frequency fc with
DoA is estimated from the carrier phase differences between
an unknown real-valued amplitude S and an unknown phase
antennas [5], [6]. For ToA estimation, the pilot signal structure
φδ . The signal propagates under LoS condition to a generic
Manuscript received August 31, 2018; revised October 11, 2018; accepted point P at distance d with speed of light c0 . The signal
October 20, 2018. Date of publication October 25, 2018; date of current phase at P, φ = φδ − ωc d /c0 , is a continuous function in
version April 9, 2019. This work was supported by the German Research space, where ωc = 2πfc . The transmitter’s position infor-
Foundation (DFG) under Contract FI 2176/1-1 and Contract HO 2226/17-1.
The associate editor coordinating the review of this paper and approving it mation w.r.t. an observation point Po can be extracted from
for publication was Y. Shen. (Corresponding author: Siwei Zhang.) the continuous wavefield. We define a two-dimensional (2D)
S. Zhang, T. Jost, R. Pöhlmann A. Dammann, and D. Shutin are with Cartesian coordinate system1 C (xy) that originates at point Po
the German Aerospace Center, Institute of Communications and Navigation,
82234 Wessling, Germany (e-mail: siwei.zhang@dlr.de). 1 The coordinates of a specific point P in a coordinate system C (ξψ) are
P. A. Hoeher is with the Chair of Information and Coding Theory,
(ξψ)
University of Kiel, 24143 Kiel, Germany (e-mail: ph@tf.uni-kiel.de). defined as p = [ξ , ψ ]T , where ξ and ψ are the two dimensions of
Digital Object Identifier 10.1109/LWC.2018.2877971 that coordinate system. The subscript is omitted for generic points P.
ZHANG et al.: SWP BASED ON CoA BY ANTENNA ARRAY 505
covariance of the transmitter position estimate COV[p̂s ] is

bounded by the CRB, CRB[ps ], which is obtained by applying
the Schur complement [6] to the position corresponded sub-
matrix of the FIM. Assuming free-space pathloss, i.e., αs =
Sc0/2ωc ds , the positioning CRB states
2σ 2 ds2
COV[p̂s ] CRB[ps ] =
S2

L
∂dsl ∂dsl 1 ∂dsl ∂dsm −1
L L
Fig. 1. Spherical wave positioning based on CoA. × − . (4)
∂ps ∂pT
s L ∂ps ∂pT
s
l=1 l=1 m=1
and includes point Ps . The corresponding polar coordinate
system is defined as C (dθ) . We consider the 2D position- To infer the geometry impacts on SWP, we first investigate
(dθ) the symmetric linear array (SLA) case. An SLA is deployed
ing problem, i.e., estimating ps = [ds , θs ]T , where ds is
along the x-axis, with L elements and an aperture length A.
the distance between points Ps and Po , and θs is the DoA
We define the k th moment of the normalized antennas’ spatial
w.r.t. the positive x-axis of C (xy) . For the rest of this letter,
distribution Mk = L k
l=1 l /A) /L, and the effective aperture
(d
we use C (dθ) as the default coordinate system and omit the
superscript (d θ) for simplicity. The transmission phase φδ and length Ã = A sin θs , to characterize the array geometry.
signal amplitude αs after propagation are not of interest, but Theorem 1 (CRB of SWP for SLA): For the SLA, assum-
need to be estimated jointly with ps . The total parameter vec- ing L 1 and ds A, the CRB of DoA estimate can be
tor to be estimated is χ = [pT T approximated by
s , φδ , αs ] . The spherical wave
intersects the xy-plane with co-phase circles centered at Ps . 2σ 2 ds2
CRB[θs ] ≈ , (5)
The curvature of these circles contains information about the S 2 LÃ2 M2
distance to the transmitter. We apply a coordinate transforma- whereas the distance CRB is approximated by
Po ,θs
tion C (xy) −−−−→ C (uv ) , where the new Cartesian coordinate 2σ 2 ds2 4ds4
system C (uv ) originates at Po and the u-axis is aligned with θs . CRB[ds ] ≈ . (6)
S 2 LÃ4 (M4 − M22 )
Definition 1 (Signal CoA): The signal CoA κo at point
Po is defined as the extrinsic curvature of −φc0 /ωc along Proof: See the Appendix.
the v-axis of C (uv ) . With the spherical wave model, CoA is The array’s aperture is often physically constrained by appli-
propotional to the absolute value of the phase’s second-order cations. For a fixed A, more elements can be deployed for
derivative and equals to the reciprocal of ds higher carrier frequencies, without resulting in severe antenna
c0 ∂ 2 φ
mutual coupling. Both CRBs in (5) and (6) linearly decrease
1
κo − 2 = . (1) with the number of elements L, which shows a benefit of
ωc ∂v Po ds higher fc , such as foreseen in 5G. The term 2σ 2 ds2 /S 2 shows
The distance information can be extracted from the second- the effect of the signal-to-noise ratio (SNR). The CRB for
order derivative of the signal phase. The DoA needs to be DoA decreases quadratically with Ã and linearly with M2 ,
estimated prior to coordinate transformation. In practice, a 2D the antennas’ spatial spread. The distance CRB experiences a
antenna array on the xy-plane, potentially asynchronous and quartic growth with the ratio ds /Ã, indicating a strong impact
non-coherent to the transmitter, with L elements centered at Po from the relative geometry. Additionally, it decreases with
is used to sample the continuous wavefield at discrete spatial M4 − M22 , i.e., the shape of the antennas’ spatial distribu-
points. An element l, l = 1, . . . , L, located at point Pl , receives tion. More importantly, when θs = 0◦ , both CRBs approach
the baseband signal2 infinity. Hence the array’s aperture expanded in u direction
contains no information of the transmitter’s position. With the
rl (t) = αs e j φδ e −j ωc dsl /c0 + nl (t), (2)
last observation, we extend Theorem 1 to arbitrary 2D CSAs.
where nl (t) ∼ CN (0, σ 2 )
is an i.i.d. circularly-symmetric Many typical arrays are centro-symmetric, e.g., uniform cir-
complex normally distributed noise process. According to the cular/linear arrays, the ones in [13], as well as the uniform
geometry under investigation, dsl can be expressed as rectangular array (URA) illustrated in Fig. 1.
Corollary 1: A CSA centered at P0 can be projected on
dsl = ds2 + dl2 − 2ds dl cos(θl − θs ). (3) the v-axis, forming a virtual SLA. The positioning CRB can
The received samples rl are acquired at an arbitrary time point, be approximated by applying Theorem 1 to the virtual SLA.
coherently at all elements, with a received sample phase φl . Proof: Since the aperture expanded in u direction does not
The concept of the SWP with CoA is illustrated in Fig. 1. contain position information, the projected virtual linear array
along v-axis is equivalent to the original CSA in the sense
III. F UNDAMENTAL L IMITS OF SWP of SWP. By the definition of centro-symmetry, for any non-
centered element l with position pl = [dl , θl ]T , there exists an
The Fisher information matrix (FIM) of χ can be calculated element m with position pm = [dl , θl +π]T . Elements l and m
from the given model (2) and (3), similarly as in [13]. The are projected on the v-axis at ±dl sin(θl −θs ) respectively and
2 We assume the array aperture to be much smaller than d . Therefore, the
s
are symmetric w.r.t. P0 . Hence the projected array is an SLA,
distance-related attenuation differences among elements are negligible. which meets the condition of Theorem 1.
IV. L OW C OMPLEXITY C OA P OSITIONING A LGORITHM

We propose a low complexity CoA positioning algorithm,
which avoids recursions and reduces the model error from
the Fresnel approximation. It can be applied either directly
as a realtime positioning variant or to initialize a recursive
algorithm like an ML estimator [7].
We define tiles Ti composed of at least three adjacent
antenna elements and centered at points Pi . The estimated
local DoA θ̂si can be calculated by traditional far-field DoA
estimation methods [14], applying the plane wave model on
all applicable antenna pairs l, m in Ti
φlm ≈ −eT
si plm ωc /c0 , ∀dlm ωc /c0 < π/2, (7)
(xy) (xy)
where plm = pl − pm , φlm = φl − φm and esi =
[ cos θsi , sin θsi ]T . To estimate the tile’s curvature κi , a coor-
Pi ,θsi
dinate system C (ui vi ) local to Ti is defined as C (xy) −−−−→
C (ui vi ) . The second order derivative of phase local to Ti can
be approximated by a double difference with three adjacent
elements l, m and n, which leads to a curvature estimate as
− mn φgh c0 /ωc + ugh
κ̃lmn = 2 lm , where gh = ,
vlm + vmn vgh

ugh êT sin θ̂si
= Tsi pgh and êsi,⊥ = .
vgh êsi,⊥ − cos θ̂si
The coarse estimate of the tile’s curvature κ˜i is obtained by
Fig. 2. A URA along x-axis with Ax = 1.5 m, Ay = 0.3 m, fc = 1 GHz
averaging κ̃lmn over all the effective combinations of l, m and and spacing of λ/4.
n, i.e., ∀ l , m, n, where |vlm |, |vmn | and |vlm + vmn | 0.
The curvature estimated from a single tile is heavily distorted
by noise. To get a stable estimate, an extra smoothing step is first decrease due to a decreasing model error and then increase
applied, exploiting the geometry equality because of the reducing SNR and worsen geometry. For small
ds , the model error varies with θs , but only leads to small esti-
(xy)
(xy) (xy) ∀Ti esi + κi pi mation errors. By applying an ML estimator in addition, the
ps = κ−1
i e si + pi = . (8) RMSEs approach the CRBs. When both ds and θs are small, a
∀Ti κi
slight Taylor approximation error from (10) is observed for the
The tile’s curvature estimate can be refined as distance CRB. At most of the evaluation points, approximated
−1
(xy) (xy) −1 and exact CRBs coincide, which verifies Corollary 1.
κ̂i = κ̃j êsj + κ̃j pj − pi . (9)
∀Tj ∀Tj
Next we compare the CoA positioning to traditional low
complexity SWP algorithms with the ULAs, since most algo-
Finally, the transmitter’s position can be estimated by replacing rithms apply the Fresnel approximation on ULAs, [8], [10].
θsi and κi in (8) with their estimates θ̂si and κ̂i . To remove the outliers occurring at small θs , the RMSEs are
calculated only for DoAs range between 30◦ and 90◦ . Fig. 3
V. N UMERICAL R ESULTS shows the performance of CoA positioning, the Fresnel based
We simulate SWP with three different arrays orientated approach in [10], the CoA initialized ML estimator and the
along the x-axis: a URA with aperture size in each dimen- CRBs for different ds . The Fresnel based approach estimates
sion Ax = 1.5 m, Ay = 0.3 m and two ULAs with aperture the DoA with the plane wave model, like the traditional far-
lengths 0.3 m and 1.5 m. All arrays have λ/4 antenna spacing. field DoA estimation. The Fresnel based approach has a larger
The tiles are constructed by 3 × 3 elements for the URA and model error for larger arrays. In contrast, the CoA positioning
1 × 3 for the ULAs. A single antenna transmitter is deployed only experiences model mismatch within individual tiles, inde-
at distances from 1 m to 100 m and with DoAs from 0◦ to pendently of the total aperture. Therefore, the CoA positioning
90◦ , transmitting a single-carrier signal with 10 dBm trans- outperforms the Fresnel based approach for shorter distances
mit power at 1 GHz carrier frequency. Free-space pathloss and larger arrays. At larger distances, the model error is no
and a noise variance of −123.2 dBm are assumed. For each longer dominant and all algorithms perform similarly along
parameter set 100 Monte Carlo runs have been conducted. the CRBs. For the 0.3 m ULA at small θs and ds > 50 m,
First the performance of the URA is assessed. Fig. 2 shows the ratio ds /Ã is so small, that none of the three algorithms is
the root mean square errors (RMSEs) of CoA positioning and a able to effectively estimate the distance. As a final result, for a
CoA initialized ML estimator, as well as the exact and approxi- 1.5 m sized array, the distance estimate can be achieved with
mated CRBs. With increasing ds , the CoA positioning RMSEs a sub-meter RMSE by the CoA positioning up to 50 m, which
ZHANG et al.: SWP BASED ON CoA BY ANTENNA ARRAY 507
L L 2
∂dsl ∂dsl al al bl
and ≈ .
∂ps ∂pTs al bl dl2 sin2 θs + bl2
l=1 l=1
The CRB of ps can be derived as

2σ 2 ds2
CRB[ps ] ≈
S 2 LÃ4 (M4 − M22 )
−1
ds−4 /4 cot θs ds−3 /2
× Ã−2 M2 . (11)
cot θs ds−3 /2 M 2 −2
2 + cot θs ds
4 −M2
(a) DoA estimation error The CRB of DoA in (5) can be directly obtained by taking the
second diagonal entity of (11). The distance CRB is derived
by taking the first diagonal entity of (11)
8σ 2 ds6 (M4 − M22 ) cot2 θs

CRB[ds ] ≈ 1 + .
S 2 LÃ4 (M4 − M22 ) (ds /Ã)2 M2
(12)
Equation (6) is obtained from (12) with the assumption

ds Ã, which completes the proof.
(b) Distance estimation error R EFERENCES

[1] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski,
Fig. 3. Two ULAs along x-axis with A = 0.3 m and 1.5 m, fc = 1 GHz “Five disruptive technology directions for 5G,” IEEE Commun. Mag.,
and spacing of λ/4. vol. 52, no. 2, pp. 74–80, Feb. 2014.
[2] J. del Peral-Rosado, R. Raulefs, J. López-Salcedo, and
meets the accuracy expectation of 5G suggested in [2]. The G. Seco-Granados, “Survey of cellular mobile radio localization
methods: From 1G to 5G,” IEEE Commun. Surveys Tuts., vol. 20,
CoA initialized ML estimator extends the applicable distance no. 2, pp. 1124–1148, 2nd Quart., 2018.
to 100 m. [3] M. Z. Win et al., “Network localization and navigation via cooperation,”
IEEE Commun. Mag., vol. 49, no. 5, pp. 56–62, May 2011.
VI. C ONCLUSION [4] C. Gentner et al., “Multipath assisted positioning with simultaneous
localization and mapping,” IEEE Trans. Wireless Commun., vol. 15,
We proposed a CoA positioning algorithm, where the no. 9, pp. 6104–6117, Sep. 2016.
transmitter position is directly estimated from the wavefront [5] A. Shahmansoori, G. E. Garcia, G. Destino, G. Seco-Granados, and
curvature. Compared to conventional methods, CoA posi- H. Wymeersch, “Position and orientation estimation through millimeter-
wave MIMO in 5G systems,” IEEE Trans. Wireless Commun., vol. 17,
tioning does not require multiple anchors, synchronization, no. 3, pp. 1822–1835, Mar. 2018.
nor coordination between communication entities. Simplified [6] Y. Han, Y. Shen, X.-P. Zhang, M. Z. Win, and H. Meng, “Performance
CRBs show that for an arbitrary CSA, the achievable SWP limits and geometric properties of array localization,” IEEE Trans. Inf.
Theory, vol. 62, no. 2, pp. 1054–1075, Feb. 2016.
performance only depends on the relative geometry and [7] X. Yin, S. Wang, N. Zhang, and B. Ai, “Scatterer localization using
moments of the antennas’ spatial distribution. Numerical large-scale antenna arrays based on a spherical wave-front parametric
results prove that the low complexity CoA positioning is model,” IEEE Trans. Wireless Commun., vol. 16, no. 10, pp. 6543–6556,
Oct. 2017.
effective for the considered applications and outperforms the [8] E. Grosicki, K. Abed-Meraim, and Y. Hua, “A weighted linear prediction
Fresnel approximation based algorithms by overcoming the method for near-field source localization,” IEEE Trans. Signal Process.,
model mismatch. Hence, the proposed CoA positioning is vol. 53, no. 10, pp. 3651–3660, Oct. 2005.
[9] J. Chen, X. Zhu, and X. Zhang, “A new algorithm for joint range-DOA-
suitable for realtime transmitter positioning in 5G. frequency estimation of near-field sources,” EURASIP J. Adv. Signal
Process., vol. 2004, no. 3, Mar. 2004, Art. no. 105173.
A PPENDIX [10] K. Deng, Q. Yin, and H. Wang, “Closed form parameters estimation for
We apply the second-order Taylor expansion to dsl at dl = 0 near field sources,” in Proc. IEEE Int. Symp. Circuits Syst., May 2007,
pp. 3251–3254.
1 [11] Y.-S. Hsu, K. T. Wong, and L. Yeh, “Mismatch of near-field bearing-
dsl ≈ ds − dl cos(θl − θs ) + sin2 (θl − θs )dl2 /ds (10) range spatial geometry in source-localization by a uniform linear
2 array,” IEEE Trans. Antennas Propag., vol. 59, no. 10, pp. 3658–3667,
d 2 sin2 θ d 2 sin θ cos θ Oct. 2011.
and define al 1 − l 2d 2 s and bl l ds
s s
. [12] P. R. Singh, Y. Wang, and P. Chargé, “Performance enhance-
s
By exploiting the symmetry of SLAs, we can write ment of approximated model based near-field sources localisation
⎛ ⎞ techniques,” IET Signal Process., vol. 11, no. 7, pp. 825–829,
L 2 L L Sep. 2017.
⎜ al al bm ⎟ [13] J.-P. Delmas, M. N. El Korso, H. Gazzah, and M. Castella,
L L
∂dsl ∂dsm ⎜ ⎟ “CRB analysis of planar antenna arrays for optimizing near-
⎜ l=1 l=1 m=1 ⎟
≈ ⎜ ⎟ field source localization,” Signal Process., vol. 127, pp. 117–134,
∂ps ∂pTs ⎜ L L L 2 ⎟ Oct. 2016.
l=1 m=1 ⎝ al bm bl ⎠ [14] T. E. Tuncer and B. Friedlander, Classical and Modern Direction-of-
l=1 m=1 l=1 Arrival Estimation. New York, NY, USA: Academic, 2009.
Molecular Communication: The First Arrival Position Channel

Nilay Pandey , Student Member, IEEE, Ranjan K. Mallik , Fellow, IEEE, and Brejesh Lall, Member, IEEE
Abstract—In this letter, we consider a molecular

communication channel where the fluid medium has a
constant drift from the transmitter toward the receiver and
the information to be transmitted is the release position of the
information molecules. We use the Green’s function formulation
and the method of images to derive the first arrival position
density of the molecule at the receiver plane. We then analyze
this first arrival position channel in terms of symbol error
probability and capacity in the information theoretic sense.
Index Terms—Brownian motion, convection-diffusion, first
Fig. 1. Representation of the considered system model. For an M-ary scheme,
arrival position (FAP), mutual information, symbol error prob- the transmitter uses M transmission nodes located at positions (d, (n − 1) a),
ability (SEP). where 1 ≤ n ≤ M, and a is the separation between the nodes. A particular
node is used for transmitting a particular symbol, such that the release position
of the information molecule is being modulated to convey the information.
I. I NTRODUCTION A constant drift is present in the fluid media from the transmitter to the
ANOMACHINES are devices having size in the range of receiver.
N 0.1 to 10 μm and consisting of components of size less
than 100 μm along at least one dimension. Electromagnetic
communication at such a small scale is challenging. For such this FAP density, we analyze the FAP channel in terms of
devices, molecular communication (MC) has emerged as a new the symbol error probability (SEP) and the capacity in the
field of communication research where messages are conveyed information theoretic sense.
through exchange of molecules between the transmitter and the
receiver [1]–[3]. II. S YSTEM M ODEL
To transmit the information using signaling molecules, we A. System Model
can modulate certain properties of the transmitted molecules,
such as concentration [3], type [4], number [5], or time of We consider an MC system consisting of a transmitter-
release [6]. In the case of molecular timing channels, the receiver pair, information carrying molecules, and a molecular
first arrival time follows an inverse Gaussian distribution if channel filled with fluid having a constant drift velocity v. For
there is a net drift from the transmitter to the receiver and a the sake of simplicity, we consider a 2-dimensional system
Lévy distribution is obtained when there is no drift [7]. In the described by the spatial dimensions x and y, but the generaliza-
case of propagation without drift, the first passage time can tion to the third dimension is straightforward. The MC system
be very large. To overcome this issue, particles having finite is confined in the x ≥ 0 half space by a perfectly absorbing
lifetime have been used [8]. The performance of such timing boundary (receiver) situated at x = 0, forming a semi-infinite
channels can be further improved by introducing diversity in slab. The transmitter can release a single molecule or multiple
the channel by the use of multiple signaling molecules [9] or molecules simultaneously into the fluid medium from multiple
transmitters [10]. locations along the line x = d, where d is the distance between
For a diffusion-based MC in 2 or 3-dimensional space, apart the transmitter and the receiver planes. The receiver is per-
from the first arrival time, there is another important informa- fectly absorbing and is assumed to have the ability to correctly
tion that can be used to design a MC system: the first arrival measure the time and position of arrival of each molecule. In
position (FAP). This information can also be used along with our MC channel, the release position of the molecules is modu-
the first arrival time to introduce diversity in the design of the lated to convey the information to the receiver. The transmitted
MC channel. In this letter, we use the Green’s function for- molecules follow independent and identically distributed (iid)
mulation and the method of images to derive the first arrival paths to the receiver. The system model is depicted in Fig. 1.
position density of the molecule at the receiver plane. Using
B. Convection-Diffusion Equation
Manuscript received September 25, 2018; accepted October 19, 2018. Date
of publication October 26, 2018; date of current version April 9, 2019. The transport of molecules through diffusion in the presence
This work was supported by the Visvesvaraya Ph.D. Scheme of Ministry of a drift in the fluid medium is described by the convection-
of Electronics and Information Technology, Government of India, being diffusion equation
implemented by Digital India Corporation. The associate editor coordinat-
ing the review of this paper and approving it for publication was C.-B. Chae. ∂t c(r, r0 , t) + v (r, t).∇c(r, r0 , t) = D∇2 c(r, r0 , t), (1)
(Corresponding author: Nilay Pandey.)
N. Pandey is with the Bharti School of Telecommunication Technology where c(r, t) is the local concentration of the transported and
and Management, Indian Institute of Technology–Delhi, Hauz Khas, diffusing quantity (the concentration field) at position r and
New Delhi 110016, India (e-mail: nilaypandey03@gmail.com). time t, r0 is the point where the diffusion starts, v is the
R. K. Mallik and B. Lall are with the Department of Electrical Engineering,
Indian Institute of Technology-Delhi, Hauz Khas, New Delhi 110016, India velocity field of the fluid medium which is assumed to be
(e-mail: rkmallik@ee.iitd.ernet.in; brejesh@ee.iitd.ernet.in). incompressible, ∇ and ∇2 are the gradient and the Laplace
Digital Object Identifier 10.1109/LWC.2018.2878206 operators, respectively, and D is the diffusion coefficient.
PANDEY et al.: MC: FAP CHANNEL 509
The value of D is determined by the temperature, the fluid Using (3), (8) and (9), the Green’s function can be written as
viscosity, and the molecule’s Stokes radius [3].
The solution of the convection-diffusion equation in (1) is 1 −(x − x0 − vt)2 − (y − y0 )2
G(x , y, x0 , y0 , t) = exp .
given in terms of the diffusion propagator G(r, r0 , t) (also 4πDt 4Dt
known as heat kernel or Green’s function). For an absorbing (10)
boundary condition, G(r, r0 , t) satisfies the initial condition
G(r, r0 , t)|t=0 = δ(r − r0 ), where δ(·) is the Kronecker To implement the condition of an absorbing receiver at
delta function, stating that r0 is the starting point of the x = 0, we use the method of images to construct the Green’s
Brownian motion; it also satisfies the Dirichlet boundary con- function for an absorbing wall. Using this method, many
dition, G(r, r0 , t)|t=0 = 0 at r ∈ S, where, S represents the boundary conditions can be automatically satisfied, and it
set of points forming the receiver surface. works particularly well for boundaries with flat surfaces [11].
The velocity field is assumed to remain constant near the In this method, it is assumed that for each particle released
boundary, such that the fluid velocity does not vanish near at (x0 , y0 ), an image source of negative mass is released at
the receiver. This leads to an outflow boundary condition. (−x0 , y0 ). These two particles diffuse freely, giving a zero
Here, the receiver acts like an imaginary surface where the resultant concentration at x = 0. The Green’s function in the
concentration becomes zero, without affecting the flow. case of an absorbing wall is then
Gab (x , y, x0 , y0 , t) = G(x , y, x0 , y0 , t)
III. C HANNEL C HARACTERIZATION − a(x0 )G(x , y, −x0 , y0 , t). (11)
For a 2-dimensional system, having a constant drift velocity
v from the transmitter to the receiver along the x-axis, (1) can The absorption condition at the receiver requires that
Gab (0, y, x0 , y0 , t) = 0, giving
be rewritten as
G(0, y, x0 , y0 , t) x v
∂c(x , y, t) ∂c(x , y, t) ∂ 2 c(x , y, t) ∂ 2 c(x , y, t) 0
+v =D ,
a(x0 ) = = exp − , (12)
∂t ∂x ∂x 2
+
∂y 2 G(0, y, −x0 , y0 , t) D
(2) which is independent of t and can be interpreted as the mass
of the negative particle. Thus, the Green’s function for the
where, c(x , y, 0) = δ(x − x0 )δ(y − y0 ) and c (x, y, t) = 0, convection-diffusion equation in (2) for an absorbing receiver
for x , y ∈ S.
Using the separation of variables, the concentration field at x = 0 can be written as

c (x, y, t) can be written as 1 (x − x0 − vt)2 + (y − y0 )2
Gab (x , y, x0 , y0 , t) = exp −
4πDt 4Dt
c(x , y, t) = c1 (x , t) · c2 (y, t). (3)
x0 v (x + x0 − vt)2 + (y − y0 )2
− exp − − .
Using (3), we get D 4Dt
∂2c ∂ 2 c1 ∂ 2 c ∂ 2 c2 (13)
= c2 , = c1 , (4)
∂x 2 ∂x 2 ∂y 2 ∂y 2 Using (13), the diffusive flux at (0, y) is then

and ∂Gab (x , y, x0 , y0 , t)
J (0, y, t) = −D
∂c ∂c2 ∂c1 ∂x x =0
= c1 + c2 . (5)
∂t ∂t ∂t 1 (a + vt) (x0 + vt)2 + (y − y0 )2
=− exp −
Using (4) and (5) in (2) and rearranging the terms gives 8πD t2 4Dt

∂c2 ∂ 2 c2 ∂c1 ∂c ∂ 2 c1 (x0 − vt) x0 v (x0 − vt) + (y − y0 )2
2
c1 −D + c2 +v 1 −D = 0. (6)
∂t ∂y 2 ∂t ∂x ∂x 2 − exp − −
t2 D 4Dt
Since c1 , c2 = 0, we have the relations
v (x0 + vt)2 + (y − y0 )2
∂c1 ∂c1 ∂ 2 c1 ∂c2 ∂ 2 c2 =− exp − . (14)
+v −D = 0, −D = 0. (7) 4πDt 4Dt
∂t ∂x ∂x 2 ∂t ∂y 2
The diffusive flux J acts as the joint probability density func-
Thus, c1 (x , t) is the solution of the 1-dimensional convection- tion (PDF) for the exit time and the exit position from the
diffusion equation, and c2 (y, t) is the solution of the half plane. Thus, the arrival position density of the Brownian
1-dimensional diffusion equation in (7). Using the Laplace particle irrespective of the time (also known as the harmonic
transform method as in [11] for the initial condition measure) can be calculated as the marginal density for position
c(x , y, 0) = δ(x − x0 )δ(y − y0 ), c1 (x , t) and c2 (y, t) are by integrating over time. Since the receiver surface is perfectly
obtained as absorbing, this arrival position will always be the first arrival
position. Let X = (x0 , y0 ) denote the point of release of the
1 (x − x0 − vt)2
c1 (x , t) = √ exp − , (8) information molecule and Y = (x, y) its arrival position at
4πDt 4Dt the receiver boundary. The conditional PDF of Y at a receiver
located at the origin (x = 0) is given as
and
∞
1 (y − y0 )2 fY |X (y|x ) = J (x , y, t) dt |x =0 . (15)
c2 (y, t) = √ exp − . (9)
4πDt 4Dt 0
Using (14) and (15), we can write

∞

v (x + vt)2 + (y − y0 )2
fY |X (y|x ) = − exp − 0 dt.
4πDt 4Dt
0
(16)
To solve the integral in (16), we rearrange the terms to get
−v x v
fY |X (y|x ) = exp − 0
4πD 2D
∞
1 x0 2 + (y − y0 )2 v 2t
× exp − − dt. (17)
t 4Dt 4D
0
If Ks (z ) is the modified Bessel function of the second kind, Fig. 2. The FAP density. The number of molecules used in
then for positive constants a and b, we have the relation simulation=100000. The individual arrival positions are grouped into 100

s/2 bins, and the height of each bin is normalised by the total number of
∞ a √
b molecules.
t −s−1 exp − + bt dt = 2 Ks 2 ab , (18)
t a
0
where s and z are real and complex numbers, respectively. Under the assumption 0 < x1 < x2 . . . < xM , the maximum
Using (17) and (18) and substituting x0 = d , we get likelihood (ML) decision rule reduces to the pairwise compar-
⎛ ⎞ isons of the conditional PDFs for equal a priori probabilities,

⎜ v 2 d 2 + (y − y0 )2 ⎟ and thus, the decision threshold ηi ∈ {η1 , η2 , . . . , ηM −1 } is
−v −d v
fY |X (y|x ) = exp K0 ⎜
⎝
⎟.
⎠
simply
4πD 2D 2D
x + xi+1
ηi = i , (22)
(19) 2
From (19), we note that the conditional density depends only with η0 = −∞ and ηM = ∞. Here, (22) follows from the fact
on y − y0 , hence the FAP channel is an additive channel. that the PDF in (19) is uni-modal and with the flow only along
the x-axis, the average value of displacement of the Brownian
IV. P ERFORMANCE A NALYSIS particle along the y-axis is zero.
In this section, we analyze the performance of the FAP chan-
nel discussed in the previous section in terms of the SEP and B. Capacity Analysis
the channel capacity. Since the motivation behind this letter is Consider an M-ary modulation scheme with input and out-
to explore the first arrival position of an information molecule put alphabets given by X , Y = {1, 2, . . . , M }. We denote the
and derive its distribution, we have considered only a sin- conditional probability Pr(Y = y|X = x ) by p (y|x). The
gle particle channel to keep the analysis simple and provide mutual information between input X and output Y within a
a basis for further studies. Furthermore, we consider a time- slot is
slotted channel. To transmit a symbol, the transmitter releases
M
M
a molecule into the fluid medium at the start of each time slot. p(y | x )
We assume that the time slot is long enough to ensure that the I (X ; Y ) = px p(y | x )log2 , (23)
M

molecule does arrive at the receiver so that the timing based y=1 x =1 px p(y | x )
errors are negligible. x =1
A. SEP Analysis where {px } is the set of a priori probabilities and

Let the input and the output alphabets for an M-ary commu-
ηy
nication be X = {x1 , x2 , . . . , xM } and Y = {y1 , y2 . . . , yM },
respectively, where, xi = (d , xi ) and yi = (0, yi ). For sim- p(y|x ) = fY |X (y|X = x ) dy, x , y = 1, . . . , M . (24)
plicity, we assume that 0 < x1 < x2 . . . < xM . The SEP ηy−1
within a time slot is given as
M The channel capacity is given as

Pe = pi Pr{err |X = xi }, (20) C = max I (X ; Y ), (25)
i=1 {px }
where, for each slot, pi is the a priori probability of sending
the symbol xi and Pr{err |X = xi } denotes the probability of where, the maximization is with respect to the set of a priori
error when symbol xi is transmitted and is given as probabilities {px }.
η i−1 We use the method of Lagrange multipliers to maximize the
mutual information with respect to the constraint p1 + p2 +
Pr{err |X = xi } = fY |X (y|X = xi ) dy
· · · + pM = 1. The Lagrange function for our case can be
−∞ written as
∞
+ fY |X (y|X = xi ) dy. (21) Λ(p1 , . . . , pM , λ) = I (X ; Y ) + λ(p1 + p2 + · · · + pM − 1),
ηi (26)
PANDEY et al.: MC: FAP CHANNEL 511
Fig. 3. SEP vs diffusion coefficient D for v = 5μm/s, 10μm/s. Fig. 4. Capacity vs diffusion coefficient D for v = 5μm/s, 10μm/s.
where λ is the Lagrange multiplier. Taking the partial deriva- figures, the change in SEP as well as the change in capacity
tives of the Lagrange function with respect to pi , 1 ≤ i ≤ M, is more pronounced for smaller values of D, and then the
and equating them to zero, we get the set of equations increase flattens out. Also, the SEP and the channel capac-
⎡ ity have opposite behavior with respect to M in the case of
M ⎢M an M-ary scheme, necessitating a proper trade-off depending
⎢ p(y|x )
⎢ p(y|i )log2 upon the application.
⎣ M

y=1 x =1 px p(y|x ) VI. C ONCLUSION
⎛x =1 ⎞⎤ In this letter, we investigate the FAP of the information
px ⎜ p(y|i ) ⎟⎥ molecule for an MC channel. We derive a closed form expres-
⎜ ⎟⎥ sion for the arrival position density for the case where the fluid
+ p(y|x )⎜ ⎟⎥ = −λ, (27)
ln 2 ⎝ M ⎠⎦ medium has a positive drift. We then consider an MC chan-
px p(y|x ) nel where the release position of the information molecules is
x =1
modulated to convey information. We call this the FAP chan-
where i = 1, . . . , M . nel and analyze this MC channel in terms of SEP and capacity
We solve the set of equations in (27) numerically to in the information theoretic sense.
obtain the optimum (p̄1 , p̄2 , . . . , p̄M , λ̄). At this optimum,
both the Lagrange multiplier as well as the mutual informa- R EFERENCES
tion (23) attain their respective maximum values. Using (23), [1] T. Suda, M. Moore, T. Nakano, R. Egashira, and A. Enomoto,
the capacity per channel use can be obtained as “Exploratory research on molecular communication between nanoma-
⎡ ⎛ ⎞ ⎤ chines,” in Proc. Genet. Evol. Comput. Conf. (GECCO), Washington,
DC, USA, Jun. 2005, pp. 25–29.
M
⎢ M M ⎜ ⎟ ⎥
⎢ p̄x ⎜ p(y|i) ⎟ ⎥ [2] I. F. Akyildiz, F. Brunetti, and C. Blázquez, “Nanonetworks: A new com-
C = p̄i ⎢ − p(y|x )⎜ ⎟ − λ̄⎥. (28) munication paradigm,” Comput. Netw., vol. 52, no. 12, pp. 2260–2279,
⎣ ln 2 ⎝
M ⎠ ⎦ Aug. 2008.
i=1 y=1 x =1 p̄x p(y|x )
x =1 [3] N. Farsad, H. B. Yilmaz, A. Eckford, C.-B. Chae, and W. Guo, “A
comprehensive survey of recent advancements in molecular communi-
cation,” IEEE Commun. Surveys Tuts., vol. 18, no. 3, pp. 1887–1919,
3rd Quart., 2016.
V. N UMERICAL R ESULTS [4] N.-R. Kim and C.-B. Chae, “Novel modulation techniques using iso-
In this section, numerical results are obtained for typ- mers as messenger molecules for nano communication networks via
ical parameter values for short-range molecular com- diffusion,” IEEE J. Sel. Areas Commun., vol. 31, no. 12, pp. 847–856,
munication [1]. We consider the diffusion coefficient Dec. 2013.
[5] M. S. Kuran, H. B. Yilmaz, T. Tugcu, and I. F. Akyildiz, “Modulation
D ∈ [10, 100]μm2 /s, drift velocity v = 5, 10μm/s, and distance techniques for communication via diffusion in nanonetworks,” in Proc.
d = 10μm. The different release points for the information IEEE Int. Conf. Commun., Kyoto, Japan, Jun. 2011, pp. 1–5.
molecules signifying different symbols of the M-ary scheme [6] K. V. Srinivas, A. W. Eckford, and R. S. Adve, “Molecular communica-
are given by xi = (d , i ) for i = 0, 1, . . . , M , (i.e., a = 1), tion in fluid media: The additive inverse Gaussian noise channel,” IEEE
with all the distances being in μ-meters. The variation of SEP Trans. Inf. Theory, vol. 58, no. 7, pp. 4678–4692, Jul. 2012.
[7] N. Farsad, W. Guo, C.-B. Chae, and A. W. Eckford, “Stable distri-
with diffusion coefficient D is shown in Fig. 1, for two values butions as noise models for molecular communication,” in Proc. IEEE
of the drift velocity v. At a constant v, the SEP increases with Glob. Commun. Conf. (GLOBECOM), San Diego, CA, USA, Dec. 2015,
increase in D. This behavior results because the drift and the pp. 1–6.
diffusion components start to interact for larger values of D. [8] N. Pandey, R. K. Mallik, and B. Lall, “Truncated Lévy statistics for dif-
For a given value of D, the SEP decreases with increase of v, fusion based molecular communication,” in Proc. IEEE Glob. Commun.
Conf. (GLOBECOM), Singapore, Dec. 2017, pp. 1–6.
since for higher v, the drift component dominates the diffu- [9] Y. Murin, M. Chowdhury, N. Farsad, and A. Goldsmith, “Diversity
sion component. Furthermore, the SEP increases with increase gain of one-shot communication over molecular timing channels,” in
of M. Proc. IEEE Glob. Commun. Conf. (GLOBECOM), Singapore, Dec. 2017,
The capacity decreases with increase in diffusion coeffi- pp. 1–6.
cient D for as illustrated in Fig. 2. The capacity increases with [10] Y. Huang, M. Wen, L. Yang, C. B. Chae, and F. Ji. (2018).
Spatial Modulation for Molecular Communication. [Online]. Available:
increase of v, as higher drift diminishes the impact of the ran- https://arxiv.org/abs/1807.01468
domness introduced due to D. Furthermore, the capacity for [11] S. Redner, A Guide to First-Passage Processes. Cambridge, U.K.:
an M-ary scheme increase with increase of M. In both the Cambridge Univ. Press, 2001.
Spatial Correlations of a 3-D Non-Stationary MIMO Channel Model

With 3-D Antenna Arrays and 3-D Arbitrary Trajectories
Qiuming Zhu , Member, IEEE, Ying Yang , Cheng-Xiang Wang , Fellow, IEEE,
Yi Tan , Jian Sun , Member, IEEE, Xiaomin Chen, and Weizhi Zhong
Abstract—By considering the 3-D antenna arrays and 3-D arbi- Meanwhile, despite difficulty and high price in realization,
trary trajectories of a mobile station, a generic non-stationary the three dimensional (3D) antenna array is a promising solu-
geometry-based stochastic model for multiple-input multiple- tion to improve directional performance of MIMO systems.
output channels is proposed. Under 3-D non-isotropic von
Mises-Fisher scattering scenarios, the theoretical and approx- However, insufficient antenna spacing or lack of scattering
imate expressions of time-variant spatial correlation function would reduce these benefits due to increased spatial correla-
(SCF) are also derived and analyzed. Simulation results show tion (SC). Therefore, the exploitation of SC is vital for design,
that the SCFs of proposed model match well with the ones of optimization, and performance evaluation of wireless MIMO
existing models for the special cases of 1-D linear and 2-D curve communication systems. Especially, the closed-form expres-
trajectories. In addition, the derived theoretical SCFs also have
good agreements with simulated and measured results. sions of spatial correlation functions (SCFs) are essential to
derive the theoretical results of system performance [2], [3],
Index Terms—Non-stationary MIMO channels, geometry- i.e., capacity, energy efficiency, and bit error rate (BER).
based stochastic model (GBSM), spatial correlation (SC), von
Mises-Fisher (VMF), arbitrary trajectories. Most of SCFs in [3] and [4] were investigated for the sta-
tionary MIMO channels with wide-sense stationarity (WSS)
I. I NTRODUCTION assumption. Measurement campaigns have proved that the
WSS assumption is only valid for short time intervals and
ULTIPLE-INPUT multiple-output (MIMO) technolo-
M gies have drawn attention for their ability to increase
spectral efficiency and system capacity significantly [1].
the non-stationarity should be considered [1]. Recently, a few
3D non-stationary MIMO channel models have been presented
in [5]–[10]. However, the models in [5]–[8] only considered
Manuscript received July 27, 2018; revised September 4, 2018 and 3D scattering environments but assumed the mobile station
October 13, 2018; accepted October 17, 2018. Date of publication (MS) moving with a constant velocity (or 1D linear trajec-
October 26, 2018; date of current version April 9, 2019. This work was tory). Bian et al. [9] and Borhani et al. [10] took the velocity
supported by EU H2020 ITN 5G Wireless Project under Grant 641985, in
part by EU H2020 RISE TESTBED Project under Grant 734325, in part variations of MS into account, but they only considered 2D
by EPSRC TOUCAN Project under Grant EP/L020009/1, in part by the curve trajectories on the azimuth plane. For simplicity pur-
National Key Scientific Instrument and Equipment Development Project under pose, the MS in [5]–[10] was configured with a uniform linear
Grant 2013YQ200607, in part by NSF of China under Grant 61631020 and
Grant 61827801, in part by the Open Foundation for Graduate Innovation of array or 2D antenna array. In addition, the corresponding SCFs
NUAA under Grant KFJJ20170405, in part by the Taishan Scholar Program of were usually analyzed numerically due to complex mathemat-
Shandong Province, and in part by Fundamental Research Funds of Shandong ical derivations. This letter aims to fill these research gaps.
University under Grant 2017JC009. The associate editor coordinating the
review of this paper and approving it for publication was V. Raghavan. Overall, the major contributions and novelties of this letter
(Corresponding author: Cheng-Xiang Wang.) are summarized as follows:
Q. Zhu is with the Key Laboratory of Dynamic Cognitive System of 1) This letter develops a new generic 3D non-stationary
Electromagnetic Spectrum Space, College of Electronic and Information
Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing geometry-based stochastic model (GBSM) for MIMO chan-
211106, China, and also with the Institute of Sensors, Signals and Systems, nels. The proposed GBSM allows for 3D scattering environ-
School of Engineering and Physical Sciences, Heriot-Watt University, ments, 3D antenna arrays, and 3D arbitrary trajectories of the
Edinburgh EH14 4AS, U.K. (e-mail: zhuqiuming@nuaa.edu.cn).
Y. Yang, X. Chen, and W. Zhong are with the Key Laboratory of MS, which makes it more realistic and suitable for a wide
Dynamic Cognitive System of Electromagnetic Spectrum Space, College of variety of communication scenarios.
Electronic and Information Engineering, Nanjing University of Aeronautics 2) We derive the theoretical and approximate expressions
and Astronautics, Nanjing 211106, China (e-mail: yingy@nuaa.edu.cn;
chenxm402@nuaa.edu.cn; zhongwz@nuaa.edu.cn). of time-variant SCFs for our proposed GBSM under 3D
C.-X. Wang is with the National Mobile Communications Research von Mises-Fisher (VMF) scattering scenarios which are flex-
Laboratory, School of Information Science and Engineering, Southeast ible and allow the dependence between azimuth angles and
University, Nanjing 210096, China, and also with the Institute of Sensors,
Signals and Systems, School of Engineering and Physical Sciences, Heriot- elevation angles.
Watt University, Edinburgh EH14 4AS, U.K. (e-mail: chxwang@seu.edu.cn). The remainder of this letter is organized as follows.
Y. Tan is with the Institute of Sensors, Signals and Systems, Section II gives a new generic 3D non-stationary GBSM.
School of Engineering and Physical Sciences, Heriot-Watt University,
Edinburgh EH14 4AS, U.K. (e-mail: yi.tan@hw.ac.uk). In Section III, the theoretical and approximate expressions
J. Sun is with the Shandong Provincial Key Laboratory of Wireless of SCF for our proposed GBSM under 3D VMF scattering
Communication Technologies, School of Information Science and scenarios are derived. In Section IV, the SCFs of proposed
Engineering, Shandong University, Qingdao 266237, China (e-mail:
sunjian@sdu.edu.cn). model for three typical trajectories are simulated and validated.
Digital Object Identifier 10.1109/LWC.2018.2878210 Finally, some conclusions are given in Section V.
ZHU et al.: SCs OF 3-D NON-STATIONARY MIMO CHANNEL MODEL WITH 3-D ANTENNA ARRAYS AND 3D ARBITRARY TRAJECTORIES 513
II. T HE G ENERIC 3D N ON -S TATIONARY GBSM and

Let us consider a non-stationary MIMO channel under 3D T MS,t0 T
ΦL MS
n,m (t) = k((s̃n,m (t)) · Rv (t) · du + (s̃BS BS
n,m (t)) · ds )
scattering environments between a base station (BS) equipped
with S antennas and a MS equipped with U antennas. There (4)
are two coordinate systems, i.e., the BS coordinate system where k is the wave number, dMS,t u
0
denotes the initial 3D
x̃ ỹ z̃ and the MS coordinate system xyz. In the BS coordi- location of the uth MS antenna, s̃n,m (t) and s̃BS
MS
n,m (t) are the
nate system, dBS s = [dsBS,x̃ , dsBS,ỹ , dsBS,z̃ ]T denotes the 3D equivalent spherical unit vectors of the mth ray within the nth
location of BS antenna element s, while φBS n,m (t) and θn,m (t)
BS
path of arrival and departure signals, respectively, and can be
represent the azimuth angles of departure (AAoDs) and ele- expressed as
vation angles of departure (EAoDs) of the mth ray within the ⎡ BS/MS BS/MS
⎤
nth path, respectively. Note that the MS coordinate origin is cos(θn,m (t)) cos(φn,m (t))
BS/MS BS/MS ⎢
BS/MS BS/MS ⎥
the center of MS and x axis is the initial movement direction s̃(φn,m (t), θn,m (t)) = ⎣ cos(θn,m (t)) sin(φn,m (t)) ⎦.
BS/MS
of MS. The time-variant velocity vector of MS is denoted by sin(θn,m (t))
vMS (t). Consequently, the 3D location of MS antenna element (5)
u in the MS coordinate system should be time-variant and is
MS,x Note that the time-variant rotation matrix RV (t) in (4) takes
denoted as dMS u (t) = [du (t), duMS,y (t), duMS,z (t)]T . The
the effect of 3D arbitrary trajectories into account and can be
azimuth angles of arrival (AAoAs) and the elevation angles
calculated by
of arrival (EAoAs) of the mth ray within the nth path are ⎡ ⎤
denoted by φMS n,m (t) and θn,m (t), respectively. In this letter,
MS cos θV (t) cos φV (t) − sin φV (t) − sin θV (t) cos φV (t)
⎢ ⎥
the twin-cluster approach in [11] is adopted. The first and RV (t) = ⎣ cos θV (t) sin φV (t) cos φV (t) − sin θV (t) sin φV (t) ⎦ (6)
last clusters denoted by An and Zn , respectively, are consid- sin θV (t) 0 cos θV (t)
ered having random velocities denoted as vAn (t) and vZn (t),
where φV (t) and θV (t) denote the azimuth and elevation
respectively, while v An /Zn , φAn /Zn , and θAn /Zn denote their
angles of vMS (t), respectively.
amplitudes, azimuth, and elevation angles, respectively. The
rest of clusters are abstracted by several virtual links. Note
that the mean angles of AAoDs and EAoDs are denoted by III. T HE SCF S U NDER 3D VMF
φ̄BS S CATTERING S CENARIOS
n (t) and θ̄n (t), respectively, while the mean angles of
BS
AAoAs and EAoAs are denoted by φ̄MS n (t) and θ̄n (t).
MS A. Theoretical SCFs of the Generic 3D Non-Stationary
The small-scale fading channel between the BS and MS GBSM
can be expressed by an U × S complex matrix. Each ele- The normalized time-variant SCF of the nth path between
ment hu,s (t, τ ) denotes the complex channel impulse response two different pairs of antennas can be defined as
(CIR) between the sth BS antenna and the uth MS antenna,
and it can be modeled as [8] ρuu21 ,s 2 ,n ∗
,s1 ,n (t; Δd) = E h̃u1 ,s1 ,n (t)h̃u2 ,s2 ,n (t) . (7)
N (t)

hu,s (t, τ ) = Pn (t)h̃u,s,n (t)δ(τ − τn (t)) (1) Since clusters An and Zn are usually considered independent,
n=1
the normalized SCF can be rewritten as
where N(t) multiple paths are characterized by a path delay ρuu21 ,s 2 ,n BS BS MS
,s1 ,n (t; Δd) = ρs1 ,s2 ,n (t; Δd ) · ρu1 ,u2 ,n (t; Δd
MS
)(8)
τn (t), path power Pn (t), and channel coefficient h̃u,s,n (t).
It should be mentioned that the channel parameters where ρBSs1 ,s2 ,n (t; Δd ) and ρu1 ,u2 ,n (t; Δd
BS MS MS ) denote nor-
of (1) such as N(t), Pn (t), and τn (t) are time-invariant in malized SCFs at the BS and the MS, respectively. By
WINNER+ model. Here, they are upgraded to be time-variant substituting (2)–(4) into (7), they can be calculated as
to capture the non-stationarity of real channels. Moreover, our π π T
BS
(t)) ·ΔdBS
previous 3D non-stationary GBSMs in [8] as well as other ρBS
s1 ,s2 ,n (t;Δd
BS
)= ejk(s̃n
−π −π
existing GBSMs only considered 1D or 2D trajectories and
antenna arrays of the MS. In order to take 3D arbitrary tra- × pφBS BS
n,m (t),θn,m (t)
(φBS BS BS BS
n (t), θn (t))dφn dθn
jectories and antenna arrays into account, this letter models (9)
π π
h̃u,s,n (t) as jk(s̃MS
T MS,t0
ρMS
u1 ,u2 ,n (t; Δd
MS
)= e n (t)) ·Rv (t)·Δd
M
1 j(ΦD L I
−π −π
h̃u,s,n (t) = lim e n,m (t)+Φn,m (t)+Φn,m ) (2) × pφMS MS (φMS MS MS MS
n (t), θn (t))dφn dθn
M →∞ M n,m (t),θn,m (t)
m=1
(10)
where M denotes the number of rays, and ΦIn,m is a ran-
dom initial phase uniformly distributed over [0, 2π). In (2), where ΔdBS = dBS s1 − ds2 denotes the space lag at the
BS
ΦD L
n,m (t) and Φn,m (t) denote the time-variant phases caused BS which does not change over time, ΔdMS (t) = RV (t) ·
by Doppler frequency variations and antenna location varia- ΔdMS,t0 denoted the time-variant space lag at the MS, and
MS,t MS,t
tions, respectively, and they can be further modeled as [12] ΔdMS,t0 = Δdu1 0 − du2 0 means the initial space lag.
t In (9) and (10), p BS/MS (φ
BS/MS BS/MS
(t), θn (t))
φn,m (t),θn,m (t) n
BS/MS
D
Φn,m (t) = k vMS (t ) − vZn (t ) s̃MS
n,m (t )dt

(3)
0 represents the time-variant joint probability density function
TABLE I
(PDF) of random angles, i.e., AAoDs φBS
n (t) and EAoDs PARAMETERS OF T HREE D IFFERENT T RAJECTORIES
θnBS (t), or AAoAs φMS
n (t) and EAoAs θn (t).
MS
B. Approximate Expressions of SCFs

For several standard channel models, e.g., WINNER+ model
and 3GPP-3D model, the azimuth and elevation angles of
AoAs and AoDs are supposed to be independent and obey
Gaussian and Laplacian distributions, respectively. However,
Mammasis et al. [4] demonstrated that they are related under
some scenarios and the flexible VMF distribution can fit
measurement data better. The PDF of VMF distribution is In a similar way, we can derive the approximate expression
defined as of ρBS BS
s1 ,s2 ,n (t; Δd ) as
κ −Im(erfi(G BS (t)))Im(erfi(H BS (t)))
p(φ, θ) = cosθeκ(cos θ cos θ cos(φ−φ)+sin θ sin θ) (11) ρBS
s1 ,s2 ,n (t; Δd
BS
)≈
4π sinh(κ) 8 sinh(κ)
EBS (t) 2 DBS (t)
where −π ≤ φ ≤ π, −π/2 ≤ θ ≤ π/2, φ̄ and θ̄ represent ×e
jks̃(φ̄BS BS
n (t),θ̄n (t))Δd
BS,t0
−(kΔdBS,t0 ) ((
2
√
2κ
) +( √2κ )2 )+κ
.
the mean values of azimuth and elevation angles, respectively,
and κ controls the concentration of VMF distribution. (16)
By substituting (11) into (9) and (10), the theoretical SCFs where
at the BS and MS under VMF scattering scenarios can be
BS BS BS BS BS BS
obtained. However, it is difficult to derive the accurate closed- D (t) = [− sin θ̄n (t) cos φ̄n (t), − sin θ̄n (t) sin φ̄n (t), cos θ̄n (t)]
form expressions of SCFs. In the following, we will derive (17a)
the corresponding approximate expressions, which are very BS BS BS
E (t) = [ − sin φ̄n (t), cos φ̄n (t), 0] (17b)
helpful to investigate the impact of 3D arbitrary trajectories BS
kE (t)Δd BS,t0
− jκ cos2 (θ̄n
BS
(t))ΔBS
BS φ
on the SC. Let us take ρMSu1 ,u2 ,n (t; Δd
MS ) as an example. By G (t) = (17c)
2κ cos2 (θ̄n
BS (t))
setting θn (t) = θ̄n (t) + ζ and φn (t) = φ̄MS
MS MS MS
n (t) + υ and kDBS (t)ΔdBS,t0 − jκΔBS
BS θ .
substituting them into (10), it yields H (t) = √ (17d)
2κ

κ cos(θ̄nMS (t) + ζ)AMS B MS
ρMS
u1 ,u2 ,n (t; Δd
MS
)= dζdυ (12) Finally, combining (8) and (14)–(17), the approximate SCF
4π sinh(κ)
{ζ,υ}
of the proposed generic 3D non-stationary GBSM can be
obtained.
where
MS MS MS MS
AMS = eκ(cos(θ̄n (t)+ζ) cos θ̄n (t) cos(υ)+sin(θ̄n (t)+ζ) sin θ̄n (t))
IV. S IMULATION R ESULTS AND A NALYSIS
(13a) In the simulation, the time-variant speed and direction
B MS = ejks̃(φ̄n
MS MS
(t)+υ,θ̄n (t)+ζ)RV (t)ΔdMS,t0
. (13b) models in [13] are adopted, i.e., v MS (t) = v0 + a · t,
φV (t) = φ0 + ωφ · t, and θV (t) = θ0 + ωθ · t, where v0
The measured data has shown that the angle spreads are is the initial speed, a is the acceleration of speed, φ0 and θ0
small under some scenarios, which means ζ and υ are small are the initial azimuth and elevation of movement, respectively,
or the parameter κ is big [4]. Holding this condition, we ωφ and ωθ are the corresponding angular speeds. To verify the
have cos(ζ) ≈ cos(υ) ≈ 1, sin(ζ) ≈ ζ, sin(υ) ≈ υ, effectiveness of proposed generic GBSM and the derived theo-
κ cos(υ) ≈ κ(1 − υ 2 /2), and κ cos(ζ) ≈ κ(1 − ζ 2 /2). By retical SCF, three kinds of trajectories with the same start point
using the integration formula, we can obtain the approximate and end point, i.e., 1D linear trajectory, 2D curve trajectory,
expression of (14) as and 3D arbitrary trajectory, were selected and rest simulation
−Im(erfi(G MS (t)))Im(erfi(H MS (t))) parameters are summarized in Table I. Note that our proposed
ρMS
u1 ,u2 ,n (t; Δd
MS
)≈
8 sinh(κ) model are suitable for all three paths, while the models in [8]–
MS MS
(t))RV (t)ΔdMS,t0 [10] chosen to compare only support Path I or Path II as shown
× ejks̃(φ̄n (t),θ̄n
2 EMS (t) 2 DMS (t) 2
in Table I.
−(kRV (t)ΔdMS,t0 ) (( √ ) +( √ ) )+κ
×e 2κ 2κ . (14) According to the definition of (9) and (10), the absolute
values of theoretical SCFs of our generic 3D non-stationary
where erfi(·) is the imaginary error function, GBSM for Paths I–III at t = 4 s are shown in Fig. 1. For
D
MS MS
(t) = [− sin θ̄n (t) cos φ̄n
MS MS
(t), − sin θ̄n
MS
(t) sin φ̄n
MS
(t), cos θ̄n (t)] comparison purpose, the SCFs of 3D non-stationary GBSMs
in [8]–[10] for Path I or Path II at t = 4 s are also given. It can
(15a)
be observed that the SCFs of our model agree well with the
MS MS MS
E (t) = [ − sin φ̄n (t), cos φ̄n (t), 0] (15b) one in [8] for Path I, and the ones in [9] and [10] for Path II,
MS MS,t0
MS
kE (t)RV (t)Δd − jκ cos2 (θ̄n
MS
(t))ΔMS
φ which means the proposed generic GBSM is compatible with
G (t) = (15c)
MS (t))
2κ cos2 (θ̄n existing ones under 1D and 2D trajectories. In addition, it
kDMS (t)RV (t)ΔdMS,t0 − jκΔMS
also shows that different trajectories have a great impact on
MS θ
H (t) = √ . (15d) the correlation properties of channel models.
2κ
ZHU et al.: SCs OF 3-D NON-STATIONARY MIMO CHANNEL MODEL WITH 3-D ANTENNA ARRAYS AND 3D ARBITRARY TRAJECTORIES 515
of the MS in this letter. The proposed model is compati-

ble with the existing 3D non-stationary GBSMs for 1D and
2D trajectories. In addition, the corresponding theoretical and
approximate expressions of SCFs under VMF scattering sce-
narios have also been derived and verified by simulations
and measurements. These results can significantly improve the
efficiency of analyzing the SC and performance of MIMO
communication systems. In the future, we will further ana-
lyze other time-varying statistical properties of this proposed
GBSM, i.e., the autocorrelation function (ACF) and Doppler
power spectrum density (DPSD), and investigate the impact
of different trajectories on the system performance.
R EFERENCES
Fig. 1. Absolute values of the theoretical SCFs of our proposed model [1] C.-X. Wang et al., “Recent advances and future challenges for massive
and other models for three paths at t = 4 s (UMi NLOS scenario, MIMO channel measurements and models,” Sci. China Inf. Sci., vol. 59,
κ = 65, v An and vZn ∼U(0,5) m/s, φAn and φZn ∼U(−π, π), θAn and no. 2, pp. 1–16, Feb. 2016.
θ Zn ∼U(−π/2, π/2)). [2] J.-A. Tsai, R. M. Buehrer, and B. D. Woerner, “BER performance of
a uniform circular array versus a uniform linear array in a mobile
radio environment,” IEEE Trans. Wireless Commun., vol. 3, no. 3,
pp. 695–700, May 2004.
[3] Q.-U.-A. Nadeem, A. Kammoun, M. Debbah, and M.-S. Alouini, “A
generalized spatial correlation model for 3D MIMO channels based
on the Fourier coefficients of power spectrums,” IEEE Trans. Signal
Process., vol. 63, no. 14, pp. 3671–3686, Jul. 2015.
[4] K. Mammasis, R. W. Stewart, and J. S. Thompson, “Spatial fad-
ing correlation model using mixtures of Von Mises Fisher distribu-
tions,” IEEE Trans. Wireless Commun., vol. 8, no. 4, pp. 2046–2055,
Apr. 2009.
[5] J. Bian et al., “A WINNER+ based 3-D non-stationary wideband
MIMO channel model,” IEEE Trans. Wireless Commun., vol. 17, no. 3,
pp. 1755–1767, Mar. 2018.
[6] S. Wu, C.-X. Wang, M. Aggoune, M. M. Alwakeel, and X. You, “A
general 3D non-stationary 5G wireless channel model,” IEEE Trans.
Commun., vol. 66, no. 7, pp. 3065–3078, Jul. 2018.
[7] A. G. Zajić, “Impact of moving scatterers on vehicle-to-vehicle narrow-
band channel characteristics,” IEEE Trans. Veh. Technol., vol. 63, no. 7,
pp. 3094–3106, Sep. 2014.
[8] Q. Zhu et al., “A novel 3D non-stationary wireless MIMO channel sim-
Fig. 2. Absolute values of the theoretical, approximated, simulated SCFs of
ulator and hardware emulator,” IEEE Trans. Commun., vol. 66, no. 9,
the proposed model for Path III at three time instants and the measured SCF
pp. 3865–3878, Sep. 2018.
in [14] (UMi NLOS scenario, κ = 65, v An and vZn ∼U(0,5) m/s, φAn and [9] J. Bian, C.-X. Wang, M. Zhang, X. Ge, and X. Gao, “A 3-D non-
φZn ∼U(−π, π), θAn and θ Zn ∼U(−π/2, π/2)). stationary wideband MIMO channel model allowing for velocity vari-
ations of the mobile station,” in Proc. ICC, Paris, France, Jul. 2017,
pp. 1–6.
The theoretical results of SCFs at t = 0 s, 4 s, and 8 s of [10] A. Borhani, G. L. Stüber, and M. Pätzold, “A random trajectory approach
proposed GBSM for path III are compared with the simulated for the development of nonstationary channel models capturing different
and approximate results in Fig. 2. It clearly shows that SCFs scales of fading,” IEEE Trans. Veh. Technol., vol. 66, no. 1, pp. 2–14,
Jan. 2017.
change over time due to the movement of the MS. Meanwhile, [11] H. Hofstertter, A. F. Molisch, and N. Czink, “A twin-cluster MIMO
the measurement results in [14] are also shown in Fig. 2. channel model,” in Proc. EuCAP, Nice, France, Nov. 2006, pp. 1–8.
The good agreement between measured, theoretical, approxi- [12] B. Boashash. Time-Frequency Signal Analysis and Processing: A
Comprehensive Reference. Amsterdam, The Netherlands: Academic,
mate, and simulated SCFs shows the correctness of both the 2015.
proposed model and theoretical derivations. [13] W. Dahech, M. Pätzold, C. A. Gutiérrez, and N. Youssef, “A non-
stationary mobile-to-mobile channel model allowing for velocity and
trajectory variations of the mobile stations,” IEEE Trans. Wireless
V. C ONCLUSION Commun., vol. 16, no. 3, pp. 1987–2000, Mar. 2017.
[14] S. Payami and F. Tufvesson, “Channel measurements and analysis
We have proposed a generic 3D non-stationary GBSM for very large array systems at 2.6 GHz,” in Proc. Antennas Propag.
incorporating 3D arbitrary trajectories and 3D antenna arrays (EUCAP), Prague, Czech Republic, Mar. 2012, pp. 433–437.
Protograph-Based Folded Spatially Coupled LDPC Codes

for Burst Erasure Channels
Inayat Ali , Hyunjae Lee, Ayaz Hussain, and Sang-Hyo Kim
Abstract—In this letter, protograph-based folded spatially and asymptotic optimality is shown for burst erasure chan-
coupled (FSC) LDPC codes are proposed. The new folded-type nels. The splitting of the convolutional structure into multiple
structure is obtained by folding the spatial coupling chain of a bands at different location makes windowed decoding impos-
conventional spatially coupled (SC) LDPC protograph and inter-
sible; so the low latency continuous decoding property of SC
lacing the nodes at staggered spatial positions. The proposed
codes outperform the SC LDPC codes over single and multiple LDPC codes is lost by this splitting technique. Asymmetric
random-burst erasure channels. We extend the construction of spatially coupled LDPC codes are proposed in [7] to enhance
the folded-type structure by connecting multiple one-sided SC the burst erasure correcting capability.
LDPC chains for higher resilience to burst erasure channels. The In this letter, we propose a protograph-based design of
FSC LDPC codes are also compatible with windowed decoder folded-type SC LDPC codes, which are formed by simply
and outperform conventional SC LDPC codes. folding the spatial structure of the conventional SC LDPC
Index Terms—Spatially coupled LDPC codes, windowed codes. The sub-blocks, each of which is a subset of vari-
decoder, burst erasure channels, protograph-based codes. able nodes at a spatial position, are interlaced between each
other, which introduces an interleaving effect when codewords
are transmitted over the burst erasure channels. The newly
I. I NTRODUCTION structured codes outperform the conventional SC LDPC codes
PATIALLY coupled (SC) LDPC codes have capac- in the asymptotic analysis and simulations. With the folded
S ity achieving performance over binary-input memoryless
symmetric-output (BMS) channels [1]. The belief propagation
structure, the low latency windowed decoding can still be con-
ducted while keeping regularity in the protograph structure. It
(BP) threshold of these codes saturates to the maximum a should be noted that the idea of folding was also suggested
posteriori (MAP) threshold of same degree distribution uncou- in [11] to get one-sided SC LDPC codes, which have a smaller
pled block-LDPC codes when the coupling length ‘L’ tends rate loss. However, we use a different idea of folding that
to infinity, which was proven for BMS channels in [2]. Good comes with interlacing. The scheme is extended by construct-
performance of these codes is due to the wave-like propaga- ing generalized-FSC LDPC structure from multiple one-sided
tion of reliable information from two terminated sides of the SC LDPC codes for better resiliency to burst erasures.
graph during BP decoding. This wave-like progress of decod-
ing can halt when a burst of erasures is introduced. Therefore,
II. C ODE C ONSTRUCTION
it was shown that, despite their good performance in mem-
oryless channels, the SC LDPC codes perform far from the A. Folded SC LDPC Codes
Singleton bound for burst erasure channels [3]. An extensive Spatial coupling of multiple LDPC protographs results in a
finite-length analysis of SC LDPC codes over different burst base matrix with a convolutional diagonal band structure [9].
erasure channels was done in [8] and tight bounds on erasure Let us consider a regular (J, K) LDPC protograph with base
probability are shown. matrix B, where, J and K are variable and check node degrees,
To improve the burst erasure correcting capability of SC respectively. The diagonal band in the base matrix depends
LDPC codes, convolutional interleavers are used to scatter the on the edge spreading pattern. Different ensembles of SC
erasures in a burst [4]. Multidimensional SC LDPC codes are LDPC codes can be obtained by using different edge spreading
introduced in [5] to overcome burst erasures, but these codes rules [9]. To preserve the degree distribution of the underlying
have a higher rate loss than the conventional SC LDPC codes. (J, K) LDPC base matrix, the valid edge spreading condition
The complexity of designing finite-length multidimensional that must be satisfied is given by
SC LDPC codes is also higher. In [6], the band-splitting w
permutation technique for the SC LDPC codes is proposed,
Bi = B, (1)
Manuscript received August 17, 2018; accepted October 21, 2018. Date i=0
of publication October 30, 2018; date of current version April 9, 2019.
This work was supported by the Basic Science Research Program through where Bi ’s are component base matrices that define edge
the National Research Foundation of Korea (NRF) funded by the Ministry spreading. In this letter, two (3, 6) ensembles are considered
of Education under Grant (NRF-2015R1D1A1A01058975) and Grant (NRF- for performance evaluation. Ensemble A is defined by com-
2018R1A2B6004195). The associate editor coordinating the review of this
paper and approving it for publication was I. Land. (Corresponding author: ponent base matrices B0 = B1 = B2 = [1 1], whereas
Sang-Hyo Kim.) ensemble B by B0 = [2 2], B1 = [0 1], B2 = [1 0].
The authors are with the College of Information and Communication For constructing finite length codes, we follow the procedure
Engineering, Sungkyunkwan University, Suwon 16419, South Korea
(e-mail: inayat@skku.edu; dlguswo77@skku.edu; ayaz@skku.edu;
defined in [9] and [10].
iamshkim@skku.edu). The FSC LDPC codes are obtained by folding the chain
Digital Object Identifier 10.1109/LWC.2018.2878562 structure of SC LDPC codes as illustrated in Fig. 1. Two
ALI et al.: PROTOGRAPH-BASED FSC LDPC CODES FOR BURST ERASURE CHANNELS 517
Fig. 2. Generalized FSC LDPC protograph constructed from m one-sided

SC LDPC chains (Li = 4).
B. Generalization by Using One-Sided SC LDPC

Fig. 1. FSC LDPC protograph constructed by folding and interlacing an
Code Chains
L = 8 SC LDPC protograph of ensemble A. The FSC LDPC codes introduced in the previous subsection
can be regarded as the folded graph obtained by connecting
wings of the coupled chain are aligned to be parallel and then two one-sided SC LDPC codes. Two one-sided SC LDPC
interlaced in such a way that the spatial positions are indexed chains are overlayed by taking node positions alternatively
alternatively from the wing-tips to the pivot. and their non-terminated sides are connected by exchanging
Let x = [xj ] be a proto-codeword of the conventional SC edges. The FSC LDPC code construction can be general-
LDPC code, where xj denotes the set of protograph symbol ized to a construction from m one-sided SC LDPC chains
nodes at spatial position j ∈ {1, . . . , L}. Note that xj will of length Li as depicted in Fig. 2, where i ∈ {1, . . . , m}
later become a sub-block of an explicit code constructed by is the index of chains. We denote the generalized ensemble
code lifting. Let z = [zk ] be a proto-codeword of FSC LDPC by L(J , K , m, Li ). Fig. 2 shows an example of edge con-
codes, whose entry is then determined by nections which is a special case of possible connections. The

2j − 1, for j ≤ L/2, entry of the proto-codeword of ensemble L(J , K , m, Li ) is
zσ(j ) = xj , σ(j ) = determined by
2(L − j + 1), otherwise,
ẑσ(j ,i) = x̂(j ,i) , σ(j , i ) = m(j − 1) + i , ∀ j ∈ {1, . . . , Li }.
where σ(·) is bijective in [1, L]. The new proto-codeword z is
transmitted symbol by symbol sequentially. This permutation This interlacing from m copies of SC LDPC chains gives a
can be regarded as a special interleaving. However, the new distance of m between adjacent sub-blocks. The burst erasure
code does not require an interleaver in the encoder nor a dein- correction capability is increased due to this spacing in the
terleaver in the decoder; because the structure is compatible constructed FSC LDPC code.
with sequential encoding and windowed decoding thanks to
the diagonal band structure in the corresponding base matrix. III. C HANNEL M ODELS AND A SYMPTOTIC A NALYSIS
Note that the base matrix BA [1,L] of FSC LDPC ensemble A OVER B URST E RASURE C HANNELS
has a diagonal band. Lifting is applied to construct a practi-
cal FSC LDPC code. Each zk at spatial position k forms a We consider the channels with single or multiple bursts of
sub-block of variable nodes in the lifted graph. erasures as a class of correlated erasure channels. In a single
Since the two terminated sides of conventional SC LDPC random-burst erasure channel (SBC), a single block of con-
codes are now positioned at the same side after folding the secutive bits of length lb in the codeword undergoes bursty
structure, the BP decoding wave progresses from only one side erasures (i.e., with a high erasure rate εb ), whereas other code-
to the other. In conventional SC LDPC codes, the windowed word bits undergo a standard random erasure channel with
decoding starts from only one terminated side of the code. But erasure rate ε [4]. Therefore, SBC is characterized as a two-
for the FSC LDPC codes, the windowed decoding is equivalent state binary erasure channel (BEC) with ε εb . The position
to a parallel realization of two windowed decoders with a of the starting bit of the burst is uniformly random, with a
half-sized window that starts from the two terminated sides of boundary condition such that the burst length remains con-
the conventional SC LDPC codes. The smaller window size stant. Thus, the last possible position of the burst is N − lb + 1
may cause performance degradation under a random erasure (where N is the code length). We also considered a multiple
environment. random-burst erasure channel (MBC), where nb random bursts
⎡ ⎤ of fixed length lb and burst erasure rate εb occur uniformly at
11 random over the range of burst positions. However, the bursts
⎢00 11 ⎥
⎢ ⎥
⎢ 1 1 0 0 ...
can overlap in this channel model, so the number of resolv-
⎥ able bursts can be smaller than nb . Note that the two channel
⎢ .. ⎥
⎢ 0 0 1 1 . 1 1 ⎥
A ⎢
B[1,L] = ⎢ .. ⎥ (2) models have a similarity of nonergodicity with the wireless
1 1 0 0 . 0 0 1 1 ⎥
⎢ . ⎥ block-fading channels [13].
⎢
⎢ 1 1 . . 1 1 0 0 1 1⎥ ⎥ The asymptotic behavior of L(J , K , m, Li ) code ensemble
⎣ 0 0 1 1 1 1⎦ over burst erasure channels can be analyzed by the density
11 11 11 evolution (DE) technique [9] for protograph-based ensembles.
TABLE I
SC & FSC LDPC T HRESHOLDS FOR D IFFERENT lb & εb
Fig. 4. BEC threshold (ε∗ ) calculated for a burst at different locations in the
Fig. 3. BEC threshold (ε∗ ) vs burst erasure rate (εb ) for a single burst protograph (lb = 4, εb = 0.6 and 0.57 for ensemble A and B, respectively).
located at the center of the protograph (lb = 6).
TABLE II
AVERAGE T HRESHOLDS FOR R ANDOM B URSTS
In asymptotic settings (i.e., when M → ∞, here M is lifting
factor of graph), we assume that an erasure burst applies in
terms of only complete spatial positions. Therefore, lb is the
number of spatial positions that a burst spans. The erasure rate
threshold for a given burst erasure rate εb is defined as
i→∞ for FSC LDPC codes when the burst location is at the pivot
ε∗ = sup{ε ∈ [ 0, 1 ] | PiBP (ε, εb ) −−−−→ 0 , ε < εb }, (3) of the folded structure, this case does not dominate the over-
all performance due to the random location of a burst in the
where, PiBP (ε, εb ) is the resulting erasure probability at the
SBC. The average threshold ε∗avg for SC and FSC LDPC codes
i th iteration of DE. The thresholds of SC and FSC LDPC
are shown in Table II. From these observations, we note that
codes are compared for different lb and εb in Table I. To find
FSC LDPC codes perform better on average, but do not help
these thresholds, the position of the erasure burst is fixed at
in the worst case, i.e., the pivot location. Moreover, it can
the center of the protograph such that the starting position of
be noted that ensemble L(3, 6, 3, 51) has a relatively better
the burst is j = k = (L−lb )/2
. It can be noted that the FSC
threshold at pivot because of the degree of freedom to choose
LDPC codes have higher thresholds when lb and εb increase.
better connectivity among multiple chains. The thresholds of
Because of the interlacing, the disjoint nodes between the con-
convolutional interleaved SC LDPC codes [4] are also shown
nected nodes split the burst into two from the perspective of
in Fig. 3 and 4 for comparison. We selected the structure of
the original SC LDPC chain. However, if the erasure burst is
convolutional interleaver such that it has the same memory
located at the pivot of the FSC LDPC protograph, the split
as that of FSC LDPC codes. However, this interleaver incor-
bursts are not separated.
porates some delay due to its structural weakness and it also
Figure 3 shows the improvement in burst erasure correcting
requires more hardware block to interleave and deinterleave
capability of proposed FSC LDPC codes. The threshold ε∗ of
the data.
SC LDPC codes decreases to zero at low εb (i.e., εb = 0.5573
and 0.538 for ensemble A and B, respectively), whereas it can
be seen that the FSC LDPC codes outperform SC LDPC codes IV. N UMERICAL R ESULTS
by attaining the same threshold at a much higher burst era- The performance of FSC LDPC codes is evaluated and com-
sure rate. More considerable gain is attained with the ensemble pared with that of conventional SC LDPC codes over SBC and
L(3, 6, 3, 51). The lb = 6 and the burst location is at the cen- MBC. The finite length codes from ensemble A and B are con-
ter of the protograph for Fig. 3. The thresholds for different structed with the code parameters set as L = 50 and M = 512,
burst locations are shown in Fig. 4. It is observed that ε∗ gets so that we obtain code length N = 51200. The girth of the
higher when the burst is located at the terminated side for constructed codes was kept larger or equal to 10. For all the
both SC and FSC LDPC codes. Although ε∗ becomes lower constructed codes, the code rate R = 0.48 is obtained because
ALI et al.: PROTOGRAPH-BASED FSC LDPC CODES FOR BURST ERASURE CHANNELS 519
performance in different environments. Windowed decoding

performances are shown in Fig. 6 for varying window size
W, where εb = 0.8. As can be noted, FSC LDPC codes out-
perform conventional SC LDPC codes in full block BP and
windowed decoding. Figure 7 exhibits the FER performances
confirming the superiority of the proposed FSC LDPC codes
over the MBC, where the number of random bursts and εb are
set at 6 and 0.7, respectively.
V. C ONCLUSION
We propose a family of windowed-decodable codes called
FSC LDPC codes, whose structure is formed by folding and
interlacing the protograph of conventional SC LDPC codes.
Fig. 5. FER performances for full block BP decoding of SC and FSC LDPC The DE analysis showed that the FSC LDPC codes are suitable
codes with m = 2 constructed from ensembles A and B over SBC. for transmission over burst erasure channels. We also proposed
a construction method of generalized structures based on the
idea of folding SC LDPC chains. The generality of the result is
confirmed by examining multiple protograph-based ensembles.
By comparing the simulated FER curves, a significant gain in
performance of FSC LDPC codes is observed over single and
multiple random-burst erasure channels under full block BP
and windowed decoding.
R EFERENCES
[1] S. Kudekar, T. Richardson, and R. L. Urbanke, “Spatially coupled
ensembles universally achieve capacity under belief propagation,” IEEE
Trans. Inf. Theory, vol. 59, no. 12, pp. 7761–7813, Dec. 2013.
[2] S. Kumar, A. J. Young, N. Macris, and H. D. Pfister, “Threshold satura-
tion for spatially coupled LDPC and LDGM codes on BMS channels,”
IEEE Trans. Inf. Theory, vol. 60, no. 12, pp. 7389–7415, Dec. 2014.
[3] A. R. Iyenger et al., “Windowed decoding of protograph-based LDPC
Fig. 6. FER performances for windowed decoding of SC and FSC LDPC convolutional codes over erasure channels,” IEEE Trans. Inf. Theory,
codes with m = 2 constructed from ensembles A and B over SBC (εb = 0.8). vol. 58, no. 4, pp. 2303–2320, Apr. 2012.
[4] R. A. Ashrafi and A. E. Pusane, “Spatially-coupled communication
system for the correlated erasure channel,” IET Commun., vol. 7, no. 8,
pp. 755–765, May 2013.
[5] R. Ohashi, K. Kasai, and K. Takeuchi, “Multi-dimensional spatially-
coupled codes,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Istanbul,
Turkey, Jul. 2013, pp. 2448–2452.
[6] H. Mori and T. Wadayama, “Band splitting permutations for spatially
coupled LDPC codes achieving asymptotically optimal burst erasure
immunity,” IEICE Trans. Fundam., vol. E100-A, no. 2, pp. 663–669,
Feb. 2017.
[7] Z. Zhang and Y. Li, “Asymmetric spatially coupled LDPC codes for
multiple-burst erasure channels,” IEEE Commun. Lett., vol. 21, no. 8,
pp. 1695–1698, Aug. 2017.
[8] V. Aref, N. Rengaswamy, and L. Schmalen, “Finite-length analysis of
spatially-coupled regular LDPC ensembles on burst-erasure channels,”
IEEE Trans. Inf. Theory, vol. 64, no. 5, pp. 3431–3449, May 2018.
[9] D. G. M. Mitchell, M. Lentmaier, and D. J. Costello, “Spatially coupled
LDPC codes constructed from protographs,” IEEE Trans. Inf. Theory,
vol. 61, no. 9, pp. 4866–4889, Sep. 2015.
Fig. 7. FER performances for full block BP decoding of SC and FSC LDPC [10] I. Ali, J.-H. Kim, S.-H. Kim, H. Kwak, and J.-S. No, “Improving win-
codes with m = 2 constructed from ensembles A and B over MBC with 6 dowed decoding of SC LDPC codes by effective decoding termination,
random bursts and εb = 0.7. message reuse, and amplification,” IEEE Access, vol. 6, pp. 9336–9346,
2017.
[11] M. R. Sanatkar and H. D. Pfister, “Increasing the rate of spatially-
coupled codes via optimized irregular termination,” in Proc. 9th Int.
of the termination. We apply the standard flooding schedule Symp. Turbo Codes Iterative Inf. Process. (ISTC), Brest, France,
for full block BP and windowed decoding [10]. The maximum Sep. 2016, pp. 31–35.
[12] P. M. Olmos, D. G. M. Mitchell, D. Truhachev, and D. J. Costello, “A
number of iterations (IMAX ) is set as 100. finite length performance analysis of LDPC codes constructed by con-
Figure 5 shows the frame error rate (FER) performances of necting spatially coupled chains,” in Proc. IEEE Inf. Theory Workshop
SC and FSC LDPC codes from ensemble A and B, respec- (ITW), Sevilla, Spain, Sep. 2013, pp. 1–5.
[13] I. Andriyanova, N. Ul Hassan, M. Lentmaier, and G. P. Fettweis,
tively, decoded with full block BP decoder over the SBC. “SC-LDPC codes over the block-fading channel: Robustness to a syn-
We simulated the transmission of the codes over the SBC chronisation offset,” in Proc. IEEE Int. Black Sea Conf. Commun. Netw.
with εb = 0.8, 0.9 and lb = 2048, 2560 to investigate the (BlackSeaCom), Constanta, Romania, May 2015, pp. 97–101.
Distribution of the Number of Users per Base Station in Cellular Networks

Geordie George , Member, IEEE, Angel Lozano , Fellow, IEEE, and Martin Haenggi , Fellow, IEEE
Abstract—We consider the number of users associating with multiple-input multiple-output transmission [9]. Even in a lat-
each base station in a cellular network. Extending and unify- tice network whose cells are of equal size, the shadowing and
ing the characterizations for certain settings available in the the stochastic nature of the user locations would induce dispar-
literature, we derive a result that is asymptotic in the strength
of the shadowing, yet otherwise universally valid: it holds for ities in the number of users associating with different BSs, and
every network geometry and shadowing distribution. We then such disparities are bound to increase in irregular networks.
illustrate how this result provides excellent representations in var- This letter addresses the stochastic modeling of the number
ious classes of networks and with realistic shadowing strengths, of users per BS. An approximate characterization available in
evidencing broad applicability. the literature is discussed, and a new asymptotic characteriza-
Index Terms—Cellular networks, user association, number of tion is provided and tested.
users, shadowing, stochastic geometry, multiuser MIMO, Poisson
point process, lattice networks, PPP networks.
II. M ODELING F EATURES
Our focus is on cellular networks where users associate with
the BS from which they enjoy the strongest large-scale channel
gain. Let us next describe the essential modeling features of
I. M OTIVATION
the networks to which the considerations in the sequel apply.
HE ANALYSIS of cellular networks via Poisson point
T process (PPP) modeling of the base station (BS) loca-
tions is a welcome complement, and sometimes even an
A. Geometries
In terms of the positions of BSs and users, virtually every
outright alternative, to the Monte-Carlo simulations that had
cellular scenario of relevance is encompassed. The BS loca-
long dominated system-level performance evaluations [1]–[3].
tions may conform to any stationary and ergodic point process
Such analysis, seemingly fitting only for ad hoc networks, hap-
Φb ⊂ R2 of density λb , or any realization thereof, say a lat-
pens to be highly relevant to cellular networks by virtue of the
tice network. This implies that the density of BSs within any
following result: regardless of the BS locations, provided only
region converges to λb > 0 as this region’s area grows [6].
that they are agnostic to the radio propagation, the distribu-
Meanwhile, the user positions Φu ⊂ R2 may belong to any
tion of powers received at any user converges (asymptotically
independent point process of density λu that is also stationary
in the strength of the shadowing) to what would be received
and ergodic.
from a PPP field of BSs [4]–[6]. This convergence, moreover,
Without loss of generality, a specific BS is set at the origin.
is very evident for practical strengths of the shadowing. In
In the random case, we condition on a BS to be located at the
hexagonal lattice networks, for instance, PPP-based analyses
origin; under expectation over Φb , this becomes the typical
are highly representative for shadowing standard deviations on
BS. In the deterministic case, we pick an arbitrary BS and
the order of 10 dB [6]–[8], well in line with the typical values
translate the coordinate system so that this BS is located at
encountered in macrocellular deployments.
the origin. In both cases, we label this BS at the origin as the
An issue that arises in the analysis of cellular networks
0th BS. Denoting by K the number of users associating with
is the modeling of the number of users associating with
such 0th BS, our purpose is to inspect the distribution of K
each BS. While hidden if each BS is assumed to commu-
under expectations over Φu and Φb .
nicate with a single user per signaling resource, this issue
becomes material once that assumption is removed, say in the
B. Large-Scale Gains
face of multiantenna BSs capable of implementing multiuser
The large-scale channel gains include path loss with expo-
Manuscript received August 1, 2018; revised September 28, 2018; accepted nent η > 2 and shadowing that is IID across links. Particularly,
October 22, 2018. Date of publication October 30, 2018; date of current between the th BS and the kth user served by the 0th BS, the
version April 9, 2019. This work was supported by MINECO/FEDER, UE large-scale gain is
under Project TEC2015-66228-P, in part by the European Research Council
through the H2020 Framework Programme/ERC under Grant 694974, and L
in part by the U.S. NSF under Award CCF 1525904. The associate editor G,(k ) = ηref χ,(k ) ∈ N0 , k ∈ {0, . . . , K − 1}, (1)
coordinating the review of this paper and approving it for publication was
r,(k )
C. Shen. (Corresponding author: Geordie George.)
G. George and A. Lozano are with the Department of Information and with Lref the path loss intercept at a unit distance, r,(k )
Communication Technologies, Universitat Pompeu Fabra, 08018 Barcelona, the link distance, and χ,(k ) the shadowing coefficient hav-
Spain (e-mail: geordie.george@upf.edu; angel.lozano@upf.edu). ing standard deviation σdB . The shadowing can be arbitrarily

M. Haenggi is with the Department of Electrical Engineering, University
of Notre Dame, Notre Dame, IN 46556 USA (e-mail: mhaenggi@nd.edu). distributed, with the only mild restriction that E χ2/η < ∞
Digital Object Identifier 10.1109/LWC.2018.2878579 to guarantee the asymptotic behavior advanced in Section I.
GEORGE et al.: DISTRIBUTION OF NUMBER OF USERS PER BS IN CELLULAR NETWORKS 521
III. D ISTRIBUTION OF K IV. A PPLICABILITY

A. Shadowless Networks Next, we examine the applicability of Lemma 1 to relevant
Suppose that there is no shadowing (σdB = 0), such that the classes of settings with σdB having values of practical interest.
users associating with a BS are strictly the ones positioned In every case, the user locations Φu conform to a homoge-
within its Voronoi cell. Given the area of the Voronoi cell, neous PPP and the shadowing is log-normal. In the histograms
K is a random variable with mean λu times that area [10]. obtained through Monte-Carlo, the number of network snap-
From Φu and the distribution of the cell area, one can then shots is set to ensure a 95% confidence interval of ± 0.07%
determine how K is distributed. For instance, if Φu is PPP (absolute value) in the corresponding CDFs.
and the BSs conform to a regular lattice, then K is Poisson-
distributed with mean K̄ = λu /λb [10]. In turn, if both Φu A. Deterministic Lattice Networks
and Φb are PPPs, then the probability mass function (PMF)
of K is tightly approximated by [11], [12] Let the BSs be located on a lattice, in which case, as
indicated earlier, K is Poisson-distributed in the absence of
Γ(k + c) K̄ k cc shadowing (σdB = 0). Since each user associates with one
fK (k) ≈ k ∈ N0 (2)
Γ(k + 1) Γ(c) (c + K̄ )k+c and only one BS, the number of users-BS associations needs
to be conserved regardless of σdB . Also, the cells are of equal
where c = 3.575 and K̄ = λu /λb is again the mean. size. Intuitively then, by sheer symmetry, the distribution of K
The characterization in (2) is appropriate to represent must remain unchanged for σdB > 0 because the probability
deployments where the BS locations are truly irregular and that the 0th BS loses the association of some number of users
shadowing is minimal. However, as developed next, care must due to unfavorable shadowing necessarily equals the probabil-
be exercised in other situations. In particular, (2) turns out to be ity of gaining the same number of users from other cells due
inadequate for situations where the PPP model for Φb intends to favorable shadowing.
to abstract the impact of shadowing on the propagation rather Indeed, for every σdB , the users that stay with the 0th BS
than the irregularity of the actual BS locations themselves. form an independent thinning of the users associating with that
BS for σdB = 0, so their number is always Poisson. Similarly,
B. Impact of Shadowing the users newly associating with the 0th BS form an indepen-
With shadowing, users need not associate with the BS in dent thinning of the other users and thus that number is also
whose Voronoi cell they are located, and hence the premise Poisson. The sum of two independent Poisson quantities is
underpinning the foregoing subsection ceases to hold. Users itself Poisson, and the mean must stay constant by symmetry.
may now associate with more distant BSs, increasingly so as For this setting, therefore, the applicability of Lemma 1
σdB grows large, and characterizing the exact distribution of extends to every value of σdB . No matter the shadow fading,
K in broad generality appears unwieldy. K is Poisson-distributed.
To bypass this obstacle, we proceed to establish the distri-
bution of K for σdB → ∞ and then test the validity of the B. Deterministic Double-Lattice Networks
result for values of interest for σdB .
Consider now the network defined by
Lemma 1: For σdB → ∞, the distribution of K becomes
Poisson with mean K̄ = λu /λb , i.e.,
Φb = Z2 ∪ (2Z)2 + (1/2, 1/2) (5)
K̄ k e −K̄
fK (k) = k ∈ N0 . (3) and depicted in the inset of Fig. 2. Amounting to a superposi-
k!
tion of two lattices, this network features two distinct cell sizes
Proof: Consider a region of finite area A on R2 having with an area ratio of 7/4. While, without shadowing, each cell
Aλb BSs and Aλu users placed arbitrarily. As σdB → ∞, the size maps to a distinct Poisson distribution for K, for σdB → ∞
number of users associated with each BS becomes binomially all cells must abide by a common Poisson distribution as per
distributed, Lemma 1.

1 Example 1: For η = 4 and K̄ = 10, with the PPP of users
K ∼ B Aλu , , (4) realized over the network described by (5), histograms of K
Aλb
for the two cell sizes and different values of σdB are plotted
because in the limit each user has equal probability, Aλ 1 , to
b
in Fig. 1. Also shown is a Poisson PMF with mean K̄, which
associate with any of the Aλb BSs. Letting A → ∞ while is the limiting distribution for both cell sizes.
keeping λu /λb constant, the binomial distribution converges Example 1 illustrates the rapid transition to the result spelled
to the Poisson distribution with mean λu /λb [10]. by Lemma 1, and thus the broad scope of validity thereof. This
We hasten to emphasize that the foregoing result, while observation is bolstered by Fig. 2, where the convergence is
asymptotic, is general in terms of the network geometry: it demonstrated in terms of the variance of K. For σdB =0 dB,
holds for arbitrary placements of BSs and users because, as such variance equals 10.94 and 6.25, respectively for large and
the shadowing strengthens without bound, it comes to domi- small cells, with the ratio of these values equaling that of the
nate over the path loss and, ultimately, all BSs become equally areas, 7/4. But, already for σdB = 10–14 dB, the variance has
likely to be the one that a user associates with. closely approached the limiting value of 10 for both cell sizes.
Fig. 1. In solid, histograms of K in a double-lattice network with η = 4 and σdB = 0, 10 and 14 dB (in shaded and in clear for the small and large cells,
respectively); users are PPP distributed with K̄ = 10. In dashed, a Poisson PMF with mean K̄.
by (2), but, as the shadowing intensifies, it quickly morphs

into a Poisson distribution as per Lemma 1. For σdB = 10 dB,
and decidedly for σdB = 14 dB, the Poisson distribution is
already an excellent match.
A complementary perspective on the evolution from (2) to
the Poisson distribution within Example 2 is provided in Fig. 4,
which depicts the variance of K as a function of σdB . The
convergence of that variance to the variance of the Poisson
PMF is almost complete for σdB = 10–14 dB. Notice the
slight crossover in the vicinity of σdB = 0 dB, which serves
as a measure of the (high) accuracy of (2) in this limit.
Altogether then, as in the lattice and double-lattice cases,
Lemma 1 is seen to apply to practical values of σdB also in
PPP networks. Moreover, all these network types are extreme
cases. For randomly perturbed lattices, which have been shown
to tightly fit empirical data from cellular operators [14], this
conclusion is reinforced even further.
Fig. 2. In solid, variance of K as a function of σdB in a double-lattice

network with η = 4; users are PPP distributed with K̄ = 10. In dashed,
variance of a Poisson PMF with mean K̄. V. D ISCUSSION
Recapitulating, we can conclude the following in terms of
how to stochastically model the number of users associating
C. PPP Networks with each BS in a cellular network.
Suppose now that the BS locations conform themselves to • For highly irregular networks subject to minimal shad-
a PPP, specifically Φb = Φ ∪ {o} where Φ ⊂ R2 is a homoge- owing, say certain indoor microcellular systems, the
neous PPP and o denotes the origin [13]. Then, by Slivnyak’s distribution in (2) represents an appropriate and precise
theorem, the central BS becomes the typical BS under expec- approximation.
tation over Φb . In the example that follows, BSs (1000 of • For arbitrary networks subject to moderate or strong
them on average) are randomly placed around the central one shadowing, say macrocellular systems, the Poisson dis-
and the number of users associating with that central one is tribution is the pertinent model. In the special case that
counted over Monte-Carlo snapshots. the network conforms to a lattice, this is the case even if
Example 2: For η = 4 and K̄ = 10, with the PPP of users the shadowing is weak or nonexistent.
realized over a PPP network of BSs, Fig. 3 contrasts the his- Since, as the shadowing intensifies, the number of users per
togram of K for different values of σdB against (2) and against BS in any network progressively behaves as in networks with
a Poisson PMF with mean K̄, corresponding respectively to the equal-size cells (where it is always Poisson-distributed), we
distribution of K for σdB = 0 and σdB → ∞. can further affirm that the shadowing acts as an equalizer of
Example 2 again illustrates how the distribution of K the effective cell areas. This phenomenon, whereby the num-
evolves with σdB . For σdB = 0, it is very well approximated ber of users per BS reduces its variance and becomes more
GEORGE et al.: DISTRIBUTION OF NUMBER OF USERS PER BS IN CELLULAR NETWORKS 523
Fig. 3. In solid, histograms of K in a PPP network with η = 4 and σdB = 0, 10 and 14 dB; users are PPP distributed with K̄ = 10. In dotted, the PMF
in (2). In dashed, a Poisson PMF with mean K̄.
control and user selection. Such further modeling, beyond the

scope of this letter, is an interesting follow-up problem.
R EFERENCES
[1] J. G. Andrews, F. Baccelli, and R. K. Ganti, “A tractable approach to
coverage and rate in cellular networks,” IEEE Trans. Commun., vol. 59,
no. 11, pp. 3122–3134, Nov. 2011.
[2] H. ElSawy, E. Hossain, and M. Haenggi, “Stochastic geometry for
modeling, analysis, and design of multi-tier and cognitive cellular wire-
less networks: A survey,” IEEE Commun. Surveys Tuts., vol. 15, no. 3,
pp. 996–1019, 3rd Quart., 2013.
[3] B. Błaszczyszyn, M. Haenggi, P. Keeler, and S. Mukherjee, Stochastic
Geometry Analysis of Cellular Networks. Cambridge, U.K.: Cambridge
Univ. Press, 2018.
[4] H. P. Keeler, N. Ross, and A. Xia, “When do wireless network signals
appear Poisson?” Bernoulli, vol. 24, no. 3, pp. 1973–1994, Aug. 2018.
[5] N. Ross and D. Schuhmacher, “Wireless network signals with mod-
erately correlated shadowing still appear Poisson,” IEEE Trans. Inf.
Theory, vol. 63, no. 2, pp. 1177–1198, Feb. 2017.
[6] B. Błaszczyszyn, M. K. Karray, and H. P. Keeler, “Wireless networks
appear Poissonian due to strong shadowing,” IEEE Trans. Wireless
Commun., vol. 14, no. 8, pp. 4379–4390, Aug. 2015.
Fig. 4. In solid, variance of K as a function of σdB in a PPP network with [7] G. George, R. K. Mungara, A. Lozano, and M. Haenggi, “Ergodic
η = 4; users are PPP distributed with K̄ = 10. In dotted, variance of the PMF spectral efficiency in MIMO cellular networks,” IEEE Trans. Wireless
in (2). In dashed, variance of a Poisson PMF with mean K̄. Commun., vol. 16, no. 5, pp. 2835–2849, May 2017.
[8] G. George, A. Lozano, and M. Haenggi, “Massive MIMO for-
ward link analysis for cellular networks,” 2018 [Online]. Available:
https://arxiv.org/abs/1811.00110
[9] R. W. Heath and A. Lozano, Foundations of MIMO Communication.
predictable, is beneficial in terms of resource provisioning. In Cambridge, U.K.: Cambridge Univ. Press, 2018.
[10] M. Haenggi, Stochastic Geometry for Wireless Networks. Cambridge,
this respect, shadowing turns out to be operationally beneficial. U.K.: Cambridge Univ. Press, 2012.
We hasten to recall that all the foregoing observations [11] H. ElSawy, A. Sultan-Salem, M. S. Alouini, and M. Z. Win, “Modeling
rely on the premise that the BS locations are agnostic to and analysis of cellular networks using stochastic geometry: A tutorial,”
IEEE Commun. Surveys Tuts., vol. 19, no. 1, pp. 167–203, 1st Quart.,
the radio propagation, dominated by deployment opportunities 2017.
and restrictions associated with terrain, permits, infrastructure, [12] H. ElSawy and E. Hossain, “Two-tier HetNets with cognitive femtocells:
power supply, and backhaul. At the other extreme in terms Downlink performance modeling and analysis in a multichannel envi-
ronment,” IEEE Trans. Mobile Comput., vol. 13, no. 3, pp. 649–663,
of planning we would have networks whose BS locations are Mar. 2014.
optimized for coverage and service, with none of those restric- [13] Y. Wang, M. Haenggi, and Z. Tan, “The meta distribution of the SIR for
tions. Such networks are best modeled by regular lattices with cellular networks with power control,” IEEE Trans. Commun., vol. 66,
no. 4, pp. 1745–1757, Apr. 2018.
no shadowing [15], and again the number of users per BS [14] A. Guo and M. Haenggi, “Spatial stochastic models and metrics for the
would then be Poisson-distributed. structure of base stations in cellular networks,” IEEE Trans. Wireless
The broad suitability of a model as simple as the Poisson Commun., vol. 12, no. 11, pp. 5800–5812, Nov. 2013.
[15] A. Guo and M. Haenggi, “Joint spatial and propagation models for cel-
distribution is a welcome finding, especially if a further layer lular networks,” in Proc. IEEE Glob. Commun. Conf., San Diego, CA,
of modeling is to be overlaid so as to incorporate admission USA, Dec. 2015, pp. 1–6.
Joint Power, Altitude, Location and Bandwidth Optimization for UAV With
Underlaid D2D Communications
Wenhuan Huang, Zhaohui Yang, Cunhua Pan , Lu Pei , Ming Chen, Mohammad Shikh-Bahaei,
Maged Elkashlan , and Arumugam Nallanathan , Fellow, IEEE
Abstract—In this letter, we aim to maximize the rate of a presented a cost function based multiple UAVs deployment
device-to-device (D2D) pair for a downlink unmanned aerial vehi- model. By taking beamwidth into account, a joint UAV altitude
cle (UAV)-aided wireless communication system, where D2D users and beamwidth optimization problem for UAV-aided multiuser
coexist in an underlaying manner. We jointly optimize the trans- communication systems was studied in [4]. Through jointly
mit power of the UAV and D2D users, the flying altitude and optimizing altitude, beamwidth and bandwidth, the sum power
location of the UAV and ground terminals’ allocated bandwidth.
To solve this problem, an iterative algorithm with low complexity
was further minimized in [5].
is accordingly proposed. Simulation results show that the altitude Apart from UAV-aided wireless communications, device-
of the UAV has an important impact on the system performance. to-device (D2D) communication has been regarded as one
of the crucial technologies in future wireless communication
Index Terms—UAV communications, D2D communications, networks [6]. D2D communication allows direct transmissions
altitude and location optimization, power allocation, bandwidth between users in proximity, which is helpful in offloading
allocation.
network traffic and reducing end-to-end delay. Compared to
the previous investigations on D2D communication underlay-
ing cellular networks [7], the coexistence of UAVs and under-
I. I NTRODUCTION laid D2D communications will introduce new interference
ECENTLY, wireless communication assisted by management challenges. Unlike fixed BSs, the altitude of the
R unmanned aerial vehicles (UAVs) has been regarded as a
promising technique which can provide economical wireless
UAV is adjustable and will influence channel characteristics.
Moreover, the impacts of the mobility of the UAV on D2D
access for mobile devices without deploying fixed network communications should be analyzed. In [8], a UAV flight
infrastructure [1]. Different from conventional terrestrial pattern selection problem for D2D communications in dis-
communications, UAVs act as flying base stations (BSs) aster areas was studied. Mozaffari et al. [9] focused on the
in UAV-aided wireless communications and bring plenty performance analysis of the coexistence between the UAV and
of benefits. Owing to their agility and mobility, UAVs can an underlaid D2D communication network in a downlink sce-
be deployed to support temporary or urgent events over a nario. However, to our best knowledge, there is no existing
wide area, which enhances the quality of service for ground work studying the performance of UAVs with underlaid D2D
terminals (GTs). Moreover, links between UAVs and GTs communications from the optimization point of view.
are dominated by line-of-sight (LoS) connections, leading to In this letter, we aim to maximize the rate of a D2D pair
enhanced data rate. for a downlink UAV-aided wireless communication system,
To fully reap the benefits of UAV-aided communications, it where D2D users coexist in an underlaying manner. We for-
is crucial to exploit the UAV mobility in a three-dimensional mulate the problem of joint power, altitude, location and
space. To address the UAV deployment challenge, an efficient bandwidth optimization. In order to solve this nonconvex rate
deployment approach based on the circle packing theory was maximization problem, we propose a low-complexity iterative
proposed in [2]. For capacity enhancement, Sharma et al. [3] algorithm. It turns out to have attractive closed-form solu-
tions for power allocation subproblem and altitude planning
Manuscript received October 1, 2018; accepted October 25, 2018. Date of subproblem.
was supported in part by the National Natural Science Foundation of China
under Grant 61871128, Grant 61521061, and 61571125, and in part by the II. S YSTEM M ODEL AND P ROBLEM F ORMULATION
Engineering and Physical Science Research Council through SENSE under
Grant EP/P003486/1. The work of M. Elkashlan was supported by the EPSRC We consider a downlink UAV-aided wireless communica-
under Project EP/N029666/1. The work of A. Nallanathan was supported tion system1 with one flying UAV, K GTs and one D2D
by EPSRC under Project EP/M016145/2. The associate editor coordinating pair which coexists in an underlaying manner. The horizontal
the review of this paper and approving it for publication was C. Huang.
(Corresponding author: Cunhua Pan.)
and vertical locations of the D2D receiver, D2D transmitter
W. Huang, L. Pei, and M. Chen are with the National Mobile and GT k are denoted by r = (0, 0), t = (t(1), t(2)) and
Communications Research Laboratory, Southeast University, Nanjing g k = (gk (1), gk (2)), respectively. The altitude of the D2D
210096, China (e-mail: huang_wenhuan@seu.edu.cn; peilu@seu.edu.cn; pair and GTs are assumed to be zero. The UAV is deployed
chenming@seu.edu.cn). as a flying BS at an altitude H with horizontal and vertical
Z. Yang and M. Shikh-Bahaei are with the Centre for Telecommunications
Research, King’s College London, London WC2B 4BG, U.K. (e-mail: location u = (x, y).
yang.zhaohui@kcl.ac.uk; m.sbahaei@kcl.ac.uk).
C. Pan, M. Elkashlan, and A. Nallanathan are with the School of 1 We consider the system where all terminals are equipped with a sin-
Electronic Engineering and Computer Science, Queen Mary University gle isotropic antenna. When each user is equipped with one antenna, the
of London, London E1 4NS, U.K. (e-mail: c.pan@qmul.ac.uk; optimization problem becomes more tractable. Moreover, it is appealing
maged.elkashlan@qmul.ac.uk; a.nallanathan@qmul.ac.uk). to equip each device with only one antenna to reduce the implementation
Digital Object Identifier 10.1109/LWC.2018.2878706 complexity.
This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/
HUANG et al.: JOINT POWER, ALTITUDE, LOCATION AND BANDWIDTH OPTIMIZATION FOR UAV WITH UNDERLAID D2D COMMUNICATIONS 525
We consider the case that GTs and the D2D receiver are III. S OLUTION A PPROACH
located outdoors, and the channel between the UAV and each Due to (4a) and (4d), Problem (4) is a nonconvex problem.
GT (D2D receiver) is dominated by the LoS path. The down- It is difficult to obtain its globally optimal solution. In the
link channel power gain between the UAV and GT k is given following, we propose a low-complexity iterative algorithm.
by [4] Since (4a) is a monotonically decreasing function of pku , the
β0 data rate constraints in (4d) for GTs should hold with equality
hku = , (1) at the optimal point. As a result, we have
u − g k 2 + H 2
Rmin
where β0 denotes the channel power gain between the UAV 2 ak
−1
and GTs (D2D receiver) at the unit distance, and · is pku ∗ = (u − g k 2 + H 2 ) pd hkd + σ 2 , (5)
β0
the Euclid norm. Similarly, the downlink channel power gain
between the UAV and the D2D receiver is where pku ∗ denotes the optimal solution of pku .
β0 To ensure that Problem (4) is feasible, we employ the fea-
h0u = 2
. (2) sibility checking algorithm. The feasibility checking problem
u + H 2 is the minimization of transmit power of the UAV subject to
The downlink achievable rate of GT k can be expressed as constraints (4b), (4d)-(4g). If the minimal sum power of the
UAV is larger than Pumax , Problem (4) is infeasible. Since pku ∗
g pku β0 is the minimal value of pku according to (4d), the feasibility
rk = ak log2 1 + , (3)
pd hkd + σ 2 u − g k 2 + H 2 checking problem is equivalent to obtain the minimum value
v ∗ of K p
k =1 k
u ∗ , when p = 0, H = H
d min , and constrains
where ak denotes the allocated bandwidth proportion for GT (4f) and (4g) are satisfied. An exhaustive algorithm can be
k, pku denotes the transmit power from UAV to GT k, pd is adopted to solve it. With fixed u, the optimal bandwidth allo-
the transmit power of the D2D transmitter, hkd is the channel cation a can be obtained via the interior point method. The
power gain between the D2D transmitter and GT k, and σ 2 is optimal u is obtained via the two-dimensional (2D) exhaustive
the power of the additive white Gaussian noise. search. As a result, Problem (4) is feasible if and only if when
We aim for maximizing the rate of the D2D pair while Pumax ≥ v ∗ .
satisfying the minimal rate requirements of all GTs via power, Based on (5) and the fact that log2 (1+x ) is a monotonically
altitude, location and bandwidth optimization. Mathematically, increasing function, Problem (4) is equivalent to
the D2D pair achievable rate maximization problem is
pd h0d
⎛ ⎞ max
Rmin
(6a)
u−g k 2 +H 2
pd ,a,u,H K

ak 2 ak
−1 pd hkd + σ 2 + σ 2
⎜ pd h0d ⎟ H 2 +u2
⎜ ⎟ k =1
max log2 ⎜1 + ⎟ (4a) Rmin
pd ,pu ,a,u,H ⎝ K
β0 ⎠ K
2 ak
−1
ak pku + σ2 s.t. 0≤
2 2 d 2
(u − g k + H )(pd hk + σ ) ≤ Pu
max
,
H 2 +u2 β0
k =1 k =1
s.t. 0 ≤ pd ≤ Pdmax , (4b) (6b)

K
(4b), (4e) − ( 4g). (6c)
0≤ pku ≤ Pumax , (4c)
k =1
To solve Problem (6), an iterative algorithm is proposed.
rkg≥ Rmin , ∀k = 1, . . . , K , (4d)
Hmin ≤ H ≤ Hmax , (4e) A. Optimal Power Allocation
ak ≥ 0, ∀k = 1, . . . , K , (4f) With fixed a, u and H in Problem (6), the power allocation
K problem is given by

ak = 1, (4g)
pd h0d D0
k =1 max (7a)
pd pd M + σ 2 (N + D0 )
where pu = (p1u , . . . pKu ), a = (a , . . . , a ), h d is the
1 K 0 s.t. 0 ≤ pd ≤ P̄dmax , (7b)
channel power gain between the D2D transmitter and the cor-
responding receiver, [Hmin , Hmax ] denotes the feasible region K
where D0 = u2 + H 2 , M = d
k =1 ak Ak Dk hk , Ak =
of the UAV’s altitude H. It should be noted that the Doppler Rmin

frequency shift is low and can be neglected for the slow speed 2 ak
− 1, Dk = u − g k 2 + H 2 , N = K k =1 ak Ak Dk ,
max 2 K
of D2D devices, the D2D channel is frequency-flat over the max max β0 Pu −σ k =1 Ak Dk
and P̄d = min{Pd , { K d }}.
whole bandwidth according to [10] and [11]. Note that the k =1 Ak Dk hk
D2D pair reuses the whole bandwidth, which indicates that ak Observing that (7a) is an increasing function of pd , the
is interpreted as the probability of receiving interference from optimal power solution of Problem (7) is pd∗ = P̄dmax .
GT k and the rate of the D2D can be modeled as (4a) accord-
ing to [11]. The power restrictions for the D2D pair and GTs B. Altitude and Location Planning
are respectively formulated in (4b) and (4c). Equation (4d)
ensures that each GT should satisfy the minimum rate require- Then, we investigate the altitude and location planning with
ment. Equations (4f) and (4g) indicate bandwidth allocation fixed D2D transmit power and bandwidth allocation. Since
pd h0d
requirements. x +σ 2
is a decreasing function with x > 0, the altitude and
location planning problem can be formulated as Algorithm 1: Iterative Algorithm

1: Check the feasibility of Problem (4). If Problem (4) is
K
u − g k 2 + H 2 infeasible, terminate the algorithm. Otherwise, initialize
min Bk (8a) (0)
u,H
k =1
H 2 + u2 a feasible solution (pd , a(0) , u (0) , H (0) ) and (pku )(0)
K
according to (5). Set the iteration number n = 1.
2: repeat
s.t. Ak Ck (u − g k 2 + H 2 ) ≤ β0 Pumax , (8b)
3: With fixed a(n−1) , u (n−1) and H (n−1) , obtain the
k =1 (n)
Hmin ≤ H ≤ Hmax , (8c) optimal pd of Problem (7).
(n)
4: With fixed pd and a(n−1) , obtain the optimal u (n)
where Ck = pd hkd + σ 2 and Bk = ak Ak Ck . To solve
Problem (8), we first obtain the optimal H under given u, and H (n) of Problem (8).
(n)
and then adopt the 2D exhaustive search to find the optimal 5: With fixed pd , u (n) and H (n) , obtain the optimal
solution of u to Problem (8). Under fixed u, constraint (8b) is a(n) of Problem (15), and (pku )(n) according to (5).
6: Set n = n + 1.
K
until the objective
function (4a) convergence.

Ak u − g k 2 Ck
7:
β0 Pumax −
u )(n) , p ∗ = p (n) , u ∗ =
Output pu = (p1 )(n) , · · · , (pK
∗ u
H ≤ k =1 8: d d
H0 , (9)
K

u (n) , H ∗ = H (n) and a∗ = a(n) .
Ak Ck
k =1
It should be noticed that in order to make the altitude K Rmin

planning problem feasible, Hmin should satisfy Hmin ≤ H0 .
s.t. 2 ak
− 1 Ck Qk ≤ β0 Pumax , (15b)
Assuming it is satisfied, constraint (8c) is transferred to
k =1
Hmin ≤ H ≤ H̄max , (10) (4f ), (4g). (15c)
where H̄max = min{Hmax , H0 }. Therefore, the altitude Qk
where Ek = and Qk = u − g k 2 +H 2 . We define
planning problem with fixed u becomes H 2 +u2
Rmin
K
2 function f (x ) = x 2 x − x for x > 0, and have
u − g k + H2
min Bk (11a)
H 2 + u2
Rmin
H
k =1 (ln 2)2 Rmin
2 2 x
f (x ) = > 0. (16)
s.t. (10). (11b) x3
K u−g k 2 +x Based on (16), we observe that the objective function of
Defining function b(x ) = k =1 Bk , we have
x +u2 Problem (15) is a convex function. Since constraint (15b)
K
K
is convex, Problem (15) is a convex problem and it can be
u2 Bk − Bk u − g k 2 solved by analyzing the Karush-Kuhn-Tucker conditions with
k =1 k =1
b (x ) = 2 , (12) the same method adopted in [12, Appendix A].
x + u2
D. Iterative Algorithm and Complexity Analysis
and we consider the optimal altitude planning in the following The iterative algorithm for solving Problem (4) is given in
two cases. K Algorithm 1, which yields a suboptimal solution. Due to the
1) Case 1: K k =1 Bk u ≥
2 2
k =1 Bk u − g k . fact that the optimal solution is obtained for each subprob-

In this case, we have b (x) ≥ 0, and b(x) is an increas- lem in each step, Algorithm 1 is guaranteed to converge. For
ing function. Since H 2 is an increasing function of H when the feasibility checking in Algorithm 1, the complexity mainly
Hmin ≤ H ≤ H̄max , the optimal altitude of Problem (11) is lies in interior point method. Since the dimension of the vari-
H ∗ (u) = Hmin . (13) ables of bandwidth allocation problem is K, the complexity of
K solving it by using the interior point method is O(Li K 3 ) [13,
2) Case 2: k =1 Bk u2 < K 2
k =1 Bk u − g k . p. 487 and 569], where Li denotes the number of iterations for
In this case, b(x) is a monotonically decreasing function. the interior point method. Let X and Y respectively denote the
The optimal altitude planning of Problem (11) is maximum searching distances of the horizontal and vertical
location, Δlx and Δly respectively denote the searching steps
H ∗ (u) = H̄max . (14)
of the horizontal and vertical location. Since the 2D exhaustive
After obtaining the optimal H ∗ (u), we adopt a 2D exhaus- search method is used, the complexity of feasibility checking
3
tive search to find the optimal solution of u to Problem (8). algorithm is O( LΔli K XY
x Δly
). For each iteration in Algorithm 1,
the major complexity mainly lies in solving the altitude and
C. Optimal Bandwidth Allocation location planning problem. Since solving Problem (8) with
For Problem (6) with fixed transmit power, altitude and fixed u involves a complexity of O(K ) according to (13)
location, the bandwidth allocation problem is given by and (14), the complexity of solving Problem (8) by 2D
exhaustive search method is O( Δl KXY ). Therefore, the total
K
R x Δly
min KXY (L + L K 2 )), where
min ak 2 ak
− 1 Ck Ek (15a) complexity of Algorithm 1 is O( Δl x Δly
o i
a Lo is the number of the outer iteration.
k =1
HUANG et al.: JOINT POWER, ALTITUDE, LOCATION AND BANDWIDTH OPTIMIZATION FOR UAV WITH UNDERLAID D2D COMMUNICATIONS 527
average gain over the FB algorithm and the FAL algorithm

when Hmin = 50 m. When Hmin = 200 m, the proposed algo-
rithm respectively achieves 30% and 8% average gain over
the FB algorithm and the FAL algorithm. This is because the
proposed algorithm jointly optimizes transmit power, altitude,
location and bandwidth. Considering the pre-allocated band-
width of each GT, the FB algorithm performs worst among
all approaches, which indicates that the bandwidth allocation
dominates the altitude and location planning in enhancing the
D2D achievable rate. It is also observed that the NGO algo-
Fig. 1. Number of iterations using the proposed method. rithm outperforms the proposed algorithm at the cost of some
additional computations.
V. C ONCLUSION
In this letter, we aim to maximize the rate of a D2D pair
with the coexistence between the UAV and an underlaid D2D
communication network in a downlink scenario. Numerical
results show that the proposed algorithm always outperforms
the algorithms which partially optimize altitude, location or
bandwidth. Moreover, the altitude of the UAV cannot be too
high or too low because the altitude of the UAV has an impor-
tant influence on the UAV-aided networks with underlaid D2D
communications.
Fig. 2. D2D achievable rate versus minimal rate demand.
R EFERENCES
IV. N UMERICAL R ESULTS
[1] Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications
We consider a scenario where there are K = 3 GTs and with unmanned aerial vehicles: Opportunities and challenges,” IEEE
one D2D pair randomly distributed in a circle area with Commun. Mag., vol. 54, no. 5, pp. 36–42, May 2016.
radius 300 m. The distance between the D2D transmitter and [2] M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Efficient deploy-
ment of multiple unmanned aerial vehicles for optimal wireless cover-
the corresponding receiver is set to 30 m. We set the total age,” IEEE Commun. Lett., vol. 20, no. 8, pp. 1647–1650, Aug. 2016.
bandwidth of the system as 10 MHz, β0 = 1.42 × 10−4 , [3] V. Sharma, M. Bennis, and R. Kumar, “UAV-assisted heterogeneous
σ 2 = −169 dBm/Hz, Hmin = 50 m, Hmax = 500 m, networks for capacity enhancement,” IEEE Commun. Lett., vol. 20, no. 6,
Pmax
d =10 dBm and Pmax u = 30 dBm. pp. 1207–1210, Jun. 2016.
[4] H. He, S. Zhang, Y. Zeng, and R. Zhang, “Joint altitude and
Similar to [14], the channel power gain hkd is modeled beamwidth optimization for UAV-enabled multiuser communications,”
as β0 ηd −3 , k = 0, 1, . . . , K , where d represents distance IEEE Commun. Lett., vol. 22, no. 2, pp. 344–347, Feb. 2018.
and η represents the Rayleigh fading coefficient, which fol- [5] Z. Yang et al., “Joint altitude, beamwidth, location, and bandwidth
optimization for UAV-enabled communications,” IEEE Commun. Lett.,
lows the exponential distribution with unit mean. We compare vol. 22, no. 8, pp. 1716–1719, Aug. 2018.
the proposed algorithm with the following three algorithms: [6] M. N. Tehrani, M. Uysal, and H. Yanikomeroglu, “Device-to-device
fixed altitude and location algorithm with optimized bandwidth communication in 5G cellular networks: Challenges, solutions, and
(labeled as ‘FAL’), fixed bandwidth algorithm with optimized future directions,” IEEE Commun. Mag., vol. 52, no. 5, pp. 86–92,
altitude and location (labeled as ‘FB’), and the near globally May 2014.
[7] C.-H. Yu, K. Doppler, C. B. Ribeiro, and O. Tirkkonen, “Resource
optimal algorithm via running the proposed iterative algorithm sharing optimization for device-to-device communication underlaying
with 1000 initial points (labeled as ‘NGO’). cellular networks,” IEEE Tans. Wireless Commun., vol. 10, no. 8,
Firstly, we study the convergence behaviour of the proposed pp. 2752–2763, Aug. 2011.
algorithm. Fig. 1 illustrates the achievable rate of the D2D [8] E. Christy et al., “Optimum UAV flying path for device-to-device com-
munications in disaster area,” in Proc. IEEE Int. Conf. Signal Syst.,
pair versus the number of iterations when Hmin = 50 m. Denpasar, Indonesia, May 2017, pp. 318–322.
It shows that the achievable rate of the D2D pair increases [9] M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Unmanned aerial
monotonically and converges rapidly. vehicle with underlaid device-to-device communications: Performance
The achievable rate of the D2D pair with differ- and tradeoffs,” IEEE Trans. Wireless Commun., vol. 15, no. 6,
ent approaches versus Rmin when Hmin = 50 m and pp. 3949–3963, Jun. 2016.
[10] Q. Wu, G. Y. Li, W. Chen, and D. W. K. Ng, “Energy-efficient D2D
Hmin = 200 m is illustrated in Fig. 2. We observe that the overlaying communications with spectrum-power trading,” IEEE Trans.
achievable rate of the D2D pair decreases when Hmin is Wireless Commun., vol. 16, no. 7, pp. 4404–4419, Jul. 2017.
increased for all algorithms. Moreover, the altitude of the [11] Z. Yang et al., “Downlink resource allocation and power control for
UAV cannot be too high or too low. If the altitude of the device-to-device communication underlaying cellular networks,” IEEE
Commun. Lett., vol. 20, no. 7, pp. 1449–1452, Jul. 2016.
UAV is too high, the coverage region of the UAV is enhanced [12] Z. Yang, C. Pan, W. Xu, H. Xu, and M. Chen, “Joint time allocation
while the channel gain is weak, which leads to high trans- and power control in multicell networks with load coupling: Energy sav-
mit power from the UAV to satisfy the rate demand of GTs ing and rate improvement,” IEEE Trans. Veh. Technol., vol. 66, no. 11,
and severe interference to the D2D pair. If the altitude of the pp. 10470–10485, Nov. 2017.
UAV is too low, the channel gains are high from the UAV [13] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.:
Cambridge Univ. Press, 2004.
to GTs and the D2D receiver, which will result in terrible [14] T. D. Hoang, L. B. Le, and T. Le-Ngoc, “Energy-efficient resource allo-
interference to the D2D pair. From Fig. 2, we observe that cation for D2D communications in cellular networks,” IEEE Trans. Veh.
the proposed algorithm respectively achieves 20% and 10% Technol., vol. 65, no. 9, pp. 6972–6986, Sep. 2016.
OTFS-Based Multiple-Access in High Doppler and Delay

Spread Wireless Channels
Venkatesh Khammammetti and Saif Khan Mohammed
Abstract—We propose a multiple-access (MA) method in the the delay and Doppler spreads are high, the information trans-
uplink of a orthogonal time frequency space modulation-based mitted on a Delay-Doppler resource block (DDRB) could
wireless communication system where the channel has high spread to neighbouring DDRB’s leading to severe multi-user
Doppler and delay spread. Each user terminal (UT) is allocated interference (MUI). In [9] guard bands have been proposed to
delay-Doppler (DD) resource blocks which are spaced at equal increase the separation between the DDRBs allocated to dif-
intervals in the DD domain. This limits the corresponding time- ferent UTs so as to reduce MUI. This however reduces the
frequency (TF) transmit signal to a sub-domain of the entire effective system capacity as guard bands are an overhead.
TF domain. By allocating non-overlapping portions to the TF
In this letter, we propose a novel MA method for OTFS
transmit signal of different UTs, multi-user interference (MUI)
is avoided. The proposed MA method is analytically shown to be based systems in high mobility and delay spread scenarios,
MUI free and therefore has a significantly higher sum spectral which multiplexes UTs in both the DD and the TF domains
efficiency when compared to other methods proposed in literature in such a manner that there is no MUI and no guard bands are
which utilize guard bands in the DD domain to reduce MUI. required. The DDRBs allocated to a UT are spaced at equal
intervals in both the delay and the Doppler domain. Due to
Index Terms—Doppler, delay spread, OTFS, multiple-access. this, the corresponding TF signal occupies only a portion of the
entire TF domain. This enables us to allocate non-overlapping
portions of the TF domain to different UTs which allows the
I. I NTRODUCTION BS to separate the received TF signals of different UTs due
CHIEVING high data rates in high mobility and delay to which there is no MUI. We derive the effective channel
A spread wireless channels1 is an objective of IMT-
2020 [1]. In Orthogonal Frequency Division Multiple Access
model for the proposed method and analytically show that
it is MUI free. Simulations show that under high delay and
(OFDMA), high mobility results in inter-carrier interference Doppler spread the sum spectral efficiency (SE) achieved by
which degrades channel capacity [3]. Recently, Orthogonal the method in [9] saturates with increase in the power transmit-
Time Frequency Space Modulation (OTFS) has been intro- ted by each UT and is significantly smaller than that achieved
duced, which has been shown to be more robust to Doppler by the proposed method.
spread as compared to OFDM based systems [4], [5]. OTFS
modulation spreads each information symbol in the Delay
Doppler (DD) domain over the entire time-frequency (TF) II. S YSTEM M ODEL
domain (unlike OFDMA systems) due to which full time- In OTFS modulation, the information symbols are trans-
frequency diversity is realizable [5]. Each information symbol mitted in the delay Doppler (DD) domain instead of the
sees the same constant channel gain, which greatly simplifies time-frequency domain (as in OFDM). The DD domain is T
the transmitter and receiver design [6]–[8]. The constant chan- seconds wide along the delay domain and Δf = 1/T Hz wide
nel gain across the entire DD domain also helps in reducing along the Doppler domain. The delay domain and the Doppler
the overhead of frequent channel estimation and feedback. domain are further sub-divided into M and N equal parts
In this letter, we consider OTFS based uplink multiple- respectively. A DD resource block (DDRB) is a combination
access (MA) with single antenna at the base station (BS) and at of a sub-division along the delay domain together with a sub-
the user terminals (UTs). As this topic is recent, there is little division along the Doppler domain (see Fig. 1a). Information
prior literature available. In [9], UTs have been multiplexed by symbols are transmitted on DDRBs, and hence they get
allocating non-overlapping regions in the DD domain. When delayed and Doppler shifted by known integer multiples of
T/M and Δf /N respectively, prior to transmission.2
Manuscript received June 13, 2018; revised September 26, 2018; accepted The received time domain signal is transformed back to the
October 24, 2018. Date of publication October 30, 2018; date of current ver- DD domain. Each multipath of the wireless channel induces
sion April 9, 2019. This work was supported in part by the Visvesvaraya delay and Doppler shift, and therefore an information symbol
Ph.D. Scheme of Ministry of Electronics and IT, Government of India, being transmitted on a single DDRB could be received at multi-
implemented by Digital India Corporation and in part by EMR Funding ple DDRBs. These received symbols can then be coherently
from the Science and Engineering Research Board, Department of Science
and Technology, Government of India. The associate editor coordinating
combined thereby achieving delay and Doppler diversity, and
the review of this paper and approving it for publication was W. Zhang. hence robustness towards Doppler shift. However, the maxi-
(Corresponding author: Saif Khan Mohammed.) mum possible delay spread τmax of the channel must be less
The authors are with the Department of Electrical Engineering, than T since otherwise two different paths having a delay dif-
Indian Institute of Technology Delhi, New Delhi 110016, India ference of T seconds would appear to the OTFS receiver as a
(e-mail: saifkmohammed@gmail.com).
1 One such scenario is that of an urban macro base station (BS) with a large 2 As the delay resolution is T/M, the transmitted time domain signal can
coverage area serving mobile UTs travelling at high speed. An example of significantly change in a time scale of the order of T/M seconds and hence
this is the standardized 3GPP Extended Typical Urban (ETU 300) channel its bandwidth is M/T = MΔf Hz. The minimum possible Doppler shift is
model where the delay spread is 5 μs and the maximum Doppler shift is Δf /N and hence for the receiver to identify such a small shift, the total time
300 Hz [2]. duration of the OTFS frame is N /Δf = NT seconds (see Fig. 1b).
KHAMMAMMETTI AND MOHAMMED: OTFS-BASED MA IN HIGH DOPPLER AND DELAY SPREAD WIRELESS CHANNELS 529
Under the conditions in (3), (4), from [4] we have

Q−1
N−1 M
−1

x[k , l ] = xq [k , l ]Hq [(k − k )N , (l − l )M ]
q=0 k =0 l=0
+ w [k , l ] (5)
where (x )N is the unique non-negative value smaller than
N and congruent to x modulo N.4 We also assume that the
Rx pulse satisfies grx ∗ (t − nT )g (t)e −j 2πmΔf (t−nT ) dt =
rx
δ(m)δ(n) and therefore w [k , l ] in (5) are i.i.d. complex nor-
mal random variables having zero mean and say unit variance,
i.e., w [k , l ] ∼ CN (0, 1). The advantage of OTFS is that the
channel Hq [., .] in (5) is constant over the entire TF domain.
However, from (5) it follows that the information symbols
in the DD domain undergo 2-D circular convolution which
Fig. 1. Proposed allocation in the Delay Doppler (DD) and Time Frequency could lead to severe MUI in high delay and Doppler spread
(TF) domains. scenarios.5
single multipath and similarly in the Doppler domain the max- III. P ROPOSED OTFS-MA S CHEME
imum possible Doppler shift νmax must be less than Δf , i.e.,
T is chosen such that Δf = 1/T > νmax and T > τmax .3 In this letter, we propose a novel MA scheme for Q = g1 g2
Let xq [k , l ] denote the information symbol transmitted UTs. We assume that g1 and g2 divide M and N respectively.
by the q-th UT on the (k, l)-th DDRB. Using the Inverse The q-th UT (0 ≤ q < g1 g2 ) is allocated DDRB’s in the set
Δ
Symplectic Finite Fourier Transform (Inverse SFFT), the DD Rq = (k , l ) | k = q/g1 + g2 u, l = (q)g1 + g1 v , 0 ≤ u <

information symbols are transformed to the TF domain (NT N /g2 , 0 ≤ v < M /g1 . The corresponding TF symbols are
sec × MΔf Hz). The modulated TF symbol transmitted by the given by
q-th UT on the (m, n)-th time-frequency RB (TFRB) is given

−j 2π ml
− nk 1 −j 2π ml
M − nk
N
1 N −1 M −1 M N Xq [n, m] = xq [k , l ] e
by Xq [n, m] = MN k =0 l=0 xq [k , l ] e , MN
m = 0, 1, . . . , (M − 1), n = 0, 1, . . . , (N − 1). (k ,l)∈Rq
N M
These are then converted to time domain and trans- −1 −1
M−1 N
−1 (a)
g
g2 1
−j 2π( Mmv − Nnu )
mitted [4], i.e., sq (t) = Xq [n, m]gtx (t − = λ(m, n) x̃q [u, v ] e /g /g 1 2
m=0 n=0 u=0 v =0

nT j 2πmΔf (t−nT ) where the transmit pulse g (·) satisfies n = 0, 1, . . . , ((N /g2 ) − 1), m = 0, 1, . . . , ((M /g1 ) − 1) (6)
∗)e tx
gtx (t − nT )gtx (t)e −j 2πmΔf (t−nT ) dt = δ(m)δ(n). The Δ m(q)g
−j 2π( M 1
− N )
nq/g1
time-domain signal received at the BS is given by [4] where λ(m, n) = e /(MN ) and step (a)
follows from the fact that xq (k , l ) is zero for all (k , l ) ∈
/ Rq .

Q−1 Δ
r (t) = hq (τ, ν)sq (t − τ )e j 2πν(t−τ ) d νd τ + w (t) (1) The information symbols of the q-th UT x̃q [u, v ] = xq (k =
q=0 q/g1 + g2 u, l = (q)g1 + g1 v ) are spaced apart by g1 RBs
along the delay domain and by g2 RBs along the Doppler
where Q is the total number of uplink UTs, hq (τ, ν) is domain (see Fig. 1a), due to which from (6) it follows that
the DD channel between the q-th UT and the BS, and
w(t) is the AWGN at the BS. At the BS, r(t) is trans- for any integers (t1 , t2 ), Xq [n + t1 (N /g2 ), m + t2 (M /g1 )] =
t2 (q)g1 t1 q/g1
formed to the −j 2π −
∗TF domain using−jthe Wigner transform, i.e., e g1 g2 Xq [n, m], i.e., the TF modulated sym-
Y [n, m] = grx (t − nT )r (t)e 2πmΔf (t−nT ) dt, which is bols are invariant to translations by integer multiples of N /g2
then transformed to the DD domain through SFFT, i.e.,
and M /g1 in the time and frequency domain respectively. This
N
−1 M
−1
j 2π ml − nk implies that the TF symbols can be restricted to a (NT /g2 )
M N
x[k , l ] = Y [n, m] e . (2) sec × (M Δf /g1 ) Hz region of the TF domain. In the pro-
n=0 m=0 posed scheme we limit the TF modulated symbols of the q-th

Just as in [4], hq (τ, ν) has bounded support, i.e., UT to the region (NT /g2 )(q)g2 , (NT /g2 )((q)g2 + 1) sec
hq (τ, ν) = 0 , |τ | > τmax or |ν| > νmax (3) 4 Since the transformation of DD domain information symbols to the time-
domain signal and its inverse are both linear, the single-antenna channel model
and that the transmit (Tx) and receive (Rx) pulse satisfy in (5), can be extended to a scenario where both the UTs and BS have multiple
antennas and each UT can send distinct DD domain information symbols
∗
grx (t − τ )gtx (t)e −j 2πν(t−τ ) dt = 0 from each of its antennas. The DD domain symbol received on a DDRB at a
particular BS antenna will simply be a sum of the symbols received on that
DDRB at the same BS antenna from all the transmit antennas of all the UTs.
|τ − nT | ≤ τmax , |ν − mΔf | ≤ νmax , (m, n) = (0, 0) (4) 5 As the effective channel in the DD domain is a 2-D circular convolution
(see (5)), a symbol transmitted by a UT on a DDRB can interfere with another
3 Also, the transmission time NT of the OTFS frame should be sufficiently
symbol transmitted by another UT on an adjacent DDRB either along the
small in order that the change in the delay and Doppler shift of each multipath delay domain if the delay spread is larger than T/M, or along the Doppler
is insignificant compared to M T and Δf respectively, during this period. domain if the maximum Doppler shift is more than Δf /N , leading to MUI.
N

× (M /g1 )q/g2 Δf , (M /g1 )(q/g2 +1)Δf Hz of the TF For the q-th UT the BS transforms Yq [ñ, m̃] back to the
domain (see Fig. 1b). The corresponding time-domain signal DD domain through SFFT, i.e., for all 0 ≤ ũ < N /g2 and
transmitted from the q-th UT is 0 ≤ ṽ < M /g1 we have
N /g2 −1 M /g1 −1

M /g1 −1 N /g2 −1
Δ j 2π Mm̃/g
ṽ − ñ ũ
xq [ũ, ṽ ] = Yq [ñ, m̃]e 1 N /g 2 . (13)
sq (t) = Xq [n, m]gtx t − (n + (N /g2 )(q)g2 )T
ñ=0 m̃=0
m=0 n=0
Theorem 2: The output of the proposed SFFT in (13), i.e.,
e j 2π m+(M /g1 )q/g2 Δf t−(n+(N /g2 )(q)g2 )T
. (7) xq [., .] is related to the information symbols transmitted by the
Let the DD channel for the q-th UT be [6] q-th UT (i.e., x̃q [., .]) through the 2-D circular convolution
N /g2 −1 M /g1 −1

pq xq [ũ, ṽ ] = x̃q [u, v ]h̃q (ũ − u)N /g2 , (ṽ − v )M /g1 )

hq (τ, ν) = hq,i δ(τ − τq,i )δ(ν − νq,i ) (8) u=0 v =0
i=1 + wq [ũ, ṽ ] ,
pq
τ
−j 2π νq,i τq,i + q,i M q − νq,i N (q)
Δ g2
where δ(·) is the impulse function, τq,i and νq,i are the delay h̃q [k , l] = hq,i e T g1 g2 Δf g2
and Doppler along the i-th multipath and pq is the number of i=1

paths. Further, we assume that hq (τ, ν) satisfies (3), i.e., Dq,i [l]Eq,i [k ]
M /g1 −1
(q)g
|τq,i | ≤ τmax , |νq,i | ≤ νmax . (9) Δ 1 −j 2πm 1 l
− M /g
τ
q,i
+ T
Dq,i [l] = e M 1
M
m=0
At the BS, for the q -th UT the BS computes N /g2 −1 q/g
Δ 1
1 − k νq,i
Eq,i [k ] = e
j 2πn N N /g2
+ Δf
(14)
∗ N
Yq [ñ, m̃] = grx t − (ñ + (N /g2 )(q )g2 )T r (t) n=0

where wq [ũ, ṽ ] is additive noise.
m̃+(M /g1 )q /g2 Δf t−(ñ+(N /g2 )(q )g2 )T
e −j 2π dt. Proof: We substitute the expression of Xq [n, m] from (6)
ñ = 0, 1, . . . , (N /g2 − 1), m̃ = 0, 1, . . . , (M /g1 − 1). (10) into (11) of Theorem 1, which gives us an expression
for Yq [ñ, m̃] in terms of x̃q [., .]. We then substitute this
expression of Yq [ñ, m̃] into the R.H.S. of (13), which
Theorem 1: For the q -th UT, Yq [ñ, m̃] computed at the
gives us (14).6
BS is free from MUI and is given by
In (14), wq (ũ, ṽ ) ∼ CN (0, 1/g1 g2 ) since the summa-
tion in (13) is only over (MN /g1 g2 ) terms as compared
Yq [ñ, m̃] = Xq [ñ, m̃] H̃q [ñ, m̃] + Wq [ñ, m̃]
to MN terms in (2) (w [k , l ] ∼ CN (0, 1) in (5)). The
pq q x̃q +
Δ 2-D convolution in (14) can be written as xq = H
H̃q [ñ, m̃] = hq ,i e −j 2πνq ,i τq ,i e j 2πνq ,i ñ+(N /g2)(q )g2 T Δ
i=1 w̃q where x̃q = x̃q [0, 0], . . . , x̃q [ gN2 − 1, 0], . . . , x̃q [0, Mg1 −
MN
m̃+(M /g1 )q /g2 T ×1
e −j 2πτq ,i Δf (11) 1], . . . , x̃q [ gN2 − 1, M
g1 − 1] and similarly
xq ∈ C g1 g2 is
MN MN
related to xq [., .]. The matrix H q ∈ C g1 g2 × g1 g2 is derived
where Wq [ñ, m̃] is the additive noise.
from the 2-D convolution in (14). With i.i.d. CN (0, ρ) symbols
Proof: Using the expression for sq (t) from (7) and that of
x̃q [u, v ] and Gaussian distributed noise, the sum SE (bits/s/Hz)
hq (τ, ν) from (8) in (1), we get the expression for r(t) using
is given by
which in (10) we get
Q−1
Δ 1 qH H .
N /g2 −1 M /g1 −1 Q−1 pq
R= log2 I + ρg1 g2 H q (15)
MN
Yq [ñ, m̃] = Xq [n, m]hq,i q=0
n=0 m=0 q=0 i=1
e −j 2πν
q,i τq,i e −j 2πτq,i Δf m+(M /g1 )q/g2 We consider an OTFS based communication system where
Fi (ñ, m̃, n, m, q, q ) + Wq [ñ, m̃] Δf = 1/T = 15 KHz, M = 36, N = 18 and the channel model

Δ ∗ is the same for all the UTs. We consider the 3GPP standardized
Fi (ñ, m̃, n, m, q, q ) = grx t − (ñ + (N /g2 )(q )g2 )T Extended Typical Urban (ETU 300) channel model [2], where
the maximum Doppler shift is νmax = 300 Hz.7 The channel
gtx (t − τq,i − nT − (N /g2 )(q)g2 T )
6 Using the fact that the effective single-antenna channel model in (5) can
j 2πt νq,i +Δf m−m̃+(M /g1 )(q/g2 −q /g2 )
be extended to the multi-antenna scenario (where UTs and BS have multiple
e dt. (12) antennas), the analysis in this section can be extended to show that the main
results of this letter, i.e., MUI free attribute of the proposed MA scheme in
Since n, ñ ∈ [0, N /g2 ), m, m̃ ∈ [0, M /g1 ), Theorem 1 and the 2-D circular convolution in Theorem 2 are preserved even
(τq,i , νq,i ) satisfy (9) and for q = q , ((q)g2 , q/g2 ) =
for the multi-antenna scenario.
7 The delay profile {τ Pq =9
((q )g2 , q /g2 ), from the property of Tx and q,i }i=1 is [0, 50, 120, 200, 230, 500, 1600,
Rx pulse in (4)
Pq =9
2300, 5000] ns and the corresponding power profile {E[|hq,i |2 ]}i=1 is
we get Fi (ñ, m̃, n, m, q, q ) = e j 2πνq ,i T ñ+(N /g2 )(q )g2 [−1, −1, −1, 0, 0, 0, −3, −5, −7] dB. We further normalize this power
δ(m − m̃)δ(n − ñ)δ(q − q ) using which in (12) 9

profile such that E[|hq,i |2 ] = 1.
we get (11). i=1
KHAMMAMMETTI AND MOHAMMED: OTFS-BASED MA IN HIGH DOPPLER AND DELAY SPREAD WIRELESS CHANNELS 531
for a fixed SNR it decreases with increasing number of UTs.

This is because, the method in [9] spreads the information of
each UT over the entire TF domain due to which they are
inseparable in the TF domain. When transformed back to the
DD domain, the high Doppler and Delay spread limits the
sum SE, since either the width of the guard band G has to be
sufficiently large which in turn would reduce the number of
available DDRBs or else the MUI will increase. In contrast,
from Fig. 3 it is clear that for a fixed number of UTs, the sum
SE of the proposed method increases steadily with increasing
SNR, and also that for a fixed SNR the sum SE is roughly the
same irrespective of the number of UTs.8 This is because, the
proposed MA method is MUI free (see Theorem 1).9
V. C ONCLUSION
Fig. 2. MA Method in [9] (M = N = 9, Q = 3, G = 2).
In this letter, we have proposed a novel Multiple-Access
method for OTFS based communication systems which is ana-
lytically shown to be MUI free. Numerical simulations confirm
that the proposed method is able to achieve high sum spectral
efficiency in high Doppler and delay spread channels.
R EFERENCES
[1] “IMT vision—Framework and overall objectives of the future deploy-
ment of IMT for 2020 and beyond,” Int. Telecommun. Union, Geneva,
Switzerland, ITU-Recommendation M-2083-0, Sep. 2015. [Online].
Available: www.itu.int
[2] LTE, Evolved Universal Terrestrial Radio Access (E-UTRA); Base
Station (BS) Radio Transmission and Reception, Version 8.6.0, 3GPP
Standard ETSI TS 36.104, Jul. 2009.
[3] T. Wang, J. G. Proakis, E. Masry, and J. R. Zeidler, “Performance
degradation of OFDM systems due to doppler spreading,” IEEE Trans.
Wireless Commun., vol. 5, no. 6, pp. 1422–1432, Jun. 2006.
[4] R. Hadani et al., “Orthogonal time frequency space modulation,” in
Proc. IEEE Wireless Commun. Netw. Conf. (WCNC), San Francisco,
CA, USA, Mar. 2017.
[5] R. Hadani and A. Monk, “OTFS: A new generation of modulation
addressing the challenges of 5G,” arXiv:1802.02623[cs.IT], Feb. 2018.
[Online]. Available: www.arxiv.org
Fig. 3. Sum Spectral Efficiency versus average received SNR. [6] K. R. Murali and A. Chockalingam, “On OTFS modulation for high-
doppler fading channels,” in Proc. Inf. Theory Workshop Appl. (ITA),
Feb. 2018, pp. 1–10.
[7] R. Raviteja, K. T. Phan, Q. Jin, Y. Hong, and E. Viterbo, “Low-
gains hq,i in (8) are modeled as independent Rayleigh faded complexity iterative detection for orthogonal time frequency space
random variables. The Doppler shift for the i-th path is taken modulation,” arXiv:1709.09402[cs.IT], Sep. 2017. [Online]. Available:
to be νq,i = νmax cos(θq,i ) where θq,i are independent and www.arxiv.org
uniformally distributed in [0, 2π) [7]. Note that νmax < Δf [8] A. Farhang, A. R. Reyhani, L. E. Doyle, and B. Farhang-Boroujeny,
“Low complexity modem structure for OFDM-based orthogonal time
and τmax < T , i.e., (3) is satisfied. frequency space modulation,” IEEE Wireless Commun. Lett., vol. 7,
We compare the average sum SE achieved by the proposed no. 3, pp. 344–347, Jun. 2017.
MA method (i.e., R in (15) averaged over the channel statis- [9] S. Rakib and R. Hadani, “Multiple access in wireless telecommunica-
tics) to that achieved by the method proposed in [9], as a tions system for high-mobility applications,” U.S. Patent 9 722 741 B1,
function of the received signal to noise ratio (SNR). The Aug. 2017.
[10] A. El Gamal and Y. H. Kim, Network Information Theory. Cambridge,
received SNR is the ratio of the received signal power from U.K.: Cambridge Univ. Press, 2011.
a UT (i.e., E[ |sq (t)|2 dt]/(NT )) to the noise power in the
total bandwidth of MΔf Hz. For the proposed method, from
the expressions derived in Section III it follows that the aver-
age received SNR is ρ/Q. The MA method in [9] allocates
non-overlapping contiguous RBs in the DD domain to each 8 With a fixed received SNR for each UT, although the number of DDRBs
UT, with DDRBs allocated to distinct UTs being separated
allocated to each UT decreases with increasing Q (number of UTs), the energy
by guard bands either in the delay domain or in the Doppler of each transmitted information symbol increases proportionately. This along
domain. For example, a guard band in the delay domain con- with the MUI free attribute of the proposed MA method explains our obser-
sists of a contiguous region in the DD domain which is G vation on the sum SE being roughly the same irrespective of Q. This also
RBs wide along the delay domain and N RBs wide along implies that the per-UT SE decreases with increasing Q, which is however a
the Doppler domain (see Fig. 2). For the method in [9], G is fundamental fact from information theory of multiple access channels where
the BS has a single antenna, which limits the degrees of freedom to one [10].
chosen so as to maximize the sum SE. 9 As the MUI free attribute of the proposed MA method holds also for
In Fig. 3 we plot the average sum SE versus the received the multi-antenna scenario, the proposed MA method is expected to achieve
SNR. For a fixed number of UTs, the sum SE achieved by the higher sum SE than that achieved by the MUI limited guard band based
method in [9] saturates with increasing SNR, and also that method even in the multi-antenna scenario.
Optimal Hybrid Beamforming for Multiuser Massive MIMO

Systems With Individual SINR Constraints
Guangda Zang , Ying Cui , Hei Victor Cheng , Feng Yang , Lianghui Ding , and Hui Liu
Abstract—In this letter, we consider optimal hybrid multiuser massive MIMO systems. However, most previous
beamforming design to minimize the transmission power works on multiuser hybrid beamforming design fail to con-
under individual signal-to-interference-plus-noise ratio (SINR) sider individual SINR constraints. Geng et al. [4] consider
constraints in a multiuser massive multiple-input-multiple- a non-convex multiuser hybrid beamforming design problem
output (MIMO) system. This results in a challenging non-convex with individual SINR constraints and propose a semidefi-
optimization problem. We consider two cases. In the case where nite relaxation-based alternating (SDR-Alt) algorithm to obtain
the number of users is smaller than or equal to that of radio a feasible solution. In particular, in each iteration, a digi-
frequency (RF) chains, we propose a low-complexity method to
tal beamforming design problem is solved by computing a
obtain a globally optimal solution and show that it achieves
the same transmission power as an optimal fully-digital beam- semi-closed form solution, and an analog beamforming design
former. In the case where the number of users is larger than that problem is solved with complexity O(M 4.5 N 4.5 ) using stan-
of RF chains, we propose a low-complexity globally convergent dard techniques for semidefinite programming (SDP), where
alternating algorithm to obtain a stationary point. M denotes the number of antennas and N denotes the num-
ber of RF chains. Moreover, most of previous works (e.g., [4])
Index Terms—Multiuser massive MIMO, hybrid beamforming, focus on the case where the number of users is no greater than
power minimization, penalty method. that of RF chains, and hence cannot provide meaningful solu-
tions for the emerging massive connectivity applications. To
I. I NTRODUCTION our knowledge, hybrid beamformer optimizations with indi-
ITH a large number of antennas deployed in massive
W multiple-input-multiple-output (MIMO) systems, power
consumption and cost of devices increase significantly and
vidual SINR constraints in multiuser massive MIMO systems
have not been successfully solved.
In this letter, we consider a multiuser massive MIMO system
may not be affordable for practical implementation. To address with K users and N RF chains and assume perfect channel
these issues, hybrid analog/digital structure with a reduced state information (CSI). We study optimal hybrid beamforming
number of radio frequency (RF) chains has been regarded as a design to minimize the transmission power subject to individ-
promising solution. Analog beamforming refers to the analog ual SINR constraints. The resulting challenging non-convex
operations applied to a signal before being transmitted through problem is solved in two cases. In the case of K ≤ N, we
antennas, and digital beamforming refers to the baseband sig- propose a low-complexity method to obtain a globally optimal
nal processing applied to a signal before being sent to RF solution and show that it achieves the same transmission power
chains. as an optimal fully-digital beamformer with a reduced num-
Hybrid beamforming technologies have been widely stud- ber of RF chains, by connecting the original optimization
ied in both point-to-point and multiuser massive MIMO problem to a fully-digital beamforming design problem. In
systems [1]–[3]. It is desirable to consider individual signal-to- the case of K > N, we propose a low-complexity globally
interference-plus-noise ratio (SINR) constraints to guarantee convergent alternating algorithm to obtain a stationary point,
quality of service (QoS) requirements for different users in based on problem transformation and a penalty method. To the
best of our knowledge, the proposed solutions are so far the
Manuscript received September 1, 2018; revised October 18, 2018; accepted most promising ones in the two cases in terms of computa-
October 20, 2018. Date of publication October 30, 2018; date of current tional complexity and theoretical guarantee. Finally, numerical
version April 9, 2019. This work was supported in part by the National Natural
Science Foundation of China under Grant 61401272, Grant 61771309, Grant results show that the proposed solutions have much lower
61671301, Grant 61420106008, and Grant 61521062, in part by the Shanghai computational complexity than the SDR-Alt algorithm.
Key Laboratory Funding under Grant STCSM15DZ2270400, in part by the
CETC Key Laboratory of Data Link Technology Foundation under Grant
CLDL-20162306, and in part by the Medical Engineering Cross Research II. S YSTEM M ODEL AND P ROBLEM F ORMULATION
Foundation of Shanghai Jiao Tong University under Grant YG2017QN47. Consider a downlink multiuser massive MIMO system with
The associate editor coordinating the review of this paper and approving it one multi-antenna base station (BS) and K single-antenna
for publication was P. D. Diamantoulakis. (Corresponding author: Feng Yang.)
G. Zang is with the Department of Electronic Engineering, users, denoted by K {1, . . . , K }. The BS has M (≥ K)
Shanghai Jiao Tong University, Shanghai 200240, China, also with antennas and N RF chains. To reduce hardware cost and power
the Department of New Technology of Communication, Shanghai consumption, we consider hybrid beamforming with a reduced
Microwave Research Institute, Shanghai 200331, China, and also number of RF chains (i.e., N < M). As illustrated in Fig. 1, we
with the CETC Key Laboratory of Data Link Technology, China
Electronics Technology Group Corporation, Xi’an 710068, China (e-mail:
adopt the widely used fully-connected structure, where each
zangguangda@sjtu.edu.cn). RF chain is connected to all M antennas. Thus, the output
Y. Cui, F. Yang, L. Ding, and H. Liu are with the Department of Electronic signal of each antenna can be seen as a linear combination
Engineering, Shanghai Jiao Tong University, Shanghai 200240, China of all RF signals. Let W [w1 , . . . , wK ] ∈ CN ×K denote
(e-mail: cuiying@sjtu.edu.cn; yangfeng@sjtu.edu.cn; lhding@sjtu.edu.cn; the digital beamformer, where wk ∈ CN ×1 denotes the dig-
ital beamforming vector for user k. Let V ∈ CM ×N denote
huiliu@sjtu.edu.cn).
H. V. Cheng is with the Department of Electrical Engineering, Linköping
University, 581 83 Linköping, Sweden (e-mail: hei.cheng@liu.se). the analog beamformer. As in [5] and [6], we do not impose
Digital Object Identifier 10.1109/LWC.2018.2878766 modulus constraints on the analog beamformer. Note that an
ZANG et al.: OPTIMAL HYBRID BEAMFORMING FOR MULTIUSER MASSIVE MIMO SYSTEMS WITH INDIVIDUAL SINR CONSTRAINTS 533
analog beamformer without modulus constraints can be imple-

mented using vector modulators [5] or double phase shifter
structure [6].1
We consider a narrowband system and assume a block fad-
ing channel model. Let gkH ∈ C1×M denote the channel of
user k ∈ K, and let G [g1 , . . . , gK ]H ∈ CK ×M denote
the channels of the K users, where the superscript H denotes
the Hermitian transpose of a matrix. In this letter, we assume Fig. 1. Hybrid beamformer structure.
perfect CSI at the BS. The
received signal of user k is given
by yk = gkH Vwk sk + gkH Vwi si + nk , where sk and Algorithm 1 Globally Optimal Design for the Case of K ≤ N
i∈K,i=k by solving Problem P
1: Find WD
nk ∼ CN (0, σk2 ) denote the transmitted information symbol FD using the method in [7];
2: Construct W ∈ CN ×K with linearly independent columns;
and the additive Gaussian noise of user k, respectively. We
assume that sk , k ∈ K are independent and with zero mean 3: Calculate V = WD (W )H W −1 (W )H ∈ CM ×N .
and unit variance. Thus, the transmission power is given by
VW2F , where · F denotes the Frobenius norm. To cap-
ture the QoS requirement for user k ∈ K, we require that the
instantaneous SINR of user k is above a threshold ηk , i.e., Lemma 1: When K ≤ N, (V , W ) is a globally optimal
H solution of Problem POri , and V W 2F = WD 2 .
g Vwk 2 F
k ≥ ηk , k ∈ K. Proof: It is clear that V W = WD and (V , W ) satis-

H 2 (1)
g Vwi + σ 2 fies (1). Thus, (V , W ) is a feasible solution of Problem POri
k k . Note that
i∈K,i=k and achieves the same transmission power as WD
the optimal value of Problem PFD is no greater than that of
Our goal is to optimize the digital beamformer W and the Problem POri . Thus, (V , W ) is a globally optimal solution
analog beamformer V to minimize the transmission power of Problem POri .
VW2F under the individual SINR constraints in (1). Thus, The key steps are summarized in Algorithm 1. To the best
we have the following hybrid beamforming design problem of our knowledge, this is the first work providing a globally
POri : min VW2F s.t. (1). optimal solution of Problem POri and showing that the optimal
V,W hybrid beamformer (with at least K RF chains) can achieve
Problem POri is a challenging non-convex problem. In the same transmission power as the optimal fully-digital beam-
Section III and Section IV, we shall solve Problem POri for former (with M(> N) RF chains) in the case of K ≤ N. As
two cases, i.e., K ≤ N and K > N, respectively. Algorithm 1 requires only computing a semi-closed form solu-
tion and some simple matrix operations, it has much lower
computational complexity than the SDR-Alt algorithm [4].
III. S OLUTION FOR THE C ASE OF K ≤ N
In this section, we study the case of K ≤ N, and obtain a
globally optimal solution of the non-convex Problem POri , by IV. S OLUTION FOR THE C ASE OF K > N
connecting it to a fully-digital beamforming design problem. In this section, we consider the case of K > N and propose
First, letting WD = VW ∈ CM ×K , Problem POri can be a globally convergent alternating algorithm based on a penalty
transformed to the following fully-digital beamforming (with method to obtain a stationary solution of Problem POri .
M RF chains) design problem
PFD : min WD 2F A. Equivalent Problem
WD
H First, consider the following problem.
[g WD ]k 2
s.t. k ≥ ηk , k ∈ K, (2)
H 2 PEq : min Sv XSw 2F
[gk WD ]i + σk2 X
H
i∈K,i=k (gk Sv XSw )H 1 + ηk H

s.t.
where WD can be viewed as the digital beamformer and σk ≤ ηk
gk Sv Xdk , k ∈ K, (3)
2
[ · ]i denotes the i-th element of the argument. Although
gkH Sv Xdk ≥ 0, k ∈ K, (4)
Problem PFD is non-convex, it can be solved optimally using
several methods, such as the method proposed in [7] which X 0, (5)
is based on a semi-closed form solution obtained from KKT rank(X) ≤ N , (6)
conditions and has low computational complexity compared
to other methods. Let WD denote a globally optimal solution
where Sv [IM ×M , 0M ×K ] ∈ CM ×(M +K ) , Sw
of Problem PFD . [0K ×M , IK ×K ]T ∈ C(M +K )×K and dk ∈ C(M +K )×1
Next, we construct a globally optimal solution of denotes the vector with the (M + k)-th element being 1 and
Problem POri based on WD . Specifically, we randomly
the rest being 0.2 Any feasible solution X of Problem PEq
generate an N × K matrix with linearly independent
columns, denoted by W ∈ CN ×K , and calculate V = can be decomposed as X = UUH (as X satisfies the con-
WD ((W )H W )−1 (W )H ∈ CM ×N . straints in (5) and (6)). We can rewrite U as U = [VH , W]H ,
where V ∈ CM ×N and W ∈ CN ×K , i.e., V = Sv U and
1 Note that our proposed methods can be extended to the case with modulus
constraints by first relaxing the modulus constraints and then projecting the 2 We denote the identity matrix and zero matrix of appropriate size by I
obtained solutions onto the set with modulus constraints. and 0, respectively.
W = UH Sw . The following result shows the relationship Algorithm 2 Solution for the Case of K > N
between Problem POri and Problem PEq . N
1: while trace(X) > i=1 λi (X) do
Theorem 1: If X is a globally optimal solution of 2: construct X(0) with random values and set i := 0
Problem PEq , (V, W) is a globally optimal solution of 3: repeat
Problem POri . Furthermore, if X is a stationary point
4: Obtain P(i+1) by solving the problem in (9);
of Problem PEq , (V, W) is a stationary point of Problem POri .
5: Obtain X(i+1) by solving the problem in (10);
Proof: See the Appendix.
6: i ← i + 1;
7: until convergence criterion is met;
B. Penalty Method 8: μ := 2μ;
Based on Theorem 1, we can solve Problem PEq instead of 9: end while
Problem POri . The rank-N constraint in (6) is non-convex and
non-smooth, and hence is hard to deal with. To address this
challenge, instead of (6), we consider the following constraint is monotonically non-increasing with i, the iterative alternat-
ing procedure for given μ converges to a limit point. As the
trace(X) − N i=1 λi (X) ≤ 0, (7)
constraint sets of the two problems are disjoint, the limit
where λi (·) denotes
the i-th largest eigenvalue of the argument. point is a stationary point of Problem PAlt [10]. A suffi-
As trace(X) ≥ N λ (X) holds for any X 0, (7) implies ciently largeμ (> μ0 ) can be found by increasing μ until
i=1 i
trace(X) = N λ (X), which means that X has at most N trace(X) − N i=1 λi (X)=0.
i=1 i
nonzero eigenvalues, i.e., (6) holds. Then we incorporate (7) The details are summarized in Algorithm 2. By Theorem 2
as a penalty for violation and obtain and by the equivalence between Problem PAlt and
Problem PPen , we know that a stationary point of
PPen : minX Sv XSw 2F + μ(trace(X) − N i=1 λi (X)) Problem POri can be obtained by Algorithm 2. As far as we
know, this is the first work providing a convergent stationary
s.t. (3), (4), (5). point of Problem POri in the case of K > N.
Using similar arguments in [8], we have the following result.
Theorem 2: There exists μ0 ∈ (0, +∞) such that for all V. N UMERICAL R ESULTS
μ > μ0 , trace(X) − N i=1 λi (X) = 0 and (V, W) is a sta- In this section, we provide numerical results to illustrate
tionary point of Problem POri , where X is a stationary point the performance of Algorithm 1 and Algorithm 2. In the sim-
of Problem PPen . ulations, the one-ring channel model is used by setting the
Based on Theorem 2, we first solve Problem PPen for any angular spread as Δ = 15◦ and assuming the azimuth angle
given μ. Let ΦM +K ,N {P ∈ SM +K , 0 P I, trace(P) = of arrival for user √ k as θk = −180◦ + Δ + (k − 1) 360
◦
K .
M +K −N } denote the convex hull of the rank-(M+K−N) We choose ηk = 2 − 1 and σk2 = 1. We consider four
projection matrices. As baselines for comparison. The first baseline is the hybrid beam-
N

former obtained using the SDR-Alt algorithm in [4] for solving
trace(X) − λi (X) = min trace(PT X) (8) Problem POri . The other three baselines are three typical fully-
P∈ΦM +K ,N digital beamformers (N = M), i.e., the optimal solution WD of
i=1
Problem PFD (optimal fully-digital beamformer), fully-digital
holds [9], Problem PPen can be rewritten as beamformer based on zero-forcing (ZF) and fully-digital

beamformer based on maximum-ratio-transmission (MRT),
PAlt : min min Sv XSw 2F + μ trace(PT X) which satisfy the SINR constraints in (2). In evaluating the
X P∈ΦM +K ,N
two proposed algorithms and the SDR-Alt algorithm in [4],
s.t. (3), (4), (5),
we use the same convergence criterion; we generate 30 ran-
which can be solved alternatively. Specifically, let X(i) denote dom channels (same for all schemes), and show the mean
the estimate of X at the i-th iteration. Then, the estimates of and standard deviation (see vertical bar at each point) of
the performance. We compare the normalized average power
P and X at the (i + 1)-th iteration are updated as
consumption which is unit-less.
P(i+1) = arg min trace(PT X(i) ) (9) Fig. 2 illustrates the average power versus the number of
P∈ΦM +K ,N users K. We can observe that, in the case of K≤N, Algorithm 1

achieves the same average power as the optimal fully-digital
X(i+1) = arg min Sv XSw 2F + μ trace((P(i+1) )T X) (10)
X beamformer. In the case of K>N, Algorithm 2 outperforms
s.t. (3), (4), (5). the fully-digital beamformers based on ZF and MRT, and
achieves similar average power compared to the optimal fully-
An optimal solution of the convex problem in (9) is given by digital beamformer. These indicate that hybrid beamforming
P(i+1) =QQH , where Q ∈ C(M +K )×(M +K −N ) is composed can achieve most of beamforming performance with reduced
of the M+K−N eigenvectors corresponding to the smallest hardware cost. In Fig. 2, we do not provide results for the
M+K−N eigenvalues of X(i) [9] and can be obtained by SDR-Alt algorithm, as its computational complexity at N = 36
standard matrix decomposition methods such as singular value and M = 96 is not acceptable. In Fig. 3, we compare the
decomposition. The convex SDP problem in (10) can be solved average power and simulation time (reflecting computational
with complexity O((M +K )4.5 ) using the standard interior- complexity) of the proposed algorithms and the SDR-Alt algo-
point toolboxes such as SeDuMi. Thus, it is clear that the rithm at small N and M. The proposed algorithms achieve
iterative alternating procedure for given μ has much lower the same average power as the SDR-Alt algorithm with much
computational complexity than the SDR-Alt algorithm [4]. lower computational complexity. In addition, the computa-
Since Sv X(i) Sw 2F + μ trace(PT X(i) ) is nonnegative and tional complexity of the proposed algorithms almost does
ZANG et al.: OPTIMAL HYBRID BEAMFORMING FOR MULTIUSER MASSIVE MIMO SYSTEMS WITH INDIVIDUAL SINR CONSTRAINTS 535
Taking square root on both sides of the above inequality and

by (11), we have

(gH VW)H
k ≤ 1 + ηk gH Vwk , k ∈ K, (12)
σk ηk k
2
Next, letting U [VH , W]H ∈ C(M +K )×N , we have V =

Sv U, W = UH Sw and wk = UH dk . Thus, (11) and (12) can
Fig. 2. Average power versus K at N = 36 and M = 96. be rewritten as
k gH Sv UUH d ≥ 0, k ∈ K.
k (13)
H
(gk Sv UUH Sw )H
≤ 1+ηk gH Sv UUH dk , k ∈ K. (14)
σk ηk k
2
Then Problem POri can be equivalently transformed to

PRe : min Sv UUH Sw 2F
U
s.t. (13), (14).
Note that X can be rewritten as X = UUH ∈
C(M +K )×(M +K ) for some U if and only if X satisfies con-
Fig. 3. Average power and simulation time. straints (5) and (6). Thus, if X is a globally optimal solution
of Problem PEq , (V, W) is a globally optimal solution of
Problem POri . Furthermore, it can be verified that if X satis-
fies the KKT system of Problem PEq , U also satisfies the KKT
not change with M, while the computational complexity of
system of Problem PRe . Thus, if X is a stationary point of
the SDR-Alt algorithm increases dramatically with M. Thus,
Problem PEq , U is a stationary point of Problem PRe . Besides,
Fig. 2 and Fig. 3 demonstrate the advantages of the proposed
by similar calculations provided in [7, Proposition 3], if U sat-
algorithms over the SDR-Alt algorithm.
isfies the KKT system of Problem PRe , (V, W) satisfies the
KKT system of Problem POri . Thus, if U is a stationary point
VI. C ONCLUSION of Problem PRe , (V, W) is a stationary point of Problem POri .
In this letter, we considered the optimal hybrid beamforming Therefore, if X is a stationary point of Problem PEq , (V, W)
design in a multiuser massive MIMO system to minimize the is a stationary point of Problem POri .
total transmission power under individual SINR constraints.
By exploring structural properties of the problem, we proposed R EFERENCES
two low-complexity algorithmic solutions to solve the chal-
lenging non-convex problem in two cases depending on the [1] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming design
number of users and the number of RF chains. The computa- for large-scale antenna arrays,” IEEE J. Sel. Topics Signal Process.,
vol. 10, no. 3, pp. 501–513, Apr. 2016.
tional complexity of the proposed algorithms is dramatically [2] T. E. Bogale, L. B. Le, A. Haghighat, and L. Vandendorpe, “On the
reduced compared to the existing SDR-Alt algorithm. number of RF chains and phase shifters, and scheduling design with
hybrid analog-digital beamforming,” IEEE Trans. Wireless Commun.,
vol. 15, no. 5, pp. 3311–3326, May 2016.
A PPENDIX [3] J. Mao, Z. Gao, Y. Wu, and M. Alouini, “Over-sampling codebook-based
P ROOF OF T HEOREM 1 hybrid minimum sum-mean-square-error precoding for millimeter-wave
First, it can be verified that multiplying a feasible point 3D-MIMO,” IEEE Wireless Commun. Lett., to be published.
VW of Problem POri on the right by a diagonal phase scal- [4] J. Geng, W. Xiang, Z. Wei, N. Li, and D. Yang, “Multi-user hybrid ana-
ing diag(e j φi ), where φi , i = 1, . . . , K are arbitrary phase
logue/digital beamforming for relatively large-scale antenna systems,”
IET Commun., vol. 8, no. 17, pp. 3038–3049, Nov. 2014.
values and diag(xi ) denotes a diagonal matrix with xi being [5] F. Ellinger, U. Lott, and W. Bachtold, “An antenna diversity MMIC
the i-th diagonal element, the feasibility and objective value of vector modulator for HIPERLAN with low power consumption and cal-
Problem POri do not change. If V W is an optimal solution, ibration capability,” IEEE Trans. Microw. Theory Techn., vol. 49, no. 5,
then V W diag(e j φi ) is also an optimal solution. Thus, we pp. 964–969, May 2001.
[6] X. Yu, J. Zhang, and K. B. Letaief, “Alternating minimization for
can restrict the k-th diagonal element of GVW, i.e., gkH Vwk , hybrid precoding in multiuser OFDM mmWave systems,” in Proc. 50th
to the non-negative real domain and impose Asilomar Conf. Signals Syst. Comput., Nov. 2016, pp. 281–285.
[7] A. Wiesel, Y. C. Eldar, and S. Shamai, “Linear precoding via conic
gkH Vwk ≥ 0, k ∈ K. (11) optimization for fixed MIMO receivers,” IEEE Trans. Signal Process.,
vol. 54, no. 1, pp. 161–176, Jan. 2006.
By (1), we have [8] A. H. Phan, H. D. Tuan, H. H. Kha, and D. T. Ngo, “Nonsmooth
H 2 2 optimization for efficient beamforming in cognitive radio multicast trans-
gk Vwi +σk2 ≤ η1k gkH Vwk , k ∈ K, mission,” IEEE Trans. Signal Process., vol. 60, no. 6, pp. 2941–2951,
i∈K,i=k Jun. 2012.
H 2 2 [9] R. A. Delgado, J. C. Agüero, and G. C. Goodwin, “A rank-constrained
⇒ gk Vwi +σk2 ≤ (1+ η1k )gkH Vwk , k ∈ K, optimization approach: Application to factor analysis,” IFAC Proc. Vol.,
i∈K vol. 47, no. 3, pp. 10373–10378, 2014.

(gH VW)H 2 [10] L. Grippo and M. Sciandrone, “On the convergence of the block non-

⇒ k ≤ (1+ 1 )gH Vwk 2 , k ∈ K. linear Gauss–Seidel method under convex constraints,” Oper. Res. Lett.,
σk ηk k vol. 26, no. 3, pp. 127–136, 2000.
2
On the Error Rate Analysis of Coded OFDM Over Multipath Fading Channels
Jinho Choi
Abstract—In this letter, an (approximate) closed-form back to the transmitter at the initial link setup (i.e., prior to
expression for the probability of codeword error is derived in data transmission) so that URLLC becomes possible with a
coded orthogonal frequency division multiplexing over multipath guaranteed target probability of codeword error. Note that in
fading channels. The derived error probability depends on the
channel realization and can be used to decide the values of key order to use such a scheme, it is expected that the variation
parameters of finite-length codes and the signal-to-noise ratio so of CSI is negligible within the feedback interval.
that a high reliability can be guaranteed in terms of the error The performance of coded OFDM over frequency-selective
probability for a given multipath fading channel. (or multipath) fading channels has been studied in the liter-
Index Terms—Ultra-reliable and low-latency communications ature. For example, a performance analysis of coded OFDM
(URLLC), orthogonal frequency division multiplexing, multipath is studied when hard-decision decoding is employed in [7].
diversity. In [8], based on an information-theoretic approach, an approx-
imate outage probability is studied to be used to estimate the
I. I NTRODUCTION probability of codeword error for coded signals. In [9], the
N ORDER to support mission-critical communications in performance of coded OFDM is studied with interferer with
I 5th generation (5G) systems (for industrial automation
and control, remote control of real-time operations, and so
an expression that takes into account all possible error patterns.
Unfortunately, existing approaches do not provide a reasonable
on), the notion of ultra-reliable and low-latency communica- estimate of the probability of codeword error in terms of key
tions (URLLC) has been considered [1], [2]. For URLLC, parameters for given channel realization when soft-decision
it is expected to use short packets based on finite-length decoding is employed at a receiver.
codes [3], [4]. While a good channel code has to be employed In this letter, we derive an upper-bound on the pairwise
for ultra-reliability, optimal or soft-decision decoding is desir- error probability (PEP) of a pair of codewords with a cer-
able to avoid any performance loss due to hard-decision. tain distance for a given channel realization. Using this
For low-latency, it might be desirable to transmit packets upper-bound, key parameters can be decided to guarantee a
without re-transmissions, where each packet can be transmit- certain target error rate. An estimate of the probability of
ted with a high reliability over fading channels. To this end, codeword error is also obtained as a closed-form expres-
we can consider orthogonal frequency division multiplexing sion of key parameters, which can be used for any block
(OFDM) where a set of subcarriers can be used simultane- codes. The closed-form expression is compared with simula-
ously to transmit a block of bits or a packet as an OFDM tion results that are obtained using the Bose, Chaudhuri, and
symbol. For a high reliability, an OFDM symbol can be a Hocquenghem (BCH) code [10] with a soft-decision decod-
codeword. In [5] and [6], it is shown that coded OFDM can ing technique. From the results, we can see that the derived
have a good coded performance by exploiting the frequency expression can be used to predict the probability of codeword
diversity gain that is due to multipath fading. error (in terms of key parameters) for given channel realiza-
Although coded OFDM can be an attractive approach tion, which is vital to guarantee a high reliability in terms
to URLLC, there are a few difficulties. For example, the of the probability of codeword error for coded OFDM over a
performance of coded OFDM is highly dependent on the multipath fading channel.
channel realization. Thus, in order to guarantee a certain relia- Notation: E[·] denotes the statistical expectation and vari-
bility, a closed-form expression for the probability of codeword ance. CN (a, R) represents the distribution of circularly sym-
error is highly desirable, which allows to decide the values of metric complex Gaussian (CSCG) random vectors with mean
key parameters such as the code rate and transmit power in vector a and covariance matrix R. The Q-function is given by
t2
advance, depending on the realization of a multipath fading Q(x ) = x∞ √1 e − 2 dt.
channel. That is, at a receiver, the channel state information 2π
(CSI) can be estimated, and using the expression for the prob-
ability of codeword error with the estimated CSI, the receiver II. S YSTEM M ODEL
can determine the values of key parameters and feed them Suppose that an OFDM symbol corresponds to a codeword
in coded OFDM. The length of an OFDM symbol is denoted
Manuscript received August 10, 2018; revised October 15, 2018; accepted
October 26, 2018. Date of publication October 30, 2018; date of current by L and the lth element is denoted by sl ∈ S, where S repre-
version April 9, 2019. The associate editor coordinating the review of this sents the signal constellation. Note that L is also the number
paper and approving it for publication was J. Mietzner. of subcarriers in the OFDM system. Then, the received signal
The author is with the School of Information Technology,
Deakin University, Burwood, VIC 3125, Australia (e-mail: at a receiver can be given by
jinho.choi@deakin.edu.au).
Digital Object Identifier 10.1109/LWC.2018.2878820 yl = Hl sl + nl , l = 1, . . . , L, (1)
CHOI: ON ERROR RATE ANALYSIS OF CODED OFDM OVER MULTIPATH FADING CHANNELS 537
where nl ∼ CN (0, N0 ) is the background noise and Hl is the is given by

lth frequency-domain channel coefficient that is given by
Es Es 2
P
P (c → c |A) = Pr(||r − Ac||2 > ||r − Ac || )
−
j 2π(p−1)(l−1)
⎛ 2 ⎞ 2
Hl = hp e L , l = 1, . . . , L. (2)
E ||A(c − c )||2
p=1 s
= Q⎝ ⎠. (4)
Here, P represents the length of the time-invariant channel 4N0
impulse response (CIR), {hp }, and hp represents the pth chan-
nel coefficient of a multipath fading channel. For convenience, B. Upper-Bound
j 2π(l−1) j 2π(P−1)(l−1)
let fl = [1 e L ... e L ]T , where the super- Suppose that there are d different elements between c and
script T (H) represents the transpose (the conjugate transpose, c (here, d can be seen as the Hamming distance between two
resp.). Then, Hl = flH h, where h = [h1 . . . hP ]T . codewords, c and c ). Let Cd be the set of the difference vectors
It is assumed that a codeword is (randomly) interleaved and of all the possible pairs, (c, c ), with d different elements. For
each group of |S| bits is mapped into a symbol, sl , as bit convenience, let e = c − c . Clearly, Cd can be seen as the set
interleaved coded modulation (BICM) [11]. In addition, we of the permutations of
assume a finite-length binary code. We also assume perfect
. . 0 ]T .
[ ±2 . . . ± 2 0 .
CSI at the receiver. If the receiver can find the probability of
codeword error for given CSI as a function of the key param- d -time (N −d)-time
eters of a given channel code and signal-to-noise ratio (SNR),
Let
the receiver can determine the required SNR or values of the
parameters of the code and feed them back to the transmitter Y = ||A(c − c )||2 = ||Ae||2 . (5)
in the initial link setup so that a guaranteed performance in
terms of the probability of codeword error can be achieved, Then, the average PEP can be given by
as mentioned earlier.
Es Y
P (d |A) = Ec,c [P (c → c |A)] = Ee Q
4N0
III. E RROR R ATE A NALYSIS OVER M ULTIPATH FADING
C HANNELS 1 − γb Y 1 − γb Y
≈ EY e 4 + e 3 , (6)
In this section, we derive an (approximate) upper-bound 12 4
on the probability of codeword error over a multipath fading Es Eb
where the approximation due to [12] and γb = 2N 0
= N0
.
channel when a finite-length code is employed. Es
Here, Eb is the bit energy, which is Eb = 2 since QPSK is
employed (one symbol is equivalent to two bits).
A. PEP
Lemma 1:
Throughout this letter, we assume that quadraturephase N

shift keying (QPSK) is employed, i.e., sl ∈ S = { E2s ±
2
E[e −λY ] ≤ 2 e −κ + (1 − e −κ )e −λ4|an | , (7)
n=1
j E2s }, where Es represents the symbol energy. For decoding,
the phases of the channel coefficients are restored as follows: where κ = . d
N
xl = e −j θl
yl = |Hl |sl + e −j θl
nl , Proof: From (5), we can show that Y = 4 N 2
n=1 |an | Dn ,
where the random variables, Dn ∈ {0, 1}, are not independent
where θl = ∠Hl . Let r = [(x1 ) (x1 ) . . . (xL ) (xL )]T . as there should be d non-zero elements. Due to random inter-
Then, we have leaving, {Dn } can be seen as a random permutation of d 1’s
and N − d 0’s. For a bound, we consider random variables
Es
r= Ac + n̄, (3) {W1 , . . . , WN }, where Wn is the number of balls in bin n
2 when d balls are randomly thrown. Define
where A = diag(|H1 |, |H1 |, . . . , |HL |, |HL |) and n̄ ∼
1, if Wn ≥ 1
N (0, N20 I). Here, c is a sequence of +1 and −1, which can ζ(Wn ) =
0, o.w.
be seen as an interleaved codeword and is referred to as coded
OFDM symbol. For convenience, denote by al and cl the lth The number of non-zero elements in {ζ(W1 ), . . . , ζ(WN )}
diagonal element of A and the lth element of c, respectively. is d if each bin has at most one ball. However, there might
Let N denote the length of the codeword. If N is even, the bins with more than 1 balls. Thus, the number of non-zero
length of c is N = 2L. On the other hand, if N is odd, we elements in {ζ(W1 ), . . . , ζ(WN )} is smaller than or equal to
assume that N = 2L −1, where the last element of c is fixed that in {Dn }. Thus, we have
(or set to 0).
Provided that there are only the two coded OFDM sym- E[f (D1 , . . . , DN )] ≤ E[f (W1 , . . . , WN )], (8)
bols, c and c , the PEP (which leads to the probability of where
codeword error when the maximum likelihood (ML) decod- N
4|an |2 ζ(Xn )
ing is employed [10]) when c is transmitted and c is decoded f (X1 , . . . , XN ) = e −λ n=1 .
We can see that f (W1 , . . . , WN ) is a nonnegative function

and E[f (W1 , . . . , WN )] is monotonically decreasing in κ or
d (since the probability that Wn ≥ 1 increases with d). Thus,
according to [13, Th. 5.10], the following inequality can be
obtained:
E[f (W1 , . . . , WN )] ≤ 2E[f (Z1 , . . . , ZN )], (9)
where Zn is an independent Poisson random variable with
d . Since
mean κ = N
2 2
E[e −λ4|an | ζ(Zn ) ] = e −κ + (1 − e −κ )e −λ4|al | , (10)
we have (7) by substituting (10) into (9).
For convenience, let
N

2
ψ(x , κ) = e −κ + (1 − e −κ )e −|an | x . (11)
Fig. 1. PEP as a function of distance d for a channel realization with
n=1 P = 5 when L = 128, N = 256, and Eb /N0 = −6 dB.
Then, applying (7) to (6), we have

ψ(γb , κ) ψ 4γ3b , κ
P (d |A) ≤ + . (12)
6 2
Note that (12) is an upper bound on the conditional PEP for
given A. If the number of paths, P, is sufficiently large, the
average PEP over A can approximate the conditional PEP
(thanks to the law of large numbers [13]). However, for a
small P, since the PEP is highly dependent on the channel
realization, in order to firmly guarantee a certain error rate, we
should consider the conditional PEP for given CSI or channel
realization as (12).
Suppose that the number of codewords is 2K (here, K is the
number of message bits) and the minimum distance is dmin .
That is, an (N , K , dmin ) code is assumed. Then, the probabil-
ity of codeword error (for the ML decoding) is bounded and
approximated as [10]
Fig. 2. Average PEP as a function of SNR (over 100 different channel
Perr ≤ P (c → c |A) Pr(c) realizations with P = 5) when L = 128, N = 256, and d = 31.
c c =c
⎛ ⎞
4γb
ψ(γb , κmin ) ψ 3 , κ min Fig. 1 shows the PEP for a channel realization, i.e., for a
≈ Bmin ⎝ + ⎠, (13)
fixed CIR, {hp }, with P = 5, as a function of d when L = 128,
6 2
N = 256, and Eb /N0 = −6 dB. The locations of d different
elements between c and c are random (for each value of d, an
where κmin = dmin and Bmin is the weight of the mini-
N averaged PEP over different realizations of pairs (c, c ) with d
mum distance dmin . If (13) is sufficiently tight, it can be used
differences is obtained). The bound in Fig. 1 is given by (12)
at the receiver to predict the performance as a function of
(for the same channel realization). We can see that the upper
(N , K , dmin ) as well as SNR, γb . Thus, as mentioned earlier,
bound in (12) is reasonably tight when d is not too large.
the receiver is able to decide the values of parameters (of code
In Fig. 2, the average PEP is shown as a function of SNR
or SNR) and fed back them to the transmitter so that it can
when P = 5, L = 128, N = 256, and d = 31. For each average
set the values of parameters for a target performance in terms
PEP, we use 100 different channel realizations. We can also
of the probability of codeword error, which is usually low for
see that the bound is reasonably tight for a wide range of SNR.
high reliable transmissions.
In [4], among various existing finite-length channel codes,
it is shown that the BCH codes outperform other well-known
IV. S IMULATION R ESULTS codes. Furthermore, although the ML decoding is compu-
In this section, we present simulation results to see whether tationally infeasible, the BCH codes have reasonably good
or not the bound in (12) and the approximate probability of low-complexity soft-decision decoding techniques such as the
codeword error in (13) are reasonably tight. For simulations, ordered statistics based decoding approach proposed in [14].
the time-domain channel coefficients, {hp }, are assumed to be Thus, we consider short-length BCH codes for simulations.
zero-mean CSCG random variables with E[|hp |2 ] = P1 . In Fig. 3, we present simulation results with two BCH codes,
CHOI: ON ERROR RATE ANALYSIS OF CODED OFDM OVER MULTIPATH FADING CHANNELS 539
probability of codeword error for a wide range of its value

(from 10−1 to 10−5 ).
V. C ONCLUDING R EMARKS
In this letter, we studied coded OFDM over multipath
fading channels and derived an upper-bound on the PEP
for given CSI. Based on the upper-bound, we also derived
an approximate (upper-bound on) probability of codeword
error, from which key parameters of the block code can be
decided to guarantee highly reliable transmissions. Simulation
results showed that the derived expression for the prob-
ability of codeword error can predict the performance of
(short-length) BCH codes with soft-decision decoding for a
wide range of target probability of codeword error (between
10−1 and 10−5 ).
R EFERENCES
[1] C. She, C. Yang, and T. Q. S. Quek, “Radio resource management for
ultra-reliable and low-latency communications,” IEEE Commun. Mag.,
vol. 55, no. 6, pp. 72–78, Jun. 2017.
[2] P. Popovski et al., “Wireless access for ultra-reliable low-latency com-
munication: Principles and building blocks,” IEEE Netw., vol. 32, no. 2,
pp. 16–23, Mar./Apr. 2018.
[3] G. Durisi, T. Koch, and P. Popovski, “Toward massive, ultrareliable, and
low-latency wireless communication with short packets,” Proc. IEEE,
vol. 104, no. 9, pp. 1711–1726, Sep. 2016.
[4] M. Shirvanimoghaddam et al., “Short block-length codes for ultra-
reliable low-latency communications,” CoRR, vol. abs/1802.09166,
Sep. 2018.
[5] H. Sari, G. Karam, and I. Jeanclaude, “Transmission techniques for dig-
ital terrestrial TV broadcasting,” IEEE Commun. Mag., vol. 33, no. 2,
pp. 100–109, Feb. 1995.
[6] J. Choi, Adaptive and Iterative Signal Processing in Communications.
Cambridge, U.K.: Cambridge Univ. Press, 2006.
[7] Y. H. Kim, I. Song, H. G. Kim, T. Chang, and H. M. Kim,
“Performance analysis of a coded OFDM system in time-varying
multipath Rayleigh fading channels,” IEEE Trans. Veh. Technol., vol. 48,
no. 5, pp. 1610–1615, Sep. 1999.
[8] J. Zheng and S. L. Miller, “Performance analysis of coded OFDM
Fig. 3. Average probability of codeword error of BCH codes (with the bound systems over frequency-selective fading channels,” in Proc. IEEE Glob.
from (13)) as a function of SNR when P = 5 and L = 64: (a) (127, 50, 27) Telecommun. Conf. (GLOBECOM), vol. 3, Dec. 2003, pp. 1623–1627.
BCH code; (b) (127, 36, 31) BCH code. [9] C. Snow, L. Lampe, and R. Schober, “Error rate analysis for coded
multicarrier systems over Quasi-static fading channels,” IEEE Trans.
Commun., vol. 55, no. 9, pp. 1736–1746, Sep. 2007.
[10] S. Lin and D. Costello, Error Control Coding: Fundamentals and
i.e., (N , K , dmin ) = (127, 50, 27) and (127, 36, 31), when Applications, 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall, 2004.
P = 5, L = 64, and N = 127, where the ordered statis- [11] A. Guillén i Fàbregas, A. Martinez, and G. Caire, “Bit-interleaved
tics based decoding approach with order 2 and 3 is used. coded modulation,” in Foundations and Trends on Communications and
Information Theory. vol. 5, Boston, MA, USA: Now, 2008, pp. 1–153.
We consider a number of different channel realizations (105 [12] M. Chiani and D. Dardari, “Improved exponential bounds and approxi-
different channel realizations) to obtain the average (code- mation for the Q-function with application to average error probability
word) error probability. For the bound, we use (13). While the computation,” in Proc. IEEE Glob. Telecommun. Conf. (GLOBECOM),
vol. 2, Nov. 2002, pp. 1399–1402.
decoding performance can be improved with a higher compu- [13] M. Mitzenmacher and E. Upfal, Probability and Computing:
tational complexity (from order 2 to order 3), it is not the ML Randomized Algorithms and Probability Analysis. Cambridge, U.K.:
performance [14]. As a result, the bound obtained from (13) Cambridge Univ. Press, 2005.
[14] M. P. C. Fossorier and S. Lin, “Soft-decision decoding of linear block
can be lower than simulation results at high SNRs. In general, codes based on ordered statistics,” IEEE Trans. Inf. Theory, vol. 41,
from Fig. 3, we can see that (13) can be used to predict the no. 5, pp. 1379–1396, Sep. 1995.
Adaptive AoA and Polarization Estimation for

Receiving Polarized mmWave Signals
Hang Li , Thomas Q. Wang , Xiaojing Huang , Senior Member, IEEE, and Y. Jay Guo , Fellow, IEEE
Abstract—This letter proposes a novel hybrid dual-polarized on a partially connected array because of its lowest hardware
antenna array which exploits two orthogonally collocated dipoles complexity compared with other architectures.
to capture the full power of a polarized millimeter wave sig- The existing massive arrays typically employ single-
nal. To maximize the received signal-to-noise ratio (SNR), we polarized antennas to receive the electric field of an incident
study the adaptive angle-of-arrival and polarization state estima- wave. Since the microwave channels are rich in scatters, the
tion, and develop a differential beam tracking algorithm and a antennas can normally receive signals with sufficient power
cross-correlation-to-power ratio polarization tracking algorithm
regardless of polarization matching. This, however, is not
for interleaved hybrid dual-polarized arrays. Simulation results
verify the superior performance of the proposed algorithms, and applicable in mmWave channels where the line-of-sight (LOS)
confirm the significant improvement of SNR obtained by using component dominates [2] and thus necessitates the alignment
the proposed array and algorithms. of polarization directions between the antennas and the inci-
dent wave. To address this issue, dual-polarized antennas have
Index Terms—Hybrid dual-polarized antenna array, angle-of- been used to collect the entire signal power for the subse-
arrival estimation, polarization state estimation, and mmWave. quent processing [4]. Crossed dipole antennas are an effective
approach to providing dual polarization with a wide frequency
I. I NTRODUCTION range from RF to mmWave [5]. Therefore, we propose to
employ antenna elements with dual-polarized crossed dipoles
ILLIMETER wave (mmWave) communication is an
M enabling technology for future wireless systems [1].
Ranging from 30-300 GHz, the spectrum of mmWave pro-
for a hybrid antenna array.
The use of dual-polarized antennas in an mmWave hybrid
array is at an early stage where many open problems need
vides an unprecedented resource to support extremely fast data to be addressed. These include the angle-of-arrival (AoA)
transmission. In addition, the vast spectrum is an effective sup- and polarization state estimations which play critical roles
plement to the currently saturated radio frequency (RF) bands in optimizing the signal-to-noise ratio (SNR) at the decoder.
(700 MHz to 2.6 GHz) for wireless communications. As a In the existing literatures, the estimations have been studied
result, a wide range of applications based on mmWave, e.g., in two contexts: the fully digital arrays with dual-polarized
multiple-input multiple-output communications [2] and target antennas [6]–[8] and the hybrid arrays with single-polarized
tracking [3], have been increasingly studied in recent years. antennas [3]. For the former, the AoA and polarization state
The short wavelength of mmWave enables massive antenna are estimated by using ESPRIT and MUSIC-based algo-
arrays in the transceivers to overcome the path loss, and rithms. However, these algorithms require the computations
to provide large-scale spatial multiplexing and highly direc- of the covariance matrix and singular value decomposition
tional beamforming. A massive array, consisting of hundreds which grow cubically with the total number of antennas. As
of antenna elements, can be typically configured in the archi- a result, they can be only applied to the arrays with mod-
tectures of digital array and hybrid array. The digital array est sizes. For the latter, only AoA can be estimated because
has each antenna connect to a dedicated RF chain, allowing of the use of single-polarized antennas. Recent researches
the joint digital signal processing among different antennas. have reported a variety of methods in this context, includ-
This potentially leads to the best performance in terms of data ing cross-correlation [3], [9] and MUSIC-based [10] algo-
rate, but results in prohibitive costs in hardware implemen- rithms. It is shown that the cross-correlation based algorithm
tation and real-time signal processing [1]. Alternatively, the requires only scalar multiplications and additions, and thus
array can employ hybrid architectures to reduce the associ- have much lower computational complexity than its counter-
ated costs. For hybrid architectures, analog beamforming is parts based on ESPRIT and MUSIC. For polarized signals,
performed on top of the signal processing in digital domain. the accuracy of the estimation depends on the received
It is typically implemented by using phase shifters to connect power which may suffer a loss due to the polarization mis-
the RF chains to the antenna elements. Depending on the con- match. This issue can be addressed by using dual-polarized
nections, a hybrid array can be configured to be either fully antennas.
connected [1] or partially connected [2]. In this letter, we focus In this letter, we propose a hybrid dual-polarized adaptive
antenna array to immunize the degradation of SNR caused
Manuscript received September 18, 2018; revised October 22, 2018; by the polarization mismatch. We present the estimations
accepted October 26, 2018. Date of publication November 1, 2018; date of
current version April 9, 2019. This work was supported by the Australian of AoA and polarization state, and demonstrate the associ-
Research Council Discovery Project under Grant DP160101693. The associate ated mean square errors (MSEs). We show that by using
editor coordinating the review of this paper and approving it for publication the hybrid dual-polarized antenna array, improved accuracy
was Y. Gao. (Corresponding author: Hang Li.) of AoA estimation can be achieved compared with the exist-
The authors are with the Global Big Data Technologies Centre, ing work [3]. It is also shown that the polarization state of
University of Technology Sydney, Ultimo, NSW 2007, Australia (e-mail:
hang.li@uts.edu.au; qian.wang@uts.edu.au; xiaojing.huang@uts.edu.au; an arbitrarily polarized wave can be estimated, leading to an
jay.guo@uts.edu.au). enhanced SNR at the decoder after combining with the AoA
Digital Object Identifier 10.1109/LWC.2018.2879010 estimation.
LI et al.: ADAPTIVE AoA AND POLARIZATION ESTIMATION FOR RECEIVING POLARIZED mmWAVE SIGNALS 541
We assume the phase shifters taking values of, α0 , . . . ,

αN −1 , in all the subarrays, and ignore the mutual couplings
between the elements [3] and between the dipoles [7]. The sig-
nals received by x- and y-axis dipoles are passed through the
analog beamformers and downconverted to baseband, followed
by analog-to-digital (A/D) conversion. The samples associated
with the mth beamformer can be expressed as
2π
[sxm [i ], sym [i ]] = [ex , ey ] · s̃[i ]e j 2πfD Ti Ps (θ)e j λ
mds sin θ
+ [nxm [i ], nym [i ]], (1)

where fD , T and ds denote the Doppler frequency shift,
the sampling interval and the adjacent subarray spacing,
respectively. Ps (θ) given by
N −1
1 j ( 2π nde sin θ+αn ) sin N πλ de sin θ − sin θ
Ps (θ) = e λ =
N
n=0
N sin π
λ de (sin θ − sin θ )
denotes the normalized subarray radiation pattern. Given the
spacing between the neighbouring elements in a subarray,
de , its value depends on those of the phase shifters, αn =
− 2π
λ nde sin(θ ), n = 0, . . . , N − 1, where θ denotes the
Fig. 1. Illustration of a linear hybrid dual-polarized antenna array of two angle in which the main beam of the subarray is directed.
interleaved subarrays. s̃[i ] denotes the sampled signal of s̃(t) with average power
E[|s̃[i ]|2 ] = σs̃2 , and [nxm [i ], nym [i ]] are the zero-mean com-
II. S YSTEM M ODELS plex additive white Gaussian noises corresponding to the x-
and y-axis dipoles at the outputs of the mth subarray with
A. Hybrid Dual-Polarized Array of Subarrays identical power σn2 .
We study a hybrid antenna array which consists of M ana- As shown in Fig. 1, the samples from the outputs of each
log subarrays, each having N isotropic antenna elements with beamformer are used to estimate the AoA and update the
omni-directional radiation patterns. Unlike the conventional phase shifters accordingly. The samples from x- and y-axis
hybrid antenna array in [3], each element employs two spa- dipoles are weighted and summed M −1 separately by digital beam-
formers as, [sx [i ], sy [i ]] = m m
m=0 wm [sx [i ], sy [i ]], where
tially collocated dipoles, denoted by x-axis and y-axis dipoles
and placed orthogonally along the directions of x-axis and w0 , . . . , wM −1 denote the weights used to align the phases
y-axis in Cartesian coordinates, respectively. This indicates of the outputs of analog beamformers. The resulting samples,
that an incident wave can simultaneously stimulate two signals [sx [i ], sy [i ]], are used to estimate the polarization state and
in an antenna, each from a dipole. As a result, with an antenna also for maximal ratio combining (MRC) which produces a
array employing M subarrays, two sets of signals collected combined sample given by s[i ] = κx sx [i ] + κy sy [i ], where
from all the x- and y-axis dipoles respectively can be formed. κx and κy are the MRC coefficients. It is seen from (1) that
From the perspective of parameter estimation, they can be used [κx , κy ] depend on both the AoA and polarization state which
to estimate the AoA and polarization state after being sepa- can be estimated using the proposed array.
rately and identically processed in analog and digital domains.
In addition, they also represent two replicas of a signal, and III. AOA AND P OLARIZATION E STIMATIONS
thus can be combined using the estimates to enhance the SNR To achieve the maximum received SNR at the decoder, the
at the decoder. An example of a linear hybrid dual-polarized array gradually adjusts its analog and digital beamformers and
antenna array with two interleaved subarrays [3] is shown the MRC combiner based on the knowledge of the AoA and
in Fig. 1, where the RF and down conversion components polarization state estimated from the received signals. The
are omitted for simplicity. The signal processing modules for adjustment is conducted by using an iterative procedure. In
x- and y-axis dipoles are also included in the figure, with those each iteration, the estimations of AoA and polarization state
for x-axis dipoles elaborated in the dashed boxes and those for are performed followed by the adjustment of the array.
y-axis dipoles enclosed in the solid ones.
A. Differential Beam Tracking
B. Received Signal Models Now we study the AoA estimation using an interleaved
We consider a transverse electromagnetic wave with a hybrid dual-polarized array. From (1), it can be seen that the
central wavelength of λ. It carries a narrow-band sig- AoA of the received signal is included in the phases of the out-
nal, s̃(t) and is incident onto the array with an eleva- put signals of the subarrays. Therefore, the cross-correlations
tion angle θ. Here we ignore the multipath components of any two adjacent subarrays can be used to estimate the AoA.
as the mmWave channel is typically dominated by LOS. This is known as the differential beam tracking (DBT) which
The electric field vector e of the wave is expressed in is developed in the context of single-polarized antennas in [3].
Cartesian coordinates as e = ex vx + ey vy + ez vz [7], When cross-dipole antennas are used, signals from both x- and
where v is a unit vector along the subscript’s coordinate, and y-axis dipoles can be used for the estimation. Assuming that
[ex , ey , ez ] = [ sin γ cos θe j η , cos γ, − sin γ sin θe j η ] denote the noise components in different subarrays are independent,
the responses of the corresponding subscripts. γ ∈ [0, π2 ) and the cross-correlations can be accordingly expressed as
η ∈ [ − π, π) represent the auxiliary polarization angle and the
[Rx , Ry ] = [E[(sxm [i ])∗ sxm+1 [i ]], E[(sym [i ])∗ sym+1 [i ]]]
polarization phase difference [7] respectively, which uniquely
determine the polarization state of a wave. = [|ex |2 , |ey |2 ] · σs̃2 |Ps (θ)|2 e ju , (2)
where (·)∗ and |(·)| represent the conjugate and absolute value Algorithm 1 The AoA and Polarization State Estimation
of (·), respectively. The AoA is contained in both Rx and Ry (0) (0)
Initialization: û (0) , R̂ (0) , Q̂xy , P̂x and P̂y ;
(0)
in the form of u = 2π λ ds sin θ. For i = 1 : I (I is the number of iterations) do
The expectation with respect to (w.r.t.) the statistics of the (i)
1. Set α̂n = −nM û (i−1) , n = 0, . . . , N − 1;
signals shown in (2) can be approximated by the arithmetic (i)
average w.r.t. the time and subarrays. Using adaptive filter- 2. Update [sxm [i], sym [i]] using α̂n in (1);
ing theory, the cross-correlation evaluated based on the first i 3. Calculate û (i) = arg{R̂ (i) } using (3);
received symbols, R̂ (i) , can be expressed as a weighted sum of 4.
(i)
Update ŵm = M
(i)
1 e −jm û , m = 0, . . . , M − 1;
that on the first i−1 symbols, R̂ (i−1) , and an update, ΔR̂ (i) , (i)
depending on the ith symbol only, i.e., 5. Update [sx [i], sy [i]] using ŵm in (4);
(i) (i) (i)
6. Update Q̂xy , P̂x , P̂y using (8);
R̂ (i) = (1 − μ)R̂ (i−1) + μΔR̂ (i) , (3)
7. Determine γ̂ (i) and η̂ (i) using (6) or (7).
where μ denotes the updating coefficient, ranging from zero
End for
to one. The updates for x- and y-axis dipole signals can be
(i) −1 m ∗ m+1 (i)
expressed as ΔR̂ = M m=0 (sx [i ]) sx [i ] and ΔR̂y =
M −1 m ∗ xm+1
m=0 (sy [i ]) sy [i ], respectively [3]. They both consist
of a wanted component with a phase u (see (2)), and a noise where cos θ can be obtained from û, as cos θ = 1 − û 2 /π 2 .
component with random and different phases. Therefore, the As a result, it is referred to as cross-correlation-to-power ratio
(i) (i) polarization tracking (CPRPT).
sum of ΔR̂x and ΔR̂y will have the wanted components In the presence of noise component, the cross-correlation
added in phase, leading to enhanced SNR for the estimation and powers, [Qxy , Px , Py ], must be estimated. Denoting their
of AoA included in the phase. As a result, ΔR̂ (i) can be (i) (i) (i)
(i) (i) estimates after receiving i symbols by [Q̂xy , P̂x , P̂y ], they
formed as ΔR̂x + ΔR̂y which results in the estimated AoA can be expressed as
based on the first i symbols given by û = arg{R̂ (i) }. Note that
compared with a conventional array [3], the computation of the
(i) (i) (i)
cross-correlation involves approximately twice the computing Q̂xy , P̂x , P̂y
loads. However, we will show in simulation results that this
(i−1) (i−1) (i−1)
leads to improved estimation performance and SNRs in return. = (1 − β) Q̂xy , P̂x , P̂y

σ2 σ2
B. Cross-Correlation-to-Power Ratio Polarization Tracking + β sx [i ](sy [i ])∗ , |sx [i ]|2 − n , |sy [i ]|2 − n , (8)
M M
The AoA estimation is followed by the adjustment of digital
and analog beamformers. Using the adjusted weights given by σ2
1 e −jm û , m = 0, . . . , M −1, the outputs of the digital where β (0 < β < 1) is the updating coefficient, and Mn is
ŵm = M the power of noise component in (4). The noise component
beamformer, [sx [i ], sy [i ]], can be expressed as also results in different values for the estimates returned by
M −1
1 jm(u−û) (6) and (7). Their accuracies depend on the SNRs of sx [i ]
[sx [i], sy [i]] = [ex , ey ] · s̃[i]e j 2πfD Ti Ps (θ) · e and sy [i ], respectively, which can be estimated by using the
M (i) (i)
m=0

estimates, P̂x and P̂y obtained from (8). As identical power
signal component (i)
is assumed for noise component in sx [i ] and sy [i ], P̂x and
M −1
1 m (i)
P̂y can be used to indicate the SNRs. Therefore, (6) is used
+ [nx [i], nym [i]]e −jm û , (4)
M (i) (i)
m=0 for the estimation if P̂x ≥ P̂y , and (7), otherwise.

The AoA and polarization state estimation algorithms are
noise component
summarized in Algorithm I. It is noted that the proposed
which indicates that the polarization state information is con- algorithms are blind adaptive since no knowledge about
tained in the complex amplitude of the signal component, the the reference signal is assumed. As shown in (2) and (5),
cross-correlation and powers of which are given by the Doppler frequency shift fD does not affect the cross-

Qxy , Px , Py correlations [Rx , Ry ] in DBT, or the cross-correlation and
2 powers [Qxy , Px , Py ] in CPRPT, since the phase shift pro-
[ex ey∗ , |ex |2 , |ey |2 ] 2 M−1 duced by fD is the same for all subarray signals corresponding
2 jm(u−û)
= 2
σs̃ |Ps (θ)| e . (5)
M to both x- and y-axis dipoles. As a result, both AoA and
m=0 polarization state estimations are Doppler resilient. In addi-
Therefore, when γ = 0 and θ = − π2 ,
the polarization tion, the proposed algorithms can be applied for tracking
state η and γ, can be evaluated by using either one of ∗ the the AoA and polarization state. The selection of updating
Q e
two cross-correlation-to-power ratios: 1) Wx = Pxyx
= ey∗ = coefficients has an impact on both the convergence speed and
x
ej η Q variance of the estimated parameters. In general, a larger coef-
tan γ cos θand 2) Wy = Pxy y
= eexy = tan γ cos θe j η , which ficient will quicken convergence speed but lead to a larger
lead to the estimated polarization state respectively given by variance, whereas a smaller coefficient will decelerate the con-
vergence but reduce the variance. To maximize the received
|Wx |−1
γ = arctan , η = arg{Wx } (6) SNR, we coherently combine two beamformed output sig-
cos θ nals sx [i ] and sy [i ].
After applying the MRC coefficients
|Wy | [κx , κy ] = [ex∗ , ey∗ ]/ |ex |2 + |ey |2 to [sx [i ], sy [i ]], the SNR
and γ = arctan , η = arg{Wy }, (7)
cos θ of the combined signal s[i] equals the sum of two output SNRs.
LI et al.: ADAPTIVE AoA AND POLARIZATION ESTIMATION FOR RECEIVING POLARIZED mmWAVE SIGNALS 543
Fig. 2. Estimated parameters (θ̂, γ̂, η̂) versus the number of iterations. Fig. 4. The combined received SNR versus the number of iterations.
Fig. 4 shows the converging processes of the SNRs achieved

by the proposed and conventional arrays with γs = 0 dB. It
can be seen that as the iteration proceeds, all the arrays con-
sidered are capable of achieving improved SNRs. However,
because of the misalignment of polarization states, the conven-
tional arrays (denoted by blue and red curves) suffer significant
losses after convergence. The loss can be avoided when the
proposed array and algorithms are employed. From the fig-
ure, we can see that the maximum achievable SNR equalling
10 lg(4 × 16) + 0 = 18 dB can be achieved by the proposed
array with N = 16. This maximum SNR drops with decreas-
ing number of antennas in a subarray. As shown in the figure,
the maximum SNR is only 10 lg(4 × 8) = 15 dB with N = 8.
Fig. 3. MSEs versus the average SNR per antenna γs (dB). V. C ONCLUSION
A hybrid dual-polarized adaptive antenna array is proposed
to exploit the polarization diversity to improve the reliability
IV. S IMULATION R ESULTS
of the mmWave system. With the interleaved subarray con-
In this section, we present the simulation results of figuration, the DBT and CPRPT algorithms are developed to
the proposed algorithms. A linear hybrid array with four estimate the AoA and polarization state. Simulation results of
subarrays, each consisting of 16 dual-polarized antennas with MSEs show that the proposed algorithms are effective in terms
ds = λ2 is studied. An elliptically-polarized wave is assumed of both parameter estimation and diversity combining.
with polarization state and AoA given by γ = 3π/10 =
0.9425, η = −π/2 = −1.5708 and θ = 0, respectively. Let γs R EFERENCES
denote the equivalent average SNR per antenna. The received
signals are assumed to be Gaussian distributed because of the [1] R. W. Heath, Jr. et al. “An overview of signal processing techniques for
millimeter wave MIMO systems,” IEEE J. Sel. Topics Signal Process.,
absence of synchronization between the transceivers [3]. vol. 10, no. 3, pp. 436–453, Apr. 2016.
Fig. 2 shows some realizations of the estimations of AoA [2] J. A. Zhang et al. “Massive hybrid antenna array for millimeter-wave
and polarization state using the proposed array and algorithm cellular communications,” IEEE Wireless Commun., vol. 22, no. 1,
with γs = −5 dB. As a comparison, the AoA estimation pp. 79–87, Feb. 2015.
using a conventional array [3] with four subarrays each con- [3] X. Huang et al. “A hybrid adaptive antenna array,” IEEE Trans. Wireless
Commun., vol. 9, no. 5, pp. 1770–1779, May 2010.
sisting of 16 single (y-axis) dipole antennas is also plotted. [4] O. Jo et al. “Exploitation of dual-polarization diversity for 5G
The updating coefficient, μ, is set to be 0.01 in both arrays. millimeter-wave MIMO beamforming systems,” IEEE Trans. Antennas
As shown in the figure, the proposed array outperforms the Propag., vol. 65, no. 12, pp. 6646–6655, Dec. 2017.
conventional one in terms of the speed of convergence. It can [5] S. X. Ta et al. “Crossed dipole antennas: A review,”
be seen that the proposed array can achieve very consistent IEEE Antennas Propag. Mag., vol. 57, no. 5, pp. 107–122,
Oct. 2015.
estimate of AoA after a few iterations, whereas the estimate [6] J. Li and R. T. Compton, “Angle and polarization estimation using
converges slowly for the conventional one. The estimations ESPRIT with a polarization sensitive array,” IEEE Trans. Antennas
of γ and η are also shown in the figure with various updat- Propag., vol. 39, no. 9, pp. 1376–1383, Sep. 1991.
ing coefficients (β = 0.001 and 0.05). After converging, they [7] K. T. Wong and M. D. Zoltowski, “Uni-vector-sensor ESPRIT for mul-
return results with high precision. The estimates {γ̂, η̂} for tisource azimuth, elevation, and polarization estimation,” IEEE Trans.
Antennas Propag., vol. 45, no. 10, pp. 1467–1474, Oct. 1997.
β = 0.001 and 0.05 are given by {0.9231, −1.5860} and [8] L. Wan et al., “DOA and polarization estimation for non-circular signals
{0.9005, −1.6047}, respectively. The MSEs of the estimates in 3-D millimeter wave polarized massive MIMO systems.” Accessed:
are shown as a function of γs in Fig. 3, where the results are Dec. 15, 2017. [Online]. Available: https://arxiv.org/abs/1712.05587
obtained after 200 iterations averaged over 5000 independent [9] K. Wu et al. “Robust unambiguous estimation of angle-of-arrival in
simulations. As shown in the figure, the proposed array and hybrid array with localized analog subarrays,” IEEE Trans. Wireless
Commun., vol. 17, no. 5, pp. 2987–3002, May 2018.
algorithm outperform the conventional array in estimating the [10] F. Shu et al., “Low-complexity and high-resolution DOA estimation for
AoA. We can also see that the MSEs of AoA and polarization hybrid analog and digital massive MIMO receive array,” IEEE Trans.
state reduce with the increase of γs . Commun., vol. 66, no. 6, pp. 2487–2501, Jun. 2018.
Antieigenvalue-Based Spectrum Sensing for Cognitive Radio

Chen Guo, Ming Jin , Qinghua Guo , and Youming Li
Abstract—In this letter, we investigate the application of trace of sample covariance matrices), respectively. In [3], the
antieigenvalues to spectrum sensing for the first time and design expression for the decision threshold of MME was derived by
an antieigenvalue-based detector. A theoretical expression for the exploiting approximate independence between the maximum
decision threshold of the proposed detector is also derived. It
is shown that the mean-to-square extreme eigenvalue detector
and minimum eigenvalues. In [4], an asymptotic expression
is a special case of the proposed one. Numerical simulations for the decision threshold of MEAM was derived by using
validate the theoretical analysis, and demonstrate the superior an approximation with chi-square distribution. Without the
performance of the proposed detector. rank-one assumption, AGM, EMR and EW detectors, which
Index Terms—Cognitive radio, spectrum sensing, sample exploit all (weighted) eigenvalues, were proposed in [5]–[7].
covariance matrix. However, obtaining all eigenvalues requires high compu-
tational complexity in the eigen-decomposition of sample
covariance matrices especially with high dimensions. Recently,
I. I NTRODUCTION to avert the computational cost of AGM, the mean-to-square
OGNITIVE radio (CR), which allows secondary users extreme eigenvalue (MSEE) detector was proposed by using
C (SUs) to utilize spectrum holes, has been recognized as
a promising technology for alleviating the problem of shortage
only part of the eigenvalues [9], which is also the focus of
this letter.
of spectral resources. One of the most important requirements The above mentioned eigenvalue-based detectors employ
for CR is the capability of spectrum sensing so that SUs can either all eigenvalues or the maximum eigenvalue. Using all
probe spectrum holes and opportunistically use them without eigenvalues with full eigen-decomposition involves high com-
causing harmful interference to licensed primary users (PUs). putational complexity especially when the matrix dimension
A variety of spectrum sensing methods have been proposed is large [10]. While using only the largest eigenvalue induces
in [1]. Among these methods, eigenvalue-based detectors inferior performance as the covariance matrix of primary
have been extensively investigated due to their high detec- signals often has several large eigenvalues [11]. Moreover,
tion performance at low signal-to-noise ratio (SNR). The detectors employing multiple large eigenvalues are vulnerable
eigenvalue-based detection mainly includes maximum eigen- to noise uncertainty as the ME detector.
value (ME), maximum-minimum eigenvalue (MME), maxi- To circumvent these issues, in this letter, we investigate the
mum eigenvalue to arithmetic mean (MEAM), arithmetic to application of antieigenvalues to spectrum sensing, and design
geometric mean (AGM), eigenvalue-moment-ratio (EMR) and an antieigenvalue-based detector which employs multiple large
eigenvalue weighting (EW) detectors [2]–[8]. ME was derived and small eigenvalues with partial eigen-decomposition [10].
based on the rank-one assumption of primary signals, and the It is shown that the MSEE detector is a special case of the
theoretical expression for its decision threshold was analyzed proposed one. Moreover, a theoretical expression for the deci-
in [2]. However, it turns out that ME is sensitive to noise sion threshold of the proposed detector is derived. Numerical
uncertainty because of the dependence of its decision thresh- simulations are provided to validate the theoretical analysis
old on the noise power [2]. MME and MEAM can be regarded and demonstrate the superior performance of the proposed
as modified versions of ME by replacing the noise power detector compared to existing eigenvalue-based detectors. In
in ME with the minimum eigenvalue and the average of all particular, compared with AGM which requires all eigenval-
eigenvalues of sample covariance matrices (equivalently, the ues, the proposed detector delivers a similar or even better
performance but with lower computational complexity.
Manuscript received October 1, 2018; accepted October 30, 2018. Date of The rest of this letter is organized as follows. The signal
publication November 2, 2018; date of current version April 9, 2019. This model and the preliminaries about antieigenvalue are intro-
work was supported in part by the Natural Science Foundation of China under
Grant 61871246 and Grant 61571250, in part by the Zhejiang Natural Science duced in Section II. The new detector and the theoretical
Foundation under Grand LY18F010008, in part by the Ningbo Leading and expression for its decision threshold are derived in Section III.
Top-Notch Talented Training Project under Grand NBLJ201801001, in part by Simulation results are provided in Section IV, followed by
the Australian Research Council’s DECRA under Grant DE120101266, and
in part by the K. C. Wong Magna Fund in Ningbo University. The associate
conclusions drawn in Section V.
was K. W. Choi. (Corresponding author: Ming Jin.)
C. Guo, M. Jin, and Y. Li are with the Faculty of Electrical II. P RELIMINARIES AND S IGNAL M ODEL
Engineering and Computer Science, Ningbo University, Ningbo A. Signal Model
315211, China (e-mail: 1971018124@qq.com; jinming@nbu.edu.cn;
liyouming@nbu.edu.cn). Spectrum sensing is usually treated as a binary hypothesis
Q. Guo is with the School of Electrical, Computer and Telecommunications problem, and a decision needs to be made on whether pri-
Engineering, University of Wollongong, Wollongong, NSW 2522, Australia
(e-mail: qguo@uow.edu.au). mary signals are present or not. Let x(n) ∈ C M ×1 denote the
Digital Object Identifier 10.1109/LWC.2018.2879339 received discrete-time complex baseband signal vector from an
GUO et al.: ANTIEIGENVALUE-BASED SPECTRUM SENSING FOR CR 545
SU or multiple closely deployed SUs. Each SU is equipped Algorithm 1 Antieigenvalue-Based Spectrum Sensing
with one or multiple antennas. The signal vector x(n) can be Input: Sample covariance matrix R̂x and an estimated rank
obtained from either multiple antennas at one time instant [4] P̂ .
or one antenna by stacking multiple time instants [11]. The Output: Decision on whether priamry signals are present or
received signal x(n) can be expressed as not.

w(n), H0 1: Compute the P̂ largest and the P̂ smallest eigenvalues of
x(n) = (1) R̂x , i.e., λ̂1 , · · · , λ̂P̂ , λ̂M −P̂ +1 , · · · , λ̂M by using partial
s(n) + w(n), H1 , n = 0, 1, . . . , N − 1
eigen-decomposition methods [10];
where H0 and H1 represent the absence and presence of pri- 2: Compute the P̂ smallest antieigenvalues of R̂x as in (6),
mary users, respectively; s(n) denotes the observed primary i.e., ν̂1 , · · · , ν̂P̂ ;
signals, which is assumed to have mean zero and covariance
3: Compute the test-statistic T in (7) with P = P̂ ;
Rs [2]; and w(n) denotes the additive white circularly sym-
4: Set a decision threshold γ according to a false-alarm prob-
metric complex Gaussian noise with mean zero and covariance
2 I , i.e., w(n) ∼ CN (0, σ 2 I ). In addition, we assume ability Pf , and make a decision: if T < γ, primary signal
σw M w M exists; otherwise, primary signal does not exist.
that s(n) and w(n) are independent of each other [4]. It has
already been shown that the covariance matrix of real-world
signals has a small number of large eigenvalues [11]. Here we
assume that Rs has a rank of P which can be much smaller which is a scaled identity matrix under H0 while not under H1 .
than the size of Rs . According to (3), we can find that the antieigenvalues of Rx
are all equal to one under H0 , while the P small antieigenval-
B. Eigenvalue and Antieigenvalue ues of Rx are smaller than one under H1 . This characteristic
can be employed to distinguish whether Rx is a scaled iden-
In this section, we briefly introduce the definition of tity matrix or not, and further to decide whether x(n) contains
antieigenvalues and the relationship between antieigenvalues primary signals.
and eigenvalues. Interested readers can refer to [12] and [13] With N samples, the sample covariance matrix of the
for more details. Given an Hermitian positive definite matrix received signals is given by
A with size M × M, we use θ to denote the angle between a
N −1
non-zero vector x and Ax. There are total M pairs of eigen- 1
value and eigenvector (λk , xk ) satisfying Axk = λk xk (i.e., R̂x = x(n)xH (n). (5)
N
cos θ = 1 for xk and Axk ). In other words, eigenvalues are n=0
obtained by maximizing cos θ. Without loss of generality, we
Let λ̂1 ≥ λ̂2 ≥ · · · ≥ λ̂M be the ordered eigenvalues of R̂x .
assume that λ1 ≥ λ2 ≥ · · · ≥ λM > 0.
The antieigenvalues of R̂x are given by
Differently, antieigenvalues are obtained by minimizing
cos(θ), and are defined as [12], [13].
2 λˆk λ̂M −k +1
(x∗k )H Ax∗k M ν̂k = . (6)
νk = ∗min∗ λ̂k + λ̂M −k +1
∗ || , k ≤ 2 (2)
xk ⊥Sk ||x∗
k || · ||Ax k
It can be seen that the P small antieigenvalues are distin-
where x∗k denotes the antieigenvector corresponding to the kth guishable under H0 and H1 , while it is difficult to distinguish
smallest antieigenvalue, Sk∗ = {x∗1 , x∗2 , . . . , x∗k −1 } with S1∗ = other antieigenvalues under H0 and H1 . Hence, we take the
∅, and x∗k ⊥ Sk∗ denotes that x∗k is orthogonal to the elements summation of the P small antieigenvalues as the test-statistic1
in Sk∗ . The relationship between νk s and λk s is given by [13]
P P 2 λˆ λ̂
k M −k +1
2 λk λM −k +1 T = ν̂k = . (7)
νk = . (3) λ̂ + λ̂M −k +1
λk + λM −k +1 k =1 k =1 k
The use of the antieigenvalue-based detector is detailed in
III. A NTIEIGENVALUE -BASED D ETECTOR AND I TS Algorithm 1. It is noted that the perfect knowledge of rank P
D ECISION T HRESHOLD may not be available in practice, and it can be estimated by
In this section, we make use of antieigenvalues to distin- many classical methods, such as the Gerschgorin radii crite-
guish whether the population covariance matrix is a scaled rion [14]. Let P̂ denote the estimated rank of the covariance
identity one or not, and then propose a new detector based on matrix of primary signals. It is shown in Section IV that the
the antieigenvalues of sample covariance matrix. Moreover, we proposed detector is robust against the inaccurate knowledge
derive a theoretical expression for the false-alarm probability of the rank. We can see that the MSEE detector is a special
and the decision threshold of the proposed detector. case of the proposed one with P̂ = 1. In Algorithm 1, setting
the decision threshold γ is a key to control the false-alarm
A. Antieigenvalue-Based Detector
1 In [7], the optimal eigenvalues weighting (OEW) detection uses all eigen-
The population covariance matrix of x(n) is given by values which are weighted properly. When the covariance matrix of primary
2 signals is not full rank, the detector actually uses the summation of weighted
σ w IM , H0
Rx = 2I , (4) large eigenvalues. In contrast, the proposed detector in this letter employs the
Rs + σw M H 1 summation of P small antieigenvalues as the test-static.
probability. Next, we investigate the false-alarm probability of

the proposed detector and decision threshold setting.
B. False-Alarm Probability and Decision Threshold

According to [15], we can obtain that the antieigenvalues
of R̂x under H0 approximately follow a Beta distribution, i.e.,
ν̂k |H0 ∼ Beta(αk , βk ) (8)
where the parameters αk and βk can be obtained through
numerical fitting with an elliptical law as presented in [15].
It has be shown in [16] that the eigenvalues of R̂x under
H0 are approximately independent of each other, and so are Fig. 1. False-alarm probability versus decision threshold for (a) P = 5 and
the antieigenvalues. Thus, the test-statistic T under H0 is the (b) P = 15 when M = 40 and N = 1000.
summation of independent central Beta distributions, and it
approximately follows a scaled Beta distribution as [17] C. Complexity Comparison
T |H0 ∼ P̂ × Beta(α, β) (9) The complexity of the eigenvalue-based detectors mainly
where the parameters α and β can be obtained by solving the lies in the eigen-decomposition. The complexity of full eigen-
following equations decomposition is O(M 3 ), while the complexity of partial
eigen-decomposition for obtaining P large and small eigen-
P̂
P̂ α αk values is only O(PM 2 ) [10]. Once the rank P is estimated, it
= (10) can be used for a number of sensing durations because the rank
α+β αk + βk
k =1 is stationary for long time relative to the sensing duration [19].
and Hence, the average computational complexity of estimating P
P̂
per sensing duration can be very small, and we only consider
P̂ 2 αβ αk βk
2
= . (11) the complexity of partial eigen-decomposition for the proposed
(α + β) (α + β + 1) (αk + βk )2 (αk + βk + 1)
k =1 detector. Then, the complexity of the proposed detector is far
Letting smaller than that of AGM when P is small. It is worth men-
P̂ tioning, as shown in [11], that the P is often far smaller than
1 αk M in practice.
a= (12)
P̂ k =1 k + βk
α
and IV. N UMERICAL S IMULATIONS
P̂ In this section, we first validate the theoretical expression
1 αk βk
b= , (13) for the false-alarm probability versus decision threshold of the
P̂ 2 k =1 (αk + βk )2 (αk + βk + 1) proposed detector, then evaluate the detection performance of
α and β are respectively given by the proposed detector compared to existing eigenvalue-based
detectors. In simulations, we consider the scenario of CR-
a(a − a 2 − b) based IoT networks as spectrum shortage is an important issue
α= (14)
b for IoT networks with large number of SUs being closely
and deployed. The SNR is defined as trace(Rs )/M /σw 2.
(1 − a)(a − a 2 − b) Fig. 1(a) shows false-alarm probability versus decision
β= . (15)
b threshold of the proposed detector when M = 40, N = 1000
Therefore, the probability density function (PDF) of T |H0 and P = 5, and Fig. 1(b) shows the results when M = 40,
is given by [17] N = 1000 and P = 15. It can be observed from Fig. 1 that
α−1 β−1 the theoretical result matches the Monte Carlo results well.
u 1− u Fig. 2(a) shows the antieigenvalues under H0 and H1 when
fT |H0 (u) = P̂ P̂
(16) M = 40, N = 1000, P = 5 and SNR = 0dB. In the simula-
B (α, β)
tions, the P eigenvalues of the covariance matrix are generated
where with the assumption that they are uniformly distributed, and
Γ(α)Γ(β) the summation of their expectations equals the noise power. It
B (α, β) = (17)
Γ(α + β) can be observed from Fig. 2(a) that the antieigenvalue ν̂k is far
with Γ(·) being the Gamma function. With a decision threshold smaller under H1 than H0 when k ≤ P. Moreover, antieigen-
γ, the false-alarm probability is given by values ν̂k under H1 and H0 are almost identical for k > P.
γ This implies that P antieigenvalues may be used for spectrum
Pf = Prob(T < γ|H0 ) = fT |H0 (u)du = I γ (α, β) (18) sensing and an overestimate of P has only small effect on the
0 P̂
proposed detector. Fig. 2(b) shows the antieigenvalues under
where I γ (α, β) denotes the regularized incomplete beta func- H0 and H1 when M = 40, N = 1000, P = 15 and SNR = 0dB,
P̂
tion with respect to γ . On the other hand, given a Pf , the and the same conclusions as those from Fig. 2(a) can also be
P̂
decision threshold γ can be obtained from (18). drawn.
GUO et al.: ANTIEIGENVALUE-BASED SPECTRUM SENSING FOR CR 547
first several antieigenvalues under H0 and H1 decreases with

P (which can also be concluded from comparing the results
in Fig. 2(a) and Fig. 2(b)).
V. C ONCLUSION
In this letter, we have proposed an antieigenvalue-based
detector, and derived the theoretical expression for the decision
threshold and false-alarm probability of the proposed detector.
Numerical simulations have been provided to verify the theo-
retical analysis and demonstrate the superior performance of
the proposed detector. It has been also shown that the proposed
Fig. 2. Antieigenvalues under H0 and H1 for (a) P = 5 and (b) P = 15 detector is robust to noise uncertainty (as the decision thresh-
when M = 40, N = 1000 and SNR = 0dB. old does not depend on noise power) and the estimation error
of the rank of the covariance matrix of primary signals.
R EFERENCES
[1] A. Ali and W. Hamouda, “Advances on spectrum sensing for cognitive
radio networks: Theory and applications,” IEEE Commun. Survey Tuts.,
vol. 19, no. 2, pp. 1277–1304, 2nd Quart., 2017.
[2] Z. Li, D. Wang, P. Qi, and B. Hao, “Maximum-eigenvalue-based
sensing and power recognition for multiantenna cognitive radio
system,” IEEE Trans. Veh. Technol., vol. 65, no. 10, pp. 8218–8229,
Oct. 2016.
[3] F. Penna, R. Garello, and M. A. Spirito, “Cooperative spectrum sensing
based on the limiting eigenvalue ratio distribution in Wishart matrices,”
IEEE Commun. Lett., vol. 13, no. 7, pp. 507–509, Jul. 2009.
[4] P. Wang, J. Fang, N. Han, and H. Li, “Multiantenna-assisted spectrum
Fig. 3. Pm versus SNR for detectors when M = 40, N = 1000 and P = 5 sensing for cognitive radio,” IEEE Trans. Veh. Technol., vol. 59, no. 4,
at Pf = 0.01. pp. 1791–1800, May 2010.
[5] R. Zhang, T. J. Lim, Y.-C. Liang, and Y. Zeng, “Multi-antenna based
spectrum sensing for cognitive radios: A GLRT approach,” IEEE Trans.
Commun., vol. 58, no. 1, pp. 84–88, Jan. 2010.
[6] L. Huang, J. Fang, K. Liu, H. C. So, and H. Li, “An eigenvalue-
moment-ratio approach to blind spectrum sensing for cognitive radio
under sample-starving environment,” IEEE Trans. Veh. Technol., vol. 64,
no. 8, pp. 3465–3480, Aug. 2015.
[7] C. Liu, H. Li, J. Wang, and M. Jin, “Optimal eigenvalue weighting detec-
tion for multi-antenna cognitive radio networks,” IEEE Trans. Wireless
[8] L. Wei and O. Tirkkonen, “Spectrum sensing in the presence of multiple
primary users,” IEEE Trans. Commun., vol. 60, no. 5, pp. 1268–1277,
May 2012.
[9] K. Bouallegue, I. Dayoub, M. Gharbi, and K. Hassan, “Blind spectrum
sensing using extreme eigenvalues for cognitive radio networks,” IEEE
Commun. Lett., vol. 22, no. 7, pp. 1386–1389, Jul. 2018.
Fig. 4. Pm versus SNR for detectors when M = 40, N = 1000 and P = 15 [10] W. Murase and M. Lindenbaum, “Partial eigenvalue decomposition of
at Pf = 0.01. large images using spatial temporal adaptive method,” IEEE Trans.
Image Process., vol. 4, no. 5, pp. 620–629, May 1995.
[11] P. Zhang and R. Qiu, “GLRT-based spectrum sensing with blindly
Fig. 3 shows miss-detection probability (Pm ) for various learned feature under rank-1 assumption,” IEEE Trans. Commun.,
detectors (including volume-based detector (VD) and covari- vol. 61, no. 1, pp. 87–96, Jan. 2013.
[12] K. E. Gustafson, “The geometrical meaning of the
ance absolute value (CAV) [18]) when M = 40, N = 1000 and Kantorovich–Wielandt inequalities,” Linear Algebra Appl., vol. 296,
P = 5 at Pf = 0.01. Note that Pm = Prob(T > γ|H1 ). For nos. 1–3, pp. 143–151, 1999.
the proposed detector, the results with different estimates of [13] R. Khattree, “On the calculation of antieigenvalues and antieigenvec-
tors,” J. Interdiscipl. Math., vol. 4, nos. 2-3, pp. 195–199, 2001.
P are given. It can be observed from Fig. 3 that the proposed [14] H.-T. Wu, J.-F. Yang, and F.-K. Chen, “Source number estimators using
detector outperforms other detectors. This is because the transformed Gerschgorin radii,” IEEE Trans. Signal Process., vol. 43,
proposed detector takes advantage of the rank information and no. 6, pp. 1325–1333, Jun. 1995.
[15] F. Ferrari, Antieigenvalues and Sample Coviarance Matrices, Linköping
exploits large eigenvalues effectively, while the other detectors Univ., Linköping, Sweden, Jul. 2016.
employ either only the largest or all eigenvalues. Moreover, we [16] M. A. Girshick, “On the sampling theory of roots of determinantal
can also see that inaccurate knowledge of the rank of primary equations,” Ann. Math. Stat., vol. 10, no. 3, pp. 203–224, Sep. 1939.
signals has small effect on the proposed detector. [17] B. Jóhannesson and N. Giri, “On approximations involving the beta dis-
tribution,” Commun. Stat. Simulat. Comput., vol. 24, no. 2, pp. 489–503,
Fig. 4 shows Pm versus SNR when M = 40, N = 1000 1985.
and P = 15 at Pf = 0.01. It again demonstrates the supe- [18] L. Huang, C. Qian, Y. Xiao, and Q. T. Zhang, “Performance analysis
rior performance of the proposed detector. It is observed that, of volume-based spectrum sensing for cognitive radio,” IEEE Trans.
Wireless Commun., vol. 14, no. 1, pp. 317–330, Jan. 2015.
when P is large, AGM has sightly better performance than [19] V. Tawil. (May 2006). 51 Captured DTV Signal. [Online].
the proposed one at the cost of high complexity in obtaining Available: http://grouper.ieee.org/groups/802/22/Meeting_documents/
full eigenvalues. This is because the distinguishability of the 2006_May/Informal_Documents
MRB Decoding of LT Codes Over AWGN Channels

Valerio Bioglio
Abstract—In this letter we propose a novel decoder for Luby

transform codes over noisy channel based on Gaussian elimina-
tion algorithm. The proposed soft-on-the-fly Gaussian elimination
algorithm permits to perform most reliable basis decoding while
distributing the complexity all along the symbols reception, and
reducing the latency due to new decoding attempts in case of
decoding failures.
Index Terms—Block codes, binary codes, AWGN channels.
I. I NTRODUCTION Fig. 1. Wireless fountain model.
ATELESS codes, also known as fountain codes, are

R designed to provide an unlimited number of coded sym-
bols generated on the fly. Luby Transform (LT) codes [1] have
limited its application for LDPC codes to hybrid decoders [8],
and in general to short codes [9]. Nevertheless, we show that
been the first realization of rateless codes. These codes, orig-
the liquid nature of LT codes permits to distribute this cost
inally developed for erasure channels, permit the decoding of
all along symbols reception by freezing the decoding process
the K input symbols after the reception of 1 K coded
while waiting for the reception of new symbols. We show that
symbols, where the overhead approaches zero when the
the proposed approach exhibits superior BLER performance
code dimension K grows. A hard-decision decoding based on
compared to BP while reducing the decoding latency.
message-passing Belief Propagation (BP) is used over erasure
channels.
The hard-decision BP decoder can be easily extended to a II. LT C ODES OVER N OISY C HANNELS
soft-decision decoder, more suitable for noisy channels. As a We consider a digital fountain scheme over a wireless chan-
consequence, since their discovery, the capabilities of LT codes nel for which a single transmitter wants to broadcast a message
over noisy channel have been investigated, revealing good u u1 , . . . , uK ½ of length K symbols to multiple receivers
performance in the waterfall regime but suffering from error over a noisy channel. A Cyclic Redundancy Check (CRC)
floors [2], [3]. Similarly to LDPC codes, this performance genie is appended to the message to mark legitimate content,
loss is due to the existence of trapping and absorbing sets resulting in the message u u1 , . . . , uK of length K K
in the structure of the code [4]. Detection and elimination of to be transmitted. Message u is then encoded through an LT
these sets may be difficult: various modifications of the Robust code into a stream of encoded symbols c c1 , . . . , cN , . . . .
Soliton Distribution (RSD) have been proposed to reduce their Every encoded symbol ci is then modulated and transmitted
impact and mitigate the error floor phenomenon, in particular independently as si . The channel introduces additive Gaussian
by altering the size of the ripple of degree-one symbols during noise. Every receiver can join the streaming and start receiving
the decoding [5], however with limited results. Alternating BP information at any time, without retransmission of lost data,
with Gaussian Elimination (GE) can prevent premature decod- collecting symbols s̃i si ni with ni N 0, σ 2 . Soft
ing failure due to trapping sets, at the cost of an augmented demodulation is performed on the received symbols, and the
complexity [6]. demodulated symbols c̃ are calculated as log-likelihood ratios
In this letter, we abandon BP decoders for a most reliable (LLRs) of the received symbols as
basis (MRB) approach [7]. MRB displays better performance
P s̃i ci 0
in terms of block error rate (BLER) compared to BP in the c̃i L ci log . (1)
decoding of LDPC codes [7], and a similar beahviour can be P s̃i ci 1
expected for LT codes; however, its high computational cost If binary symbols are transmitted through a BI-AWGN channel
and BPSK modulation, c̃i 2s̃ σ2
i
. LLRs are then passed to the
Manuscript received May 31, 2018; revised September 25, 2018; accepted soft-input/hard-output LT decoder. The result of the decod-
October 28, 2018. Date of publication November 2, 2018; date of current
version April 9, 2019. The associate editor coordinating the review of this ing is an estimation û of the transmitted message. Now a
paper and approving it for publication was W. Zhang. CRC check is performed: if passed, the decoder outputs the
The author is with the Mathematical and Algorithmic Sciences estimation û and the receiver disconnects, otherwise a novel
Laboratory Paris Research Center, Huawei Technologies SASU, 92100
Boulogne-Billancourt, France (e-mail: valerio.bioglio@huawei.com). decoding attempt is performed after the reception of new sym-
Digital Object Identifier 10.1109/LWC.2018.2879359 bols. In the rest of this letter, we will use the notation x, x̃
BIOGLIO: MRB DECODING OF LT CODES OVER AWGN CHANNELS 549
We recall that the operation , introduced in [10] as x̃1 x̃2

L x1 x2 , is defined as
x̃1 x̃2

x̃1 x̃2 2 arctanh tanh tanh (5)
2 2
sign x̃1 sign x̃2 min x̃1 , x̃2. (6)
Fig. 2. LT code structure. Circles and squares represent variable and check
nodes respectively for the BP decoder. Variable to check messages Li j are calculated later, using
different estimations of variable node uj made by the check
nodes connected to it. The different estimations are supposed
to be independent, hence according to [10] we have that

Li j Ll j , (7)
l Ej ,l i
where Ej is the set of the indices of the encoded symbols

containing bit uj , i.e., such that gjl 1. Finally, an estimation
of the variable node uj is calculated as 12 sign ũj 1, where
Fig. 3. Messages calculation in BP algorithm.

ũj Ll j . (8)
and x̂ to represent alphabet symbols, symbol LLRs and hard
l Ej
decisions on symbol LLRs respectively.
Decoding stops if the CRC is met, while decoding fails if
A. LT Encoding the CRC is not checked after imax iterations; a new decoding
attempt is then executed after the reception of new symbols.
The K information symbols u u1 , . . . , uK are encoded
into a stream of symbols c c1 , c2 , . . . , cN , . . . . Every
III. P ROPOSED A LGORITHM
encoded symbol cj is generated independently drawing a
degree dj N0 according to a degree distribution ρ d and
The goal of a most reliable basis (MRB) algorithm is to
selecting di distinct integers Di t1 , . . . , tdi
at random select the K most reliable linearly independent received sym-
with 1 t1 tdi K , such that bols and invert the matrix formed by their encoding equations
to perform the LT decoding. These two procedures, namely the
ci u t1 u td
i
ut . (2) symbols sorting and the matrix inversion, have a large compu-
t Di tational cost. Moreover, matrix inverse has to be recalculated
Robust Soliton Distribution (RSD) [1] is employed for erasure from the scratch in case of decoding failure, after the recep-
channels, with appropriate modifications for AWGN [5]. If we tion of more reliable symbols. In order to reduce the decoding
call encoding equation of ci the vector g i g1i , . . . , gK
i with
complexity of classical MRB for LT codes, we propose to
modify the On-the-Fly Gaussian Elimination (OFG) algorithm,
1 if j Di
gji (3) proposed in [11] for the decoding of LT codes over erasure
0 otherwise
channels, in order to handle soft inputs. In the following, we
then (2) becomes ci g i u T , and the matrix G formed by show that an appropriate modification transforms OFG into a
vectors g i as rows is the generator matrix of the LT codes. MRB decoder, permitting to triangularize the matrix composed
by the encoding equations of the most reliable information on
B. LT Soft Decoding via BP the fly.
Soft decoding of LT codes can be performed through Belief
Propagation (BP) by exchanging messages, representing LLR A. sOFG Algorithm Description
estimates, between check nodes (CNs) and variable nodes The proposed sOFG algorithm is presented as Algorithm 1
(VNs). In LLR-based sum-product implementation of the BP and works as follows. We suppose the K K decoding matrix
algorithm, variable nodes are initialized with the correspond- G to be partially upper triangular after the reception of i1
ing received symbol LLR. A check to variable message Li j symbols, i.e., either a row G i has the leftmost nonzero element
represents the LLR estimation of the variable node uj made by on the diagonal or it is all-zero. Each row G i is connected
a check node ci , while a variable to check message Li j rep- to a symbol ṽi , representing the LLR of the encoded sym-
resents the LLR estimation of the variable node uj excluding bol having encoding equation G i ; ṽi 0 if G i is empty.
check node ci . This process is depicted in Figure 3. A new encoded symbol with LLR c̃i is then received, along
Messages flowing from check nodes to variable nodes are with the corresponding encoding equation g i . The position s
calculated first. According to (2), uj ci ut1 utα¡1 of the leftmost 1 of g i is located in order to insert g i in the
utα 1 utdj with j tα , hence Li j can be calculated s-th row of G, keeping it upper triangular. If the row is empty,
as the LLR of the xor of di independent symbols as g i is inserted, and its LLR c̃i is stored as ṽs . If the row is
t j t j

full, the following new swap heuristic is used to decide which
encoding equation will be stored: if the received symbol is
Li j L ci ut c̃i Li t . (4)
t Di t Di more reliable, i.e., c̃i ṽs , then the row and the received
Algorithm 1 Soft On-the-Fly Gaussian Elimination

1: Initialize K K matrix G and K-vectors ṽ to 0
2: EmptyRows K
3: while EmptyRows 0 and CRCcheck û False do
4: receive K-vector g i and symbol LLR c̃i
5: s FirstOne g i
6: while IsEmpty si False do
7: if c̃i ṽs then
8: EmptyRows EmptyRows 1 Gss
9: swap g i , G s
10: swap c̃i , ṽs
11: end if
12: gi gi Gs
13: c̃i c̃i ṽs
14: s FirstOne g i
15: end while Fig. 4. Normalized
and per symbol vs. G matrix filling.
16: v̂ HardDecision ṽ
17: û BackSub G, v̂
18: end while reception. This is possible due to the nature of LT codes, for
19: return û which new encoded symbols are received continuously. During
the triangularization, the reliabilities of the intermediate sym-
bols are calculated on the fly by performing operations
equation are swapped, along with their reliabilities, in order to between stored symbol LLRs and the received symbol LLRs.
keep only the most reliable information in the decoding pro- The most reliable linearly independent intermediate symbols
cess. At this point, an attempt to store the received equation are kept in the decoding process thanks to the proposed swap
in another row is performed, namely by moving the leftmost 1 heuristic. In fact, according to (5) ṽi takes the value of the
to the right. This can be done by xoring g i and G s , obtaining less reliable received symbol used to form Gi ; this symbol is
a new equation g i g i G s with leftmost 1 in position then expelled from the matrix by the swap heuristic if a most
s s. This new equation represents a novel subset of input reliable combination of received symbols is found. As a result,
bits, whose LLR can be calculated as c̃i c̃i ṽs . This pro- the matrix is filled by the most reliable received information,
cedure continues until either the received equation is inserted that are already combined to accelerate the matrix inversion.
or zeroed, and hence discarded.
Whenever the decoding matrix is filled, i.e., when there are C. Complexity Considerations
no more empty rows, hard-decisions vector v̂ is calculated as The proposed sOFG decoder requires to execute and
sign ṽj 1 operations, while BP requires and operations. As shown
v̂j , (9)
2 in Figure 4, the number of needed to insert a symbol
and back substitution is run on v̂ to calculate the hard decisions in the decoding matrix increases with the filling percentage
vector û. Finally, a CRC check is performed on û. If the CRC of the matrix, however remaining below K/3 per symbol in
is passed, the decoding stops, otherwise new encoded symbols the worst case. As a consequence, we can upper bound the
are collected and inserted in the decoding matrix. number of for the triangularization phase, and hence the
number of operations, by N K/3, while the back substitu-
B. Algorithm Analysis tion requires on average K 2 4 bit . On the other hand, the
number of operations of BP depends on the number of edges of
Decoding of LT codes can be performed when K linearly
the LT encoding graph depicted in Figure 2, that is on average
independent symbols are collected, i.e., if their encoding equa-
2Nlog K for RSD. Every iteration requires to perform 2Nlog
tions form a full rank matrix in F2 . The goal of a MRB
K and , each one involving on average 2log K elements;
algorithm is to find the K most reliable linearly independent
as a consequence, BP requires on average 2imax N log2 K
received symbols and invert the matrix G formed by their
and operations.
decoding equations as rows, calculating H G 1 . Once
At first glance, the proposed sOFG looks more complex than
these symbols are selected, the encoding system of equations
BP; however, the execution of the most demanding oper-
G u T c T can be inverted in the decoding system of equa-
ations is distributed all along symbols reception, while back
tions H c T u T , so that input symbols can be retrieved as
substitution requires only simple bit . In fact, the incremen-
ui H i c, with LLR
K tal insertion of the received equation in the decoding matrix
i

Hji 1 permits to distribute the majority of the computational effort
ũi LH c L Hji cj c̃j . (10) of the decoding along all symbols reception. Since at most
j 1 1 j K K/3 operations are executed at every symbol reception, the
The proposed sOFG decoder obtains H through GE, dis- result of the decoding can be known almost immediately after
tributing the triangularization phase of GE during the packets the reception of each symbol. In this way, sOFG reduces the
BIOGLIO: MRB DECODING OF LT CODES OVER AWGN CHANNELS 551
performance. Figure 5 shows the decoding performance of

both decoders for K 100 under various overheads. The
proposed sOFG decoder exhibits a significant gain compared
to BP due to absence of trapping sets. This outcome is obtained
at the cost of a small augmentation of the total decoding
complexity, which is however distributed along the symbols
reception, resulting in a significant reduction of the decoding
latency. Figure 6, showing BLER performance for K 500
under various overheads, suggests that the proposed sOFG out-
performs BP algorithm for small overheads or for short codes.
In fact, increasing the code length, iterative decoders like BP
bring out the best from the received information, while a MRB
approach simply discards the redundant information.
V. C ONCLUSION
In this letter we presented sOFG, a novel decoder for
Fig. 5. BLER for K 100 and various N.
LT codes over noisy channels. Its MRB approach permits
to improve the BLER performance of the code by reducing
the impact of trapping sets on the decoding algorithm. The
incremental nature of sOFG mitigates the effect of decoding
failures. Finally, the decoding complexity is distributed along
the symbols reception, making this algorithm suitable for time-
demanding scenarios, where the decoding latency impacts the
quality of service. The proposed approach shows to outperform
iterative decoders for short LT codes.
R EFERENCES
[1] M. Luby, “LT codes,” in Proc. IEEE Symp. Found. Comput. Sci.,
Vancouver, BC, Canada, Nov. 2002, pp. 271–280.
[2] R. Palanki and J. S. Yedidia, “Rateless codes on noisy channels,” in
Proc. IEEE Int. Symp. Inf. Theory (ISIT), Chicago, IL, USA, Jun. 2004,
p. 37.
[3] H. Jenkač, T. Mayer, T. Stockhammer, and W. Xu,“Soft decoding of
LT-codes for wireless broadcast,” in Proc. IST Mobile Commun. Summit,
Dresden, Germany, Jun. 2005, pp. 1–5.
[4] V. L. Orozco and S. Yousefi, “Trapping sets of fountain codes,” IEEE
Fig. 6. BLER for K 500 and various N.
Commun. Lett., vol. 14, no. 8, pp. 755–757, Aug. 2010.
[5] J. H. Sorensen, T. Koike-Akino, P. Orlik, J. Ostergaard, and P. Popovski,
“Ripple design of LT codes for BIAWGN channels,” IEEE Trans.
impact of decoding failures on the decoding complexity, due Commun., vol. 62, no. 2, pp. 434–441, Feb. 2014.
[6] A. Kharel and L. Cao, “Decoding of short LT codes over BIAWGN chan-
to the reuse of the decoding matrix and hence of the com- nels with Gauss–Jordan elimination-assisted belief propagation method,”
putational effort spent in the matrix diagonalization. This is in Proc. IEEE Wireless Telecommun. Symp. (WTS), New York, NY, USA,
an advantage over BP, for which every decoding failure rep- Apr. 2015, pp. 1–6.
[7] M. P. C. Fossorier, “Iterative reliability-based decoding of low-density
resents a waste of computational power since the decoder has parity check codes,” IEEE J. Sel. Areas Commun, vol. 19, no. 5,
to be rerun from the scratch. pp. 908–917, May 2001.
[8] M. Baldi, N. Maturo, E. Paolini, and F. Chiaraluce, “On the applicability
of the most reliable basis algorithm for LDPC decoding in telecommand
IV. N UMERICAL R ESULTS links,” in Proc. IEEE Int. Conf. Inf. Commun. Syst. (ICICS), Amman,
Jordan, Apr. 2015, pp. 1–6.
In the following we present simulations of the block error [9] J. Van Wonterghem, A. Alloumf, J. J. Boutros, and M. Moeneclaey,
rate (BLER) performance of the proposed sOFG in comparison “Performance comparison of short-length error-correcting codes,” in
to classical BP for the decoding of short LT codes over AWGN Proc. IEEE Symp. Commun. Veh. Technol. (SCVT), Mons, Belgium,
Nov. 2016, pp. 1–6.
channels. In our simulations we used two LT codes of dimen- [10] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary
sion K 100 and K 500 designed according to the RSD block and convolutional codes,” IEEE Trans. Inf. Theory, vol. 42, no. 2,
with parameters δ 0.001 and c 0.03, while imax 20 for pp. 429–445, Mar. 1996.
[11] V. Bioglio, M. Grangetto, R. Gaeta, and M. Sereno, “On the fly Gaussian
the BP decoder. Codes are assessed at fixed lengths, hence the elimination for LT codes,” IEEE Commun. Lett., vol. 13, no. 12,
addition of a CRC is not necessary to evaluate the decoders pp. 953–955, Dec. 2009.
Spectral and Energy Efficient Resource Allocation for Massive

MIMO HetNets With Wireless Backhaul
Bo Huang and Aihuang Guo
Abstract—This letter provides a spectral and energy effi- algorithm is developed by applying alternating optimization
ciency evaluation framework for massive MIMO heterogeneous (AO) to decouple original problem into the beamforming and
networks with wireless backhaul. This framework is to maximize power allocation subproblem and wireless backhaul bandwidth
the weighted summation of spectral efficiency and energy effi- allocation subproblem, and further solve them by applying
ciency with guaranteeing quality of service for users, interference successive convex approximation (SCA) and Lagrange dual,
mitigation, and sufficient capacity in wireless backhaul. To respectively.
solve this non-convex problem, we develop a novel alternating
optimization algorithm, which is a combination of Lagrange dual
with successive convex approximation. Numerical results verify II. S YSTEM M ODEL
the effectiveness of the framework in improving the energy and
spectral efficiency by comparing with the other schemes.
Consider a downlink HetNet consisting of one macro base
station (MBS) with N antennas and J single-antenna small
Index Terms—Energy efficiency, resource allocation, spectral base station (SBS). For convenience, we denote the set of
efficiency, heterogeneous networks, wireless backhaul. all BSs B0 ≡ B {0}, where B = {1, . . . , J } is the set of
SBSs and the index 0 is for the MBS. The MBS serves M
macro users (MUE) and each SBS serves the same number
I. I NTRODUCTION of K small cell users (SUE) within their coverage. Assume
UE TO the spectrum scarcity and growing attention to that the HetNet operates in the reverse time division duplex
D energy consumption, the joint optimization of spectral
efficiency (SE) and energy efficiency (EE) has become an
(RTDD) mode. Specifically, the uplink/downlink time slots
of the MBS are also the downlink/uplink time slots of the
essential issue in 5G communication networks [1]. Massive SBSs simultaneously [7]. With it, the interference from MBS
MIMO and heterogeneous networks (HetNets) are recognized to SUEs in downlink can be avoided. Meanwhile, the MBS
as two key technologies to improve SE and EE [2]. Moreover, provides in-band wireless backhaul to SBSs. Consequently,
wireless backhaul is preferred for 5G HetNets since its easy define β ∈ [0, 1] as the bandwidth allocation ratio of wire-
deployment and low cost. This letter will investigate the less backhaul, the rest (1 − β) is allocated to wireless access
tradeoff between SE and EE in massive MIMO HetNet with links of BSs.
wireless backhaul. Considering the downlink transmission of MBS, where
The joint design of downlink beamforming and power allo- it serves MUEs and SBSs. Let F = {B , M} =
cation in wireless backhaul HetNets to maximize system {(1, . . . , J ), (J + 1, . . . , J + M )} be the index of receivers
EE was investigated in [3], but the bandwidth allocation of for the MBS, the received signal at kth receiver is
wireless backhaul was not considered. Wang et al. [4] and
Liu et al. [5] jointly optimized user association and backhaul y0,k = hT 0,k w0,k x0,k +
hT0,k w0,k x0,k + n0,k (1)
k ∈F \k
bandwidth allocation to maximize SE, while none of them
involved EE. Although Zhang et al. [6] analyzed EE in the where w0,k ∈ CN ×1 and h0,k ∈ CN ×1 are the beamform-
massive MIMO enabled HetNet with wireless backhaul by ing vector and the channel vector to kth receiver, respectively.
jointly optimizing bandwidth and power allocation, the SE and x0,k is transmission information and n0,k is the white Gaussian
interference mitigation were not considered. noise. Regularized zero-forcing (RZF) precoding is employed
−1
In this letter, we provide an optimization framework to (HHT +αI) h0,k
by the MBS, which is given by u0,k = ,
maximize SE and EE simultaneously in massive MIMO (HHT +αI)−1 h0,k
HetNets with wireless backhaul while guaranteeing minimum where H = [h0,1 , . . . , h0,J +M ] ∈ CN ×(J +M ) is channel
quality of service (QoS) requirement of each user, interference matrix and α > 0 is regularization parameter. u0,k satis-

mitigation, and sufficient capacity wireless backhaul. To solve fies the power constraint k ∈F uT 0,k u0,k vk ≤ P
max , where
this non-convex problem, a new iterative resource allocation vk denotes the transmit power to kth receiver and the corre- √
Manuscript received September 12, 2018; revised October 22, 2018;
sponding beamforming can be computed by w0,k = u0,k vk .
accepted October 28, 2018. Date of publication November 5, 2018; date Thereby the SINR from MBS to kth receiver is r0,k =
vk g0,k 2
of current version April 9, 2019. This work was supported in part by the log(1 + ), where g0,k = hT0,k u0,k and
National Natural Science Foundation of China under Grant 61331009, in part k ∈F \k vk g0,k +N0
by the Open Project of State Key Laboratory of Millimeter Waves under N0 is the power of the noise, and the corresponding achievable
Grant K201935, and in part by the National Science and Technology Major rate is
Project under Grant 2017zx05005001-005. The associate editor coordinating 1−β
M r0,k , k ∈ M
the review of this paper and approving it for publication was A. Ozcelikkale.
(Corresponding author: Bo Huang.) R0,k = (2)
The authors are with the Department of Information and Communication
βr0,k , k ∈B
Engineering, Tongji University, Shanghai 201804, China (e-mail:
hb0533@tongji.edu.cn; tjgah@tongji.edu.cn). Denoting pj as the transmit power of jth SBS and hj ,k as
Digital Object Identifier 10.1109/LWC.2018.2879428 the channel from jth SBS to the kth SUE. The achievable rate
HUANG AND GUO: SPECTRAL AND ENERGY EFFICIENT RESOURCE ALLOCATION FOR MASSIVE MIMO HetNets WITH WIRELESS BACKHAUL 553
can be expressed as IV. A LGORITHM D EVELOPMENT

⎛ 2 ⎞
In this section, we use AO to decouple original problem
1−β 1−β pj hj ,k
Rj ,k = r = log⎝1+ ⎠. into (i)beamforming and power allocation subproblem and
K j ,k K p hj ,k 2 + N0 (ii) wireless backhaul bandwidth allocation subproblem, and
j ∈B\j j
further apply SCA to approximate the former into a convex
(3)
one and utilize Lagrange dual to solve the latter.
III. P ROBLEM F ORMULATION
A. Beamforming and Power Allocation
Let us define the system EE as the ratio between the sum
access rate and power consumption which can be modeled as Given β, the original problem (7) can be rewritten as follows
(1 − η) η
k ∈K Rj ,k max f (p, v, β) = ψSE (p, v, β) − P (p, v, β)
ψEE (p, v, β) = j ∈B0
(4) p,v R P C
σ0 k ∈F vk + j ∈B σj pj s.t. (5b) − (5e) (8)

where PC (p, v, β) = σ0 k ∈F vk + j ∈B σj pj is the We first see that problem (8) is still not convex because
aggregation of the emitted powers, and constant σj ≥ of the interference term in Rj ,k , ∀j ∈ B0 . By introducing an
∈
1,∀j B0 is the amplifier at transmitter. ψSE (p, v, β) = additional slack variable l = [l1 , . . . , lj ], the constraint (5d) is
j ∈B0 k ∈K Rj ,k is system SE. Our target is to study the equivalent to the following problem

tradeoff between SE and EE, so we formulate a multi-objective R ≤ βlj , ∀j ∈ B (9a)
optimization (MOO) problem as k ∈K j ,k

max [ψEE (p, v, β), ψSE (p, v, β)] (5a) vk g0,k
v,p,β log 1 + ≥ lj (9b)
k ∈F \k vk g0,k + N0
min
s.t. R0,k ≥ R0,k , ∀k ∈ M (5b) It is straightforward to see that (9b) and (5b)-(5c) have
Rj ,k ≥ Rjmin
,k , ∀j ∈ B, k ∈ K (5c) the same non-convex constraint form. Equation (9b) can be
rewritten as
Rj ,k ≤ βr0,k , ∀j ∈ B (5d) ⎛ ⎞ ⎛ ⎞
k ∈K
uT max
, pj ≤ Pjmax , ∀j ∈ B (5e) log⎝ vk g0,k + N0⎠ ≥ lj + log⎝ vk g0,k + N0⎠
0,k u0,k v0,k ≤P
k ∈F k ∈F k ∈F \k
0≤β≤1 (5f)
(10)
where constraint (5b) and (5c) specify the minimum QoS where the both sides of (10) are concave functions. We apply
requirement for users. Equation (5d) is the downlink wire-
the first-order Taylor approximation to approximate the right
less backhaul constraint such that the backhaul rate of the jth
small cell should be larger than the access data transmission side of (10) around the point v(t) as
⎛ ⎞
rate from the SBS j to its intended SUEs. (t)
As explained in [8], (5a) representing the tradeoff between log vk g0,k + N0 ≥ lj + log⎝ v g0,k + N0⎠ k
SE and EE can be equivalently converted into the follow- k ∈F k ∈F \k
ing problem, maximizing SE while minimizing total power
(t)
consumption vk − vk g0,k
k ∈F \k
max [ψSE (p, v, β), −PC (p, v, β)] (6a) + (t)
(11)
p,v,β vk g0,k + N0
k ∈F \k
s.t. (5b) − (5f) (6b)
Equation (6a) is a conflicting objective, the most widely where the superscript t in the above equations denotes the
accepted method to solve it is to obtain Pareto-optimal points. tth iteration of the iterative procedure. Similarly, (5b)-(5c) can
To achieve Pareto optimality, we introduce a parameter η ∈ also be transformed into the same convex form as (11), i.e.,
[0, 1] to adjust the importance dynamically towards ψSE or (5b*)-(5c*), which is omitted for brevity.
PC depending on the network situation. During the peak Now we turn our attention to rj ,k , ∀j ∈ B0 , which is non-
hours, increasing ψSE is more important than PC to satisfy convex with respect to p and v. We utilize the SCA method
the demand of more users. On the other hand, during off-peak to derive its convex bounds in the following result.
hours, minimizing PC is more important than ψSE to increase Proposition 1: The convex lower bound of rj ,k is
ψEE . It follows as
(t) 2 j =B pj gj ,k (t)
1−η η Lj ,k (p) − log (1+ p gj ,k )
p g + 2 j ∈B\j j
max ψSE (p, v, β) − P (p, v, β) j =B j j ,k
p,v,β R P C 1 (t)
s.t. (5b) − (5f) (7) − (t)
gj ,k (pj − pj )
j ∈B
p
j ∈B\j j j ,k g + 1
where R and P are normalization factors for the achievable (12)
rate and power.
Problem (7) is a difficult non-convex problem due to the convex upper bound is
strong coupling among variables and the interference term
(t) (t) (t)
in Rj ,k , ∀j ∈ B0 , which is generally a NP-hard problem. Uj ,k (p) log (1+
j ∈B
pj gj ,k ) +
j ∈B
gj ,k (pj − pj )
In what follows, we will address problem (7) using a novel
1 2 j ∈B\j pj gj ,k
alternating optimization algorithm, which can improve the × − (13)
(t) pj gj ,k + 2
iterative solutions. j ∈B pj gj ,k +1 j ∈B\j
(t)
Similarly, the lower bound L0,k (v) of r0,k can also be Algorithm 1 Joint Beamforming, Power and Bandwidth
obtained and is omitted for brevity. Allocation
Proof: At the Appendix. 1: Select feasible initial value for p(t) , v(t) ;
(t) (t) (t) 2: Set t = 0;
By applying the convex bounds Lj ,k (p), Uj ,k (p), L0,k (v)
3: repeat
into (8), the approximate convex program is given by 4: Compute optimum β ∗ according to (19);
⎛ 5: repeat
(1 − η)(1 − β) L(t)
0,k (v) 6: Obtain l∗ , p∗ and v∗ according to (14) with the increment
max (t)
˜ (p, v, β) =
f ⎝
p,v, R M t = t +1;
l≥0 k ∈M
⎞ 7: until convergence of p(t) , v(t) ;
L(t) j ,k (p) ⎠ η
8: until convergence of objective function (7)
+ − PC (p, v, β)
j ∈B k ∈K
K P TABLE I
(14a)
1 − β (t)
s.t. U (p) ≤ βlj , ∀j ∈ J (14b)
K k ∈K j ,k
(5b*), (5c*), (5e), (11) (14c)
Theorem 1: Selecting a feasible initial point {p(0) , v(0) },

According to (18), given λ(l+1) and µ(l+1) obtained from
recursively generating {p(t) , v(t) } for t = 1, 2, . . ., by the
the lth inner iteration, updating β is expressed as follows
optimal solution of convex problem (14) is improved solutions
of problem (8), which finally converges to a KKT point. (1 − η)rj ,k λj ,k rj ,k
ηr0,k
Proof: Since f˜(t) (p, v, β) is global lower bound of β=− + +
K R K M
f (p, v, β), it follows that j ∈B k ∈K k ∈M R
⎛ ⎞
f Ω(t+1) ≥ f˜(t) Ω(t+1) ≥ f˜(t) Ω(t) = f Ω(t) (15) rj ,k λ0,k r0,k
+ μj ⎝r0,k + ⎠− (19)
K M
which means that the feasible point Ω(t+1) = j ∈B k ∈K k ∈M
(p(t+1) , v(t+1) , β) is better than Ω(t) toward the opti-
Finally, dual problem (18) can be solved by adopting sub-
mizing (8). As the sequence Ω(t) is bounded, and gradient method as follows
thus, by Cauchy theorem, there is a subsequence
(l+1) (l) (l)
{Ω(tv ) }∞
v =1 that converges to a limit point Ω̄, so λ0,k = λ0,k − δ1 1 − β (l) r0,k /M − R0,kmin
(20)
limv →+∞ f (Ω(tv ) ) = f (Ω̄). For every t, there is v such that (l+1) (l) (l)

tv ≤ t ≤ tv +1 . By (15), f (Ω̄) = limv →+∞ f (Ω(tv ) ) ≤ λj ,k = λj ,k − δ2 1 − β (l) rj ,k /K − Rjmin
,k (21)
limv →+∞ f (Ω(t) ) ≤ limv →+∞ f (Ω(tv +1 ) ) = f (Ω̄), show-
(l+1) (l) (l)
μj = μj − δ3 β (l) r0,k − 1 − β (l) rj ,k /K
ing that limt→+∞ f (Ω(t) ) = f (Ω̄). Therefore, Ω̄ is a KKT k ∈K
point according to [9]. (22)
B. Wireless Backhaul Bandwidth Allocation C. Joint Beamforming, Power and Bandwidth Allocation
Algorithm
When p and v are fixed, we can obtain the wireless backhaul
bandwidth subproblem as The proposed joint beamforming, power and bandwidth
allocation algorithm is shown in Algorithm 1, which only
(1 − η) requires polynomial complexity: the complexity of solving
max ψSE (p, v, β)
β R problems (14) and (17) are O(2JK + 3JN + 3MN )3 and
s.t. (5b) − (5d), (5f) (16) O(JK + 3J + M ), respectively [10].
Defining Lagrange multipliers λ1 = [λ0,1 , . . . , λ0,M ]T for V. S IMULATION R ESULTS

(5b), λ2 = [λ1,1 , . . . , λJ ,K ]T for (5c), and µ = [μ1 , . . . , μj ]T In the simulations, we consider a downlink HetNet with one
for (5d), the Lagrange function related of (16) is MBS and four SBSs. The coverage radius of the macro cell
is 0.5 km, and that of a small cell is 0.04 km. The pathloss
min min
L(β, λ, µ) = λj ,k Rj ,k − Rj ,k + λ0,k R0,k − R0,k between MBS and receivers, SBS and SUEs are defined as
j ∈B k ∈K k ∈M 127 + 30 log10 d(km) and 128.1 + 37.6log10 d(km). The noise
⎛ ⎞ ⎛ ⎞
1−η power spectral density is −174 dBm/Hz, and the detailed
+ ⎝ R0,k + Rj ,k ⎠ + μj ⎝βr0,k − Rj ,k⎠ .
R simulation parameters are presented in Table I.
k ∈M j ∈B k ∈K j ∈B k ∈K
In Fig. 1(a), we study the impact of different rate constraints
(17) on the SE-EE tradeoff, where K = 3 and M = 8. SE curves
Thus, the corresponding Lagrange dual function can be decrease to predetermined constraints with η raising to 1. On
represented as the other hand, we first observe that when R min increases, the
EE of the whole system is even smaller. In addition, EE curves

maxβ L(β, λ, µ) increase first and then decrease as η increases. That is because
min h(λ, µ) = s.t. 0 ≤ β ≤ 1 (18) the drop rate of transmission rate is first less than that of power
λ,µ
λ, µ ≥ 0 consumption, and then greater than that of power consumption.
HUANG AND GUO: SPECTRAL AND ENERGY EFFICIENT RESOURCE ALLOCATION FOR MASSIVE MIMO HetNets WITH WIRELESS BACKHAUL 555
(a) (b) (c)
Fig. 1. (a) Energy and spectrum efficient tradeoff for different QoS. (b) Comparison of energy and spectrum efficient at different strategies. (c) Energy and
spectrum efficient tradeoff in limited CSI.
Fig. 1(b) compares the performance of the proposed where gj ,k = gj ,k /N0 . According to [12], 2+x 2x log(1+x ) for
Algorithm 1 with the optimal algorithm and existing algo- all x > 0, it follows that
rithm [6]. Note that the optimal algorithm solves the original
⎛ ⎞
problem (7) by exhaustive search. We see that the proposed 2 pj gj ,k
algorithm performs similarly to optimal algorithm, and shows
j ∈B\j
SE-EE performance improvement in comparison with the algo- log⎝1 + pj gj ,k ⎠ > (25)

2 + pj gj ,k
rithm of [6]. The reason is that the lower and upper convex j ∈B\j j ∈B\j
bound of non-convex term we derived are closer to the optimal
solution. Equation (13) can be proved by combination (24) and (25).
In Fig. 1(c), we compare the performance of Algorithm 1 in Similarly, minor modifications to (24) and (25) can derive (12).
limited channel state information
(CSI). The imperfect CSI can
be modeled as [11], h0,k = 1 − τ0,k 2 ĥ
0,k +τ0,k e0,k , ∀k ∈ F

and hj ,k = 1 − τj2,k ĥj ,k + τj ,k ej ,k , ∀j ∈ B , k ∈ K , where R EFERENCES
ĥ0,k and ĥj ,k are the estimated CSI, e0,k and ej ,k are the chan- [1] K. N. R. S. V. Prasad, E. Hossain, and V. K. Bhargava, “Energy
efficiency in massive MIMO-based 5G networks: Opportunities and
nel noise, and τj ,k ∈ [0, 1] indicates the channel estimation challenges,” IEEE Wireless Commun., vol. 24, no. 3, pp. 86–94,
error of CSI. We show the SE-EE tradeoff at different val- Jun. 2017.
min = R min = 0.6. We
ues of error τj ,k = 0, 0.2, 0.3, and R0,k j ,k [2] J. G. Andrews et al., “What will 5G be?” IEEE J. Sel. Areas Commun.,
observe that the performance of both SE and EE are degraded vol. 32, no. 6, pp. 1065–1082, Jun. 2014.
as τj ,k increases. [3] T. M. Nguyen, A. Yadav, W. Ajib, and C. Assi, “Centralized and dis-
tributed energy efficiency designs in wireless backhaul HetNets,” IEEE
Trans. Wireless Commun., vol. 16, no. 7, pp. 4711–4726, Jul. 2017.
VI. C ONCLUSION [4] N. Wang, E. Hossain, and V. K. Bhargava, “Joint downlink cell
In this letter, we proposed a spectral and energy effi- association and bandwidth allocation for wireless backhauling in two-
tier HetNets with large-scale antenna arrays,” IEEE Trans. Wireless
ciency evaluation framework for massive MIMO HetNets with Commun., vol. 15, no. 5, pp. 3251–3268, May 2016.
wireless backhaul while guaranteeing user QoS, interference [5] Y. Liu, L. Lu, G. Y. Li, Q. Cui, and W. Han, “Joint user association and
mitigation and sufficient capacity in wireless backhaul. We spectrum allocation for small cell networks with wireless backhauls,”
developed an efficient algorithm to solve the non-convex IEEE Wireless Commun. Lett., vol. 5, no. 5, pp. 496–499, Oct. 2016.
optimization problem based on Lagrange dual and SCA. [6] H. Zhang, H. Liu, J. Cheng, and V. C. M. Leung, “Downlink energy effi-
Numerical results demonstrated the effectiveness of the frame- ciency of power allocation and wireless backhaul bandwidth allocation
in heterogeneous small cell networks,” IEEE Trans. Commun., vol. 66,
work in improving the energy and spectral efficiency. no. 4, pp. 1705–1716, Apr. 2018.
[7] L. Sanguinetti, A. L. Moustakas, and M. Debbah, “Interference manage-
A PPENDIX ment in 5G reverse TDD HetNets with wireless backhaul: A large system
analysis,” IEEE J. Sel. Areas Commun., vol. 33, no. 6, pp. 1187–1200,
P ROOF OF THE P ROPOSITION 1 Jun. 2015.
By simple algebraic manipulation, the rj ,k is rewritten as [8] O. Amin, E. Bedeer, M. H. Ahmed, and O. A. Dobre, “Energy
⎛ ⎞ ⎛ ⎞ efficiency–spectral efficiency tradeoff: A multiobjective optimization
approach,” IEEE Trans. Veh. Technol., vol. 65, no. 4, pp. 1975–1981,
rj ,k log⎝ N0 + pj gj ,k ⎠ − log⎝ N0 + pj gj ,k ⎠ (23) Apr. 2016.
j ∈B j ∈B\j [9] B. R. Marks and G. P. Wright, “A general inner approximation algo-
rithm for nonconvex mathematical programs,” Oper. Res., vol. 26, no. 4,
Since log(x ) is a concave function with x > 0, we approxi- pp. 681–683, 1978.
mate the right side first term of (23) by its first-order condition [10] A. Ben-Tal and A. Nemirovski, Lectures on Modern Convex
(t) Optimization: Analysis, Algorithms, and Engineering Applications,
around the point pj as vol. 2. Philadelphia, PA, USA: SIAM, 2001.
[11] T. K. Vu, M. Bennis, S. Samarakoon, M. Debbah, and M. Latva-Aho,
⎛ ⎞ ⎛ ⎞ gj ,k pj − pj
(t)
“Joint load balancing and interference mitigation in 5G heteroge-
(t) ⎠ j ∈B
log⎝1 + pj gj ,k⎠ ≤ log⎝1 + pj gj ,k + neous networks,” IEEE Trans. Wireless Commun., vol. 16, no. 9,
(t)
j ∈B j ∈B pj gj ,k + 1 pp. 6032–6046, Sep. 2017.
j ∈B [12] E. R. Love, “64.4 some logarithm inequalities,” Math. Gazette, vol. 64,
(24) no. 427, pp. 55–57, 1980.
Practical User Selection With Heterogeneous Bandwidth and Antennas

for MU-MIMO WLANs
Sulei Wang , Zhe Chen, Yuedong Xu , Xin Wang, and Qingsheng Kong
Abstract—User selection is one of the most important Although the above seminal approaches perform well, they
components for next generation multi-user multiple-input- neglect a crucial phenomenon, that is, user heterogeneity. The
multiple-output wireless local area networks. However, state- origin of user heterogeneity stems from the diversity of devices
of-the-art approaches neglect the heterogeneity of users in the accessing WLANs, e.g., laptops, cellphones, wearable as well
available bandwidth and the number of antennas, which dimin- as smart-home devices. The users possess different number of
ishes their performance considerably. To tackle this challenge, we antennas and support different maximum bandwidths. The user
formulate a novel integer optimization framework to select the
heterogeneity poses new challenges to the user selection in
antennas of heterogeneous users simultaneously. With estimated
signal-to-interference-and-noise ratio of users via channel vec- MIMO systems. For instance, grouping all the antennas of one
tor projection, we propose a low-complexity branch-and-prune multiple-antenna user does not always yield a high throughput.
algorithm to search for the near-optimal combinations of user If two users are selected whose bandwidths are 20MHz and
antennas. Our algorithm is compatible with legacy 802.11ac and 40MHz respectively, the AP will transmit in the 20MHz band-
is implemented on the software defined radio system. Extensive width. Therefore, the user selection strategy must be capable of
experiments show that our algorithm achieves around 95% of handling the user heterogeneity as it is genuinely remarkable
the optimal throughput and outperforms a benchmark scheme and coupled with the mitigation of inter-user interference.
with a 1.18× gain in realistic indoor environments. In this letter, we formulate the user selection as an inte-
Index Terms—MU-MIMO, user selection, heterogeneity, ger programming problem. The objective is to maximize the
branch-and-prune. aggregate throughput in each slot, constrained by the number
of transmit antennas, and the available bandwidth of users.
Finding the optimal set of users incurs a prohibitive compu-
I. I NTRODUCTION tational complexity, thus is infeasible for online processing.
U-MIMO is a key enabling technology to scale up To circumvent this difficulty, we first adopt a channel vector
M the capacity of 802.11ac wireless local area networks
(WLANs) [1]. Equipped with multiple antennas, an access
projection method [7] to estimate the SINR of user antennas.
A branch-and-prune algorithm [8] is further applied to select
point (AP) is capable of transmitting multiple data streams the user antennas incrementally. Specifically, our algorithm
to different users or receive antennas concurrently, achieving maintains multiple candidate user antenna combinations that
a spatial reuse gain up to the number of transmit anten- enlarge the search space for better throughput. The algorithm
nas [2]. This spatial reuse gain depends on the channel is compatible with legacy 802.11ac and advanced techniques
orthogonality among users and such orthogonality cannot be including CSI compression. We implement the proposed algo-
always preserved [3]. Hence, the AP needs to schedule the rithm on the software defined radio platform WARP [9], and
transmission of a group of users wisely in each slot so extensive experiments manifest the effectiveness.
as to reduce the inter-user interference, especially when the
candidate pool is large. II. P ROBLEM F ORMULATION
The design of user selection strategy has gripped much Suppose that the AP is equipped with S antennas and there
attention recently in 802.11ac WLANs. Xie and Zhang [4] are K users contending for transmission, where user k is
proposed an orthogonality probing based user selection equipped with nk antennas and supports the maximum avail-
scheme named OPUS. SIEVE [5] balanced the trade-off able bandwidth Bk . With the binary selection mask xs,k ,i
between performance and complexity with a scalable multiuser indicating whether antenna i at user k is selected and served
selection module. Recently, MUSE [6] was designed to by data stream s or not, we denote the receiving SINR as
perform user selection with limited CSI feedback on com- SINRs,k ,i . The corresponding throughput Us,k ,i is given by
modity Wi-Fi devices. Us,k ,i = B log(1 + SINRs,k ,i ) where B is the transmission
bandwidth. Thus the problem is formulated as follows:
Manuscript received September 7, 2018; revised October 26, 2018; accepted S K nk
October 29, 2018. Date of publication November 6, 2018; date of current max xs,k ,i Us,k ,i (1)
version April 9, 2019. This work was supported in part by the Natural Science x,B s=1 k =1 i=1
Foundation of China under Grant 61772139, in part by the Shanghai–Hong S K nk
s.t x ≤ S, (2)
i=1 s,k ,i
Kong Collaborative Project under Grant 18510760900, and in part by the
s=1 k =1
CERNET Innovation Project under Grant NGII20170209. The associate editor
coordinating the review of this paper and approving it for publication was B = min s,k ,i {xs,k ,i Bk }\{0}, (3)
C. T. Chou. (Corresponding author: Yuedong Xu.) xs,k ,i ∈ {0, 1}. (4)
S. Wang, Y. Xu, and Q. Kong are with the Research Center of Smart
Networks and Systems, School of Information Science and Technology, Eq. (2) means that the S-antenna AP can serve no more
Fudan University, Shanghai 200433, China (e-mail: wangsl16@fudan.edu.cn; than S user antennas concurrently. The users support diverse
ydxu@fudan.edu.cn; qskong@fudan.edu.cn). maximum bandwidths, while the AP can only transmit on
Z. Chen and X. Wang are with the School of Computer Science,
Fudan University, Shanghai 200433, China (e-mail: zhechen13@fudan.edu.cn; a central frequency with one channel bandwidth in each
xinw@fudan.edu.cn). transmission slot. Therefore, once the set of user antennas
Digital Object Identifier 10.1109/LWC.2018.2879668 are selected, the AP transmits using the lowest available
WANG et al.: PRACTICAL USER SELECTION WITH HETEROGENEOUS BANDWIDTH AND ANTENNAS FOR MU-MIMO WLANs 557
Fig. 1. SINR degrades after projection.
bandwidth that can be supported by them (Eq. (3)). For

example, if three user antennas supporting up to 20MHz,
40MHz, and 40MHz are selected, the actual bandwidth for
transmission is 20MHz.
N by the total
Denote number of user antennas, i.e., Fig. 2. Greedy user selection with pruning.
N
N = K k =1 nk . There exist S
T =1 T possible user antenna
combinations. Exhaustively searching for the optimal combi-
nation entails a prohibitively high complexity, which is not
suitable for online computation in MIMO WLANs.
III. P ROPOSED A LGORITHM

In this section, we first introduce a lightweight method to
estimate the SINR of user antennas. Then we adopt a low-
Fig. 3. WARP platform.
complexity branch-and-prune algorithm to select user antennas
incrementally. The complexity of our algorithm is analyzed
and the feasibility of practical implementation is described. employ the branch-and-prune algorithm [8] to perform user
selection for MU-MIMO WLANs. We keep M candidate user
antenna combinations instead of selecting only one and reject-
A. SINR Inference ing all the others in each incremental step, which distinguishes
The prerequisite of designing a user selection algorithm our algorithm from previous methods.
is to infer the SINR of possible user antenna combinations. The high-level operation flow of our algorithm is as follows:
Here, we use an example in Fig. 1 to illustrate how the 1) Initialization: Each of N user antennas is initialized
channel vector projection can achieve this goal. The two- as a candidate user antenna combination, generating N
antenna AP transmits signals in a two-dimensional space candidates.
where h11 , h12 , h21 , h22 are complex channel gains between 2) Branching: For each candidate user antenna combina-
AP and the user antennas. When precoding a symbol for U2 , tion, we add all the user antennas that have not been
the AP nullifies the interference from U1 by projecting the included in it respectively.
symbol for U2 in a direction orthogonal to h1 . This projec- 3) Pruning: We compute the sum throughput of all the
tion process leads to a degradation in SINR of U2, which is branched user antenna combinations, keep the top-M
reflected on the reduction of the length of blue vector. The user antenna combinations and reject all the others.
SINR reduction decreases as the inter-user reception angle θ 4) Looping: The above branching and pruning process
increases. As a special case θ = 90◦ , the channel vectors of repeats until all the antennas of the AP have been
U1 and U2 are perfectly orthogonal to each other, hence there assigned. We then select the user antenna combina-
is no SINR reduction. tion with the maximum sum throughput in the final M
Denote SINRorig by the SINR of a user antenna when it is choices.
served alone, then the SINR after projection is given by Fig. 2 provides an illustrative example of our algorithm.
SINRproj = sin2 (θ) · SINRorig . (5) Ai (i = 1, . . . , N ) refers to the i th antenna. In this tree,
each node represents a candidate user antenna combination.
Generally, if an AP equipped with S antennas serves m user The arrows indicate the incremental user selection process.
antennas (m<S) and a new antenna is added into the group, An arrow pointing to a node means that an additional user
the SINR reduction of the new antenna can also be obtained antenna is added. The red line arrow means that the node is
by Eq. (5). In this case, sin θ is obtained by: branched and retained while the black dashed arrows point
|h⊥ · h| to the pruned nodes. The number of red line arrows in each
sin θ = (6)
||h⊥ || · ||h|| step M is a tuneable parameter in which the branch-and-prune
algorithm is a naive greedy approach at M = 1. Here, we let M
where h is the channel vector of the new user and h⊥ is
be 3 so that only three combinations are retained in each step.
the vector orthogonal to the subspace spanned by the channel
In the initialization step, each user antenna is initialized as
vectors of the m selected users.
a candidate combination itself. Later on, each step contains a
branching phase and a pruning phase. In the branching phase,
B. Greedy Search With Pruning we add each remaining user antenna in the set of candidate
A conventional wisdom of greedy algorithm is to select a antennas. For instance, when each node Ai is added to {A1 }
user from the candidate pool in each step that yields the highest for i = 1, we obtain N−1 branches of user antenna combina-
throughput increment. However, there is no remedy when the tions, i.e., {A1 , A2 } . . . , {A1 , AN }. Hence, there are N(N−1)
first several steps deviate far away from optimality. In our branches in total at the first step. In the pruning phase, the total
problem, a simple greedy approach may lead to the improper throughput of each user antenna combination is computed.
downgrading of bandwidth or assignment of antennas. We The M user antenna combinations with highest throughput
Fig. 4. Throughput comparison. Fig. 5. Impact of search space.
are retained, i.e., {A1 , A3 }, {A2 , AN } and {AN , A1 }, and modulation/coding, channel estimation and ZFBF. In the MAC
the remaining branches are pruned subsequently. layer, we implement the aforementioned CSI feedback mech-
The branch-and-prune process terminates automatically anism. Our experiments are conducted in a typical office
when the number of selected user antennas equals to the num- environment. The users are placed randomly and sometimes
ber of antennas at the AP. Since only one user antenna is move at a walking speed for evaluating the mobility scenario.
added in each incremental step, the algorithm will terminate
at Step S. We then select the user antenna combination yield- B. Evaluation
ing the maximum sum throughput in the final M choices Ωi 1) Effectiveness of Our Algorithm: To evaluate the effec-
(i = 1, . . . , M ). tiveness of our algorithm, we setup a MU-MIMO WLAN with
Note that a combination with more user antennas does not a four-antenna AP and a random number of heterogeneous
always lead to a higher total throughput. There are two reasons users that possess twelve antennas in total. Based on the SINR
accounting for this phenomenon: 1) the channel of a newly calculated from the received preambles, we obtain the system
added user antenna is highly correlated with the existing user throughput via the SNR-to-rate lookup table [2]. For compari-
antennas; 2) the newly added user antenna that is of lower son, we implement a greedy user selection algorithm (M = 1).
bandwidth may force the existing users to use the lower band- Besides, we implement MUSE [6], a heuristic user grouping
width, causing a downgraded throughput. When a candidate scheme for MU-MIMO with the consideration of bandwidth
user antenna combination outperforms all its branching combi- heterogeneity. Moreover, we calculate the optimal user selec-
nations, we mark it as a special combination and let it compare tion results in SU-MIMO and MU-MIMO modes offline. Our
with the top-M combinations in the final step. The branch- experiments are conducted in one hundred different scenarios,
and-prune loop remains unchanged. As shown in Fig. 2, and each experiment has a runtime of 100 rounds.
ΩM +1 = {AN , A1 } is such an user antenna combination, Fig. 4 plots the CDFs of the sum throughput. The optimal
and is put in the final candidate pool. result in SU-MIMO mode is the worst, verifying the necessity
Fairness Control: To maintain fairness among user antennas of MU-MIMO. The greedy algorithm is also not satisfactory,
in each scheduling round, we divide all the user antennas into because the AP has no knowledge of the inter-user interference
an active set and an inactive set. The user antennas in the active of unselected users when performing user selection incremen-
set are moved to the inactive set after being selected. The user tally. Our user selection algorithm achieves a total median
antennas in the inactive set are restored to the active set when throughput of 336.675 Mbps, outperforming MUSE by 18%.
it is empty. This mechanism is easy to be implemented and can Less than 20% of experiments have a throughput below
effectively ensure the fairness among different user antennas. 300 Mbps in our algorithm, while more than 40% of them have
Complexity: Our user antenna selection algorithm has a in MUSE. The proposed algorithm is also repeated without
polynomial-time computational complexity. In the initializa- considering bandwidth heterogeneity. The resulting throughput
tion step, N branches are created and maintained without is much worse than that with bandwidth knowledge, implying
any pruning. Therefore, N(N−1) candidate combinations are the significance of taking bandwidth heterogeneity into con-
branched at step 1. The selection of the highest M combi- sideration. Furthermore, our algorithm reaches around 95% of
nations yields a complexity order O(MN 2 ). After step 1, throughput of the optimal result in MU-MIMO mode.
only M branches are maintained until the end of the branch- 2) Impact of Search Space: We hereby evaluate the
ing of all the steps. Hence, there exist M(N−1) branches in proposed branch-and-prune algorithm with different scales of
each following step where the selection of the top-M com- search space. Recap that the search space is determined by M,
binations has a complexity order O(M 2 N ). Considering that the number of the candidate user combinations. Fig. 5 illus-
the above procedure is executed for S − 1 times since Step trates the CDF of sum throughput when M increases from one
2, our branch-and-prune algorithm has a complexity order of to six. A larger M yields a better sum throughput at the cost
O(MN 2 + SNM 2 ), which is lower than exhaustive searching. of increased computational complexity. When M changes from
Compatibility: Our algorithm does not require any modi- one to five, one can witness a nearly 20% throughput gain in
fication on vanilla 802.11ac medium access control (MAC) most of the experiments; when it increases from five to six,
protocols. The algorithm is executed based on the CSI feed- the throughput gain becomes negligible. Hence, an optimistic
back, and the data streams are precoded based on the selection message on our algorithm is that a small search space (e.g.,
results. It is compatible with techniques including CSI feed- M is chosen to be five) might be good enough.
back compression and frame aggregation, as long as effective 3) Execution Time: We record the execution time under
CSI feedback is provided. different numbers of user antennas and different search space.
As shown in Fig. 6, the execution time grows almost linearly
IV. I MPLEMENTATION AND E VALUATION with the number of user antennas, but fortunately with gentle
slopes. Meanwhile, the execution time is proportional to the
A. Implementation and Experimental Setup search space. Compared with the greedy algorithm (M = 1),
We implement our user selection algorithm on the soft- the proposed algorithm costs more runtime, but leads to better
ware defined radio platform WARP [9]. The PHY layer throughput performance. Note that this part of experiments
follows the 802.11ac specifications, consisting of OFDM, is conducted on a laptop configured with Intel Core i7-4500U
WANG et al.: PRACTICAL USER SELECTION WITH HETEROGENEOUS BANDWIDTH AND ANTENNAS FOR MU-MIMO WLANs 559
(20 ms/two subcarriers/6 bits) and aggressive CSI compression

(40 ms/four subcarriers/4 bits). Under the same experimental
setup as before (four-antenna AP and twelve receive anten-
nas), AFC reduces the overhead from 3.364ms to 0.589ms,
0.131ms and 0.029ms under different compression levels.
However, we can see from Fig. 8 that the conservative com-
pression has almost negligible impact on system throughput,
Fig. 6. Execution time.
while the median compression and the aggressive compression
experience 7.3% and 15.4% throughput loss respectively.
C. Large-Scale Simulation
To evaluate the scalability of our algorithm, we consider a
large scale WLAN that consists of an AP with eight to sixteen
antennas and a certain amount of users with a total number
of fifty receive antennas. Due to the restriction of SDR plat-
form (expensive price and up to four antennas on each board),
we collect realistic wireless transmit and receive traces using
Fig. 7. Fairness and throughput w/ and w/o fairness control. our system at different locations, and then emulate the selec-
tion procedure offline. Fig. 9 shows the mean throughputs
of MUSE, the optimal scheme and our algorithm. Our algo-
rithm outperforms optimal SU-MIMO and MUSE by 78.7%
and 40.9%, respectively. The performance gap between our
algorithm and the optimal MU-MIMO is nearly 10%, slightly
larger than that in Fig. 4. When the AP is equipped with more
antennas, e.g., sixteen, our algorithm still exhibits remarkable
gains and remains close to the optimality. This manifests that
our algorithm possesses an excellent scalability.
Fig. 8. Compatibility.
V. C ONCLUSION
In this letter, we address the user selection problem in MU-
MIMO WLANs with heterogeneous maximum bandwidth and
number of receive antennas. A novel branch-and-prune algo-
rithm is proposed to achieve the low-complexity selection
of user antennas. Our algorithm is compatible with legacy
802.11ac, and is implement on the software defined radio
platform WARP. Experimental results demonstrate that our
Fig. 9. Large-scale simulation.
algorithm outperforms a very recent counterpart by 1.18 times
and reaches around 95% of the optimal results.
1.80GHz CPU using MATLAB. We believe that a binary code
implementation on commercial AP will be much faster. R EFERENCES
4) Effectiveness of Fairness Control: We further test our
[1] IEEE Draft Standard for IT—Telecommunications and Information
user and antenna selection algorithm with fairness consid- Exchange Between Systems—LAN/MAN—Specific Requirements—Part
eration. The preceding experiments are repeated except that 11: Wireless LAN Medium Access Control and Physical Layer
the fairness control scheme is activated. For comparison, we Specifications—AMD 4: Enhancements for Very High Throughput for
implement the proportional fairness scheduling of PF-11ac+ Operation in Bands Below 6GHz, IEEE Standard P802.11ac/D3.0,
in [10]. We adopt Jain’s fairness index to quantify the fair- Jun. 2012.
[2] M. Gast, 802.11ac: A Survival Guide, 1st ed. Beijing, China: O’Reilly
ness of all the receive antennas where the CDFs of average Media Inc., 2013.
throughput and fairness index are shown in Fig. 7. One can [3] T. Yoo et al. “Multi-antenna downlink channels with limited feed-
observe that our fairness control method balances the trans- back and user selection,” IEEE J. Sel. Areas Commun., vol. 25, no. 7,
mission opportunities well among all the receive antennas. pp. 1478–1491, Sep. 2007.
PF-11ac+ achieves better fairness performance because it is [4] X. Xie and X. Zhang, “Scalable user selection for MU-MIMO
networks,” in Proc. IEEE INFOCOM, 2015, pp. 808–816.
designed for long-term fairness. However, enforcing fairness [5] W. Shen et al. “SIEVE: Scalable user grouping for large MU-MIMO
control usually reduces the sum throughput. Both our fairness systems,” in Proc. IEEE INFOCOM, 2015, pp. 1975–1983.
control method and PF-11ac+ degrades the sum throughput [6] S. Sur et al. “Practical MU-MIMO user selection on 802.11ac commod-
slightly, because some users with poor channel conditions or ity networks,” in Proc. ACM MobiCom, 2016, pp. 122–134.
low bandwidth are grouped for fairness guarantee. [7] W. Shen et al., “Rate adaptation for 802.11 multiuser MIMO network,”
in Proc. ACM MobiCom, 2012, pp. 29–40.
5) Compatibility: The proposed algorithm also works [8] J. Porta et al. “A branch-and-prune algorithm for solving systems of
under CSI compression mechanisms that intends to reduce distance constraints,” in Proc. IEEE ICRA, 2003, pp. 342–348.
the feedback overhead. We implement AFC, a representa- [9] WARP Project. Accessed: Oct. 26, 2018. [Online]. Available:
tive CSI compression mechanism proposed in [11] and fix http://warpproject.org
the configurations under three different compression lev- [10] K. Lee and C. Kim, “User scheduling for MU-MIMO transmission with
active CSI feedback,” EURASIP J. Wireless Commun. Netw., vol. 2015,
els for verifiable comparison: conservative CSI compression p. 112, Apr. 2015.
(sharing CSI across 10ms, one subcarrier and quantizing [11] X. Xie et al. “Adaptive feedback compression for MIMO networks,” in
numerical values into 8 bits), median CSI compression Proc. ACM MobiCom, 2013, pp. 477–488.
Coupling Information Transmission With Window Decoding

Alireza Karami and Dmitri Truhachev
Abstract—We consider a coupling information transmission a number of coupled systems including spatially-coupled
modulation format utilized for communication over a multiple- low-density parity-check codes (SC-LDPCs) [7]–[10], slid-
access channel. A multitude of packets is transmitted by a ing window superposition coding [11] and modulation [12]
number of sources to a common receiver. Each packet is modu-
lated via simple repetition, interleaving, and an application of a proposed for broadcast in 5G cellular networks to reduce co-
signature sequence. The packets are transmitted with time offsets channel interference. This letter focuses on a system operating
that initiate coupling of the transmitted signals into a received over a multiple-access channel (MAC) rather than broad-
signal that lends itself to an efficient window-based iterative esti- cast and on iterative sliding window decoding. We present
mation and interference cancellation decoding. We present the a window decoding algorithm, and determine the optimal
window decoding formulation, a theoretic analysis, and numerical
results for the achievable sum-rate, gap to the channel capacity window sizes using both theoretical analysis and system simu-
and the optimal window size. lation. Finally, we present simulation results for the data rates
achievable with window decoding of the coupled systems and
Index Terms—Multiuser detection, spatial graph coupling,
multiple access. compare them to the decoding in uncoupled regimes and the
channel capacity.
I. I NTRODUCTION
II. S YSTEM M ODEL
ULTIPLE-ACCESS communications has numerous
M practical applications starting from cellular systems to
ad-hoc and sensor type networks. A significant interest is ded-
Consider a communications scenario in which multiple
transmitters communicate packets to a common receiver. In
icated to Internet of Things (IoT) and machine-to-machine order to encode and modulate the kth packet, k = 1, 2, . . .,
(MtoM) communications in a non-orthogonal multiple- access a K-bit long information vector u k is encoded by an outer
(NOMA) framework for 5G. Systems that can sustain a large error-correction code to produce a vector v k which consists
number of simultaneously arriving packets, where each packet of N coded bits (code rate R = K/N). Each bit in v k is
may be encoded via a number of simple operations are desir- repeated M1 times. The resulting vector is then permuted by
able. The receiver, at the same time, may use sophisticated a permutor of size NM1 . Finally each bit is multiplied by a
techniques to extract the information from all arriving packets. signature sequence s k = (sk ,1 , sk ,2 , . . . , sk ,M2 ), sk ,j ∈ {±1},
We focus on coupling information transmission technique j = 1, 2, . . . , M2 producing a BPSK modulated MN-long vec-
in which each packet is encoded via repetition, interleav- tor ṽ k = (ṽk ,1 , ṽk ,2 , . . . , ṽk ,MN ), where M = M1 M2 . The
ing and signature sequence operations. Each packet can be above operations can be represented in a matrix form
represented as a bipartite graph that connects variable nodes ṽ k = Bk v k . (1)
and modulated symbol nodes, the format which is rooted to
sparse code-division multiple-access type techniques [1]–[3]. The matrix Bk = Sk Pk R, where R is the MN × N bit rep-
To get advantage of the spatial-graph coupling threshold sat- etition matrix repeating each bit M times, Pk is a binary
uration phenomenon [4], [5] the packets are transmitted with permutation matrix and Sk is the diagonal signature sequence
time offsets that enable spatial graph coupling of the resulting matrix that multiplies each repeated data bit by a bit of the
packet graphs at the receiver. The receiver performs iterative signature sequence. The kth data stream (packet) is transmit-
interference cancellation and symbol estimation exploiting the ted with the time offset θk . We consider transmission over a
coupling structure of the underlying message graph. As a result real-valued additive white Gaussian noise (AWGN) channel
the technique is capacity-achieving in a number of regimes [6]. and assume that the packets are symbol-synchronous at the
In this letter we study a window decoding approach, the receiver. The received signal r = (r1 , r2 , . . .) is given by
∞
method that can be used in a practical receiver implemen- 1
tation for such coupled system with optimized complexity rτ = √ ṽk ,τ −θk + nτ τ = 1, 2, . . . (2)
L k =1
and decoding delay. Window decoding has been applied to
where nτ are iid AWGN samples with zero mean and variance
Manuscript received September 28, 2018; accepted October 28, 2018. Date σ 2 . Here we assume that ṽk ,j = 0 for any j < 1 or j > MN.
of publication November 6, 2018; date of current version April 9, 2019.
This work was supported by the Natural Sciences and Engineering Research If, in average, L data streams arrive within one packet length
Council of Canada. The associate editor coordinating the review of this we have a system load of α = L/M. Hence, the total transmit
paper and approving it for publication was J. Choi. (Corresponding author: power is normalized to one.
Dmitri Truhachev.)
The authors are with the Department of Electrical and Computer The received signal can be expressed using the matrix
Engineering, Dalhousie University, Halifax, NS B3J 1Z1, Canada (e-mail: notation as
akarami@dal.ca; dmitry@dal.ca).
Digital Object Identifier 10.1109/LWC.2018.2879840 r = Av + n (3)
KARAMI AND TRUHACHEV: COUPLING INFORMATION TRANSMISSION WITH WINDOW DECODING 561
Fig. 1. Matrix representation of the entire coupled system with Aw w

k corresponding to the current decoding window, and Ak +L to the next decoding window.
where v = (v 1 , v 2 , . . .) and n is the vector of the noise sam- data bits of the jth packet are then estimated via a conditional
ples. The matrix A is a band-diagonal matrix depicted in Fig. 1 expectation estimator of binary symbols (soft bits)
with the matrices of the transmitted packets B1 , B2 , . . . etc. on
(i) (i−1)
the diagonal. v̂ j = tanh RRT − IMN PT j Sj λj . (7)
(i−1)
III. W INDOW D ECODING The above equation demonstrates that the LLRs λj are first
multiplied by the signature sequence, then inverse-permuted
The decoder performs iterative data estimation and
PTj Sj . Finally for each of the M replica’s of each data bits we
interference cancellation operations on a sliding window that
sum up the LLRs, all except one, not to reuse self-information
covers LW received data packets or approximately W packet
in the iterative process, and apply the ‘tanh’ operation to
lengths of the received sequence. The kth decoding window is
compute the conditional expectation estimates of the data bits.
the received sequence segment
The next step is the interference cancellation operation that
rw
k = (rθk , rθk +1 , . . . , rθk +LW −1 +MN −1 ).
computes
(i)
(i−1)
that consists of received packets from k to k + LW − 1 where r j = r̃ j − Sj Pj v̂ j (8)
in matrix form j =j
rw
k = Aw w
k vk + nw
k + ṽ b + ṽ a , that is we subtract the impact of all packets except packet
vw = (v k , v k +1 , . . . , v k +LW −1 ) (4) j for the received sequence segment r̃ j that corresponds to
k
the jth packet. If we subtract the jth estimated packet from
is the data vector of the LW received packets contained in (i)
r j as well we can use the residual to compute the noise and
the current decoding window, and n w k is the correspond- interference power vectors σ 2j ,i .
ing vector of the noise samples. Fig. 1 shows the window At the last, in the Ith data estimation iteration all LLRs are
matrix Aw k , composed of the individual packet matrices Bj , used to compute the estimated data bits
j = k , k + 1, . . . , k + LW − 1. The vectors ṽ b and ṽ a are the
(i−1)
residual from the previous window and the beginning of the v̂ j = sign RT PT j Sj λj (9)
next window respectively. Once I decoding iterations are per-
formed the decoding window is shifted to accommodate the j = k , k + 1, . . . , k + L − 1 that are forwarded to the outer
next L received packets. error-correction decoder. Modulated bits Bj v̂ j for j = k , k +
(I )
The first iteration of the window decoder starts with 1, . . . , k + L − 1 as well as Bj v̂ j , j = k + L, k + L +
r̃ w w 1, . . . , k + WL − 1 are used to compute the initial interference
k = r k − v̂ b (5)
cancellation term v̂ b for the next window (see (5)).
where the impact of the data bits decoded and estimated in Instead of a decoding schedule with the fixed number of
the previous window are subtracted. Let’s define r̃ j to be the iterations I per decoding window we can let the number of
segment of the received sequence r̃ w k corresponding to the iterations vary from window to window and apply a stop-
jth packet. We also denote the same segment of the received ping criteria: the iterative process stops once the estimated
sequence after i − 1 interference cancellation iterations by signal-to-noise and interference ratio (SINR) computed based
(i−1) on σ 2j ,i ’s of packet k + L reaches the SINR for packet k.
r̃ j . Iteration i starts with computation of log-likelihood
ratios (LLRs) of the data bits of the jth packet An alternative decoding algorithm is the approximate mes-
(i−1) (i−1)
sage passing algorithm (AMP) [13] that has been derived for
λj = L−1/2 r̃ j /σ 2j ,i−1 (6) (un-windowed) coupled transmission in [14]. Instead of pass-
ing M and L messages from each channel and variable node
where σ 2j ,i−1 is the vector of noise and interference power
of the system’s graph at each iteration we only pass a sin-
(i−1)
computed for the jth packet based on r j (see below). The gle message from either node and include a correction factor
load equals α = L/M. The noise and interference power at

iteration i for the time section t is denoted by xti while the
variance of the estimated data bits (soft-bit variance) is denoted
by yti . In vector form we have

x i = x1i , x2i , x3i · · · , y i = y1i , y2i , y3i · · ·
The recursion is initialized by setting

t
xt0 = + σ 2 for t ≤ n and xt0 = 1 + σ 2 for t > n. (10)
n
Since the packets start to arrive to the receiver gradually, in
groups of L/n. At iteration i the noise and interference power
xti is composed of the Gaussian noise and the soft-bit variance
contributions of all packets that are interfering at time t
n
1 i
xti = yt+j −1 + σ 2 . (11)
Fig. 2. Achievable rate for window decoding, analytic results. n
j =1
The soft bit variance is determined by the variance of the

accounting for the difference in the messages. For the win- conditional expectation estimate of a Bernoulli random vari-
dowed case we follow equations (5)–(9) without excluding self able in an AWGN channel with SNR γ given by g(γ) =
message (j = j ) and apply the correction factors as in [14] √
E[(1 − tanh(γ + ξ γ))2 ] where E denotes expectation and ξ
to formulate the algorithm as is the standard normal random variable, ξ ∼ N (0, 1). Hence
w,(i−1) ⎛ ⎞
w,(i) r w(2) n
rk = r̃ w w (i)
k − Ak v̂ + k2 ◦ Ak sech2 w (i−1) i 1 1
σ i−1 yt+n = g⎝ i−1
⎠
αn xt+j −1
w,(i) j =1
r
w(2) T 1
w (i)
= (Aw )T k 2 + Ak ◦ v̂ (i)
where the argument of the function g(·) is the average SINR
σi σ 2i
of the LLRs in equation (6) indexed in terms of sections t, t +
v̂ (i) = tanh w (i−1)
1, . . . , t + n − 1.
Consider now a decoding window of size W that translates
where v̂ (i) is the estimate of the concatenated data vec- to W n time instances (packet section) and consider I decoding
tor of packets from k to k + WL − 1 at iteration i and iterations per window before the window is shifted forward by
◦ denotes Hadamard (component-wise product) of two vec- n time units. We obtain
tors or matrices, Ak
w(2)
= Aw w ⎛ ⎞
k ◦ Ak ; vector inverses are
1 1
componentwise. y i[t+n−1,t+Wn−2] = g ⎝ Cw i−1 ⎠ (12)
To quantify the decoding complexity we count the num- αn x [t,t+Wn−1]
ber of multiplication, division, and group summation oper- 1
ations executed at each iteration which is approximately x i[t,t+(W −1)n−1] = Cw y i[t,t+Wn−1] + σ 2 (13)
n
WN (3L + 5M) for the AMP and WN [(3M + 1) L + M]
for the MP decoder. The operations are performed during and Cw is of size (W − 1) n × W n and for i = 1, 2, . . . , (W −
Iav iterations on a window which includes LMW data bits 1)n and j = 0, 1, 2, . . . , n we have Ci,i+j = 1 while the
(latency) to output LN decoded data bits. Different window rest of the elements are zeros. We also note that the other
sizes require different numbers of iterations for converge to components of the vectors x i and y i stay unchanged, i.e.,
the multi-user interference-free case. Hence, we consider nor-
y i[1,t+n−2] = y i−1
[1,t+n−2] , y i[t+Wn−1,∞) = y i−1
[t+Wn−1,∞) ,
malized complexity η = Iav W /LN to identify the optimal
decoding window. x i[1,t−1] = x i−1
[1,t−1] , x i[t+(W −1)n,∞) = x i−1
[t+(W −1)n,∞)
The results of the analytic evaluation are given in Fig. 2.

IV. W INDOW D ECODING A NALYSIS The magenta curve with dots represents the AWGN channel
For the purpose of analysis assume that each packet consists capacity plotted as a function of the SNR. The solid magenta
of n equal-length sections that contain MN/n bits each. We also curves plot the capacities of the AWGN channel with 4, 8, 16,
assume that the packets arrive to the receiver in groups of L/n 32-PAM input (from bottom to top). The blue curve gives
packets at the beginning of each section, that is θ1 = θ2 = the sum-rate achievable with non-windowed decoding taken
· · · , θL/n = 1, θL/n+1 = θL/n+2 = · · · , θ2L/n = MN /n+1 as an asymptotic limit for M, N → ∞. The dashed red, black,
and so on. We now count the time unites in terms of sections and green curves show the maximum achievable sum-rate with
(one packet spans n time units) and recall that the total system window decoding for n = 20 and W = 3, 4, 10 and 400
power equals 1, the noise power is given by σ 2 , and the system iterations (from bottom to top).
KARAMI AND TRUHACHEV: COUPLING INFORMATION TRANSMISSION WITH WINDOW DECODING 563
The results demonstrate that the window decoder in both

message-passing iterative interference cancellation and estima-
tion version and AMP version provides a tool for establishing
near-capacity communication with high system loads.
VI. C ONCLUSION
In this letter we propose a window-based decoding algo-
rithm or the coupling information transmission multi-user
communication format. The decoding window size and the
number of decoding iterations per window determine the com-
plexity of the decoding and the resulting achievable rate. Both
theoretic and numerical results demonstrate that the window of
size four is sufficient to achieve near-capacity communications
and at the same time allow for a larger number of simultaneous
arriving packets (higher system loads) than traditional multi-
user communication approaches, such as equal-power dense
Fig. 3. The achievable sum-rate for various values of M and L. The size of
the decoding window is W = 4.
CDMA with maximum load α = 1.49 [15] and α = 2.07 for
partition-spreading CDMA [2].
TABLE I
AVERAGE N UMBER OF I TERATIONS , N ORMALIZED C OMPLEXITY VS . W
R EFERENCES
[1] L. Ping, L. Liu, K. Wu, and W. K. Leung, “Interleave division multiple-
access,” IEEE Trans. Wireless Commun., vol. 5, no. 4, pp. 938–947,
Apr. 2006.
[2] D. Truhachev, C. Schlegel, and L. Krzymien, “Low-complexity capacity
V. S IMULATION R ESULTS achieving two-stage demodulation/decoding for random matrix chan-
nels,” in Proc. IEEE Inf. Theory Workshop, Lake Tahoe, CA, USA,
We start with an experiment that looks for the optimal win- Sep. 2007, pp. 584–589.
dow size and compares it to the findings suggested by the [3] P. A. Hoeher and T. Wo, “Superposition modulation: Myths and facts,”
IEEE Commun. Mag., vol. 49, no. 12, pp. 110–116, Dec. 2011.
analysis above. We consider repetition factor M = 25 and the [4] M. Lentmaier, A. Sridharan, D. J. Costello, and K. S. Zigangirov,
average number of L = 65 packets interfering at each time at “Iterative decoding threshold analysis for LDPC convolutional codes,”
the receiver giving systems load of α = 2.6 which is close to IEEE Trans. Inf. Theory, vol. 56, no. 10, pp. 5274–5289, Oct. 2010.
[5] S. Kudekar, T. J. Richardson, and R. L. Urbanke, “Threshold satura-
the limits for 20dB total system SNR (as we will see below). tion via spatial coupling: Why convolutional LDPC ensembles perform
We choose a decoding schedule with the stopping criteria (see so well over the BEC,” IEEE Trans. Inf. Theory, vol. 57, no. 2,
Section III) and estimate the average number of iterations per pp. 803–834, Feb. 2011.
[6] D. Truhachev, “Universal multiple access via spatially coupling data
window Iav and compute the normalized complexity η (see transmission,” in Proc. IEEE Int. Symp. Inf. Theory, Istanbul, Turkey,
Table I). The packet size is chosen to be N = 400 bits. We Jul. 2013, pp. 1884–1888.
notice that W = 3 is insufficiently long and results in high [7] M. Lentmaier, M. M. Prenda, and G. P. Fettweis, “Efficient message
passing scheduling for terminated LDPC convolutional codes,” in Proc.
complexity while window size W = 4 leads to the smallest IEEE Int. Symp. Inf. Theory, St. Petersburg, Russia, Jul./Aug. 2011,
complexity overall. pp. 1826–1830.
Fig. 3 shows the achievable system sum-rate as a function [8] A. R. Iyengar et al., “Windowed decoding of protograph-based LDPC
convolutional codes over erasure channels,” IEEE Trans. Inf. Theory,
of total system SNR (in dB) for the message passing and the vol. 58, no. 4, pp. 2303–2320, Apr. 2012.
AMP decoding algorithms with window size W = 4 and dif- [9] A. R. Iyengar, P. H. Siegel, R. L. Urbanke, and J. K. Wolf, “Windowed
ferent spreading factors. For each SNR point we choose the decoding of spatially coupled codes,” IEEE Trans. Inf. Theory, vol. 59,
no. 4, pp. 2277–2292, Apr. 2013.
highest system load α = L/M such that the system converges [10] N. U. Hassan, A. E. Pusane, M. Lentmaier, G. P. Fettweis, and
to the approximately interference-free case, compute the post D. J. Costello, Jr., “Non-uniform window decoding schedules for spa-
interference cancellation SNR and compute the achievable nor- tially coupled LDPC codes,” IEEE Trans. Commun., vol. 65, no. 2,
pp. 501–510, Feb. 2017.
malized sum-rate as LR/M = αR where the error-correction [11] L. Wang, E. Şaşoğlu, and Y.-H. Kim, “Sliding-window superposition
code rate R is selected equal the capacity on a binary-input coding for interference networks,” in Proc. IEEE Int. Symp. Inf. Theory
AWGN channel with the post interference cancellation SNR. (ISIT), Jun./Jul. 2014, pp. 2749–2753.
The achievable rate is compared to the capacity of the AWGN [12] K. T. Kim et al., “Interference management via sliding-window coded
modulation for 5G cellular networks,” IEEE Commun. Mag., vol. 54,
channel. For M = 250 AMP the achievable sum-rate closely no. 11, pp. 82–89, Nov. 2016.
follows the channel capacity curve with a gap of about 1.6dB. [13] D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algo-
The M = 50 message passing algorithm performance is simi- rithms for compressed sensing,” in Proc. Nat. Acad. Sci. USA, vol. 106,
no. 45, pp. 18914–18919, Jul. 2009.
lar for the SNR range from 15 to 22 dB. For smaller repetition [14] D. Truhachev and D. McNutt, “Coupling information transmission with
factors slight performance degradation with respect to capacity approximate message-passing,” IEEE Commun. Lett., vol. 20, no. 10,
occurs at high SNRs. This is in agreement with the theo- pp. 1995–1998, Oct. 2016.
[15] T. Tanaka, “A statistical-mechanics approach to large-system analysis of
retic evaluations suggesting that higher repetition factors are CDMA multiuser detectors,” IEEE Trans. Inf. Theory, vol. 48, no. 11,
required for near-capacity performance at high SNRs. pp. 2888–2910, Nov. 2002.
Secure UAV-to-UAV Systems With Spatially Random UAVs

Jia Ye, Chao Zhang, Hongjiang Lei , Gaofeng Pan , Member, IEEE, and Zhiguo Ding, Senior Member, IEEE
Abstract—In this letter, we investigate the secrecy performance through theoretical analysis and simulation results. In [13],
of an unmanned aerial vehicle (UAV)-to-UAV system, where a hybrid outage probability was derived to examine the security
UAV acts as the source (S) transmitting information to a legiti- issue of UAV-aided communication systems while the eaves-
mate UAV receiver while a group of UAVs trying to eavesdrop the dropper performing eavesdropping and malicious jamming
information delivery between S and legitimate UAV receiver. The simultaneously. The tight lower-bound of ergodic capacity was
locations of the legitimate UAV receiver and the eavesdropping obtained for UAV aided cellular communications systems [14].
UAVs are randomly distributed in the coverage space of S. We
However, most of the existing works mainly focus on the
first characterize the statistical characteristics of the signal-to-
noise ratio over S to the legitimate UAV receiver links; and then UAV-to-ground systems in 2-dimensional (2D) space which
the closed-form analytical expressions for secrecy outage prob- only has one direction of communication model from the
ability and the average secrecy capacity have also been derived ground to the air. The security of UAV-to-UAV (A2A) systems
accordingly. Finally, Monte-Carlo simulations are carried out to in 3-dimensional (3D) space will be more complicated since
verify our proposed analytical models. the receivers or eavesdroppers work in all directions, leading
to the difficulties in the mathematical derivations during the
Index Terms—Average secrecy capacity, secrecy outage prob-
performance modeling, as shown in Sections II–IV. It is obvi-
ability, stochastic geometry, unmanned aerial vehicles.
ous that the 3D security model has not been well investigated
and understood, still leaving an open issue.
I. I NTRODUCTION Furthermore, most of the researches only consider the fixed
ECENTLY, unmanned aerial vehicles (UAVs) hook more locations and fixed number of UAVs, while UAVs can move
R and more researchers’ interests as they have been
regarded as an effective complement to aerial communica-
freely and flexibly in the 3D space. However, in practical sce-
narios, the positions of UAVs may vary due to the conditions
tions, which can provide robust and reliable communication of airspace or the scheduled tasks to the UAVs. Especially, the
networks [1]. UAVs could not only play a key role in both malicious UAVs, who want to eavesdrop the transmitted infor-
military and civilian area [2], but also apply in wireless com- mation, will always change their positions to cover up their
munications [3]. Mozaffari et al. [4] analyzed the deployment eavesdropping behaviors. Thus, in this letter we will consider a
of and UAV as a flying base station in a given geographical general case with the randomness of the terminals’ positions,
communication area. Reference [5] studied the optimum relay as well as the random number of the eavesdropping UAVs.
UAV placement to maxmize reliability. The applicability of Moreover, Many previous works have considered 2D space
UAV networks was investigated in [6]. and are not suitable for A2A systems, because UAVs may be
However, most of the existing literatures have not consid- distributed anywhere in the airspace.
ered the security of the confidential information transmitted Motivated by the above observations, in this letter we inves-
from the transmitting UAV to a legitimate UAV. Actually, tigate the secrecy performance of an A2A communication
there is a promising method, that is, physical-layer (PHY) system with a legitimate UAV and a group of eavesdropping
security which can prevent information delivery from eaves- UAVs in the line of sight (LoS) 3D space. It is reasonable
dropping [7], [8]. Several techniques have been investigated to assume that channels mainly experience LoS fading in the
to achieve positive secrecy rates for UAV communication open space [15] because UAVs flying above buildings and
systems, such as artificial noise [9], power control [10], and so shadows are more likely to observe the radio path clearance
on. Zhang et al. [11] studied the maximum secrecy rate and from other UAVs in the surrounding areas. Also, in some
the UAV’s optimizing trajectory considering transmit power application scenarios that UAVs fly in stratosphere, there is
over a finite horizon. Reference [12] discussed how spatial no reflected signal and LoS fading plays the main role during
communication security was affected by the communication the transmissions [16].1 We also consider the randomness of
the locations of all UAVs and the number of the eavesdrop-
Manuscript received August 17, 2018; revised October 25, 2018; accepted ping UAVs by using the stochastic geometry theory. The main
November 1, 2018. Date of publication November 6, 2018; date of current ver- contributions of this letter are summarized as: 1) We charac-
sion April 9, 2019. This work was supported in part by the U.K. EPSRC under
Grant EP/P009719/2, and in part by H2020-MSCA-RISE-2015 under Grant terize the probability density function (PDF) and cumulative
690750. The work of H. Lei was supported by the Project of Fundamental distribution function (CDF) of the signal-to-noise-ratio (SNR)
Science and Frontier Technology Research Plan of Chongqing under Grant over the A2A links; 2) The closed-form analytical expressions
cstc2017jcyjAX0204. The associate editor coordinating the review of this for secrecy outage probability (SOP) and the average secrecy
paper and approving it for publication was P. A. Dmochowski. (Corresponding capacity (ASC) have been derived.
author: Gaofeng Pan.)
J. Ye, C. Zhang, and G. Pan are with the Chongqing Key Laboratory Notations: B(x , y) and f (x, y) denote Beta functions and
of Nonlinear Circuits and Intelligent Information Processing, Southwest joint PDF related x and y. 3 F2 (·) and 2 F1 (·) denote Gauss
University, Chongqing 400715, China (e-mail: gfpan@swu.edu.cn).
H. Lei is with the Chongqing Key Laboratory of Mobile Communications
Technology, Chongqing University of Posts and Telecommunications, 1 In Section V, we present some simulation results under case with both LoS
Chongqing 400065, China. and Non-line-of-sight (NLoS, e.g., Nakagami-m fading) fading to show the
Z. Ding is with the School of Electrical and Electronic Engineering, impact of NLoS fading on the secrecy performance of the considered system.
University of Manchester, Manchester M13 9PL, U.K. We leave the analysis of this case for future works due to its complexity and
Digital Object Identifier 10.1109/LWC.2018.2879842 the page limitation.
YE et al.: SECURE UAV-TO-UAV SYSTEMS WITH SPATIALLY RANDOM UAVs 565
reference SNR. In the following, we also denote vmin = γP

D2
≤
γi ≤ vmax = ∞ for simplification, where vmin and vmax
means the minimum and maximum values of γi respectively.
Lemma 1: The CDF of γi = γP d2
can be expressed as
i
⎧ √ 3
⎨ ( γP )
1− 3 , x ≥ γP
D2 .
Fγi (x ) = x 2 D3 (1)
⎩ γP
0, x < D2
3 , we have the CDF of d as
Proof: Using fW (w ) = 4πD 3 i
x π 2π
3
Fdi (x ) = sin φi di2 dθi dφi d(di )
Fig. 1. System model. 4πD 3
0 0 0
⎧
⎨ 0, x <D
hypergeometric functions. ln(·) denote natural logarithms. = Dx3 , 0 ≤ x ≤ D . (2)
(N −1)! ⎩ 3
CNj −1 = j !(N −1−j ) with N > 1. 1, x >D
Therefore, we can achieve the CDF of γi as
II. S YSTEM M ODEL
In this letter, we consider a A2A communication system, γP
Fγi (x ) = Pr{γi ≤ x } = 1 − Pr di ≤ . (3)
as shown in Fig. 1, where a UAV with a single omni trans- x
mitting antenna acts as the source (S) trying to transmit its
information to a legitimate UAV.2 Furthermore, there are also Then, the proof is completed.
a group of UAVs distributed in the coverage space (in which Corollary 1: Accordingly, the PDF of γi can be expressed
the received signal strength is equal or above the threshold at as
the receiver to demodulate and decode the received signal) of ⎧ √ 3
⎨ 3( γP )
S trying to eavesdrop the information transmission between S 5 , x ≥ γP
D2 .
and the legitimate UAV. For tractability purpose, in this letter fγi (x ) = 2x 2 D 3 (4)
⎩ γP
we treat the coverage space of S as a sphere, V, with radius, 0, x < D2
D (D > 0) m, where S is located at the center of the sphere.
Without of loss generality, we assume that the legitimate UAV Proof: The PDF of γi is the derivative of (1).
and N (N ≥ 1) eavesdropping UAVs can be modeled as a Lemma 2: The CDF of γmax with N eavesdropping UAVs
set of independent and identical uniformly distributed points can be derived as
⎧

without cooperation with each other in the sphere V, denoted ⎪
⎨
√
( γP )
3 N
by W, in order to protect their eavesdropping activities. The 1 − , x ≥ γP
FγNmax (x ) = 3
x 2 D3 D2 . (5)
number of the eavesdropping receivers is Poisson distributed ⎪
⎩ 0, γP
with density λ, i.e., P {N = k } = (μkV /k !) exp(−μV ), where x < D2
3
μV = 4πD 3 λ is the mean measure. Proof: Using probability theory, we have Fz (z ) =
In the following, we name the legitimate receiver as the Pr{max{x1 , x2 , . . . , xM } ≤ z }, if x1 , x2 , . . . ,
xM are M
0th receiver to facilitate the following analysis. Therefore,
(M > 1) independent variables. Then, Fz (z ) = M i=1 Fxi (z ),
the distance between S and UAVs can be calculated from
where Fxi (x ) is the CDF of xi . As the eavesdropping chan-
W, the PDF of which can be given by using [17, eq. (1)] as
3 . In this letter, we assume that the communi- nels are independent with each other, we can derive the CDF
fW (w ) = 4πD 3 of γmax as (5).
cation channels from S to UAVs are dominated by LoS links
rather than other channel impairments in the open airspace, Corollary 2: Accordingly, the PDF of γmax with N eaves-
such as shadowing or small-scale fading, as treated in [3]. dropping UAVs can be expressed as
Thus, the channel power gain from S to the ith (0 < i ≤ N) ⎧
N −1 √
⎪ √
⎨ ( γP )3 3( γP )
3
γP
UAV follows the free-space path loss model, which can be N 1− 3 , x ≥
fγNmax (x ) = 5 D2
given by gi = βdi−2 , where β denotes the channel power at ⎪
⎩ 0,
x 2 D3 2x 2 D 3
γP
x <
the reference distance d = 1 m, whose value depends on the ⎧ D2
√ 3 N −1 √
carrier frequency, antenna gain, etc., and di is the link distance ⎨ N 3( γP ) (−1)j C j
3j
⎪ ( γP ) γP
2D 3 N −1 3 j+ 5 , x ≥ D2
between S and the ith UAV. = j =0 x 2 2 D 3j .
⎪
⎩ γP
Let P denote the transmit power at S. The received SNR 0, x < D2
at the ith UAV from S can be expressed as γi = Pδg2i = γP d2
, (6)
i
where δ 2 denotes the noise power, and γ = δβ2 represents the
Proof: The PDF of γmax is the derivative of (5).
2 In this letter, we only consider omni transmitting antenna to introduce the
analysis method for A2A systems and the system with directional antenna is a III. T HE S ECRECY O UTAGE P ROBABILITY
special case of the one with omni transmitting antenna: the coverage space of
directional antenna is a part of the one of omni antenna, leading to a portion In this letter, SOP is defined as the probability that the
of the sphere for eavesdroppers to distribute, as indicated in Fig. 1. instantaneous secrecy capacity is below a threshold secrecy

N = ∞ ln(1 + x )f (x ) x

rate, Cth (Cth ≥ 0). Therefore, the instantaneous secrecy where C̄s1 υmin γ0 υmin fγmax (y)dydx and

N = ∞ ln(1 + y)f
∞
capacity from S to the legitimate UAV is C̄s2 υmin γmax (y) y γ0 (x )dx dy.
f
N
Thus, we can calculate C̄s1 as
CS (γ0 , γmax ) = max{log2 (1 + γ0 ) − log2 (1 + γmax ), 0}, (7)
∞
where γmax = maxi∈{1,...,N } {γi }. N
C̄s1 = ln(1 + x )f0 (x )
Then, let ν = 2Cth . SOP can be written as vmin
√ 3 N −1 √ 3j
1 + γ0 Cth
x
3 γP γP
Pr (Cth ) = Pr{Cs ≤ Cth } = Pr ≤2 × N (−1) j j
CN dydx
1 + γmax 2D 3 −1 3 5
y 2 j + 2 D 3j
vmin j =0
= Pr{γ0 ≤ νγmax + ν − 1}. (8) √ 3j +6
N
−1 ∞
j 9N (−1)j γP 5
Theorem 1: Using the Corollary 1 and Corollary 2, we can = CN −1 3 j+ 3 x − 2 ln(1 + x )dx
j =0 2(3j + 3)vmin 2 2 D 3j +6
express Pr (Cth ) as follows: vmin
⎡ N
−1 j √ 3j +6 ∞
√ 3j +6 j 9N (−1) γP 3
∞ N −1 − CN x − 2 j −4 ln(1 + x )dx .
μV −μV ⎢ j j 3N γP −1
2(3j + 3)D 3j +6
Pr (Cth ) = e ⎣ (−1) CN −1 3 j =0 vmin
N =1
N ! j =0 (3j + 3)vmin 2 j +3 D 3j +6
(13)
N −1 √ 3j +6
j 3N γP
− (−1)j CN In orderto facilitate the analysis, we define a new function
g1 (m) = v∞
−1 3
j =0 2ν 2 D 3j +6 x m ln(1 + x )dx . Using [19, eq. (2.6.10.47)],
3 ⎤ min
B j + 3, 1 2 F1
3
+ 52 , 1; 32 j + 32 ;
j 3 we can easily calculate g1 (m) as
2 2 2vmin ⎥
× ⎦, (9)
3 5
vmin 2 j + 2 1 − ν1 + vmin g1 (m) = vmin m+1 B (1, −m − 1)[ln(vmin ) + ψ(−m)
− ψ(−m − 1)] + vmin m B (1, −m)
Proof: As the main and eavesdropping channels are inde-
1
pendent with each other, we can calculate Pr (Cth ) with N × 3 F2 −m, 1, 1; 2, −m + 1; − . (14)
vmin
eavesdropping UAVs as
N as
Then, using (14) in (13), we can easily derive C̄s1
∞
νy+ν−1
PrN (Cth ) = fγ0 (x )fγmax (y)dx dy N −1 √ 3j +6 5

N
j 9N (−1)j γP
vmin vmin C̄s1 = CN −1 3 3
g1 −
2(3j + 3)vmin 2 j + 2 D 3j +6 2
∞
νy+ν−1 √ 3 j =0
3 γP √ 3j +6
= fγmax (y) 5 dx dy N
−1
9N (−1)j γP 3
2x 2 D 3 j
vmin vmin − CN −1 g1 − j −4 , (15)
2(3j + 3)D 3j +6 2
N −1 ∞ √ 3j +6 j =0
j 3N γP
(−1)j CN
= −1 3
2vmin D 3j +6 y 2 j + 2
2
3 5 dy. (10) where g1 (m) = v∞ min
x m ln(1 + x )dx .
j =0 vmin N by using (14) as
Similarly, we can also derive C̄s2
Using [18, eq. (3.197.2)] and taking the Poisson distributed vmax √ 3
∞
number of eavesdroppers into consideration, Pr (Cth ) can be N 3 γP
obtained as (9). C̄s2 = ln(1 + y)fγmax (y) 5 dx dy
2x 2 D 3
vmin y
N −1 √ 3j +6 ∞
IV. T HE AVERAGE S ECRECY C APACITY j j 3N γP 3
= (−1) CN −1 y − 2 j −4 ln(1 + y)dy
In this letter, ASC is defined the expected value of secrecy 2D 3j +6
j =0 vmin
capacity as follows3 √ 3j +6
N
−1
j 3N γP 3j
C̄s (γ0 , γmax ) = E [Cs (γ0 , γmax )] = (−1)j CN −1 g1 − − 4 . (16)
2D 3j +6 2
∞ ∞ j =0
= Cs (γ0 , γmax )f (γ0 , γmax )dγ0 dγmax . Finally, ASC can be obtained by substituting (15) and (16)
υmin υmin into (12).
(11)
V. N UMERICAL R ESULTS
As the main and eavesdropping channels are independent In this section, Monte Carlo simulations are carried out to
with each other, we can rewrite ASC as validate our proposed analytical expressions for SOP and ASC
∞
under dominated LoS fading channels and Nakagami-m fad-
μV N N
C̄s (γ0 , γmax ) = exp(−μV )(C̄s1 − C̄s2 )/ ln 2, (12) ing. The main adopted parameters are set as γ = 80 dB [11],
N!
N =1 P = 10 dBW, Cth = 1 bits/s/Hz and the expectation of chan-
nel power gain of the Nakagami-m fading channel is 1 dB.
3 The ergodic secrecy capacity aims at a case that the source can gain the
Moreover, the coverage distance of the source UAV is set
channel state information (CSI) of the eavesdropping link and transmits the from hundreds of meters to tens of km to reflect the practical
information only when the legitimate channel outperforms the eavesdropping
channel. Therefore, the ergodic secrecy capacity is a special case of ASC, as scenarios of UAVs in civil and military applications.
ASC is calculated for the case that no matter the CSI of the eavesdropping Fig. 2(a) shows the SOP versus λ for various Cth with
channel is available or not at the source. D = 500 m. Since a high threshold secrecy rate means the
YE et al.: SECURE UAV-TO-UAV SYSTEMS WITH SPATIALLY RANDOM UAVs 567
experiencing LoS fading serves as the lower bound of the

ASC.
Finally, we can also clearly see from Figs. 2-3 that simu-
lation and analysis results match very well with each other,
which can verify the correctness of our proposed analysis
models.
VI. C ONCLUSION
In this letter, we have studied the secrecy performance of
a A2A communication system and derived the closed-form
analytical expressions for SOP and ASC. We consider the ran-
Fig. 2. SOP. domness of the number and the positions of all UAVs to make
our system more practical. From the numerical results, we can
obtain that the density of the eavesdropping UAVs and the
radius of the coverage space of S exhibit a negative impact on
SOP. However, the ASC can be improved by increasing the
radius of the coverage space of the source UAV.
R EFERENCES
[1] S. Hayat, E. Yanmaz, and R. Muzaffar, “Survey on unmanned aerial
vehicle networks for civil applications: A communications viewpoint,”
IEEE Commun. Surveys Tuts., vol. 18, no. 4, pp. 2624–2661, 4th Quart.,
2016.
[2] M. Erdelj, E. Natalizio, K. R. Chowdhury, and I. F. Akyildiz, “Help from
the sky: Leveraging UAVs for disaster management,” IEEE Pervasive
Comput., vol. 16, no. 1, pp. 24–32, Jan./Mar. 2017.
Fig. 3. ASC. [3] Y. Zeng, R. Zhang, and T. J. Lim, “Wireless communications
with unmanned aerial vehicles: Opportunities and challenges,” IEEE
Commun. Mag., vol. 54, no. 5, pp. 36–42, May 2016.
[4] M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Unmanned aerial
vehicle with underlaid device-to-device communications: Performance
system requires a large channel secrecy rate, we can see that and tradeoffs,” IEEE Trans. Wireless Commun., vol. 15, no. 6,
pp. 3949–3963, Jun. 2016.
the systems with a small Cth outperforms the one with a large [5] Y. Chen, W. Feng, and G. Zheng, “Optimum placement of UAV as
Cth . Fig. 2(b) shows the SOP versus D for various λ. Clearly, relays,” IEEE Commun. Lett., vol. 22, no. 2, pp. 248–251, Feb. 2018.
there are more eavesdropping UAVs when λ and/or D increase, [6] T. Andre et al., “Application-driven design of aerial communication
resulting in degraded SOP. Because the diversity gain of the networks,” IEEE Commun. Mag., vol. 52, no. 5, pp. 129–137, May 2014.
information eavesdropping will increase, while the number of [7] G. Pan, J. Ye, and Z. Ding, “On secure VLC systems with spatially
random terminals,” IEEE Commun. Lett., vol. 21, no. 3, pp. 492–495,
eavesdropping UAVs enlarging. Mar. 2017.
Fig. 3(a) presents the ASC versus D for various λ, in which [8] G. Pan, J. Ye, and Z. Ding, “Secure hybrid VLC-RF systems with
one can see that ASC can be improved in the low λ region and light energy harvesting,” IEEE Trans. Commun., vol. 65, no. 10,
degrades in large λ region when λ decreases or D increases. pp. 4348–4359, Oct. 2017.
Because low λ and large D can improve the probability that the [9] S. Goel and R. Negi, “Guaranteeing secrecy using artificial noise,” IEEE
Trans. Wireless Commun., vol. 7, no. 6, pp. 2180–2189, Jun. 2008.
distance between S and the legitimate UAV is shorter than the [10] P. K. Gopala, L. Lai, and H. E. Gamal, “On the secrecy capacity of fad-
ones among S and the eavesdropping UAVs. Then, the legiti- ing channels,” IEEE Trans. Inf. Theory, vol. 54, no. 10, pp. 4687–4698,
mate UAV will suffer weaker eavesdropping, compared to the Oct. 2008.
case with high λ and small D. However, it is exactly oppo- [11] G. Zhang, Q. Wu, M. Cui, and R. Zhang, “Securing UAV communi-
site for ASC in the high λ and/or low D regions, since there cations via trajectory optimization,” in Proc. GLOBECOM, Singapore,
2017, pp. 1–6.
will be more eavesdropping UAVs, leading to poorer secrecy [12] S.-W. Kim and S.-W. Seo, “Cooperative unmanned autonomous vehicle
performance. As shown in Fig. 3(b) with λ = −130 dB, control for spatially secure group communications,” IEEE J. Sel. Areas
we can also see that the reference SNR γ cannot play a Commun., vol. 30, no. 5, pp. 870–882, Jun. 2012.
positive role significantly, since the received SNR over the [13] C. Liu, T. Q. S. Quek, and J. Lee, “Secure UAV communication in
eavesdropping links will also be improved when γ increases. the presence of active eavesdropper (invited paper),” in Proc. WCSP,
Nanjing, China, 2017, pp. 1–6.
Moreover, while suffering both LoS and Nakagami-m fad- [14] S. Hu, J. Flordelis, F. Rusek, and O. Edfors, “Unmanned aerial vehicle
ing, we also present the SOP performance with m = 2 and √ assisted cellular communication,” arXiv:1803.05763, 2018.
m = 3 in Fig.
√ 2(a) and Fig. 2(b), and the ASC with m = 6 [15] K. Welch, “Evolving cellular technologies for safer drone operation,”
and m = 2 in Fig. 3(a) and Fig. 3(b), separately. One can San Diego, CA, USA, Qualcomm 5G, White Paper, Oct. 2016.
see that the SOP and ASC suffering both LoS propagation [16] Iskandar, “Wireless channel characteristic and its performance for strato-
spheric platform communication,” Ph.D. dissertation, Waseda Univ.,
and Nakagami-m fading outperforms the ones only experienc- Tokyo, Japan, 2007.
ing LoS propagation. This observation can be explained by [17] G. Pan, H. Lei, Z. Ding, and Q. Ni, “On 3-D hybrid VLC-RF systems
the fact: Benefiting from the diversity gain of multiple eaves- with light energy harvesting and OMA scheme over RF links,” in Proc.
dropping UAVs, the equivalent received SNR at eavesdroppers, GLOBECOM, Singapore, 2017, pp. 1–6.
γmax , will degrade slower compared with the received SNR [18] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series and
Products, 7th ed. San Diego, CA, USA: Academic, 2007.
at the legitimated UAV, γ0 , when the channels get worse. [19] A. P. Prudnikov, Y. A. Brychkov, and O. I. Marichev, Integrals and
Therefore, we can see that the SOP only suffering LoS fad- Series: Elementary Functions, vol. 1, 3rd ed. New York, NY, USA:
ing acts as the upper bound of the SOP, and the ASC only Gordon Breach Sci. Publ., 1992.
Flexible-Rate SIC-Free NOMA for Downlink VLC

Based on Constellation Partitioning Coding
Chen Chen , Wen-De Zhong, Senior Member, IEEE, Helin Yang , Pengfei Du , and Yanbing Yang
Abstract—Visible light communication (VLC) systems utilizing multiplexing gain achievement via multiple-input multiple-
conventional superposition coding/successive interference cancel- output (MIMO) transmission [8]–[10], etc.
lation (SPC/SIC)-based non-orthogonal multiple access (NOMA) As a promising candidate for 5G systems, power-domain
suffer from the error propagation effect due to imperfect SIC. In
this letter, we propose a flexible-rate SIC-free NOMA technique non-orthogonal multiple access (NOMA) has been consid-
for downlink VLC systems, based on constellation partitioning ered for capacity improvement in downlink VLC systems.
coding (CPC) and uneven constellation demapping (UCD). By Marshoud et al. [11] applied NOMA in VLC systems and
using CPC/UCD, SIC is not required and hence error propaga- proposed a gain ratio power allocation strategy. Advanced
tion can be eliminated in NOMA-based VLC systems. Moreover, power allocation strategies were proposed for NOMA-based
by selecting a proper bit allocation scheme, flexible-rate multiple
access can be supported in the VLC system applying CPC/UCD- VLC systems by considering user fairness and the quality
based NOMA. Proof-of-concept two-user VLC experiments verify of service (QoS) constraint [12], [13]. An in-depth evalua-
that, compared with conventional SPC/SIC-based NOMA, the bit tion of NOMA in VLC systems was further performed by
error rate performance of the near user can be greatly improved Yin et al. [14]. In [15], NOMA was applied in MIMO VLC
by using CPC/UCD-based NOMA, and hence the effective power systems. In these works, superposition coding and successive
allocation ratio range can be substantially extended.
interference cancellation (SPC/SIC) are adopted and perfect
Index Terms—Visible light communication, non-orthogonal SIC is generally assumed. However, perfect SIC cannot always
multiple access, constellation partitioning coding. be guaranteed in practical NOMA-based VLC systems. It has
been shown that imperfect SIC might cause error propagation
and hence degrade the bit error rate (BER) performance [16].
I. I NTRODUCTION To address this issue, Li et al. [17] proposed symmetric
SPC with symmetric SIC for error propagation mitigation.
N RECENT years, visible light communication (VLC)
I using white light-emitting diodes (LEDs) has gained
tremendous attention, due to the dual use of white LEDs
Nevertheless, the error propagation effect cannot be com-
pletely eliminated since SIC is still required and only fixed-rate
multiple access can be supported.
for simultaneous illumination and communication in indoor
Moreover, NOMA has also been considered in uplink VLC
environments [1]. As a complementary technology to tra-
systems. In [18], phase pre-distortion was applied to improve
ditional radio-frequency technologies such as Wi-Fi, VLC,
the BER performance of uplink NOMA-based VLC systems.
also known as Li-Fi [2], has many inherent advantages such
In [19], a joint detection scheme was presented, which is SIC-
as huge and unregulated spectrum, low-cost front-ends and
free and maximum likelihood optimal. Nevertheless, according
no electromagnetic interference emission [3]. Nevertheless,
to [20], joint detection requires bit-level joint maximum like-
the achievable capacity of VLC systems is far beyond the
lihood calculations and hence has relatively high complexity.
expectations, due to the limited modulation bandwidth of
In this letter, for the first time, we propose and demonstrate
off-the-self white LEDs [4]. To overcome the bandwidth
a novel NOMA technique based on constellation partitioning
limitation, many capacity-enhancing techniques have been
coding (CPC) and uneven constellation demapping (UCD) for
proposed so far, such as bandwidth extension based on
downlink VLC systems. By using CPC/UCD, user decoding
pre- or post-equalization in the frequency domain [5], [6],
can be realized without SIC and hence the error propagation
spectral efficiency improvement using orthogonal frequency
effect due to imperfect SIC can be eliminated, resulting in
division multiplexing (OFDM) with high-order quadrature
improved BER performance. Moreover, by selecting a proper
amplitude modulation (QAM) constellations [7], diversity or
bit allocation scheme, flexible-rate multiple access can also be
Manuscript received September 9, 2018; revised October 19, 2018; accepted achieved. The feasibility of applying the proposed flexible-rate
November 3, 2018. Date of publication November 8, 2018; date of current ver- SIC-free NOMA technique in practical VLC systems has been
sion April 9, 2019. This work was supported in part by the Delta Electronics
Inc., and in part by the National Research Foundation, Singapore, through the successfully verified through two-user VLC experiments.
Corp Lab@University Scheme. The associate editor coordinating the review
of this paper and approving it for publication was B. Makki. (Corresponding
author: Chen Chen.) II. T WO -U SER VLC U SING CPC/UCD-BASED NOMA
C. Chen, W.-D. Zhong, H. Yang, and P. Du are with the School of Electrical
and Electronic Engineering, Nanyang Technological University, Singapore In this section, we introduce a downlink VLC system with
639798 (e-mail: chen0884@e.ntu.edu.sg). two users using the proposed flexible-rate SIC-free NOMA
Y. Yang is with the College of Computer Science, Sichuan University,
Chengdu 610065, China (e-mail: yangyanbing@scu.edu.cn). technique based on CPC and UCD. Fig. 1 illustrates the block
Digital Object Identifier 10.1109/LWC.2018.2879924 diagram of the system. As we can see, the input bits of both
CHEN et al.: FLEXIBLE-RATE SIC-FREE NOMA FOR DOWNLINK VLC BASED ON CPC 569
Fig. 2. Comparison of (a) Gray-coded 8-QAM constellation for CPC and

(b) non-Gray-coded 8-QAM constellation generated by SPC.
Fig. 1. Block diagram of a downlink two-user VLC system using CPC-based

flexible-rate SIC-free NOMA.
the near and far users are fed into the CPC block to generate
the CPC-coded constellation and the detailed principle of CPC Fig. 3. Flexible-rate partitioning of Gray-coded 8-QAM constellation by
is described in Section II-A. Subsequently, inverse fast Fourier CPC for (a) bn = b2 b3 , bf = b1 and (b) bn = b2 , bf = b1 b3 .
transform (IFFT) is executed and Hermitian symmetry (HS) is
imposed to obtain a real-valued OFDM signal. The resultant effect of severe error propagation due to the non-orthogonality
digital signal is converted to an analog signal via digital-to- of the superposed constellation [17].
analog conversion (DAC) and a direct-current (DC) bias is When using 8-QAM constellation, totally three bits (b1 b2 b3 )
added to ensure the non-negativity of the LED-driving signal. can be transmitted per 8-QAM symbol. For a two-user VLC
After propagation over the indoor VLC channel, the light system, the three bits are allocated to the near user and the far
is converted into an electrical analog signal via photode- user. Specifically, two bit allocation schemes can be adopted
tection (PD) at each user. The obtained analog signals are according to users’ data rate requirements: (1) the near user is
converted back to digital signals via analog-to-digital conver- allocated with two bits while the far user is allocated with one
sion (ADC). In the following, fast Fourier transform (FFT) bit, i.e., Mn = 4 and Mf = 2; (2) the near user is allocated
and frequency-domain equalization (FDE) are performed to with one bit while the far user is allocated with two bits, i.e.,
obtain the respective constellations of both users. At each user, Mn = 2 and Mf = 4. Hence, flexible-rate multiple access can
uneven Gray-coded QAM demapping is first performed and be achieved by employing CPC/UCD-based NOMA.
then the desired bits for each user are extracted. The principle Let bn and bf denote the bit/bits allocated to the near and
of UCD with adaptive thresholds is discussed in Section II-B. far users, respectively. Fig. 3(a) illustrates the first bit alloca-
tion scheme with bn = b2 b3 and bf = b1 , where the 8-QAM
constellation is divided into two 4-QAM subconstellations.
A. Constellation Partitioning Coding (CPC) Fig. 3(b) shows the second bit allocation scheme with bn = b2
As shown in Fig. 1, the general rule of CPC for two-user and bf = b1 b3 , where the 8-QAM constellation is partitioned
NOMA consists of two parts. The first part is Gray-coded into four binary phase-shift keying (BPSK) subconstellations.
(Mn × Mf )-QAM mapping, where Mn and Mf denote the As shown in Figs. 3(a) and (b), for both bit allocation schemes,
orders of QAM constellations desired by the near user and the the electrical powers allocated to the far user and the near user
far user, respectively. The second part includes two steps: one are a 2 and b 2 , respectively. Hence, the power allocation ratio
is constellation partitioning which is performed to adaptively between the electrical powers allocated to the far user and the
partition a Gray-coded QAM constellation into multiple sub- near user for both bit allocation schemes is obtained by
constellations according to a pre-defined bit allocation scheme, a2
and the other is power allocation which is executed according ρ = 2. (1)
b
to a pre-defined power allocation strategy. In the following,
Therefore, the resultant (Mn × Mf )-QAM symbol after CPC
without loss of generality, we introduce CPC by taking the
can be represented by
8-QAM constellation as an example, i.e., Mn × Mf = 8.
The Gray-coded 8-QAM constellation for CPC is depicted P ρP
xCPC = xn + x , (2)
1+ρ f
in Fig. 2(a), where there is only a one-bit difference between
1+ρ
any two adjacent constellation points. However, for the non-
Gray-coded 8-QAM constellation generated by SPC, as shown where xn and xf are the symbols desired by the near and far
in Fig. 2(b), there is a two-bit difference between the constel- users, respectively, and P is the total input electrical power for
lation points representing bits “010” and “100”, and the ones the two users at the LED transmitter.
representing bits “011” and “101”. Therefore, the near user Although only 8-QAM constellation is considered here as
using SPC/SIC-based NOMA might suffer from the adverse an example, the general rule of CPC for two-user NOMA is
applicable to a QAM constellation with an arbitrary order, i.e.,

the values of Mn and Mf can be arbitrary.
B. Uneven Constellation Demapping (UCD)

Due to the use of CPC, the transmitted constellation remains
to be Gray-coded and hence maintains the quasi-orthogonality.
Therefore, UCD can be performed to recover the output bits
for both the near and far users. Unlike the conventional even
Gray-coded QAM demapping which uses fixed thresholds,
adaptive thresholds are required when performing UCD. It can
be observed from Fig. 3 that the adaptive thresholds can be
obtained by slightly modifying the fixed thresholds of con- Fig. 4. Experimental setup of the two-user VLC system. Insets: (a) electrical
ventional even Gray-coded QAM constellation. Specifically, spectrum of the received OFDM-NOMA signal, (b) photo of the transmitter
the threshold sets adopted to decode the CPC-coded 8-QAM part, and (c) photo of the receiver part.
constellation are given as follows:

{I = 0; Q = 0, ±a}, if bn = b2 b3 , bf = b1 signal. The photos of the transmitter part and the receiver part
T = {I = 0; Q = 0, ± √a }, if b = b , b = b b . (3)
2 n 2 f 1 3 are shown by insets (b) and (c), respectively.
In the following, we evaluate the BER performance of the
Since three bits can be obtained from each input symbol, bit two-user VLC system using the conventional SPC/SIC-based
extraction is required by users to recover their desired output NOMA and the proposed CPC/UCD-based NOMA. Moreover,
bits. Similarly, the principle of UCD can be easily generalized to demonstrate that flexible rates can be achieved for the two
to a QAM constellation with an arbitrary order. users, two bit allocation schemes as discussed in Section II-A
It can be found that SIC is no longer required when adopting are investigated. Fig. 5(a) shows the measured BER as a func-
CPC with UCD. Consequently, the adverse error propagation tion of power allocation ratio ρ for 8-QAM constellation with
effect induced by imperfect SIC can be successfully eliminated bn = b2 b3 , bf = b1 , where the input peak-to-peak voltage
due to the quasi-orthogonality of the received constellation. of AWG is set to 2.4 V and ρ is in the range from 1.4
to 2.6. As can be observed, nearly the same BER can be
III. E XPERIMENTAL S ETUP AND R ESULTS achieved for the far user using either conventional SPC/SIC-
To verify the feasibility of applying the CPC/UCD-based based NOMA or CPC/UCD-based NOMA, which is gradually
NOMA technique in practical downlink VLC systems, a proof- reduced with the increase of ρ. However, for the near user,
of-concept experimental demonstration is conducted here and the BER is first reduced and then increased with the increase
the experimental setup of a two-user VLC system is depicted of ρ when using conventional SPC/SIC-based NOMA. More
in Fig. 4. The transmitted two-user OFDM-NOMA signal with specifically, the BER is below the 7% forward error correction
a power allocation ratio ρ as defined by (1) is generated offline (FEC) overhead limit of 3.8 × 10−3 only when the value of
by MATLAB and uploaded to an arbitrary waveform generator ρ is around 1.8. The deteriorated BER performance at small
(AWG, Tabor WW2074) with a sampling rate of 50 MSa/s. ρ values, i.e., ρ < 1.8, is mainly due to the adverse error
Subsequently, the obtained signal is added with a 300-mA DC propagation effect caused by imperfect SIC. In contrast, when
bias current via a bias-tee (bias-T) and the resultant signal is applying the proposed CPC/UCD-based NOMA, SIC is no
then used to drive a white LED (Luxeon SR-12 Rebel Star/O). longer required and hence error propagation can be elimi-
After 100-cm free-space propagation, the light is detected by nated. As a result, the BER is gradually reduced with the
two users, where the near user is assumed to face towards the decrease of ρ, suggesting that the BER performance of the
LED while the far user has an position offset of 10 cm from near user is robust against the interference from the signal
the near user. Each user is individually equipped with a blue intended for the far user. Moreover, when ρ becomes rela-
filter (BF) and an avalanche photodiode (APD, Hamamatsu tively large, the near user can achieve almost the same BER
S8664-50K). The APD has a responsivity of about 15 A/W at performance using either conventional SPC/SIC-based NOMA
450 nm and an active area of 19.6 mm2 . Due to the hardware or CPC/UCD-based NOMA, which is because the interference
limitation, we adjust the position of the receiver to detect the caused by the far user’s signal becomes negligible. The mea-
signals of two users. The detected signals are recorded by a sured BER performance versus the power allocation ratio ρ
mixed domain oscilloscope (MDO, Tektronix MDO3104) with for 8-QAM constellation with bn = b2 , bf = b1 b3 is shown
a sampling rate of 250 MSa/s and further processed offline. in Fig. 5(b), where the input peak-to-peak voltage of AWG is
The OFDM-NOMA signal is generated offline with an IFFT 3.1 V and ρ is in the range from 5 to 11. Similarly, the BER
size of 512, where totally 154 (2nd to 155th ) subcarriers are performance of the near user can be substantially improved
utilized to modulate valid data. Therefore, the bandwidth of by applying the proposed CPC/UCD-based NOMA, when ρ
the OFDM-NOMA signal is 15 MHz. When using 8-QAM is relatively small. In addition, the insets in Figs. 5(a) and (b)
constellation, the sum data rate of the two users is 45 Mbit/s. show the corresponding constellation diagrams.
No cyclic prefix (CP) is used and a total of 200 symbols are In order to achieve error-free downlink transmission in the
transmitted through the VLC system for BER measurement. two-user VLC system, the BERs of two users should both
Inset (a) in Fig. 4 shows the electrical spectrum of the received below the FEC overhead limit, i.e., 3.8×10−3 . Here, we define
CHEN et al.: FLEXIBLE-RATE SIC-FREE NOMA FOR DOWNLINK VLC BASED ON CPC 571
achieved for both users by selecting a proper bit allocation

scheme. Moreover, the BER performance of the near user can
be substantially improved when the power allocation ratio is
relatively small, by using CPC/UCD-based NOMA in com-
parison to conventional SPC/SIC-based NOMA. Meanwhile,
the BER performance of the far user remains the same for
both NOMA techniques. In consequence, an extended effec-
tive power allocation ratio range can be achieved by the VLC
system using CPC/UCD-based NOMA, which enables the
potential of higher data rate transmission and more flexible
system design.
R EFERENCES
[1] T. Komine and M. Nakagawa, “Fundamental analysis for visible-
light communication system using LED lights,” IEEE Trans. Consum.
Electron., vol. 50, no. 1, pp. 100–107, Feb. 2004.
[2] H. Haas, L. Yin, Y. Wang, and C. Chen, “What is LiFi?” J. Lightw.
Technol., vol. 34, no. 6, pp. 1533–1544, Mar. 15, 2016.
[3] L. Grobe et al., “High-speed visible light communication systems,” IEEE
Commun. Mag., vol. 51, no. 12, pp. 60–66, Dec. 2013.
[4] Z. Ghassemlooy, L. N. Alves, S. Zvanovec, and M.-A. Khalighi, Visible
Light Communications: Theory and Applications. Boca Raton, FL, USA:
CRC Press, Jul. 2017.
[5] H. Le Minh et al., “100-Mb/s NRZ visible light communications using a
postequalized white LED,” IEEE Photon. Technol. Lett., vol. 21, no. 15,
pp. 1063–1065, Aug. 1, 2009.
[6] C. Chen, W.-D. Zhong, and D. Wu, “Indoor OFDM visible light com-
munications employing adaptive digital pre-frequency domain equaliza-
tion,” in Proc. Conf. Lasers Elect. Opt. (CLEO), Jun. 2016, pp. 1–2.
[7] M. Z. Afgani, H. Haas, H. Elgala, and D. Knipp, “Visible light com-
munication using OFDM,” in Proc. Int. Conf. Testbeds Res. Infrastruct.
Develop. Netw. Commun. (TRIDENTCOM), Mar. 2006, pp. 129–134.
[8] L. Zeng et al., “High data rate multiple input multiple output (MIMO)
Fig. 5. Measured BER vs. power allocation ratio for 8-QAM constellation optical wireless communications using white LED lighting,” IEEE J.
with (a) bn = b2 b3 , bf = b1 and (b) bn = b2 , bf = b1 b3 . Sel. Areas Commun., vol. 27, no. 9, pp. 1654–1662, Dec. 2009.
[9] T. Fath and H. Haas, “Performance comparison of MIMO techniques for
optical wireless communications in indoor environments,” IEEE Trans.
Commun., vol. 61, no. 2, pp. 733–742, Feb. 2013.
the effective power allocation ratio ρe as the value of ρ which [10] C. Chen, W.-D. Zhong, and D. Wu, “On the coverage of multiple-input
guarantees that both the near and far users can achieve BERs multiple-output visible light communications [Invited],” IEEE/OSA J.
below 3.8 × 10−3 . For 8-QAM constellation with bn = b2 b3 , Opt. Commun. Netw., vol. 9, no. 9, pp. D31–D41, Sep. 2017.
[11] H. Marshoud, V. M. Kapinas, G. K. Karagiannidis, and S. Muhaidat,
bf = b1 , ρe is in a very small range between 1.75 and 1.9 “Non-orthogonal multiple access for visible light communications,”
when using conventional SPC/SIC-based NOMA. However, IEEE Photon. Technol. Lett., vol. 28, no. 1, pp. 51–54, Jan. 1, 2016.
[12] X. Zhang, Q. Gao, C. Gong, and Z. Xu, “User grouping and power
when CPC/UCD-based NOMA is applied, the range of ρe allocation for NOMA visible light communication multi-cell networks,”
is extended to (1.68, 2.22). Similarly, for 8-QAM constella- IEEE Commun. Lett., vol. 21, no. 4, pp. 777–780, Apr. 2017.
tion with bn = b2 , bf = b1 b3 , the range of ρe is extended [13] Z. Yang, W. Xu, and Y. Li, “Fair non-orthogonal multiple access for
visible light communication downlinks,” IEEE Wireless Commun. Lett.,
from (6.85, 8.51) to (5.81, 9.29) when conventional SPC/SIC- vol. 6, no. 1, pp. 66–69, Feb. 2017.
based NOMA is replaced by CPC/UCD-based NOMA. From [14] L. Yin, W. O. Popoola, X. Wu, and H. Haas, “Performance evaluation of
the practical implementation point of view, a wider effective non-orthogonal multiple access in visible light communication,” IEEE
Trans. Commun., vol. 64, no. 12, pp. 5162–5175, Dec. 2016.
power allocation ratio range indicates the potential to support [15] C. Chen, W.-D. Zhong, H. Yang, and P. Du, “On the performance
higher data rates and the flexibility of overall system design. of MIMO-NOMA-based visible light communication systems,” IEEE
Photon. Technol. Lett., vol. 30, no. 4, pp. 307–310, Feb. 15, 2018.
[16] S. M. R. Islam, N. Avazov, O. A. Dobre, and K.-S. Kwak, “Power-
IV. C ONCLUSION domain non-orthogonal multiple access (NOMA) in 5G systems:
Potentials and challenges,” IEEE Commun. Surveys Tuts., vol. 19, no. 2,
In this letter, we have proposed and experimentally veri- pp. 721–742, 2nd Quart., 2017.
fied a novel CPC/UCD-based NOMA technique for downlink [17] H. Li, Z. Huang, Y. Xiao, S. Zhan, and Y. Ji, “Solution for error propaga-
tion in a NOMA-based VLC network: Symmetric superposition coding,”
VLC systems. By applying CPC with UCD in NOMA, the Opt. Exp., vol. 25, no. 24, pp. 29856–29863, Nov. 2017.
desired signals for all users in the VLC system can be [18] X. Guan, Q. Yang, Y. Hong, and C. C.-K. Chan, “Non-orthogonal
successfully decoded without SIC. Therefore, the error prop- multiple access with phase pre-distortion in visible light communica-
tion,” Opt. Exp., vol. 24, no. 22, pp. 25816–25823, Oct. 2016.
agation effect induced by imperfect SIC can be efficiently [19] X. Guan, Q. Yang, and C.-K. Chan, “Joint detection of visible light com-
mitigated. A two-user VLC system achieving a sum data rate munication signals under non-orthogonal multiple access,” IEEE Photon.
of 45 Mbit/s has been experimentally demonstrated to verify Technol. Lett., vol. 29, no. 4, pp. 377–380, Feb. 15, 2017.
[20] K. Ando, Y. Sanada, and T. Saba, “Joint maximum likelihood detection
the feasibility of the proposed CPC/UCD-based NOMA tech- in far user of non-orthogonal multiple access,” IEICE Trans. Commun.,
nique. The obtained results show that flexible rates can be vol. 100, no. 1, pp. 177–186, Jan. 2017.
Meta Distribution of Downlink Non-Orthogonal Multiple

Access (NOMA) in Poisson Networks
Konpal Shaukat Ali , Hesham ElSawy , and Mohamed-Slim Alouini
Abstract—We study the meta distribution (MD) of the cover- however, the joint decoding associated with SIC is not taken
age probability (CP) in downlink non-orthogonal-multiple-access into account.
(NOMA) networks. Two schemes are assessed based on the loca- This letter characterizes the MD in downlink cellular
tion of the NOMA users: 1) anywhere in the network and networks for two NOMA schemes, namely, everywhere
2) cell-center users only. The moments of the MD for both NOMA (E-NOMA) and cell-center NOMA (C-NOMA). E-
schemes are derived and the MD is approximated via the beta dis- NOMA utilizes NOMA for UEs located everywhere in the
tribution. Closed-form moments are derived for the first scheme;
network [5], [7], while C-NOMA restricts NOMA to cell-
for the second scheme exact and approximate moments, to sim-
plify the integral calculation, are derived. We show that restrict- center UEs only [2], [3]. We derive closed-form expressions
ing NOMA to cell-center users provides significantly higher for the moments of the MD in E-NOMA. Integral expressions
mean, lower variance and better percentile performance for are obtained for the moments in C-NOMA; consequently, we
the CP. propose accurate approximate moments to simplify the inte-
gral calculation. The MD is then approximated using the beta
Index Terms—Stochastic geometry, meta distribution, non- distribution via moment matching to characterize the UEs per-
orthogonal multiple access (NOMA).
centile performance. Different from [7] we derive and compare
the statistics of the MD for two NOMA schemes, and con-
sider joint decoding for all SIC phases. To the best of our
I. I NTRODUCTION
knowledge, NOMA works in the literature employ one scheme
ONVENTIONALLY, orthogonal multiple access (OMA)
C is used for transmissions to different users (UEs) served
by the same base station (BS). OMA assigns different time-
and do not compare different schemes. Our results show that
C-NOMA not only provides higher SCP, but also reduces
the variance of the CP across the UEs in the network when
frequency resource blocks (TF-RBs) to each UE to avoid compared to the E-NOMA.
intracell interference. However, spectrum scarcity and the
increasing capacity demand call for more efficient spectrum II. S YSTEM M ODEL
utilization. In this regard, non-orthogonal multiple access
(NOMA) is a technique that improves spectral efficiency by We consider a downlink cellular network where BSs are
superposing the messages of multiple UEs on one TF-RB. distributed according to a homogeneous PPP Φ with inten-
Successive interference cancellation (SIC) is used for NOMA sity λ. Each BS serves N UEs in one TF-RB by multiplexing
decoding. The superiority of NOMA over OMA schemes in a the signals for each UE with different power levels using a
noise-limited regime is well established from an information total power budget P = 1. A Rayleigh fading environment is
theoretic perspective [1]. assumed such that the fading coefficients are i.i.d. with a unit
Using stochastic geometry, the superiority of NOMA mean exponential distribution. A power-law path-loss model
has also been established for large-scale interference prone is considered where the signal decays at the rate r −η with
networks [2]–[5]. Such studies usually focus on the spatially distance r, η > 2 denotes the path-loss exponent and δ = η2 .
averaged coverage probability (SCP), which averages the cov- SIC requires ordering the UEs according to some measure of
erage probability (CP) over all fading, activity, and network link strength [2]. For i ∈ {1, . . . , N }, the i th strongest UE is
realizations. However, network operators are usually more referred to as UEi . In this letter, we order the UEs based on the
interested in the percentile performance of UEs, where the link distance R. The ordered link distance of UEi is denoted
fading and activity change while the network realization is by Ri ; consequently, UEi is nearer to the BS and therefore
kept constant. The CP given a fixed network realization is stronger than UEj for i<j (i.e., Ri < Rj ). Exploiting SIC,
defined as the conditional CP (CCP) [6]. The complementary UEi decodes and cancels messages intended for all weaker
cdf of the CCP, denoted as the meta distribution (MD), reveals UEs before decoding its own message. On the other hand,
the percentile performance across an arbitrary network realiza- messages for stronger UEs are treated as noise and contribute
tion. Reference [7] studies the MD for uplink and downlink to the intracell interference. We incorporate imperfect SIC into
NOMA with NOMA UEs located everywhere in the network; our analysis by considering a fraction β of residual intracell
interference from the canceled messages of weaker UEs. Let
Manuscript received September 12, 2018; accepted November 5, 2018. Date Pi and log(1 + θi ) denote the power allocated and target rate
of publication November 8, 2018; date of current version April 9, 2019. The for UEi ; the corresponding signal-to-interference ratio (SIR)
associate editor coordinating the review of this paper and approving it for threshold for the message of UEi is θi . Note that due to the
publication was M. Kountouris. (Corresponding author: Konpal Shaukat Ali.) power budget, N i=1 Pi = 1. For feasible SIC, proper resource
K. S. Ali and M.-S. Alouini are with the Computer, Electrical, and
Mathematical Sciences and Engineering Divison, King Abdullah University allocation (RA), i.e., power allocation and rate adaptation (e.g.,
of Science and Technology, Thuwal 23955-6900, Saudi Arabia (e-mail: Pi ≤ Pj and/or θi ≥ θj for i<j), for all UEs is required.
konpal.ali@kaust.edu.sa; slim.alouini@kaust.edu.sa). Lemma 1: For any ascending ordered statistic like Ri , based
H. ElSawy is with the Department of Electrical Engineering, King Fahd
on the statistics of the unordered counterpart R, the pdf is
University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia (e-mail: i−1
fRi (r ) = N − 1 NfR (r ) FR (r ) (1 − FR (r ))N −i .
hesham.elsawy@kfupm.edu.sa).
Digital Object Identifier 10.1109/LWC.2018.2880210 i −1
(1)
ALI et al.: MD OF DOWNLINK NOMA IN POISSON NETWORKS 573
In terms of components larger than i, (1) can be rewritten as For a fixed, yet arbitrary, realization of the network, the
N
CCP of UEi in a randomly selected cell, PCi , is
m −1 ⎡ ⎛ ⎞ ⎤
fRi (r ) = fR i (r ) + (−1)m−i fR
m (r ), (2)
i −1 (a)
m=i+1 PCi = P(Ci |Φ) = Egyi ⎣exp⎝−Riη Mi gyi yi −η ⎠ | Φ⎦
N −1 j −1 for i ≤ j ≤ N. In x∈Φ\x0
where fR j (r ) = j −1 NfR (r )(FR )
(b) 1
terms of components smaller than i, (1) can be rewritten as = , (6)
i−1 1 + Riη Mi yi −η
(N − m)!(−1)i−m
x∈Φ\x0
fRi (r ) = fR
(r ) + f (r ), (3) where (a) follows using the cdf of hi ∼ exp(1) and (b) follows
i (m − 1)!(i − m)! R m
m=1 from the MGF of the independent RVs gyi ∼ exp(1).
N −1
where fR j (r ) = j −1 NfR (r )(1 − FR )
N −j for 1 ≤ j ≤ i. Denote the b th moment of the CCP of UEi across all links
in an arbitrary fixed realization of the network by Mi,b . Then,
We denote the distance between a BS and its nearest ⎡ ⎤
neighboring BS by ρ. Since Φ is a PPP, the pdf of ρ is −b
η
Mi,b = E⎣ 1 + Ri Mi yi −η ⎦.
2
fρ (x ) = 2πλxe −πλx , x≥ 0. Consider a disk around each (7)
BS located at x with radius ρ/2, i.e., b(x,ρ/2); we refer to this x∈Φ\x0
as the in-disk. The in-disk is the largest disk centered at a BS
that fits inside its Voronoi cell. We study and compare NOMA Remark: If P̃j < 0, the CCP is zero. Henceforth we assume
for the following two schemes. RA such that P̃j ≥ 0.
1) Everywhere Noma (E-Noma): N UEs are distributed uni- Note: If b = 1 in (7), we obtain the SCP of UEi .
formly and independently in each Voronoi cell. Consequently, Through moment matching, the MD of UEi is approximated
the distribution of 2the unordered link distance R follows using the beta distribution [6] as follows

fR (r ) = 2πλre −πλr , r≥ 0. Using this pdf and its cdf FR (r ), βi Mi,1

the ordered distance distribution fRi (r ), r≥ 0, in the E-NOMA F̄PC (α) = P PCi > α ≈ 1 − Iα , βi , (8)
i 1 − Mi,1
scheme follows (1). (Mi,1 −Mi,2 )(1−Mi,1 )
2) Cell-Center Noma(C-Noma): N UEs are distributed uni- where βi = Mi,2 −M2i,1
and Iα (a, b) =
formly and independently in the in-disk b(x, ρ/2) of each BS α a−1 b−1
at x [3]. Consequently, the link distance R, conditioned on ρ, 0 l (1 − l) dl . The variance of
the MD of UEi is
follows fR|ρ (r | ρ) = 8r , 0 ≤ r ≤ ρ2 . Using (1) the pdf of defined as σi2 = Mi,2 − M2i,1 .
ρ2 The ordered relative distance process (RDP) for UEi , which
Ri , conditioned on ρ, in the C-NOMA scheme follows

i−1
N −i is the RDP in [8] using ordered link distance, is
N − 1 8rN 4r 2 4r 2 Ri = {x ∈ Φ\{x0 } : Ri /yi }. (9)
fRi |ρ (r | ρ) = 1− 2 ,
i −1 ρ2 ρ2 ρ Using the PGFL of the PPP in (a), the PGFL of Ri is
ρ ⎡ ⎤ ⎤ ⎡
0≤r≤ . (4)

2 Ri
GRi [f ] = E⎣ f (x )⎦ = E⎣ f ⎦
Remark: C-NOMA restricts the link distance to ρ/2; the yi
x ∈Ri x ∈Φ\{x0 }
notion is that NOMA is better suited for UEs that are closer ! " ∞

#
to the serving BS. UEs with relatively larger link distances are (a) Ri
= ERi exp −2πλ 1−f a da . (10)
better served in their own resource block without sharing [2]. Ri a
Using the ordered RDP for UEi , the expectation in (7) can
III. SIR A NALYSIS also be evaluated as ⎡ ⎤
SIC requires a UE to successfully decode all of the messages
intended for weaker UEs. Consider a randomly selected BS Mi,b = E⎣ (1 + Mi y η )−b ⎦. (11)
y∈Ri
located at x0 and its associated UEs; the SIR at UEi of the
message intended for UEj for i≤ j ≤ N is
hi Ri−η Pj A. E-NOMA Scheme
SIRij = , We characterize the PGFL of the ordered RDPs and obtain
j
−1
N
hi Ri−η Pm + β Pk + gyi yi −η closed for expressions for Mi,b .
m=1 k =j +1 x∈Φ\x0 Lemma 2: The PGFL of Ri for 1≤ i ≤ N in E-NOMA is
where yi = x − ui , ui is the location of UEi , · denotes the i−1
(N − m)!(−1)i−m
Euclidean norm, and hi (gyi ) is the fading power gain from GRi [f ] = GR
i [f ] + G [f ], (12)
the serving (interfering) BS to UEi . (m − 1)!(i − m)! R m
m=1
Accordingly, due to SIC decoding, coverage at UEi is where for 1 ≤ j ≤ i
N −1
defined via the following joint event
⎧ ⎫ j −1 N
N ⎨ ⎬ GR
[f ] = . (13)
+ 1) + 2 1∞ (1 − f (y −1 ))y
N

θj j (N − j dy
Ci = SIRij > θj = hi > Riη gyi yi −η , (5)
⎩ P̃j ⎭ Proof: We obtain (12) using (3) in (10). Also using (10),
j =i j =i x∈Φ\x0
" ∞ " ∞ x
j
−1 N

where P̃j = Pj −θj ( Pm +β Pk ). We rewrite (5) as GR
[f ] = fR
(x ) exp −2πλ 1−f a da dx
j
0
j
Ri a
m=1 k =j +1
θ N − 1 " ∞
Ci = {hi > Riη Mi −η (1−f (y −1 ))y

∞ −2πλm dy
gyi yi } using Mi = max j . (a)
= πλN e 1 e −πλ(N −j +1)m dm
x∈Φ i≤j ≤N P̃j j −1 0
where (a) is obtained by changing variables and (13) is Lemma 4: Using A1 and A2, the PGFL of Ri conditioned
obtained using the MGF of m ∼ exp(πλ(N − j + 1)). on ρ for 1 ≤ i ≤ N in the C-NOMA scheme is
Corollary 1: Mi,b for 1 ≤ i ≤ N in E-NOMA is N

m −1
i−1
GRi |ρ [f ] = GR [f ] + (−1)m−i GR
m |ρ [f ],
$i,b + (N − m)!(−1)i−m $ i |ρ i −1
Mi,b = M Mm,b , (14) m=i+1
(m − 1)!(i − m)! (17)
m=1
where for 1 ≤ j ≤ i where for i ≤ j ≤ N

N −1
$j ,b = N − 1
M
N
. (15) j −1 Γ(j ) − Γ j , πλρ2 ∞
2 1 1 − f 1
y ydy
j − 1 N − j + 2 F1 (b, −δ, 1 − δ, −Mi ) j |ρ [f ] =
GR j .
1 ρ2 ∞ 1
Proof: (14) is obtained using (12), where we define using (11) N 2 πλ 1 1 − f y ydy
% &
1 (18)
M$j ,b = G
Rj
(1 + Mi y η )b Proof: We obtain (17) using (2) in (10). Also using (10),
N −1 " ∞ " ∞ x
(a) j −1 N GR
|ρ
[f ] = fR
(x ) exp −2πλ 1−f a da dx
= ∞ . j
0
j
Ri a
−η −b " ρ2 −2πλm ∞(1−f (y −1 ))y
N −j +1+2 1 − (1 + Mi y ) ydy (a) N −1 4j 4 dy
1 = N 2j e 1 m j −1 dm
j −1 ρ 0
We obtain (a) using (13), and (15) follows by y → g −1 .
(a) follows by changing variables, and (18) by integration.
B. C-NOMA Scheme We approximate Mi,b by substituting the approximate
PGFL of Ri , conditioned on ρ, into (11) and averaging over ρ.
We obtain integral expressions for Mi,b . We also propose Corollary 2: Using A1 and A2, Mi,b for 1 ≤ i ≤ N in
approximate PGFLs of the ordered RDP and use these to
C-NOMA is
evaluate Mi,b in a simpler form. N
m − 1
Lemma 3: The b th moment of the CCP for UEi in the )i,b +
Mi,b = M )m,b ,
(−1)m−i M (19)
C-NOMA scheme is i −1
m=i+1
⎡
−b ⎤
∞ η where for i ≤ j ≤ N
rdr
−b
M R
−2πλ 1− 1+ ri η i
Mi Riη
⎡ ⎤
⎢ ρ−Ri ⎥
Mi,b ≈ Eρ,Ri ⎣e 1+ ⎦. πλρ2
ρη ⎢ Γ(j ) − Γ j , 4 (2 F1 (b, −δ, 1 − δ, −Mi ) − 1) ⎥
)j ,b
M = Eρ ⎢
⎣ j 2j
⎥.
⎦
(πλ) ρ j
j
(2 F1 (b, −δ, 1 − δ, −Mi ) − 1)
(16) N −1
j −1
N 4
Proof: In the C-NOMA model each UE is conditioned (20)

to have an interferer ρ away from the serving BS. Hence,
Proof: (19) is obtained using (17) where we define using (11)
using (7) * +
⎡ M)j ,b = Eρ G [(1 + Mi y η )−b ]
Rj |ρ
⎢ Ri
η
−b ⎡
N −1
2 ∞ ⎤
⎢ Γ(j ) − Γ j , πλρ 1 − (1 + Mi y −η )−b ydy
Mi,b = E⎢ 1 + Mi (a) ⎢ j −1 2 1 ⎥
⎣ yi η = Eρ ⎣
1 ρ2 ∞ −η −b
j ⎦.
x∈Φ\x0 N 2
πλ 1 1 − (1 + M i y ) ydy
x−x0 >ρ
⎤ We obtain (a) using (18), and (20) follows by y → g −1 .

−b ⎥
Riη ⎥
× 1 + Mi ⎥. IV. R ESULTS
yi η ⎦
x∈Φ\x0 In this section, we select the following parameters: λ = 10,
x−x0 =ρ η = 4, β = 0 and N = 2, unless stated otherwise. Simulations
We obtain the first term in (16) using the PGFL of the PPP are repeated 50,000 times. Since the power budget is P = 1,
and the guard zone b(ui , ρ−Ri ) in the C-NOMA scheme. The P2 = 1 − P1 . Unless stated otherwise, Lemma 3 is used for
average location of a UE distributed uniformly in the in-disk the moments of the CCP in the C-NOMA model.
is the center of the disk, i.e., x0 . Accordingly, we approxi- Fig. 1 verifies the approximation of the MD in (8) using
mate the average distance between a UE and the BS ρ away simulations for both schemes with different values of P1 . The
from x0 as ρ; hence, the second term in (16) is obtained. This approximation is tighter (looser) for C-NOMA (E-NOMA)
approximation has been validated to be tight in [2] and [3]. because of its larger (smaller) interference-exclusion disk with
Consider the following two approximations: radius ρ − Ri (Ri ). The fraction of UEi that attain a given
• A1: UEi is guaranteed to have no interfering BS in CP is always much larger for C-NOMA when compared to E-
b(ui , Ri ), which is not the largest guard zone around the UE. NOMA, which highlights the superiority of restricting NOMA
• A2: Deconditioning on the BS ρ away from the serving to cell-center UEs. When P1 = 0.5, 98.9% (92.1%) of UE1
BS. (UE2 ) achieve a CP of at least 0.5 in C-NOMA, while only
Remark: The two approximations have opposing effects; A1 61.5% (19.9%) of UE1 (UE2 ) achieve the same CP in E-
overestimates intercell interference while A2 underestimates it. NOMA. Decreasing P1 worsens the performance of UE1 and
Calculating Mi,b using Lemma 3 requires a triple integral. improves UE2 ; consequently, decreasing P1 in Fig. 1 increases
However, exploiting A1 and A2, we provide an approximation the fraction of UE2 that attains a certain CP at the expense of
to calculate Mi,b that requires a single integration. reducing the fraction of UE1 achieving a given CP.
ALI et al.: MD OF DOWNLINK NOMA IN POISSON NETWORKS 575
Fig. 1. MD vs. α with θ1 = 1 and θ2 = 0.5. Solid lines represent P1 = 0.5,

dashed P1 = 0.1, markers show Monte Carlo simulations.
Fig. 4. SCP and variance of the MD vs. θ1 . For C-NOMA: TMR = 0.1
(black) uses P2 = 0.18 and θ2 = −9 dB, TMR = 0.4 (blue) uses P2 = 0.54
and θ2 = −0.7 dB. For E-NOMA: TMR = 0.1 (red) uses P2 = 0.47 and
θ2 = −7 dB.
Fig. 4 plots the mean and variance of the MD for an opti-

mized power and rate adaptation for UE2 such that the total
rate is maximized subject to a threshold minimum rate (TMR)
constraint. The rate of a UE is defined as the SCP times tar-
get rate. RA is done according to the algorithm in [3] and
results in UE2 having rate equal to the TMR. We also plot
the rate of UE1 in Fig. 4. In C-NOMA (and E-NOMA, not
shown for brevity), increasing the TMR increases σ22 while
the peak σ12 occurs at lower θ1 but does not change in value.
Fig. 2. SCP and variance of the MD vs. θ (identical target rate for all UEs)
with P1 = 1/3 for the C-NOMA scheme using the exact and approximate When the TMR is 0.1, the SCP of UE2 and σ22 are worse
moments of CP. for E-NOMA. Although the peak σ12 is higher for C-NOMA
than E-NOMA, at the optimum θ1 that maximizes the rate of
UE1 , σ12 is lower for C-NOMA. Other than highlighting the
superiority of the C-NOMA scheme, this also emphasizes the
importance of optimum RA for not just the SCP, but also for
higher moments of the MD.
V. C ONCLUSION
We study the MD of the CCP of NOMA UEs distributed
according to two models. Closed form expressions for the
moments of the MD in the E-NOMA scheme are derived.
The C-NOMA scheme requires a triple integral so we pro-
pose approximate moments that reduce to a single integration.
Fig. 3. SCP and variance of the MD vs. θ (identical target rate for all UEs) Our results show that employing NOMA for cell-center users
with P1 = 1/3 for both schemes. Solid lines are for β = 0, dashed for is significantly more beneficial than using it for all UEs in
β = 0.1, dash-dot for β = 0.2. Note: the weakest NOMA UE is unaffected a cell, thereby motivating the works of [2] and [3]. We also
by β.
emphasize the importance of RA in NOMA.
Fig. 2 plots the mean and variance of the MD for the NOMA R EFERENCES
UEs in the C-NOMA scheme. We compare using the moments
obtained with and without the approximations A1 and A2. [1] D. Tse and P. Viswanath, Fundamentals of Wireless Communication.
Cambridge, U.K.: Cambridge Univ. Press, 2004.
We observe that the approximation is tight for the SCP and [2] K. S. Ali et al., “Downlink non-orthogonal multiple access (NOMA) in
overestimates the variance, particularly for UE2 near the peak. Poisson networks,” IEEE Trans. Commun., to be published.
Fig. 3 plots the mean and variance of the MD of the UEs for [3] K. S. Ali et al., “Analyzing non-orthogonal multiple access (NOMA) in
both schemes using identical RA. We observe that C-NOMA downlink Poisson cellular networks,” in Proc. IEEE Int. Conf. Commun.
outperforms the E-NOMA scheme in terms of both SCP and (ICC), 2018, pp. 1–6.
[4] H. Tabassum et al.,“Modeling and analysis of uplink non-orthogonal
variance. Increasing β deteriorates performance of the non- multiple access (NOMA) in large-scale cellular networks using Poisson
weakest UEs, decreasing SCP and increasing variance. For cluster processes,” IEEE Trans. Commun., vol. 65, no. 8, pp. 3555–3570,
a given β, the higher SCP of the C-NOMA scheme can be Aug. 2017.
attributed to the fact that the UEs are closer to the BS on [5] K. S. Ali et al., “Non-orthogonal multiple access for large-scale
average than the E-NOMA scheme. The lower variance is also 5G networks: Interference aware design,” IEEE Access, vol. 5,
pp. 21204–21216, 2017.
due to the limited vicinity leading to lower disparity than the [6] M. Haenggi, “The meta distribution of the SIR in Poisson bipolar
E-NOMA model. Furthermore, σi2 peaks at high θ for the and cellular networks,” IEEE Trans. Wireless Commun., vol. 15, no. 4,
C-NOMA scheme (corresponding to low SCP); which is not pp. 2577–2589, Apr. 2016.
the case for the E-NOMA scheme. This implies the existence [7] M. Salehi et al. Meta Distribution of the SIR in Large-Scale Uplink and
of θ with high SCP and low σi2 in C-NOMA, thereby high- Downlink NOMA Networks. Accessed: Apr. 2018. [Online]. Available:
https://arxiv.org/abs/1804.02710
lighting its superiority with careful RA. The C-NOMA is also [8] R. K. Ganti and M. Haenggi, “Asymptotics and approximation of the
a more consistent scheme as both SCP and variance are better SIR distribution in general cellular networks,” IEEE Trans. Wireless
for UE1 than UE2 ; this is not the case for the E-NOMA. Commun., vol. 15, no. 3, pp. 2130–2143, Mar. 2016.
PAPR Reduction Based on Parallel Tabu Search for

Tone Reservation in OFDM Systems
Yajun Wang , Renjie Zhang, Jun Li , Senior Member, IEEE, and Feng Shu , Member, IEEE
Abstract—In this letter, we focus on the high peak to average The PAPR reduction performance of TR-clipping-based
power ratio (PAPR) reduction problem in tone reservation-based schemes relies on how to select the peak reduction tone
orthogonal frequency division multiplexing systems. We first (PRT) set and the optimal clipping threshold. However, it’s
propose a parallel Tabu search (PTS)-based scheme to find a
sub-optimal peak reduction tone (PRT) set. After finding the a nondeterministic polynomial-time (NP)-hard issue [4] to
sub-optimal PRT set, we apply it in an adaptive iterative clipping find the optimal PRT set. Therefore, sub-optimal solutions
and filtering (AICF) method for PAPR reduction. Furthermore, are preferable, like the genetic algorithm (GA)-PRT [4],
PAPR reduction and bit error rate (BER) performances are com- cross entropy (CE)-PRT [5] and invasive weed optimiza-
pared among the AICF method, the adaptive scaling, the adaptive tion and particle swarm optimization (IWOPSO)-PRT [6].
amplitude clipping and the fast iterative shrinkage-thresholding
algorithm schemes. Simulation results verify that the PTS-based
These methods converge slowly and have high computation
PRT scheme can obtain better secondary peaks with lower com- complexity (CC).
putational complexity, and the AICF scheme effectively reduces To obtain a low PAPR signal, an adaptive scaling (AS)-
PAPR with a faster convergence speed while its BER performance TR algorithm [7] was proposed to reduce PAPR using a
is only slightly worse than for the existing methods. pre-determined clipping threshold. However, for the AS-TR
Index Terms—OFDM, PAPR, tone reservation, parallel tabu it is hard to select an optimal clipping threshold. An adap-
search. tive amplitude clipping (AAC)-TR [4] method was proposed
to improve AS-TR performance. A fast iterative shrinkage-
I. I NTRODUCTION thresholding algorithm (FISTA) scheme [8] was presented to
address PAPR with power control on magnitude of reserved
RTHOGONAL frequency division multiplexing
O (OFDM) has numerous advantages as its channels
are orthogonal to each other. This contributes to avoiding
tones. Although the AAC-TR and FISTA schemes acquire
better PAPR performance, the CCs of the AAC-TR and
FISTA ones are also higher. The equation-based approach
narrow-band interference and multi-path fading [1]. But
considers only in-band distortion at Nyquist sampling rate
OFDM has several drawbacks such as high peak to average
and does not take the effect of out-of-band noise into
power ratio (PAPR). High PAPR results in bad effects on
account [9].
the orthogonality of transmitted signals in conjunction with a
In this letter, we focus on the efficiency of finding the
non-linear power amplifier. To reduce high PAPR, numerous
PRT set and the CC decrease of methods for PAPR reduction.
conventional methods [2] have been proposed. Among the
Specifically, we first propose a novel method based on parallel
ways, the tone reservation (TR) method proposed by Tellado
tabu search (PTS) to search a sub-optimal PRT set. Then, we
firstly [3] is a simple and efficient one without requiring
propose an AICF algorithm to reduce PAPR after finding the
transmission of side information.
sub-optimal PRT set. We compare performances of the AICF
Manuscript received July 21, 2018; revised September 9, 2018 and October algorithm and traditional methods from two aspects of PAPR
6, 2018; accepted November 5, 2018. Date of publication November 9, 2018;
date of current version April 9, 2019. This work was supported in part by the
reduction and bit error ratio (BER). Simulation results validate
National Natural Science Foundation of China under Grant 61727802, Grant the effectiveness of our proposed scheme in PAPR reduction
61872184, and Grant 61501238, in part by the Jiangsu Provincial Science and BER performance.
Foundation under Project BK20150786, in part by the Specially Appointed
Professor Program in Jiangsu Province, 2015, in part by the Fundamental
Research Funds for the Central Universities under Grant 30916011205, and II. OFDM S YSTEMS AND T ONE R ESERVATION
in part by the Open Research Fund of National Mobile Communications
Research Laboratory, Southeast University, under Grant 2017D04. The asso- A. OFDM Systems and PAPR
ciate editor coordinating the review of this paper and approving it for
publication was J. Mietzner. (Corresponding author: Jun Li.)
In OFDM systems, N independent data symbols Xk are
Y. Wang is with the Department of Information and Computational modulated through phase shift keying (PSK) or quadrature
Sciences, Jiangsu University of Science and Technology, Zhangjiagang amplitude modulation (QAM) on a set of N orthogonal sub-
215600, China (e-mail: wangyj1859@just.edu.cn).
R. Zhang and F. Shu are with the School of Electronic and Optical
carriers with the oversampling factor J. The OFDM block
Engineering, Nanjing University of Science and Technology, Nanjing 210094, is expressed as X = [X0 , X1 , . . . , XN −1 ]T , where (·)T
China (e-mail: renjie.zhang@njust.edu.cn; shufeng@njust.edu.cn). means the transpose of a vector. After inverse fast Fourier
J. Li is with the School of Electronic and Optical Engineering, Nanjing transform (IFFT), the discrete time domain OFDM signal is
University of Science and Technology, Nanjing 210094, China, also with the
National Mobile Communications Research Laboratory, Southeast University, generated as
Nanjing, China, and also with the School of Computer Science and Robotics,
JN
−1
National Research Tomsk Polytechnic University, Tomsk 634050, Russia 1 j 2πmk
(e-mail: jun.li@njust.edu.cn). xm = √ Xk · e JN , m = 0, 1, . . . , JN − 1. (1)
Digital Object Identifier 10.1109/LWC.2018.2880432 JN k =0
WANG et al.: PAPR REDUCTION BASED ON PTS FOR TR IN OFDM SYSTEMS 577
The PAPR of x is defined as the ratio of maximum Algorithm 1 PTS Algorithm for PRT Set
instantaneous power to the average power, 1: Input N, M, U, K1 , pc , pm and K.
max |xm |2 2: Randomly generate four initial sequences. Do the TS for
0≤m<JN every sequence.
PAPR(x) = , (2)
E [|xm |2 ] 3: Update initial sequences and tabu list.
where x = [x0 , x1 , . . . , xJN −1 ]T . The complementary cumu- 4: Repeat K1 times and obtain four better sequences.
lative distribution function (CCDF) is used to measure the 5: Do crossover and mutation to the four better sequences
capability of PAPR reduction. The CCDF is the probability above. Select the best sequence of smallest SP.
that the PAPR of an OFDM symbol exceeds the predetermined 6: Repeat K cycles. Output the final PRT set with the
threshold PAPR0 , smallest SP.
CCDF = Pr (PAPR > PAPR0 ). (3)
B. Tone-Reservation search methods, the TS algorithm explores the neighbour-

hood to find a better solution. The neighbourhood is generated
For the TR-based technology, M tones are selected to be a according to one Hamming distance between the initial solu-
PRT set for PAPR reduction. The peak-cancelling signal c = tion and neighbour solutions. Differently, TS has a tabu list
[c0 , c1 , . . . , cJN −1 ]T is produced by the M reserved tones. Then to avoid local search cycling. The tabu list contains all best
the peak reduced signal a is generated by the peak-cancelling solutions of recent searches. These solutions included in the
signal c plus the original signal x in the time domain, tabu list will not be searched at following iterations.
a = x + c = Q(X + C), (4) The parallel tabu search (PTS) algorithm is that several TS
where Q is the IFFT matrix. To prevent signal distortion, C algorithms run simultaneously, which avoids the TS algorithm
and X are orthogonal in the frequency domain. falling into a local optimal solution and can find a better solu-
With TR [3], PAPR is redefined as tion. Compared with the TS, the PTS can shorten the search
time and improve the search efficiency. In this letter, we com-
max |xn + cn |2
0≤n<JN bine the crossover and mutation operations from the GA with
PAPR(a) = , (5)
E [|xn |2 ] the PTS to further produce better solutions and accelerate the
where c should be selected to minimize the peak amplitude of rate of search. To the best of our knowledge, this is the first
the signal a. According to [3], c is updated as follows: application of the PTS to solve the PRT set problem. In order
to better understand the PTS and GA algorithms, readers can
c(k +1) = c(k ) − λk p[((j − jk ))JN ], (6) refer to [4], [10], and [11].
where λk is a scaling factor. p[((j −jk ))JN ] is a circular shift of
p to the right by a value jk , and p = QP is a time domain ker-
nel. The frequency domain kernel P = [P0 , P1 , . . . , PN −1 ]T B. PTS-Based PRT Position Search
closely related to the PRT set is defined by In this subsection, we use the PTS to find the sub-optimal
PRT set. In the PTS, four TSs work simultaneously to find
0, n ∈ RC ,
Pn = (7) better PRT sets.
1, n ∈ R,
Firstly, four initial binary sequences are randomly gener-
where R means the index set of the PRT and RC is the com- ated. Each sequence consists of M ones and N − M zeros.
plementary set of R in N = {0, 1, . . . , N − 1}. To obtain The positions of the PRT set are marked by ones. We then
the optimal PRT set or the optimal p, we need to address the compute SPs of every initial binary sequence. Secondly, each
optimization problem: initial sequence takes TS independently. The process of TS
R∗ = arg min||[p1 , . . . , pJN −1 ]T ||∞ . (8) is to find the sequence of smallest SP in the neighbourhood
R solution space. Thirdly, these sequences replace the initial
The secondary peak (SP) of the time domain kernel p is used four sequences, respectively. Fourthly, the tabu list is updated
as a metric to evaluate the performance of the PRT set whose by adding the sequence of smallest SP in each iteration.
definition is Therefore, the solutions in the tabu list are not searched in the
SP = [p1 , . . . , pJN −1 ]T ∞ . (9) following iterations even if these solutions are in the neigh-
bourhood solutions space. The length of the tabu list is U.
According to (5)-(9), the performance for PAPR reduction
After K1 iterations, four parallel tabu searches end and obtain
relies on the choice of the time domain kernel p or fre-
four better sequences.
quency domain kernel P. We will present a parallel tabu search
Then, crossover and mutation operations are used to gen-
algorithm to solve (8) in the next section.
erate new solutions from the better sequences above at the
probability of crossover pc and mutation pm . After crossover
III. PARALLEL TABU S EARCH A LGORITHM AND
and mutation operations, the best sequence with the smallest
A DAPTIVE I TERATIVE C LIPPING AND F ILTERING
SP is selected.
A. Parallel Tabu Search Algorithm The whole procedure above is called a cycle. The PTS algo-
Tabu search (TS) was first proposed by Glover [10], rithm finds the best sequence after K cycles. Eventually, the
which is evolved from classical local search methods to deal PRT set is provided as output. In conclusion, the PRT search
with combinatorial optimization problems. Like previous local algorithm based on the PTS is summarized in Algorithm 1.
TABLE I
Algorithm 2 AICF-Based Algorithm for PAPR Reduction S ECONDARY P EAK (SP) AND C OMPUTATION
1: Input OFDM symbols, initial γi , i , maximal iteration C OMPLEXITY (CC) C OMPARISON
number imax .
2: Calculate Pav and Ti .
3: Calculate vi , Vi , Hi and hi .
4: Get PAPR reduced signal xi+1 by (12).
5: Update the CR γi by (13).
6: i = i + 1. The algorithm ends until i = imax .
C. AICF Method to Reduce PAPR

After finding the sub-optimal PRT set by the PTS algorithm,
we apply the AICF algorithm to reduce PAPR. The AICF algo-
rithm is similar to the AS-TR [7] and AAC-TR methods [4],
which also consist of iterative clipping and filtered operations.
Firstly, the initial clipping ratio (CR) γi is set and the initial
clipping threshold Ti is computed
Ti = γi ∗ Pav , (10)
where Pav is the average power of signal xi and i is the
iteration number. Then the AICF clips the time signal xi to Fig. 1. Comparison of average secondary peak for different PRT sets.
get a clipping noise vi when the amplitude of xi exceeds Ti
vi = (|xi | − Ti )e j argxi , (11)
where |xi | denotes the amplitude of xi and argxi represents
the phase of xi .
Secondly, because clipping noise vi must meet tone reserva-
tion constraints, so vi is first converted to frequency domain
Vi by a fast Fourier transform (FFT), i.e., Vi = FFT(vi ).
Thirdly, Vi is projected to the reserved PRT set, and removes
the out-of band components of Vi , that can be achieved by a
filtering operation. Specifically, we construct a length JN fil- Fig. 2. CCDF comparison with PTS PRT.
ter F whose components are 1 in the positions of the PRT
set and 0 in other locations. As the Vi passes the filter F,
we achieve a filtered clipping noise Hi = Vi ∗ F in the fre- IV. S IMULATION R ESULTS
quency domain. Fourthly, by carrying out an IFFT for Hi , we
obtain a filtered clipping noise hi in the time domain, i.e., To compare the performance of the PTS and other algo-
hi = IFFT(Hi ). Then, the PAPR reduced signal is given by rithms, i.e., GA, CE and IWOPSO, 105 OFDM symbols with
N = 512, M = 32 and J = 4 are randomly generated. For
xi+1 = xi − hi (12) the parameters of GA, CE and IWOPSO in the simulation,
The CR γi is updated [12] by we refer to [4]–[6], respectively. The PTS takes 4 indepen-
dent tabu searches, K1 = 10 iterations for each tabu search,
Amax
γi = , (13) the length of the tabu list is U = 10, crossover probability
Aave pc = 0.9, mutation probability pm = 0.05, 6 crossover and
where Amax and Aave denote the maximum and average mutation operations are carried out within each cycle, and the
amplitude of xi+1 respectively. The proposed AICF-based number of total cycles is K = 20.
PAPR reduction algorithm is summarized in Algorithm 2. The PRT set obtained by the PTS is PTS-PRT = {9, 20,
21, 26, 56, 88, 93, 136, 151, 153, 176, 177, 245, 273, 285,
D. Complexity Analysis of AICF Algorithm 294, 295, 299, 313, 314, 324, 330, 339, 370, 391, 404, 406,
In the AICF algorithm, main computational costs depend 428, 472, 478, 485, 495}.
on the computations of frequency domain noises Vi and clip- From Table I, the SP achieved by the CE algorithm is the
ping noises hi after filtering. The computations require a pair smallest one in these four algorithms. However, the CC gap
of FFT-IFFT, so the complexity of the AICF algorithm is between the CE and PTS algorithms is 20400 − 920 = 19480,
O(JN log(JN )), which is the same as for the AAC-TR and while the SPs of the two algorithms only have a gap of 0.0043,
AS-TR algorithms. But the CR γ of the AAC-TR algorithm which can be ignored. Therefore, the proposed method based
is invariable. The CR γ of the AICF algorithm is related to on the PTS is better than the other three algorithms in finding
the maximum amplitude and average amplitude of the PAPR the PRT set because it has the lowest CC.
reduced signal. This leads to the faster convergence of the In Fig. 1, the average SPs of the four algorithms are com-
AICF algorithm, which will be verified in the next section. pared. Clearly, the PTS algorithm converges fastest and the
WANG et al.: PAPR REDUCTION BASED ON PTS FOR TR IN OFDM SYSTEMS 579
where So and Si mean the output and input signals, respec-

tively. p is 3 and A is 0.6 in the simulation.
Fig. 4 shows the comparison of the BER performance
among the optimized signals over an additive white Gaussian
noise (AWGN) channel. With the ideal SSPA (A taking
max |Si | and p → ∞), the original signal gets the BER
of 10−6 when the signal-to-noise ratio Eb/No is 10.3 dB.
However, including the above specified SSPA model, Eb/No
reaches 11.7 dB, 11.7 dB, 11.6 dB and 11.5 dB for the AICF-
Fig. 3. Average PAPR comparison with different iteration numbers. TR, AAC-TR, AS-TR and FISTA algorithms, respectively.
At the BER of 10−6 , the BER performance of the AICF-
TR is same with the AAC-TR, while that is inferior by
0.1 dB and 0.2 dB compared with the AS-TR and FISTA
schemes.
V. C ONCLUSION
In this letter, we propose a novel method based on parallel
tabu search (PTS) to find the sub-optimal PRT set for PAPR
reduction in OFDM systems. Compared with existing algo-
rithms, the PTS scheme can find better PRT sets with lower
Fig. 4. BER performance comparison among various methods. computational complexity. Then, we adopt the AICF algorithm
to decrease PAPR.Simulation results reveal that the proposed
AICF algorithm achieves better PAPR reduction, faster con-
CC of the PTS one is the lowest within 100 iterations among vergence rate and a comparable BER as the AS-TR and FISTA
the four algorithms. Above 100 iterations, The gap of the aver- algorithms.
age SPs for the four algorithms is small. Our aim is to fast
obtain a better PRT set with a lower computational complex- R EFERENCES
ity. So the PTS algorithm is the best choice among the four
[1] R. Van Nee and R. Prasad, OFDM for Wireless Multimedia
schemes. Communications. Boston, MA, USA: Artech House, Mar. 2000.
In Fig. 2, the PAPR reduction performances of various [2] T. Jiang and Y. Wu, “An overview: Peak-to-average power ratio reduction
schemes are compared with the identical PTS-PRT. The maxi- techniques for OFDM signals,” IEEE Trans. Broadcast., vol. 54, no. 2,
pp. 257–268, Jun. 2008.
mum number of iterations of the AICF-TR and FISTA schemes [3] J. Tellado, “Peak to average power reduction for multicarrier modula-
is 3, while that for the AAC-TR and AS-TR is 10. Other tion,” Ph.D. dissertation, Dept. Elect. Eng., Stanford Univ., Stanford,
parameters of the AAC-TR, AS-TR and FISTA are referred CA, USA, 2000.
to [4], [7], and [8], respectively. When the CCDF equals 10−4 , [4] Y. Wang, W. Chen, and C. Tellambura, “Genetic algorithm based nearly
optimal peak reduction tone set selection for adaptive amplitude clipping
the PAPR of the original OFDM signal, the AS-TR, AAC-TR, PAPR reduction,” IEEE Trans. Broadcast., vol. 58, no. 3, pp. 462–471,
FISTA and AICF-TR are 11.9 dB, 9.3 dB, 8.6 dB, 6 dB and Sep. 2012.
[5] J.-C. Chen and C.-P. Li, “Tone reservation using near-optimal peak
4.5 dB, respectively. Clearly, the PAPR reduction performance reduction tone set selection algorithm for PAPR reduction in OFDM
of the AICF-TR scheme is superior to others. systems,” IEEE Signal Process. Lett., vol. 17, no. 11, pp. 933–936,
In Fig. 3, the average PAPRs of the four schemes are Nov. 2010.
[6] H.-L. Hung, C.-H. Cheng, and Y.-F. Huang, “PAPR reduction of OFDM
compared as the iteration number increases. The initial CR using invasive weed optimization-based optimal peak reduction tone set
γ = 4.5 dB is the same for the AAC-TR, AS-TR and AICF- selection,” EURASIP J. Wireless Commun. Netw., vol. 244, pp. 1–12,
TR. When the iteration number is 2, the average PAPR of Oct. 2013.
the AICF-TR converges to 4.5 dB while the AAC-TR, AS-TR [7] L. Wang and C. Tellambura, “Analysis of clipping noise and
tone-reservation algorithms for peak reduction in OFDM sys-
and FISTA are still at the state of decrease. When the iter- tems,” IEEE Trans. Veh Technol., vol. 57, no. 3, pp. 1675–1694,
ation number is 10, the average PAPRs are 4.5 dB, 6.2 dB, May 2008.
7.2 dB and 4.2 dB for the AICF-TR, AAC-TR, AS-TR and [8] Y. J. Wang, S. Xie, and Z. B. Xie, “FISTA-based PAPR reduction method
for tone reservation’s OFDM system” IEEE Wireless Commun. Lett.,
FISTA, respectively. Fig. 2 and Fig. 3 show that the AICF-TR vol. 7, no. 3, pp. 300–303, Jun. 2018.
algorithm can acquire better PAPR decrease, lower CC and a [9] N. Bibi, A. Kleerekoper, N. Muhammad, and B. Cheetham, “Equation-
faster convergence rate than the AAC-TR, AS-TR and FISTA method for correcting clipping errors in OFDM signals,” SpringerPlus,
vol. 5, no. 1, p. 931, Jun. 2016, doi: 10.1186/s40064-016-2413-0.
schemes within 6 iterations. [10] F. Glover, “Future paths for integer programming and links to artifi-
To evaluate the BER performance of the whole OFDM sys- cial intelligence,” Comput. Oper. Res., vol. 13, no. 3, pp. 533–549,
tem, the processed signal passes through a solid-state power May 1986.
[11] J. Hou, C. Tellambura, and J. Ge, “Tone injection for PAPR reduction
amplifier (SSPA) model [1], using parallel Tabu search algorithm in OFDM systems,” in Proc. IEEE
Glob. Commun. Conf., Anaheim, CA, USA, 2012, pp. 4899–4904.
|Si | [12] K. Anoh, C. Tanriover, and B. Adebisi, “On the optimization of iterative
So = 1 e j Θ, (14) clipping and filtering for PAPR reduction in OFDM systems,” IEEE
|Si | 2p 2p
[1 + ( A ) ] Access, vol. 5, pp. 12004–12013, 2017.
Resource Allocation in UAV-Assisted M2M

Communications for Disaster Rescue
Xilong Liu , Student Member, IEEE, and Nirwan Ansari , Fellow, IEEE
Abstract—Internet of Things (IoT) promotes the awareness Recently, unmanned aerial vehicles (UAVs) have been incor-
about our world and eases human life. In IoT, machine- porated into the cellular system to assist mobile networks.
to-machine (M2M) communications empowers machine-type- UAV-mounted-BSs (UBSs) can provide coverage over a radius
devices (MTDs) to cooperatively exchange information and per- of 5km and enable high quality voice calls and real-time video
form actions. Current M2M communications primarily leverages
cellular networks to provision reliable services. When a disrup- streaming [4]. In addition, UBS provisions faster and flexible
tive disaster destroys the local cellular infrastructure, unmanned deployment and connectivity, and can boost network through-
aerial vehicle mounted base stations (UBS) can be deployed to put and enhance QoS, attributed to the presence of line-of-sight
assist rescue by facilitating M2M communications among the links.
human portable/wearable MTDs (HMTDs). The UBS network Since the conventional network access scheme (i.e., Random
access and resource allocation scheme is proposed to maximize Access Scheme (RAS)) in most cellular systems is not
the number of HMTDs to establish communications. We have designed for disaster rescue [5] and the resource alloca-
validated the proposed scheme through extensive simulations.
tion schemes in UAV-assisted M2M communications are not
Index Terms—Internet of Things, machine-to-machine commu- well investigated, in this letter, we propose to deploy UBS
nications, unmanned aerial vehicle mounted base station, disaster in the disaster area to assist disaster rescue by facilitating
rescue, network access and resource allocation. M2M communications for the human portable/wearable MTDs
(HMTDs). In this disaster rescue scenario, the objective is
I. I NTRODUCTION to enable the maximum number of HMTDs to establish con-
NTERNET of Things (IoT) promotes the level of awareness nectivity and send rescue messages with required data rates.
I about our world and helps run modern lives more effi-
ciently. Physical things (referred to as machine-type-devices
Hence, we propose the UBS Network Access and Resource
Allocation (UNARA) scheme to maximize the number of
(MTDs) in IoT) are embedded with sensing and trans- HMTDs in provisioning transmissions. As the UBS carries a
mission ability, and are thus enabled to gather and share limited capacity battery, the energy consumption of the UBS is
information. Home appliances, smart meters, vehicles, sen- a primary concern and the hovering time is a key factor for per-
sors, and human portable/wearable devices are examples of forming rescue, and so minimizing the transmission power for
such MTDs, which are expected to be the main users in relaying data at the UBS to tangibly extend the hovering time
IoT [1]. Machine-to-machine (M2M) communications empow- is critical and considered in this letter. We have also validated
ers various applications, such as health care, proximal social the proposed UNARA scheme through extensive simulations.
networking, facilities monitoring, transportation, security and
disaster rescue.
It has been predicted that by 2020, more than 50 billion II. S YSTEM M ODEL
MTDs will be connected for IoT [2] and in a smart city,
the number of MTDs per square kilometer will be in the The UBS deployment scenario is depicted in Fig. 1. Given
order of tens of thousands [3]. Owing to the pervasiveness the location (latitude and longitude coordinate) of the disaster
of MTDs and IoT applications, most current M2M communi- area, which loses local cellular infrastructure, the UBS first
cations relies on cellular (or licensed Low Power Wide Area reaches the airspace above the particular area. Once the alti-
(LPWA)) infrastructure because base stations (BSs) provision tude of the UBS is determined (the altitude selection is beyond
centralized scheduling and resource management, interference the scope of this letter), its effectively covered HMTDs are
mitigation, quality of service (QoS) guarantees and secured determined [6]. UBS will select the unused (empty) cellular
services. However, when a disaster destroys the existing cel- spectrum band in that area (to avoid the mutual interference if
lular infrastructure, local IoT services will be disrupted, thus there exist surrounding working BSs) to facilitate local M2M
hampering people from seeking help. communications among HMTDs. The UBS communication
system utilizes Long-Term Evolution (LTE) connectivity and
Manuscript received September 21, 2018; revised October 31, 2018; is time-slotted.
accepted November 5, 2018. Date of publication November 9, 2018; date Within the coverage of a UBS, we assume that the real-
of current version April 9, 2019. This work was supported by NSF under time channel state information (CSI) between the HMTDs,
Grant CNS-1814748. The associate editor coordinating the review of this
paper and approving it for publication was M. Nafie. (Corresponding author: and that between the HMTDs and the UBS are known by the
Xilong Liu.) UBS [7]. In every time slot, according to CSI, the HMTDs in
The authors are with the Advanced Networking Laboratory, Helen and the coverage are partitioned into two groups. If the channel
John C. Hartmann Department of Electrical and Computer Engineering,
New Jersey Institute of Technology, Newark, NJ 07102 USA (e-mail:
gain between the source and destination (SD) HMTD is better
xl249@njit.edu; nirwan.ansari@njit.edu). than that between the source HMTD and the UBS, we clas-
Digital Object Identifier 10.1109/LWC.2018.2880467 sify this SD pair into the direct M2M group (for simplicity, D
LIU AND ANSARI: RESOURCE ALLOCATION IN UAV-ASSISTED M2M COMMUNICATIONS FOR DISASTER RESCUE 581
to the Rh th destination HMTD [9], and hence the effective

data rate of this SD pair in the R group is
1
CRSD = min C SU
Rh , C UD
Rh . (4)
h 2
III. P ROBLEM F ORMULATION

When a disaster disrupts the local cellular infrastructure,
the most important task is to enable the maximum number
of HMTDs to establish connection and send rescue messages
with required data rates; this is the objective in our formu-
Fig. 1. The UAV-assisted M2M communications network. lated problem. The destination HMTDs in this scenario can
be the devices in the rescue team entered the disaster area,
the vehicle-carried BSs at the edge of the disaster area or
group); the source HMTD in this D group will directly trans- other HMTDs in the disaster area. In disaster rescue, HMTDs
mit data to its destination HMTD. Otherwise, this SD pair is have higher priority in transmission than other types of source
classified into the relay M2M group (for simplicity, R group); MTDs. The UBS first recognizes the HMTDs through the con-
the source HMTD will first send data to the UBS, and then trol signal based on the MTDs’ Media Access Control (MAC)
the UBS will forward the data to the destination HMTD. addresses, and then wisely allocate network resource to them
according to our proposed UNARA scheme.
A. Direct M2M Communications Model
Denote Di as the index of HMTD SD pairs in the D group. A. Resource Allocation for the Direct M2M Group
Since the transmissions of the HMTDs adopt the single-carrier In a time slot, suppose there are I active HMTDs, each
indexed by Di , in the D group. In each time slot, PD S of
frequency division multiple access (SC-FDMA), within the i
coverage of the UBS, there is no interference among the SD each HMTD is assumed fixed and known by the UBS through
pairs. According to the Shannon Theorem [8], the transmission the control signal. We assume that the required data rate of
each SD pair (denoted as CD SD ) is also fixed and known by
data rate of the Di th SD pair is ireq
S g SD the UBS. According to Eq. (1), the UBS resource allocation
PD SD , i.e.,
SD
CD = WDSi log2 (1 + i Di
). (1) should fulfill each SD pair’s required data rate CDi req
i
N0 WDSi
S g SD
PD
SD SD i Di
Here, PDS is the D th source HMTD’s transmission power,
i CD = CD = WDSi log2 (1 + ). (5)
i
SD the channel gain between the D th source and destina-
ireq i
N0 WDSi
gD i i
tion HMTD, N0 the power spectral density of additive white Here, WDSi (i.e., the Di th SD pair’s required bandwidth) can
Gaussian noise (AWGN), and WDSi the bandwidth allocated
be calculated based on Eq. (5). WDSi is the only variable in
to the Di th source HMTD. S , according to Shannon Theorem [8],
Eq. (5). For a fixed PD i
B. Relay M2M Communications Model by increasing its transmission bandwidth WDSi , the data rate
will be increased accordingly. It can also be seen from Eq. (5)
Denote Rh as the index of SD pairs in the R group. Since that CDSD is a concave increasing function of W S for a fixed
the orthogonal channels are available, the transmission data i Di
PD S [10].
rate from the Rh th source HMTD to the UBS (SU) [8] is i
Suppose, at the UBS, the total available spectrum band-
S g SU
PR width is WM 2M . In order to enable the maximum number of
h Rh
CRSU = WRSh log2 (1 + ). (2) HMTDs to effectively transmit data in the D group, we pro-
h
N0 WRS
h pose the UBS Network access and Resource allocation (UNR)
Here, PRS is the R th source HMTD’s transmission power, algorithm (i.e., Algorithm 1).
h h
SU the channel gain between the R th source HMTD and By first allocating the spectrum to the HMTDs with rela-
gR h h tively small required bandwidth, the UNR algorithm enables
the UBS, and WRSh the bandwidth allocated to the Rh th source the maximum number of the HMTDs to transmit data.
HMTD. The computational complexity of the UNR algorithm is
U and W U are the transmission power and
Similarly, PR h Rh O(I 2 log I ), and thus the UNR algorithm is a polynomial time
bandwidth of the UBS for serving the Rh th destination HMTD algorithm, i.e., its solution can be obtained efficiently.
(UD), respectively; gRUD is the channel gain between the UBS
h
and the destination HMTD; the data rate is B. Resource Allocation for the Relay M2M Group
U g UD
PR h Rh
As the occupied bandwidth
in the D group is already
CRUD = WRUh log2 (1 + ). (3) obtained, WDirect = M W S
h
N0 WRU Dm =1 Dm , if WDirect < WM 2M ,
h the remaining bandwidth, i.e., WRelay = WM 2M − WDirect ,
For the relay M2M communications, in the first half time is allocated to the R group. According to Eq. (2), Eq. (4)
slot, the Rh th source HMTD sends the data to the UBS, and and the HMTDs’ required data rates, in the first half time
then, in the second half time slot, the UBS forwards the data slot, the UBS first executes Algorithm 1 (by replacing the
Algorithm 1 UNR Algorithm

1: Define two sets: Set 1 is an empty set and Set 2 includes
all the I active HMTDs.
2: for Di = 1 to I do
3: Find the HMTD Di with the least required bandwidth
WDSi in Set 2, add this HMTD into Set 1 and remove
it from Set 2;
4: Update the number of HMTDs, I, in Set 2; denote the
number of HMTDs in Set 1 as M and index the HMTDs
by Dm , and updateSM;
5: Calculate M Dm WDm ;
M
6: if Dm =1 WDSm < WM 2M and Set 2 is not empty; Fig. 2. The number of HMTDs can transmit vs. UBS available bandwidth.
then
7: Continue to the next iteration;
8: else Proposition 1: The UBS resource allocation optimization
9: Stop the iteration; problem in the second half time slot is a convex problem.
Proof: According to Eq. (3), the transmission power PR U
10: end if h
11: end for of the UBS for serving the Rh th destination HMTD can be
12: UBS facilitates transmissions of the HMTDs in Set 1 by expressed as
allocating their required bandwidth WDSm . UD
CR
U WU
h
N0 WRUh
PR = (2 Rh
− 1) · UD
. (12)
h
gR
h
WM 2M in Line 6 with WRelay ) to maximize the number of
HMTDs to transmit data in R group and allocate the HMTDs Thus, the objective function Eq. (6) and its constraints can be
their required bandwidth. Suppose there are H HMTDs, each re-written as
UD
indexed by Rh , which are facilitated in the first half time slot; H
CR
the allocated bandwidth WRSh for each source HMTD leads to WU

h
N0 WRUh
min (2 Rh
− 1) · UD
(13)
the corresponding data rate CRSU h
(CRSU
h
should be larger than {WRU ,CRUD }
h h Rh =1
gR
h
or equal to 2CRSD , according to Eq. (4)) for the SU link.
hreq s.t. CRSU
h
≤ UD
CRh (14)
In the second half time slot, WRelay is fully used again
H

to facilitate transmissions from the UBS to the destination
HMTDs. Since the UBS is equipped with much higher trans- WRUh ≤ WRelay (15)
mission power than that of a HMTD [11], the data rate of the Rh =1
UD link can be guaranteed such that CRUD ≥ CRSU , even if the H

h h U
channel condition of the UD link may not be as good as that PR h
≤ PTotal (16)
of the SU link. As the UBS carries a limited capacity battery, Rh =1
the hovering time is a key factor in rescue. Thus, in this half 0 < WRUh (17)
time slot, the objective is to minimize the UBS’s transmission
power, while fulfilling the required data rates, i.e., 0 < CRUD
h
. (18)
H
Here, WRUh and CRUD h
are the only variables in Eq. (13).
U
min PR h
(6) By calculating the second-order derivatives of Eq. (12), its
U ,W U }
{PR
h R h Rh =1 Hessian matrix is positive semidefinite. That is, Eq. (12) is
s.t. CRSU ≤ UD
CRh (7) convex [10]. The summation of convex functions, Eq. (13), is
h still a convex function [10]. At the same time, the inequal-
H
ity constraints (14)-(18) are convex. Hence, this optimization
WRUh ≤ WRelay (8) problem is a convex problem.
Rh =1
H

U IV. S IMULATIONS
PR h
≤ PTotal (9)
Rh =1
Simulations are set up as follows. The radius of the UBS’s
coverage is 500m. The UBS is placed at 80m above the cen-
0 < WRUh (10) ter of the coverage area. The distribution of the HMTDs is
U
0< PR h
. (11) generated by a Poisson Point Process in this area. In simu-
lations, the distance-dependent channel model [7] is adopted.
Here, Eq. (9) constrains the total transmission power for relay- The HMTDs’ locations are fixed within each time slot 10ms.
ing data not to exceed the UBS’s hardware allowed maximum The transmission power of each HMTD is arbitrarily assigned
transmission power PTotal . It can be proved that this optimiza- among 8mW, 10mW and 20mW. In the coverage of the
tion problem is convex. The optimal solution can be obtained UBS, the pairing for the SD pairs are arbitrarily assigned.
efficiently. The SD pairs’ required data rates are arbitrarily assigned
LIU AND ANSARI: RESOURCE ALLOCATION IN UAV-ASSISTED M2M COMMUNICATIONS FOR DISASTER RESCUE 583
the HMTDs may require a large bandwidth; when the available

bandwidth cannot accommodate all the HMTDs’s requests, the
UBS will randomly disconnect one or multiple HMTDs; some
bandwidth may not be allocated in the end of the resource allo-
cation. With the increase of available bandwidth at the UBS,
the spectrum efficiency of RAS slowly ascends because RAS
can, with more available bandwidth, more likely accommodate
those HMTDs that require large bandwidth.
Fig. 4 shows the Monte Carlo result of UBS’s average trans-
mission power with different numbers of SD pairs classified
in the relay M2M group. As the number of SD pairs in the
Fig. 3. Spectrum efficiency vs. UBS available bandwidth. relay M2M group increases, the UBS’s transmission power
increases because the UBS has to serve more UD links to relay
data for the source HMTDs. The UBS’s transmission power
in RAS increases rapidly because in RAS, the HMTDs are
randomly selected and some HMTDs may require high data
rates. In order to satisfy their required data rates, the UBS
has to increase its transmission power at each UD link. The
UNARA scheme minimizes the transmission power at UBS by
wisely selecting HMTDs and allocating the network resource,
thus furthest prolonging the hovering time of the UBS for
preforming rescue.
V. C ONCLUSION
Fig. 4. UBS’s transmission power vs. the number of HMTDs in the relay In this letter, we have proposed to adopt the UBS to assist
M2M group. M2M communications for disaster rescue. In order to enable
the maximum number of HMTDs to transmit data, we have
proposed the UNARA scheme to wisely select the HMTDs and
allocate them with network resource accordingly. Meanwhile,
among 150Kbps, 200Kbps and 300Kbps. The total available
the UNARA scheme has minimized the transmission power at
bandwidth at the UBS is 10MHz.
the UBS when it relays data for the HMTDs in the relay M2M
Fig. 2 shows the average number of HMTDs that can trans-
group to furthest prolong its hovering time. The performance
mit data, facilitated by our proposed UNARA scheme and
of the proposed UNARA scheme has been validated through
the conventional RAS [5] (i.e., HMTDs are randomly selected
extensive simulations.
to transmit), respectively, versus different amounts of UBS
total available bandwidth. In this simulation, the total num-
ber of active HMTD SD pairs is set as 620. When the UBS R EFERENCES
has more available bandwidth, a larger number of HMTDs [1] A. Rico-Alvarino et al., “An overview of 3GPP enhancements on
can transmit. In each setting of the available bandwidth, the machine to machine communications,” IEEE Commun. Mag., vol. 54,
no. 6, pp. 14–21, Jun. 2016.
UNARA scheme outperforms the conventional RAS because [2] D. Lake, A. Rayes, and M. Morrow, “The Internet of Things,” Internet
our proposed scheme preferentially allocates the spectrum to Protocol J., vol. 15, no. 3, pp. 10–19, 2012.
the SD pairs which require small bandwidth, thus enabling [3] Vodafone, “RACH intensity of time controlled devices,” 3rd Gener.
more HMTDs to access the spectrum to transmit. In contrast, Partnership Project, Sophia Antipolis, France, Rep. R2-102296, 2012.
[4] L. Zhang, Q. Fan, and N. Ansari, “3-D drone-base-station placement
with the conventional RAS, UBS randomly selects the active with in-band full-duplex communications,” IEEE Commun. Lett., vol. 22,
HMTDs in its coverage, and some of them may demand large no. 9, pp. 1902–1905, Sep. 2018.
bandwidth, and so the total number of HMTDs that can access [5] Y.-H. Chen, E. H.-K. Wu, C.-H. Lin, and G.-H. Chen, “Bandwidth-
the spectrum is not maximized. satisfied and coding-aware multicast protocol in MANETs,” IEEE Trans.
Mobile Comput., vol. 17, no. 8, pp. 1778–1790, Aug. 2018.
Fig. 3 shows the spectrum efficiency of the UBS network [6] M. Alzenad, A. El-Keyi, F. Lagum, and H. Yanikomeroglu, “3-D
with different amounts of UBS available bandwidth. Note that placement of an unmanned aerial vehicle base station (UAV-BS) for
with different amounts of UBS available bandwidth, the spec- energy-efficient maximal coverage,” IEEE Wireless Commun. Lett.,
trum efficiency of our proposed UNARA scheme and that of vol. 6, no. 4, pp. 434–437, Aug. 2017.
[7] F. Wang, C. Xu, L. Song, and Z. Han, “Energy-efficient resource
RAS is almost the same because the spectrum in both schemes allocation for device-to-device underlay communication,” IEEE Trans.
are all efficiently allocated once they select the HMTDs. Even Wireless Commun., vol. 14, no. 4, pp. 2082–2092, Apr. 2015.
though the data rate that can be transmitted over the avail- [8] C. E. Shannon and W. Weaver, The Mathematical Theory of
able bandwidth in both schemes are nearly the same, our Communication. Champaign, IL, USA: Univ. Illinois Press, 2002.
[9] G. Zhao, C. Yang, G. Y. Li, D. Li, and A. C. K. Soong, “Power and chan-
proposed UNARA scheme enables more HMTDs to trans- nel allocation for cooperative relay in cognitive radio networks,” IEEE
mit (Fig. 2) because the UNARA scheme first allocates the J. Sel. Topics Signal Process., vol. 5, no. 1, pp. 151–159, Feb. 2011.
spectrum to the SD pairs which require small bandwidth, thus [10] S. P. Boyd and L. Vandenberghe, Convex Optimization. Cambridge,
allowing more people to tap on the spectrum to send res- [11] L. Zhang and N. Ansari, “On the number and 3-D placement of in-
cue messages. In general, the spectrum efficiency of RAS is band full-duplex enabled drone-mounted base-stations,” IEEE Wireless
slightly lower than that of UNARA because in RAS, some of Commun. Lett., to be published, doi: 10.1109/LWC.2018.286750.
Coded Redundant Message Transmission Schemes for

Low-Power Wide Area IoT Applications
Samuel Montejo-Sánchez , Member, IEEE, Cesar A. Azurdia-Meza, Member, IEEE,
Richard Demo Souza , Senior Member, IEEE, Evelio Martin Garcia Fernandez , Member, IEEE,
Ismael Soto, Member, IEEE, and Arliones Hoeller, Jr. , Member, IEEE
Abstract—We propose a novel transmission scheme, suitable to low noise levels, resulting in high sensitivity and low power
Internet of Things, which sends coded redundant information in consumption [2].
independent packets. The results indicate that this scheme out- Reliability is always a concern in wireless communications,
performs, in terms of reliability, transmit power, and coverage, and it can be increased by retransmission and redundancy
the typical direct transmission strategy, schemes based on repli- schemes [3]–[6]. A feedback channel is usually employed to
cations, and methods that embed coded redundant information
in packets that contain new information. Moreover, we demon-
request retransmissions. However, LPWANs avoid the contin-
strate the impact of selecting the most appropriate transmission uous use of the downlink channel, since this link is often
scheme according to the target information outage probability near congested, as the BS covers several nodes at a low data
and node location. rate [1]–[5]. So, as in [3]–[5], this possibility is not con-
sidered in this letter. In contrast, redundancy-based schemes
Index Terms—Coded redundant message transmissions, energy do not need any feedback information since the devices
efficiency, Internet of Things (IoT), LPWAN.
transmit redundant packets without prior request. Remarkable
redundancy-based schemes are fountain codes [7], [8] and net-
I. I NTRODUCTION work coding [8], [9]. But since in LPWANs the one-hop link
HE INTERNET-OF-THINGS (IoT) aims at providing
T connectivity for thousands of devices [1]. An IoT base
station (BS) should cover many devices in a relatively simple
to the BS is essential to reduce energy consumption, solu-
tions based on cooperative diversity [8] are typically not well
suited [2]–[5].
manner, thus reducing infrastructure cost [2], [3]. Typically, Packet-level erasure coding on top of pure physical-layer
the BS collects information from many devices [1]–[5], which coding is known to be beneficial in block fading channels [6].
transmit a small amount of data with short duty cycle. In [3], by taking into account the randomness in time and
Ensuring energy efficient and reliable communications in frequency domains, a redundancy-based scheme using sim-
resource constrained networks is a primary concern in IoT [1]. ple packet replications has been considered in a UNB based
That is why Low Power Wide Area Networks (LPWANs), LPWAN. However, solutions based on coded packet redun-
such as LoRa and SigFox, attract so much interest [2]. For dancy [4], [5], [7] can bring increased reliability for the same
instance, by using ultra narrow band (UNB) and low data rate. In [4], inspired by network coding, we proposed a non-
rates, SigFox utilizes bandwidth efficiently and experiences cooperative transmission scheme based on coded redundant
information, while in [5], a convolutional fountain erasure
Manuscript received September 27, 2018; revised November 2, 2018; coding scheme for data recovery called DaRe is proposed.
accepted November 5, 2018. Date of publication November 12, 2018; date In this letter, first we extend the scheme addressed in [4] to
of current version April 9, 2019. This work was supported in part by
FONDECYT Post-Doctoral under Grant 3170021, in part by FONDECYT embed redundancy in more than just the next data packet and
Iniciación under Grant 11160517, in part by FONDEF under Grant to include more than just one coded message per packet. Then,
IT17M10012, in part by ERANet-LAC under Grant ELAC2015/T10-0761, and more importantly, we propose a novel coded transmission
in part by CNPq under Grant 304503/2017-7, and in part by the Araucaria scheme, which sends redundant information in new indepen-
Foundation, Brazil. The associate editor coordinating the review of this paper dent packets; unlike the aforesaid schemes [4], [5] which focus
and approving it for publication was A. Ozcelikkale. (Corresponding author:
Samuel Montejo-Sánchez.) on the case where redundancy is embedded in the next data
S. Montejo-Sánchez is with the Programa Institucional de Fomento a la packets. Besides, we establish a fair comparison between the
I+D+i, Universidad Tecnológica Metropolitana, Santiago 8940577, Chile transmission schemes, imposing a maximum delay constraint,
(e-mail: smontejo@utem.cl). taking into account the impact of the protocol overhead, and
C. A. Azurdia-Meza is with the Department of Electrical
Engineering, Universidad de Chile, Santiago 8370451, Chile (e-mail:
ensuring the same channel time utilization.
cazurdia@ing.uchile.cl). The main contribution of this letter is a novel transmission
R. D. Souza is with the Department of Electrical Engineering, Federal scheme, based on independent coded redundant informa-
University of Santa Catarina, Florianópolis 88034500, Brazil (e-mail: tion, that outperforms direct transmission, methods based on
richard.demo@ufsc.br). replications [3], and schemes based on embedded redundant
E. M. G. Fernandez is with the Department of Electrical Engineering,
Federal University of Paraná, Curitiba 81531-990, Brazil (e-mail: messages [4], [5], in terms of reliability, transmit power, and
evelio@ufpr.br). coverage, while ensuring the same channel time utilization.
I. Soto is with the Department of Electrical Engineering, Universidad de
Santiago de Chile, Santiago 9170124, Chile (e-mail: ismael.soto@usach.cl).
A. Hoeller is with the Department of Telecommunications Engineering, II. S YSTEM M ODEL
Federal Institute for Education, Science and Technology of Santa Catarina,
São José 88103-310, Brazil (e-mail: arliones.hoeller@ifsc.edu.br). We consider the uplink between node U and the BS
Digital Object Identifier 10.1109/LWC.2018.2880959 in an LPWAN. We assume additive white Gaussian noise
MONTEJO-SÁNCHEZ et al.: CODED REDUNDANT MESSAGE TRANSMISSION SCHEMES FOR LOW-POWER WIDE AREA IoT APPLICATIONS 585
Moreover, RT-E sends replicas embedded in the successive

data packets that contain new information, which avoids the
increase of protocol overhead at the expense of increasing the
payload. As in Fig. 1-c, when RT-E with one replica is used the
k th message is contained in both the k th and the (k +1)th data
packets (note that the (k + 1) data packet also contains new
information). Generalizing the idea, the k th message could
be embedded in the following n packets, increasing n times
the payload size of the respective packets. Consequently, the
Fig. 1. Transmission schemes, for n = 1. The top scale is consistent with transmission rate required by RT-E is
the binary length of the headers and messages, while the bottom scale is
lm
consistent with the temporal length. RRT-E,n = 1 + n RDT . (3)
lh + lm
channels subject to quasi-static Rayleigh fading in which Note that, by adjusting the transmission rate we ensure the
the instantaneous signal-to-noise ratio (SNR) at the BS is same channel time utilization by every scheme, even with dif-
2P
γ = ghNo W , where Pt is the transmit power of node U, No
t
ferent number of replicas, thus establishing a fair comparison.
is the unilateral noise power spectral density, W is the chan- Since both techniques have the same redundancy, n repli-
nel bandwidth, h is the channel fading coefficient, independent cated messages, their information outage probabilities are
and identically distributed (i.i.d.), where h 2 follows an expo- (1+n) (1+n)
nential distribution with unit energy, g = Kd −α is the path IRT-I,n = ORT-I,n andIRT-E,n = ORT-E,n , (4)
loss [10], where K is a frequency dependent constant, d is the
so that, for each scheme, increasing the number of replicas has
distance between U and BS, and α is the path loss exponent.
a beneficial impact in the exponent of the outage probability,
The outage probability is the probability that a message
but also a negative impact in the required transmit rate.
from U is not recovered at the BS. A transmission fails when-
ever the instantaneous SNR is below a target SNR γDT ,
required to guarantee reliable communication at a rate RDT . III. C ODED T RANSMISSION S CHEMES
Then, the link outage probability between U and the BS is [10] The schemes discussed in Section II are inefficient since
No W (2RDT /W −1) we must double (RT-I) or almost double (RT-E) the trans-
−
ODT = Pr{γ < γDT } = 1 − e gPt . (1) mission rate, compared to DT, to send a single replica. Thus,
inspired by network coding and fountain codes, we propose the
A. Direct Transmission use of Coded Transmission (CT), in which successive replicas
In the typical direct transmission (DT) scheme, each mes- are combined before transmission, increasing the spectral effi-
sage is sent in a single packet with a header of size lh and a ciency. The use of linear combinations (e.g., XOR operation)
payload of size lm . The probability of information outage is of previous messages increases the redundancy with limited
associated with the loss of the packet conveying that informa- cost in terms of spectral efficiency. Just as in RT, the CT
tion, and consequently to the link outage. As shown in Fig. 1-a, schemes can be of the independent (CT-I) and embedded (CT-
when the reception of the k th transmitted packet fails, the k th E) types. Moreover, the transmission rates of CT-I and CT-E
message is lost. Since each transmitted packet contains a single are obtained by evaluating (2) and (3), respectively.
information message, then the information outage probability
is simply IDT = ODT . A. Extending the Coded Transmission Scheme in [4]
Let n = 1, as in Fig. 1-d, so that the redundancy is embedded
B. Replication Schemes in the next data packet. Then, the k th packet contains the
The simplest time diversity scheme uses repetition coding, original k th message and a linear combination of the (k −1)th
sending replicas of the original message. This technique is and (k −2)th messages. If the decoding of the k th packet fails,
referred here as Replication Transmission (RT). The replicas then the k th message can be obtained from the (k − 1)th and
can be sent in new independent packets, which is referred (k + 1)th or (k + 1)th and (k + 2)th or (k + 2)th and (k + 3)th
here as Independent RT (RT-I), or embedded in the payload packets. This example defines CT-E, which was introduced
of the successive data packets, Embedded RT (RT-E). As in in [4], but the analysis ignored the effects of the protocol
Fig. 1-b, with RT-I the k th message can be contained in n suc- overhead and channel time fairness, while a formulation as a
cessive independent replicas. Note that RT-I is the technique function of n was not presented.
discussed in [3] and utilized by SigFox. In these scenarios, If n = 2 is used, each message contains two linear combi-
most of the applications have long time intervals between nations of the previous messages (in this case the k th packet
messages [1], [2], which allows the replicas to be sufficiently contains the original k th message, a linear combination of the
spaced in time to guarantee statistical independence. For the (k − 1)th and (k − 2)th messages, and a linear combination of
sake of fairness, considering the same channel time utiliza-
the (k −1)th and (k −3)th messages) but, the number of inde-
tion according to DT, we assume the transmission rate to be a
function of the number of replicas. Consequently, the rate of pendent packets is the same, while the required transmission
RT-I is1 rate increases. Thus, the required transmission rate of CT-E is
the same as of RT-E. By enumerating2 the events that lead to
RRT-I,n = (1 + n)RDT . (2)
2 To limit the decoding delay and the memory requirements, at most three
1R
X,n and OX,n are the transmission rate and link outage probability redundant messages per information message are assumed, and so we only
when scheme X ∈ {RT-I, RT-E, CT-E, CT-I} is used, with n redundant present the analysis between the (k − 3)th and (k + 3)th data packets. For
messages. OX,n is written by replacing RDT in (1) by RX,n according to instance, taking into account that SigFox messages are limited to 140 uplink
(2) or (3). messages per day [2], this limits the decoding delay to at most half an hour.
TABLE I
an outage for different values of n, E VENTS FOR S UCCESSFUL D ECODING OF CTI-I W ITH n = 1
3 n−1 n
ICT−E,n = OCT -E,n (1 + OCT-E,n − OCT-E,n
2n−2 2n−1
+ OCT -E,n − OCT-E,n ). (5)
Note that the diversity order is not increasing with n, making
CT-E worse than RT-E, when n ≥ 2. DaRe [5] is a similar
embedded scheme, which uses linear combinations of previous
messages, but performs better with higher window size values,
limiting its application in scenarios with delay constraints.3 TABLE II
S YSTEM PARAMETER VALUES
B. Novel Independent Coded Transmission Scheme
Fig. 1-e illustrates the novel CT-I scheme with n = 1. The
k th message is sent in one independent data packet, but it is
also contained in two additional packets: i) the coded mes-
sage sent after the k th data packet, in a linear combination of
the (k − 1)th and the k th messages; ii) the coded message
sent after the (k + 1)th data packet, in a linear combina-
tion of the k th and the (k + 1)th messages. Table I lists the
events (one per row), between the (k − 3)th and (k + 3)th this idea, the number of complementary independent sets of
data packets, that allow the k th message decoding, according events that allow decoding of the k th message is 2n and
to the successful (S) or failed (F) decoding of that particular P(Ej ) = En : ∀j ∈ {1, 2, ..2n}. So, the information out-
packet, where M and R are new message and coded redun- age probability of CT-I can be formulated considering the
dant information, respectively. Thus, if the direct decoding of union of these independent sets of events [11,
the k th packet fails, the k th message can still be obtained eq. (2.15)],
ICT-I,n = 1 − [(1 − OCT-I,n ) + OCT-I,n P( 2n j =1 Ej )]. This
from the other events, which can be classified in two sets union can be rewritten using the binomial function, and then
containing: i) the events that depend on the linear combi- by an algebraic approach based on the binomial theorem [11,
nation with the (k − 1)th message, denoted as E1 and eq. (6.4)] we determine a closed form expression for the
highlighted in red; ii) the events that depend on the linear information outage probability of CT-I as
combination with the (k + 1)th message, denoted as E2
2n
and highlighted in green. From Table I we can determine the
2n
ICT-I,n = OCT-I,n − OCT-I,n (−1)j +1 Enj
probability of occurrence of both independent sets of events, j
j =1
which are P(E1 ) = P(E2 ) = E1 , 2n+1
= OCT-I,n − OCT-I,n (1 − (1 − En )2n ) = OCT 2n
-I,n Fn , (9)
2 3
E1 = (1 − OCT-I,1 ) + OCT-I,1 (1 − OCT-I,1 ) where
2 4
+ OCT -I,1 (1 − OCT-I,1 ) . (6) 2 3
En = (1 − OCT-I,n ) + OCT-I,n (1 − OCT-I,n )
Then, considering the union of the events in Table I, the 2 4
+ OCT-I,n (1 − OCT-I,n ) , (10)
information outage probability of CT-I with n = 1 is 2 3 4 5
Fn = 1 + OCT-I,n + OCT-I,n − 5OCT-I,n + 4OCT-I,n − OCT-I,n .

ICT-I,1 = 1 − (1 − OCT-I,1 ) + OCT-I,1 P E1 E2 (11)

2 3 2
= OCT-I,1 − OCT-I,1 2E1 − E1 = OCT-I,1 F1 , (7) Note that, from (9) the diversity order of CT-I is 2n + 1.
2 3 4 5
F1 = 1 + OCT-I,1 + OCT-I,1 − 5OCT-I,1 + 4OCT-I,1 − OCT-I,1 . (8)
If n = 2 is used, the k th message is sent in one indepen-
dent data packet, but it is also contained in four additional We evaluate the proposed schemes in terms of information
packets, which leads to four independent sets of events that outage probability, transmit power consumption, and coverage.
All closed-form expressions were validated by Monte-Carlo
allow message decoding with probability of occurrence E2
simulations. Unless stated otherwise, we use the parameters in
and depending on: i) one coded message sent after the k th Table II, which coincide with those in the Sigfox standard [12].
data packet, in a linear combination of the (k − 2)th and Fig. 2 shows the information outage probability as a function
the k th messages; ii) another coded message sent after the of n, which is the number of replicas sent by RT or the number
k th data packet, in a linear combination of the (k − 1)th of coded messages sent by CT. Several interesting points arise:
and the k th messages; iii) the coded message sent after i) For all redundancy-based schemes, the optimum n depends
the (k + 1)th data packet, in a linear combination of the on the communication distance, except for CT-E whose opti-
k th and the (k + 1)th messages; iv) the coded message mum value is n = 1; ii) For the other schemes, the optimal
sent after the (k + 2)th data packet, in a linear combina- value of n decreases as d increases, n = 3 for RT-I and CT-I,
tion of the k th and the (k + 2)th messages. Generalizing as well as n = 5 for RT-E when d = 5 km (see Fig. 2-a),
while n = 2 for RT-I and CT-I, as well as n = 3 for RT-E
3 DaRe [5] with a code rate equal to 1/2 and W = 2 is equivalent to CT-E when d = 10 km (see Fig. 2-b); iii) For the same d, n of
with n = 1, while if W = 1, then it is equivalent to RT-E with n = 1. RT-E is greater than n of RT-I and CT-I, due to the protocol
MONTEJO-SÁNCHEZ et al.: CODED REDUNDANT MESSAGE TRANSMISSION SCHEMES FOR LOW-POWER WIDE AREA IoT APPLICATIONS 587
Fig. 2. Information outage as a function of the number of replicas or coded Fig. 4. (a) Minimum transmit power (Pt,min ) and (b) Maximum commu-
messages (n), for (a) d = 5 km and (b) d = 10 km. nication distance (dmax ) as a function of the information outage.
for more demanding target information outages, such as Io ≤

10−3 , CT-I with n = 2 is the best option.
The results shown here vary quantitatively in real-world sce-
narios, but the performance metrics trends do remain. Besides
note that, as seen in Fig. 1, when the original message trans-
mission fails and the system must rely on a replica, RT-I
decodes the k th message within the k th duty cycle, CT-I can
decode the message within the same duty cycle or in the fol-
lowing duty cycles, but RT-E and CT-E always decode the
message in the following duty cycles.
Fig. 3. Information outage as a function of (a) the transmit power (Pt ) and
V. C ONCLUSION
(b) the protocol overhead. We proposed CT-I, which outperforms direct transmission,
methods based on replications, and another coded transmission
efficiency of RT-E; iv) CT-I outperforms all other schemes in scheme based on embedded messages, in terms of reliabil-
terms of information outage with n . ity, transmit power, and coverage, with the same channel
Fig. 3-a shows the information outage probability as a time utilization. CT-I transmits redundancy with less addi-
function of the transmit power. Several interesting points tional packets than the other schemes. The optimum number
arise: i) All redundancy-based schemes outperform DT, despite of redundant messages depends on range, protocol overhead,
being their respective link outage probabilities greater than transmit power and information outage requirements.
ODT ; ii) CT-I with n = 2 outperforms all other schemes in
terms of information outage, when 12 dBm ≤ Pt ≤ 18 dBm; R EFERENCES
iii) The optimum n depends on Pt , for instance for CT-I if [1] G. A. Akpakwu, B. J. Silva, G. P. Hancke, and A. M. Abu-Mahfouz,
Pt < 12 dBm then n = 1, but if Pt > 18 dBm then n = 3. “A survey on 5G networks for the Internet of Things: Communication
technologies and challenges,” IEEE Access, vol. 6, pp. 3619–3647, 2018.
Fig. 3-b shows the information outage probability as a func- [2] U. Raza, P. Kulkarni, and M. Sooriyabandara, “Low power wide area
tion of the protocol overhead, defined as the ratio between the networks: An overview,” IEEE Commun. Surveys Tuts., vol. 19, no. 2,
header and the packet sizes. In this experiment, we set the DT pp. 855–873, 2nd Quart., 2017.
[3] Y. Mo, M.-T. Do, C. Goursaud, and J.-M. Gorce, “Optimization of the
packet size to 12 bytes, so the protocol overhead varies accord- predefined number of replications in a ultra narrow band based IoT
ing to lh and lm (e.g., if lh = 4 bytes, then lm = 8 bytes and the network,” in Proc. IEEE Wireless Days, 2016, pp. 1–6.
[4] S. Montejo-Sánchez et al., “An alternative non-cooperative transmis-
protocol overhead is 33%). Two additional points: i) The per- sion scheme based on coded redundant information,” in Proc. IEEE
formance of schemes based on embedded messages improves LATINCOM, 2017, pp. 1–6.
when the protocol overhead increases, due to (3); ii) CT-I with [5] P. J. Marcelis, V. S. Rao, and R. V. Prasad, “DaRe: Data recovery through
application layer coding for LoRaWAN,” in Proc. IEEE IoTDI, 2017,
n = 2 outperforms all other schemes in terms of information pp. 97–108.
outage, when the protocol overhead is less than 50 %, from [6] T. A. Courtade and R. D. Wesel, “Optimal allocation of redun-
which RT-E with n = 3 becomes the best option. dancy between packet-level erasure coding and physical-layer channel
coding in fading channels,” IEEE Trans. Commun., vol. 59, no. 8,
Fig. 4-a shows the minimum transmit power required to pp. 2101–2109, Aug. 2011.
sustain a target information outage probability. The proposed [7] J. W. Byers, M. Luby, and M. Mitzenmacher, “A digital fountain
scheme, CT-I with n = 2 and n = 1, as well as CT-E with approach to asynchronous reliable multicast,” IEEE J. Sel. Areas
n = 1, require half the transmit power of RT-I with n = 2 [8] R. R. Borujeny and M. Ardakani, “Fountain code design for the Y-
to meet Io = 10−3 . Requiring less Pt to meet the same Io network,” IEEE Commun. Lett., vol. 19, no. 5, pp. 703–706, May 2015.
[9] G. Angelopoulos, A. Paidimarri, A. P. Chandrakasan, and M. Médard,
and channel time utilization implies greater energy efficiency, “Experimental study of the interplay of channel and network cod-
since the additional computational cost of CT-I is minimal ing in low power sensor applications,” in Proc. IEEE ICC, 2013,
when compared to the RF power consumption and can thus pp. 5126–5130.
[10] A. Goldsmith, Wireless Communications. Cambridge, U.K.: Cambridge
be neglected. If information outage requirements are mild, then Univ. Press, 2005.
it is more convenient to use few replicas. Fig. 4-b shows the [11] T. T. Soong, Fundamentals of Probability and Statistics for Engineers.
Hoboken, NJ, USA: Wiley, 2004.
maximum range as a function of Io . For the case of a relatively [12] SigFox. (2018). Lightweight Protocol. [Online]. Available: https://www.
large Io = 10−2 , CT-E with n = 1 is the best option, while sigfox.com/en/sigfox-iot-technology-overview
On Optimizing Effective Rate for Random Linear Network

Coding Over Burst-Erasure Relay Links
Huangnan Wu, Ye Li , Yingdong Hu, Bin Tang , and Zhihua Bao
Abstract—This letter proposes to optimize the effective data

rate of random linear network coding (RLNC) over finite-buffer
burst-erasure relay links. An absorbing Markov chain model is
exploited to analyze the expected transmission completion times
of using different finite field sizes. Through analysis and simu-
lations, we show that the RLNC effective data rate considering
the coefficient overhead can be maximized by optimization over Fig. 1. A two-hop line network with GE channels.
the field size for coding.
Index Terms—Network coding, coding coefficient, finite field (PRNG) to avoid sending the coding coefficients might be a
size, absorbing Markov chain. possible solution for the end-to-end scenarios, but it cannot be
used for the relay transmissions where recoding needs to be
performed so as to approach the max-flow capacity [3].
To solve the problem, we consider optimizing the finite
I. I NTRODUCTION field size to improve the efficiency of RLNC. In this letter,
ELAY-AIDED transmissions play a vital role in wireless
R communication systems. For example, in the ultra-dense
cellular of the fifth generation (5G) wireless networks, the
we propose to focus on the effective data rate which takes
the coefficient overhead into the account of the end-to-end
rate. Given the tradeoff that using a smaller finite field may
relay-aided transmission is a key technique to improve the lower the coefficient overhead but could also increase the lin-
system throughput and expand the transmission range [1]. ear dependency among coded packets and hence decrease the
Random linear network coding (RLNC) [2] is a promising effective rate, we analyze the relationship between the finite
technique for enhancing the relay-aided transmission. With field size and the completion time for the finite-buffer relay
RLNC, a relay node is allowed to recode received packets, transmission scenarios. Based on the analysis, the field size is
and the transmission time can be reduced without the need optimized via numerical search to maximum the effective data
of acknowledgments between nodes. This fountain property is rate. We focus on the Gilbert-Elliott (GE) channel model as
especially useful in the scenarios where the link qualities are the packet loss of wireless channels is usually bursty [4].
poor or the feedback channel is expensive. Previously, the effects of finite field sizes have been inves-
However, one drawback of RLNC is that coding coeffi- tigated in [5] and [6] for end-to-end transmissions where
cients need to be carried in the coded packet header. When the the overall path is modeled as a Bernoulli erasure channel.
size of the finite field used for coding is large, the increased However, the effects of recoding with limited buffer size at
coefficient overhead may negatively affect the transmission intermediate nodes are not characterized. In [7], the relay
efficiency when the data payload is small, which is a typi- transmission with Bernoulli erasures was analyzed under the
cal characteristic of many 5G applications such as the Internet assumptions that the relay buffer and the field sizes are suf-
of Things (IoT). While splitting the source packets into mul- ficiently large. In [8], the finite-buffer was considered for a
tiple subsets and performing RLNC within each subset may special GE burst erasure channel where the packet loss rate is
reduce the overhead, feedback from the decoder is required 0 when the channel is in the Good state. Again, the assumed
for the source to move to the transmission of the next subset, field size in [8] is sufficiently large. Compared to the previous
which would destroy the fountain property of RLNC and might works, the main contribution of this letter are that we analyze
not be practical in scenarios where feedback is expensive or the completion time of RLNC over a generic GE erasure relay
has long delay. Using the pseudo random number generator link by generalizing the analysis of [8] to where the effects of
using finite q’s are taken into account, and that we determine
Manuscript received September 28, 2018; revised November 5, 2018;
accepted November 9, 2018. Date of publication November 13, 2018; date the optimal q that maximizes the effective end-to-end data rate
of current version April 9, 2019. This work was supported in part by the of RLNC via an efficient numerical algorithm.
National Natural Science Foundation of China (NSFC) under Grant 61771263,
Grant 61771264, Grant 61801248, and Grant 61871241, in part by the Natural II. S YSTEM M ODEL
Science Foundation of Jiangsu Province under Grant BK20180943, and in part
by the Natural Science Foundation of Jiangsu Higher Education Institutes A. RLNC Transmission Model
under Grant 18KJB510036 and Grant 18KJB510037. The work of B. Tang
We consider a relay link consisting of a source node S, a
was supported by the NSFC under Grant 61501221 and Grant 61872171. The
associate editor coordinating the review of this paper and approving it for relay node R and a destination node D. Each hop is modeled as
publication was L. P. Natarajan. (Corresponding author: Zhihua Bao.) a burst erasure channel which may be described using a two-
H. Wu, Y. Li, Y. Hu, and Z. Bao are with the School of Electronics state GE model as shown in Fig. 1. In the model, there are
and Information, Nantong University, Nantong 226019, China (e-mail: two states, i.e., Good and Bad. On the l-th hop (l ∈ {1, 2}),
bao.zh@ntu.edu.cn).
B. Tang is with the National Key Laboratory for Novel Software
the transition probabilities between Good and Bad states are
Technology, Nanjing University, Nanjing 210023, China. denoted as Pgl and Pbl , respectively. The probabilities of suc-
Digital Object Identifier 10.1109/LWC.2018.2881157 cessful transmission in Good and Bad states are αl < 1 and
WU et al.: ON OPTIMIZING EFFECTIVE RATE FOR RLNC OVER BURST-ERASURE RELAY LINKS 589
βl < 1, respectively. The average packet loss rate of this model of an S-R transmission, D has received Y innovative packets,
Pgl
is Pel = PblP+P bl
gl
(1 − αl ) + Pgl +P bl
(1 − βl ) [4]. and R has X ≤ m packets buffered which are innovative with
Suppose that S has M source packets s 1 , . . . , s M to send respect to the Y innovative packets at D. Over the S-R link,
to D. Each packet contains K bytes. The relay node R connects the probability that R receives an innovative packet from S is
X +Y X +Y
S and D, and has a finite buffer of size m M packets. We then P1 = 1− q q M . The term q q M is the probability that a
assume that the oldest buffered packet would be replaced by uniformly randomly generated M-dimensional EV over Fq lies
the new incoming packet if the buffer is full. The two links in the span of the X + Y linearly independent M-dimensional
are transmitted in a time division manner. We assume that S-R vectors that have been received by R and D [9]. D receives an
transmits in odd time slots, and R-D transmits in even time innovative packet if and only if X > 0 before the R-D trans-
slots. The packet is received immediately by the downstream mission and the recoding coefficients are not all zero. Denote
node if it is not erased over the channel. the probability that a recoded packet from R is innovative as
WithRLNC, coded packets are sent from S in the form of P2 . We have P2 = 1 − q1X if the latest S-R transmission did
p = M i=1 gi s i , where the coding coefficients gi
s are uni-
not increase X, and P2 = 1− q X1+1 otherwise (i.e., X → X +1
formly randomly chosen from the finite field Fq of size q.
in the latest S-R transmission).
The vector [g1 , . . . , gM ] is the encoding vector (EV) of the
In the model, we have X ∈ {0, 1, . . . , m}, Y ∈
coded packet. Suppose that R has received k coded pack-
{0, 1, . . . , M }, and X + Y ≤ M. In addition, there might be
ets p 1 , . . . , p k , 0 < kk ≤ m, recoded k packets are sent from
M two states for each link, Good and Bad, abbreviated as G
R in the form of j =1 h j p j = j =1 h j ( i=1 gi,j s i ) = and B, respectively. The two-hop transmission process can
M k
i=1 ( j =1 gi,j hj )s i , where gi,j ’s are the coding coeffi- therefore be fully characterized by a Markov chain with state
cients of pj , hj ’s are uniformly randomly chosen from Fq , (X , Y , S1 , S2 ), where S1 and S2 denote the channel states of
k
and j =1 gi,j hj are the coding coefficients of the recoded
S-R and R-D, respectively, S1 ∈ {G, B }, S2 ∈ {G, B }. The
packet. The coding coefficients are delivered in the header total number of the (X , Y , S1 , S2 ) states is
of each coded/recoded packet. When D receives a sufficient
m(m + 1)
number of coded packets which contain M innovative (i.e., lin- η = 16 (m + 1)(M − m + 1) + , (2)
early independent) EVs, the source packets can be recovered 2
by Gaussian elimination. where 16 = 4 × 4 is because there are 4 different (S1 , S2 )
combinations and each of which may transit to 4 different
B. Performances Metrics states. Depending on the different numbers of innovative pack-
M , where E{T } ets at R and D, the state transition probabilities after every two
The achieved end-to-end rate is R = E{T } time slots can be determined as follows.
is the expected completion time, i.e., the expected num- For simplicity, here we only present the transition prob-
ber of time slots required for the decoder to recover all abilities for one possible change of the channel states,
the M source packets. The minimum expected completion (G, B) → (G, B). As mentioned, a total of 16 such channel state
time Tmin = 2M /(1 − max{Pe1 , Pe2 }) when the max-flow transitions are possible. The (X , Y , S1 , S2 ) transition proba-
capacity is achieved, where 2 is due to the time division trans- bilities corresponding to the other channel state transitions can
mission, and Pe1 and Pe2 are the average packet loss rates of be calculated similarly, and are omitted to save space.
the two links, respectively. The effective rate is defined as Case 1: X = 0, Y < M. Since no innovative packet is
K M buffered at R at the beginning of the S-R transmission, Y
Re = M log2 q
, (1)
K+ E{T } would not increase regardless of the R-D transmission if the
8
S-R link can not transmit an innovative packet successfully
where K is data payload size of each coded packet in bytes. (i.e., either the S-R transmission fails or the transmission suc-
Re measures the effects of coefficient overhead on the rate.
ceeds but the packet is non-innovative). When R receives
an innovative packet successfully, Y increases and X would
III. A NALYZING AND O PTIMIZING E FFECTIVE R ATE remain 0 if R-D transmits an innovative packet successfully.
We exploit a Markov chain model to analyze the expected Otherwise, Y would not increase and X would increase by 1.
completion time of using RLNC with finite field size q. The Therefore, we may have the following transition probabilities:
analysis is in part inspired by [8]. Based on the analysis, we
then optimize q to maximize the effective rate of RLNC. Pr{(X , Y , G, B) → (X , Y + 1, G, B)} = (1 − Pg1 )α1 P1 (1 − Pb2 )β2 P2 ,
Pr{(X , Y , G, B) → (X , Y , G, B)}
A. Analysis
= [(1 − Pg1 )α1 (1 − P1 ) + (1 − Pg1 )(1 − α1 )](1 − Pb2 ),
The model tracks the numbers of innovative packets at
Pr{(X , Y , G, B) → (X + 1, Y , G, B)}
R and D,1 which are then used to determine the expected
transition steps that are requried to achieve the transmission = (1 − Pg1 )α1 P1 [(1 − Pb2 )(1 − β2 ) + (1 − Pb2 )β2 (1 − P2 )],
completion.2 Specifically, we consider that at the beginning
where we recall that α1 and β2 are the probabilities of success-
1 We note that during the transmission we do not actually check the innova- ful transmission in Good and Bad states, respectively. Note that
tiveness of packets at R to avoid complicating the node. However, this model α2 and β1 do not appear here because the S-R link remains
simplifies the analysis, and will be shown to match the simulations. Good and the R-D link remains Bad.
2 The analysis can also be extended to a line network with multiple relays
Case 2: 0 < X < m, Y < M − X. On the one hand, when
by tracking the numbers of innovative packets at the relay nodes. However,
the state space would be significantly enlarged. Since this letter focuses on the S-R cannot transmit an innovative packet successfully, Y
the single-relay link, we omit the extension to the multi-relay scenario. would increase (and X would decrease) if the R-D transmits
an innovative packet successfully. Otherwise, both X and Y Algorithm 1 Search for the Optimal q
are not changed. On the other hand, when the S-R transmits 1: q ← 2, q̂ ← q
an innovative packet successfully, X would increase and Y 2: Calculate Re (q) according to equations (1) and (4)
are not changed if the R-D could not transmit an innovative 3: Re∗ ← Re (q) placeholder of the maximum Re
packet successfully. Otherwise, Y would increase and X is not 4: while 2q ≤ q ∗ do
changed. We may have the following transition probabilities: 5: q ← 2q
6: if Re (q) > Re∗ then
Pr{(X , Y , G, B) → (X − 1, Y + 1, G, B)} 7: Re∗ ← Re (q)
8: q̂ ← q optimal q
= [(1 − Pg1 )α1 (1 − P1 ) + (1 − Pg1 )(1 − α1 )](1 − Pb2 )β2 P2 ,
9: end if
Pr{(X , Y , G, B) → (X , Y , G, B)} = [(1 − Pg1 )α1 (1 − P1 ) 10: end while
+ (1 − Pg1 )(1 − α1 )][(1 − Pb2 )(1 − β2 ) + (1 − Pb2 )β2 (1 − P2 )],
Pr{(X , Y , G, B) → (X + 1, Y , G, B)}
expected number of times of staying in the corresponding state,
= (1 − Pg1 )α1 P1 [(1 − Pb2 )(1 − β2 ) + (1 − Pb2 )β2 (1 − P2 )],
given that the random process starts at (0, 0, G, G). E is an
Pr{(X , Y , G, B) → (X , Y + 1, G, B)} = (1 − Pg1 )α1 P1 (1 − Pb2 )β2 P2 . (η−4)×(η−4) identity matrix. The variance of the completion
time can also be obtained as
Case 3: X = m or X + Y = M. When X = m, R’s buffer is
full, and X would not increase after the S-R transmission. Y Var{T } = e 1,η−4 [(2H − E )t] − (e 1,η−4 t)2 , (5)
increases by 1 and X decreases by 1 accordingly if and only where e 1,η−4 is a row vector of length (η−4) and its elements
if the R-D transmits an innovative packet successfully. When are all-zero except that the 16th element being 1. t = H1 where
X + Y = M, R and D have obtained all the needed innovative 1 is a column vector of length (η − 4) with all 1 entries.
packets, and X would not increase anymore. Therefore, the
state transition probabilities are the same as X = m. We may
B. Optimizing Effective Rate
have the following transition probabilities:
It is clear that the completion time depends on the finite field
Pr{(X , Y , G, B ) → (X − 1, Y + 1, G, B )} size q. From (1), q also affects the coefficient overhead. We
= (1 − Pg1 )(1 − Pb2 )β2 P2 , can therefore optimize q based on the above analysis results to
Pr{(X , Y , G, B ) → (X , Y , G, B )} maximize the effective data rate. We consider that the nodes
have limited computation capability and storage space, and
= (1 − Pg1 )[(1 − Pb2 )(1 − β2 ) + (1 − Pb2 )β2 (1 − P2 )]. therefore the maximum supported finite field size is limited.
Note that we have 4 absorbing states, i.e., (0, M, G, G), Let q ∗ denote the maximum allowed finite field size. Without
loss of generality, the allowed finite fields are all extensions
(0, M, G, B), (0, M, B, G), (0, M, B, B). Let ζ = M −m +1 and
of F2 , i.e., q = 2k , k ∈ Z+ . We can use Algorithm 1 to
label the 4 channel state combinations (S1 , S2 ), i.e., (G, G), traverse all q s in [2, q ∗ ] to find the optimal q that maximizes
(G, B), (B, G), (B, B), as W = 0, 1, 2, 3, respectively. We can the effective data rate. The optimal q is denoted as q̂.
obtain the transition matrix of the Markov chain by ordering
all the four-tuple states. Define IV. S IMULATION R ESULTS
⎧
⎪ In this section, we evaluate the performance of the proposed
⎨16[(m + 1)Y + X + 1] − W ,
⎪ Y ≤ ζ,
−ζ−1 scheme. Throughout the coded transmission simulations, we
f (X , Y , S1 , S2 ) = 16[(m + 1)ζ + kY=0 (m − k )
⎪
⎪
⎩ use packets each containing K = 200 bytes given the fact that
+X + 1] − W , Y > ζ, many applications such as IoT use small payload sizes. The
where W is to identify the channels states. Let ΔX and results are based on averaging over 100,000 trials.
ΔY denote the change of X and Y after every two time We first validate the proposed analytic model. Suppose that
slots, respectively, ΔX , ΔY ∈ {−1, 0, 1}. By assigning index the number of source packets M = 64, and the channel state
λ = f (X , Y , S1 , S2 ) and ω = f (X + ΔX , Y + ΔY , S1 , S2 ), transition probabilities of the links are Pg1 = Pb1 = Pg2 =
we can write the η × η transition matrix of (X , Y , S1 , S2 ) as Pb2 = 0.2. The successful transmission probabilities on each
link in the Good and Bad states are α1 = α2 = 0.8 and β1 =
A Z β2 = 0.6, respectively. We observe that the simulation and
Π= . (3)
0 I 4×4 analysis results (see equations (4), (5)) are closely matched in
Fig. 2, where the vertical bars are the standard deviations. Non-
The (λ, ω)-th entry of matrix Π, [Π]λ,ω = surprisingly, the completion time decreases as the finite field
Pr{(X , Y , S1 , S2 ) → (X + ΔX , Y + ΔY , S1 , S2 )}. A size increases. The average completion time decreases more
is an (η − 4) × (η − 4) matrix and I 4×4 is an 4 × 4 identity significantly when the finite field size is small (i.e., q ≤ 4),
matrix which corresponds to the 4 absorbing states. whereas it remains almost unchanged when q is large (i.e.,
Given Π, and assuming that the transmission starts at the q ≥ 8). Note that there is a gap between the achieved and the
16-th state, i.e., (0, 0, G, G), the expected completion time minimum achievable completion times. The reason is that the
n−4 results are in the finite regime M = 64. The completion time

E{T } = [H ]16,b , (4) would achieve the minimum as M → ∞ and q → ∞.
Fig. 3 shows the effective rates of RLNC using q =
b=1
{2, 4, 8, . . . , 256}, respectively. The maximum rate is achieved
where H = (E − A)−1 is the fundamental matrix of the at q = 4. Although q = 256 has the smallest completion time
Markov chain [8]. Each entry in the 16-th row of H is the as shown in Fig. 2, the effective rate is the lowest due to the
WU et al.: ON OPTIMIZING EFFECTIVE RATE FOR RLNC OVER BURST-ERASURE RELAY LINKS 591
Fig. 4. Optimal effective rate with various M.

Fig. 2. Completion times with various q.
optimal Re∗ . We see that the optimized q using the original

and the simplified Algorithm 1 matches, and the achieved rates
only slightly deviate from the analyzed maximum Re∗ . As M
increases, the overhead increases, and therefore the effective
rate gradually decreases. However, the rates are higher than
the routing rate, which is equal to 12 (1 − Pe1 )(1 − Pe2 ).
V. C ONCLUSION
In this letter, we showed that, by analyzing the completion
time of RLNC over finite-buffer burst-erasure relay links using
a Markov chain model and then optimizing the finite field size,
we can increase the effective rate of RLNC.
R EFERENCES
[1] X. Ge, S. Tu, G. Mao, C.-X. Wang, and T. Han, “5G ultra-dense cel-
lular networks,” IEEE Wireless Commun., vol. 23, no. 1, pp. 72–79,
Feb. 2016.
Fig. 3. Effective rates with various q. [2] T. Ho et al., “A random linear network coding approach to multicast,”
IEEE Trans. Inf. Theory., vol. 52, no. 10, pp. 4413–4430, Oct. 2006.
[3] Z. Liu, C. Wu, B. Li, and S. Zhao, “UUSee: Large-scale operational
on-demand streaming with random network coding,” in Proc. IEEE
high coefficient overhead. We observe that Re has only one INFOCOM, San Diego, CA, USA, 2010, pp. 1–9.
maximum in Fig. 3. This implies that if using Algorithm 1 [4] X. Xu, Y. L. Guan, and Y. Zeng, “Batched network coding with adaptive
recoding for multi-hop erasure channels with memory,” IEEE Trans.
to optimize q, we may actually exit the search after evalu- Commun., vol. 66, no. 3, pp. 1042–1052, Mar. 2018.
ating q = {2, 4, 8}, i.e., breaking out the loop after finding [5] J. Heide, M. V. Pedersen, F. H. P. Fitzek, and M. Medard, “On code
the first q such that Re ( q2 ) < Re (q) > Re (2q). This may parameters and coding vector representation for practical RLNC,” in
save the computational cost significantly. Through extensive Proc. IEEE Int. Conf. Commun. (ICC), 2011, pp. 1–5.
numerical evaluations and simulations for various M and q, [6] M. Nistor, D. E. Lucani, T. T. V. Vinhoza, R. A. Costa, and J. Barros,
“On the delay distribution of random linear network coding,” IEEE J.
we have confirmed that such a simplified algorithm is indeed Sel. Areas Commun., vol. 29, no. 5, pp. 1084–1093, May 2011.
effective.3 [7] X. Shi, M. Medard, and D. E. Lucani, “Whether and where to code in
In Fig. 4, we show the simulated effective rates for various the wireless packet erasure relay channel,” IEEE J. Sel. Areas Commun.,
M using q s found by Algorithm 1 and that found by the vol. 31, no. 8, pp. 1379–1389, Aug. 2013.
[8] Y. Li et al., “A low-complexity coded transmission scheme over finite-
simplified algorithm, respectively. The corresponding optimal buffer relay links,” IEEE Trans. Commun., vol. 66, no. 7, pp. 2873–2887,
q s are marked. As a comparison, we also plot the calculated Jul. 2018.
[9] O. Trullols-Cruces, J. M. Barcelo-Ordinas, and M. Fiore, “Exact decod-
3 Unfortunately, we are not able to mathematically prove that R has only
e ing probability under random linear network coding,” IEEE Commun.
one maximum since we cannot obtain T as a closed-form function of q. Lett., vol. 15, no. 1, pp. 67–69, Jan. 2011.
Different Power Adaption Methods on Fluctuating Two-Ray Fading Channels

Hui Zhao , Student Member, IEEE, Zhedong Liu, and Mohamed-Slim Alouini , Fellow, IEEE
Abstract—In this letter, we consider a typical scenario where In practice, there is an average transmit power constraint at
the transmitter employs different power adaption methods, the transmitter, which will involve power adaption according to
including the optimal rate and power algorithm, optimal rate instantaneous channel state to enhance the EC. Optimal power
adaption, channel inversion and truncated channel inversion,
and rate algorithm (OPRA) and channel inversion (CI) are
to enhance the ergodic capacity (EC) with an average trans-
mit power constraint over fluctuating two-way fading channels. two comment methods of power adaption [5]–[7], where the
In particular, we derive exact closed-form expressions for the performance of OPRA is much better than that of CI, because
EC under different power adaption methods, as well as cor- a large amount of the transmit power is required to compensate
responding asymptotic formulas for the EC valid in the high for the deep channel fading. To improve the performance of
signal-to-noise ratio region. Finally, we compare the performance CI, [8] proposed a kind of truncated CI (TCI), where the cutoff
of the EC under different power adaption methods, and this also level can be selected to achieve a specified outage probability.
validates the accuracy of our derived expressions for the exact
In this letter, we derive exact and corresponding asymptotic
and asymptotic EC.
closed-form expressions for the EC over FTR fading channels
Index Terms—Asymptotic ergodic capacity, ergodic capacity, under different power adaption methods, namely OPRA, ORA,
fluctuating two-ray fading channel, power adaption. CI and TCI, and compare the EC among them by simulation.
I. I NTRODUCTION
UE TO the exponential increase in aggregate traffic, II. S YSTEM M ODEL
D millimeter-wave (mmWave) has been recently used to
overcome the wireless spectrum shortage. Although some con-
The EC is defined as
∞
ventional fading channels, such as Rayleigh and Rician fading C = E{ln(1 + γd )} = ln(1 + γd )fγd (γd )d γd , (1)
0
channels, has been verified to suit sometimes the mmWave
radio communications, the fluctuation suffered by the received where γd is the instantaneous SNR at the destination, and
signal cannot always be modeled accurately by conventional fγd (·) is the probability density function (PDF) of γd . In
fading models. In view of this issue, [1] introduced the fluctu- the following sections, we will investigate several adaptive
ating two-ray (FTR) fading model consisting of random phase transmission methods to improve the EC.
plus a diffuse component, which is the natural generalization The PDF and cumulative density function (CDF) of γd over
of the two-wave with diffuse (TWDP) model in [2] where FTR fading channels are given by [3]
the specular components of the TWDP model are just con-
stant amplitudes, and also can reduce to many conventional md ∞ jd γdjd exp − 2σγd
2
md Kd djd d
fading models, such as Rician and Nakagami-m fading mod- fγd (γd ) = 2 jd +1 , (2)
Γ(md ) j !j !
els (more special cases refer to [1, Table I]). Subsequently, jd =0 d d 2σd
[3], [4] extended the work of [1] in terms of elementary func- md ∞ jd

md Kd djd γd
tions and coefficients consisting of fading parameters, where Fγd (γd ) = Υ jd + 1, 2 , (3)
Γ(md ) j !j ! 2σd
the parameter m in FTR fading can be valued by an arbitrary j =0 d d
d
positive real number, rather than only positive integers in [1].
However, there is no derivation of asymptotic ergodic capacity respectively, where Γ(·) and Υ(·, ·) denote the Gamma func-
(AEC) and asymptotic ergodic secrecy capacity (AESC) in the tion and lower incomplete Gamma function [9], respectively.
high signal-to-noise ratio (SNR) region in [3] and [4], resulting Kd is the average power ratio of the dominant waves and
in missing some insights and less efficiency in calculation of remaining diffuse multipath, md is the parameter of Gamma
the ergodic capacity (EC) and ergodic secrecy capacity (ESC) distribution with unit mean, σd2 is the variance of the real (or
in the high SNR region. Further, the authors in [3] and [4] imaginary) diffuse component, and
only consider the optimal rate adaption (ORA) case, where jd
k
k

Δ jd Δd k
the transmit power is fixed over the whole transmission. djd = Γ(jd + md + 2l − k )
k 2 l
Manuscript received September 7, 2018; revised October 25, 2018; accepted k =0 l=0
November 8, 2018. Date of publication November 14, 2018; date of current
−(jd 2+md )
π(2l − k )i 2 2
version April 9, 2019. The associate editor coordinating the review of this · exp (md + Kd ) − (Kd Δd )
paper and approving it for publication was I. Krikidis. (Corresponding author: 2
⎛ ⎞
Hui Zhao.)
k −2l m d + Kd
The authors are with the Computer, Electrical, and Mathematical ·Pjd +md −1 ⎝ ⎠, (4)
Science and Engineering Division, King Abdullah University
of Science and Technology, Jeddah 23955-6900, Saudi Arabia (md + Kd )2 − (Kd Δd )2
(e-mail: hui.zhao@kaust.edu.sa; zhedong.liu@kaust.edu.sa;
slim.alouini@kaust.edu.sa). in which Δd , i, and P(·) denote a ratio defined by (4) in [3], the
Digital Object Identifier 10.1109/LWC.2018.2881158 imaginary unit, and the Legendre function of the first kind [9],
ZHAO et al.: DIFFERENT POWER ADAPTION METHODS ON FTR FADING CHANNELS 593
respectively. From Fγd (∞) = 1 and [3, eqs. (7) and (8)], In view of the definition of upper incomplete Gamma function
md d ∞
m j
Kdd djd Γ(·, ·), we finally have the constraint condition as
we can easily derive Γ(m ) j d =0 jd ! = 1, and further
d
∞ j
rewrite Fγd (·) as mdmd Kdd djd
∞ j Γ(md ) j !j !
mdmd Kdd djd j =0 d d
Fγd (γd ) = 1 − d
Γ(md ) jd ! 1 γ0 1 γ0
jd =0 · Γ jd + 1, 2 − 2 Γ jd , 2 = 1. (12)
j n γ0 2σd 2σd 2σd
γd d
1 γd
· exp − 2 . (5)
2σd n! 2σd2 Let
n=0 ∞
1 fγd (γd )
f (γ0 ) = 1 − Fγd (γ0 ) − d γd . (13)
III. E RGODIC C APACITY U NDER OPRA γ0 γ0 γd
A. Exact EC Under OPRA Differentiating f (γ0 ) with respect to γ0 by using Leibniz Rule,
Our goal is to adjust the transmit power according to the we have
instantaneous channel state to maximize the EC subject to a ∂f (γ0 ) 1 − Fγd (γ0 )
certain average transmit power (P t ) by using OPRA, where =− . (14)
the EC in the integral form is given by (7) in [5] ∂γ0 γ02
∞ ∂ f (γ ) < 0. Thus, f (γ ) is monotonically
γ For γ0 > 0, ∂γ 0 0
C opra = ln d fγd (γd )d γd , (6) 0
γ0 γ0 decreasing over γ0 ∈ [0, ∞). When γ0 → ∞, (13) will
converge to 0, and when γ0 → 0, it will go to infinity. To sum-
where γ0 = λP t , in which λ ≥ 0 is the corresponding marize, there exists unique γ0 satisfying the identity of (12).
Lagrangian multiplier. Substituting the PDF of γd into (6), When 2σd2 → ∞, we have (15), shown on the top of the
we can derive the EC under OPRA as next page. Our numerical results show that γ0 increases as
mdmd ∞ j
Kdd djd γ D increases, and as such γ0 will lie in the interval [0, 1].
C opra = 2 jd +1
Γ(md )
jd =0 jd !jd ! 2σd B. Asymptotic EC Under OPRA
∞
Γ(s,x )
γd jd γd By using the limit identity lim xs = − 1s for Re(s) < 0
· ln γ exp − 2 d γd , (7) x →0
γ0 γ0 d 2σd and lim Γ(0, x ) = − ln x + ψ(1), we have
x →0
I1
Γ −p + l − 1, 1 2σd2 1
where lim 2 −p+l−1 = , (16)
2
1/(2σd )→0 p + 1−l
∞ 1 2σd
x + γ0 jd x + γ0
I1 = ln (x + γ0 ) exp − dx and
0 γ0 2σd2
jd −p p
jd
γ0 1 γ0
jd jd −p γ0 lim + Γ 0, 2
= γ exp − 2 1/(2σd2 )→0 2σd2 p+1−l 2σd
p 0 2σd l=1
p=0 ⎧
∞ ⎨0, p < jd ;
x x
p
· x ln 1 + exp − 2 dx . (8) =
⎩ψ(p + 1) − ln γ0
, p = jd ,
(17)
0 γ0 2σd 2σ 2 d
By using the integral identity derived in [5, Appendix B], I1 where ψ(·) denotes the digamma function [9]. In view of this
can be easily solved in closed-form as limit relationship, the AEC under OPRA can be derived by
jd p+1
jd
2 l j +1−l γ0 ∞ md
md ∞
j
Kdd djd
I1 = p! 2σd γ0 d
Γ −p − 1 + l , 2 , (9) C opra = ln 2σd2 − ln γ0 + ψ(jd + 1). (18)
p=0
p
l=1
2σd Γ(md )
jd =0
jd !
where Γ(·, ·) denotes the complementary upper incomplete ∞

Finally, C opra can be further rewritten by using the relation-
Gamma function [5]. ship between σd and γ D , i.e., γ D = N Eb
0
2σd2 (1 + Kd )rd−ηd ,
The corresponding constraint condition for the transmit given by (3) in [4], where Eb , N0 , ηd and rd are the energy
power can be written as [5] per bit, noise power, path-loss exponent, and distance between
∞
1 1 the transmitter and destination, respectively,
− fγd (γd )d γd
γ0 γ0 γd −η
∞ ∞ Eb (1 + Kd )rd d
1 1 C opra = ln γ D − ln
= 1 − Fγd (γ0 ) − fγd (γd )d γd = 1. (10) N0
γ0 γ0 γd
mdmd ∞
Kdjd djd
Substituting the CDF and PDF of γd over FTR fading channels − ln γ0 + ψ(jd + 1). (19)
into (10), we can derive (11) shown on the top of the next page. Γ(md ) jd !
jd =0
m ∞ j jd m ∞ j ∞
1 md d Kdd djd γ0 1 γ0 n md d Kdd djd jd −1 γd
exp − 2 − γ exp − d γd = 1 (11)
γ0 Γ(md ) j =0 jd ! 2σd n=0 n! 2σd2 Γ(md ) j =0 jd !jd !(2σ 2 )jd +1 γ0 d 2σd2
d d d

mdmd ∞
Kdjd djd 1 γ0 1 γ0
m ∞
1 md d Kdd djd
j
Γ jd + 1, 2 − 2 Γ jd , 2 = = 1 ⇒ γ0 = 1 (15)
Γ(md ) j !j ! γ0 2σd 2σd 2σd γ0 Γ(md ) jd !
jd =0 d d jd =0

=1
We can easily see that the slope of EC in high SNRs is unity The first derivative of the nth moment with respect to n is
with respect to ln γ D , and the power offset of EC in high SNRs
Kdd djd
2 n
∞ j
is independent of γ D , because when γ D → ∞, γ0 → 1, which ∂E γdn mdmd
= 2σd Γ(n + jd + 1)
means that ln γ0 → 0, and therefore, there is no impact of γ D ∂n Γ(md ) j !j !
jd =0 d d
on the power offset in high SNRs.
!
· ln 2σd2 + ψ(jd + n + 1) . (23)
In considering [10, eq. (20)], the AEC can be derived by

IV. E RGODIC C APACITY U NDER ORA
"
A. Exact EC Under ORA ∞ ∂E γdn ""
C ora = "
In the ORA case, the transmitter cannot employ the chan- ∂n "
n=0
nel state to adjust its transmit power instantaneously, and just
m md ∞
Kdjd djd
uses a constant power, i.e., average transmit power, to transmit = ln 2σd2 + d ψ(jd + 1), (24)
signal to the destination. From [3, Lemma 2], the EC under Γ(md ) jd !
jd =0
ORA is
which is the same as (21).

mdmd ∞
Kdjd djd 1
C ora = exp
Γ(md )
jd =0
jd ! 2σd2 V. E RGODIC C APACITY U NDER C HANNEL I NVERSION
j A. Exact EC Under CI
d +1
−jd −1+l 1
2σd2 Γ −jd − 1 + l , 2 . (20) In the CI case, the transmit power is adjusted according to
2σd the channel state to maintain a constant SNR at the receiver,
l=1
and the corresponding EC is given by (46) in [5]

1
B. Asymptotic EC Under ORA C ci = ln 1 + , (25)
Γ(s,x ) E{1/γd }
We can use lim xs = − 1s for Re(s) < 0 and
x →0 where
lim Γ(0, x ) = − ln x + ψ(1) to derive the AEC under ORA
x →0 # $
in high SNR 1 mdmd ∞
Kdjd djd Γ(jd )
E = . (26)
γd Γ(md ) j !j ! 2σd2
jd =0 d d
∞ mdmd ∞
Kdjd djd

C ora = ψ(jd + 1) + ln 2σd2 When d0 is not equal to zero or Kd is not infinity, E{ γ1d } will
Γ(md ) jd !
jd =0
become infinity, because Γ(jd ) → ∞ for jd = 0, and thus EC

m md ∞ j
Kdd djd under CI will be zero.
= ln 2σd2 + d ψ(jd + 1). (21)
Γ(md ) jd !
jd =0
B. Exact EC Under TCI
This result is the same as (18), because γ0 → 1 as γ D → ∞ We consider the TCI case where a cutoff level γ0 is selected
in (18). to achieve a specified outage probability. In this case, the EC
Another way to derive the AEC is to use the moments of is given by [8, eq. (12)]
γd , where the nth moment of γd is given by
1
∞ C tci = ln 1 + % ∞ F γd (γ0 ), (27)
γ0 1/x fγd (x )dx
E{γdn } = x n fγd (x )dx
0 where F γd (·) is the complementary CDF of γd , and
mdmd ∞ j
Kdd djd
n
Γ(n + jd + 1) 2σd2 , (22) ∞
md d Kdd djd
= ∞ m j
1 1 γ
Γ(md ) jd !jd ! fγd (x )dx = 2 Γ jd , 02 . (28)
jd =0 x Γ(md ) j !j ! 2σd 2σd
γ0
d j =0 d d
ZHAO et al.: DIFFERENT POWER ADAPTION METHODS ON FTR FADING CHANNELS 595
The EC under TCI over FTR fading channel is

md
∞ j

md Kdd djd γ0
C tci = Γ jd + 1,
Γ(md ) j =0 jd !jd ! 2σd2
d
⎛ ⎛ ⎞ −1 ⎞
m ∞ j
md d Kdd djd 1 γ 0
· ln⎝ 1 + ⎝ Γ jd , 2 ⎠ ⎠ . (29)
Γ(md ) j =0 jd !jd ! 2σd2 2σd
d
C. Asymptotic EC Under TCI

For 2σd2 → ∞, it is easy to see that lim F γd (γd ) → 1.
2σd2 →∞
Then, we consider the limit identity

⎧
γ0
Γ jd , 2σ 2 ⎨ Γ(jd2) , jd > 0;
2σd
lim 2 d
= − ln(γ (30)
0 )+ln(2σd )+ψ(1)
2
2σd2 →∞ 2σd ⎩ 2 , jd = 0,
2σd
γ0 2
where we truncate the Taylor expansion for Γ(jd , 2σ 2 )/(2σd ) Fig. 1. Ergodic Capacity versus γ D for Kd = 10, Δd = 0.5, Pt = 0 dB,
d and rd = 1 m, where the circle (star, triangle, and square) symbols, real
at 2σd2 = ∞ up to the first order term. Let ξ = − ln(γ0 ) + lines, and dash lines denote simulation, analytical, and asymptotic results
ln(2σd2 ) + ψ(1), and finally, the AEC under TCI is given by under OPRA (ORA, CI, and TCI), respectively.
⎛ ⎞
∞ ⎜ 2σd2 Γ(md ) ⎟
C tci = ln⎜
⎝ ⎟
⎠
ORA in the low SNR region. It is interesting to note that the
m ∞ j
Kdd djd Γ(jd )
md d d0 ξ + jd =1 jd !jd !
EC under CI is zero, and this can be explained by the fact that
⎛ ⎞ Γ(jd ) → ∞ for jd = 0 while d0 is not equal to zero or Kd
∞
j
Γ(md ) Kdd djd Γ(jd ) is not infinity in (26).
= ln 2σd 2
+ ln m − ln⎝d0 ξ + ⎠. (31)
md d jd !jd ! From Fig. 1, our derived asymptotic results matches very
jd =1
well with the simulation and analytical results in the high SNR
From (31), we can see that the EC under TCI is not a line region, and the slope of EC under OPRA and ORA is always
function with respect to ln(2σd2 ) or ln(γ D ) in the high SNR unity with respect to ln γ D , regardless of parameter settings.
region. However, the slope of (31) with respect to ln(2σd2 )
changes very slowly in the high SNR region, which can be R EFERENCES
shown in Fig. 1 in the simulation section. [1] J. M. Romero-Jerez, F. J. Lopez-Martinez, J. F. Paris, and
A. J. Goldsmith, “The fluctuating two-ray fading model: Statistical char-
acterization and performance analysis,” IEEE Trans. Wireless Commun.,
VI. S IMULATION vol. 16, no. 7, pp. 4420–4432, Jul. 2017.
[2] L. Wang, N. Yang, M. Elkashlan, P. L. Yeoh, and J. Yuan, “Physical
In this section, we use Monte-Carlo simulation to validate layer security of maximal ratio combining in two-wave with diffuse
our derived closed-form expressions for the EC and AEC in power fading channels,” IEEE Trans. Inf. Forensics Security, vol. 9,
OPRA, ORA, CI, and TCI cases. In calculation of the infi- no. 2, pp. 247–258, Feb. 2014.
[3] J. Zhang, W. Zeng, X. Li, Q. Sun, and K. P. Peppas, “New results on
nite summation terms in the PDF and CDF of γd over FTR the fluctuating two-ray model with arbitrary fading parameters and its
fading channels, we can truncate the infinite summation terms applications,” IEEE Trans. Veh. Technol., vol. 67, no. 3, pp. 2766–2770,
into finite terms, where the resulting truncation error can be Mar. 2018.
evaluated by (6) shown on [4, Sec. II-B]. [4] W. Zeng, J. Zhang, S. Chen, K. P. Peppas, and B. Ai, “Physical
layer security over fluctuating two-ray fading channels,” IEEE
As shown in Fig. 1, we can easily see that the EC increases Trans. Veh. Technol., vol. 67, no. 9, pp. 8949–8953, Sep. 2018,
with γ D increasing, because of the improved channel state doi: 10.1109/TVT.2018.2842126.
between the transmitter and receiver. It is also obvious that [5] M.-S. Alouini and A. J. Goldsmith, “Capacity of Rayleigh fading
the EC is in decline as md decreases, due to heavier fading, channels under different adaptive transmission and diversity-combining
techniques,” IEEE Trans. Veh. Technol., vol. 48, no. 4, pp. 1165–1181,
which is reflected by the larger power offset (intercept on hor- Jul. 1999.
izontal axis). Moreover, the performance of EC under OPRA [6] A. Laourine, M.-S. Alouini, S. Affes, and A. Stephenne, “On the capac-
is better than that under ORA, because the transmitter adjusts ity of generalized-K fading channels,” IEEE Trans. Wireless Commun.,
vol. 7, no. 7, pp. 2441–2445, Jul. 2008.
its transmit power according to the instantaneous channel state [7] G. Pan, E. Ekici, and Q. Feng, “Capacity analysis of log-normal chan-
in the OPRA case, rather than just using the fix power (aver- nels under various adaptive transmission schemes,” IEEE Commun. Lett.,
age power) to transmit signal in another case. It is also worth vol. 16, no. 3, pp. 346–348, Mar. 2012.
to note that the EC in OPRA and ORA cases converges in the [8] A. J. Goldsmith and P. P. Varaiya, “Capacity of fading channel with
channel side information,” IEEE Trans. Inf. Theory, vol. 43, no. 6,
high SNR region, due to the fact that γ0 → 1 as γ D → ∞, pp. 1986–1992, Nov. 1997.
which means that the transmit power under OPRA case is [9] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and
close to the average power. Products, 7th ed. San Diego, CA, USA: Academic, 2007.
[10] F. Yilmaz, O. Kucur, and M.-S. Alouini, “A novel framework on
The EC under TCI, where the cutoff level γ0 is 0.1, is exact average symbol error probabilities of multihop transmission over
smallest except the one under CI in the medium and high amplify-and-forward relay fading channels,” in Proc. 7th ISWCS, York,
SNR region, while this figure for TCI is larger than that under U.K., Sep. 2010, pp. 546–550.
Optimal Dynamic Capacity Allocation for High Throughput

Satellite Communications Systems
Anargyros J. Roumeliotis , Charilaos I. Kourogiorgas, and Athanasios D. Panagopoulos , Senior Member, IEEE
Abstract—In this letter, a simple engineering scheme for The spatial diversity of multiple feeder links in a Smart
achieving optimal dynamic capacity allocation in geostationary Gateway Diversity (SGD) concept, for resource management
and medium Earth orbit multi-beam high throughput satellite in HTS systems, similar to this letter, has recently been applied
systems is presented. Exploiting smart gateway diversity setup
in [6] and [7] considering system’s capacity losses. In [6]
and considering users’ requested and gateways’ offered capaci-
ties, we propose a theoretically optimal, based on Monge arrays, three different allocation algorithms, namely rate matching,
capacity allocation scheme for minimization of both system’s load balance and fairness method are examined under a com-
capacity losses and rate matching performance metrics. This binatorial solution framework regarding the number of time
scheme has low complexity considering the appropriate sorting slots for GW-UE beam pairs’ connection. Moreover, in [7] a
of system’s offered and requested capacities. Finally, simulation GW-UE allocation scheme based on the deferred acceptance
results also confirm its optimality by presenting its identical per- algorithm from matching theory is proposed. Finally, in satel-
formance with the much more complicated and time consuming
lite downlinks, in [8] authors apply the Lagrangian approach
exhaustive mechanism.
and an optimal multi-beam power and spot-beam allocation
Index Terms—High throughput satellite systems, dynamic are presented.
resource allocation, Monge arrays, optimization. In this letter we investigate the optimal dynamic alloca-
tion of capacity to the user data demands by Medium Earth
I. I NTRODUCTION and Geostationary Orbit, i.e., MEO and GEO, HTS systems
applying the SGD framework. Moreover, we consider the
OWARDS the upcoming 5G networks, the combination
T of huge traffic demand with higher users’ quality of ser-
vice (QoS) requirements and the lack of available spectrum is
GWs’ offered capacities, originated by spatial-temporal total
atmospheric attenuation realistic MEO and GEO stochastic
dynamic channels presented in [9] and [10] respectively and
very challenging for the designers of wireless communication the UEs’ requested capacities and propose a very low complex-
systems. The demanding wireless environment is also stated ity optimal, scalable, dynamic resource allocation mechanism
by Cisco reporting that in 2021 Wi-Fi and mobile devices will between the GWs and UEs in terms of minimizing two dif-
account for 63% of IP traffic and the number of devices con- ferent performance metrics, i.e., the important objective of
nected to IP networks will be three times as high as the global system’s capacity losses and the rate matching cited in [6]. The
population [1]. optimality of the proposed resource allocation scheme is the-
Satellite networks constitute a crucial pillar to assist 5G oretically originated by the fact that both performance metrics
terrestrial infrastructure to satisfy the aforementioned chal- constitute Monge arrays [11], [12]. Finally, this optimality is
lenges [2]. Specifically multi-beam, where different user termi- confirmed in numerical results by the identical performance of
nals or user equipment (UE) beams, called also as UEs below, proposed mechanism with the corresponding time consuming
are served by different gateways (GWs), high throughput satel- exhaustive scheme.
lite (HTS) systems, that can achieve values of Terabit/s [3], For the rest of this letter, Section II presents the system
can satisfy the demanding high QoS users’ requirements. The model, dynamic capacity allocation problems and proposed
satellite systems exploit in the satellite-UE beam link, called optimal matching algorithm. In Section III, numerical results
as user link or downlink, the Ka Band (20/30GHz) and in of the performance of the proposed matching algorithm are
the gateway-satellite link, called as feeder link, the Q/V Band given and in Section IV conclusions are depicted.
(40/50GHz). The signal in these links is mainly deteriorated by
the atmospheric phenomena [4], including rain, clouds, gases
and tropospheric turbulence. The spatial diversity is the most II. O PTIMAL DYNAMIC C APACITY
efficient fade mitigation technique [5] concerning the usage A LLOCATION M ECHANISM
of multiple propagation paths for increasing the availability In this letter, we consider an HTS system with M GWs
of the satellite network [4] and can be implemented either in that can serve N UEs as presented in Fig. 1. The satellite is
the side of feeder links (uplink transmission) or as downlink considered non-regenerative and in a frame period, contain-
reception diversity [4]. ing a number of time-slots, the M-active time multiplexing
SGD context is investigated. In a time-slot basis each UE can
Manuscript received October 17, 2018; accepted November 7, 2018. Date
of publication November 16, 2018; date of current version April 9, 2019. The be served by one GW and each GW serves either only one
associate editor coordinating the review of this paper and approving it for pub- UE assuming M = N, or many UEs simultaneously in the
lication was C. Shen. (Corresponding author: Athanasios D. Panagopoulos.) case M < N and frequency and time multiplexing in a frame
The authors are with the School of Electrical and Computer Engineering,
National Technical University of Athens, 15780 Athens, Greece (e-mail:
period are considered. In the former case each GWk , with
aroumeliot@mail.ntua.gr; harkour@mail.ntua.gr; thpanag@ece.ntua.gr). k = 1, . . . , M , serves qk = 1 UE, while in the latter case
Digital Object Identifier 10.1109/LWC.2018.2881693 qk > 1 UEs are served simultaneously by GWk satisfying the
ROUMELIOTIS et al.: OPTIMAL DYNAMIC CAPACITY ALLOCATION FOR HIGH THROUGHPUT SATELLITE COMMUNICATIONS SYSTEMS 597
N
pair, (b) N m m
j =1 xij = 1, ∀i and (c) i=1 xij = 1, ∀j declar-
ing (b)/(c) that each GW/UE is paired with one UE/GW
respectively and index m is L or RM representing problems
(1) or (2) respectively.
N
N
min
m
cijm xijm , s.t. (a), (b), (c), (3)
X
i=1 j =1
where the elements of N × N array Cm = (cijm ) take the form

(4), (5) considering the problems (1), (2) respectively.

cijL = max RCj − OCi , 0 , (4)
Fig. 1. System Configuration.
c RM = RCj − OC .
ij i (5)
M
relation k =1 qk = N . Hence, the number of each GWk in Generally LSAP problems have more complexity than our
the system “virtually” increases from one to qk by offering at scenario that belongs to a special case in which the optimal

the same time OCk = OCk /qk capacity to UEs where OCk pairing is easily extracted and known in advance, because the
in bps is the total offered capacity of GWk . Therefore, also cost array Cm , as shown in analysis below, has the specific
in the case of M < N, the problem of resource allocation is formation of a Monge array. Specifically, its elements satisfy
transformed in a problem of N GWs to serve N UE beams. the Monge property depicted in (6), described in (2.11) of [12],
The scenario where satellite operators have to satisfy more for all i1 , i2 , j1 , j2 with i1 < i2 and j1 < j2 , namely for every
UEs with less GWs is an important scenario for the design of 2 × 2 sub-array of Cm inequality (6) is valid.
future satellite systems.
The total offered capacity is determined by Shannon for- cim1 j1 + cim2 j2 ≤ cim1 j2 + cim2 j1 . (6)

mula, i.e., OCk = BC log2 (1 + γk ) where BC is the carrier
For proving that Cm is a Monge array, based on (6), in
bandwidth and γk−1 = CNIRup,k −1
+CNIRdn −1
. Especially, γk is order to solve optimally problems in (3), we sort all OCs and
the end-to-end or total carrier to noise plus interference ratio RCs in ascending ordered lists as follows:
CNIRk including both the CNIR of feeder link k and the
CNIR of downlink that is assumed as an average value over OCσ(1) ≤ OCσ(2) ≤ · · · ≤ OCσ(N ) , (7)
each UE beam’s coverage area and considered invariant for all
RCλ(1) ≤ RCλ(2) ≤ · · · ≤ RCλ(N ) . (8)
UEs since we focus on the feeder links of SGD framework
similar to [6]. Finally, CNIRup,k = CNIRCS ,k 10−Ak /10
In (7), (8) σ(•) and λ(•) refer to mapping functions
where the CNIRCS ,k is CNIR in clear sky conditions for the
σ, λ : {1, . . . , N } → {1, . . . , N } that set OCs and RCs in
feeder link k and Ak is the total atmospheric attenuation of the
ascending ordered lists respectively. These can be also set in
same link. In this study total atmospheric attenuation includes
descending ordered lists without loss of generality. To prove
gaseous attenuation, scintillation, attenuation due to clouds and
that (7), (8) result in optimal solution for problems in (3), we
rain attenuation [9], [10].
construct the arrays CL , CRM assuming that 1, . . . , N rows
Consequently, according to the prior analysis, the resource
represent the σ(1), . . . , σ(N ) GWs and the 1, . . . , N columns
allocation problem is investigated in a system with equal num-
represent the λ(1), . . . , λ(N ) UEs meaning that row i corre-
ber, N, of UEs and GWs, where each UEj with j = 1, . . . , N Δ
requests capacity RCj in bps to satisfy its need for data and sponds to OCσ(i) and column j to RCλ(j ) , i.e., i = σ(i ) and
each GWi offers OCi capacity with i = 1, . . . , N . Δ
j = λ(j ) in following analysis. In the Appendix we prove
Our objective is to propose low complexity optimal GWs- that these arrays are Monge arrays and their p th diagonal
UEs pairs, based on the OCs and RCs, in order to minimize element corresponds to the σ(p) GW and λ(p) UE. Hence
both the total system’s capacity losses, L, defined as: considering the [12, Corollary 2.3], presenting the identical
N
permutation as an optimal pairing in the assignment problem
L= max RCj − OCj∗ , 0 , (1) with a Monge cost array, the optimal (GWσ(p) , UEλ(p) ) pair-
j =1 ing with p = 1, . . . , N , originated by (7), (8) with complexity
of O(Nlog(N)), minimizes both systems’ capacity losses and
and the rate matching, RM, defined as [6]: Δ
rate matching, i.e., XL = XRM .
N
Finally, the sorting mechanism can be easily applied in a

RM = RCj − OCj∗ , (2) simple reconfigurable satellite segment avoiding complicated
j =1 optimization resource allocation methods.
where OCj∗ is the offered capacity to j th UE. The afore-

mentioned minimization problems can be formally written as III. S IMULATION R ESULTS AND D ISCUSSION
in (3), constituting linear sum assignment problems (LSAPs), For MEO satellite scenario we assume 2 GWs in Nemea
considering the constraints: (a) the N × N pairing array in Greece and for GEO satellite scenario one in Nemea and
Xm = (xijm ) is binary and xijm = 1 only for the GWi − UEj another in Harwell in U.K. For the MEO scenario, an 8-MEO
Fig. 3. CCDF of rate matching as the number of UEs increases for Sorting
Fig. 2. CCDF of losses as the number of UEs increases for Sorting and and Exhaustive mechanisms in: (a) MEO and (b) GEO HTS systems.
Exhaustive mechanisms in: (a) MEO and (b) GEO HTS systems.
The complementary cumulative distribution function

constellation at equatorial plane is considered and the time (CCDF) of system’s capacity losses and rate matching for
series of total atmospheric attenuation are derived using the both scenarios are presented in Figs. 2(a), 3(a) for MEO
multi-dimensional synthesizer presented in [9]. The separa- satellite and in Figs. 2(b), 3(b) for GEO satellite. It is obvious
tion distance between the 2 GWs of the MEO scenarios that the same Sorting of OCs and RCs and GWs-UEs pairing
is 20 km in the North-South direction. For the scenario of as proposed in Section II, has identical performance, as proven
the GEO system, the satellite is in 9◦ East and the total in the Appendix, with the more complicated Exhaustive
atmospheric attenuation time series are derived by employ- technique which examines all the possible system’s capacity
ing multi-dimensional Stochastic Differential Equations [10]. losses and rate matchings in order to find the minimum ones.
In both cases, the operating frequencies are in Q/V-bands.
The bandwidth BC = 1 GHz, the uplink CNIR in clear sky
conditions,CNIRCS ,up , is 25 dB for both feeder links, while
the corresponding CNIR for the downlink, CNIRCS ,dn , is IV. C ONCLUSION
13 dB. We study two different scenarios: a) with equal num- In this letter an optimal dynamic capacity allocation scheme
ber of GWs and UEs, i.e., 2 UEs in the system, and b) with is proposed, based on Monge arrays, to minimizing both the
8 UEs in the system where each GW serves simultaneously losses and rate matching performance metrics. For making
4 UEs. Considering the constant links’ propagation conditions GWs-UEs pairing we consider the same sorting of OCs and
during the frame period of 1 sec and the fact that RCs in the RCs in scenarios with MEO and GEO future HTS systems.
same period are drawn from a uniform distribution, as in [6], For the OCs we use realistic spatial-temporal total atmospheric
in the range (0, (BC /q)log2 (1 + CNIRtotal_max )) bps with attenuation channels. The low complexity O(Nlog(N)) of the
−1
upper bound related to CNI Rtotal _ max −1 = CNIRCS ,up + proposed algorithm and its optimality, as proved theoretically
−1
CNIRCS ,dn and q is the number of simultaneous served UEs and shown by simulations, make the former a very promis-
by each GW, both metrics in (1), (2) are estimated in this time ing general, fast, scalable and optimal solution for dynamic
duration. Finally (1), (2) are investigated over a year in order satellite environments considering multi-beam HTS systems.
to capture the total attenuation long-term statistics. Hence, A subject of future work, in the scenarios where GWs are
the evaluation of the proposed dynamic allocation algorithm fewer than UEs, is the investigation of the appropriate selec-
using spatial-temporal total atmospheric attenuation realistic tion of which GW to be divided and also the determination of
channels is another important contribution of this letter. the simultaneous served UEs.
ROUMELIOTIS et al.: OPTIMAL DYNAMIC CAPACITY ALLOCATION FOR HIGH THROUGHPUT SATELLITE COMMUNICATIONS SYSTEMS 599
A PPENDIX the inequality (6) is satisfied for all i1 , i2 , j1 , j2 with i1 < i2

Firstly, we prove that CL is a Monge array and to do that we and j1 < j2 . Substituting (24) in (6) we result in:
show that its elements based on (4) satisfy inequality (6) for
m = L. We set w/v equal to right/left hand sides of (6) respec- max RCj1 , OCi1 + max RCj2 , OCi2

tively for simpler mathematical notation. We prove (6) for
≤ max RCj2 , OCi1 + max RCj1 , OCi2 (26)
random i1 , i2 , j1 , j2 with i1 < i2 and j1 < j2 hence this
can be generalized for all i1 < i2 and j1 < j2 . Considering Continuing our analysis and substituting (25) in (26) we have:
the specific construction of CL as described in Section II
Δ Δ
and that i = σ(i ), j = λ(j ), we have for i1 < i2 that max RCj1 − OCi1 , 0 + max RCj2 − OCi2 , 0
OCσ(i ) ≤ OCσ(i ) and for j1 < j2 that RCλ(j ) ≤ RCλ(j ) .
1 2 1 2
Afterwards, the inequalities (9)-(12) are easily extracted. ≤ max RCj2 − OCi1 , 0 + max RCj1 − OCi2 , 0 (27)
RCλ(j1 ) − OCσ(i1 ) ≥ RCλ(j1 ) − OCσ(i2 ) , (9) The inequality (27) is true because is the same with (6) for
RCλ(j2 ) − OCσ(i1 ) ≥ RCλ(j2 ) − OCσ(i2 ) , (10) m = L that was proven in the analysis for the CL Monge array
RCλ(j2 ) − OCσ(i1 ) ≥ RCλ(j1 ) − OCσ(i1 ) , (11) considering the sorting in (7), (8). Due to the fact that (27) is
true then inequality (6) for m = RM is also valid for all i1 ,
RCλ(j2 ) − OCσ(i2 ) ≥ RCλ(j1 ) − OCσ(i2 ) . (12) i2 , j1 , j2 with i1 < i2 and j1 < j2 and thus the CRM is also
Regarding (4), (6) and setting dσ(i),λ(j ) = RCλ(j ) − OCσ(i) a Monge array.
for a more compact analysis we have:

ACKNOWLEDGMENT
v = max dσ(i1 ),λ(j1 ) , 0 + max dσ(i2 ),λ(j2 ) , 0 , (13)
The postdoctoral research of Charilaos Kourogiorgas
w = max dσ(i1 ),λ(j2 ) , 0 + max dσ(i2 ),λ(j1 ) , 0 . (14) was implemented with a scholarship from IKY, funded
by the Program “Strengthening Postdoctoral Researchers
We prove that w ≥ v according to (6) starting without loss of /Researchers” from the resources of the Human Resources
generality from (14) and examining all the possible cases. Development, Education and Lifelong Learning Program with
Case I: dσ(i1 ),λ(j2 ) > 0 (15) & dσ(i2 ),λ(j1 ) > 0 (16) priority axes 6,8,9 and co-funded by European Social Fund-
From (16) and (9) we have dσ(i1 ),λ(j1 ) > 0 (17) and from ESF and national Greek funds.
(16) and (12) that dσ(i2 ),λ(j2 ) > 0 (18). Thus substituting (15),
(16) in (14) and (17), (18) in (13) we conclude to w = v from
the commutative property. R EFERENCES
Case II: dσ(i1 ),λ(j2 ) > 0 (15) & dσ(i2 ),λ(j1 ) ≤ 0 (19) [1] “Cisco visual networking index: Forecast and methodology, 2016–2021,”
Substituting (15), (19) in (14) we have w = dσ(i1 ),λ(j2 ) San Jose, CA, USA, Cisco, White Paper, Jun. 2017.
(20). Due to the fact that (15), (19) do not influence (13), [2] B. Evans, O. Onireti, T. Spathopoulos, and M. Ali Imran, “The role of
satellites in 5G,” in Proc. 23rd Eur. Signal Process. Conf. (EUSIPCO),
the different possible values of (13) considering four different Nice, France, 2015, pp. 2756–2760.
cases of v and the corresponding comparisons with (20) are: [3] N. Jeannin et al., “Smart gateways for terabit/s satellite,” Int. J. Satell.
1) From (17), (18) and the commutative property we have Commun. Netw., vol. 32, no. 2, pp. 93–106, 2014.
[4] A. D. Panagopoulos, P.-D. M. Arapoglou, and P. G. Cottis, “Satellite
v = dσ(i1 ),λ(j2 ) + dσ(i2 ),λ(j1 ) , hence v ≤ w due to (15) communications at Ku, Ka, and V bands: Propagation impairments and
and (19). mitigation techniques,” IEEE Commun. Surveys Tuts., vol. 6, no. 3,
2) From (18) and dσ(i1 ),λ(j1 ) ≤ 0 (21) we have v = pp. 2–14, 3rd Quart., 2004.
dσ(i2 ),λ(j2 ) , hence v ≤ w due to (10). [5] A. D. Panagopoulos, P.-D. Arapoglou, J. D. Kanellopoulos, and
P. G. Cottis, “Long-term rain attenuation probability and site diversity
3) From (17) and dσ(i2 ),λ(j2 ) ≤ 0 (22) we have v = gain prediction formulas,” IEEE Trans. Antennas Propag., vol. 53, no. 7,
dσ(i1 ),λ(j1 ) , hence v ≤ w due to (11). pp. 2307–2313, Jul. 2005.
4) From (21) and (22) we have v = 0 < w due to (15). [6] A. Kyrgiazos, B. G. Evans, and P. Thompson, “On the gateway diversity
Case III: dσ(i1 ),λ(j2 ) ≤ 0 (23) for high throughput broadband satellite systems,” IEEE Trans. Wireless
From (23) and (11) we have (21), from (23) and (10) we [7] A. J. Roumeliotis, C. I. Kourogiorgas, A. Kyrgiazos, and
have (22) and from (21) and (9) we have (19). Substituting A. D. Panagopoulos, “Flexible capacity allocation in smart gate-
(21), (22) in (13) and (19), (23) in (14) we have w = v = 0. way diversity satellite systems using matching theory,” in Proc. 9th Int.
In all the above cases we showed that v ≤ w is valid for Conf. Wireless Satell. Syst. (WISATS), Oxford, U.K., 2017, pp. 195–204.
[8] J. P. Choi and V. W. S. Chan, “Optimum power and beam allocation
random i1 , i2 , j1 , j2 with i1 < i2 and j1 < j2 , hence the based on traffic demands and channel conditions over satellite down-
same happens for all i1 , i2 , j1 , j2 with i1 < i2 and j1 < j2 links,” IEEE Trans. Wireless Commun., vol. 4, no. 6, pp. 2983–2993,
concluding that (6) is valid for m = L, hence CL is a Monge Nov. 2005.
[9] C. I. Kourogiorgas et al., “Capacity statistics evaluation for next gener-
array. ation broadband MEO satellite systems,” IEEE Trans. Aerosp. Electron.
Now, we prove that CRM is a Monge array showing that Syst., vol. 53, no. 5, pp. 2344–2358, Oct. 2017.
its elements based on (5) satisfy inequality (6) for m = RM. [10] G. A. Karagiannis, A. D. Panagopoulos, and J. D. Kanellopoulos,
“Multidimensional rain attenuation stochastic dynamic modeling:
In the following analysis we use the relations (24), (25): Application to earth–space diversity systems,” IEEE Trans. Antennas

RM
cij = RCj − OCi = 2 max RCj , OCi − RCj − OCi , (24) Propag., vol. 60, no. 11, pp. 5400–5411, Nov. 2012.

[11] R. E. Burkard, B. Klinz, and R. Rudolf, “Perspectives of Monge prop-
max RCj , OCi = max RCj − OCi , 0 + OCi . (25) erties in optimization,” Discrete Appl. Math., vol. 70, no. 2, pp. 95–161,
Sep. 1996.
Considering a same approach as before and the same sorting [12] P. Brucker, Scheduling Algorithms, 5th ed. Berlin, Germany:
as in (7), (8) the array CRM is a Monge array if and only if Springer-Verlag, 2007.
Learning-Based Wireless Powered Secure Transmission

Dongxuan He , Chenxi Liu , Member, IEEE, Hua Wang , Member, IEEE,
and Tony Q. S. Quek , Fellow, IEEE
Abstract—In this letter, we propose a learning-based wireless constraint, considering that only the imperfect channel state
powered secure transmission, in which a source utilizes energy information (CSI) are available at the legitimate system. In [4],
harvested from a power beacon to communicate with a legitimate the use of artificial noise (AN) was considered to enhance
receiver, in the presence of an eavesdropper. In order to confuse
the eavesdropper, we assume that the source transmits the arti- physical layer security in wireless powered systems. The joint
ficial noise signals, in addition to the information signals. We design of the power allocation for the AN signals as well
first characterize the effective secrecy throughput of our system, as the time fraction between the power transfer (PT) phase
showing its dependence on the transmission parameters, includ- and the information transfer (IT) phase under the harvest-then-
ing the fraction of time allocated for wireless power transfer, the transmit protocol was examined in [5]. However, these works
fraction of power allocated to the information signals, as well as
the wiretap code rates. We then leverage the deep feedforward often require complicated algorithms, which may lead to large
neural network to learn how the optimal transmission parame- latency impact in real-world deployments.
ters that jointly maximize the effective secrecy throughput can In this letter, we propose a learning-based wireless pow-
be obtained. Through numerical results, we demonstrate that ered secure transmission, in which we exploit the potential of
our learning-based scheme can achieve almost the same secrecy machine learning in rapidly configuring the wireless powered
performance as the optimal solution obtained from the exhaustive
search, while requiring much less computational complexity. systems so as to maximize the effective secrecy throughput
(EST). In fact, machine learning has been shown to be effective
Index Terms—Wireless power transfer, artificial noise, physical in many wireless network applications with secrecy con-
layer security, deep feedforward neural network.
straints, such as anti-jamming [6] and secure transmit antenna
selection [7]. Compared to these works (e.g., [6] and [7]), our
I. I NTRODUCTION contributions are summarized as follows. First, we focus on
IRELESS power transfer has been envisaged as a the scenario of the wireless powered secure transmission, and
W promising solution to fulfill the ever-increasing demand
for energy in the fifth-generation (5G) and beyond wireless
characterize the impacts of various transmission parameters on
the EST. Second, we show how the deep feedforward neural
networks, since it can harvest energy from the radio frequency network (DFNN) can be utilized to efficiently determine the
signals without relying on the location or the climate [1]. On optimal transmission parameters that maximize the EST.
the other hand, security is another important issue in wireless
communications, due to the broadcasting nature of wire-
II. S YSTEM M ODEL AND P ROBLEM F ORMULATION
less medium. In particular, the decentralized modern wireless
networks have introduced significant challenges to traditional We consider a wireless powered secure communication
key-based cryptographic techniques, such as key generation, system, which consists of a power beacon (PB), a source
distribution, and management. To tackle this problem, physi- (Alice), a legitimate receiver (Bob), and an eavesdropper
cal layer security [2] has been proposed as an alternative for (Eve). In this system, Alice utilizes energy collected from PB
cryptographic techniques, since it can achieve the information- to communicate with Bob in the presence of Eve. We consider
theoretic secure communications without using secret keys. that Alice is equipped with Na antennas, while PB, Bob, and
Recently, wireless powered secure communication has also Eve are all equipped with a single antenna. We assume that
been receiving an increasing research attention [3]–[5]. In [3], the CSI between PB and Alice and the CSI between Alice
robust beamforming schemes were proposed to maximize the and Bob are perfectly known at PB and Alice, respectively,
throughput of wireless powered systems with the secrecy while only statistical information on Eve’s channel is available
to the legitimate nodes. We also assume that all the chan-
Manuscript received October 3, 2018; accepted November 14, 2018. Date nels are subject to identical and independent distributed (i.i.d)
of publication November 19, 2018; date of current version April 9, 2019. This Rayleigh fading. We denote hij and dij as the channel and dis-
work was supported in part by the China Scholarship Council ([2017] 3109)
and in part by the National Natural Science Foundation of China under
tance between node i, i ∈ {p, a}, and node j, j ∈ {a, b, e},
Grant 61471037, Grant 61771048, and Grant 61201181. The associate editor respectively.
coordinating the review of this paper and approving it for publication was
L. P. Natarajan. (Corresponding author: Chenxi Liu.)
D. He and H. Wang are with the School of Information and A. Wireless Powered Secure Transmission
Electronics, Beijing Institute of Technology, Beijing 100081, China
(e-mail: hdxbit@bit.edu.cn; wanghua@bit.edu.cn). We now detail the wireless powered secure transmission
C. Liu and T. Q. S. Quek are with the Information Systems Technology considered in this letter. The transmission between Alice and
and Design Pillar, Singapore University of Technology and Design, Singapore
487372 (e-mail: chenxi_liu@sutd.edu.sg; tonyquek@sutd.edu.sg). Bob consists of two phases, namely, the PT phase and the IT
Digital Object Identifier 10.1109/LWC.2018.2881976 phase. In the PT phase, Alice harvests energy from PB, and
HE et al.: LEARNING-BASED WIRELESS POWERED SECURE TRANSMISSION 601
then utilizes all the energy harvested in the PT phase to com- As such, with the aid of [8], the secrecy outage probability of
municate with Bob in the IT phase. We denote τ as the fraction our system can be derived as
of time allocated to the PT phase. As such, the available
transmit power at Alice in the IT phase can be expressed as Pso (τ, α, Re ) = Pr(γe > κe )

(1 − α)κe −(Na −1) − αϕγ
κe
Pa = ϕP̂ , (1) = 1+ e e , (6)
α(Na − 1)
τ , P̂ = ξPd −η 2
where ϕ = (1−τ ) pa |hpa | , 0 < ξ < 1 denotes
the energy harvest efficiency at Alice, P denotes the transmit where κe = 2Re − 1.
power at PB, and η denotes the path loss exponent. In order to evaluate the secrecy performance of our system,
In the IT phase, we assume that Alice transmits the AN sig- we adopt the modified EST [8], [9] as the performance metric,
nals, in addition to the information signals, in order to confuse given by
the eavesdropper. As such, we express the transmitted signal
at Alice as Ts (τ, α, Re ) = (1 − τ )(Rb − Re )(1 − Pso (τ, α, Re )). (7)
xs = wts + Gtsan , (2) We can see that the EST in (7) is a function of the fraction
of the time allocated to the PT phase τ , the fraction of power
where w and G denote the Na × 1 beamforming vector used allocated to the information signals α, as well as the redun-
to transmit the information signal ts and the Na × (Na − 1) dancy rate of the wiretap code Re . The key goal of this letter is
beamforming matrix used to transmit the AN signal tsan , to find the optimal transmission parameter tuple (τ ∗ , α∗ , Re∗ )
respectively. In order to degrade the quality of the received that achieves the maximum EST Ts∗ per transmission block.
signal at Eve, while maintaining the quality of the received Mathematically, this problem can be formulated as
h†
signal at Alice, we choose w = hab and G as the projection
ab
matrix onto the null space of hab , respectively [8], [9]. As max Ts (τ, α, Re ), (8a)
τ,α,Re
such, we have hab G = 0, and w and the columns of G form s.t. 0 < τ < 1, 0 < α ≤ 1, 0 < Re ≤ Rb . (8b)
an orthonormal basis. In addition, we denote α as the fraction
of power allocated to the information signals. As such, we However, the maximization problem in (8) is non-convex.
(1−α)P
have E[|ts |2 ] = αPa and E[tsan t†san ] = Na −1 a INa −1 . Therefore, it is difficult to analytically obtain the optimal trans-
Based on (1) and (2), we express the received signal-to- mission parameter tuple (τ ∗ , α∗ , Re∗ ). Traditionally, (8) can
interference-plus-noise ratios (SINRs) at Bob and Eve in the be solved numerically via an exhaustive search, but requiring
IT phase, respectively, as a high computational complexity. To tackle this problem, we
propose a learning-based scheme that is capable of solving (8)
γb = αϕγ b |hab |2 , (3) in a far more efficient way compared to the exhaustive search.1
αϕγ e |hae w|2
γe = 1−α , (4)
2
Na −1 ϕγ e hae G + 1
III. P ROPOSED L EARNING -BASED S CHEME
−η 2 −η 2 2
where γ b = P̂ dab /σb , γ e
= P̂ dae /σe , σb and σe2
denote the In this section, we present details on our proposed learning-
variance of the additive white Gaussian noise at Bob and Eve, based scheme that solves (8) efficiently.2 Specifically, we
respectively. According to (3) and (4), the secrecy capacity of utilize the DFNN to learn the nonlinear mapping from hab
our system is expressed as to the optimal transmission parameter tuple (τ ∗ , α∗ , Re∗ ) that
achieves the maximum EST, which can be expressed as
Cs = {Cb − Ce }+ , (5)
(τ ∗ , α∗ , Re∗ ) = f ∗ (hab ). (9)
where {·}+ denotes max{0, ·}, Cb = log2 (1 + γb ) and Ce =
log2 (1 + γe ) denote the capacity of Bob’s channel and the We note that the DFNN has shown to be effective in approx-
capacity of Eve’s channel, respectively. imating any measurable function to any desired degree of
accuracy [11], and thus is an appropriate method to solve (8).3
B. Problem Formulation
1 Note that in practice the choice of transmission parameters is finite.
In this letter, we employ the well-known Wyner’s wiretap
We will show in Section IV that, even in such scenarios, our proposed
code [2] with the parameter pair (Rb , Re ) to perform secure learning-based scheme still outperforms the exhaustive search in terms of
transmissions, where Rb denotes the transmission rate of the the computational complexity.
wiretap code and Re denotes the redundancy rate of the wire- 2 Besides wireless powered communication systems, our learning-based
scheme can also be applied to enhance the security of various communication
tap code, representing the cost of preventing eavesdropping. systems. For example, in secure communication systems with the coopera-
When Rb > Cb , the transmitted signal from Alice cannot be tive jammers (as in [10]), the proposed learning-based scheme can be utilized
reliably decoded at Bob, and the transmission outage occurs. to determine the optimal power allocation at the cooperative jammers that
maximizes the secrecy rate.
When Re ≤ Ce , the information on the transmitted signal is 3 We note that the curve fitting method can also be applied to approximate
leaked to Eve, and the secrecy outage occurs. Since we assume the mapping from hab to (τ ∗ , α∗ , Re∗ ). However, it may not achieve the
that hab is perfectly known at Alice, we can set Rb = Cb . same degree of accuracy as our proposed learning-based scheme.
end, we adopt the Levenberg-Marquardt method [14], which

is suitable for training the neural network when the learning
performance metric is the sum of squares of nonlinear func-
tions, to iteratively update W. As such, W in the (k + 1)-th
update can be expressed as
1
Fig. 1. The structure of the DFNN. Wk +1 = Wk − ∇F Wk , (13)
2μk
A. Deep Feedforward Neural Network where μk is the training parameter and

As shown in Fig. 1, our adopted DFNN consists of three
∇F Wk = 2J Wk e Wk . (14)
layers, i.e., the input layer, the hidden layer, and the output
layer. Specifically, we choose |hab | as the input of our DFNN In (14), J(·) denotes the Jacobian matrix and
since the input of the neural network must be a real-valued e(Wk ) = [e1 (Wk ), e2 (Wk ), . . . , eM (Wk )], where
scalar or vector [12]. We also choose (τ ◦ , α◦ , Re◦ ) as the out- em (W ) = |f (|hab,m |, Wk ) − f ∗ (hab,m )|.
k
put of our DFNN. We note that there are typically multiple We keep updating W until the constraint in (12) is sat-
layers in the hidden layer, and the output of one layer is the isfied. Then, we can utilize the trained DFNN to determine
input of the sequential layer. As such, the output of the i-th the optimal transmission parameters that maximize the EST.
layer in the hidden layer can be expressed as Specifically, we use a new channel gain |hab | as the input
xi = g(Wi xi−1 + bi ), (10) of the trained DFNN, then the optimal transmission parame-
ter tuple (τ ∗ , α∗ , Re∗ ) that maximizes the EST can be directly
where g(z ) denotes the activation function of the hidden obtained from the output of the DFNN.
layer, Wi and bi denote the weight matrix and the bias of
the i-th layer, respectively. In this letter, we choose the rec-
tified linear unit [13] as the activation function, given by
g(z ) = max{0, z }. In this section, we present numerical results to validate
We denote l as the depth of the hidden layer and W = the effectiveness of our proposed learning-based scheme.
[W1 , W2 , . . . , Wl ]. Then, the mapping relationship between Specifically, we first examine the impact of transmission
|hab | and (τ ◦ , α◦ , Re◦ ) can be expressed as parameters (i.e., τ , α, and Re ) on the EST of our system.
We then compare the secrecy performance achieved by our
(τ ◦ , α◦ , Re◦ ) = f (|hab |, W). (11) proposed learning-based scheme with that of the optimal
Note that f (|hab |, W) in (11) is different from f ∗ (hab ) solution obtained from the exhaustive search. Finally, we
in (9). In this letter, the target of our DFNN is to train W such examine the computational demands required by our proposed
that f (|hab |, W) approaches f ∗ (hab ). To this end, we adopt the learning-based scheme and the exhaustive search.
mean squared error (MSE) as the learning performance met- In Fig. 2, we plot Ts (τ, α, Re ) versus the transmission
ric [12]. As such, a well-trained W should satisfy the following parameters for γ b = γ e = 10 dB for a realization of hab . The
constraint curves in Fig. 2(a)–2(c) are generated from (7). Fig. 2(a) shows
the impacts of τ and α on Ts (τ, α, Re ) when the optimal Re∗
J (W) = Ehab |f (|hab |, W) − f ∗ (hab )|2 ≤
. (12) is selected. We can see that Ts (τ, α, Re ) first increases then
decreases as τ (or α) increases for a given α (or τ ), and there is
where
denotes the target training error.
a unique (τ, α) that maximizes Ts (τ, α, Re ). The same behav-
ior can be observed when examining the impacts of (τ, Re )
B. Training of DFNN and (α, Re ) on Ts (τ, α, Re ) in Fig. 2(b) and Fig. 2(c), respec-
We now detail how our DFNN can be trained to obtain the tively. This demonstrates that there is a unique (τ ∗ , α∗ , Re∗ )
optimal transmission parameter tuple (τ ∗ , α∗ , Re∗ ) that maxi- that maximizes Ts (τ, α, Re ) for each realization of hab .
mizes the EST of our system. Specifically, the training process In Fig. 3, we plot the average maximum EST, denoted by
of DFNN is performed through two steps: 1) Generate the T s = Ehab [Ts (τ ∗ , α∗ , Re∗ )], versus γ b for different values of
training set and 2) Use the generated training set to train the Na with the optimal transmission parameter tuple (τ ∗ , α∗ , Re∗ )
DFNN such that W satisfying (12) is obtained. being selected for each realization of hab . In this figure, we
1) Training Set Generation: In this step, we generate the consider two schemes, namely, our proposed learning-based
training set of M training examples. Each training example scheme and the optimal solution obtained from the exhaus-
consists of the channel gain and the corresponding optimal tive search. For our proposed learning-based scheme, we use
transmission parameter tuple, given by Sm = {|hab,m | → M = 103 training examples to train the DFNNs. To examine
(τm∗ , α∗ , R ∗ )}, where m = 1, . . . , M . We note that the impact of the depth of the hidden layer (denoted by l) on
m e,m
(τm∗ , α∗ , R ∗ ) for each h
m e,m ab,m is obtained through the the performance of the proposed learning-based scheme, we
exhaustive search. consider three DFNNs with l = 2, 4, and 6, respectively. The
2) Training of W: In this step, we use the generated train- number of neurons in the hidden layers of these DFNNs is
ing set S = {S1 , S2 , . . . , SM } to train the DFNN such that (10, 10), (10, 10, 5, 5), and (10, 10, 10, 10, 5, 5), respec-
the weight matrix W satisfies the constraint in (12). To this tively. We see that, for all the values of Na , the secrecy
HE et al.: LEARNING-BASED WIRELESS POWERED SECURE TRANSMISSION 603
TABLE I
C OMPLEXITIES AND RUNNING T IMES OF D IFFERENT S CHEMES
Fig. 2. Ts versus the transmission parameters for γ b = γ e = 10 dB.

V. C ONCLUSION
In this letter, we proposed a learning-based wireless pow-
ered secure transmission, where the source uses energy col-
lected from a power beacon to communicate with a legitimate
receiver, in the presence of an eavesdropper. In our scheme, we
assumed that the source transmits the AN signals together with
the information signals to confuse the eavesdropper. We char-
acterized the EST of our system, and showed how the EST
can be maximized by judiciously selecting the transmission
parameters. Furthermore, we exploited the DFNN to learn the
nonlinear mapping from Bob’s CSI to the optimal transmission
Fig. 3. T s versus γ b for different values of Na .
parameters that maximize the EST. Compared to the optimal
solution obtained from the exhaustive search, we showed
performance achieved by our proposed learning-based scheme that our proposed learning-based scheme can deliver almost
is almost the same as that of the optimal solution obtained the same secrecy performance with much less computational
from the exhaustive search, demonstrating the effectiveness complexity.
of our proposed learning-based scheme. We also see that, the
DFNN with l = 2 achieves a worse performance than the
DFNN with l = 4, while the DFNN with l = 6 achieves R EFERENCES
almost the same performance as the DFNN with l = 4. This
indicates that our learning-based scheme can achieve a bet- [1] S. Bi, C. K. Ho, and R. Zhang, “Wireless powered communication:
Opportunities and challenges,” IEEE Commun. Mag., vol. 53, no. 4,
ter trade-off between the performance and the complexity by pp. 117–125, Apr. 2015.
selecting hyperparameters. [2] A. D. Wyner, “The wire-tap channel,” Bell Syst. Tech. J., vol. 54, no. 8,
Finally, we evaluate the computational demands of different pp. 1355–1387, Oct. 1975.
[3] D. W. K. Ng, E. S. Lo, and R. Schober, “Robust beamforming for
schemes in Table I.4 Specifically, we first compare the compu- secure communication in systems with wireless information and power
tational complexities of our proposed learning-based scheme transfer,” IEEE Trans. Wireless Commun., vol. 13, no. 8, pp. 4599–4615,
and the exhaustive search. To this end, we define Nτ = 1/δτ , Aug. 2014.
[4] H. Xing, K.-K. Wong, A. Nallanathan, and R. Zhang, “Wireless powered
Nα = 1/δα , and NR = Rb /δR , where δτ , δα , and δR cooperative jamming for secrecy multi-AF relaying networks,” IEEE
denote the search step size for τ , α, and Re , respectively. Trans. Wireless Commun., vol. 15, no. 12, pp. 7971–7984, Dec. 2016.
We find that the computational complexity of our learning- [5] C. Guo, B. Liao, D. Feng, C. He, and X. Ma, “Minimum secrecy
throughput maximization in wireless powered secure communications,”
based scheme (i.e., O(1)) is significantly less than that of IEEE Trans. Veh. Technol., vol. 67, no. 3, pp. 2571–2581, Mar. 2018.
the exhaustive search (i.e., O(Nτ Nα NR )). This is because a [6] L. Xiao, Y. Li, C. Dai, H. Dai, and H. V. Poor, “Reinforcement learning-
well-trained DFNN only needs finite steps of calculation to based NOMA power allocation in the presence of smart jamming,” IEEE
obtain the optimal transmission parameter tuple (τ ∗ , α∗ , Re∗ ), Trans. Veh. Technol., vol. 67, no. 4, pp. 3377–3389, Apr. 2018.
[7] D. He, C. Liu, T. Q. S. Quek, and H. Wang, “Transmit antenna selec-
while the exhaustive search needs to go through every point tion in MIMO wiretap channels: A machine learning approach,” IEEE
in the search space. This finding can be verified by the run- Wireless Commun. Lett., vol. 7, no. 4, pp. 634–637, Aug. 2018.
ning times of our learning-based scheme and the exhaustive [8] N. Yang et al., “Artificial noise: Transmission optimization in multi-
input single-output wiretap channels,” IEEE Trans. Commun., vol. 63,
search on MATLAB on a 6-core 64-bit 2.5 GHz Intel E5- no. 5, pp. 1771–1783, May 2015.
2640 microprocessor with the target training error
= 10−7 [9] C. Liu, N. Yang, J. Yuan, and R. Malaney, “Location-based secure trans-
and the search step size δτ = δα = δR = 10−2 . We see that mission for wiretap channels,” IEEE J. Sel. Areas Commun., vol. 33,
no. 7, pp. 1458–1470, Jul. 2015.
the running time of our learning-based scheme for a realiza-
[10] K. Cumanan, G. C. Alexandropoulos, Z. Ding, and G. K. Karagiannidis,
tion of hab is much less than that of the exhaustive search for “Secure communications with cooperative jamming: Optimal power allo-
all the values of Na . We also see that the running time of the cation and secrecy outage analysis,” IEEE Trans. Veh. Technol., vol. 66,
exhaustive search increases as Na increases, while the running no. 8, pp. 7495–7505, Aug. 2017.
[11] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward
time of our learning-based scheme remains relatively the same networks are universal approximators,” Neural Netw., vol. 2, no. 5,
when Na varies. These observations indicate that our learning- pp. 359–366, 1989.
based scheme can be performed in real-time with negligible [12] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge,
MA, USA: MIT Press, 2016.
latency impact, while the exhaustive search is not practical in [13] V. Nair and G. E. Hinton, “Rectified linear units improve restricted
real-world deployments. Boltzmann machines,” in Proc. ICML, Haifa, Israel, Jun. 2010,
pp. 807–814.
4 Since the training phase of our learning-based scheme is completed offline, [14] M. T. Hagan, H. B. Demuth, and M. Beale, Neural Network Design.
its computational demands are not included in this table. Boston, MA, USA: PWS, 1995.
New Analytical Approach in the SER Evaluation of CSIN-Assisted

AF Dual-Hop Wireless Systems
Yazid M. Khattabi , Member, IEEE
Abstract—Among the different relaying modes employed in TABLE I

amplify-and-forward (AF) dual-hop wireless cooperative systems, AF M ODE C ATEGORIES : E ND - TO -E ND SNR F ORMULA
the channel-state-information (CSI)-and-noise (CSIN)-assisted
mode generates the optimal amplification-gain that limits the
relay’s retransmitted power. Despite that, and up to this moment,
its symbol-error-rate (SER) performance has been evaluated
in literature only through approximate analyses. In this let-
ter, we introduce a new analytical approach to analyze its SER
exactly. More specifically, we consider a CSIN-assisted AF dual-
hop system operating in Rayleigh fading environment with high
nodes’ mobility and incorrect CSI estimates, and derive novel,
generic, and tractable exact expression for its per-frame-average Therefore, in this letter, we consider such a system operat-
SER performance. The derived expression is also valid for non- ing in Rayleigh fading environment and present an analytical
moving nodes and perfect CSI estimation scenarios. We also approach to derive its exact SER. As the scenarios of nodes’
derive the irreducible error floors that appear due to nodes’ mobility as well as imperfect CSI estimation are more practi-
motion and incorrect CSI estimation. cal [9], [10], we conduct the proposed analytical work under
Index Terms—Cooperative systems, amplify-and-forward,
the effects of these scenarios. More specifically, we follow the
error rate performance, fading channels. same time-selective fading and the incorrect CSI estimation
models as in [9], and then start from the system’s exact effec-
tive SNR to derive exact SER expression in terms of quickly
I. I NTRODUCTION convergent infinite summation. It should be noted that, in [9],
system error rate performance has been derived approximately
OOPERATIVE techniques have been identified as a
C promising technology for future wireless systems as
means of significantly improving system performance [1].
starting from approximating the system’s effective SNR as
γeq ≈ min(γ1 , γ2 ). The derived expression in this letter is
generic and valid for the scenarios of static nodes and per-
Cooperation modes are classified as amplify-and-forward (AF) fect CSI estimates. To give more insight, we show that the
and decode-and-forward (DF) [2], [3]. Depending on how system’s exact error performance is affected by mobility, incor-
source-relay channel-state-information (CSI) and noise knowl- rect CSI estimates, and it experiences irreducible floors, which
edge are used in producing relay’s gain, AF modes can be are quantitatively determined. In addition, we derive a closed-
further subdivided as CSI-and-noise (CSIN)-assisted variable form approximate expression for the system’s SER, which
gain (the optimal) [1], CSI-assisted variable gain [3], and fixed shows acceptable accuracy as compared to the one derived
gain [2]. Effective end-to-end SNRs for these different modes based on the approximate literature method.
can share the generic formula γeq = γ1 γ2 /(aγ1 + bγ2 + c) This letter is organized as follows. System model is
where γ1 and γ2 are the effective SNRs over the source-relay presented in Section II. In Section III, system SER
and rely-destination hops, respectively. The factors a, b, and performance is analyzed using previous and proposed
c are non-negative real, whose values depend upon the AF approaches. Numerical results are discussed in Section IV.
mode, see Table I. Finally, conclusions are drawn in Section V.
During the past two decades, tremendous research contri-
butions have been exerted in the performance evaluation of II. S YSTEM M ODEL AND P RELIMINARY R ESULTS
AF-based dual-hop systems. However, SER performance of
such systems has been analyzed exactly only for the fixed We consider a dual-hop AF wireless cooperative system
and the CSI-assisted protocols and has not been done for the with source S communicating with destination D through relay
CSIN-assisted protocol. This is because of the mathematical R. On the basis of [9], we take the nodes’ high mobility
intractability arising due to the existence of the non-zero fac- and the imperfect CSI estimation assumptions into account.
tor c in the CSIN-assisted protocol SNR’s denominator. The S transmits consequent data frames each with J symbols,
broadly used approach to tackle this issue starts from neglect- where, due to mobility, each symbol experiences its own
ing c [4]–[8]. Results obtained based on this approach are small-scale fading conditions. Let h1 (j ) ∼ CN (0, σ12 ) and
approximate, and thus, they do not give insight into the real- h2 (j ) ∼ CN (0, σ22 ) denote the Rayleigh fading coefficients
istic error rate performance of the CSIN-assisted AF systems. for the first (S-R) and second (R-D) hops, respectively, and
corresponding to the j th signaling period. The two fading
Manuscript received October 16, 2018; accepted November 12, 2018. Date hops are assumed to be independent with average powers of
of publication November 20, 2018; date of current version April 9, 2019. The σ12 = E[|h1 |2 ] and σ22 = E[|h2 |2 ]. To take the nodes’ mobility
associate editor coordinating the review of this paper and approving it for effect, any two time-adjacent fading coefficients of the i th hop
publication was A. Kammoun.
The author is with the Department of Electrical Engineering, The University
(i = 1, 2) are related to each other as [9, eq. (4)]
of Jordan, Amman 11942, Jordan (e-mail: y.khattabi@ju.edu.jo).

Digital Object Identifier 10.1109/LWC.2018.2882205 hi (j ) = ρi hi (j − 1) + 1 − ρ2i βi (j − 1) (1)
KHATTABI: NEW ANALYTICAL APPROACH IN SER EVALUATION OF CSIN-ASSISTED AF DUAL-HOP WIRELESS SYSTEMS 605
where βi ∼ CN (0, σi2 ) is the i.i.d varying-component of techniques, (5) and (6) can be expressed as
the i th hop; and ρi = J0 (2πfc νi Ts /c) is its correlation- √
PSK a1 a2 ∞ −a2 x 2
parameter where J0 (.) is the zeroth-order Bessel function of P e (j ) = √ e Fγ (x 2 ) dx (7)
the first kind, νi is the relative speed between the i th hop’s π 0
√
communicating nodes, Ts is the symbol duration, fc is the
b1 √b2 ∞
M
−1
2
carrier frequency, and c is the speed of light [9]. According M−QAM 2
Pe (j ) = √ e −b2 x Fγ (x 2 ) dx . (8)
to the pilot-assisted estimation method [11], we assume that, π 0
for each frame, the system’s receivers estimate their fading m=0
coefficients only over the 1st signaling period, which are ĥ1 (1) We continue the system’s SER analyses using (7), where the
and ĥ2 (1). In addition, we assume imperfect CSI estimation, results could be then extended directly to the M-QAM case.
i.e., ĥi (1) = hi (1) + i (1) where i (1) ∼ CN (0, σe2i ) is the
estimation error [11]. Over the j th signaling period, S trans- A. SER Using Literature Method
mits the modulated symbol x(j) with energy Es to R, which Based on the approximate method commonly adopted in lit-
in turn amplifies the received signal by the CSIN-assisted erature to evaluated SER performance of dual-hop AF systems,
amplification gain [9, eq. (6)] like in [4]–[8], we can approximate the SNR in (3) as

γ γ1 (j )γ2 (j )
α(j ) = 2(j −1) 2(j −1) 2
(2) γeq (j ) ≈ γ̂eq (j ) = (9)
ρ1 2
|ĥ1 (1)| + (1 − ρ1 )σ1 γ + 1 γ1 (j ) + γ2 (j )
where γ = Es /No and No is the white noise power of the which has cdf obtained from (4) by replacing c(j ) = 0 as
first hop. After that, R transmits the amplified signal towards
1
1

D, which uses ĥ1 (1) and ĥ2 (1) in the detection process [9]. Fγ̂eq (j ) (γ) = 1 − 2γd(j ) e
2 −z(j ) γ
K1 2γd(j ) .
2
(10)
Based on this, the system’s equivalent end-to-end SNR over
the j th signaling period can be obtained at D as [9, eq. (15)]
By plugging (10) into (7) along with making use of [13,
γ1 (j )γ2 (j ) eq. (6.621.3)], and under the assumption of equiprobabale
γeq (j ) = (3)
γ1 (j ) + γ2 (j ) + c(j ) symbols in each frame, the PSK per-frame-average SER can
2(j −1)
be approximately obtained by
ρi γ|ĥi (1)|2
where γi (j ) = ωi (j )
, ωi (j ) = 1 + [(1 − 1
J 6π a12 a2 d(j ) A(j ) − 2d(j2 )
2(j −1) 2 2(j −1) 2 φ(j ) PSK 0.5
ρi )σi + ρi σei ]γ, and c(j ) = ω (j )ω (j ) with Pe
J j =1
a1 − 1 F 2.5, 1.5; 2; 1
1 2
2(j −1) 2 2 2(j −1) 2 2(j −1) 2 (A(j ) + 2d(j2 ) )2.5 A(j ) + 2d(j2 )
φ(j ) = (1 − ρ1 )σ1 γ [(1 − ρ2 )σ2 + ρ2 σe2 +
(11)
1 ] + (1 − ρ2(j −1) )σ 2 γ[ρ2(j −1) σ 2 γ + 1] + ρ2(j −1) σ 2 γ + 1 +
γ 2 2 2 e1 2 e2
2(j −1) 2 2(j −1) 2 2 where A(j ) = a2 + z(j ) , and F(a, b; c; z) is the Gauss’ hyper-
ρ1 σe1 ρ2 σe2 γ . If the nodes are static and the esti-
geometric function [13, eq. (9.100)]. Although (11) is obtained
mation is perfect, c(j ) = 1. As γi (j ) is exponential R.V with
ϑ (j ) 2(j −1) 2
based on the literature method, it is new and generalizes its
mean γ i (j ) = ωi (j ) , where ϑi (j ) = ρi σi γ, the exact literature counterparts under the impacts of nodes’ motion and
i
cdf of γeq (j ) can be obtained as incorrect CSI estimates. Unde these impacts, the system’s SER

−z(j ) γ
suffers from error floors. By taking the limit of (11) as γ → ∞,
Fγ (γ) = 1 − 2 (γ 2 + c(j ) γ)d(j ) e K1 2 (γ 2 + c(j ) γ)d(j )
eq (j ) these floors can be given as
(4)
J
√
floor a1 3πa1 a2 e1 e2 a2 + λ2
where K1 (·) is the 1st -order
modified-Bessel function of the Pe =
2
−
f1 f2 (a2 + λ1 )2.5
F 2.5, 1.5; 2;
a2 + λ1
(12)
2nd -type, d(j ) = 1/(γ 1 (j )γ 2 (j )), and z(j ) = d(j ) [γ 1 (j ) + j =1
γ 2 (j )]. 2(j −1) 2(j −1) 2(j −1)

where ei = (1−ρi )σi2 +ρ
i σe2i , fi = ρi σi2 , and
f1 e2 +f2 e1 i−1 e1 e2
III. S YSTEM SER P ERFORMANCE λi = f1 f2 + (−1) 2 f1 f2 , ∀i = 1, 2. Equation (12)
Over the j th signaling period, a commonly used SER reduces to 0 in case of static nodes and perfect CSI estimates.
formula for PSK based wireless systems with SNR γ(j ) is [12]
PSK
B. SER Using Proposed Method
P e (j ) = Eγ a1 Q 2a2 γ(j ) (5)
1) Proposed Exact SER:
where E[·] denotes the statistical mean operator, Q(·) is the Theorem 1: The exact per-frame-average SER for the
Gaussian Q-function; and a1 = a2 = 1 for BPSK, and a1 = system model under study is given by (13) in the bottom of
2 and a2 = sin2 (π/M ) for M-ary PSK. For M-ary QAM the next page, where χ(j ) = a2 ϑ1 (j )ϑ2 (j ) + ω1 (j )ϑ2 (j ) +
systems, the following can be accurately used [12, eq. (8.15)] ω2 (j )ϑ1 (j ), V(k , ) = Γ(k + + 1.5)(ψ(k + + 1.5) −
√ 2ψ(k + 1) − k +1 1 ), Γ(·) is the gamma function, ψ(·) is the psi
M
−1
M−QAM
2

(Digamma) function [13, eq. (8.360)], (·)!denotes the facto-
Pe Eγ b1 Q 2b2 γ(j ) n a1 ,...,ap
(j ) = (6) a!
rial operator, Cba = b!(a−b)! , and G m,
p, q b1 ,...,bq z is the
m=0
√ √ Meijer’s G-function [13, eq. (9.301)].
with b1 = 4( M − 1)/ M and b2 = 3(2m + 1)2 /2(M − Proof: By substituting the exact cdf in (4) into (7) and
1). With the help of by-parts and by-substitution integration then assuming equiprobable symbols per frame, the exact

∞
per-frame-average SER can be given by −A(j ) u k ++ 12 u
× e u ln 1 + du . (21)
J
√ 0 c(j )
a1 1 a1 a2
Pe = − √
2 J π I6
j =1
Upon solving the integral in (19) using [13, eq. (3.371)], we
∞ x 4 + c(j ) x 2 x 4 + c(j ) x 2
−A(j ) x 2
× 2e K1 2 dx . (14) have
0 γ 1 (j )γ 2 (j ) γ 1 (j )γ 2 (j )
+1 C k +1 d k +1 c k +1− 2(k + ) + 1 !!
√ k (j ) (j )
I1
I3 = π (22)
By denoting u = x 2 , we can write I1 as =0 2k + +1 Ak(j+ +1.5
)
=f (u) where (·)!! denotes the double-factorial operator. Alternatively
k ++1
I1 =
∞
e −A(j ) u 2
d(j ) (u + c(j ) )K1 2 d(j ) (u + c(j ) u) du. (15) writing (2(k + ) + 1)!! = 2 √π Γ(k + + 32 ) yields I3 to be
0 ultimately given as
We are unaware of a closed-form solution to I1 . However, k
+1 C k +1 d(jk +1 c k +1− Γ(k + + 32 )
exploiting the expansion of Kν (·) [13, eq. (8.446)] gives I3 =
) (j )
. (23)
k + +1.5

K
=∞
(d(j ) (u 2 + c(j ) u))k + 2
1
=0 A(j )
1
K1 f (u) = +
2 d(j ) (u 2 + c(j ) u) k !(k + 1)! Solving the integral in (20) using [13, eq. (4.352.1)] yields
k =0

1 k
+1 Ck +1 d(j
k +1 k +1−
c(j ) Γ(k + + 32 ) ψ(k + + 32 ) − ln A(j )
× ln d(j ) (u 2 + c(j ) u) − ψ(k + 1) + ψ(k + 2) . (16) I4 =
)
.
2 k ++ 3
=0 A(j ) 2
Plugging (16) into (15) and using ln(·) properties gives

(24)
K=∞
∞
e −A(j ) u 1 u
I1 = √ du + (−ψ(k + 1) − ψ(k + 2)) In I6 in (21), by replacing ln(1 + c ) by it equivalent rep-
4u 2k !(k + 1)! (j )
0 k =0 1, 2 1, 1 u
π
resentation G 2, 2 1, 0 c , and then making use of [13,
I2 = 4A(j )
(j )
∞ eq. (7.813.1)], we obtain I6 in closed-form as
1
e −A(j ) u u k + 2 (d(j ) (u + c(j ) ))k +1 du

−(k + + 32 ) 1, 3 −(k + + 1 ), 1, 1 1
0 I6 = A(j ) G 3, 2 2 . (25)

I3 1, 0 c(j ) A(j )
∞ 1
+ e −A(j ) u u k + 2 (d(j ) (u + c(j ) ))k +1 ln(u) du By substituting (25) and (23) into (21) and then (along
0
with (23) and (24)) into (17); and finally into (14) along with
I4
∞ some arrangements, we obtain P e in (13).
k+ 1
+ e −A(j ) u u 2 (d(j ) (u + c(j ) ))k +1 ln(d(j ) (u + c(j ) ))du . Notice that the infinite series in (13) converges rapidly,
0 which is due to the k !(k +1)! in the denominator. SER expres-
I5 sion for the special cases of non-moving nodes and correct
(17) CSI estimates can be directly obtained from (13) by substi-
tuting ρ1 = ρ2 = 1 and σe21 = σe22 = 0. As (11) is a tight
By using the binomial power series expansion [13, eq. (1.111)] upper bound for (13) over the high γ region, the error floors
we may have for (13) are exactly given by (12). It is worthwhile mentioning
k
+1 that computing (13) for higher modulation orders requires rel-
(d(j ) (u + c(j ) ))k +1 = C k +1 d(jk +1 c k +1− u .
) (j )
(18) atively higher running-time. This is due to the appearance of
=0
the Meijer’s G-function, which, for general parameters, does
not have polynomials representation. The following corollary
By substituting (18) into I3 , I4 , and I5 given in (17) we obtain helps in tackling this issue.
k
+1 ∞ Corollary 1: The Meijer’s G-function in (13) can be accu-
1
I3 = Ck +1 d(j
k +1 k +1−
) c(j ) e −A(j ) u u k ++ 2 du (19) rately represented by (26), shown at the bottom of the
=0 0 this page, where n = k + , ε is a very small number,
k+1 ∞ 1
p Fq (α1 , . . . , αp ; β1 , . . . , βq ; x ) is the generalized hypergeo-
I4 = Ck +1 d(j
k +1 k +1−
) c(j ) e −A(j ) u u k ++ 2 ln(u)du (20) metric function [13, eq. (9.14.1)] and γ(α, x ) is the incomplete
=0 0 gamma function [13, eq. (8.350.1)] and [13, eq. (8.354.1)].
k
+1 Proof: First, without affecting accuracy we propose to write
I5 = ln(d(j ) c(j ) )I3 + Ck +1 d(j
k +1 k +1−
) c(j )

−(n+ 1 ),1,1

−(n+ 1 ),1,1+ε

G 13 ,3
,2
2 F (j ) = G 13 ,3
,2
2 F (j ) . (27)
=0 1,0 1,0
J
∞ √ k+1 k +1 1
a1 1 a12 a2 ϑ1 (j )ϑ2 (j ) a1 a2 C φ(j )k +1− (ω1 (j )ω2 (j )) (ϑ1 (j )ϑ2 (j )) + 2
Pe = − + √ 3 V(k , )
2 J 4χ(j ) 2 πk !(k + 1)! (χ(j ))k + + 2
j =1 k =0 =0

φ(j ) ω1 (j )ω2 (j )ϑ1 (j )ϑ2 (j )
1, 3 −( 21 +k + ),1,1
+ ln Γ(k + + 1.5) + G 3, 2 F(j ) , with F(j ) = (13)
χ(j ) 1,0 φ(j )χ(j )

−(n+ 12 ),1,1 πγ(n + 1.5, −1/F (j )) Γ(n + 1.5 + ε)
G 13 ,3
,2 F (j ) = √ + Γ(−ε) Γ(n + 1.5) − 1 F 1 (−ε; −(n + ε + 0.5); 1/F (j )) (26)
1,0 −1 (1/F (j ))ε
KHATTABI: NEW ANALYTICAL APPROACH IN SER EVALUATION OF CSIN-ASSISTED AF DUAL-HOP WIRELESS SYSTEMS 607
Fig. 1. M-PSK SER Vs γ with 65 mph speed, σ12 = σ22 = 1, and σe2 = Fig. 2. M-QAM SER Vs γ with static nodes, σ12 = σ22 = 1, and σe2 = 0.
0.001.
is clear that, similarly, the new exact results match the simu-
By expanding the right hand side of (27) using lation and outperform the old results. It is notable from Fig. 2
[13, eq. (9.304)] and then applying the facts that that because the nodes are static and the estimation is perfect,
2 F2 (a, 0; b, c; z ) = 1, 2 F2 (a, b; a, c; z ) = 1 F1 (b; c; z ), the SER plots behave normally and do not suffer from floors.
π
Γ(−x ) = − sin(πx )Γ(1+x , Γ(1 + x ) = x Γ(x ), and finally
)
using [13, eq. (8.351.2)] we obtain (26). V. C ONCLUSION
Computing (13) by adopting Corollary 1 reduces its running
time to up to 91% in case of 16-PSK as an example. In this letter, a new analytical arrangement has been
2) Proposed Approximate SER: proposed to derive novel exact as well as approximate expres-
sions for the SER performance of a Rayleigh fading CSIN-
Corollary 2: The per-frame-average SER can be evaluated assisted based dual-hop AF cooperative system with mobile
using the following closed-form approximate expression nodes and imperfect CSI estimates. The exact expression
J
√
involves a quickly convergent infinite summation, while the
a1 a2 a2 d(j ) c(j ) ln d(j ) /A(j )
Pe 1− − approximate one is in closed form and shows very satisfying
2J j =1 4A(j ) 2 A1.5
(j )

results. As a future work, this proposed arrangement could be
0.114c(j ) 3 0.78 + ln d (j ) /A (j ) easily extended considering other network models.
+ + . (28)
A1.5
(j )
2 A2.5
(j )
R EFERENCES
Proof: By taking only the 1st term (k = 0) from the infi- [1] J. N. Laneman, D. N. C. Tse, and G. W. Wornell, “Cooperative diversity
nite sum in (17), along with letting ln(d(j ) (u + c(j ) ) ≈ in wireless networks: Efficient protocols and outage behavior,” IEEE
ln(d(j ) u/c(j ) ) in I5 , I1 can be approximately obtained as Trans. Inf. Theory, vol. 50, no. 12, pp. 3062–3080, Dec. 2004.
[2] M. O. Hasna and M.-S. Alouini, “A performance study of dual-hop
∞ transmissions with fixed gain relays,” IEEE Trans. Wireless Commun.,
π d(j )
I1 + ln(d(j ) ) − ψ(1) − ψ(2) e −A(j ) u vol. 3, no. 6, pp. 1963–1968, Nov. 2004.
4A(j ) 2 0 [3] M. O. Hasna and M.-S. Alouini, “End-to-end performance of transmis-
∞ ∞ sion systems with relays over Rayleigh-fading channels,” IEEE Trans.
3 √ √
× u 2 du + c(j ) e −A(j ) u udu + 2 e −A(j ) u c(j ) u Wireless Commun., vol. 2, no. 6, pp. 1126–1131, Nov. 2003.
0 0
[4] Y. Xiao et al., “Forwarding strategy selection in dual-hop NOMA relay-
3 ing systems,” IEEE Commun. Lett., vol. 22, no. 8, pp. 1644–1647,
× ln(u) + u 2 ln(u) du . (29)
Aug. 2018.
[5] L. Yang, M. O. Hasna, and I. S. Ansari, “Unified performance analy-
Solving the first two integrals using [13, eq. (3.371)] and sis for multiuser mixed η-μ and M - distribution dual-hop RF/FSO
the last one using [13, eq. (4.352.1)], and then substitut- systems,” IEEE Trans. Commun., vol. 65, no. 8, pp. 3601–3613,
ing the result into (14), along with some simplifications, Aug. 2017.
gives (28). [6] N. S. Ferdinand and N. Rajatheva, “Unified performance analysis of two-
hop amplify-and-forward relay systems with antenna correlation,” IEEE
Trans. Wireless Commun., vol. 10, no. 9, pp. 3002–3011, Sep. 2011.
IV. N UMERICAL R ESULTS [7] D. Senaratne and C. Tellambura, “Unified exact performance analysis
of two-hop amplify-and-forward relaying in Nakagami fading,” IEEE
Fig. 1 shows that, as compared to exact simulation, SER Trans. Veh. Technol., vol. 59, no. 3, pp. 1529–1534, Mar. 2010.
results obtained using (11) (the old method) show unaccept- [8] G. Farhadi and N. C. Beaulieu, “A general framework for symbol error
able tightness especially for low modulation orders and over probability analysis of wireless systems and its application in amplify-
the low and medium SNR regions; while the new exact results and-forward multihop relaying,” IEEE Trans. Veh. Technol., vol. 59,
no. 3, pp. 1505–1511, Mar. 2010.
computed using (13) perfectly match with simulation results. [9] Y. M. Khattabi and M. M. Matalgah, “Performance analysis of multiple-
The best performance improvement achieved by the proposed relay AF cooperative systems over Rayleigh time-selective fading
approach over the old approximate is about 2.1 dB and 1.2 dB channels with imperfect channel estimation,” IEEE Trans. Veh. Technol.,
over the low and medium SNR regions, respectively. The num- vol. 65, no. 1, pp. 427–434, Jan. 2016.
ber of terms involving the calculation of the infinite summation [10] N. Varshney, A. K. Jagannatham, and P. K. Varshney, “Cognitive
MIMO-RF/FSO cooperative relay communication with mobile nodes
in (13) are labeled on the figure, which indicate a comfortable and imperfect channel state information,” IEEE Trans. Cogn. Commun.
convergence rate, especially, for the low modulation orders. Netw., vol. 4, no. 3, pp. 544–555, Sep. 2018.
Further, the new approximate results obtained using (28) show [11] X. Wu, H. Claussen, M. Di Renzo, and H. Haas, “Channel estima-
satisfying accuracy level and outperform (11) for the low mod- tion for spatial modulation,” IEEE Trans. Commun., vol. 62, no. 12,
ulation orders. We can also observe from Fig. 1 that under the pp. 4362–4372, Dec. 2014.
effects of nodes mobility and imperfect CSI estimation, the [12] M. K. Simon and M.-S. Alouini, Digital Communication Over Fading
Channels, vol. 95. Hoboken, NJ, USA: Wiley, 2005.
overall system error performance is degraded and experiences [13] I. Gradshteyn and I. Ryzhik, Table of Integrals, Series and Products,
irreducible floors (plotted using (12)). Fig. 2, shows SER plots 5th ed., A. Jeffrey and D. Zwillinger, Eds. New York, NY, USA:
for the same system considering M-QAM modulation where it Academic, 2007.
DF-CSPG: A Potential Game Approach for Device-Free Localization

Exploiting Joint Sparsity
Sixing Yang, Yan Guo , Ning Li, and Dagang Fang, Fellow, IEEE
Abstract—Device-free localization (DFL) plays an increasingly theory [3] into DFL, greatly reducing the required measure-
dominant role in some security and military applications, aim- ment used to promise a reliable localization accuracy. In this,
ing at locating targets without carrying any extra devices. Most there have been many types of researches concentrated in CS-
research utilize received signal strength, causing poor accuracy
due to insufficient information. In this letter, we propose DF-
based DFL, coming down to dictionary refinement [5], model
CSPG, a new DFL algorithm using compressive sensing (CS) designing [6], human effort saving [7] and so on. Most of
theory and potential game (PG) approach to exploit channel sta- the existings are based on Received Signal Strength (RSS)
tion information (CSI). It is the first time that the PG approach is for it’s easy to get. But, RSS is always too noisy, which
applied in DFL. At first, the CS-based localization model is built may cause inaccuracy in localization. As an alternative, the
by exploiting CSI and then investigated from a game-theoretic Channel Station Information (CSI) consists of amplitude and
perspective in which the subcarriers are assumed as players.
Then, we prove that the proposed DFL game is an exact PG. phase information from multiple subcarriers. And it is con-
Based on these, the best response theory is utilized to obtain the venient to be measured as long as the system is based on
equilibriums. Finally, simulation results show that DF-CSPG can Orthogonal Frequency Division Multiplexing (OFDM).
achieve more accurate and robust performance. Above all, CS can be applied to reduce the number of
Index Terms—Device-free localization, compressive sensing, wireless devices required to promise the localization accu-
potential game, nash equilibrium, best response. racy while CSI can provide more localization information.
So, the localization algorithm DF-CSPG, based on CS theory
and Potential Game (PG) approach, is proposed. In fact, the
PG approach has been used in some occasions for its advan-
I. I NTRODUCTION tages [8]. However, this is the first time that PG approach is
S AN emerging technique, Device-Free Localization applied in DFL. Firstly, we build the CS-based localization
A (DFL) could utilize the everywhere spread wireless sig-
nal to estimate the locations without requiring any devices to
model using CSI and design it into a nash game. Secondly,
the DFL game is proved to be an Exact Potential Game
be attached to targets [1]. In addition, it is the fundamental of (EPG) with its potential function is the global optimization.
device-free wireless sensing technique, making a great effect According to the definition in [9], we can optimize the utility
on estimating of the motion, activity, and gesture of person. function of DFL game instead of global optimization which is
Nowadays, DFL has drawn considerable attention and played complex. Thirdly, as one kind of the decision rules to achieve
a significant role in security monitoring and emergency rescue. pure Nash Equilibrium (NE) points, Best Response (BR) the-
In DFL, a person within the deployment area influences ory is introduced. Fourthly, CS theory is utilized to optimize
the wireless signal in a predictable way, which renders it fea- the problem obtained from the BR dynamics. Finally, sev-
sible to sense the target location by analysing the wireless eral simulations have been formulated to verify our proposed
patterns and characteristics [2]. Besides, the position accuracy algorithm.
is greatly infected by the quantity of wireless links distributed
in the monitoring area. More wireless links are deployed, II. M ODEL D ESIGNING
more accurate localization will be obtained. However, most
A. CS-Based Localization Model Using CSI
of the wireless devices are energy-constrained, heavily limit-
ing the scale and lifetime of DFL system. Considering the This letter focus on the CS-based multi-target localization
issues, Wang et al. [4] applied Compressive Sensing (CS) using CSI. It can serve both amplitude and phase information
from multiple subcarriers, in which the amplitude information
Manuscript received October 7, 2018; revised November 4, 2018; accepted could be convert into RSS through the research in [10]. As is
November 29, 2018. Date of publication December 5, 2018; date of current
version April 9, 2019. This work was supported in part by the National Natural
illustrated in Fig. 1, the l × l two dimensional area is divided
Science Foundation of China under Grant 61871400 and Grant 61571463, and into N grids and all K targets are located at the grid points.
in part by the Natural Science Foundation of Jiangsu Province under Grant By exploiting the inherent sparsity of localization issue, the
BK20171401. The associate editor coordinating the review of this paper and target location can be represented as a K sparse vector:
approving it for publication was Y. Shen. (Corresponding author: Yan Guo.)
S. Yang, Y. Guo, and N. Li are with the College of Communications
Engineering, Army Engineering University of PLA, Nanjing 210000, China w = [0, 1, 0, . . . , 0, 1]T , (1)
(e-mail: guoyan_1029@sina.com).
D. Fang is with the School of Electronic Engineering and Optoelectronic in which the nonzero coordinates of location vector corre-
Technology, Nanjing University of Science and Technology, Nanjing 210094,
China. sponding to the target locations. Additionally, M wireless links
Digital Object Identifier 10.1109/LWC.2018.2885052 are constructed by 2M uniformly employed wireless nodes,
YANG et al.: DF-CSPG: PG APPROACH FOR DFL EXPLOITING JOINT SPARSITY 609
Deviation of consensual estimation: for DFL game Γ, due

to that all the players make decisions for the same multiple tar-
gets, the location vectors estimated by different players should
be as similar as possible. So, the deviation is defined:
L

1
Eic = Al wl − Al wi 22 . (6)
L−1
l=1,l=i
Above all, Eip and Eic for player i should be as small as

possible to ensure the localization accuracy. So, the utility
function Ui for player i is designed as follow:
p
Fig. 1. System of multi-target device-free localization.
Ui (wi , w−i ) = (ξ − 1)Eic − ξEi (0 < ξ < 1)
L

ξ−1
= Al wl − Al wi 22 − ξyi − Ai wi 22 , (7)
L−1
which can only transmit or receive wireless signal. The ele- l=1,l=i
ment of sensing matrix Amn is assumed as the shadow effect
of wireless link m caused by target located at grid point n. where ξ controls the trade-off between Eic and Eip . In fact,
According to paper [11], the saddle surface model is applied to all L players try to minimize their deviation, which means to
approximate the RSS changes. The effected area is an ellipse, achieve the location vector by maximizing the utility function.
for target outside which, Amn = 0; otherwise: According to (7), the global utility function U0 is given by:
x y
Amn = Sa(Pmn , Pmn ), (2) L

U0 = Ui (wi , w−i ). (8)
where Sa is the function of the saddle surface model
i=1
which is defined to approximate the shadowing effect. And
x , P y ) is the coordinate of grid point n according to
(Pmn mn
the coordinate system of wireless link m, in which the line III. A LGORITHM D ESIGN VIA P OTENTIAL G AME
between the emitter and the receiver is defined as the X axis. In this section, the DFL scheme is designed from a game-
Then, we can obtain the sensing equation: theoretic perspective and the details are as follows.
yM ×1 = AM ×N wN ×1 , (3) Definition 1 (Nash Equilibrium [12]): An action profile S =
(w1 ∗ , w2∗ , . . . , wL
∗ ) is a pure strategy NE if and only if no
where y is the measurement vector, representing the shadowing player can improve its utility by deviating unilaterally, i.e.,
effect of wireless links. ∗ ∗
Actually, CSI values are collected from L subcarriers and Ui wi∗ , w−i ≥ Ui wi , w−i , ∀i ∈ N , wi , wi∗ ∈ S, wi = wi∗ .
targets appearing in the motoring area make different shad- (9)
owing effect on different subcarriers. So, we can obtain the
following equation: Definition 2 (Exact Potential Game [9]): The game Γ is
[y1 , y2 , . . . , yL ] = [A1 w, A2 w, . . . , AL w], (4) an exact potential game if and only if a potential function
F (wi , w−i ) satisfies the following condition:
where Ai and yi are the sensing matrix and measurement
vector of subcarrier i, respectively. F (wi∗ , w−i ) − F (wi , w−i ) = Ui (wi∗ , w−i ) − Ui (wi , w−i ),
∀i ∈ N , wi , wi∗ ∈ S. (10)
B. The DFL Game Design
Different from the existing DFL approaches, we investigate As an EPG, it has at least one pure strategy NE and any
this issue from a game-theoretic perspective. The DFL game global or local maximization of the potential function is a pure
is defined as Γ = (N , S, {Ui }i∈N ), where N = {1, 2, . . . , L} strategy NE.
is the set of L players. S is the strategy space and wi ∈ S is a Theorem 1 if F is a potential function for the exact poten-
strategy file of player i for location vector. Additionally, w−i is tial game Γ = (N , S, {Ui }i ∈ N ), then the set of NE of Γ
the strategies of all other players except player i. As the utility coincides with the set of NE for the identical interest game
function, Ui maps the strategy file of player i to a real number Υ = (N , S, {F }i ∈ N ):
according to different needs. In this letter, Ui is defined as
follows, referencing to deviation of personal estimation Ei
p NESet(Γ) = NESet(Υ), (11)
and deviation of consensual estimation Ei . c
where NESet denotes the set of NEs of a game. For EPG, its
Deviation of personal estimation: for DFL game Γ, the loca-
utility function shares the same NE points with its potential
tion vector wi estimated by player i should be consistent with
function. That is to say, when the DFL game is proved as an
its physical measurement yi . So, the deviation is defined:
EPG, we can optimize the utility function Ui (wi , w−i ) instead
p
Ei = yi − Ai wi 22 . (5) of the potential function F (w1 , w2 , . . . , wL ).
So, we design the potential function F using the global Algorithm 1 Algorithm of DF-CSPG
utility function U0 which referencing to the global optimal Require: yi , Ai , σ, ξ, n = 0, τ = 0, wi (0) = 0
objective: 1: Wireless links send the measured CSI to data fusion center.
L 2: while τ < 10 do

F (w1 , w2 , . . . , wL ) = Ui . (12) 3: Update the iteration number n = n + 1.
i=1 4: for i = 1 to L do
L

1−ξ
Then, we certify that the DFL game Γ is an EPG whose 5: Update bi (n) = L−1 Al T Al w
i (n − 1) +
potential function is F. The details are as follows. When player l=1,l=i
k transfers its strategy from wk to wk∗ , the change of potential ξAi T yi .
L

1−ξ
function F is calculated as: 6: Update Di (n) = L−1 Al T Al + ξAi T Ai .
F (wk∗ , w−k ) − F (wk , w−k ) = Uk (wk∗ , w−k ) l=1,l=i
⎧ ⎫ 7: i (n) by CS.
Estimate the location vector w
⎨ L ⎬
− Uk (wk , w−k ) + ∗
[Ui (wi , w−i ) − Ui (wi , w−i )] , (13) 8: end for
⎩ ⎭ L

i=1,i=k 9: if L1 i (n − 1)2 < σ then
wi (n) − w
where the change of potential function includes two parts: one i=1
10: τ =τ +1
is the change of utility function of player k, and the other is the 11: else
sum of the changes of other players’ utility function. In DFL 12: τ =0
games, it is considered that when player k deviate its strategy, 13: end if
player i (i = k) remains in its primary station, making its 14: end while
utility function Ui unchanged. As a result, the element in {} L

of (13) is zero. According to definition 2, the DFL game Γ 15: = L1
Compute the final location vector w i (n)
w
i=1
is proved to be an EPG. So, NESet(Υ) = NESet(Γ), and
we minimize Ui to obtain NE points instead of optimizing F
1−ξ L
which is complex according to Theorem 1.
Let bi = L−1 T T
In game theory, there are several kinds of decision rules to l=1,l=i Al Al wl + ξAi yi and Di =
1−ξ L T T
achieve pure NE points, such as Fictitious Play (FP), Spatial L−1 l=1,l=i Al Al + ξAi Ai , the (14) can be con-
Adaptive Play (SAP), Best Response (BR) and so on. We use verted as:
BR dynamics here.
Lemma 1: For game Γ, the BR dynamics can guarantee the bi = Di wi , (17)
end solution closer to an actual NE in a finite number of steps.
where there might be countless solution of wi . CS reconstruc-
In BR concept, players update their strategy files according
tion algorithm is used here to exploit the sparse solution.
to the best utility in a round-robin. And each time, only one
Based on the above analysing, the new algorithm DF-CSPG
player is chosen to update its strategy while all other players
is summarized as Algorithm 1.
remain unchanged until the criterion is achieved. Based on BR
According to the proposed algorithm, σ = 0.1L is used to
theory, the location vector wi is optimized as follow: control the speed of the convergence and τ = 10 is empirically
wi = arg max Ui (wi , w−i ) set to promise the algorithm has reached the convergence not
wi
⎛ ⎞ on accident.
L
−2(1 − ξ)
= arg min⎝ wl T Al T Al − 2ξyi T Ai ⎠wi
wi L − 1 l=1,l=i IV. S IMULATION R ESULTS
⎛ ⎞ To evaluate the new proposed algorithm, we conduct the
L
1 − ξ
+ wi T ⎝ Al T Al + ξAi T Ai ⎠wi , (14) following simulations. The monitoring scenario is divided into
L − 1 l=1,l=i N = 400 grids and K = 4 targets are randomly located at the
grid points. Besides, M = 40 wireless links are uniformly
where the first-order and second-order derivative of utility
deployed in the monitoring area. To approximate the noise,
function Ui are calculated as follows: the Gaussian White Noise is added, which is represented by
L
∂Ui (wi , w−i ) −2(1 − ξ) Signal-to-Noise Ratio (SNR).
= Al T Al wl
∂wi L−1 We first test the convergence of the new proposed algorithm
l=1,l=i
⎛ ⎞ DF-CSPG. Fig. 2 presents the evolution of the Euclidian dis-
L

T 1 − ξ T T tance of the recovered location vector. As is shown in the
− 2ξAi yi + 2⎝ Al Al + ξAi Ai ⎠wi , (15)
L−1 picture, most of the proposed algorithm DF-CSPG can reach
l=1,l=i
⎛ ⎞ the convergence except the case in which ξ = 0.01. The main
L
∂ 2 Ui (wi , w−i ) 1 − ξ T T reason is that the DF-CSPG with ξ = 0.01 rarely takes the
= 2⎝ Al Al + ξAi Ai ⎠, (16)
∂wi wi T L−1 deviation of personal estimation into consideration. For more
l=1,l=i
details: firstly, the convergence speed is so quick that all the
in which its second-order derivative is easily proved positive algorithms can reach the convergence even when the iteration
semidefinite, meaning that there are pure NE points which times reach 15. Secondly, most algorithms convergence to the
∂Ui (wi ,w−i )
satisfy ∂wi = 0. similar level of Euclidian distance except ξ = 0.01, which
YANG et al.: DF-CSPG: PG APPROACH FOR DFL EXPLOITING JOINT SPARSITY 611
Fig. 2. i (n) under different ξ.

Euclidian distance of w Fig. 4. Localization error via noise.
better localization accuracy which can be found by experience

roughly.
V. C ONCLUSION
In this letter, we investigate the CS-based DFL problem
using CSI. Different from existing algorithms, we solve the
DFL problem from a game-theoretic perspective which states
the DFL as a potential game and each CSI subcarrier as a
player. Accordingly, we prove the DFL game is an EPG.
Following this idea, we develop a new algorithm DF-CSPG,
using CS and BR theory, to find the NE points. At last, sim-
ulations have shown that our methods own better localization
accuracy and robustness.
Fig. 3. Localization error via target number.

R EFERENCES
[1] N. Patwari and J. Wilson, “RF sensor networks for device-free local-
may promise the localization accuracy even when the best ξ ization: Measurements, models, and algorithms,” Proc. IEEE, vol. 98,
no. 11, pp. 1961–1973, Nov. 2010.
is not choose partly. [2] J. Wang, Q. Gao, M. Pan, and Y. Fang, “Device-free wireless sensing:
Next, we test the localization accuracy of DF-CSPG accord- Challenges, opportunities, and applications,” IEEE Netw., vol. 32, no. 2,
ing to different target number. The algorithms Basis Pursuit pp. 132–137, Mar./Apr. 2018.
[3] E. J. Candes and M. B. Wakin, “An introduction to compressive
De-Noising (BPDN), Greedy Matching Pursuit (GMP) [4], sampling,” IEEE Signal Process. Mag., vol. 25, no. 2, pp. 21–30,
Orthogonal Matching Pursuit (OMP) and Bayes Compressive Mar. 2008.
Sensing (BCS) are all used to compare with DF-CSPG. As [4] J. Wang et al., “LCS: Compressive sensing based device-free localization
for multiple targets in sensor networks,” in Proc. IEEE INFOCOM, 2013,
is shown in Fig. 3, with the increasing of target number, the pp. 145–149.
localization error of all the algorithms increased. As ξ = 0.01, [5] D. Yu, Y. Guo, N. Li, and D. Fang, “Dictionary refinement for com-
meaning that the estimated location vector is mainly decided pressive sensing based device-free localization via the variational EM
algorithm,” IEEE Access, vol. 4, pp. 9743–9757, 2017.
by cognition, the localization error is always much larger than
[6] J. Wilson and N. Patwari, “A fade-level skew-laplace signal strength
DF-CSPG with other values. In addition, no matter how the model for device-free localization with wireless networks,” IEEE Trans.
target number changes, the algorithm DF-CSPG with certain Mobile Comput., vol. 11, no. 6, pp. 947–958, Jun. 2012.
ξ can make the localization error below the other algorithms, [7] J. He et al., “LiReT: An fine-grained self-adaption device-free localiza-
tion with little human effort,” in Proc. IEEE Int. Conf. Smart Comput.,
such as ξ = 0.2, ξ = 0.4 and ξ = 0.6. Hong Kong, 2017, pp. 1–3.
Finally, we test the localization accuracy of DF-CSPG [8] M. Ke, Y. Xu, A. Anpalagan, D. Liu, and Y. Zhang, “Distributed
according to different SNR. As is showing Fig. 4, the local- TOA-based positioning in wireless sensor networks: A potential game
approach,” IEEE Commun. Lett., vol. 22, no. 2, pp. 316–319, Feb. 2018.
ization error reduced when SNR changes from 5dB to 35dB [9] D. Monderer and L. S. Shapley, “Potential games,” Games Econ. Behav.,
for all the algorithms. The algorithm DF-CSPG with ξ = 0.01 vol. 14, no. 1, pp. 124–143, May 1996.
may cause higher localization error in case that it almost just [10] J. Wang et al., “LiFS: Low human-effort, device-free localization with
finegrained subcarrier information,” in Proc. 22nd ACM Annu. Int. Conf.
consider the deviation of consensual estimation. And the DF- Mobile Comput. Netw., Oct. 2016, pp. 243–256.
CSPG with ξ = 0.2 performs better when compared with [11] J. Wang et al., “Toward accurate device-free wireless localization with
other algorithms, proving the effectiveness and robustness of a saddle surface model,” IEEE Trans. Veh. Technol., vol. 65, no. 8,
our proposed DF-CSPG algorithm. pp. 6665–6677, Aug. 2016.
[12] Q. D. Lä, H. Y. Chew, and B.-H. Soong, Potential Game Theory:
Note that although limited number of ξ are selected to verify Applications in Radio Resource Allocation. Cham, Switzerland:
DF-CSPG for lack of resource, there must be an ξ owning Springer, 2016.
Fundamentals on Base Stations in Urban Cellular Networks: From

the Perspective of Algebraic Topology
Ying Chen , Rongpeng Li , Zhifeng Zhao , and Honggang Zhang
Abstract—In recent decades, deployments of cellular networks B. Algebraic Geometric Tools

have been going through an unprecedented expansion. In this In the algebraic geometric field, firstly, α-Shapes [5] of
regard, it is beneficial to acquire profound knowledge of cellular
networks from the view of topology so that prominent network a discrete point set are consistent with its intuitive shapes.
performances can be achieved by means of appropriate place- In general, an α-Shape is a manifold which can be con-
ments of base stations (BSs). In our researches, practical location structed based on the set of points and a given scale parameter
data of BSs in eight representative cities are processed with clas- α, thus a finite series of manifolds is obtained as α varies
sical algebraic geometric instruments, including α-shapes, Betti from 0 to +∞. Secondly, Betti numbers [8] describe the
numbers, and Euler characteristics. At first, a fractal nature is topological information of a manifold, i.e., an α-Shape, by
validated in urban BS topology from both perspectives of Betti holes. Specifically, in the case of 3-dimensional space, a
numbers and Hurst exponents. Furthermore, log-normal distri-
0-dimensional hole (expressed by β0 ) is an independent com-
bution is affirmed to provide the optimal fitness to the Euler
characteristics of real urban BS deployments. ponent in the α-Shape. A 1-dimensional hole (expressed by
β1 ) means the induced tunnel after plotting an edge between
Index Terms—Base stations, telecommunication network topol- two directed or indirected connected points in the α-Shape.
ogy, algebra, data analysis. A 2-dimensional hole (expressed by β2 ) is a cavity or void
in the α-Shape enclosed by a 2-dimensional surface. Thirdly,
I. I NTRODUCTION Euler characteristics [5] capture the topological features of a
manifold through global statistical properties. According to the
A. Motivations Euler-Poincare Formula, Euler characteristic is equivalent to
ROMPTED by the significance of base station (BS)
P deployment issues, substantial researchers have been
working on the relevant subjects over decades [1]–[3].
the alternating sum of Betti numbers. Due to the space limi-
tation, interesting readers could refer to [4] to find the details
of these three topological notions and their relationships.
However, the majority of the related documents studied Virtually, these seemingly abstract algebraic geometric tools
BS deployments only by means of analyzing BS density are closely connected with various practical systems includ-
distributions. ing wireless networking scenarios, and they have been applied
In our earlier works [4], several principal concepts in alge- in a vast number of networks. For instance, α-Shapes and
braic geometry field, i.e., α-Shapes, Betti numbers, and Euler Betti numbers have been incorporated in the cosmic Web [5].
characteristics [5], have been merged into the analyses of BS Moreover, Betti numbers, or so-called persistent homology,
topology of 12 countries around the world, and meaningful have provided an alternative perspective for coverage problems
topological features have been discovered, including the fractal in wireless sensor networks [9]. In addition, Euler characteris-
property and log-normal distribution of the Euler characteristic has been utilized to study the impacts of dynamic topology
tics [4]. As we know, the same rules hold for the whole system of connections on the performance of recurrent artificial neural
are not necessarily correct for a part of the system, and vice networks [10].
versa, such as the AdaBoost algorithm in deep learning [6]
and massive MIMO (multiple-input multiple-output) in wire-
C. Contributions
less systems [7]. On the other hand, from the perspectives
of practical applications and engineering, the BS topological To be specific, this letter searches for topological features
features in urban cellular networks may be more important in BS deployments, which transcend geographical, historic, or
than those in the scale of countries because a large number of culture distinctions, and provide valuable guidance for design-
users tend to live in capital cities. All of those concerns trigger ers of cellular networks. For this purpose, this letter offers two
our serious reflections on such a problem: will the topological novel contributions as listed below:
features in national cellular networks still hold in the scale of • Firstly, a fractal nature is uncovered in urban BS config-
dense cities? urations based on Betti numbers and Hurst exponents;
• Secondly, log-normal distribution is confirmed to match
Manuscript received November 11, 2018; accepted December 14, 2018. the Euler characteristics of real urban BS location data
Date of publication December 21, 2018; date of current version April 9, 2019. with the best fitting among all candidate distributions.
This work was supported by National Natural Science Foundation of China
under Grant 61701439 and Grant 61731002. The associate editor coordinat-
Actually, the results in this letter could be applied in a plenty
ing the review of this paper and approving it for publication was J. Coon. of works for the study of complicated cellular networks. On
(Corresponding author: Zhifeng Zhao.) one hand, fractal features could facilitate more sophisticated
The authors are with the College of Information Science and stochastic geometry models of cellular networks and enhance
Electronic Engineering, Zhejiang University, Hangzhou 310058,
China (e-mail: 21631088chen_ying@zju.edu.cn; lirongpeng@zju.edu.cn;
the performance evaluation of cellular networks [11], [12]. On
zhaozf@zju.edu.cn; honggangzhang@zju.edu.cn). the other hand, log-normal distribution of the Euler character-
Digital Object Identifier 10.1109/LWC.2018.2889041 istics is a completely valuable discovery, and we believe it
CHEN et al.: FUNDAMENTALS ON BSs IN URBAN CELLULAR NETWORKS: FROM PERSPECTIVE OF ALGEBRAIC TOPOLOGY 613
TABLE I
BASIC I NFORMATION OF 8 S ELECTED C ITIES
will provide fruitful guidance on the analysis and design of

BS deployments in the future.
The organization of this letter is arranged as follows. The
actual BS location data are briefly introduced in Section II.
The fractal phenomenon in BS deployments for either Asian
or European cities is presented in Section III. Log-normal dis-
tribution is affirmed to offer the best fitting for the Euler
Fig. 1. Comparisons between the Betti curves of random and fractal point
characteristics of all the eight cities in Section IV. Lastly, distributions.
conclusions are summarized in Section V.
Fig. 1(a) displays the practical point diagrams for both

II. DATASET D ESCRIPTION AND cases. For the left part, the horizontal and vertical coordinates
P ROCESSING P ROCEDURES of each point are randomly designated according to Poisson
Massive real data downloaded from OpenCellID community point distribution (PPP), whereas the fractal point deployment
(https://community.opencellid.org/) [13], a platform to provide in the right part is realized by hierarchical partitions of the
BS’s location data all over the world, are analyzed to guaran- area, which brings the most principal aspect of fractal features,
tee the validity of our results. Substantial BS location data of i.e., self-similarity [19], into the point distribution.
eight cities, including four representative cities in Asia (i.e., Fig. 1(b) illustrates the Betti curves for the random pattern,
Seoul, Tokyo, Beijing, and Mumbai) and Europe (i.e., Warsaw, while Fig. 1(c) for the fractal pattern. Instead of the clearly
London, Munich, and Paris) respectively, have been collected monotonous decease in the β0 curve and the single peak in the
from this platform. The basic information of these eight cities β1 curve in the random situation, the fractal nature is mani-
is provided in Table I. fested by the distinctive features of multiple ripples and peaks
The data processing procedure from the original BS loca- in the β0 and β1 curves, respectively [8], where a ripple is
tion data to the results in this letter is given as follows. Firstly, formed due to the distinct slope change as highlighted by the
a collection of BS nodes is abstracted into a discrete point set amplified blue subgraph in Fig. 1(c).
in the 2-dimensional plane. Then an α-Shape [5] can be con- In summary, the fractal property can be characterized by
structed given a scale parameter α. As α grows from 0 to +∞, the multiple ripples and peaks of the Betti curves [8]. In our
a finite series of α-Shapes is obtained. Secondly, each of the works, the fractal property in BS topology is verified in Fig. 2
α-Shapes has the corresponding values of Betti numbers [8] for European cities and Fig. 3 for Asian ones, respectively.
and Euler characteristic [5]. Thus a series of Betti numbers Beyond geographical constraints, it is extremely astonishing to
and Euler characteristics can be achieved. Thirdly, taking α as find out the consistent fractal nature for all the aforementioned
the horizontal axis and Betti numbers as the vertical axes, Betti eight cities.
curves in this letter can be depicted. Lastly, PDF (probability Moreover, the entirely different positions of the multiple rip-
density function) fitting curves can also be obtained according ples and peaks can be evaluated on the basis of the crossover
to the series of Euler characteristics. point of two straight lines with quite different gradients, as
shown by the β0 curves in Fig. 2 and Fig. 3. As listed in
Table II, it is clear that the peaks always come after the
III. F RACTAL NATURE IN C ELLULAR corresponding ripples, which makes sense because of the
N ETWORKS T OPOLOGY larger size of loops than that of components.
As a vital property of complex networks, the fractal nature
has been revealed in a plenty of wireless networking scenar-
B. Fractal Features Based on Hurst Exponents
ios [14]–[18]. In terms of Betti numbers and Hurst exponents,
the fractal feature in BS topology is verified in this section. The Hurst exponent between [0, 1] is widely used as an
evaluation index of the fractal property of data series, and the
indication of fractality increases gradually as a Hurst exponent
A. Fractal Features Based on Betti Numbers approaches to 1 [20].
Random and fractal point distributions can be distinguished For verifying the fractal nature in BS topology, the Hurst
by their Betti curves as depicted in Fig. 1 (for details see [8]). exponents of all the eight cities are listed in Table I.
Fig. 2. Betti curves of the practical BS deployments in European cities.
Fig. 3. Betti curves of the practical BS deployments in Asian cities.
TABLE II
P OSITIONS OF THE R IPPLES AND P EAKS IN THE B ETTI C URVES
TABLE III
C ANDIDATE D ISTRIBUTIONS AND THEIR PDF E XPRESSIONS
Fig. 4. Comparisons between the practical PDF and the fitted ones for the
Euler characteristics of BS locations in European cities.
The definition of a data series and the computation for the

Hurst exponents are illuminated as follows. Firstly, a center values form the data series for computing the Hurst exponent.
point is selected randomly among the BS location data of a Thirdly, a Hurst exponent can be calculated according to
city, and a circle is drawn given a random value of radius. the R/S method [21]. Lastly, the above steps are operated a
Secondly, the distance between each of the points within the hundred times with different center points and radii, and the
circle and the center point is computed, and these distance average value is the final Hurst exponent. The fractal nature
CHEN et al.: FUNDAMENTALS ON BSs IN URBAN CELLULAR NETWORKS: FROM PERSPECTIVE OF ALGEBRAIC TOPOLOGY 615
utilized in the discovery of inherent topological essences in

BS deployments for Asian and European cities. Firstly, a frac-
tal nature has been revealed in BS topology according to both
Betti numbers and Hurst exponents. Moreover, it has been
proved that log-normal distribution provides the best match
for the PDFs of the Euler characteristics of the real BS loca-
tion data in cellular networks among the classical candidate
distributions.
However, regardless of the topological discoveries above,
some thought-provoking problems still need to be solved. For
example, what are the influential factors for the positions of
the ripples and peaks in the Betti curves? What is the intrinsic
meaning of the number of ripples or peaks? All of these issues
will be investigated in our future works.
R EFERENCES
Fig. 5. Comparisons between the practical PDF and the fitted ones for the [1] J. Kibiłda, B. Galkin, and L. A. DaSilva, “Modelling multi-operator base
Euler characteristics of BS locations in Asian cities. station deployment patterns in cellular networks,” IEEE Trans. Mobile
Comput., vol. 15, no. 12, pp. 3087–3099, Dec. 2016.
TABLE IV [2] Y. Zhou et al., “Large-scale spatial distribution identification of base
RMSE B ETWEEN E ACH C ANDIDATE D ISTRIBUTION stations in cellular networks,” IEEE Access, vol. 3, pp. 2987–2999, 2016.
AND THE P RACTICAL O NE [3] Z. Zhao, M. Li, R. Li, and Y. Zhou, “Temporal-spatial distribution nature
of traffic and base stations in cellular networks,” IET Commun., vol. 11,
no. 16, pp. 2410–2416, 2017.
[4] Y. Chen, R. Li, Z. Zhao, and H. Zhang, “Study on base station topology
in cellular networks: Take advantage of alpha shapes, Betti numbers,
and Euler characteristics,” arXiv:1808.07356v1, Aug. 2018.
[5] R. V. D. Weygaert et al., Alpha, Betti and the Megaparsec Universe: On
the Topology of the Cosmic Web. Heidelberg, Germany: Springer, 2011.
[6] J. Zhu, S. Rosset, H. Zou, and T. Hastie, “ Multi-class AdaBoost,” Stat.
Interface, vol. 2, no. 3, pp. 349–360, 2006.
[7] E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive
MIMO for next generation wireless systems,” IEEE Commun. Mag.,
vol. 52, no. 2, pp. 186–195, Feb. 2014.
[8] P. Pranav et al., “The topology of the cosmic Web in terms of persistent
Betti numbers,” Monthly Notices Roy. Astronomical Soc., vol. 4, no. 465,
pp. 4281–4310, Mar. 2017.
[9] V. D. Silva and R. Ghrist, “Coverage in sensor networks via persis-
is evidently affirmed because all the Hurst exponents turn out tent homology,” Algebraic Geometric Topol., vol. 7, no. 1, pp. 339–358,
to be very close to 1. Apr. 2007.
[10] P. Masulli and A. E. P. Villa, “The topology of the directed clique com-
plex as a network invariant,” Springerplus, vol. 5, no. 388, pp. 1–12,
IV. L OG -N ORMAL D ISTRIBUTION OF THE 2016.
E ULER C HARACTERISTICS [11] R. Li, Z. Zhao, Y. Zhong, C. Qi, and H. Zhang, “The stochastic geometry
analyses of cellular networks with α-stable self-similarity,” IEEE Trans.
Euler characteristics can be calculated from Betti numbers Commun., to be published, doi: 10.1109/TCOMM.2018.2883099.
according to the Euler-Poincare Formula [5]. Since a clear [12] X. Ge, X. Tian, Y. Qiu, G. Mao, and T. Han, “Small-cell networks with
fractal coverage characteristics,” IEEE Trans. Commun., vol. 66, no. 11,
heavy-tailed property is demonstrated in the PDFs of the Euler pp. 5457–5469, Nov. 2018.
Characteristics, three classical heavy-tailed statistical distribu- [13] M. Ulm, P. Widhalm, and N. Brändle, “Characterization of mobile
tions and widely-used Poisson distribution are selected as the phone localization errors with OpenCellID data,” in Proc. Int. Conf.
candidates to match the PDFs. The candidates and their PDF Adv. Logistics Transp., Valenciennes, France, May 2015, pp. 100–104.
formulas are given in Table III. [14] C. Yuan, Z. Zhao, R. Li, M. Li, and H. Zhang, “The emergence of
scaling law, fractal patterns and small-world in wireless networks,” IEEE
The real PDF and the fitted ones are presented in Fig. 4 Access, vol. 5, pp. 3121–3130, 2017.
for European cities and in Fig. 5 for Asian ones, respectively. [15] X. Ge et al., “Wireless fractal cellular networks,” IEEE Wireless
Moreover, the root mean square error (RMSE) between the real Commun., vol. 23, no. 5, pp. 110–119, Oct. 2016.
PDF and every candidate is listed in Table IV. It is obvious [16] Y. Hao et al., “Wireless fractal ultra-dense cellular networks,” Sensors,
vol. 17, no. 4, p. E841, 2017.
that the RMSE between log-normal distribution and the real [17] S. H. Strogatz, “Complex systems: Romanesque networks,” Nature,
PDF is almost one-order of magnitude smaller than the others vol. 433, no. 7024, pp. 365–366, 2005.
in the same row. As a result, an astounding but well-grounded [18] C. Song, S. Havlin, and H. A. Makse, “Self-similarity of complex
conclusion can be drawn as follows: regardless of geographical networks,” Nature, vol. 433, no. 7024, pp. 392–395, 2005.
[19] D. Benhaiem, M. Joyce, and B. Marcos, “Self-similarity and stable
boundaries, culture differences, and historical limitations, the clustering in a family of scale-free cosmologies,” Monthly Notices Roy.
Euler characteristics of both Asian and European cities entirely Astronomical Soc., vol. 443, no. 3, pp. 2126–2153, 2014.
conform to log-normal distribution. [20] T. Gneiting and M. Schlather, “Stochastic models which separate fractal
dimension and Hurst effect,” Soc. Ind. Appl. Math. Rev., vol. 46, no. 2,
pp. 269–282, 2001.
V. C ONCLUSION AND F UTURE W ORKS [21] M. Fernández-Martínez, M. A. Sánchez-Granero, J. E. T. Segovia, and
I. M. Román-Sánchez, “An accurate algorithm to calculate the Hurst
In this letter, several algebraic geometric tools, namely, exponent of self-similar processes,” Phys. Lett. A, vol. 378, nos. 32–33,
α-Shapes, Betti numbers, and Euler characteristics, have been pp. 2355–2362, 2014.
Modified Conjugate Beamforming for Cell-Free Massive MIMO

Masoud Attarifar , Student Member, IEEE, Aliazam Abbasfar , Senior Member, IEEE,
and Angel Lozano , Fellow, IEEE
Abstract—We present a modification of conjugate II. N ETWORK AND C HANNEL M ODELS

beamforming for the forward link of cell-free massive MIMO The networks under consideration feature N APs, each
networks. This modification eliminates the self-interference
equipped with M antennas (where M is small), and K single-
and yields a performance that, without forward pilots, closely
approaches what would be achieved with such pilots in place. antenna users. Every AP can communicate with every user
The simplicity of conjugate beamforming is preserved, with on each time-frequency resource. With time-division duplex-
no need for matrix inversions, at the expense of fading-rate ing and perfect calibration of the transmit-receive chains [8],
coordination among the access points. the forward and reverse channels are reciprocal. A share of
the resources are reserved for pilot transmissions from the
Index Terms—Cell-free networks, massive MIMO, conjugate users, based on which the channels are estimated by the
beamforming, power allocation.
APs. The remaining resources, apportioned between the for-
ward and reverse directions as desired, are available for data
transmission.
I. I NTRODUCTION
ELL-FREE massive MIMO can be regarded as a A. Large-Scale Modeling
C deconstruction of cellular massive MIMO: the many
antennas that would be collocated at the cell sites are scat-
Provided the AP locations are agnostic to the radio propaga-
tion, shadowing has been shown to make such locations seem
tered over the network and the associations between users Poisson-distributed from the vantage of any user [9]. This
and cells are released. The result is a dense infrastructure of approximation sharpens as the shadowing strengthens, being
access points (APs), each featuring one or a few antennas, highly precise for values of interest [9], [10]. Leveraging this
with every user potentially served from every AP via conju- result, we place the APs and users randomly over the network,
gate beamforming [1]–[5]. Capitalizing on extensive backhaul, such that their locations conform to respective (mutually inde-
cell-free networks offer several advantages over their cellular pendent) binomial point processes; as the network grows, these
counterparts, including increased large-scale diversity and user converge to Poisson point processes.
proximity. Each CDF in this letter corresponds to 1000 network snap-
As in cellular massive MIMO, the total number of anten- shots in a wrapped-around universe with 200 APs, ensuring a
nas is substantially larger than the number of users per 95% confidence interval of 0.3% in absolute terms.
time-frequency resource; this renders conjugate beamforming Signals are subject to pathloss with exponent η, giving a
−η
effective and ensures low multiuser interference. In contrast, large-scale channel gain Gn,k = dn,k between the nth AP and
the channel hardening observed in cellular massive MIMO the kth user, distanced by dn,k . The forward- and reverse-link
does not carry over to cell-free networks because, in such large-scale SNRs equal SNRn,k = Gn,k P /σ 2 and SNRr =
networks, the channel gains are not IID; they are indepen-
dent, but have very disparate strengths. As a result, substantial Gn,k P r /σ 2 with P and P r the maximum transmit powers at
self-interference arises in forward-link transmissions devoid of APs and users, respectively, measured at 1 m from their source
pilots. This problem can be remedied at the user receivers, at so that no scaling constants are needed. In turn, σ 2 is the noise
the expense of incorporating precoded forward pilots for each power.
user [6]. Alternatively, a partial recovery is possible through Defining ρ = P /P r , we can relate the large-scale SNRs in
blind methods operating on data observations [7]. both directions via
This letter proposes a modified conjugate beamforming SNRn,k
SNRrn,k = . (1)
technique that prevents self-interference completely, with no ρ
action required at the receivers.
B. Small-Scale Modeling
Manuscript received November 19, 2018; revised December 25, 2018; Besides Gn,k , the reverse-link channel between the kth
accepted December 26, 2018. Date of publication January 1, 2019; date of user and the nth AP features the small-scale fading vec-
current version April 9, 2019. The work of A. Lozano was supported in part tor h n,k ∼ NC (0, I M ), independent across users and APs.
by MINECO/FEDER, UE under Project TEC2015-66228-P, and in part by Owing to reciprocity, the forward-link fading between the nth
AP and the kth user is h ∗n,k .
the European Research Council through H2020 Framework Programme/ERC
under Grant 694974. The associate editor coordinating the review of this
paper and approving it for publication was Y. Huang. (Corresponding author:
Aliazam Abbasfar.) III. H ARDENING -BASED C ONJUGATE B EAMFORMING
M. Attarifar and A. Abbasfar are with the School of Electrical
and Computer Engineering, University of Tehran, Tehran, Iran A. Channel Hardening
(e-mail: m.attarifar@ut.ac.ir; abbasfar@ut.ac.ir). The effectiveness of conjugate beamforming descends from
the law of large numbers. Let h ∗k and h ∗k be the N-dimensional
A. Lozano is with the Department of Information and Communication
Technologies, Universitat Pompeu Fabra, 08002 Barcelona, Spain
(e-mail: angel.lozano@upf.edu). fading vectors from all antennas to users k and k. In cellu-
Digital Object Identifier 10.1109/LWC.2018.2890470 lar massive MIMO, where the antennas are collocated, these
ATTARIFAR et al.: MODIFIED CONJUGATE BEAMFORMING FOR CELL-FREE MASSIVE MIMO 617
⎛ ⎞
N M SNRrn,k N
1 + SNRrn,k

M SNRrn,k
yk = Gn,k pn,k P sk + ⎝ h ∗ ĥ − ⎠ Gn,k pn,k P sk
1 + SNRrn,k M SNRrn,k n,k n,k 1 + SNRrn,k
n=1 n=1

Desired Signal: Sk Self -Interference: Ek

N
1 + SNRrn,k √
+ Gn,k P h ∗ ĥ p s +v (2)
M SNRrn,k n,k n,k n,k k k
n=1 k=k

Multiuser Interference: Ik
N → ∞, K
vectors have IID entries and, for K k =1 pn,k ≤ 1. Within the class of large-scale-based power
1 ∗ a.s. ∗ allocations, the most natural are:
1 k =k • Maximal-ratio, pn,k = K
Gn,k
.
h h → E hn,k hn,k = (3)
N k k 0 k = k. k=1 Gn,k
• Max-min (a quasi-convex optimization solved iteratively),
Thus, for N K, a precoder f k ∝ h k leads to (i) a hardened which equalizes the SINRs and maximizes fairness [1].
(over the fading) precoded channel at user k, whereby forward User k observes
pilots are not needed and the decoding can rely on large-scale N

quantities, and (ii) minimal interference onto users k = k. yk = Gn,k h ∗n,k x n + vk (9)
In cell-free networks, the channel gains connecting the N
n=1
APs with a user are no longer IID and hence (i) is not upheld.
where vk ∼ NC (0, σ 2 ).
With hardening-based reception (no
B. Reverse-Link Channel Estimation forward pilots), user k recovers the projection of yk onto [1]

Let Pk be the set of users (including user k) that share the M SNRrn,k
pilot of user k. The simultaneous transmission from the users Gn,k E h ∗n,k f cb = Gn,k pn,k P (10)
n,k
in this set of a pilot of power P r is observed at the nth AP as 1 + SNRrn,k
√
yn = Gn,k h n,k P r + v n (4) where we have invoked
(6). In ∗turn, cbthe projection of yk on
k∈Pk Gn,k h ∗n,k f cb
n,k − Gn,k E[h n,k f n,k ] is self-interference.
Combining (6)–(10), the observation at user k can be written
where v n ∼ NC (0, σ 2 I M ). From y n , the nth AP produces the as (2) at top of this page, from which the SINR emerges as
LMMSE channel estimate ĥ n,k satisfying h n,k = ĥ n,k +h̃ n,k
where h̃ n,k ∼ NC (0, MMSEn,k I ) is uncorrelated error and E |Sk |2
SINRk = 2
cb
(11)
σ + E[|Ek |2 ] + E[|Ik |2 ]
1 + k∈Pk ,k=k SNRrn,k 2
MMSEn,k = . (5) N SNR2n,k pn,k
1 + k∈Pk SNRrn,k
n=1 ρ+SNRn,k
Since the problem we address is not directly related to pilot =M N K , (12)
contamination, and there are ways of keeping such contami- 1+ n=1 SNRn,k k=1 pn,k
nation at bay [1], [11], we disregard it to avoid distractions expressed as function of only the forward-link SNRs via (1).
and the need to posit specific pilot assignments. This amounts In interference-limited conditions, the above reduces to
to Pk containing only user k, from which 2
N
M SNRrn,k n=1 G p
n,k n,k
E ĥ n,k 2 = , SIRcb
k = M N K . (13)
1 + SNRrn,k n=1 Gn,k k=1 pn,k
M
E h̃ n,k 2 = . (6)
1 + SNRrn,k IV. H ARDENING S HORTFALL IN C ELL -F REE N ETWORKS
The fluctuations of the precoded channels over their
C. Forward-Link Data Transmission expected values have two detrimental effects, opposite sides
The precoder applied by the nth AP to beamform to user k of the same coin, on hardening-based receivers: they subtract
n,k ∝ ĥ n,k . Altogether, the nth AP generates the signal
is f cb signal power, turning it into self-interference. These effects are
gauged in Fig. 1, which depicts, as a function of M,
K

E |Ek |2 E |Ek |2
xn = f cb
n,k sk (7) and (14)
k =1 E[|Sk |2 ] E[|Ek |2 ] + E[|Ik |2 ]
with sk the unit-variance symbol intended for user k and further averaged over the user locations, for maximal-ratio
power allocation, η = 4, and N/K = 4 and 10. These ratios
p P quantify the self-interference, respectively as a share of the
n,k ĥ n,k
f cb
n,k = (8) desired signal and of the total interference. For M = 1, self-
E ĥ n,k 2 interference steals about a third of the desired signal and it
represents about two thirds of the interference, and only for
where, by virtue of the normalization by E[ ĥ n,k 2 ], the share substantial M is this largely corrected. For cell-free networks
of power that the nth AP devotes to user k is pn,k with with small M, therefore, self-interference is a major issue.
Fig. 2. CDF of SIR (averaged over the fading) for maximal-ratio power
allocation, η = 4, N /K = 10 and M = 1.
Fig. 1. Average (over all locations) ratios of self-interference to desired signal

and of self-interference to total interference. Both ratios are for maximal-ratio
power allocation, η = 4, and N/K = 4,10, as a function of M. VI. M ODIFIED C ONJUGATE B EAMFORMING
The technique we propose exploits that the APs have
V. G ENIE -A IDED U PPER B OUND more information than the users, hence they can (with only
To calibrate the loss in SIR caused by self-interference when the uncertainty of channel estimation errors) compensate for
M = 1, we can contrast (13) with a genie-aided counterpart the precoded channel fluctuations, tightening the overall gain
around a target value. To make room for upward–downward
where users have perfect knowledge of their precoded chan- corrections, such target needs to be below the hardening-based
nels, i.e., user k knows h ∗n,k f n,k or equivalently h ∗n,k ĥ n,k . level in (10); we set the target to a portion r ∈ [0, 1] of (10).
Then, the projection onto this quantity becomes the desired Start with conjugate beamforming as per (8). If the overall
signal, allowing us to rewrite (2) as gain from all APs to user k exceeds the target, i.e., if

N
1 + SNRrn,k ∗
yk = h n,k ĥ n,k Gn,k pn,k P sk Gn,k ĥ ∗n,k f cb
n,k > r E Gn,k ĥ ∗n,k f cb
n,k , (18)
M SNRrn,k n n
n=1

Desired Signal: Sk then we declare an upward fluctuation and all the precoders
intended for user k are scaled down to
N

1 + SNRrn,k √
+ Gn,k P h ∗n,k ĥ n,k pn,k sk +vk ∗ f cb
M SNRrn,k r E n G n,k ĥ n,k n,k
n=1 k=k

f mod
n,k = f cb
n,k ∀n (19)
G ∗ f
n,k ĥ n,k n,k
cb
Multiuser Interference: Ik n
(15) such that the overall gain is pushed back to the target.
If (18) is reversed, the fluctuation is downwards. To correct
free of self-interference. From (15), as a function of the
it with the minimal amount of interference to other users, we
realization of h ∗n,k ĥ n,k ,
identify as nmax the AP having the strongest large-scale gain
to user k and adjust upwards only f nmax ,k , setting it to
E |Sk |2 |h ∗n,k ĥ n,k
genie
SINRk = f mod
nmax ,k
σ 2 + E |Ik |2 |h ∗n,k ĥ n,k
2 rE Gn,k ĥ ∗n,k f cb
n,k − Gn,k ĥ ∗n,k f cb
1 N
√ =
n

n=nmax n,k
ĥ nmax ,k
M n=1 ρ + SNRn,k h ∗n,k ĥ n,k pn,k Gnmax ,k ĥ nmax ,k 2
= .
1 N 2 ∗ (20)
1+ M n=1 SNRn,k E h n,k |h n,k f n,k k=k pn,k
(16) where by

In interference-limited conditions, the above reduces to Gnmax ,k ĥ ∗nmax ,k f mod
nmax ,k
2
N 2 √p
G h ∗
genie n=1 n,k n,k n,k =r E Gn,k ĥ n,k f n,k −
cb
Gn,k ĥ ∗n,k f cb
n,k
SIRk = N . (17)
2 n
n=1 Gn,k h n,k k=k pn,k
n=nmax
(21)
Figs. 2–4 present SIR distributions over all locations for
M = 1, confirming the deficiency of (13) relative to (17) when and the target is met again. Altogether, every user experiences
M is small. a stable overall gain, disturbed only by channel estimation
ATTARIFAR et al.: MODIFIED CONJUGATE BEAMFORMING FOR CELL-FREE MASSIVE MIMO 619
Fig. 3. CDF of SIR (averaged over the fading) for max-min power allocation, Fig. 4. CDF of SIR (averaged over the fading) for max-min power allocation,
η = 4, N/K = 10 and M = 1. η = 4, N/K = 4 and M = 1.
errors, and the multiuser interference is curbed. (Since r<1, VIII. D ISCUSSION
the per-AP transmit powers are lowered and a rescaling of all The proposed modifications preserve the utter simplicity
precoders is advisable if noise is significant; in interference- of conjugate beamforming, free of matrix inversions, at the
limited conditions, this is immaterial.) User k observes expense of fading-rate coordination—needed anyway to com-
bine and decode the reverse-link transmissions—among the

yk = r E Gn,k ĥ ∗n,k f cb
n,k sk
APs. By translating these modifications from f n,k to pn,k ,
n they can be construed as a fading-based power allocation,

Desired Signal: Sk
which can be overlaid onto any existing large-scale-based
allocation.

+ Gn,k h ∗n,k f mod
n,k − r E Gn,k ĥ ∗n,k f cb
n,k sk
n n R EFERENCES

Self−interference: Ek
[1] H. Q. Ngo, A. Ashikhmin, H. Yang, E. G. Larsson, and T. L. Marzetta,
+ Gn,k h ∗n,k f mod
n,k sk +vk (22) “Cell-free massive MIMO versus small cells,” IEEE Trans. Wireless
Commun., vol. 16, no. 3, pp. 1834–1850, Mar. 2017.
n k=k

[2] E. Nayebi, A. Ashikhmin, T. L. Marzetta, and B. D. Rao, “Performance
Mutual Interference: Ik of cell-free massive MIMO systems with MMSE and LSFD receivers,”
in Proc. Asilomar Conf. Signals Syst. Comput., Pacific Grove, CA, USA,
Nov. 2016, pp. 203–207.
where, if ĥ n,k = h n,k (interference-limited conditions with- [3] E. Nayebi, A. Ashikhmin, T. L. Marzetta, H. Yang, and B. D. Rao,
out pilot contamination), Ek indeed vanishes giving “Precoding and power optimization in cell-free massive MIMO
2
systems,” IEEE Trans. Wireless Commun., vol. 16, no. 7, pp. 4445–4459,
r 2M n Gn,k pn,k Jul. 2017.
SIRk =
mod (23) [4] T. C. Mai, H. Q. Ngo, M. Egan, and T. Q. Duong, “Pilot power control
mod 2
n k=k Gn,k E f n,k for cell-free massive MIMO,” IEEE Trans. Veh. Technol., vol. 67, no. 11,
pp. 11264–11268, Nov. 2018.
[5] J. Zhang, Y. Wei, E. Björnson, Y. Han, and S. Jin, “Performance analysis
with the denominator following from the independence of and power control of cell-free massive MIMO systems with hardware
n,k and h n,k . The portion r, which must be known by the
f mod impairments,” IEEE Access, vol. 6, pp. 55302–55314, 2018.
users, can be optimized over; note that, in (23), r affects the [6] G. Interdonato, H. Q. Ngo, E. G. Larsson, and P. Frenger, “How much do
numerator and, through f mod
n,k , also the denominator.
downlink pilots improve cell-free massive MIMO?” in Proc. IEEE Glob.
Commun. Conf. (GLOBECOM), Washington, DC, USA, 2016, pp. 1–7.
[7] H. Q. Ngo and E. G. Larsson, “No downlink pilots are needed in
VII. E XAMPLES TDD massive MIMO,” IEEE Trans. Wireless Commun., vol. 16, no. 5,
√ pp. 2921–2935, May 2017.
Figs. 2–3 exemplify how, with N/K = 10 and r = 1/ 2 [8] J. Vieira, F. Rusek, and F. Tufvesson, “Reciprocity calibration meth-
for both maximal-ratio and max-min power allocations, the ods for massive MIMO based on antenna coupling,” in Proc. IEEE
modifications push the SIR close to the (unachievable) genie- Glob. Commun. Conf. (GLOBECOM), Austin, TX, USA, 2014,
aided upper bound, erasing most of the deficit of conjugate pp. 3708–3712.
[9] B. Błaszczyszyn, M. K. Karray, and H. P. Keeler, “Wireless networks
beamforming. For shrinking N/K, self-interference is progres- appear Poissonian due to strong shadowing,” IEEE Trans. Wireless
sively overcome by multiuser interference, yet the modified Commun., vol. 14, no. 8, pp. 4379–4390, Aug. 2015.
beamformer continues to perform satisfactorily close to the [10] G. George, R. K. Mungara, A. Lozano, and M. Haenggi, “Ergodic
genie-aided bound. This is illustrated in Fig. 4, which is the spectral efficiency in MIMO cellular networks,” IEEE Trans. Wireless
counterpart to Fig. 3 for N/K = 4. Commun., vol. 16, no. 5, pp. 2835–2849, May 2017.
[11] O. Y. Bursalioglu, C. Wang, H. Papadopoulos, and G. Caire, “RRH
In terms of η, its increase weakens the hardening and based massive MIMO with ‘on the fly’ pilot contamination control,” in
renders the modified beamforming even more effective. Proc. IEEE Int. Conf. Commun. (ICC), Kuala Lumpur, Malaysia, 2016,
Conversely, a decrease in η reduces the advantage. pp. 1–7.
Divergence-Optimal Fixed-to-Fixed Length Distribution

Matching With Shell Mapping
Patrick Schulte , Student Member, IEEE, and Fabian Steiner , Student Member, IEEE
Abstract—Distribution matching (DM) transforms indepen- error propagation. Research is therefore now dedicated to
dent and Bernoulli(1/2) distributed bits into a sequence of find improved DM architectures for short blocklengths, e.g.,
output symbols with a desired distribution. A fixed-to-fixed [8] and [9]. Good performance for short blocklengths is also
length, invertible DM architecture based on shell mapping needed to operate several DMs in parallel to further reduce
(SM) is presented. It is shown that SM for DM (SMDM)
is the optimum DM for the informational divergence metric processing latencies.
and that finding energy optimal sequences is a special case In this letter, we introduce shell-mapping distribution match-
of divergence minimization. Additionally, it is shown how to ing (SMDM), a fixed-to-fixed (f2f) length DM architecture for
find the required SM weight function to approximate arbi- short output blocklengths based on shell mapping. Shell map-
trary output distributions. SMDM is combined with probabilis- ping was developed in the early 1990s [3], [10], [11] and was
tic amplitude shaping to operate close to the Shannon limit. used in the V.34 modem standard to realize shaping gains
SMDM exhibits excellent performance for short blocklengths with trellis coded modulation (TCM). We show that SMDM
as required by ultra-reliable low-latency applications. SMDM
outperforms constant composition DM by 0.7 dB when used
minimizes the informational divergence of f2f length DM if
with 64-QAM at a spectral efficiency of 3 bits/channel use and the self-information of the target distribution is used as the
a 5G low-density parity-check code with a short blocklength weight function for the shell mapping algorithm. Further, we
of 192 bits. show that the dyadic and Maxwell-Boltzmann (MB) distribu-
Index Terms—Probabilistic amplitude shaping, distribution
tions [1] lead to integer weight functions, which significantly
matching, shell mapping, coded modulation. simplify the implementation of shell mapping (SM). Finally,
we explain how to integrate SMDM with PAS to operate close
to the Shannon limit at small blocklengths. Numerical simu-
I. I NTRODUCTION lations with 64-QAM and low-density parity-check (LDPC)
codes from the 5G enhanced mobile broadband (eMBB) stan-
IGHER-ORDER modulation is a key enabler for high
H spectral efficiencies and various approaches have been
considered in the past to close the shaping gap of discrete
dard [12] show a gain of 0.6 dB of SMDM over CCDM for
a spectral efficiency (SE) of 3.0 bits/channel use (bpcu).
constellations with uniformly distributed points (e.g., with
quadrature amplitude modulation (QAM)) [1]–[3]. Recently, II. P RELIMINARIES
probabilistic amplitude shaping (PAS) [4] was proposed that A. Notation
is based on a reverse concatenation architecture [5], placing the We denote random variables with uppercase letters, and
shaping operation before the forward error correction (FEC) their realizations with lowercase letters. Let A be a discrete
encoding. Apart from achieving most of the shaping gain, it random variable with probability mass function (pmf) PA
allows flexible rate adaptation with a single constellation and defined on the set A. If an event A = a occurs with positive
FEC code rate. To convert uniformly distributed input bits to probability, then its self-information is
non-uniformly distributed output symbols, PAS requires a dis-
tribution matcher (DM). Schulte and Böcherer [6] introduced ι(PA (a)) = − log2 (PA (a)) bits. (1)
constant composition distribution matching (CCDM) which The entropy of a random variable A is the expectation of the
is asymptotically optimal, in the sense of a vanishing nor- self-information of A, i.e., we have
malized informational divergence [7, p. 7], for long output
blocklengths. H(PA ) = E[ι(PA (A))] = −PA (a) log2 (PA (a)),
For practical communication systems and new requirements a∈supp(PA )
such as ultra reliable low latency communication (URLLC), (2)
shorter output blocklengths in the range of 10 to 500 symbols
are desirable to minimize the processing latency and limit where supp(PA ) ⊆ A is the support of PA , i.e., the subset of
a in A with positive probability. The informational divergence
Manuscript received September 21, 2018; revised December 11, 2018; of two distributions PÃ and PA on A is
accepted December 12, 2018. Date of publication January 1, 2019; date of
current version April 9, 2019. This work was supported by the German Federal PÃ (a)
Ministry of Education and Research in the framework of an Alexander von D PÃ PA = PÃ (a) log2 . (3)
Humboldt Professorship. The associate editor coordinating the review of this PA (a)
paper and approving it for publication was R. C. de Lamare. (Corresponding a∈supp(PÃ )
author: Patrick Schulte.)
The authors are with the Institute for Communications Engineering, The mutual information of two random variables A and B with
Technical University of Munich, 80333 Munich, Germany (e-mail: joint pmf PAB is
patrick.schulte@tum.de; fabian.steiner@tum.de).
Digital Object Identifier 10.1109/LWC.2018.2890595 I(A; B ) = D(PAB PA × PB ), (4)
SCHULTE AND STEINER: DIVERGENCE-OPTIMAL f2f LENGTH DM WITH SM 621
with C. Divergence-Optimal Codebooks

(PA × PB )(ab) = PA (a) · PB (a). (5) To operate
close
to channel capacity, (7) suggests to mini-
mize D PÃn PA n , where P is the optimal input distribution.
A
We denote a length n vector of random variables as An = The optimal codebook with fixed cardinality M is
[A1 A2 . . . An ] with realization a n = [a1 a2 . . . an ]. For ran- n

dom vectors with independent and identically distributed (iid) n
CˆM = argmin D PÃn PA = argmin ι(PA (ai )),
entries, we write C⊆An C⊆An a n ∈C i=1
n |C|=M |C|=M

n n
PA (a ) = PA (ai ). (6) (13)
i=1 where equality in (13) holds because we shifted the objec-
tive by log2 |C| and scaled it by |C|. Problem (13) is solved
B. Fixed Length Distribution Matching in [16] by selecting those M codewords with the least self-
DMs have applications to capacity achieving communica- information. In order find the optimal codebook
tion [4] and stealth communication [13]. In both areas, the ⎛ ⎞
informational divergence between the output distribution of n ⎜ n ⎟
the DM and the target distribution plays a fundamental role. Cˆ = argmin D PÃn PA = argmin⎝argmin D(PC PA )⎠
C⊆An M C⊆An
For energy efficient communication, suppose that PA is the |C|=M
capacity-achieving input distribution of a channel with dis-
(14)
crete inputs and capacity C. Let Ỹ n be the channel output for
an input Ãn . Then we have [14, eq. (23)] ˆ
we need to search through the solutions CM of (13) for differ-
n
ent codebook sizes around M ≈ 2n H(PA ) [16] which is not
D PÃn PA I(Ãn ; Ỹ n )
C− ≤ ≤ C. (7) difficult. We show next how to efficiently encode and decode
n n to CM .
Hence, a small divergence guarantees a mutual information
close to capacity. III. S HELL M APPING
A one-to-one f2f DM is an invertible function f that realizes SM maps unsigned integers to shell sequences a n ∈ An ,
a desired distribution PA on the output symbols. It maps m i.e.,
uniformly distributed bits B m to length n sequences Ãn =
f (B m ) ∈ An , where A is the output alphabet. The output fˆSM : {0, 1, . . . , |A|n − 1} → An . (15)
distribution is defined on a block of n symbols and we denote We restrict the input to the integers {0, 1, . . . , M − 1} and
it by PÃn . We call the ratio of input to output lengths the refer to the image of the shell mapper as the codebook
matcher rate
m CSM,M = fˆSM ({0, 1, . . . , M − 1}). (16)
R= . (8)
n We assign a non-negative weight W(a) to each letter a in
In this letter, we consider one-to-one f2f distribution matchers. the alphabet A using W : A → N0 . SM orders
the sequences
For vanishing informational divergence, we have (see [15]) a n ∈ An according to the sequence weight ni=1 W (ai ).
m This ordering is in general not unique because two
R= ≤ H(PA ). (9) sequences may have the same weight, e.g., if they are per-
n
mutations of each other. A SM creates one of these ordered
We refer to the image of a DM as the codebook C and its ele-
lists. Hence CSM,M solves the problem
ments as codewords. As a uniformly distributed bit sequence
of length m indexes the codewords in the codebook, every n

code word has probability 1/|C| = 2−m . Consequently, for minn W (ai ), (17)
C⊆A n
the informational divergence the explicit mapping from input |C|=M a ∈C i=1
to output is not important, only the codebook matters. We
define the letter distribution PĀ of a codebook as i.e., SM finds the set Cˆ of M sequences a n of smallest weight
n
i=1 W (ai ). There are many ways to implement SM, e.g.,
1 na (αn ) using the divide and conquer principle [17] or sequential
PĀ (a) = , (10)
|C| n n encoding [3]. In the next section we show how to use (17)
α ∈C
for solving (13).
where na (αn )= |{i : αi = a}| is the number of times symbol
a appears in αn . The letter distribution corresponds to the IV. S HELL M APPING AS D ISTRIBUTION M ATCHER
probability of drawing a letter a from the whole codebook.
A. SMDM Interface
As each codeword a n of a codebook C is chosen with
equal probability, we can write the unnormalized informational SM algorithms require as inputs the codebook cardinality M,
divergence as the output length n, and the weight function W. We consider
1 a binary input DM, so we choose M = 2m where m is the
n n n
D PÃn PA = − log2 |C| + ι(PA (a )) (11) input blocklength in bits. The input bits are interpreted as an
n
|C| unsigned integer in the range of {0, . . . , 2m − 1}. The SM
a ∈C
output corresponds to the output of a DM
= − log2 |C| + nH PĀ + nD PĀ PA .
(12) fSM,m : {0, 1}m → An . (18)
B. Divergence Optimal Weight Functions

Proposition 1: A minimum divergence f2f length DM with
a target output probability PA is a shell mapper with weight
function
Ŵ (a) = ι(PA (a)). (19)
Proof: The SM algorithm solves problem (17). When
we use the self-information as a weight function, we solve
problem (13). With a search over the input length, we can
find the best distribution matcher independent of the codebook
size.
Example 1: Dyadic distributions have the form
PA (a) = 2−a (20)
Fig. 1. CCDM and SMDM comparison based on the normalized informa-
where a is a positive integer for all a. The weight functional divergence and different output blocklengths n. The target distribution
tion (19) is is 4-ary MB and the rate is R = 1.25 input bits per output symbol. The cal-
culation of the exact output distribution PÃn is limited by an input length
W (a) = a a ∈ supp(PA ). (21) of m = 64 bits on a standard 64-bit computer architecture.
Remark: A non-negative, integer weight function is

desirable for implementation. Weight functions constructed (AWGN) channel [4, Table 1]. If we minimize the infor-
with (19) do not generally have integer Ŵ (a). In the fol- mational divergence of our f2f length DM to a memoryless
lowing, we show which practically relevant distributions also source with MB distribution, we find that sequences of least
yield non-negative integer valued weight functions with (19). energy accomplish this goal. Kschischang and Pasupathy [1]
Proposition 2: Consider a finite support discrete distribu- show that minimizing the average energy subject to a entropy
tion that can be expressed as constraint induces MB distributed symbols.
The weight function (26) is independent of the parameter
e −v Ω(a)
PA (a) = −v Ω(ξ)
, ∀a ∈ supp(PA ) (22) v. Consequently, according to (13) a shell mapper with this
ξ∈supp(PA ) e weight function implements a minimum divergence DM with
fixed codebook size 2m for all half MB distributions and any
with v being positive and Ω is any function
rate adaptation does not require to change the weight function.
Ω : supp(PA ) → N0 , (23) A similar property can be observed for distribution families
defined in (22) for a fixed function Ω and varying v.
Then Ω is a non-negative integer weight function.
Proof: Inserting (22) into (19) we obtain C. Determining Letter Frequency PĀ

Ŵ (a) = v Ω(a) log2 (e) + log2 e −v Ω(ξ) . A soft-input soft-output decoder requires the letter distribu-
ξ∈supp(PA ) tion (10) in order to calculate the priors on the constellation
symbols [4, Sec. VI-B]. The letter distribution depends on
Any translation and positive scaling can be applied on the both the weight function and how to order sequences of equal
objective function without changing the codebook. We obtain weight. Fischer suggests in [18] an algorithm to calculate the
the integer weight function letter distribution. The algorithm uses the partial histogram,
W (a) = Ω(a) a ∈ supp(PA ). (24) i.e., the letter distribution for codebooks that consist of all
sequences up to a certain weight. We denote the partial his-
togram for sequences a n up to weight w (i.e., W (a n ) ≤ w )
Example 2: The half MB distribution is defined as by PĀ (·, w ), where the first parameter is the symbol that
2 we want to evaluate. We suggest to use PĀ (·, wmax ) and
e −va PĀ (·, wmax − 1) as approximations of the true frequencies,
PA (a) = (25)
−v ξ 2 where wmax is the maximum weight of sequences that the
ξ∈supp(PA ) e
respective SMDM can generate, i.e.,
with supp(PA ) = {1, 3, 5, . . . , 2η − 1} = A and positive
v ∈ R+ and η ∈ N. Comparing with (22) we identify the wmax = W (fSM,m ([1, 1, . . . , 1])). (27)
weight function
Consider that the all 1 sequence is mapped to a sequence
W (a) = a 2 a ∈ supp(PA ) (26) of highest weight. For long blocks we may use the target
distribution as approximation at the receiver.
which corresponds to the energy of a constellation point.
Corollary 1: We obtain sequences of least power by mini-
V. CCDM AND SMDM C OMPARISON
mizing divergence to MB distributions.
This result has a special beauty. MB distributions are close A. Divergence
to optimal for maximizing the single letter mutual information To compare CCDM and SMDM we consider the output
on discrete signal points for the additive white Gaussian noise alphabet A = {1, 3, 5, 7} and rate R = 1.25. The distribution
SCHULTE AND STEINER: DIVERGENCE-OPTIMAL f2f LENGTH DM WITH SM 623
VI. C ONCLUSION
We introduced an informational divergence optimal f2f
length DM approach based on SM, which shows superior
performance compared to state of the art DMs for short
blocklengths. We showed that the self-information of the
target output distribution can be used as the weight func-
tion for the SM algorithm to synthesize arbitrary output
distributions. Furthermore, we showed that energy efficient
signaling is a special case of divergence minimization. We
gave examples for distributions that result in non-negative,
integer valued SM weight functions favorable for practical
implementations.
Fig. 2. Finite length performance for uniform and shaped signaling using
ACKNOWLEDGMENT
CCDM and SMDM. We target an SE of 3 bpcu with 5G LDPC codes of The authors would like to thank Gerhard Kramer and Georg
blocklength 192. Böcherer for fruitful discussions.
of the CCDM is an n-type approximation [4, Sec. IV] of R EFERENCES

a half MB distribution. The SMDM has rate 1.25 and uses [1] F. R. Kschischang and S. Pasupathy, “Optimal nonuniform signal-
the weight function defined in (26). The results are shown in ing for Gaussian channels,” IEEE Trans. Inf. Theory, vol. 39, no. 3,
pp. 913–929, May 1993.
Fig. 1. The approximations (dotted lines) use the partial his- [2] G. D. Forney, “Trellis shaping,” IEEE Trans. Inf. Theory, vol. 38, no. 2,
tograms PĀ (·, wmax − 1) and PĀ (·, wmax ) as approximations pp. 281–300, Mar. 1992.
for the letter distribution PĀ in (12). Using SMDM and a [3] R. Laroia, N. Farvardin, and S. A. Tretter, “On optimal shaping of
target divergence of 0.1 bit we save approximately a factor multidimensional constellations,” IEEE Trans. Inf. Theory, vol. 40, no. 4,
pp. 1044–1056, Jul. 1994.
of 5.5 in blocklength as compared to CCDM, and at a target [4] G. Böcherer, F. Steiner, and P. Schulte, “Bandwidth efficient and
divergence of 0.01 we save a factor of 4.1. rate-matched low-density parity-check coded modulation,” IEEE Trans.
Commun., vol. 63, no. 12, pp. 4651–4665, Dec. 2015.
[5] W. G. Bliss, “Circuitry for performing error correction calculations
B. Rate Adaptation on baseband encoded data to eliminate error propagation,” IBM Tech.
Disclosure Bull., vol. 23, pp. 4633–4634, Mar. 1981.
Rate adaptation for SMDM is straightforward for distribu- [6] P. Schulte and G. Böcherer, “Constant composition distribution
tions of the form (22). The number of bits that are interpreted matching,” IEEE Trans. Inf. Theory, vol. 62, no. 1, pp. 430–434,
as the index of the ordered list can be easily adapted, and Jan. 2016.
[7] G. Kramer, “Topics in multi-user information theory,” Found. Trends
therefore the rate can be easily adapted. The granularity of rate Commun. Inf. Theory, vol. 4, nos. 4–5, pp. 265–444, 2008.
adaption is 1/n, where n is the output length. This granularity [8] T. Fehenberger, D. S. Millar, T. Koike-Akino, K. Kojima,
is the best possible. and K. Parsons, “Partition-based distribution matching,” arXiv
preprint, 2018. [Online]. Available: http://arxiv.org/abs/1801.08445,
doi: 10.1109/TCOMM.2018.2881091.
C. Coded Results [9] T. Yoshida, M. Karlsson, and E. Agrell, “Hierarchical distribution match-
ing for probabilistically shaped coded modulation,” arXiv:1809.01653,
We compare the performance of SMDM and CCDM for Sep. 2018.
PAS in a coded scenario. We target an SE of 3 bpcu with [10] F. R. Kschischang, “Shaping and coding gain criteria in signal constella-
tion design,” Ph.D. dissertation, Dept. Elect. Eng., University of Toronto,
a 64-QAM constellation. We employ LDPC codes from the Toronto, ON, Canada, Jun. 1992.
recent 5G eMBB standard [12] with blocklength 192 bits, i.e., [11] A. K. Khandani and P. Kabal, “Shaping multidimensional signal spaces.
32 complex channel uses. The uniform reference curve uses a I. Optimum shaping, shell mapping,” IEEE Trans. Inf. Theory, vol. 39,
rate Rc = 1/2 code, whereas the shaped scenarios use a rate no. 6, pp. 1799–1808, Nov. 1993.
[12] T. Richardson and S. Kudekar, “Design of low-density parity check
Rc = 3/4 code. Both DMs approaches have a 4-ary output codes for 5G new radio,” IEEE Commun. Mag., vol. 56, no. 3, pp. 28–34,
alphabet to generate the shaped amplitude sequences for the Mar. 2018.
real and imaginary part. Note that a 64-QAM constellation [13] J. Hou and G. Kramer, “Effective secrecy: Reliability, confusion and
can be constructed as the Cartesian product of two (bipolar) stealth,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Sep. 2014,
pp. 601–605.
8-amplitude shift keying (ASK) constellations, where the latter [14] G. Böcherer and R. Mathar, “Matching dyadic distributions to channels,”
has four different amplitude values. For 32 complex channel in Proc. Data Compression Conf., 2011, pp. 23–32.
uses with QAM symbols, we therefore need 64 amplitudes. [15] G. Böcherer and R. A. Amjad, “Informational divergence and entropy
The target distribution in both cases is the MB family. Both rate on rooted trees with probabilities,” in Proc. IEEE Int. Symp. Inf.
Theory (ISIT), Sep. 2014, pp. 176–180.
DMs operate with an output blocklength of n = 64 output [16] R. A. Amjad, “Algorithms for simulation of discrete memoryless
symbols. The CCDM performance (blue curve) is similar to sources,” Master’s thesis, Inst. Commun. Eng., Tech. Univ. Munich,
the uniform reference (green curve) in Fig. 2. The constant Munich, Germany, 2013.
composition constraint of CCDM thus causes a significant [17] R. F. H. Fischer, Precoding and Signal Shaping for Digital Transmission.
New York, NY, USA: Wiley, 2002.
rate loss [4, Sec. V-B] for small output blocklengths. In con- [18] R. F. H. Fischer, “Calculation of shell frequency distributions obtained
trast, SMDM gains 0.7 dB in power efficiency at a frame error with shell-mapping schemes,” IEEE Trans. Inf. Theory, vol. 45, no. 5,
rate of 10−3 . pp. 1631–1639, Jul. 1999.
SCR-Based Tone Reservation Schemes With Fast Convergence

for PAPR Reduction in OFDM System
Jingqi Wang , Member, IEEE, Xin Lv, and Wen Wu, Senior Member, IEEE
Abstract—The signal to clipping noise ratio (SCR)-based squares approximation algorithm (LSA)-TR [9]. However,
tone reservation scheme is a promising candidate for peak-to- these schemes require additional fast Fourier transform (FFT)
average ratio (PAPR) reduction of orthogonal frequency division and inverse fast Fourier transform (IFFT) operations in each
multiplexing signals, but it suffers from low convergence rate
caused by the nonoptimal scalar scaling factor. In this letter, iteration and thereby inevitably result in an increased compu-
we first derive a scaling SCR (S-SCR) scheme, in which the tation complexity. On the other hand, a complexity reduced
scaling factor is an optimized vector with peak regeneration TR scheme using a gradient algorithm was proposed by
constraints. To further improve convergence rate and eliminate Tellado [7]. However, this signal to clipping noise ratio (SCR)
multiple peaks in each iteration, we then propose the multiple scheme suffers from a low convergence speed caused by
scaling SCR (MS-SCR) scheme with an augmented scaling factor
in matrix form. Numerical results demonstrate the superiority of a non-optimal scaling factor. Yu and Jin [10] presented a time-
proposed schemes compared with the conventional ones, includ- domain kernel matrix (TKM) TR scheme with simultaneous
ing better PAPR reduction performance with fewer iterations multi-peak reduction and rapid convergence speed. However,
and comparable bit error ratio performance. Furthermore, the because of the lack of peak regeneration suppression, its
computational complexity reduction ratio of the proposed S-SCR performance deteriorates sharply when applied to low clipping
and MS-SCR schemes are increased by 38.52% and 27.36%,
respectively. thresholds.
In this letter, we firstly propose a novel SCR-based PAPR
Index Terms—OFDM, PAPR reduction, SCR, tone reservation, reduction scheme, called scaling SCR (S-SCR) scheme, in
scaling factor.
which the optimal scaling factor vector is calculated by the
LSA algorithm with peak regeneration constraints. Further,
an improved Multiple Scaling SCR (MS-SCR) method with
I. I NTRODUCTION an augmented scaling factor in matrix form is proposed to
RTHOGONAL frequency division multiplexing suppress peak regeneration and eliminate multiple peaks in
O (OFDM) signals have been extensively adopted in
wireless communication systems. However, OFDM signals
one iteration. Simulated results show that the two schemes
introduce great improvements on computational complexity,
often suffer from high peak-to-average ratio (PAPR) which convergence speed, and peak regeneration suppression with
leads to significant in-band distortion and out-of-band (OOB) comparable communication performance. This letter is orga-
radiation. Consequently, various PAPR reduction techniques nized as follows. Section II briefly introduces the OFDM
have been proposed, such as clipping [1], coding [2], system and conventional SCR scheme. Section III presents
companding [3], selective mapping (SLM) [4], and partial the two proposed schemes. Simulated results are provided in
transmit sequence (PTS) [5]. Section IV and the conclusion is given in Section V.
The tone reservation (TR) schemes [6]–[10] have gained
great attention because of their low complexity and good bit II. P RELIMINARY BACKGROUNDS
error ratio (BER) performance. The TR scheme reserves a sub-
A. OFDM System and PAPR
set of subcarriers as peak reduction tones (PRTs), in which
no information data are loaded and which is used for gen- In an OFDM system, N orthogonal subcarriers are used to
erating the peak canceling signal. To raise the convergence transmit modulated data symbols X = [X0 , X1 , . . . , XN −1 ]T .
speed, improved TR algorithms were subsequently proposed, An OFDM signal sequence in the time domain can be
such as the adaptive-scaling (AS)-TR [8] and the least generated by an IFFT and is written as
N −1
Manuscript received September 30, 2018; revised November 5, 2018; 1 2πnk
accepted December 3, 2018. Date of publication January 1, 2019; date of x (n) = √ Xk e j LN , n = 0, 1, . . . , LN − 1, (1)
current version April 9, 2019. This work was supported in part by the N k =0
National Natural Science Foundation of China under Grant 61301020, in
part by the Natural Science Foundation of Jiangsu Province under Grant where L is the oversampling factor, which usually satisfies
BK20130772, and in part by the Priority Academic Program Development L ≥ 4.
of Jiangsu Higher Education Institutions. The associate editor coordinating
the review of this paper and approving it for publication was J. Mietzner. In the time domain, the discrete signal x(n) is actually
(Corresponding author: Jingqi Wang.) a sum of N orthogonal subcarriers. Therefore, the OFDM sig-
The authors are with the Ministerial Key Laboratory of JGMT, Nanjing nal occasionally exhibits envelope fluctuations which can be
University of Science and Technology, Nanjing 210094, China (e-mail:
wangjingqi@njust.edu.cn). quantified by the PAPR. The PAPR is defined as the ratio of
Digital Object Identifier 10.1109/LWC.2018.2890596 maximum signal power to average signal power.
WANG et al.: SCR-BASED TR SCHEMES WITH FAST CONVERGENCE FOR PAPR REDUCTION IN OFDM SYSTEM 625
B. SCR Scheme Algorithm 1 Steps of the MS-SCR scheme

1. Initialization conditions: the number of subcarriers, the number
In the SCR scheme, the peak reduced signal x̃ (n) is itera-
of the reserved tones, the clipping ratio CR, the maximum
tively updated by using a simple gradient algorithm given by number of iterations K and the time-domain kernel p using (3).
2. Convert the input bit stream into frequency-domain symbols,
x̃ m+1 (n) = x̃ m (n) − μ · cmax
m
(n), (2a)
m m
and put them on the data subcarriers. Then, generate the time
m
cmax (n) = x̃ m (nmax
m
) − Ae j arg{x̃ (nmax )} domain signal x by an IFFT.
3. Calculate the clipping threshold A, if |x̃ m (n)| ≤ A, go to
m
× p[((n − nmax ))LN ], (2b) step 7, otherwise, go to step 4.
4. Get the clipping noise f and the set S, and then, construct the
m (n) is the peak reduction signal, μ is a con-
where μ · cmax peak canceling vector cum using (10).
−1 m 5. Calculate the scaling factor vector Γu by (12) and (13).
stant scaling factor, A = CR 1/(LN )· LN n=0 |x̃ (n)|
2
6. Update the peak reduced signal x using (14).
is the required clipping threshold, CR is the clipping ratio. 7. If |x̃ m (n)| ≤ A or the iteration number is larger than K,
p = [p0 , p1 , . . . , pLN −1 ]T represents the time domain transmit x and terminate the procedure. Otherwise, go to step 3.
m ))
kernel, p[((n − nmax LN ] denotes the right circularly shifted
sequence of p to nmax m which is the position of the peak
amplitude of x̃ m (n) in the m-th iteration.
The time domain kernel is obtained by
p = QP, (3)
where Q is the IFFT matrix, P is the frequency domain kernel,
in which the value of the elements on the PRTs are defined
as 1, while the others are set to 0.
By avoiding FFT/IFFT operations in the iterations, the SCR
has lower computational complexity than the traditional TR
scheme. However, because of the lack of a predetermined
appropriate scaling factor and peak regeneration regression, Fig. 1. Example of amplitude of original OFDM signal (top) and the signal
the convergence rate of SCR is quite slow. of MS-SCR scheme with one iteration (down).
III. P ROPOSED S CHEMES

restricts the scaling adjusting only on the peak value in one
A. The S-SCR Scheme
iteration. Thus, we get
In this section, to improve the convergence rate, we present
an S-SCR scheme which employs the LSA algorithm with μ i = nmax
ψ(i ) = (8)
additional peak regeneration restraint to obtain the optimized 1 i = nmax .
scaling factor.
Accordingly, we can rewrite the peak reduced signal x̃ (n)
First, given the clipping threshold A, the directly clipped
with optimal solution of the scaling factor as
signal can be expressed as
m x̃ m+1 (n) = x̃ m (n) − ψ · x̃ m (nmax
m m m
) − Ae j arg{x̃ (nmax )}
x̃ (n) |x̃ m (n)| ≤ A
x̂ m (n) = j arg{x̃ m (n)} (4)
Ae |x̃ m (n)| > A. m
× p[((n − nmax ))LN ]. (9)
then the clipping noise is defined as
B. The MS-SCR Scheme
f m (n) = x̃ m (n) − x̂ m (n). (5)
The proposed S-SCR scheme can reduce peak regeneration
Then, we express the position of the peaks which satisfy and improve convergence rate. However, it is still limited to
|f m (n)| > 0 with U entries as S = {s0 , s1 , . . . , sU −1 }. eliminating only a single peak in one iteration and therefore is
To achieve better PAPR performance, the amplitude of the not efficient. In this part, we propose an MS-SCR scheme to
m in (2a) is utilized to approximate
peak reduction signal μ·cmax eliminate multiple peaks simultaneously in one iteration and
m
the clipping noise f (n), which can be formulated as ensure fast convergence and better performance of PAPR at
the same time.
m
minμ · cmax (n) − f m (n)22 . (6) Above all, we define the signals which excess the thresh-
n∈S
old A as C = {c0m , c1m , . . . cU
m }, the element of set C is
−1
By using the LSA algorithm, the optimal solution of μ can a 1 × LN vector which can be formulated as
be calculated as
m (n)||f m (n)|
m
cum (n) = x̃ m (su ) − Ae j arg{x̃ (su )} .p[((n − su ))LN ],
|cmax
μ = n∈S
. (7)
m 2 u = 0, 1, . . . U − 1, (10)
n∈S |cmax (n)|
Furthermore, for the purpose of avoiding an undesirable where su is the element of set S, p[((n − su ))LN ] denotes the
peak regeneration, we design a 1 × LN scaling vector ψ that circularly shifted sequence of p to su .
TABLE I
OVERALL C OMPUTATIONAL C OMPLEXITY OF
T HREE S CHEMES AND CCRR S
Fig. 2. PAPR reduces along with the increasing iteration number when A. Complexity Analysis
CCDF = 10-3 (top) and the number of peaks above the threshold with To evaluate computational complexity, iteration numbers of
iteration numbers (bottom) for three schemes.
the proposed schemes that achieve the same PAPR reduction
performance need to be discussed. Fig. 2 shows the PAPR
Then the optimization problem can be modified as and the number of peaks above the threshold decreases along
U −1 2 with increasing iteration number for SCR, S-SCR and MS-
SCR schemes when CR = 2dB. As shown in Fig. 2, for
m m
min γu · cu (n) − f (n) . (11)
n∈S the particular OFDM symbol considered, the PAPR reaches
u=0 2 8.8 dB after 18 and 11 iterations in SCR and S-SCR, respec-
Accordingly, by applying the LSA algorithm, the approxi- tively. Meanwhile, by using the MS-SCR scheme, PAPR can
mate optimization value of the scaling factor γu is obtained as be reduced to 5dB in only one iteration.

|cum (n)||f m (n)| To represent the complexity reduction accurately, we use
γu = n∈S 2
. (12) the computational complexity reduction ratio (CCRR).
m
n∈S |cu (n)|

The peak reduction signal in (10) is generated by mixing complexity of proposed method
CCRR = 1 − × 100%.
kernel signals with many non-ignorable values of amplitudes complexity of previous method
near the maximum peak, so the peak regrowth is serious. To (15)
avoid the undesirable peak regeneration, similar to the scal-
ing vector of the S-SCR scheme, the LN-dimensional scaling Table I lists the overall computation complexity in terms of
factor matrix Γu is designed as real multiplications for the three schemes, we can make a brief
summary that the CCRRs of the proposed S-SCR and MS-SCR
γu i = su
Γ u (i ) = (13) schemes are increased by 38.52% and 27.36% compared with
1 i = su .
the original SCR scheme, respectively.
Thus, the iteratively updating peak reduced signal x̃ (n) can
be denoted as
B. PAPR Reduction Performance
U
−1 m

x̃ m+1 (n) = x̃ m (n) − Γu · x̃ m (su ) − Ae j arg{x̃ (su )} It can be observed from the above simulations that the SCR
u=0 and S-SCR converge after 18 iterations, and the MS-SCR with
× p[((n − su ))LN ]. (14) one iteration can offer better PAPR performance. Moreover,
TKM converges after 3 iterations in [10]. Therefore, in the fol-
Fig. 1 depicts the peak reduced signal of the MS-SCR lowing simulations, the number of iterations for SCR, S-SCR,
scheme in one iteration compared with the original OFDM TKM, and MS-SCR is set to be 18, 18, 3 and 1, respectively.
signal when CR is set to be 2dB. The dashed red line is As shown in Fig. 3, for the same CR, the PAPR reduction
the clipping threshold A. It can be observed that the ampli- performance of the proposed S-SCR and MS-SCR scheme
tudes of the clipped signals are almost all below the predefined is obviously better than the traditional SCR scheme. At the
threshold A after only one iteration. Obviously, the proposed same time, compared with TKM after three iterations, MS-
MS-SCR can efficiently eliminate numerous peaks. SCR obtains 2.7dB reduction gain with only one iteration.
The detailed steps of the MS-SCR scheme can be summa- Furthermore, it is evident from this figure that the PAPR of
rized as Algorithm 1. OFDM with MS-SCR technique is reduced from 6.1, 5.1 to
4.6dB for CR = 3, 2, and 1 dB, respectively. It is obvious that
IV. C OMPLEXITY A NALYSIS AND S IMULATION the multi-peak elimination ability of MS-SCR enhances as the
In this section, a 16-QAM modulated OFDM signal with the clipping ratio reduces.
number of subcarriers N = 256 and the number of reserved Fig. 4 further compares the performance of the four schemes
tones T = 32 is applied to evaluate the proposed schemes. with different relative number of reserved tones T/N for a sub-
Moreover, an oversampling rate L = 4 is used and the PAPR carrier number N = 256 and CR = 2 dB. It is shown
performance is represented by the complementary cumulative that when SCR, TKM, and S-SCR all reach the conver-
distribution function (CCDF) of the PAPR. gence state, the S-SCR scheme achieves better PAPR reduction
WANG et al.: SCR-BASED TR SCHEMES WITH FAST CONVERGENCE FOR PAPR REDUCTION IN OFDM SYSTEM 627
degradation with the same PAPR reduction performance, the

iteration number in SCR, S-SCR, TKM and MS-SCR is set to
18, 11, 3, and 1, respectively, while CR is defined as 2, 2, 2.4,
and 4dB, respectively. For SCR, TKM, S-SCR and MS-SCR
schemes, Eb/No is approximately 11, 11.2, 11.4, and 12.1dB
when a BER of 10−4 is desired, respectively. To evaluate the
BER performance of the MS-SCR scheme in a practical appli-
cation with low clipping threshold, we further add a simulation
parameter set for MS-SCR, in which the CR = 2 dB. In this
case, the PAPR is reduced from 13.6 dB to 4.8 dB and the
Eb/No is about 13 dB with BER at 10−4 .
Fig. 3. PAPR reduction performance of the SCR, TKM, S-SCR and MS-SCR V. C ONCLUSION
schemes with CR = 1 dB, 2 dB, and 3 dB.
In this letter, we first proposed an improved SCR-based
PAPR reduction scheme, which utilizes a least square algo-
rithm to calculate the optimal scaling factor for clipping
noise. This proposed S-SCR method greatly improves the
convergence rate and computational complexity. Furthermore,
we expanded the scaling factor vector into matrix form and
reshaped it for optimal peak regeneration suppression. The
derived MS-SCR scheme can eliminate multiple peaks in
one iteration. Simulation results show that the proposed MS-
SCR scheme achieves powerful multi-peak elimination with
rapid convergence rate and comparable BER performance.
It is attractive for practical implementations with low PAPR
requirement. Moreover, the S-SCR scheme shows low com-
putational complexity and great PAPR reduction performance,
Fig. 4. PAPR reduction performance of the SCR, S-SCR, TMK and MS-SCR given a relatively high threshold, and therefore is a suitable
schemes with relative number of reserved tones T/N = 1/16, 1/8, and 1/4.
candidate for highly cost-efficient applications.
R EFERENCES
[1] H. Ochiai and H. Imai, “Performance analysis of deliberately clipped
OFDM signals,” IEEE Trans. Commun., vol. 50, no. 1, pp. 89–101,
Jan. 2002.
[2] S. H. Han and J. H. Lee, “An overview of peak-to-average power
ratio reduction techniques for multicarrier transmission,” IEEE Wireless
[3] X. Huang, J. H. Lu, J. L. Zheng, K. B. Letaief, and J. Gu, “Companding
transform for reduction in peak-to-average power ratio of OFDM sig-
nals,” IEEE Trans. Wireless Commun., vol. 3, no. 6, pp. 2030–2039,
Nov. 2004.
[4] S.-J. Heo, H.-S. Noh, J.-S. No, and D.-J. Shin, “A modified SLM scheme
with low complexity for PAPR reduction of OFDM systems,” IEEE
Trans. Broadcast., vol. 53, no. 4, pp. 804–808, Dec. 2007.
[5] R. J. Baxley and G. T. Zhou, “Comparing selected mapping and partial
Fig. 5. BER performance comparison of SCR, S-SCR, TKM and MS-SCR transmit sequence for PAR reduction,” IEEE Trans. Broadcast., vol. 53,
with 16-QAM mapping over AWGN channel through an SSPA model, with no. 4, pp. 797–803, Dec. 2007.
parameter p = 2 and the input back-off (IBO) set to be 5 dB. [6] C. Ni, Y. Ma, and T. Jiang, “A novel adaptive tone reservation scheme
for PAPR reduction in large-scale multi-user MIMO-OFDM systems,”
IEEE Wireless Commun. Lett., vol. 5, no. 5, pp. 480–483, Oct. 2016.
[7] J. Tellado, “Peak to average power reduction for multicarrier modula-
performance. Moreover, MS-SCR is obviously superior to tion,” Ph.D. dissertation, Dept. Elect. Eng., Stanford Univ., Stanford,
other three schemes with only one iteration. CA, USA, 2000.
[8] L. Wang and C. Tellambura, “Analysis of clipping noise and tone-
reservation algorithms for peak reduction in OFDM systems,”
C. BER Performance IEEE Trans. Veh. Technol., vol. 57, no. 3, pp. 1675–1694,
In order to evaluate the overall BER performance for SCR, May 2008.
[9] H. Li, T. Jiang, and Y. Zhou, “An improved tone reservation scheme
TKM, S-SCR and MS-SCR schemes, we consider passing with fast convergence for PAPR reduction in OFDM systems,” IEEE
the clipped OFDM signals through a solid-state power ampli- Trans. Broadcast., vol. 57, no. 4, pp. 902–906, Dec. 2011.
fier (SSPA) model. Fig. 5 compares the BER performance of [10] P. Yu and S. Jin, “A low complexity tone reservation scheme
based on time-domain kernel matrix for PAPR reduction in OFDM
the four schemes with 16-QAM mapping when the parame- systems,” IEEE Trans. Broadcast., vol. 61, no. 4, pp. 710–716,
ters of OFDM are N = 256 and T = 32. To evaluate the BER Dec. 2015.
Average Age of Information in Wireless Powered Sensor Networks

Ioannis Krikidis , Fellow, IEEE
Abstract—In this letter, we deal with the age of information queueing-theoretic standpoint under various service policies,
(AoI) for a sensor network with wireless power transfer (WPT) e.g., [5] and [6]. Recent works employ the notion of AoI
capabilities. Specifically, we study a simple network topology, in energy harvesting communication systems (from natural
where a sensor node harvests energy from radio frequency signals
renewable sources), and investigate transmission policies that
(transmitted by a dedicated energy source) to transmit real-time
status updates. The sensor node generates an update when its minimize AoI-based performance metrics [7]–[10]. On the
capacitor/battery becomes fully charged and transmits by using other hand, the design of WPT-based communication systems
all the available energy without further energy management. with objective to optimize AoI, is a new research area with
The average AoI performance of the considered greedy policy is potential applications. Dong et al. [11] propose a two-way data
derived in closed form and is a function of the capacitor’s size. exchanging system, where a master node transfers energy and
The optimal value of the capacitor that maximizes the freshness information to a slave node, while the slave node uses the
of the information, corresponds to a simple optimization problem
requiring a 1-D search. The derived theoretical results provide energy harvested to power the uplink channel; the average
useful performance bounds for practical WPT networks. uplink AoI is derived in closed form. Although AoI seems to
be a natural design metric for WPT networks, other relevant
Index Terms—Age of information, wireless power transfer,
energy harvesting, sensor networks.
works cannot be found in the literature.
In this letter, we study a basic communication link where a
sensor node with WPT capabilities communicates with a sin-
gle destination. Specifically, the sensor node is equipped with
I. I NTRODUCTION
a capacitor, which is charged via RF radiation by a dedicated
IRELESS power transfer (WPT) via dedicated radio-
W frequency (RF) radiation is a promising technology
for wireless communication systems, which are characterized
energy source. Once the capacitor is charged, the sensor node
transmits status updates containing the most recent information
about parameters of interest by using all the energy stored.
by a massive number of low-power devices such as in the This online transmission policy does not require compli-
Internet-of-Things systems. It can support mobility, energy cated energy management decisions (e.g., energy-depended
multicasting, non-line-of-sight propagation environments and thresholds) and is appropriate for WPT low complexity/low
contributes in the development of smaller, lighter and more power devices. We investigate the freshness of the received
compact devices. From the pioneering work of Varshney [1], information and we provide simple closed form expressions
who has introduced this concept, WPT has been extensively for the average AoI, which depend on the size of the capaci-
studied in the literature for different network architectures, tor. The design of the system introduces an interesting tradeoff:
e.g., [2] and [3]. However, most of the current works focus a small capacitor is charged quickly and thus new updates are
on complex network structures with limited practical interest sent more frequently to minimize the AoI; on the other hand,
and/or use conventional performance metrics, e.g., throughput, a larger capacitor increases the transmit power and boosts
coverage probability, diversity gain, information-energy capac- the successful decoding. The optimal value of the capaci-
ity etc, which do not capture timeliness requirements that arise tor is computed by solving a one-dimensional optimization
from sensing and actuation applications within machine-type problem. It is worth noting that the network topology and
communications. the transmission policy considered are inspired by commer-
A performance metric that captures the freshness of the cial battery-free WPT products, e.g., Powercast [12]; these
received information and is appropriate for applications requir- devices are equipped with supercapacitors that deliver high
ing timely information to accomplish specific tasks (e.g., power bursts when charged. Although our analysis refers
sensor networks, cyberphysical systems, etc), was proposed to a simplistic system model, the derived theoretical results
in [4], i.e., age of information (AoI). It is defined as the can serve as guidelines (performance bounds) for practical
time elapsed since the generation of the freshest status update implementations.
that has reached the destination. Initial works on AoI take
into account traffic burstiness and minimize the AoI from a
II. S YSTEM M ODEL
Manuscript received November 8, 2018; revised December 14, 2018;
accepted December 15, 2018. Date of publication January 9, 2019; date We assume a simple WPT sensor network consisting of
of current version April 9, 2019. This work was supported in part by one energy transmitter (ET), one sensor node, S, and one
the European Regional Development Fund and in part by the Republic
of Cyprus through the Research Promotion Foundation under Project
information receiver (IR); all the nodes are equipped with
INFRASTRUCTURES/1216/0017. The associate editor coordinating the single antennas. The ET is connected to the power grid and
review of this paper and approving it for publication was K. W. Choi. continuously broadcasts an energy signal with power P. The
The author is with the Department of Electrical and Computer Engineering, sensor node has WPT capabilities and harvests energy from
Faculty of Engineering, University of Cyprus, 1678 Nicosia, Cyprus
(e-mail: krikidis@ucy.ac.cy). the received RF signal; the harvested energy is stored in a
Digital Object Identifier 10.1109/LWC.2018.2890605 capacitor of finite-size B. When the capacitor becomes fully
KRIKIDIS: AVERAGE AoI IN WIRELESS POWERED SENSOR NETWORKS 629
Fig. 2. Example of AoI; Xk denotes the interarrival time between two

Fig. 1. A three node sensor network topology; ET broadcasts energy, S consecutive received updates, Tk is the time between two consecutive capac-
communicates with IR by discharging its capacitor of size B. itor’s recharges, Qk is the area under Δ(n) corresponding to the k-th received
update.
charged, the sensor node generates a status update and trans-

mits it towards the IR by using all the stored energy (greedy Fig. 2 presents an example of the age evolution for the sensor
online policy [12]). Energy transmission and communication network considered. An update is generated at the sensor node
links are performed in orthogonal channels (e.g., different when the capacitor becomes fully charged and transmitted in
frequency bands) to avoid interference; in addition, time is the next slot (one time slot of delay); in case of a successful
considered to be slotted with a slot size equal to one time decoding, i.e., log2 (1 + γk ) ≥ r , the AoI at the IR is reset to
unit (due to the normalized slot duration, the measures of one. If nk , nk +1 represent the time slots of two consecutive
energy and power become identical and therefore are used updates at the IR, Xk = nk +1 −nk denotes the k-th interarrival
interchangeably throughout this letter). Fig. 1 schematically time (time between nk +1 and nk in time slots). In addition,
presents the system model. Tk denotes the time (in time slots) between
two consecutive
The sensor node is able to harvest energy from the ET capacitor recharges; we have Xk = M i=1 Ti , where M is a
during the status transmission. This is feasible due to the discrete random variable that denotes the number of the update
orthogonality between the communication/harvesting links transmissions until successful decoding.
and the existence of an appropriate capacitor architecture
that supports simultaneous transmission/harvesting (i.e., two III. A NALYSIS OF THE AVERAGE AGE OF I NFORMATION
antennas operating in different frequency bands for communi- In this section, we analyze the performance of the sen-
cation/harvesting; a secondary storage capacitor/device stores sor network considered in terms of the average AoI. Firstly,
up harvested energy while the transmitter is active [2]). we state two propositions which are used to derive the AoI
All wireless links experience Rayleigh block fading (chan- performance.
nel is constant for one time slot and changes independently Proposition 1: The first-order and second-order moments
across time slots). Let hk , gk ∼ exp(λ) denote the power of of the time between two consecutive capacitor recharges,
the channel fading for the link ET-S and S-IR at the k-th time respectively, are given by
slot, respectively. In addition, all wireless links exhibit additive E(T ) = 1 + β, (4)
white Gaussian noise (AWGN) with variance σ 2 . The energy 2 2
E(T ) = 1 + 3β + β . (5)
stored (i.e., the amount of available energy in the capacitor)
at time slot k, denoted as Ek , will evolve as follows1 where β = λB /(ηP ).
Ek = min{1Ek −1 <B Ek −1 + ηPhk , B }, (1) Proof: See Appendix B.
Based on Proposition 1, we have the following proposition
where 0 ≤ η ≤ 1 denotes the RF-to-DC conversion efficiency for the first-order and the second-order moments for the time
(harvesting from the AWGN is considered negligible) and 1X between two consecutive successful delivered status updates.
is the indicator function of X, with 1X = 1 if X is true and Proposition 2: The first-order and the second-order
1X = 0 otherwise. If the capacitor becomes fully charged at moments for the interarrival time between two consecutive
time slot k, i.e., Ek = B , the sensor node transmits a status updates at the IR, respectively, are given by
update to the IR, containing information for the considered 1+β
parameters of interest as well as the time of generation of E(X ) = , (6)
the update, with a spectral efficiency r bits per channel use π
1 + 3β + β 2 2
2(1 + β) (1 − π)
(BPCU) in the (k+1)-th time slot (a packet transmission is E(X 2 ) = + , (7)
performed in one time slot). The signal-to-noise ratio at the π π2
2 −1 ) is the r
IR for the k-th time slot is written as where π = P{log2 (1 + γk ) > r } = exp(−λ B/σ 2
Bg
γk = 2k . (2) success probability for the link S-IR (Rayleigh fading).
σ Proof: See Appendix C.
Age of information: In time slot n, AoI is the difference For a time period of N time slots where K successful
between n and the generation time U(n) measured in time transmissions occur, the average AoI can be written as
slots of the latest received update at the IR [4], i.e., N K K
1 1 K 1
Δ(n) = n − U (n). (3) ΔN = Δ(n) = Qk = Qk , (8)
N N N K
n=1 k =1 k =1
1 A linear WPT model is sufficient for the purposes of this letter [3], [11] and
provides useful lower bounds for the harvested energy achieved by non-linear where Qk denotes the area under Δ(n) corresponding to the k-
models. th status update. The time average ΔN tends to the ensemble
average age for N → ∞ [6], i.e.,

E(Q)
Δ = lim ΔN = , (9)
N →∞ E(X )
K (N ) 1
where limN →∞ N = E(X )
is the steady state rate of
updates generation.
The area under Δ(n) for the k-th update corresponds to
the sum of Xk rectangles with one side equal to one and the
other side equal to m, with 1 ≤ m ≤ Xk . Therefore, Qk can
be written as
Xk
X (X + 1)
Qk = m= k k . (10)
2
m=1
By taking the expectation operator, the average area under Fig. 3. Average AoI versus capacitor’s size B for P = {1, 3, 5, 10} Watt.
Δ(n) can be expressed as
E(X 2 ) + E(X )
E(Q) = . (11)
2
By using Propositions 1 and 2, and by substituting (11) in (9),
we have the following theorem on the average AoI.
Theorem 1: The average AoI for the considered sensor
network is given by

1 E(X 2 ) 1 + 3β + β 2
Δ= +1 =
2 E(X ) 2(1 + β)
(1 + β)(1 − π) 1
+ + . (12)
π 2
From Theorem 1, we have the following two remarks.
Remark 1: For the case where P → ∞ and B is a constant,
the average AoI asymptotically converges to Δ → 1/π.
Remark 2: For the case where P → ∞ and B → ∞ with Fig. 4. Minimun average AoI versus P for r = {0.01, 0.05, 0.08, 0.1} BPCU.
a ratio B/P = θ, the average AoI asymptotically converges to
1+3(λθ/η)+(λθ/η)2
Δ→ 2(1+λθ/η)
+ 12 . different for each P value and matches with the optimal solu-
If the objective of the system is to design the capacitor B tions given by the optimization problem in (13). We can also
such as the IR has as much as possible fresh information, observe, that as P (and/or d) increases the achieved average
we introduce the following one-dimensional optimization AoI decreases; a higher P (and/or shorter d) charges the capac-
problem; we assume that the transmit power is given and we itor faster and therefore decreases the time that the sensor is
minimize the AoI with respect to the size of the capacitor B. idle (in the energy harvesting mode). Finally, theoretical results
The optimal capacitor size is given by perfectly match with the simulation results and validate the
analysis.
B ∗ = arg min Δ. (13) Fig. 4 plots the minimum average AoI (corresponding
B>0
to B ∗ ) versus P for different spectral efficiencies and dis-
Given (12), unfortunately the optimization problem in (13)
tances. As it is expected, for a given P, a higher spectral
does not admit closed-form solutions; however, the optimal
efficiency (and/or distance) increases the achieved AoI, as
B ∗ can be solved numerically (e.g., fminsearch in MATLAB).
it requires more transmission attempts before a successful
transmission takes place; asymptotically (P → ∞), the aver-
IV. N UMERICAL R ESULTS age AoI converges to an AoI floor that is equal to 1/π (see
The simulation setup follows the description of Section II Remark 1).
with parameters σ 2 = −50 dBm, η = 0.5, r = 0.05 BPCU; the
sensor node is located 20 meters away from both the ET and
the IR; the channel power gains are modeled as λ = 103 d α , V. C ONCLUSION
where d is the link distance and α = 2.2 is the path-loss In this letter, we have studied the performance of a basic
exponent [13]. wireless powered sensor network in terms of the average
Fig. 3 depicts the average AoI versus the capacitor’s size B AoI. The sensor node transmits updates to the destination by
for different values of P and d. As can be seen, the param- discharging its capacitor, which is charged by a dedicated
eter B significantly affects the average AoI performance of energy source. We derived simple closed form expressions
the system. Specifically, a high B facilitates the transmission for the average AoI and we showed that it highly depends
phase but requires more time slots to charge the capacitor, on the capacitor’s size. The optimal capacitor value has
while a low B activates the sensor node faster but the avail- been computed numerically by formulating and solving a
able energy for transmission is low. The optimal value of B is one-dimensional optimization problem.
KRIKIDIS: AVERAGE AoI IN WIRELESS POWERED SENSOR NETWORKS 631
A PPENDIX A the k-th transmission was successful. Therefore, the average

P ROBABILITY OF C APACITOR C HARGING IN K interarrival time becomes equal to
C ONSECUTIVE T IME S LOTS ∞
β+1
Let X1 , . . . , XK denote K independent and identically dis- E(X ) = k E(T )(1 − π)k −1 π = , (17)
π
tributed exponential random variables with rate parameter λ. k =1
We calculate the following probability where π denotes the success probability for the link S-IR,
K −1 K
and (17) is based on [14, eq. (1.113)].

Π(y, K ) = P Xi < y Xi ≥ y For the second-order moment, we have
k 2 k k
k
i=1 i=1
2
y y−x1 y− K −2 xi
i=1
X = Ti = Ti2 + 2 Ti Tj . (18)
= ... i=1 i=1 i=1 j >i
0 0 0
∞ By taking the conditional expectation operator and after some
× K −1 fX1 ,...,XK (x1 , . . . , xK )dxK . . . dx1 basic manipulations, we have
y− xi
y y−x1
i=1
y− K −2 xi ∞ E(X 2 |k ) = k E(T 2 ) + k (k − 1)E(T )2 . (19)
i=1
= ... K −1 λK By using similar arguments with the computation of the first-
0 0 0 y− i=1 xi order moment, we average out the number of transmissions,
K

i.e.,
× exp −λ xi dxK . . . dx1 ∞
i=1 E(X 2 ) = E(X 2 |k )(1 − π)k −1 π
1 k =1
= (λy)K −1 exp(−λy), (14)
(K − 1)! E (T 2 ) 2(1 − π)
= + E(T )2
K π π2
where fX1 ,...,XK (x1 , . . . , xK ) = i=1 fX (xi ) is the joint 2
1 + 3β + β 2(1 + β)2 (1 − π)
probability density function (PDF) of the random variables = + , (20)
X1 , . . . , XK , and fX (x ) = λ exp(−λx ) is the PDF for an π π2
exponential random variable with parameter λ. where (20) is based on the expressions in (15), (16).
By using the above computation, the probability that the
capacitor is charged in K consecutive time slots becomes equal R EFERENCES
to Π(β/λ, K ). [1] L. R. Varshney, “Transporting information and energy simultaneously,”
in Proc. IEEE Int. Symp. Inf. Theory, Toronto, ON, Canada, Jul. 2008,
pp. 1612–1616.
A PPENDIX B [2] S. Luo, R. Zhang, and T. J. Lim, “Optimal save-then-transmit proto-
P ROOF OF P ROPOSITION 1 col for energy harvesting wireless transmitters,” IEEE Trans. Wireless,
vol. 12, no. 3, pp. 1196–1207, Mar. 2013.
The average time between two consecutive capacitor’s [3] R. Zhang and C. K. Ho, “MIMO broadcasting for simultaneous wire-
recharges can be computed as follows less information and power transfer,” IEEE Trans. Wireless Commun.,
vol. 12, no. 5, pp. 1989–2001, May 2013.
∞ ∞
[4] S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should
E(T ) = k P{T = k } = k Π(β/λ, k ) one update?” in Proc. IEEE Int. Conf. Comput. Commun., Orlando, FL,
k =1 k =1 USA, Mar. 2012, pp. 2731–2735.
∞
[5] C. Kam, S. Kompella, G. D. Nguyen, and A. Ephremides, “Effect of
k
β k −1 = β + 1,
message transmission path diversity on status age,” IEEE Trans. Inf.
= exp(−β) (15)
(k − 1)! Theory, vol. 62, no. 3, pp. 1360–1374, Mar. 2016.
k =1 [6] A. Kosta, N. Pappas, and V. Angelakis, “Age of information: A new con-
cept, metric, and tool,” Found. Trends Netw., vol. 12, no. 3, pp. 162–259,
where Π(β/λ, k ) is given in Appendix A, and the result in (15) 2017.
is based on [14, eq. (1.212)]. For the second-order moment of [7] A. Arafa and S. Ulukus, “Age-minimal transmission in energy harvesting
the time between two consecutive recharges, we have two-hop networks,” in Proc. IEEE Glob. Commun. Conf., Singapore,
Dec. 2017, pp. 1–6.
∞ ∞
[8] X. Wu, J. Yang, and J. Wu, “Optimal status update for age of information
2 2
E(T ) = k P{T = k } = k 2 Π(β/λ, k ) minimization with an energy harvesting source,” IEEE Trans. Green
Commun. Netw., vol. 2, no. 1, pp. 193–204, Mar. 2018.
k =1 k =1
[9] A. Arafa, J. Yang, and S. Ulukus, “Age-minimal online policies for
∞
k2 energy harvesting sensors with random battery recharges,” in Proc. IEEE
= exp(−β) β k −1 = 1 + 3β + β 2 . (16) Int. Conf. Commun., Kansas City, MO, USA, May 2018, pp. 1–6.
(k − 1)! [10] A. Arafa, J. Yang, S. Ulukus, and H. V. Poor, “Age-minimal transmission
k =1
for energy harvesting sensors with finite batteries: Online policies,” IEEE
Trans. Inf. Theory, to be published.
A PPENDIX C [11] Y. Dong, Z. Chen, and P. Fan, “Uplink age of information of unilaterally
P ROOF OF P ROPOSITION 2 powered two-way data exchanging systems,” in Proc. IEEE Int. Conf.
k Comput. Commun., Honolulu, HI, USA, Apr. 2018, pp. 559–564.
The interarrival time can be written as X = i=1 Ti , [12] [Online]. Available: https://www.powercastco.com/
where k denotes the number of the consecutive transmissions [13] Q. Wu, W. Chen, D. W. K. Ng, and R. Schober, “Spectral and energy
until successful decoding at the IR and it is a (positive inte- efficient wireless powered IoT networks: NOMA or TDMA?” IEEE
Trans. Veh. Tech., vol. 67, no. 7, pp. 6663–6667, Jul. 2018.
ger) random variable. If k transmissions occur, this means [14] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and
that (k−1) consecutive transmissions were unsuccessful, while Products. San Diego, CA, USA: Elsevier, 2007.
User Cooperation in Wireless-Powered Backscatter Communication Networks

Bin Lyu , Dinh Thai Hoang , and Zhen Yang
Abstract—In this letter, we introduce new user-cooperation of the limitations of BackCom is that if the incident signal is
schemes for wireless devices in a wireless-powered backscatter unavailable, information backscattering (IB) is impossible. To
communication network with the aim to improve communica- fully exploit the advantages of both the HTT and BackCom,
tion and energy efficiency for the whole network. In particular,
BackCom has been introduced in WPCNs [6], [7], where each
we consider two types of wireless devices which can support
different communication modes, i.e., backscatter and harvest- device can choose to operate in either the HTT or BackCom
then-transmit, and they can cooperate to deliver the information mode. However, IoT devices are hardware-constrained devices,
to the access point. To improve energy transmission efficiency for the backscatter and energy harvesting circuits together with an
the devices, energy beamforming is deployed at the power beacon. adaptive switch required to support the HTT and BackCom
We then formulate the weighted sum-rate maximization problem modes may not be available in practice. Hence, the assump-
by jointly optimizing time schedule, power allocation, and energy tion that the devices can support both the two modes may not
beamforming. Due to the non-convex issue of the optimization
problem, we employ the variable substitutions and semidefinite be practical. Furthermore, both user cooperation and energy
relaxation techniques to obtain the optimal solution. Simulation beamforming are not considered in these works, hence the
results show that the proposed cooperation framework can sig- communication and energy efficiency can not be maximized.
nificantly improve the communication efficiency compared with In this letter, we introduce two user cooperation schemes for
non-cooperation approach. the WPCN with BackCom with the aim to optimize commu-
Index Terms—Energy harvesting, backscatter communication, nication and energy efficiency for the network. In particular,
user cooperation, energy beamforming. we consider two wireless devices, denoted by HD and BD,
supported to operate in two different modes, i.e., HTT and
BackCom, respectively. We then consider two important sce-
I. I NTRODUCTION narios, i.e., the HD (BD) is located nearer the AP and can
IRELESS power transfer (WPT) has been considered be served as a relay node to assist the BD (HD) to transmit
W to be a promising way to supply wireless devices
with sustainable energy. In a wireless-powered communication
information due to the low channel quality of the BD (HD).
For each scenario, we formulate the weighted sum-rate (WSR)
network (WPCN), wireless devices can first harvest energy optimization problem by jointly optimizing the time schedule,
from a power beacon (PB), and then transmit their information power allocation, and energy beamforming. To deal with the
to the dedicated access point (AP) following the harvest-then- non-convex issue of the optimization problem, we first employ
transmit (HTT) protocol [1]. In [2] and [3], user cooperation the variable substitutions and design the optimal energy beam-
was applied in WPCNs to enhance system performance by forming vector only for IB or information forwarding (IF).
exploiting cooperative diversity. However, since devices in After that, the energy beamforming matrix is derived based
[2] and [3] are the HTT devices, the dedicated energy harvest- on semidefinite relaxation (SDR) [11] for the joint IB and EH
ing (EH) phase is required, which may reduce the duration of which satisfies the rank-one constraint. Simulation results then
the information transmission (IT) phase. show that our proposed cooperation framework can achieve
Recently, backscatter communication (BackCom) has been larger communication efficiency than that of non-cooperation
introduced as a novel communication method for IoT approach.
networks [4]. The BackCom device transmits information to
II. S YSTEM M ODEL AND N OTATIONS
the AP by modulating and reflecting the incident signals,
which requires less circuit power consumption and makes its As illustrated in Fig. 1, we consider the WPCN with
instantaneous harvested energy be sufficient to power its circuit BackCom, including a PB, an AP, and two devices, denoted by
operation [5]. Hence, the dedicated EH phase is not necessary, HD and BD, supported to operate in two different modes, i.e.,
which avoids the limitation of the HTT protocol. However, one HTT and BackCom, respectively. The PB with stable power
supply has N antennas, and the two devices are with single
Manuscript received November 19, 2018; accepted December 26, 2018. antenna. We consider two cases: (i) the HD is located nearer
Date of publication January 1, 2019; date of current version April 9, 2019. the AP than the BD, and it can operate as a relay node, and
This work was supported by the National Natural Science Foundation of
China under Grant 61671252 and Grant 61772287. The associate editor (ii) the BD is located nearer the AP than the HD, and it can
coordinating the review of this paper and approving it for publication was work as a relay node. Note that the relay node also needs to
P. D. Diamantoulakis. (Corresponding author: Bin Lyu.) deliver its own information to the AP. Moreover, we assume
B. Lyu and Z. Yang are with the National Engineering Research Center
of Communication and Network Technology, Nanjing University of Posts
the relay node decodes the information transmitted by the other
and Telecommunications, Nanjing 210003, China (e-mail: blyu@njupt.edu.cn; device more easily than the AP, which is useful for cooperative
yangz@njupt.edu.cn). communication [2]. The channel vectors between the PB and
D. T. Hoang is with the Faculty of Engineering and Information Technology, the BD/HD/AP are denoted as h 0,1 , h 0,2 , and h 0,3 , respec-
University of Technology Sydney, Sydney, NSW 2007, Australia (e-mail:
hoang.dinh@uts.edu.au). tively. The channel variables between the BD-HD, BD-AP,
Digital Object Identifier 10.1109/LWC.2018.2890642 HD-AP, and HD-BD links are denoted as h1,2 , h1,3 , h2,3 , and
LYU et al.: USER COOPERATION IN WIRELESS-POWERED BACKSCATTER COMMUNICATION NETWORKS 633
the AP via decode-and-forward (DF) operation following [2].

Since the transmitted signal at the PB during τ1 only focuses
on IB of the BD, we let the normalized energy beamform-
ing vector be ŵ 1,i
√. The transmitted signal is thus expressed
as w 1,i (τ ) = P ŵ 1,i s(τ ). The backscattered signal is
received by both the HD and the AP, and SIC is operated.
Fig. 1. System model. The SNRs at the HD and the AP during τ1 are respec-
tively given by γhd,1,i = P |α0,i |2 |h1,2 |2 |h H 2 2
0,1 ŵ 1,i | /σhd and
γap,1,i = P |α0,i |2 |h1,3 |2 |h H 2 2 2
0,1 ŵ 1,i | /σap , where σhd is the
g1,2 , respectively. Denote the received signal and signal-noise- noise power at the HD. During the third phase, the HD
ratio (SNR) at the BD/HD/AP during the p + 1-th phase for decodes the received signal from the BD [9] and forwards
Case q as ym,p,q and γm,p,q , where m = bd,hd,ap, p = 0,1,2,3, it to the AP. The forwarded signalreceived by the AP dur-
and q = i, ii. The achievable rates of the BD/HD for Case ing τ2 is expressed as yap,2,i (τ ) = P1,i h2,3 ci (τ ) + nap (τ ),
q is denoted by Rm,q . The system is considered within a where P1,i is the transmit power of the HD for IF, and the
normalized transmission time block, denoted by T = 1. SNR is expressed as γap,2,i = P1,i |h2,3 |2 /σap 2 . During the
fourth phase, the HD transmits its own information to the
A. Case i: The HD Is Located Nearer the AP AP. Similarly, the SNR at the AP during τ3 is expressed
In this case, we divide the transmission block into four as γap,3,i = P2,i |h2,3 |2 /σap 2 , where P
2,i denotes the HD’s
phases with duration denoted by τi (i = 0, . . . , 3), where transmit power for its own IT.
3 Based on the above analysis, the achievable rates of the BD
i=0 τi ≤ 1. During τ0 , the BD backscatters information
to the AP, while the HD harvests energy. Denote the trans- and the HD are expressed as Rbd,i = τ0 log2 (1 + ξγap,0,i ) +
mitted signal of the PB√ during τ0 as w 0,i (τ ) which is min{τ1 log2 (1 + ξγap,1,i ) + τ2 log2 (1 + ξγap,2,i ), τ1 log2 (1 +
ξγhd,1,i )} [10] and Rhd,i = τ3 log2 (1+ξγap,3,i ), respectively,
expressed by w 0,i (τ ) = P ŵ 0,i s(τ ), where P is the trans-
where ξ is the performance gap due to the practical modulation
mit power of the PB, s(τ ) is a known sequence with unit
and coding scheme [1], [7]. Note that since there exists the
power, and ŵ 0,i is the energy beamforming vector during
backscattered noise at the AP, the above expressions are the
τ0 and satisfies ||ŵ 0,i ||2 ≤ 1. The received signal at the
approximation of the real transmission rate of the BD [5].
BD during τ0 , denoted by u0,i (τ ), is expressed as u0,i (τ ) =
hH0,1 w 0,i (τ ) + nan (τ ), where nan (τ ) is the antenna noise.
Denote the own signal of the BD for Case i as ci (τ ), which is B. Case ii: The BD Is Located Nearer the AP
modulated on u0,i (τ ) by controlling the reflection coefficient
In the second case, we divide the transmission block into
α0,i , where E[|ci (τ )|2 ] = 1, α0,i is a complex coefficient
and |α0,i |2 ≤ 1. The received AP during τ0 is 2 phases with duration denoted by ti (i = 0,1,2), where
three
√ signal at the H i=0 ti ≤ 1. During the first phase, the HD harvests energy
then given by yap,0,i (τ ) = P α0,i h1,3 h 0,1 ŵ 0,i s(τ )ci (τ ) +
√ and the BD backscatters information. Denote the normal-
h1,3 α0,i ci (τ )nan (τ ) + P h H 0,3 ŵ 0,i s(τ ) + nap (τ ), where ized energy beamforming vector, the own signal of the BD,
h 1,3
√ H α0,i ci (τ )n an (τ ) is the noise backscattered to the AP, and the reflection coefficient during t0 for Case ii as ŵ 0,ii ,
P h 0,3 ŵ 0,i s(τ ) is the interference signal from the PB, cii (t), and α0,ii , respectively, where E[|cii (t)|2 ] = 1 and
nap (τ ) represents the additive white Gaussian noise (AWGN) |α0,ii |2 ≤ 1. The harvested energy at the HD is given
with zero mean and variance σap 2 . The backscattered noise
by Eh,ii = ηP |h H 2
0,2 ŵ 0,ii | t0 . Similarly, the backscattered
power is much smaller than that of nap (τ ) due to channel noise is not considered and the SIC is adopted in this case.
attenuation and is typically negligible. Moreover, since the AP The SNR at the AP during t0 is thus given by γap,0,ii =
can also receive ŵ 0,i s(τ ), the perfect self-interference cancel- P |α0,ii |2 |h1,3 |2 |h H 2 2
0,1 ŵ 0,ii | /σap . During the last two phases,
lation (SIC) technique1 is employed to subtract the interference the IT of the HD is assisted by the BD. During the second
signal from yap,0,i (τ ) [5], [8]. The SNR at the AP during τ0 is phase, the HD transmits its information to both the AP and
thus expressed as γap,0,i = P |α0,i |2 |h1,3 |2 |h H 2 2
0,1 ŵ 0,i | /σap . the BD based on the harvested energy, and the PB keeps idle.
Similarly, the received powers from the backscattered signal The SNRs at the AP and the BD are thus given by γap,1,ii =
and noise at the HD are much smaller than those from the P1,ii |h2,3 |2 /σap
2 and γ 2 2
bd,1,ii = P1,ii |g1,2 | /σbd , where P1,ii
PB and are negligible. Hence, the harvested energy at the HD is the transmit power of the HD and satisfies P1,ii t1 ≤ Eh,ii ,
is given by Eh,i = ηP |h H 2
0,2 ŵ 0,i | τ0 , where η is the energy and σbd2 is the noise power at the BD. During the third phase,
harvesting efficiency. the PB is activated, and the BD can forward the information
During τ0 , the direct IB rate from the BD to the AP may from the HD to the AP via DF. The subsequent SNR after
be limited due to the energy beamforming tradeoff between SIC is given by γap,2,ii = P |α2,ii |2 |h1,3 |2 |h H 2 2
0,1 ŵ 2,ii | /σap ,
IB and EH and the far distance between the BD and the AP. where ŵ 2,ii is the normalized energy beamforming vector
Hence, during the second and third phases, the HD operates during t2 , α2,ii is the reflection coefficient during t2 and
as a relay node to transmit IF of the BD, where the HD |α2,ii |2 ≤ 1.
first receives the backscattered signal and then forwards it to Then, the achievable rates of the HD and the BD for Case ii
1 Generally, the interference can not be canceled completely. However, even are given by Rhd,ii = min{t1 log2 (1+ξγap,1,ii )+t2 log2 (1+
if there exists the residual interference after cancellation, the structures and ξγap,2,ii ), t1 log2 (1 + ξγbd,1,ii )}, and Rbd,ii = t0 log2 (1 +
conclusions of the aftermentioned results will not be changed. ξγap,0,ii ).
III. W EIGHTED S UM -R ATE M AXIMIZATION this letter, we use CVX tools [13] to derive the optimal
A. Case i solution. The optimal power allocations are further given by
∗ = e ∗ /τ ∗ and P ∗ = e ∗ /τ ∗ . Then, we compute the
P1,i
We first set
the time and energy constraints for network as 1,i 2 2,i 2,i 3
3 optimal solution ŵ ∗0,i from W ∗i . Note that if W ∗i satis-
follows: C1: i=0 τi ≤ 1, C2: τi ≥ 0, ∀i , C3: P1,i τ2 + fies the rank-one constraint, ŵ ∗0 computed from W ∗i /τ0∗ by
P2,i τ3 ≤ Eh,i , C4: ||ŵ 0,i ||2 ≤ 1, and C5: ||ŵ 1,i ||2 ≤ 1.
eigen-decomposition is the optimal energy beamforming vec-
Then, the optimization problem can be formulated by
tor during τ0 . Hence, we proceed to show that W ∗i always
max ω1 Rbd,i + ω2 Rhd,i has the rank-one property in the following proposition.
ŵ 0,i ,ŵ 1,i ,P i ,τ Proposition 3: The optimal solution W ∗i derived from P3
s.t. C1, C2, C3, C4, C5, (P1) is a rank-one matrix.
Proof: To show W ∗i is a rank-one matrix, we first give the
where τ = [τ0 , τ1 , τ2 , τ3 ], P i = [P1,i , P2,i ], ω1 and ω2
following optimization problem.
denote the nonnegative rate weights for the BD and the
HD, respectively. Problem P1 is not a convex optimization min Tr(W i )
Wi
problem because ŵ 0,i , ŵ 1,i , τ , and P i are coupled with ∗
s.t. e0,i ≤ Tr(h 0,1 h H
0,1 W i ), W i 0,
each other. To solve P1, we introduce some new variables
∗ ∗
and apply the SDR technique [11]. First, we introduce R̄bd,i , e1,i + e2,i ≤ ηP Tr(h 0,2 h H
0,2 W i ). (P4)
R̂bd,i , e0,i , and let e1,i = P1,i τ2 , e2,i = P2,i τ3 , W i =
Denote the optimal solution for Problem P4 as W †i , which
τ0 ŵ 0,i ŵ H
0,i . Hence, we have the following new constraints: is also a feasible solution for P3. The reason is that there are
C6: R̄bd,i ≤ τ1 log2 (1 + ξP |α0,i |2 |h1,3 |2 |h H 2 2
0,1 ŵ 1,i | /σap ) + more constraints in P3 than in P4, which guarantees that a
2 2
τ2 log2 (1 + ξe1,i |h2,3 | /(σap τ2 )), C7: R̄bd,i ≤ τ1 log2 (1 + feasible solution for P3 is also feasible for P4. Hence, we can
ξP |α0,i |2 |h1,2 |2 |h H 2 2
0,1 ŵ 1,i | /σhd ), C8: R̂bd,i = τ0 log2 (1 + derive that Tr(W †i ) ≤ Tr(W ∗i ) ≤ τ0∗ , which shows that W †i
ξP |α0,i |2 |h1,3 |2 e0,i /(σap2 τ )), C9: R
0 hd,i = τ3 log2 (1 +
is a feasible solution for P3. Furthermore, since the objective
ξe2,i |h2,3 | /(σap τ3 )), C10: e0,i ≤ Tr(h 0,1 h H
2 2 function of P3 is a function of e i , τ , R̂bd,i , R̄bd,i and Rhd,i ,
0,1 W i ),
C11: e1,i + e2,i ≤ ηP Tr(h 0,2 h H W ), C12: Tr(W we can derive that {W †i , e ∗i , τ ∗ , R̂bd,i
∗ , R̄ ∗ , R ∗ } is also
0,2 i i ) ≤ τ0 , bd,i hd,i
C13: W i 0, and C14: rank (W i ) = 1. Then, P1 is the optimal solution for P3, i.e., W †i = W ∗i . According to
recast as the theorem given in [14, Th. 3.2], we then show that W †i
is a rank-one matrix. Since there exists an optimal solution
max ω1 (R̂bd,i + R̄bd,i ) + ω2 Rhd,i W †i satisfying (rank (W †i ))2 ≤ 2, we derive that W †i = 0
W i ,ŵ 1,i ,e i ,τ ,R̂bd,i ,R̄bd,i ,Rhd,i
is rank-one. Hence, rank(W ∗i ) = 1.
s.t. C1, C2, C5, C6, C7, C8, C9, C10, C11, C12, C13, C14,
(P2) B. Case ii
where e i = [e0,i , e1,i , e2,i ]. However, Problem P2 is still Similar
2 to the first case, we add the following constraints:
not a convex optimization problem due to the rank-one con- C17: i=0 ti ≤ 1, C18: ti ≥ 0, ∀i , C19: P1,ii t1 ≤ Eh,ii ,
straint and the couple of τ1 and ŵ 1,i . To handle this, we C20: |ŵ 0,ii |2 ≤ 1, and C21: |ŵ 2,ii |2 ≤ 1. Then, the
first give the following proposition. Denote the optimal solu- optimization problem for Case ii is formulated by:
tion for P2 as {W ∗i , ŵ ∗1,i , e ∗i , τ ∗ , R̂bd,i
∗ , R̄ ∗ , R ∗ }, where
bd,i hd,i max ω1 Rbd,ii + ω2 Rhd,ii
e i = [e0,i , e1,i , e2,i ] and τ = [τ0 , τ1∗ , τ2∗ , τ3∗ ].
∗ ∗ ∗ ∗ ∗ ∗ ŵ ii ,P1,ii ,t
Proposition 1: The optimal energy beamforming vector s.t. C17, C18, C19, C20, C21, (P5)
during τ1 is given by ŵ ∗1,i = h 0,1 /||h 0,1 ||.
The proof of Proposition 1 can be done by contra- where ŵ ii = [ŵ 0,ii , ŵ 2,ii ] and t ∗ = [t0∗ , t1∗ , t2∗ ].
Following the similar approach for Case i , Problem
diction theory and is omitted due to the limited space.
P5 can be solved as follows. We introduce auxiliary
Then, based on Proposition 1, C6 and C7 are recast as
variables R̄hd,ii , e0,ii , e1,ii , and let e2,ii = P1,ii t1 ,
C15: R̄bd,i ≤ τ1 log2 (1 + ξP |α0,i |2 |h1,3 |2 ||h 0,1 ||2 /σap 2 )+
2 2 W ii = t0 ŵ 0,ii ŵ H0,ii . Then, we introduce the following
τ2 log2 (1 + ξe1,i |h2,3 | /(σap τ2 )) and C16: R̄bd,i ≤
new constraints C22: t1 log2 (1 + ξe2,ii |h2,3 |2 /(σap 2 t )) +
τ1 log2 (1 + ξP |α0,i |2 |h1,2 |2 ||h 0,1 ||2 /σhd 2 ), respectively. With
2 2 H 2 2
1
Proposition 1, P2 is still non-convex due to the rank-one t2 log2 (1 + ξP |α2,ii | |h1,3 | |h 0,1 ŵ 2,ii | /σap ) ≥ R̄hd,ii ,
constraint. The SDR technique is an efficient approximation C23: t1 log2 (1+ξe2,ii |g1,2 |2 /(σbd
2 t )) ≥ R̄
1 hd,ii , C24: e0,ii ≤
H
Tr(h 0,2 h 0,2 W ii ), C25: e1,ii ≤ Tr(h 0,1 h H
technique to convert the non-convex problem to a convex 0,1 W ii ),
problem [11]. By relaxing C14 following SDR, P2 is recast C26: Rbd,ii = t0 log2 (1 + ξP |α0,ii |2 |h1,3 |2 e1,ii /(σap 2 t )),
0
as follows: C27: e2,ii ≤ ηPe0,ii , C28: Tr(W ii ) ≤ τ0 , C29: W ii 0,
and C30: rank (W ii ) = 1. Then, P5 is recast as:
max ω1 (R̂bd,i + R̄bd,i ) + ω2 Rhd,i
W i ,e i ,τ ,R̂bd,i ,R̄bd,i ,Rhd,i
max ω1 Rbd,ii + ω2 R̄hd,ii
s.t. C1, C2, C8, C9, C10, C11, C12, C13, C15, C16. (P3) W ii ,ŵ 2,ii ,e ii ,t,Rbd,ii ,R̄hd,ii
s.t. C17, C18, C21, C22, C23, C24,
Proposition 2: Problem P3 is a convex problem [12].
According to Proposition 2, the optimal solution for P3 C25, C26, C27, C28, C29, C30,
can be solved by some standard optimization techniques. In (P6)
LYU et al.: USER COOPERATION IN WIRELESS-POWERED BACKSCATTER COMMUNICATION NETWORKS 635
where eii = [e0,ii , e1,ii , e2,ii ]. Denote the optimal solution

for Problem P6 as {W ∗ii , ŵ ∗2,ii , e ∗ii , t ∗ , Rbd,ii
∗ ∗
, R̄hd,ii }, where
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
eii = [e0,ii , e1,ii , e2,ii ], and t = [t0 , t1 , t2 ]. Similar to Case i,
we can derive the following Proposition 4.
Proposition 4: The optimal energy beamforming design
during t2 is given by ŵ ∗2,ii = h 0,1 /||h 0,1 ||.
Based on Proposition 4, C22 is rewritten as
C31: t1 log2 (1 + ξe2,ii |h2,3 |2 /(σap 2 t )) + t log (1 +
1 2 2
2 2 2 2
ξP |α2,ii | |h1,3 | |h 0,1 | /σap ) ≥ R̄hd,ii . P6 is then
reformulated as P7 without considering C30. Fig. 2. Performance evaluation.
max ω1 Rbd,ii + ω2 R̄hd,ii

W ii ,e ii ,t,Rbd,ii ,R̄hd,ii V. C ONCLUSION
s.t. C17, C18, C23, C24, C25, C26, C27, C28, C29, C31. (P7) We have proposed two user cooperation schemes in the
WPCN with BackCom, where one device is the BD and
It can be proved that P6 is a convex problem [12], hence the another device is the HD. We have considered two cases
optimal solution of which can be solved by CVX tools [13]. in which either HD or BD is located nearer the AP and it
Based on the derived solution, the optimal power alloca- can serve as the relay node for another node in forwarding
∗
tion is given by P1,ii ∗ /t ∗ , and the optimal energy
= e2,ii 1 information to the AP. Two WSR optimization problems have
beamforming vector during t0 is derived from W ∗ii /t0∗ by been formulated to jointly optimize the time schedule, power
eigen-decomposition since W ∗ii is a rank-one matrix. allocation, and energy beamforming vectors. Then, the vari-
able substitutions and SDR technique have been developed to
obtain the optimal solution. Finally, simulation results have
IV. S IMULATION R ESULTS been provided to evaluate the efficiency of the proposed
In this section, simulation results are given to evaluate schemes.
the performance of the proposed schemes. The simulated
network topology is a 2-D plane, where the position of R EFERENCES
each node is described with its coordinate (x, y). The coor-
[1] H. Ju and R. Zhang, “Throughput maximization in wireless powered com-
dinates of the PB, the AP and the two devices are given munication networks,” IEEE Trans. Wireless Commun., vol. 13, no. 1,
as (0,10), (10,0), (0,0), and (2,1), respectively. All chan- pp. 418–428, Jan. 2014.
nels are modeled following Rayleigh fading with distribution [2] H. Ju and R. Zhang, “User cooperation in wireless powered commu-
−κ ), where κ denotes the path-loss exponent and
CN (0, dm,n nication networks,” in Proc. IEEE GLOBECOM, Austin, TX, USA,
Dec. 2014, pp. 1430–1435.
is set at 2, and dm,n is the distance between two nodes m, [3] X. Di, K. Xiong, P. Fan, H.-C. Yang, and K. B. Letaief, “Optimal resource
n. We assume σap 2 = σ 2 = σ 2 = −40 dBm, η = 0.7, allocation in wireless powered communication networks with user coop-
hd bd
|α0,i | = |α0,ii | = |α2,ii |2 = 1, ξ = −5 dB [7], and N = 10.
2 2 eration,” IEEE Trans. Wireless Commun., vol. 16, no. 12, pp. 7936–7949,
Dec. 2017.
The proposed schemes under Case i and Case ii are denoted [4] G. Zhu, S.-W. Ko, and K. Huang, “Inference from randomized trans-
as ‘proposed scheme i’ and ‘proposed scheme ii’, respectively. missions by many backscatter sensors,” IEEE Trans. Wireless Commun.,
The scheme that both devices are the HD devices [3] and the vol. 17, no. 5, pp. 3111–3127, May 2018.
[5] S. Gong et al., “Backscatter relay communications powered by wire-
non-cooperation schemes for Case i and Case ii are used as less energy beamforming,” IEEE Trans. Commun., vol. 66, no. 7,
the benchmark schemes. pp. 3187–3200, Jul. 2018.
Fig. 2a shows the WSR versus P with ω1 = ω2 = 0.5. [6] D. T. Hoang, D. Niyat, P. Wang, D. I. Kim, and Z. Han, “Ambient
backscatter: A new approach to improve network performance for RF-
It can be observed that the results obtained by two proposed powered cognitive radio networks,” IEEE Trans. Commun., vol. 65, no. 9,
schemes are superior to those of the benchmark schemes. This pp. 3659–3674, Sep. 2017.
is because the time of information delivery is extended since [7] S. H. Kim et al., “Hybrid backscatter communication for wireless-
the dedicated EH phases for both proposed schemes are not powered heterogeneous networks,” IEEE Trans. Wireless Commun.,
vol. 16, no. 10, pp. 6557–6570, Oct. 2017.
required and user cooperation can enhance the system WSR. [8] D. Bharadia, K. R. Joshi, M. Kotaru, and S. Katti, “BackFi: High through-
Moreover, the WSR of the proposed scheme ii is larger than put WiFi backscatter,” in Proc. SIGCOMM, London, U.K., Aug. 2015,
that of the proposed scheme i. This is because for the proposed pp. 283–296.
[9] G. Wang, F. Gao, R. Fan, and C. Tellambura, “Ambient backscatter com-
scheme i, the channel conditions of the BD for IB are worse munication systems: Detection and performance analysis,” IEEE Trans.
and the harvested energy of the HD are used for transmitting Commun., vol. 64, no. 11, pp. 4836–4846, Nov. 2016.
its own information and forwarding the information of the BD, [10] Y. Liang and V. V. Veeravalli, “Gaussian orthogonal relay channels:
which limits the WSR. While, for the proposed scheme ii, the Optimal resource allocation and capacity,” IEEE Trans. Inf. Theory,
vol. 51, no. 9, pp. 3284–3289, Sep. 2005.
channel conditions of the BD for IB are better, and the har- [11] Z.-Q. Luo, W.-K. Ma, A. M.-C. So, Y. Ye, and S. Zhang, “Semidefinite
vested energy of the HD is only used for its own IT. Hence, the relaxation of quadratic optimization problems,” IEEE Signal Process.
proposed scheme ii can achieve a larger WSR. Fig. 2b shows Mag., vol. 27, no. 3, pp. 20–34, May 2010.
[12] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K:
the effect of ω1 on the system WSR with P = 20 dBm. From Cambridge Univ. Press, 2004.
Fig. 2b, we observe that the WSR of the proposed schemes are [13] M. Grant and S. Boyd. (Sep. 2013). CVX: MATLAB Software for
larger than those of the benchmark schemes, which shows the Disciplined Convex Programming, Version 2.0 Beta. [Online]. Available:
superiority of the proposed schemes. Since changing ω1 can http://cvxr.com/cvx
[14] Y. Huang and D. P. Palomar, “Rank-constrained separable semidefinite
guarantee user fairness, we conclude that guaranteeing user programming with applications to optimal beamforming,” IEEE Trans.
fairness may degrade the system WSR. Signal Process., vol. 58, no. 2, pp. 664–678, Feb. 2010.
Tag Cardinality Estimation Using Expectation-Maximization in ALOHA-Based

RFID Systems With Capture Effect and Detection Error
Chuyen T. Nguyen , Van-Dinh Nguyen , and Anh T. Pham
Abstract—Tag cardinality estimation is one of the most crucial effect (CE) [5]. The observed state of the slot, in this case, is
issues in radio frequency identification technology. The issue, assumed to be singleton to differentiate from one-response. In
however, usually faces with challenges in wireless fading environ- addition, a tag might not be detected with a probability in the
ments due to the presence of the so-called capture effect (CE)
and detection error (DE). The aim of this letter is to provide an corresponding one-response slot, which is referred to as the
efficient and accurate estimation method to cope with the CE detection error (DE) [6]. Similarly, the observed state of the
and DE using expectation-maximization algorithm and the stan- slot is called empty, while in other cases, the state is observed
dard Aloha-based protocol. We show that the proposed method as collision. These phenomena has been extensively studied
gives more accurate estimates than a conventional one. Thanks in the literature of RFID both in theoretical and experimental
to this fact, the Aloha frame size used for the tag identification
process can also be optimally selected so that the identification aspects [6]. They are usually hidden from the reader, and there-
efficiency can be improved. Computer simulations are presented fore, affect the estimation accuracy of conventional methods.
to confirm the merit of the proposed method. Several works have been proposed to deal with the cardinal-
Index Terms—RFID, Aloha, capture effect, detection error, EM ity estimation in the presence of the CE [7]–[9]. The method
algorithm, estimation. in [7], i.e., capture-aware backlog estimation (CMEBE), esti-
mates the tag cardinality and the CE probability by minimizing
the norm-2 distance between theoretical and observed vectors
I. I NTRODUCTION of empty, singleton and collision slots. In [8], they are found by
AG CARDINALITY estimation holds a crucial task in using Bayesian approach. Also, in [9], the capture probability
T Radio Frequency Identification (RFID) technology with
many practical applications such as intelligent transporta-
is analyzed in a more accurate approach by considering the
number of contention tags in a time slot and physical layer
tion, indoor stadium, and warehouse systems. The task has parameters. Thanks to the approach, a closed-form solution
been investigated in several previous works [1], [2] with a of the optimal frame length is found, which then improves
frame slotted Aloha (FSA) protocol, which is originally and the identification performance. The key limitation of those
standardly used to detect RFID tags’ Identity (ID) [3]. In works, nevertheless, is that the DE is completely ignored. On
those works, tags randomly transmit their IDs in a frame of the other hand, the cardinality estimation in the presence of
time slots. Then, observations of the number of responses both the CE and DE was recently studied in [10] based on
in each slot, i.e., no response, one response, and multiple the maximum likelihood (ML) approach. In the approach, an
responses [4], can be utilized for the tag cardinality estima- approximation of the likelihood function of the tag cardinal-
tion. The tags’ ID identification process can be significantly ity, CE and DE probabilities, for given observations of slots is
improved with an accurate estimate of the cardinality. determined. Nevertheless, to maximize the likelihood function,
On the other hand, under effects of wireless channel impair- the method adopted an exhaustive search algorithm to check
ments, the observations of time slots may not accurately reflect all possible values of the tag cardinality and the probabili-
the real number of responses. Indeed, due to the channel fading ties. This approach thus resulted in a very high computational
phenomenon, a tag might be detected with a probability in a complexity, and affected the overall performance of the iden-
multiple-response slot, which is well-known as the capture tification process. Although, the complexity could be reduced
with a simple transmission model such as flat Rayleigh fading
Manuscript received October 9, 2018; revised November 23, 2018; accepted where a deterministic relation between CE and DE probabili-
December 7, 2018. Date of publication January 1, 2019; date of current ties could be obtained [10], it would be much more challenging
version April 9, 2019. This work was supported by JSPS Kakenhi under
Project 18K11269. The associate editor coordinating the review of this paper in practical ones.
and approving it for publication was K. Adachi. (Corresponding author: In this letter, we propose a new method employing the
Chuyen T. Nguyen.) Expectation-Maximization (EM) algorithm [11] and FSA pro-
C. T. Nguyen is with the Department of Telecommunication
Systems, School of Electronics and Telecommunications, Hanoi tocol to efficiently and accurately estimate the tag cardinality
University of Science and Technology, Hanoi 100000, Vietnam in presence of both CE and DE. The method includes iterative
(e-mail: chuyen.nguyenthanh@hust.edu.vn). estimation rounds. In each round, the cardinality is first esti-
V.-D. Nguyen is with the Institute of Research and Development, Duy
Tan University, Da Nang 550000, Vietnam, and also with the Department mated by ML approach given expected values of hidden
of ICMC Convergence Technology, Soongsil University, Seoul 06978, South data/observations caused by the CE and DE. The CE and
Korea (e-mail: nguyenvandinh@ssu.ac.kr). DE probabilities can then be found in closed-forms for the
A. T. Pham is with the Department of Computer Engineering, University
of Aizu, Aizuwakamatsu 965-8580, Japan (e-mail: pham@u-aizu.ac.jp). given estimate, which significantly reduces the computational
Digital Object Identifier 10.1109/LWC.2018.2890650 complexity in comparison with the method in [10]. Simulation
NGUYEN et al.: TAG CARDINALITY ESTIMATION USING EXPECTATION-MAXIMIZATION IN ALOHA-BASED RFID SYSTEMS WITH CE AND DE 637
n̂, α̂ and β̂, respectively, can be approximately found as

n̂, α̂, β̂ = arg max f (E , S , C |n, α, β)
n∈N, α, β∈[0,1]
(E + S + C )! E S C
≈ arg max pE pS pC , (2)
n∈N, α, β∈[0,1] E !S !C !
where f (E , S , C |n, α, β) is the likelihood function of n, α
and β, given E, S and C. It should be also noted in (2) that
the likelihood function has been approximately modeled as
Fig. 1. A reading round of Aloha-based identification operation.
a multinomial distribution with L repeated independent trials,
where each trial has one of three outcomes: empty, singleton,
or collision. Although this approximation does not reflect the
results also confirm the effectiveness of our proposed method
exact likelihood function [12], it results in accurate estimates
compared to the conventional methods.
as ML ones especially when L is large, which has been numer-
ically validated in [2] and [10]. Since there is no guarantee on
II. P ROTOCOL D ESCRIPTION AND P ROPOSED M ETHOD the convergence of the (pseudo) likelihood function, it could
A. Protocol Description be possible to solve (2) by an exhaustive search algorithm or
Our considered RFID system consists of a reader and n tags finding a deterministic relation between the two probabilities
in its communication range. The FSA protocol is implemented, in an assumed fading channel model [10]. Nevertheless, while
in which the reader first broadcasts a request consisting of a the former costs a very high computational complexity, the
time slotted frame size of L. Then, each tag responds to the latter is difficult to obtain for practical fading models.
reader by its identity (ID) randomly in one of the L slots. In what follows, we utilize the EM approach to find the
The reader tries to estimate the tag cardinality based on the estimates of the tag cardinality and the probabilities. EM is
observed numbers of empty, collision, and singleton slots, an iterative estimation algorithm, which is especially useful
denoted as E, S, and C, respectively [4]. when necessary information/data is hidden/missing. In our
Practically, the DE and CE may happen in any slot with model, the hidden data is the number of one-response and
tags’ responses. To focus on the tag cardinality estimation, we multiple-response slots observed as empty and singleton ones
use the similar model as in [10], in which, each one-response denoted by S1 and C1 , respectively. In particular, each EM
slot is assumedly detected as an empty one with an average DE iteration includes two steps, namely E-step and M-step. In E-
probability of β, while a multiple-response slot is recognized step, expected values of the hidden data S1 and C1 , which
as singleton with an average CE probability of α. The inac- are respectively denoted by S 1 and C 1 , are estimated. In
curate cardinality estimation problem due to the CE and DE M-step, the estimates of n, α and β are found for a given
can be illustrated as in Fig. 1, which presents a reading round complete (observed and hidden) data, i.e., E , S , C ; S 1 , C 1 .
in an Aloha-based RFID system with a reader and 5 tags. In This estimation process is repeated until convergence that is
this example, a request L = 4 is initially used and the correct defined as

observation should be E = 1, S = 1, C = 2. Nevertheless, = (n̂ {r } − n̂ {r −1} )2 + (α̂{r } − α̂{r −1} )2 + (β̂ {r } − β̂ {r −1} )2
due to the impact of the CE and DE, tag 3 is detected in slot
≤ p , (3)
3 (CE), while tag 1 is not detected in slot 1 (DE). The reader
therefore has a wrong observation with E = 2, S = 1, C = 1; where is the norm-2 distance between two estimated vectors
and consequently, it will inaccurately estimate the cardinality of n, α and β at two consecutive iterations. n̂ {r } , α̂{r } and
of tags in the system. β̂ {r } are, respectively, the estimates of n, α and β at the
r-th iteration. p is a predefined constant. The two steps are
B. Proposed Method described in details as
1) E-Step: From (1), S 1 and C 1 are easily found as follows
In our method, we first denote pE , pS and pC as the aver-
n−1 n n−1
age probabilities of observing an empty, a singleton and a 1 1 n 1

S 1 = βn 1 − , C 1 = αL 1 − 1− − 1− .
collision slot, respectively; and they can be expressed as L L L L
(4)
pE ≈ p0 + βp1 , pS ≈ (1 − β)p1 + αp2 , pC ≈ (1 − α)p2 , (1)
It is noted that, values of n, α and β in (4) are taken from
where p0 , p1 or p2 is a probability that a slot is, respec- the following M-step, while they can be initially set as S + 2C,
tively, no-response, one-response or multiple-response, i.e., 0.5, and 0.5, respectively.
p0 ≈ (1− L1 )n , p1 ≈ nL (1− L1 )n−1 , and p2 ≈ 1−p0 −p1 [10]. 2) M-Step: Given the complete data [E , S , C ; S 1 , C 1 ],
It is noted in (1) that the DE is assumed to not happen the likelihood function of n, α and β denoted by
in multiple-response slots due to signal diversity, which has f (E , S , C , S 1 , C 1 |n, α, β) is written as
also been validated in [10] under the assumption of a sim-
(E + S + C )!
ple Rayleigh fading channel model. More practical models of f E , S , C , S 1 , C 1 |n, α, β =
the DE, CE, and status of each slot should be investigated in (E − S 1 )!S 1 !(S − C 1 )!C 1 !C !
future works. Then, the estimates of n, α, and β denoted by × p0E −S 1 p01
S 1 S −C 1 C 1 C
p10 p12 pC , (5)
Algorithm 1 EM Estimation Algorithm

1: Initialization: Generate L for the first request, and observe
E, S and C. Set n = S + 2C, α = 0, 5, β = 0.5
2: repeat
3: E-step:
n−1
4: S 1 = βn 1 − L1
n n−1
5: C 1 = αL 1 − 1 − L1 − nL 1 − L1 .
6: M-step:
7: n̂ = arg maxn∈N g1 (n)
8: α̂ = C 1 , β̂ = S1
,
C 1 +C S +S 1 −C
9: until Convergence
Fig. 2. Convergence behavior of the proposed algorithm with different

where p01 = βp1 , p10 = (1 − β)p1 , and p12 = αp2 . number of tags (n), for L = 256, α = 0.3 and β = 0.3.
Since E , S , C , S 1 and C 1 are constants, the estimates
can be found by maximizing the function g(n, α, β) =
ln(p0E −S 1 p01
S 1 S −C 1 C 1 C
p10 p12 pC ), which can be re-written as
g(n, α, β) = g1 (n) + g2 (α) + g3 (β), (6)
where g1 (n) = (E − S 1 ) ln(p0 ) + (S 1 + S − C 1 ) ln(p1 ) +

(C 1 + C ) ln(p2 ), g2 (α) = C 1 ln(α) + C ln(1 − α), g3 (β) =
S 1 ln(β) + (S − C 1 ) ln(1 − β). Since n, α and β are indepen-
dent, the estimates are easily obtained by maximizing g1 (n),
g2 (α), and g3 (β) with respect to n, α, and β, respectively, i.e.,
n̂ = arg max g1 (n), (7)

n∈N
C1 S1
α̂ = , β̂ = . (8)
C1 + C S + S1 − C Fig. 3. RMSE of n, for L = 256, α = 0.3.
It is also noted that (7) can be efficiently solved by the

Chen’s method [2] where g1 (n) is numerically proven to be III. N UMERICAL R ESULTS AND D ISCUSSIONS
converged. We summarize the EM estimation iterations in
In this section, we evaluate the performance of the proposed
Algorithm 1. The initial value of n is selected as a lower
estimation method via computer simulations. The frame size
bound after observing E, S and C. Also, the initial values of
L and the predetermined constant p are set by 256 and 10−4 ,
α and β can be arbitrary in (0,1). Nevertheless, since we have
respectively. The simulation results are obtained by Monte
no knowledge of α and β a priori, they are both initially set
Carlo method with the number of simulation runs R = 1000,
as 0.5.
and are also compared with those of the conventional CMEBE
The Aloha frame size can be selected based on the above
and Bayesian methods.
estimates to improve the performance of tag identification. In
First, we investigate the typical convergence behavior of
particular, the size is found by maximizing the system effi-
the proposed algorithm by plotting the norm-2 distance with
ciency which is defined as the average number of detected
different numbers of tags {n = 300, 400} in Fig. 2, for α = 0.3
tags per time slot and denoted by η. Here, η is written as
and β = 0.3. It is seen that the proposed method converges

very fast, within only a few iterations in all cases.
n 1 n−1 1 n n 1 n−1
η = (1 − β) 1− +α 1− 1− − 1− . Next, Figs. 3 and 4 show the root mean square errors
L L L L L
(9) (RMSEs) of n and α (denoted by en and eα , respectively)
of the CMEBE, the ML-based method in [10], Bayesian esti-
By letting the differentiation of η in (9) with respect to L be mate [8], and the proposed method for L = 256, β = 0 or 0.3.
zero (assuming the continuous relaxation of L), we can find Here, the RMSEs are defined as

the optimal frame size denoted by Lopt as R R
1 1
en = (n̂i − n) , eα =
2
(α̂i − α)2 , (11)
α(n − 1) R R
Lopt = n − . (10) i=1 i=1
1−β
where n̂i and α̂i are, respectively, the i-th estimates of n and
In other words, by substituting (7) and (8) into (10), the frame α. It should be noted that the estimates in [10] are obtained by
size could be optimally selected. maximizing the approximated likelihood function in (2) with
NGUYEN et al.: TAG CARDINALITY ESTIMATION USING EXPECTATION-MAXIMIZATION IN ALOHA-BASED RFID SYSTEMS WITH CE AND DE 639
estimation methods on identification performance. Here, the

frame sizes of the methods are optimally determined by (10)
in which β is set by 0 for CMEBE and Bayesian. We can see
that the consumed time slots is proportional to α for given
β, while inversely proportional to β for given α. The reason
is that more tags are detected in multiple-response slots, but
more tags are also hidden in one-response slots. Nevertheless,
in the both cases, the proposed method takes a smaller number
of time slots than conventional ones, especially when the DE
and (or) CE are more significant. This is because both the DE
and CE have been considered in our estimation scheme thanks
to the EM approach.
IV. C ONCLUSION
Fig. 4. RMSE of α, for L = 256.
This letter investigated the issue of tag cardinality estimation
with FSA protocol in RFID systems considering impacts of
both CE and DE. The EM approach was utilized to iteratively
estimate the tag cardinality, the CE, and DE probabilities.
Computer simulations confirmed that the proposed method
was guaranteed to converge after only a few iterations and
provided more accurate estimates than that of the conventional
methods. The proposed method was also proven to improve
the efficiency of the identification process.
R EFERENCES
[1] W. Gong, J. Liu, K. Liu, and Y. Liu, “Toward more rigorous and practical
cardinality estimation for large-scale RFID systems,” IEEE/ACM Trans.
Netw., vol. 25, no. 3, pp. 1347–1358, Jun. 2017.
[2] W.-T. Chen, “An accurate tag estimate method for improving the
performance of an RFID anti-collision algorithm based on dynamic
Fig. 5. The total number of slots used to detect n = 400 tags with respect frame length ALOHA,” IEEE Trans. Autom. Sci. Eng., vol. 6, no. 1,
to the DE (dash lines) or CE (solid lines) probability. pp. 9–15, Jan. 2009.
[3] EPC Radio-Frequency Identity Protocols Class-1 Generation-2
UHF RFID Protocol for Communications at 860 Mhz–960 Mhz
an exhaustive search algorithm over all possible values of n, α, Version 1.2.0. Accessed: May 8, 2016. [Online]. Available:
http://www.gs1.org/sites/default/files/docs/epc/uhfc1g2_1_2_0-standard-
and β. Therefore, [10] can be approximately considered as a 20080511.pdf
lower bound of all considered methods. We can see that while [4] C. Qian, H. Ngan, Y. Liu, and L. M. Ni, “Cardinality estimation for
our method can be comparable with CMEBE and Bayesian large-scale RFID systems,” IEEE Trans. Parallel Distrib. Syst., vol. 22,
no. 9, pp. 1441–1454, Sep. 2011.
when β = 0, it significantly outperforms them in term of the [5] H. Salah, H. A. Ahmed, J. Robert, and A. Heuberger, “A time and cap-
estimation accuracy when β > 0. This is because only the CE ture probability aware closed form frame slotted ALOHA frame length
has been taken into account in CMEBE and Bayesian meth- optimization,” IEEE Commun. Lett., vol. 19, no. 11, pp. 2009–2012,
Nov. 2015.
ods. Nevertheless, the performance of EM-based algorithms
[6] C. T. Nguyen, A.-T. H. Bui, V.-D. Nguyen, and A. T. Pham, “Modified
greatly depends on the initial values of estimated parameters. tree-based identification protocols for solving hidden-tag problem in
Indeed, it is seen in Fig. 4 the degraded performance of the RFID systems over fading channels,” IET Commun., vol. 11, no. 7,
proposed method for small values of α (α < 0.15). This fact pp. 1132–1142, May 2017.
[7] B. Li and J. Wang, “Efficient anti-collision algorithm utilizing the cap-
is also observed for α = 0.3, 0.4 where the performance of ture effect for ISO 18000-6C RFID protocol,” IEEE Commun. Lett.,
the proposed method is even better than that of [10]. vol. 15, no. 3, pp. 352–354, Mar. 2011.
We now provide the worst-case per-iteration complexity [8] H. Wu, Y. Wang, and Y. Zeng, “Capture-aware Bayesian RFID tag esti-
mate for large-scale identification,” IEEE/CAA J. Automatica Sinica,
analysis of Algorithm 1 and compare to that of [10]. Recall vol. 5, no. 1, pp. 119–127, Jan. 2018.
that the per-iteration complexity of the method presented [9] H. A. Ahmed, H. Salah, J. Robert, and A. Heuberger, “A closed-
in [10] is O(n 3 ). The complexity of Algorithm 1 is mostly due form solution for ALOHA frame length optimizing multiple collision
recovery coefficients’ reading efficiency,” IEEE Syst. J., vol. 12, no. 1,
to solving (7) (i.e., step 7 of Algorithm 1), which requires the pp. 1047–1050, Mar. 2018.
complexity of O(n). This is to say, given the same convergence [10] C. T. Nguyen, K. Hayashi, M. Kaneko, and H. Sakai, “Maximum
condition as in (3), the proposed algorithm requires signifi- likelihood approach for RFID tag cardinality estimation under capture
effect and detection errors,” IEICE Trans. Commun., vol. E96-B, no. 5,
cantly lower complexity, compared to that of [10], especially pp. 1122–1129, May 2013.
when n is large. [11] G. J. McLachlan and T. Krishnan, The EM Algorithm and Extensions,
Finally, we plot the total number of slots used to detect 2nd ed. Hoboken, NJ, USA: Wiley, 2008.
[12] E. Vahedi, V. W. S. Wong, I. F. Blake, and R. K. Ward, “Probabilistic
n = 400 tags with respect to different values of α (β is set analysis and correction of Chen’s tag estimate method,” IEEE Trans.
by 0.3) or β (α is set by 0.3) in Fig. 5 to see the impact of Autom. Sci. Eng., vol. 8, no. 3, pp. 659–663, Jul. 2011.
Pilot Allocation and Computationally Efficient Non-Iterative Estimation

of Phase Noise in OFDM
Ville Syrjälä , Toni Levanen, Tero Ihalainen, and Mikko Valkama
Abstract—This letter proposes a pilot subcarrier allocation estimation and suppression of PN induced CPE and ICI, with
approach for orthogonal frequency division multiplexing systems, low latency and low overhead. The accurate estimation and
which enables construction of symbol-wise phase-noise (PN) esti- suppression of PN enables the use of higher-order MCSs
mates with high efficiency and low overhead in a non-iterative allowing significantly improved spectral efficiency, particularly
manner. The complexity and performance of the overall PN sup- at higher cmW or mmW bands. No prior symbol deci-
pression algorithm together with the proposed pilot allocation sions or information about consecutive OFDM symbols are
approach are evaluated in 5G single-user-multiple-input multiple- needed, resulting in low latency and reduced buffering require-
output (MIMO) and multi-user-MIMO radio links at 28 GHz
ments. The block-wise pilot signal design, with the defined
carrier frequency, showing clear complexity-performance benefits
against a state-of-the-art reference algorithm. estimation and compensation algorithms, can be applied to
single-input single-output (SISO), single-user (SU) multiple-
Index Terms—OFDM, reference signals, MIMO, phase noise, input multiple-output (MIMO), and multi-user (MU)-MIMO
ICI, mitigation, computational complexity, cmW, mmW, 5G. OFDM links.
I. I NTRODUCTION II. OFDM R ADIO L INK M ODEL W ITH P HASE N OISE

HASE noise (PN) has serious inband effects in orthogo- Assuming that the channel delay spread is shorter than the
P nal frequency division multiplexing (OFDM) systems [1]
which can be divided into two parts. The first is called com-
utilized cyclic prefix length, the demodulated received OFDM
signal corrupted by receiver side PN can be written as [2]
mon phase error (CPE), which corresponds to PN-dependent N −1

common rotation at every subcarrier within an OFDM symbol, Rk = Xk Hk J0 + Xl Hl Jk −l + Zk , k ∈ 0, . . . , N − 1, (1)
which is typically mitigated by using pilot subcarriers [2]. The l=0, l=k
second effect stemming from the spread of the energy of each
subcarrier on top of the other subcarriers is called inter-carrier where k is the subcarrier index, Xk is the subcarrier trans-
interference (ICI), which in turn is much more complex in mit symbol, Hk is the channel frequency response, Zk is the
nature [1], [2]. discrete Fourier transformed white Gaussian noise, and Jl
In the literature, considerable amount of research has is the discrete Fourier transformed PN complex exponential
focused on ICI mitigation, e.g., [2]–[7]. Despite this, com- expressed as
pact transceiver implementations with limited computational N −1
resources still rely on good quality oscillators instead of actu- 1 j φn − j 2πnl
Jl = e e N . (2)
ally mitigating the ICI with signal processing. However, cheap N
n=0
oscillators working at high frequencies require that the ICI is
handled with signal processing, because the PN problem gets Here φn is the sampled time-varying PN within the consid-
substantially more challenging at centimeter wave (cmW) and ered OFDM symbol. Notice that the model is written for
millimeter wave (mmW) frequencies, where, e.g., the emerg- an arbitrary OFDM symbol, and thus the OFDM symbol
ing new 5G systems will operate [8]. OFDM systems using index has been omitted for simplicity. In the first term of (1),
high-order modulation and coding schemes (MCS) are very the effect of the CPE is visible as a multiplication by J0 ,
sensitive to PN and thus require advanced PN compensation while the second additive term corresponds to the effect of
solutions. In future 5G networks, also the processing delay the ICI [2]. As shown in [2], the model in (1)-(2) describes
requirements will become stricter, limiting, e.g., the use of a PN-impaired OFDM radio link very accurately also in cases
iterative PN compensation approaches. where both the transmitter and receiver side PN sources are
In this letter, we introduce a pilot allocation strategy that significant.
enables non-iterative, computationally effective, but accurate
III. P HASE N OISE M ITIGATION
Manuscript received October 29, 2018; revised December 28, 2018;
accepted December 28, 2018. Date of publication January 1, 2019; date A. Reference Frequency-Domain Suppression Strategy
of current version April 9, 2019. This work was supported in part by the If we rewrite (1) into a form where only 2u + 1 centermost
Finnish Cultural Foundation, in part by the Academy of Finland under Grant frequency bins of the PN complex exponential are assumed
276378, Grant 304147, and Grant 288670, in part by the Nvidia Corporation,
and in part by the Nokia Corporation. The associate editor coordinating the significant, i.e., we exploit the strong lowpass nature of the
review of this paper and approving it for publication was R. C. de Lamare. PN process, we arrive into a form [2]
(Corresponding author: Ville Syrjälä.) u u
V. Syrjälä, T. Levanen, and M. Valkama are with the Laboratory Rk = Xk −l Hk −l Jl + Qk = Yk −l Jl + Qk . (3)
of Electronics and Communications Engineering, Tampere University of
Technology, 33101 Tampere, Finland (e-mail: ville.syrjala@tut.fi). l=−u l=−u
T. Ihalainen is with the Wireless Advanced Technologies Research
Department, Nokia Bell Labs, 33101 Tampere, Finland. Here, Qk includes ICI from the non-significant frequency bins
Digital Object Identifier 10.1109/LWC.2018.2890665 of the PN complex exponential and the additive noise. Then,
SYRJÄLÄ et al.: PILOT ALLOCATION AND COMPUTATIONALLY EFFICIENT NON-ITERATIVE ESTIMATION OF PN IN OFDM 641
adjacent values of Yk . If the size of the subcarrier block is

b ≥ 4u + 1 and the index of the first subcarrier in the block
Xn,u,b
eq
Rn,u,b is n, then the set of linear equations for the channel equalized
received signal reads
⎡ ⎤
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
eq Xn+2u Xn+2u−1. . . Xn eq
R
n+u ⎢ ⎥ J−u Qn+u
⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ .. .. ⎥⎢ ⎥ ⎢ ⎥
Na ⎢ Req ⎥ ⎢ ⎥⎢J ⎥ ⎢ Q eq ⎥
⎢ n+u+1 ⎥ ⎢ Xn+2u+1 . . Xn+1 ⎥⎢ −u+1 ⎥ ⎢ n+u+1 ⎥
⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ . ⎥ =⎢ ⎥⎢ . ⎥+⎢ . ⎥⇔
⎢ . ⎥ ⎢ . . . . ⎥⎢ . ⎥ ⎢ . ⎥
⎢ ⎥ ⎢ . .. .. . ⎥⎢ ⎥ ⎢ ⎥
Fig. 1. An example pilot subcarrier block and the groups of 2u + 1 pilots ⎢ . ⎥ ⎢ ⎥⎢ . ⎥ ⎢ . ⎥
⎣ ⎦ ⎢ . . ⎥⎣ ⎦ ⎣ ⎦
⎣ ⎦
to solve (7), indicated by circles and arrows, for the case of u = 1. We have eq
Rn+b−u−1
Xn+b−1 ... . . .Xn+b−2u−1
Ju
eq
Qn+b−u−1
pilot subcarrier block of size b = 7, and therefore a system of b − 2u =
eq eq
5 equations in (7). Also, the basic flow diagram of the algorithm is depicted. Rn,u,b = Xn,u,b Ju + Qn,u,b . (7)
eq
Here, Qn contains residual noise and ICI after the channel
for a set of subcarriers k ∈ {l1 , l2 , . . . , lp } : p ≥ 2u + 1, we equalization. Then, the PN can be estimated with LS estima-
eq
can write a set of linear equations as [2] tion from (7) as Ĵu = (XH −1 H
n,u,b Xn,u,b ) Xn,u,b Rn,u,b . To
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
R l1 Yl1 +u Yl1 +u−1 . . . Yl1 −u J−u Q l1 obtain the required block of Xk values in (7) without decision
⎢R ⎥ ⎢ ⎥⎢
⎢ l2 ⎥ ⎢Y ..
.
..
. Yl2 −u ⎥⎢ J−u+1 ⎥ ⎢
⎥ ⎢ Q l2 ⎥
⎥ feedback iterations, we propose to use a block of contiguous
⎢ . ⎥=⎢ l +u
⎢ 2
⎥⎢
⎥⎢ . ⎥ + ⎢ . ⎥⇔ pilot subcarriers, as illustrated in Fig. 1. Overall, the proposed
⎢ . ⎥ ⎢ .. . ⎥⎣ .. ⎥ ⎢ ⎥
⎣ . ⎦ ⎣ .
..
.
..
. . ⎦ ⎦ ⎣ .. ⎦ pilot block structure has a reasonable overhead to facilitate
.
R lp Ju Q lp highly efficient symbol-by-symbol CPE and ICI estimation in
Ylp +u ... . . . Ylp −u
modern communications systems utilizing thousands of sub-
Rp = Yu,p Ju + Qp . (4) carriers. Since most of the PN energy is at the low frequencies,
Now, if we assume that the values of Yu,p are known, we can even very small values of u, e.g., u = 1 or u = 2, and thus
obtain the estimate of Ju with least squares (LS) approach small block size, give significant performance gains [2], [3].
as [3] Notice, however, that if we increase the pilot subcarrier block
size from the minimum of 4u + 1, the estimation accuracy
Ĵu [Ĵ−u , . . . , Ĵu ]T = (Yu,p
H
Yu,p )−1 Yu,p
H
Rp , (5) can be improved at the cost of increased pilot overhead, as
will be shown in Section V.
where (·)T and (·)H denote the transpose and conjugate Utilization of a single block of pilot subcarriers is the most
transpose, respectively. The LS solution is computationally spectrally efficient solution. However, frequency diversity can
relatively simple, while also the corresponding minimum be attained by using multiple pilot subcarrier blocks. The size
mean square error (MMSE) solution is formulated in [2]. of each block should always be at least 2u + 1 subcarriers.
Algorithms, e.g., in [2]–[4] exploit (4) in decision feedback If we want to divide the needed subcarriers into a > 1 sep-
manner for PN estimation. The suppression of PN is then done arate groups of adjacent pilot subcarriers, we need at least
by deconvolution [2], and the signal after PN suppression can 2au + 2u + 1 pilot subcarriers to carry out the PN estima-
be written as tion. Every additional separate pilot block always requires 2u
u
∗
more pilots for the PN estimation.
Ŷk = Rk −l Ĵ−l . (6) The rest of this letter focuses on the case of a single
l=−u block of pilot subcarriers for minimal overhead. The over-
head and its impact on the throughput are further discussed
Here x∗
denotes the complex conjugate of x, and the decon-
in Section V. For clarity, we also state that separate channel
volution needs to be done only for the active subcarriers.
estimation pilots are assumed.
B. Proposed Method
C. Complexity Analysis of Proposed Algorithm
Here pilot subcarrier allocation schemes are proposed,
which allow us to formulate the non-iterative PN estimation The complexity of the proposed algorithm is very low. No
and suppression algorithm with very low computational com- separate CPE compensation is needed, as it is compensated
plexity. After the channel estimation and equalization, both the by the proposed algorithm simultaneously with the ICI. The
CPE and ICI are estimated simultaneously without data sym- complexity of the proposed algorithm is dominated by the
bol detection or iterations, and without any information of the complexity of the LS estimation and the complexity of PN
consecutive OFDM symbols. This results in very low process- mitigation by deconvolution. For the deconvolution, the com-
ing latency, complexity and buffering requirements. After the plexity is 2u + 1 complex multiplications per data-carrying
estimation, the PN is suppressed with deconvolution, as for- active subcarrier for which the PN compensation is applied.
mulated in (6). The flow diagram of the algorithm is depicted For the LS estimation, the complexity is
in Fig. 1. 8u 2 p + 10up + 3p + 8u 3 + 12u 2 + 6u + 1 (8)
The proposed pilot subcarrier structure is based on the set
of linear equations in (4). To solve the LS estimate of the complex multiplications. As an example, if u = 1 and we have
2u + 1 frequency bins of PN in (5), we need to know at least p = 9 pilots, and therefore a set of 7 linear equations to solve,
2u + 1 different values of Yk , and neighborhood of u values we have 216 complex multiplications per OFDM symbol.
With the proposed algorithm, for OFDM symbols with high
on both sides of each of these. The key observation is that
number of active subcarriers, the deconvolution is dominating
the neighborhoods can be overlapping, so we can reuse most from the complexity point of view, but for smaller amounts
of the values in the adjacent neighborhoods in the estimation of active scheduled data subcarriers the complexity of the
process (see Fig. 1). To carry out the PN estimation based on LS estimation becomes dominating. The state-of-the-art refer-
the above idea, the smallest required block size is only 4u + 1 ence technique of [2] has the same amount of multiplications
in terms of the parameters in (8), but in practice compa-

rable estimation performance is obtained with 3 iterations,
p of 16 and u of 3. Already a single iteration results in
2023 complex multiplications per OFDM symbol, and if iter-
ated further, the complexity gets multiplied. These values of p
and u are used also in the performance comparisons provided
in Section V. Notice, that the reference algorithm from [2]
requires also symbol decisions which significantly increases
the complexity and the latency, especially with higher-order
modulations. Fig. 2. The power spectral density (PSD) of the PN of the used oscillators.
IV. P HASE N OISE M ITIGATION S TRATEGY IN and NLOS evaluation cases, respectively [9]. Per polarization,
MIMO-OFDM 8 × 8 antenna array is assumed at the BS and 2 × 2 array
The proposed algorithm can be used also in MIMO links. at the UE [9]. Different polarizations are used for different
It is reasonable to assume that in small mobile transceivers spatial MIMO streams. A 5G NR like subframe structure con-
a common oscillator is shared between the antennas, so the sisting of 14 OFDM symbols is assumed, where the first two
PN realization is always the same for each antenna port. In symbols are reserved for control. From the remaining 12 sym-
SU-MIMO, when the same pilot subcarrier block for PN esti- bols, three equally spaced symbols dedicated for demodulation
mation is sent from all the antenna ports of the transmitter, reference signals are used for channel estimation and the rest
the received pilot blocks can be averaged across the received carry user data and PN estimation pilots. In each OFDM sym-
streams after the channel equalization, and the PN estimation bol carrying user data, a contiguous block of b pilot subcarriers
can then be carried out. This is beneficial, because of relatively (proposed method) or 24 scattered, equidistant pilot subcarri-
low complexity compared to stream-by-stream estimation of ers (reference methods) are allocated for PN estimation. The
the PN and reduced noise variance. It is also possible to trans- actual channel estimation is based on the well-known MMSE
mit the PN estimation pilots only from a single antenna port solution. The channel estimates are interpolated in time and
to reduce the pilot overhead. In this case, the averaging gain frequency by using a Wiener interpolator after which a subcar-
is lost and the PN estimation accuracy is defined by the rier wise MMSE equalizer is applied on the data subcarriers,
inter-stream-interference level in high SNR regime. prior to PN mitigation. The radio link performance is evaluated
In multi-user uplink (UL), each user equipment (UE) by block-error ratio (BLER) and throughput. BLER target of
requires their own PN pilot block, because each UE has an 10% is assumed, which is common in mobile radio networks
independent PN realization. Averaging the PN pilots over all using hybrid automatic repeat request (HARQ) on top of the
streams from one UE is possible, assuming that one oscillator channel coding.
is shared by all the transmit antenna ports in the UE, and that The studied cases are: 1) ‘No PN’ case, where no PN
each UE transmits the PN pilot block in all streams. Sharing nor pilots are added, 2) ‘CPE m.’ case, where only CPE is
the same time-frequency resources by each UE in MU-MIMO mitigated [2], 3) ‘Pet’ case where, CPE and ICI are mitigated
is beneficial, because the interfering signal from each UE is in a way proposed in [2] and 4) ‘u;b’ cases where the proposed
known. If different resources are used, the PN pilot block of algorithm is used with parameterization u and b. The ‘Pet’
the desired UE is interfered by a random data by other UEs, algorithm estimates 3 frequency components of the PN from
unless the corresponding subcarriers are muted by other UEs. the both sides of the direct current (DC) bin, with 3 iterations.
112 subcarrier symbol decisions are used per iteration to con-
struct the PN estimate. These decisions are made so that the
V. S IMULATION R ESULTS AND A NALYSIS most reliable subcarriers are used based on the estimated chan-
A. Simulation Scenario and Simulator Description nel amplitude response. These parameters enable reasonable
performance, but the complexity is still significantly higher
The studied scenario follows closely the 3GPP 5G New
than in the proposed algorithm.
Radio (NR) physical layer standardization in [9] and [10], and
For both SU-MIMO and MU-MIMO cases, simulations
thus represents a very timely example. In the simulator, an
are carried out with MATLAB based 5G NR specifications
OFDM signal is generated with fast Fourier transform (FFT)
compliant tool. We consider 2x2 SU-MIMO DL with spatial
length of 2048 and with 1284 active subcarriers. For the active
multiplexing of 2 streams. The SU-MIMO results are evalu-
subcarriers, 64QAM is used as a representative example of
ated with code rates of 3/4 and 4/5 for 64QAM, and turbo
fairly high modulation order. The center frequency is 28 GHz,
codec implementation following [14]. In the decoder, 8 max-
and the subcarrier spacing is 60 kHz, while the total carrier
log-MAP decoding iterations are used. In the DL SU-MIMO
bandwidth is 80 MHz. We use cyclic prefix of 144 samples (at
scenario, the transmitted pilot blocks are averaged across the
baseband sampling rate of 122.88 MHz). We incorporate both
spatial streams. In MU-MIMO, we consider UL case with sim-
the transmitter and receiver PNs that are generated by a charge-
ilar parameters as in the SU-MIMO case, but now there are
pump PLL oscillator [11] tuned to give similar power spectral
2 UEs each utilizing individual spatial stream, so we cannot do
density as used in 5G standardization evaluations [12], with
any averaging as described in Section IV. We assume 3 km/h
PSD shown in Fig. 2. The radio channel modeling is based
UE mobility for all the cases.
on the clustered delay line (CDL) C and D multipath chan-
nel models [13]. The non-line-of-sight (NLOS) CDL-C and
line-of-sight (LOS) CDL-D channels are with the root-mean- B. Results and Analysis
squared (RMS) delay spreads of 300 ns and 50 ns, respectively. In Fig. 3, 2 × 2 SU-MIMO DL case is considered for
In the CDL-D channel, the K-factor is 9 dB. These are selected 64QAM with code rate 3/4 in CDL-C (right set of curves)
since they correspond to evaluation assumptions for cmW and CDL-D (left set of curves) channels. With the proposed
radio link performance in 5G standardization [9]. CDL-D algorithm, only the DC bin and either one (u = 1) or two
50 ns and CDL-C 300 ns offer typical small cell LOS (u = 2) frequency components from the both sides of the
SYRJÄLÄ et al.: PILOT ALLOCATION AND COMPUTATIONALLY EFFICIENT NON-ITERATIVE ESTIMATION OF PN IN OFDM 643
the overhead is only 1.9%. Notice that in Table I, the pilot

overhead is taken into account.
In the right two columns of Table I, throughputs for the SU-
MIMO case at 26.5-dB SNR for 64QAM with code rates 3/4
and 4/5 are given in CDL-D channel. In the PN free reference
case, we can observe a clear improvement in the throughput
with higher MCS. On the contrary, with CPE-only mitigation,
or 1;12 and 1;24 ICI compensation schemes, the through-
put is not improved due to increased sensitivity to PN and
insufficient mitigation performance. With 2;24 ICI mitigation
scheme, the throughput performance can be improved and it
is very close to the PN free reference case, thus enabling the
use of higher MSCs under PN. We thus notice that increasing
the pilot overhead leads to improved throughput due to better
PN estimation and compensation capabilities.
VI. C ONCLUSION
When moving towards cmW and mmW frequencies in the
emerging OFDM-based 5G and beyond systems, both the CPE
and ICI induced by the oscillator PN needs to be considered
Fig. 3. Block-Error Rate as a function of signal-to-noise ratio in 2 × 2 SU- and potentially suppressed through digital signal processing.
MIMO DL case for 64QAM (3/4) in CDL-C and CDL-D channels with 3 km/h
mobility. In the legend entries for the proposed algorithm, u;b defines the
In this letter, to facilitate symbol-by-symbol PN estimation
number of estimated ICI components per side and the pilot block length. and CPE+ICI compensation, a block type pilot allocation
was proposed. Based on the proposed pilot allocation, a low-
TABLE I latency and computationally very efficient, yet highly accurate
T HROUGHPUTS [M BPS ] IN CDL-D C HANNEL W ITH 3 KM / H M OBILITY. PN mitigation algorithm was formulated. It was shown through
L EFT-H AND S IDE C OLUMNS AT 25.5- D B AND 64QAM (3/4) C OMPARE radio link simulations at 28 GHz carrier frequency that despite
SU-MIMO AND MU-MIMO C ASES , AND R IGHT-H AND S IDE C OLUMNS its simplicity, the proposed algorithm outperforms the existing
AT 26.5- D B C OMPARE D IFFERENT C ODE R ATES IN SU-MIMO
state-of-the-art and can suppress the PN effects near to the PN
free reference case in base station and mobile transceivers. The
proposed scheme enables the use of higher MCSs, and was
shown to be able to significantly improve the overall radio
link throughput despite the involved pilot overhead.
R EFERENCES
[1] T. C. W. Schenk, R. W. Van Der Hofstad, E. R. Fledderus, and
P. F. M. Smulders, “Distribution of the ICI term in phase noise
DC bin are estimated, and a single pilot block with vary- impaired OFDM systems,” IEEE Trans. Wireless Commun., vol. 6, no. 4,
ing amounts of subcarriers is used. In CDL-D case, some pp. 1488–1500, Apr. 2007.
[2] D. Petrovic, W. Rave, and G. Fettweis, “Effects of phase noise on OFDM
improvement over the CPE-only mitigation (‘CPE m.’) can be systems with and without PLL: Characterization and compensation,”
achieved with the ‘Pet’ reference algorithm of [2] at higher IEEE Trans. Commun., vol. 55, no. 8, pp. 1607–1616, Aug. 2007.
[3] V. Syrjälä and M. Valkama, “Analysis and mitigation of phase noise
SNR regime. With the proposed algorithm in 2 × 2 SU- and sampling jitter in OFDM radio receivers,” Int. J. Microw. Wireless
MIMO case with 1;12, 1;24, and 2;24 configurations, the Technol., vol. 2, no. 2, pp. 193–202, Apr. 2010.
difference to the ‘no PN’ case is only around 1.7 dB, 1.2 dB [4] N. N. Tchamov, et al., “Enhanced algorithm for digital mitigation of
ICI due to phase noise in OFDM receivers,” IEEE Wireless Commun.
and 0.7 dB, respectively, for the BLER target of 10%. Clearly Lett., vol. 2, no. 1, pp. 6–9, Feb. 2013.
in 2;12 case, the used 12 subcarrier block is too small for [5] V. Syrjälä and M. Valkama, “Iterative receiver signal process-
the u = 2 case, as it gives only performance similar to the ing for joint mitigation of transmitter and receiver phase noise
in OFDM-based cognitive radio link,” in Proc. Int. Conf. Cogn.
reference algorithm. The CDL-C case is clearly more demand- Radio Orient. Wireless Netw., Stockholm, Sweden, Jun. 2012,
ing for all the algorithms due to more frequency selective doi: 10.4108/icst.crowncom.2012.248513.
[6] P. Mathecken, T. Riihonen, S. Werner, and R. Wichman, “Phase noise
channel. Compared to the ‘CPE m.’ case, the considered estimation in OFDM: Utilizing its associated spectral geometry,” IEEE
1;12 configuration offers performance improvement at the Trans. Signal Process., vol. 64, no. 8, pp. 1999–2012, Apr. 2016.
higher SNR range. Furthermore, the 1;24 configuration pro- [7] P. Rabiei, W. Namgoong, and N. Al-Dhahir, “A non-iterative technique
for phase noise ICI mitigation in packet-based OFDM systems,” IEEE
vides performance already within 1 dB of the ideal ‘no PN’ Trans. Signal Process., vol. 58, no. 11, pp. 5945–5950, Nov. 2010.
case at the 10% BLER target. With u = 2, the performance [8] J. Gozalvez, “5G worldwide developments,” IEEE Veh. Technol. Mag.,
is slightly degraded and a larger pilot block size should be vol. 12, no. 1, pp. 4–11, Mar. 2017.
[9] “Study on new radio access technology; physical layer aspects, v14.2.0,”
applied to benefit from larger u. 3GPP, Sophia Antipolis, France, Rep. TR 38.802, Sep. 2017.
In the left two columns of Table I, throughput results are [10] NR; Physical Layer Procedures for Data, V15.1.0, 3GPP Standard TS
given for 2 × 2 SU-MIMO and 2 × 2 MU-MIMO cases 38.214, Apr. 2018.
[11] N. N. Tchamov, Circuit- and System-Level Design of OFDM Receivers
at 25.5-dB SNR in CDL-D channel. Notice that MU-MIMO in the Presence of Phase Noise, D.Sc. dissertation, Dept. Electron.
throughput is evaluated as aggregated throughput of the Commun. Eng., Tampere University of Technology, Tampere, Finland,
2013.
two MU-MIMO users. The performance of the proposed [12] “Study on new radio access technology; RF and co-existence aspects,
algorithm is very good, in general, also in MU-MIMO, while v14.2.0,” 3GPP, Sophia Antipolis, France, Rep. TR 38.803, Sep. 2017.
the computational complexity stays very low. Furthermore, [13] “Study on channel model for frequencies from 0.5 to 100 GHz, v15.0.0,”
3GPP, Sophia Antipolis, France, Rep. TR 38.901, Jun. 2018.
the overhead is very small, e.g., with a block of 12 pilots [14] E-UTRA Multiplexing and Channel Coding, V14.5.1, 3GPP Standard
it is only 0.93%, and even with larger block size of 24, TS 36.212, Jan. 2018.
Energy-Perceptive MAC for Wireless Power and Information Transfer

Youngil Cho , Yunmin Kim , and Tae-Jin Lee , Member, IEEE
Abstract—Energy harvesting technology using radio frequency

signals in a wireless local area network is a promising way to pro-
vide energy with devices in a wireless network. In this letter, we
propose a novel distributed medium access control (MAC) pro-
tocol for both wireless power transfer and wireless information
transfer based on carrier sense multiple access with enhanced
collision avoidance. In the proposed protocol, the power bea-
con adjusts the energy transmission opportunity according to
the stations’ energy level. The proposed MAC protocol improves
data throughput and effectively provides energy to stations with
minimal energy loss.
Index Terms—Wireless power transfer, energy harvesting, self- Fig. 1. A wireless network with one AP, one PB, and n stations, each with
scheduling, distributed MAC protocol. its own energy storage.
I. I NTRODUCTION RF energy signals, the WiFi and Wireless Power Transfer-

N THE era of Internet of Things (IoT), Wireless Power MAC (W2 P-MAC) protocol, in which PBs are equipped with
I Transfer (WPT) technology is expected to provide wire-
less devices with long-lasting power without concerning about
WLAN modules and participate in DCF contention to sense
and occupy the channel for energy transmission [10]. However,
battery [1], [2]. Radio Frequency (RF) signals from ambient it may not control the collisions with the stations. In this let-
devices or Power Beacons (PBs) [3]–[5] can be used as energy ter, we propose a novel distributed MAC protocol in order
sources for wireless devices to harvest energy. In a Wireless to support both WPT and WIT based on CSMA/ECA. In the
Local Area Network (WLAN), information and energy signals proposed protocol, the PB can adjust the energy transfer oppor-
may share the same RF channel and interference between them tunities in consideration of the energy level of the stations, and
can deteriorate information and energy transfer performance the PB and stations maintain the collision-free contention. We
resulting in low throughput and long latency [6], [7]. develop a Markov chain model for analysis for the wireless
To reduce the effect of collisions in Wireless Information power transfer network and the proposed protocol.
Transfer (WIT), IEEE 802.11 adopts Distributed Coordination
Function (DCF) algorithm based on Carrier Sense Multiple II. P ROPOSED MAC P ROTOCOL FOR WPT AND WIT
Access with Collision Avoidance (CSMA/CA) for the Medium A. Network Model
Access Control (MAC) protocol [8]. With the deterministic
backoff strategy, Carrier Sense Multiple Access with Enhanced A wireless network is composed of one Access Point
Collision Avoidance (CSMA/ECA) enables collision-free (AP), one PB and n WLAN Stations (STAs). Each station is
scheduling [9]. A station which succeeds in contention selects equipped with an RF energy harvester and an energy storage
a deterministic backoff value instead of a random back- device. Fig. 1 shows an example of a network for both infor-
off value and uses it for the future backoff value. In the mation and power transfer. The PB is equipped with a WLAN
steady state, wireless stations can have their natural orders of module. We assume that the stations always have packets to
transmissions and transmit data sequentially without collisions. transmit to the AP and a packet can include the information
When the information signal and the energy signal operate of the current energy state. The amount of energy in a station
in the same band, traditional MAC for the sole information sig- changes by the harvested energy from the RF signal of the PB
nal transfer may not work in the operating environment with and by the consumed energy due to data transmissions.
information and power transfer. To control the interference of The amount of energy consumption and harvesting can be
modeled as follows. Each time a station transmits a data
Manuscript received October 30, 2018; revised December 16, 2018; packet, it consumes constant amount of energy, i.e., econ and
accepted December 27, 2018. Date of publication January 9, 2019; date of cur- we assume that econ is a unit energy. When the PB trans-
rent version April 9, 2019. This work was supported by the National Research mits the RF energy signal to the stations, the energy harvester
Foundation of Korea grant funded by the Korean Government under Grant
2014R1A5A1011478. The associate editor coordinating the review of this of a station captures this signal and harvests the energy Ehar .
paper and approving it for publication was S. De. (Corresponding author: Since the harvested energy depends on the transmission power
Tae-Jin Lee.) PTX , the transmission time te , and the distance r between the
The authors are with the College of Information and Communication
Engineering, Sungkyunkwan University, Suwon 16419, South Korea (e-mail: PB and a station, Ehar can be expressed as
tjlee@skku.edu).
Digital Object Identifier 10.1109/LWC.2019.2891644 Ehar = PTX [ max(r , 1)]−α te γc , (1)
CHO et al.: ENERGY-PERCEPTIVE MAC FOR WIRELESS POWER AND INFORMATION TRANSFER 645
where α is the path loss exponent and γc is RF signal to direct In the steady state in which all stations successfully transmitted
current (DC) energy conversion rate. at least once, each station can transmit data with the backoff
If stations are uniformly deployed within an area with the values without collision. Even if there is no central coordi-
radius of Rmax around the PB (i.e., the Probability Density nator, it is possible for a station to transmit data by its own
Function (PDF) fR (r ) = R2r 2 , 0 ≤ r ≤ Rmax ), one can transmission schedule. If the number of stations accessing the
max
derive the PDF of Ehar when α = 2 and γc = 1. network exceeds ωd , collisions may occur in CSMA/ECA. To
P te 1 PTX te solve this problem, research has been conducted to allow the
2 , ≤ ehar < PTX te
TX
2
Rmax ehar 2
Rmax
fEhar (ehar ) = (2) stations to change the contention window size by themselves
1 , ehar = PTX te .
R2 with the assistance of the AP, i.e., CSMA/ECA adjust-CW. In
max
the CSMA/ECA adjust-CW, the AP determines the contention
Since we consider a discrete random variable for Ehar , it can
window size ωmin , based on the number of connected stations
be expressed as a discrete value by quantizing Ehar .
n, and informs it by beacon frames.
d
Ehar = i , if iecon ≤ Ehar ≤ (i + 1)econ , 1 ≤ i ≤ kmax , (3)
qn, qn mod 2 = 1,
ωmin = (7)
where kmax is the maximum number of quantization levels. qn + 1, otherwise,
Then, we can derive the Probability Mass Function (PMF) of
where q is a control parameter (q ≥ 2).
the harvested energy as follows.
(i+1)econ
d C. Proposed MAC Protocol
P[Ehar = i] = fEhar (ehar )dehar , 1 ≤ i ≤ kmax . (4)
iecon
In the proposed protocol, a PB senses the channel status and
The amount of stored energy in a station can be modeled by participates in the backoff competition as the other stations
a random variable Esto . A discrete random variable for Esto do. After winning the competition, the PB can transmit the
can be derived as energy signal to the stations without collisions or interference.
d The average energy signal transmission opportunity of the PB
Esto = j , if (j − 1)econ ≤ Esto ≤ jecon , 1 ≤ j ≤ M . (5)
is adjusted according to the energy states of the stations by
We assume that M > kmax + 1. multiplying the original ωd by an integer factor r. Then, while
A massive number of devices are expected to be deployed the PB can transmit the RF energy signal once and the other
in a wide area. And, it may not be viable to cover such a data stations can transmit data signals r times.
wide area with a single PB since the amount of power trans-
fer decreases due to increased distance. Thus, PBs can be ωd,PB = r (ωmin + 1)/2. (8)
deployed to supply energy to the nodes in sub-areas. And, each
The parameter r is determined according to the energy level
PB is responsible for a small sub-area. If PTX and Rmax of
of the stations. If the energy level of the stations is higher
PBs are the same and sub-areas have the same distribution of
than a high threshold ζhigh , r = rhigh , i.e., ωd,PB =
Ehar , fEhar (ehar ), it is equivalent to a network with a single
rhigh (ωmin + 1)/2. If the energy level is lower than a low
PB in a sub-area.
threshold ζlow , r = rlow , i.e., ωd,PB = rlow (ωmin + 1)/2.
Otherwise r = rmid , i.e., ωd,PB = rmid (ωmin + 1)/2.
B. Carrier Sense Multiple Access With Enhanced Collision
The parameters rlow , rmid , rhigh , ζhigh , and ζlow are
Avoidance (CSMA/ECA)
determined based on the amount of energy consumption
CSMA/CA protocol is a distributed MAC protocol for econ and the average amount of harvested energy Ehar d . The
WLAN. Before transmitting data, a station senses the channel. CSMA/ECA-based stations can transmit data signals r times
If the channel is idle during the DCF Interframe Space (DIFS), more than the PB does and consume r units energy. The
it randomly selects a backoff counter value in [0, 2k (ωmin + stations harvest Ehard energy from the PB. When the sta-
1) − 1], where ωmin is the minimum contention window and tion’s energy level is ζ, the station’s energy level becomes
k ∈ [0, m] is the backoff stage. The backoff counter decreases ζ = ζ − r + Ehard . If a station is in the low energy state, it
by one if the channel is idle during a slot time. When the back- requires quick energy harvesting. Then the average transmis-
off counter value reaches zero, the station transmits data. For sion opportunity of the PB is set to the same as that of the
each unsuccessful transmission by collision, a station increases other stations, i.e., rlow = 1. Thus, ζ will be at least equal
the backoff stage by one up to the maximum backoff stage m to or greater than ζ. In the medium energy state, in order
and repeats the backoff procedure. This algorithm is known to balance the amount of the harvested energy Ehar d and the
as a Binary Exponential Backoff (BEB) mechanism. consumed energy rmid ,
The difference between CSMA/ECA and CSMA/CA is that
d
a station which successfully transmitted data selects a deter- rmid = E har (9)
ministic backoff value and keeps the backoff value afterwards
d
in CSMA/ECA. When a station succeeds in data transmission where · is a rounding function and E har is the average of
and receives an Acknowledgment (ACK) frame, it sets the d .
a random variable Ehar
backoff stage to 0 and selects a deterministic backoff value
k
ωd instead of a random backoff value as in CSMA/CA, d d
max
d
E har = E[Ehar ]= i P[Ehar = i ]. (10)
ωd = (ωmin + 1)/2. (6) i=1
from the current state i to any state j is

⎧ d
⎪ P[Ehar = j − (i − rlow )],
⎪
⎪
⎪
⎪ 1 ≤ i < ζlow , 1 ≤ j < ζlow − rlow + kmax
⎪
⎪
⎪
⎪ d
⎨ P[Ehar = j − (i − rmid )],
pi,j = ζlow ≤ i ≤ ζhigh , ζlow − rmid + 1 ≤ j ≤ M (15)
⎪
⎪ d
⎪ P[Ehar
⎪ = j − (i − rhigh )],
⎪
⎪
⎪
⎪ ζ < i ≤ M , ζhigh − rhigh + 1 < j ≤ M
⎪
⎩ high
0, otherwise.
To simplify the notation, we have defined the probability
d
P[Ehar = j − (i − rg )] as pd,g , where d = j − i and
g ∈ {low , mid , high}.
Fig. 2. An example of the proposed MAC protocol.
We group these energy states into 3 groups, i.e., low,
mid, and high. First, in the low group, a station’s energy
level is lower than ζlow , and the PB’s transmission oppor-
In the high energy state, rhigh is set to reduce the opportunity tunity is the same as those of the other stations (rlow = 1).
of the energy harvesting and the amount of wasted energy. If The states between ζlow and ζhigh belong to the mid group.
d is maximally transferred (k
Ehar max ),
After the stations transmit data rmid times, the PB trans-
mits energy once. In the high group, stations’ energy level is
rhigh = kmax . (11) higher than ζhigh . The PB’s transmission opportunity is rhigh
times less than those of the other stations. Then, a unique
Therefore, ζ becomes less than or equal to at least ζ.
steady state probability distribution π = [π1 π2 . . . πM ] where
The parameter ζhigh is set for a station in the medium
πi = limt→∞ P[π(t) = i ], i ∈ [1, M ] can be found from
energy state not to exceed M, and the parameter ζlow is
M

determined so that it can be larger than 1 after transmission.
πP = π, πi = 1. (16)
d
1 < ζ − rmid + Ehar ≤ M. (12) i=1
We can calculate the probability of sojourning in each group,
From the upper bound, ζhigh can be determined as
ζlow
−1
ζhigh = M − (kmax − rmid ), (13) plow = πi
d =k i=1
when it harvests the maximum amount, i.e., Ehar max . ζhigh
Similarly, and from the lower bound, ζlow can be
pmid = πi
determined as
i=ζlow
ζlow = 2 − (1 − rmid ), (14) M

d = 1.
phigh = πi . (17)
when it harvests the minimum amount, i.e., Ehar i=ζhigh +1
An example of a PB monitoring the energy status of the
stations and adjusting the average data transfer opportunity is Let Si be the system throughput for group i, i ∈
shown in Fig. 2. The minimum energy level of the stations {low , mid , high}, i.e., the total amount of successfully trans-
is higher than ζlow and lower than ζhigh , then rmid = 2 and mitted data during the entire time. Let ri be the ratio between
ωd,PB = rmid ωd = 16. As a result, the transmission opportu- the stations’ data transmission rate and the PB’s energy signal
nity of the PB doubles, and the PB transmits RF energy after transmission rate for group i and n be the number of stations.
the other stations transmit data twice. Then, Si can be expressed from the frame payload size lpayload
and the time TData and TEnergy,i .
III. P ERFORMANCE E VALUATION lpayload × (ri × n)
Si = . (18)
A. Analysis TData × ri × n + TEnergy,i
In this section, we analyze the saturation throughput of the Based on IEEE 802.11 DCF, we obtain
stations in the steady state. To this end, it needs to model the TData = tDIFS + tSlot + tRTS + tSIFS
steady state probability distribution of the stations’ energy lev-
+ tCTS + tSIFS + tData + tSIFS + tACK ,
els. The behavior of a station’s energy level can be modeled by
a discrete-time Markov chain {e(t)}, e(t) ∈ {1, 2, . . . , M }. TEnergy,i = tDIFS + ri × tSlot + tRTS + tSIFS
Each state in the Markov chain denotes the energy level of a + tCTS + tSIFS + te , (19)
station after the PB transmits the energy signal. where tβ denotes the duration for β. Then, we can calculate
The Markov chain is characterized by a transition matrix the overall saturation throughput ST as follows:
P = [pi,j ], where pi,j is the transition probability from energy
level i to energy level j. Since the amount of a station’s har- ST = pi Si (20)
vested energy is a random variable, the transition probability i∈{low ,mid,high}
CHO et al.: ENERGY-PERCEPTIVE MAC FOR WIRELESS POWER AND INFORMATION TRANSFER 647
TABLE I
S YSTEM PARAMETERS waste of energy even for a small number of stations. Since
W2 P-MAC transmits energy signal without considering the
energy states of stations, energy may be wasted and frequent
collisions may occur. Thus unused amount of energy in
W2 P-MAC is larger than that in the proposed mechanism.
The energy efficiency of the stations is shown in Fig. 3 (c).
With the introduction of the PB, WIT and the energy efficiency
may be reduced. Thus, WPT and WIT have a trade-off. In the
proposed protocol, to balance the WPT and WIT, the PB can
control the WPT opportunity according to the energy levels of
the stations. The proposed protocol has almost twice as high
energy efficiency as those of the others due to the control of
the PB’s WPT opportunities.
IV. C ONCLUSION
In this letter, we have proposed a distributed MAC protocol
for both WPT and WIT based on CSMA/ECA in wire-
less powered communication networks. The proposed protocol
attempts to coordinate data and energy transmissions. It not
only reduces the amount of wasted energy but also improves
throughput and energy efficiency by appropriately controlling
the energy transmission opportunity of PB according to the
energy level of stations. The proposed protocol is shown to
improve the power availability of devices and allows efficient
data transfer and power transmission in a wireless network.
As future research, WPT with beamforming could be con-
Fig. 3. Performances of the proposed MAC: (a) Throughput for varying sidered. If the WPT with beamforming technique is applied,
numbers of stations, (b) Wasted energy due to energy transfer by the PB, and performances is expected to be further improved.
(c) Energy efficiency of the stations due to energy harvesting.
R EFERENCES
[1] W. Ejaz, M. Naeem, A. Shahid, A. Anpalagan, and M. Jo, “Efficient
B. Simulations energy management for the Internet of Things in smart cities,” IEEE
Commun. Mag., vol. 55, no. 1, pp. 84–91, Jan. 2017.
In order to evaluate the performance of the proposed pro- [2] P. Kamalinejad et al., “Wireless energy harvesting for the Internet of
tocol, we compare the throughput, wasted energy, and energy Things,” IEEE Commun. Mag., vol. 53, no. 6, pp. 102–108, Jun. 2015.
efficiency of the proposed protocol with those of W2 P-MAC [3] T. X. Doan, T. M. Hoang, T. Q. Duong, and H. Q. Ngo, “Energy
harvesting-based D2D communications in the presence of interference
and the proposed protocol without adjusting PB’s transmission and ambient RF sources,” IEEE Access, vol. 5, pp. 5224–5234, 2017.
opportunity. The wasted energy is defined as the ratio between [4] X. Lu, P. Wang, D. Niyato, D. I. Kim, and Z. Han, “Wireless networks
total amount of wasted energy and total time, and the energy with RF energy harvesting: A contemporary survey,” IEEE Commun.
Surveys Tuts., vol. 17, no. 2, pp. 757–789, 2nd Quart., 2015.
efficiency is defined as the total number of data bits transmit- [5] J. Ren et al., “RF energy harvesting and transfer in cognitive radio sensor
ted divided by the total energy transmitted by the PB. The networks: Opportunities and challenges,” IEEE Commun. Mag., vol. 56,
parameters used in the simulation and the analysis are shown no. 1, pp. 104–110, Jan. 2018.
[6] S. Lee, L. Liu, and R. Zhang, “Collaborative wireless energy and infor-
in Table I [11]. mation transfer in interference channel,” IEEE Trans. Wireless Commun.,
Throughput performance in Fig. 3(a) increases as the num- vol. 14, no. 1, pp. 545–557, Jan. 2015.
ber of stations increases, which stems from the fact that the [7] R. Gupta, A. K. Chaturvedi, and R. Budhiraja, “Improved rate-energy
tradeoff for energy harvesting interference alignment networks,” IEEE
time interval for the PB decreases as the number of stations Wireless Commun. Lett., vol. 6, no. 3, pp. 410–413, Jun. 2017.
increases. Since the stations retransmit data due to collisions [8] Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY)
in W2 P-MAC, throughput is the lowest. Proposed schemes Specifications, IEEE Standard 802.11, Dec. 2016.
[9] L. Sanabria-Russo, J. Barcelo, B. Bellalta, and F. Gringoli, “A high
show higher performance than W2 P-MAC because there are efficiency MAC protocol for WLANs: Providing fairness in dense
no retransmissions due to collisions. When the PB adjusts scenarios,” IEEE/ACM Trans. Netw., vol. 25, no. 1, pp. 492–505,
transmission opportunity, it can prevent unnecessary energy Feb. 2017.
[10] H. Lee, Y. Kim, J. H. Ahn, M. Y. Chung, and T.-J. Lee, “Wi-Fi and
transmission so that the channel can be used about 7% more wireless power transfer live together,” IEEE Commun. Lett., vol. 22,
for data transmission. no. 3, pp. 518–521, Mar. 2018.
Fig. 3(b) shows the wasted energy rate. As the number [11] W. Cheng, X. Zhang, and H. Zhang, “Full-duplex spectrum-sensing
and MAC-protocol for multichannel nontime-slotted cognitive radio net-
of stations increases, the wasted energy tends to decrease. works,” IEEE J. Sel. Areas Commun., vol. 33, no. 5, pp. 820–831,
However, the proposed protocol can significantly reduce the May 2015.
Comments and Corrections

Corrections to “Outage Analysis for Decode-and-Forward Multirelay
Systems Allowing Intra-Link Errors”
Albrecht Wolf , Diana Cristina González , Meik Dörpinghaus , Luciano Leonel Mendes,
José Cândido Silveira Santos Filho , and Gerhard Fettweis
I. I NTRODUCTION that independent decoding would be able to achieve optimum

In [1], we derived an analytical expression for the outage performance. But we refuted this claim in [3].
probability of decode-and-forward (DF) multirelay systems Admissible Rate Region: In [1, eqs. (2) and (3)], we repro-
that allow for intra-link errors (IE). To this end, we relied upon duced from [2] what was claimed to be the admissible rate
the admissible rate region of a binary many-help-one problem region of the binary many-help-one problem with indepen-
with independently degraded helpers, which we believed to dently degraded helpers. This admissible rate region turned out
have obtained in [2]. Unfortunately, later on we realized that to be incorrect, and the problem remains open. On the other
the admissible rate region in [2] is incorrect, and thus so is hand, we derived recently a bound on the admissible rate region
the resulting DF-IE outage probability in [1]. when the primary source is uniformly distributed and the helpers
As far as we are aware, the admissible rate region of the are degraded through symmetric channels [3]. This bound is
binary many-help-one problem with independently degraded reproduced next, and used subsequently for outage analysis.
helpers remains unknown in closed form. On the other hand, Reference [3, Th. 3]: If (X1 , X2 , . . . , XN ) is an N-tuple
we derived recently a simple bound of this admissible rate random variables with joint pmf p(x1 , xL ) =
of binary
region when specialized to a primary source that is uniformly p(x1 ) N i=2 p(xi |x1 ), with pX1 (0) = pX1 (1) = 1/2,
distributed and to helpers that are degraded through symmetric pXi |X1 (0|1) = pXi |X1 (1|0) = pi for some 0 ≤ pi ≤ 1/2, i ∈
channels [3]. L, then Rsub is a subset of the admissible rate region RDF-IE ,
In this letter we correct the analysis in [1] by obtaining an given by

upper bound of the DF-IE outage probability based on the Rsub = (R1 , R2 , . . . , RN ):
admissible rate region’s bound in [3].
R1 ≥ h(pi ∗ κi ) − η(pL , κL ),
i∈L
II. C ORRECTIONS
Ri ≥ η(pL , κL ) − η(pS c , κS c ) − h(κi ),
The corrections that follow are threefold: the system model
i∈S i∈S
of DF-IE, the admissible rate region of the binary many-help-
one problem with independently degraded helpers, and the Ri ≥ 1 + η(pL , κL ) − h(κi ),
outage probability of DF-IE. Please refer to the notation intro- i∈L i∈L

duced in the last paragraph of [1, Sec. I], as we shall use it ∀S ⊂ L and S = L\S, κL ∈ [0, 0.5](N −1) , (1)
c
here.
System Model: Unlike reported in [1, Sec. II] and depicted where η(·) is defined in [3, eq. (27)]. Here, we use a compact
in [1, Fig. 1], the relay sequences are not decoded indepen- notation of its argument, e.g., η(pL , κL ) = η({pi ∗ κi }i∈L ).
dently at the destination. Instead, all received sequences — As shown in [3], the subset Rsub is an increasingly tight
from source and relays — are jointly decoded at the des- approximation of the admissible rate region RDF-IE as the
tination to retrieve the source message. In [2], we claimed helpers become more degraded.
Outage Probability: In [1, eqs. (6)–(9)], we claimed to
Manuscript received December 24, 2018; accepted December 24, 2018. have obtained the outage probability of DF-IE. But those
Date of current version April 9, 2019. (Corresponding Author: Albrecht Wolf.) expressions are incorrect, as they rely upon an incorrect admis-
A. Wolf, M. Dörpinghaus, and G. Fettweis are with the Vodafone
Chair Mobile Communications Systems, Technische Universität sible rate region [1, eqs. (2) and (3)]. Here, by using (1), we
Dresden, 01062 Dresden, Germany (e-mail: albrecht.wolf@tu-dresden.de; obtain a valid upper bound on the referred outage probabil-
meik.doerpinghaus@tu-dresden.de; gerhard.fettweis@tu-dresden.de). ity, as follows. On the one hand, we have the transmission
D. C. González and J. C. S. Santos Filho are with the Department
of Communications, School of Electrical and Computer Engineering, rates RN depending on the received SNRs of the source- and
University of Campinas, Campinas 13083-852, Brazil (e-mail: relay-destination links, ΓSD and ΓFL D (see [1, eq. (4)]). On
dianigon@decom.fee.unicamp.br; candido@decom.fee.unicamp.br). the other hand, we have the admissible rate region RDF-IE
L. L. Mendes is with the Instituto Nacional de Telecomunicações, Santa
Rita do Sapucaí 37540-000, Brazil (e-mail: luciano@inatel.br). depending on the received SNRs of the source-relay links
Digital Object Identifier 10.1109/LWC.2018.2889941 ΓSFL . An outage event occurs whenever the transmission rates
IEEE WIRELESS COMMUNICATIONS LETTERS, VOL. 8, NO. 2, APRIL 2019 649
RN fall outside the admissible rate region RDF-IE . Thus, probability is an upper bound on the outage probability of
using [1, eq. (4)], we have DF-IE. This bound cannot be solved in closed form, requiring
1 1 numerical evaluation.
out
PDF -IE,N = Pr φ(ΓSD ), φ(ΓFL D ) We verified for the numerical examples presented in
Rc Rc
[1, Fig. 2a] that the upper bound of DF-IE in (3) proves
∈/ RDF-IE ΓSFL (2) marginally different from the (incorrect!) outage probability

1 given in [1, eq. (9)]. The curves barely change. In particular,
< Pr ψ(ΓSD ) ≤ h pi (ΓSFi ) ∗ κl the conclusion drawn in [1] still holds true: DF-IE outperforms
Rc
i∈L conventional DF in terms of outage probability, becoming

− η(pL (ΓSFL ), κL ) ∪ more advantageous as more relays are used.

1
ψ(ΓFi D ) ≤ η(pL (ΓSFL ), κL ) R EFERENCES
Rc
∀S⊂L i∈S [1] A. Wolf, D. C. González, M. Dörpinghaus, L. L. Mendes,
J. C. S. Santos Filho, and G. Fettweis, “Outage analysis for decode-and-
− η(pS c (ΓSFS c ), κS c ) − h(κi ) ∪ forward multirelay systems allowing intra-link errors,” IEEE Wireless
i∈S Commun. Lett., vol. 6, no. 6, pp. 758–761, Dec. 2017.
1 [2] A. Wolf, D. C. González, M. Dörpinghaus, J. C. S. Santos Filho,
ψ(ΓFi D ) ≤ 1 + η(pL (ΓSFL ), κL ) − h(κi ) , and G. Fettweis, “On the many-help-one problem with
Rc independently degraded helpers,” 2017. [Online]. Available:
i∈L i∈L
http://arxiv.org/abs/1701.06416v1
S c = L\S, κL ∈ [0, 0.5](N −1) . (3) [3] A. Wolf, D. C. González, M. Dörpinghaus, J. C. S. Santos Filho,
and G. Fettweis, “On the binary lossless many-help-one problem with
independently degraded helpers,” in Proc. IEEE 56th Annu. Allerton
In (3), we substituted the rate constraints from (1). Because (1) Conf. Commun. Control Comput. (Allerton), Monticello, IL, USA,
represents a subset of the admissible rate region, the resulting Oct. 2018, pp. 1–5.
IEEE COMMUNICATIONS SOCIETY
2019 Board of Governors
O FFICERS
President President-Elect VP-Member & Global Activities VP-Publications
K. L ETAIEF V. C HAN N. K ATO X. S HEN
Hong Kong Univ. Science & Tech. MIT Graduate School of Info. Sci. Univ. of Waterloo
VP-Technical & Educational Activities VP-Conferences VP-Industry & Standards Activities

N. F ONSECA S. B REGNI S. G ALLI
State Univ. Campinas Politecnico di Milano Huawei Technologies
Executive Director/Secretary
S.M. B ROOKS, IEEE ComSoc
A PPOINTED O FFICERS
Chief Information Officer Treasurer Director—Educational Services Director—Magazines Director—Member Services Director—Sister & Related Societies
Z. D ING F. TAKAWIRA F. G RANELLI M. FANG S. G UO O. D OBRE
Univ. of California Univ. of Witwatersrand Univ of Trento Memorial Univ.
Director—Tech. Services Director—NA Region
Chief Marketing Officer Director—AP Region Director—EMEA Region T. TALEB W. A LMUHTADI Director—Standards Dev.
R. F ISH S. BAHK A. K SENTINI T. E L -BAWAB
NETovations, LLC Director—Industry Communities Director—On-Line Content Jackson State Univ.
Director—Conf. Dev. Director—Journals I. W ONG Z. N IU
Parliamentarian J. RODRIGUES R. S CHOBER Tsinghua Univ. Director—Standardization
R. D E M ARCA Univ. of Beira Interior Friedrich-Alexander Univ. Director—Industry Outreach Programs Development
Center for Telecommunication A. D UTTA, AT&T N. G OLMIE
Studies Director—Conf. Operations Director—LA Region National Inst. of Standards & Tech.
H. S ARI L. Z AMBENEDETTI
Nanjing Univ. of Posts and
Telecommunications
M EMBERS - AT-L ARGE

L. H ANZO (’19) G. F ETTWEIS (’20) O. D OBRE
Univ. of Southampton Technical Univ. Dresden Memorial Univ. (21)
W. L IAO (’19) E. H OSSAIN (’20) P. M ARTIN (21)

National Taiwan Univ. Univ. of Manitoba
P. P OPOVSKI (21)
D. M ICHELSON (’19) U. M ITRA (’20)
Univ. of British Columbia Univ. of Southern C. X IAO (21)
California
R. V EIGA (’19)
Univ. of Buenos Aires W. Z HANG (’20)
C OMMITTEE C HAIRS
Awards Dist. Lecturers Selection Finance GITC Strategic Planning
C. X IAO S. M AO F. TAKAWIRA M. VALENTI V. C HAN
Lehigh Univ. Univ. of Witwatersrand West Virginia Univ. MIT
Emerging Technologies
Communications History J. A NDREWS Governance Add Marketing Technical Committees Recertification
D. M ICHELSON Univ. of Texas at Austin M. H ARTMANN R. F ISH - NETovations, LLC N. F ONSECA
Univ. of British Columbia State Univ. Campinas
Fellow Evaluation GIMS Nominations & Elections
ComSoc Young Professional S. B ENEDETTO
N. R AMIREZ K. A SATANI TBD WICE
Politecnico de Torino Kogakuin Univ. A. G ARCIA A RMADA
Operations & Facilities
K. L ETAIEF
Hong Kong Univ. Science & Tech.
Conferences Transactions on Cognitive ComSoc e-News Communication Theory Optical Networking

VP — S. B REGNI Communications and Networking S. K. W ILSON, Acting EIC U. M ITRA X. C AO
Politecnico di Milano Y.-C. L IANG, EIC Santa Clara Univ. Univ. of Southern California Georgia State Univ.
Director — Conf. Dev. Transactions on Green Communications IEEE Press
J. RODRIGUES Computer Communications Power Line Communications
and Networking S. S HEN
Univ. of Beira Interior J. S TERBENZ A. T ONELLO
E. AYANOGLU, EIC Univ. of Waterloo Univ. Kansas/Lancaster U.K.
Univ. of California, Irvine
Director — Conf. Operations
Tech. & Educational Activities Radio Communications
H. S ARI Transactions on Molecular, Data Storage
VP — N. F ONSECA Y. S HEN
Nanjing Univ. of Posts and Biological, and Multi-Scale S. S. G ARANI
State Univ. Campinas Tsinghua Univ.
Telecommunications Communications Indian Inst. of Sci., Bangalore
Publications U. M ITRA, EIC Vice Chair Satellite & Space Commun.
VP — X. S HEN Univ. of Southern California S. M AO e-Health T. D E C OLA
Univ. of Waterloo Auburn Univ. TBD German Aerospace Ctr.
Transactions on Networking and Service
Management
Journals Board Ad Hoc & Sensor Commun. Green Comm. & Computing
F. D E T URCK, EIC Signal Processing &
Director of Journals C. L I J. W U
Ghent Univ. Communications Electronics
R. S CHOBER
Friedrich-Alexander Univ. X. WANG
Transactions on Wireless Big Data Information Infrastructure &
Communications J. L I Networking
Communications Letters J. Z HANG, EIC R. L ANGAR SPCE
O. D OBRE, EIC Cognitive Networks X. WANG
Memorial Univ. Wireless Commun. Letters Y. G AO
W. Z HANG, EIC Queen Mary Univ. of London Innovation & Standards in
Info. & Comm. Tech. Smart Grid Communications
IEEE Communications Univ. of New South Wales A. Z HANG
Surveys & Tutorials Comm. & Info. Security S. M. H ASAN
Magazines Board Chinese Univ. of Hong Kong
Y. D. L IN, EIC A. B ENSLIMANE
Director of Magazines Internet
National Chiao Tung Univ. Univ. of Abignon
M. FANG D. H UANG Social Networks
IEEE/ACM Transactions on Communications Magazine Communications Quality & Arizona State Univ. N. P RASAD
Networking T. E L -BAWAB Reliability
R. S RIKNAT, EIC Jackson State Univ. S. P ORETSKY Molecular, Biological and Multi-Scale TC Tactile Internet
Univ. of Illinois at Communications M. S IMSEK
Network Magazine
Urbana-Champaign Communications Software T. NAKANO
M. G UIZANI
Univ. of Idaho A. K SENTINI Transmission Access &
Journal on Selected Areas Optical Systems
in Communications Multimedia Communications
Wireless Commun. Magazine Commun. Switching & Routing F. G RANELLI
R. B OUTABA, EIC S. M AO
H. G HARAVI A. M ELLOUK Univ. of Trento
Univ. of Waterloo Auburn University
National Inst. of Standards & Tech. UPEC
Transactions on Commun. Global Commun. Newsletter Communications Systems Network Operations & Mgmt. Wireless Communications
N. A L -D HAHIR, EIC S. B REGNI Integration & Modeling L. G RANVILLE W. Z HANG
Univ. of Texas at Dallas Politecnico di Milano C. V ERIKOUKIS Fed. Univ. Rio Grande do Sul Univ. of South Wales

08lwc02 2905315 Completeissue APRIL 2019

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

08lwc02 2905315 Completeissue APRIL 2019

Transféré par

Droits d'auteur :

Formats disponibles

APRIL 2019 VOLUME 8 NUMBER 2 IWCLAF ISSN (2162-2345)

Optimal User Pairing for Downlink Non-Orthogonal Multiple Access (NOMA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

(Contents Continued on Page 325)

IEEE WIRELESS COMMUNICATIONS LETTERS

IEEE E XECUTIVE S TAFF

IEEE Publishing Operations

Digital Object Identifier 10.1109/LWC.2019.2905313

(Contents Continued on Page 326)

(Contents Continued on Page 327)

COMMENTS AND CORRECTIONS

Optimal User Pairing for Downlink Non-Orthogonal Multiple Access (NOMA)

Case 2: User-1 pairs User-3, i.e., {u1,3 = 1, u2,4 = 1}. The

D. Generalization Consideration V. C ONCLUSION

On the Capacity of Gaussian MIMO Channels Under the

SR = {R : R ≥ 0, tr R ≤ PT , rii ≤ P } (5) P > m −1 (PT + α1 ), (13)

VI. P ROPERTIES OF O PTIMAL C OVARIANCE where Λ is determined from the PACs

Secure Transmission With Interleaver for Uplink Sparse

SER performance of the eavesdropper is still too high to detect

Hybrid Modulation Scheme Combining PPM With Differential

III. P ERFORMANCE OF PPM-DCSK A NALYSIS

where Es = 2RE{x 2 } is symbol energy of PPM-DCSK,

proposed PPM-DCSK. Fig. 6 shows BER performance with

Double Shadowing the Rician Fading Model

IV. P ERFORMANCE A NALYSIS

V. S PECIAL C ASES AND N UMERICAL R ESULTS

VI. C ONCLUSION A PPENDIX C

Energy-Efficient Prefix Code Based Backscatter Communication

Fig. 1. Block diagram of the proposed CBBC.

We note that the codebook designed in [4], which is applied

Outage Constrained Robust Multigroup Multicast Beamforming for

per-beam power constraints. We then reformulated the out-

Low-Complexity Differential Spatial Modulation

(SM) [1]–[3] attracts much attention. SM is a multi-antenna II. R EVIEW OF DSM

rank, i.e., |(X − X )(X − X )† | = 0, no matter what the

orders have appeared before. This process is continued until all

with i > 1, all matrices in Ωi differ at least three columns

Time-Expanded Graph-Based Resource Allocation Over the Satellite Networks

Abstract—In this letter, we propose a transceiver resource

Then, we can define:

B. Transceiver Resource Allocation Condition

where A − B contains the elements in the set A but not in

where (9) means that the flow out of a satellite u during τp

B. Representation Rule of Transceiver Resources

A Novel Frequency Allocation Scheme for In Band

levels as well as interference from neighboring cells, only UEs

number of UEs, it is not necessary that all dij ’s ∈ D but only

III. F REQUENCY A LLOCATION S CHEME

be accommodated to identify new sets of UEs for frequency

PLdB = 46.3 + 33.9log10 (f ) − 13.82log10 (hB ) − a(hR , f )

where, a(hR , f ) = (1.1log10 (f ) − 0.7)hR − (1.56log10 (f ) −

PRj = PTi + GRj + GTi − PLij (A.2)

Optimal Transmission Scheduling in Small Multimodal Underwater Networks

III. S IMULATION R ESULTS

mitted by node k to node i. For Lp bits in a packet, we also

consumption of 2.2 Wh by OMS and 4.0 W/h by Aloha. This

Placement Delivery Array Design via Attention-Based Sequence-to-Sequence

and N, and it is called a (K, M, N) caching system. According

parameters of the model, i.e., Algorithm 1 Seq2Seq Placement Delivery Network

Massive MIMO-OFDM Channel Estimation via

the proposed StFBP this is not as important as ABSP since

Analysis of Unslotted IEEE 802.15.4 Networks With

The collision probability for nodes of class l is the proba-

where γ = 2ttatb−tb is a corrective factor that introduces into

S = U(0, Ls − 1), C = U(0, L − 1) and B0 = U(0, W0 − 1) (j )

rank, i.e., |(X − X )(X − X )† | = 0, no matter what the