Vous êtes sur la page 1sur 473

Advances in Information Technologies for Electromagnetics

Advances in Information
Technologies for
Electromagnetics
Edited by

Luciano Tarricone
University of Lecce, Italy

and
Alessandra Esposito
University of Lecce, Italy
A C.I.P. Catalogue record for this book is available from the Library of Congress.

ISBN-10 1-4020-4748-7 (HB)


ISBN-13 978-1-4020-4748-0 (HB)
ISBN-10 1-4020-4749-5 (e-book)
ISBN-13 978-1-4020-4749-5 (e-book)

Published by Springer,
P.O. Box 17, 3300 AA Dordrecht, The Netherlands.

www.springer.com

Printed on acid-free paper

All Rights Reserved


2006 Springer
No part of this work may be reproduced, stored in a retrieval system, or transmitted
in any form or by any means, electronic, mechanical, photocopying, microfilming, recording
or otherwise, without written permission from the Publisher, with the exception
of any material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work.

Printed in the Netherlands


Dedication

This book is dedicated to


Edoardo and Silvia
Contents

Contributing Authors xvii


Preface xxi
Acknowledgments xxvii

1 Parallel and Distributed Environments 1


A. Esposito

1. INTRODUCTION 1
2. BASIC CONCEPTS 2
3. PARALLEL PROGRAMMING 3
3.1 Introduction 3
3.1.1 MPI 5
3.2 Performance Assessment 6
4. DISTRIBUTED SYSTEMS 6
4.1 Introduction 6
4.2 RPC 7
4.3 Mobile Agent Framework 8
5. THE WEB 8
5.1 XML 10
5.1.1 Introduction 10
5.1.2 XML Fundamentals 12
5.1.3 Namespaces 13
5.1.4 XML Schema 15
5.1.5 Applications 16

2 Object-Oriented Technologies 19
A. Esposito

1. INTRODUCTION 19
2. OO PROGRAMMING 20
viii Contents

2.1 Basic Concepts 20


2.2 Java 23
2.2.1 Introduction 23
2.2.2 The Language 24
3. OO DISTRIBUTED FRAMEWORKS 26
3.1 Introduction 26
3.1.1 Java RMI 26
3.2 Java Mobile Agents 27

3 The Semantic Web 29


A. Esposito

1. INTRODUCTION 29
2. DESCRIPTION LOGICS 31
2.1 Introduction 31
2.2 A Model for Reality: The TBox 31
2.2.1 Constructors 33
2.2.2 Axioms 35
2.3 The ABox 37
2.4 Reasoners 38
3. TOOLS FOR THE SEMANTIC WEB 41
3.1 Languages 41
3.2 Reasoners 42
3.3 Tools for Building Ontologies 43

4 Web Services 45
A. Esposito

1. INTRODUCTION 45
2. BASIC CONCEPTS 46
2.1 Web Services Architecture 46
3. WEB SERVICES DESCRIPTION: WSDL 48
4. AUTOMATIC DISCOVERY OF WEB SERVICES 50
4.1 UDDI 50
4.2 The Semantic Web Services 50

5 Grid Computing 55
A. Esposito

1. INTRODUCTION 55
2. GC BASIC CONCEPTS 56
Contents ix

3. THE GLOBUS TOOLKIT 57


3.1 GT and Web Services 58
4. GT COMPONENTS 60
5. JOB MANAGEMENT 61
5.1 GC for HPC 61
6. INFORMATION SERVICES 62
7. DATA MANAGEMENT 65

6 Complex Computational Electromagnetics using Hybridisation


Techniques 69
R. A. Abd-Alhameed and P. S. Excell

1. INTRODUCTION 70
1.1 Integral Equation Methods 70
1.2 Differential Equation Methods 70
1.3 The Advantages and Disadvantages of the Methods 71
1.4 Hybrid Methods 72
1.5 Literature Review 74
2. OUTLINE OF THEORY AND IMPLEMENTATION OF HYBRID
METHOD 80
2.1 Hybrid Treatment for Homogeneous Multiple Elements 80
2.1.1 Hybrid MoM/MoM Treatment for Two Elements
(Sub-Matrices Iterative Technique) 81
2.1.2 Hybrid MoM/MoM Method for Two Elements
(Field Transfer Iterative Technique) 84
2.1.3 Extension of Hybrid MoM/MoM Method from
Two Elements to Multiple Elements (Field Transfer
Iterative Technique) 88
2.1.4 Hybrid MoM in Multiple Regions Using
the Equivalence Principle Surface 91
3. INCIDENT WAVE EXCITATIONS IN THE FDTD METHOD 101
3.1 Total/Scattered Field Formulation in Three Dimensions 102
4. MODIFIED TOTAL/SCATTERED FIELD FORMULATION
FOR THE HYBRID TECHNIQUE 107
5. VALIDATION OF TOTAL/SCATTERED FIELD
FORMULATION IMPLEMENTATION USING
HOMOGENEOUS FDTD IN MULTIPLE REGIONS 110
x Contents

6. HYBRID MOM/FDTD TECHNIQUE ALGORITHM 112


6.1 Theoretical Formulation 113
6.2 Multiple-Source Scattering Problems 116
7. NEC/FDTD HYBRID PROGRAM 120
8. FAR FIELD CALCULATIONS USING THE HYBRID CODE 123
9. NUMERICAL EXAMPLES USING THE HYBRID MoM/FDTD
TECHNIQUE 123
10. SUMMARY 140

7 Enhanced EM software for Planar Circuits 147


D. Vande Ginste, F. Olyslager, D. De Zutter and E. Michielssen

1. INTRODUCTION 148
1.1 Setting and Definition of the Research Topic 148
1.1.1 High-Frequency Applications and Design 148
1.1.2 Planar Circuits and Planar Solvers 149
1.1.3 Some Advantages and Drawbacks of BIE-MoM Based
Planar Solvers 150
1.2 Methodology 151
1.2.1 Perfectly Matched Layer (PML) Based Greens
Functions 151
1.2.2 Iterative Solvers 153
1.2.3 Fast Multipole Method (FMM) 154
1.3 Outline 155
2. CLASSICAL SOLUTION TECHNIQUE FOR MICROSTRIP
STRUCTURES 156
2.1 Geometry of the Problem 156
2.2 The EFIE Description 157
2.3 The Greens Dyadic G ee (r | r ') 158
2.3.1 Integral Representation 158
2.3.2 Sommerfeld-Integrals 160
2.4 The Method of Moments 161
3. PERFECTLY MATCHED LAYER BASED GREENS
FUNCTIONS FOR LAYERED MEDIA 163
3.1 The Perfectly Matched Layer Concept 163
3.1.1 The Split Field Formalism 163
3.1.2 Complex Coordinate Stretching Formalism 164
3.2 Closure of Open Microstrip Substrates 165
3.2.1 Procedure and Influence on the Greens Functions 165
3.2.2 Complex Thickness 166
3.2.3 Dispersion Relations 167
Contents xi

3.3 Series Expansion for the Greens Dyadic G ee 168


3.3.1 Integral Representation 168
3.3.2 Gee , xx 168
3.3.3 Gee , xy 172
3.3.4 Closed-Form Expression for G ee 172
3.3.5 Important Remarks Concerning the Series Expansion 173
4. A PML-MLMFA FOR THE MODELING OF LARGE PLANAR
MICROSTRIP STRUCTURES 174
4.1 Introduction and Outline 174
4.2 Formulation of the Technique 175
4.2.1 The moment Matrix Written as Interactions Between
Elementary Current Sources 175
4.2.2 Plane Wave Decomposition of the Hankel Function 176
4.2.3 Core Equation of the PML-MLFMA for Microstrip
Structures 178
4.3 Implementation of the Technique 180
4.3.1 Construction of the MLFMA Tree 180
4.3.2 The Matrix-Vector Multiplication 185
4.4 Some Important Remarks about the Complexity of the
PML-MLFMA 195
4.4.1 Memory and Computational Complexity 195
4.4.2 Mode Trimming 195
n +1
l
4.4.3 Determination of the Sampling Rates 2QTX, 196
4.5 Numerical Results 197
4.5.1 Validation of the Method 197
4.5.2 Computational and Memory Efficiency 204
4.5.3 Application Examples 206
5. EXTENSIONS AND CONCLUSIONS 210
5.1 Extensions 210
5.1.1 Development of a Low-Frequency Algorithm 210
5.1.2 Combination of the HF- and the LF-Technique 213
5.1.3 Extension to General Multilayered Structures 214
5.2 Conclusions 215

8 Parallel Grid-enabled FDTD for the Characterization


of Metamaterials 223
L. Catarinucci, G. Monti, P. Palazzari and L. Tarricone
1. INTRODUCTION 223
2. INTRODUCTION TO METAMATERIALS 224
2.1 DNG Metamaterials 225
xii Contents

3. NEGATIVE REFRACTION 227


4. HOW TO SYNTHESIZE A DNG MEDIUM 228
5. DNG MEDIA APPLICATIONS 234
6. MODULATED SIGNALS IN A DNG MEDIUM 236
6.1 Dispersion 236
6.2 Gaussian Pulse in a DNG Slab 237
7. NUMERICAL METHODS FOR METAMATERIALS 242
7.1 Bases for the FDTD Method 242
7.2 Parallel Grid-Enabled FDTD using MPI 249
7.3 Efficient Subgridding Technique for Parallel FDTD
Algorithms: Variable Mesh FDTD 250
7.4 FDTD Methods and DNG Materials 256
7.5 DNG Slabs: Reflection by and Propagation in a DNG Slab 257

9 A Software Tool for Quasi-Optical Systems 265


N. C. Albertsen, P. E. Frandsen and S. B. Srensen

1. INTRODUCTION 265
2. REQUIREMENTS FOR QUASI-OPTICAL NETWORK
DESIGN 267
3. OUTLINE OF THE SOFTWARE SYSTEM 271
4. ANALYSIS METHODS 274
5. USER INTERFACE - THE FRAME EDITOR 276
6. COMPONENTS AND OBJECTS: THE OBJECT WIZARD 281
7. COMPLEX COMMANDS: THE COMMAND WIZARD 283
8. FRAME CONNECTIONS AND 3D MODELLING 286
9. EVALUATION AND FUTURE EXTENSIONS 291

10 Cooperative Computer Aided Engineering of Antenna Arrays 295


A. Esposito, L. Tarricone, L. Vallone and M. Vallone

1. INTRODUCTION 295
2. CAE OF APERTURE ANTENNA ARRAYS 296
3. GRID SERVICES AND SEMANTIC GRID 297
4. SYSTEM ARCHITECTURE 298
5. THE FRAMEWORK 301
5.1 Introduction 301
5.2 Grid Infrastructure 302
5.3 Encapsulation into Services 303
5.4 Ontology 306
5.4.1 Introduction 306
Contents xiii

5.4.2 Service Discovery 307


5.4.3 Service Orchestration 309
5.4.4 Service Binding 317
5.5 Client Application 320
5.5.1 Introduction 320
5.5.2 Service Discovery 320
5.5.3 Service Orchestration 321
5.5.4 Service Invocation 322
6. CONCLUSIONS 323

11 Distributed and Object-Oriented Computational


Electromagnetics on the Grid 327
D. Caromel, F. Huet, S. Lanteri and N. Parlavantzas

1. INTRODUCTION 327
2. DISTRIBUTED OBJECTS: PROACTIVE 328
2.1 Basic Model 328
2.2 Mapping Active Objects to JVMs: Nodes 329
2.3 Deployment Descriptors 329
2.4 Group Communications 331
3. OO DISTRIBUTED FINITE VOLUME SOLVER 332
3.1 Basic Architecture of the OO Model 333
3.2 Distribution and Parallelization 335
4. BENCHMARKS 337
4.1 Comparison with a Fortran Implementation 337
4.2 Grid5000 Experiments 338
5. ON-GOING AND FUTURE WORK 339
5.1 Application Controlled Deployment 339
5.2 Enhancing Modifiability Through Components 339
6. CONCLUSIONS 342

12 Software Agents for Parametric Computational


Electromagnetics Applications 345
D. G. Lymperopoulos, I. E. Foukarakis, A. I. Kostaridis,
C. G. Biniaris and D. I. Kaklamani

1. INTRODUCTION 345
2. CLASSIFICATION OF PARAMETRIC PROBLEMS IN CEM 347
2.1 Method-level Parametric Analysis 347
2.2 Application-level Parametric Analysis 348
2.3 Population-Based Stochastic Optimisation 348
xiv Contents

3. MOBILE SOFTWARE AGENTS 349


3.1 The Mobile Agent Paradigm 349
3.2 Mobile Agents in CEM: The Master-Worker Model 351
3.2.1 The Master Agent 351
3.2.2 The Worker Agent 352
3.3 A Brief Comparison Between MAT and MPI or PVM 353
4. A WEB-BASED MOBILE AGENT PLATFORM
FOR PARAMETRIC CEM MODELING 355
4.1 Mobile Agent Platform Components 355
4.2 Communication Mechanisms 357
4.3 Web-Based Infrastructure 357
4.3.1 Interaction With the User 358
4.3.2 Servlets for Front/Back-End Communication 359
4.4 Conformal Array Modelling: A Modified Method of Auxiliary
Sources (MMAS) Approach 360
4.4.1 Problem Formulation 360
4.4.2 Overview of the Model Geometry 362
4.4.3 Agent Deployment Mechanisms 363
4.4.4 Simulation Results 364
4.5 Electromagnetic Penetration Through Apertures: A Resonator
Method of Moments (MoM) Model 365
4.5.1 Formulation of the Electromagnetic Problem 365
4.5.2 Parametric Simulations 369
4.5.3 Performance Results 369
5. INTRODUCING GENETIC SOFTWARE AGENTS 371
5.1 Distributed Genetic Algorithms with Agents 372
5.1.1 Entity Mappings 373
5.1.2 Parallel Processing Coordination 374
5.2 Proposed Architecture 375
5.2.1 Centralised Model 375
5.2.2 Decentralised Model 376
5.2.3 Hybrid Implementations 376
5.3 Conclusions 377

13 Web Services Enhanced Platform for Distributed Signal


Processing in Electromagnetics 381
I. E. Foukarakis, D. B. Logothetis, A. I. Kostaridis,
D. G. Lymperopoulos and D. I. Kaklamani

1. INTRODUCTION 381
2. WEB SERVICES IN DISTRIBUTED SAR MODELLING AND
SIGNAL PROCESSING 382
2.1 Platform Architecture 382
Contents xv

2.2 Server Services 385


2.2.1 Node Management Service 385
2.2.2 Input Provider Service 387
2.2.3 Output Receiver Service 387
2.2.4 Scheduler 387
2.3 Node Services 389
2.3.1 Resource Manager Service 389
2.3.2 Task Executing Service 389
2.3.3 Remote Input Service 390
2.4 Other Issues 390
2.5 Imaging Radar Signal Processing 390
2.6 The Simulation Mechanism 392
2.7 Results and Conclusions 394

14 Grid-Enabled Transmission Line Matrix (TLM) Modelling


of Electromagnetic Structures 399
P. Russer, B. Biscontini and P. Lorenz

1. INTRODUCTION 399
2. THE 3D-TLM METHOD 400
3. MODELLING OF DIELECTRIC MEDIA 411
4. PARALLELIZATION OF THE TLM METHOD 415
4.1 Domain Decomposition 415
4.2 Decomposition of the TLM Algorithm 417
5. TLM-G: GRID-ENABLED TIME DOMAIN TRANSMISSION
LINE MATRIX SYSTEM 422
5.1 The Components of the TLM-G System 423
5.2 The Relation Between YATWAD, YATD
and the Components of the Globus Toolkit
in the TLM-G System 424
6. ANALYSIS OF THE PERFORMANCE OF THE TLM-G
SYSTEM AND EXAMPLES 425
6.1 The Electromagnetic Performance of the TLM-G System 425
6.2 A Bowtie Antenna in a TLM-G System 426
7. THE CIRCULAR CYLINDRICAL CAVITY RESONATOR 428

Glossary 433
Index 451
Contributing Authors

Raed A. Abd-Alhameed
University of Bradford, UK

Niels Christian Albertsen


Informatics and Mathematical Modelling, Technical University of Denmark

Christos G. Biniaris
School of Electrical and Computer Engineering, National Technical University of Athens,
Greece

Bruno Biscontini
Technische Universitt Mnchen, Munich, Germany

Denis Caromel
INRIA, France

Luca Catarinucci
University of Lecce, Italy

Daniel De Zutter
Ghent University, Belgium

Alessandra Esposito
University of Lecce, Italy

Peter Stuart Excell


University of Bradford, UK

xvii
xviii Contributing Authors

Ioannis E. Foukarakis
School of Electrical and Computer Engineering, National Technical University of Athens,
Greece

Poul Eric Frandsen


TICRA Engineering Consultants, Copenhagen, Danmark

Fabrice Huet
INRIA, France

Dimitra I. Kaklamani
School of Electrical and Computer Engineering, National Technical University of Athens,
Greece

Antonis I. Kostaridis
School of Electrical and Computer Engineering, National Technical University of Athens,
Greece

Stefane Lanteri
INRIA, France

Dyonisios B. Logothetis
School of Electrical and Computer Engineering, National Technical University of Athens,
Greece

Petr Lorentz
Technische Universitt Mnchen, Munich, Germany

Dimitrios G. Lymperopoulos
School of Electrical and Computer Engineering, National Technical University of Athens,
Greece

Eric Michielssen
University of Illinois at Urbana-Champaign, USA

Giuseppina Monti
University of Lecce, Italy

Frank Olyslager
Ghent University, Belgium
Contributing Authors xix

Paolo Palazzari
ENEA-HPCN, Italy

Nikos Parlavantzas
INRIA, France

Peter Russer
Technische Universitt Mnchen, Munich, Germany

Stig Busk Srensen


TICRA Engineering Consultants, Copenhagen, Danmark

Luciano Tarricone
University of Lecce, Italy

Laura Vallone
University of Lecce, Italy

Mariangela Vallone
University of Lecce, Italy

Dries Vande Ginste


Ghent University, Belgium
Preface

The way of designing, tuning, trimming and realizing electromagnetic (EM)


circuits and antennas has deeply changed in the last decades. The continuous
growth in the variety of applications, services, and in the required
performance, the reduction of production times, in other words the
increasing complexity of the task microwave (MW) engineers are typically
involved in, has leaded to a systematic use of computers and numerical
methods in daily research and development workflow.
Meanwhile, an extraordinary progress in Information and Communication
Technologies (ICT) is a constant matter of fact in the present years,
with a consequent availability of more and more powerful computing
platforms and systems, or software methodologies, which are modifying
the way we conceive the activities of Computer Aided Design (CAD) or
Engineering (CAE) in any scientific area, and the EM context makes no
exceptions.
As a consequence, more and more attention is paid, even from the
educational point of view, in the EM community, to ICT evolution, as it is
rather evident the relevant impact that it can have on the efficiency,
effectiveness and quality of EM devices. The very recent past has seen, in
the ICT area, the affirmation of some new and important key-words. Apart
from the most common (and abused), such as Internet-related ones, we want
here to recall the concept of system integration, intended as the capability
of generating added-value products by assembling together pre-existing
tools, as well as the one of distributed and cooperative computing,
intended as the capability of exploiting in a concerted manner computational
or software resources not physically concentrated in one single site or
platform. Inside these concepts, very wide and multi-folded, many other

xxi
xxii Preface

focal issues can be discovered, representing in many cases hot cutting-edge


points, such as the ones associated to software interoperability and
reusability, high performance and grid computing, resource and process
sharing, etc.
The above enumerated key-words are more than simple technological
or scientific turn-points, as they intimately affect the way people work and
plan their daily activities (for instance, the more and more spread adoption
of outsourcing processes, or the huge number of cooperative projects on a
geographical scale).
On the other hand, the EM community is, still today, far from being
completely aware of the impressive potential impact that recent ICT
progresses can have on itself. This is partly due to a physiological delay in
the assimilation of complex and foreign approaches, and partly to a
persistent lack of bridging tools between the EM world and the ICT one.
This book is intended to bridge this gap, so that, on one side, EM researchers
can identify new and promising ICT tools (already available, or to be
consolidated in the immediate future), expected to significantly improve
their daily EM investigation. On the other side, ICT experts can find here
appealing ideas and scientific areas, perhaps not explored or considered till
now, where ICT can play a major and innovative role.
Consequently, EM researchers can find in this book interesting answers
from ICT to some questions emerging often and often, for example how to
get more computing power, or how to improve the capability to manage
complex projects with special software solutions (Chapters 6 to 9).
Furthermore, they can also see sample applications where suitable EM
theoretical formulations, or algorithms, render EM numerical methods more
amenable to take advantage from new ICT tools (Chapters 10 to 14). The
now enumerated chapters, proposing real applications, are preceded by an
introductory part (Chapters 1 to 5), containing foundations to Information
Technology, so that even beginners in the ICT area are introduced to basic
concepts playing a key-role in further parts of the book.
More in detail, the book is structured as follows. Chapter 1 provides a
very simple taxonomy of terms and an introduction to basic concepts of
parallel and distributed computing. As for parallel computing, a special
focus is devoted to MPI, for its recent affirmation as standard. The
importance of MPI is confirmed by its wide use in grid computing
environments, as demonstrated in Chapter 5 and in Chapter 8. As for
distributed computing, Chapter 1 deals with two technologies, the low level
technique named RPC and the powerful mobile agents paradigm. The former
is a well known technique widely adopted in the distributing computing
arena, as confirmed by the availability of several libraries for its
implementation. Among them, we recall RMI and XML-RPC which are
Preface xxiii

widely used throughout the book (Chapters 10, 11, 12, 13). The latter
(mobile agents) is the core of Chapter 11 and is one of the most promising
technologies of the moment. Finally, Chapter 1 introduces the basic concepts
of XML, the meta-language for exchanging information through the Web,
which provides the foundations for Web Services and the last generation
of Grid Computing frameworks (including Semantic Grids). An EM
application adopting such technologies is proposed in Chapter 10.
Other relevant technologies related to distributed computing are dealt
with in Chapter 2, for their relationship with object oriented (OO) concepts,
and in Chapter 4 and 5, where the two hot technologies of Web Services and
Grid Computing are addressed respectively. More in detail, Chapter 2 is
devoted to the introduction of basic concepts and terms of object oriented
programming and software designing model. The chapter focuses on the
Web-oriented OO language par excellence: Java. A description of the
language is provided together with an overview of Java-based object
oriented distributed frameworks, i.e. Java RMI and Java mobile agents. The
former is widely used in Chapter 11, the latter is adopted in Chapter 12 and
13.
Chapter 3 overviews the basic concepts behind the Semantic Web (and
consequently the Semantic Grids). The Semantic Web encodes knowledge
by means of an appropriate language, so that electronic agents search
information on the bases of human-readable queries. In this way, discovery
and choreography of resources is automated and made easier also in
complex environments, such as Internet-wide grid frameworks. As shown in
Chapter 10, where a practical EM application benefiting from Semantic Web
technologies is described, this can give a huge momentum to the solution of
complex multidisciplinary problems.
Chapter 4 introduces the concepts behind Web Services (WS) and the
main standards supporting them. WS propose a new model for implementing
applications, thus promoting reusability and cooperation. An example of
application of WS to EM is provided in Chapter 13, whilst Chapter 10 shows
the potentials of integrating WS with Grid technologies.
Finally, Chapter 5 shows how grid computing has recently embraced WS
concepts, driving a process of fusion of WS and grids. In this way, sharing
of standard hardware and software computing resources is joined with the
potential of orchestrating autonomously developed software components
offered by WS technologies. Examples of these capabilities are provided in
Chapter 8, 10 and 14.
Chapter 6 opens the part of the book devoted to EM applications. In
chapter 6, the relevant issue of hybrid numerical methods is addressed. A
wide review of EM numerical methods is proposed, so that the importance
of method hybridization is cleared. Sample applications are proposed, and
xxiv Preface

crucial topics for effective hybridizations are focused, such as the one of
domain partitioning.
Chapter 7 proposes an interesting approach where the use of suitable EM
theoretical formulations, and algorithmic solutions, is the right way to reduce
numerical complexity. The use of Perfectly Matched Layers (PML) leads to
a series representation for Greens functions of planar circuits. The terms in
this series allow for the application of a Multilevel Fast Multipole Algorithm
(MLFMA) for the analysis of very large planar structures, or of small
circuits with high geometric detail, with impressive computational
performance.
Chapter 8 is an example of how a standard numerical method (FDTD)
can be implemented so to take advantage of parallel computing platforms,
even with very low costs (consider for instance computational grids).
Furthermore, suitable algorithmic solutions allow to couple such advantage
with the selection of variable meshes. The resulting parallel grid-enabled
variable-mesh FDTD tool can play a major role in the investigation of the
hot topic of metamaterial properties, with a special focus on finite-
bandwidth signal propagation inside double-negative materials.
Chapter 9 proposes a software tool for designing and analyzing quasi-
optical complex systems. Along with a general discussion of the analysis
methods available, particular emphasis is put on the user interface and other
relevant software components, demonstrating how an advanced use of
graphical facilities can turn into an impressive added-value generator.
Chapter 10 is devoted to the use of Grid Computing, and more
specifically to Semantic Grid, as a tool for cooperative computer-aided
engineering (CAE) of antenna arrays. The CAE of aperture antenna arrays is
a good example of a complex application, merging different skills and
scientific knowledge, with a consequent potential demand for cooperation
among several research groups. The adoption of Semantic Grids is the right
answer to this demand. It also allows an implementation of the CAE
environment in a service oriented framework, where CAE components are
encapsulated in grid services and exploitable remotely through the grid. The
chapter also profiles the attractiveness of ontologies as an immediate and
future turning point.
Chapter 11 is one more evidence of the relevance grid computing is
gaining in the EM community. More specifically, this chapter suggests an
interesting coupling between grid technologies and object oriented
methodologies, with the final goal of developing high performance
numerical methods for the solution of systems of PDEs (Partial Differential
Equations). An open source middleware for the grid, featuring distributed
objects and components, is proposed to design and implement an object-
Preface xxv

oriented time domain finite volume solver on unstructured meshes for the
3D modelling of EM propagation.
Chapter 12 proposes the application of novel networking software
technologies in distributed parallel CEM computing. Web services and
mobile agents are used to solve demanding parametric CEM problems, this
resulting in platform-independent concurrent computing solutions. Conformal
array modelling with the method of Auxiliary Sources, and the study of EM
penetration through aperture with the Method of Moments are proposed as
testbeds. In addition, Genetic Software Agents are introduced, i.e. mobile
agent entities with the ability to carry out Genetic Search Optimisations in a
collaborative scheme. Their perspective use is outlined.
On the basis of the previous chapter, Chapter 13 extends the concepts of
Web services to develop an enhanced distributed platform, discussing also
architectural issues. The resulting application is service-oriented and
the implementation is based on the Simple Object Access Protocol
specifications. The platform is tested with a problem of microwave imaging
using a coherent Synthetic Aperture Radar (SAR) sensor.
Finally, Chapter 14 proposes the use of grid computing for the high-
performance implementation of the Transmission Line Matrix (TLM)
method. The parallelization of the TLM algorithm is performed by segmen-
tation of the TLM state vector, whilst system identification and spectral
analysis approaches allow a considerable reduction of numerical effort.
In conclusion, the book suggests some ICT concepts, destined to play a
major role in the current and future EM context. These concepts are
introduced, at a beginner level, in the former part of the work. Their
effective impact on real EM problems is then outlined in the latter part of the
book, with a wide variety of applications.
Acknowledgments

The editors thank so much Barbara Pici for her precious editing work and
Laura Vallone for her contribution to a global revision of the book.

xxvii
Chapter 1
PARALLEL AND DISTRIBUTED
ENVIRONMENTS

A. Esposito
University of Lecce, Italy

Abstract: A very brief summarization of basic concepts related to parallel and


distributed computing is provided. Frameworks and enabling technologies
referred later in the book are herein explained in very simple and introductory
terms. A particular emphasis is dedicated to XML for its relevant role played
in Web Computing and Service-oriented Grid Computing domain.

Key words: Multiprocessing; Multicomputer; Cluster; Master-worker; Domain decompo-


sition; Thread; Distributed memory; Shared memory; MPI; RPC; Mobile
agents; XML; Namespace; Schema.

1. INTRODUCTION

Distributed and parallel computing is a very large discipline, including


several diverse technologies, models, languages. We have a distributed
computing system every time a coordinate use of physically autonomous
resources is made. Distributed systems span from intra-departmental usage
of printer servers to advanced inter-organization cooperation via Internet.
The involved technological issues are several, spanning from reliability and
performance of the communication network to optimal balancing of usage of
shared resources. This chapter does not aim at dealing with such an extended
area in an exhaustive way, nor it intends to provide a complete survey of
available enabling technologies. Several papers and books provide this and
some of them are listed in the bibliography. Rather, this chapter is, similarly
to the other tutorial chapters of this book, intended as a very basic support

1
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 117.
2006 Springer. Printed in the Netherlands.
2 Chapter 1

for understanding terms and concepts cited in the application-oriented


chapters.
First (Section 2) a very simple taxonomy of terms related to parallel and
distributed computing is provided. Then, in Section 3 some basic concepts
and technologies related to parallel programming are given. The special
focus on MPI is due to its recent affirmation as standard for parallel
programming in grid computing frameworks. Finally, simple formulas for
performance assessment of parallel programs are introduced.
Section 4 is devoted to two technologies supporting distributed com-
puting, the low level technique named RPC and the powerful mobile agents
paradigm. Other relevant technologies related to distributed computing
are dealt with in Chapter 2, for their relationship with object oriented
concepts, and in Chapter 4 and 5, where the two hot technologies of Web
Services and Grid Computing have been assigned a whole chapter.
Section 5 is dedicated to XML, a meta-language for the Web, which
provides the foundations for Web Services and the last generation of Grid
Computing frameworks.

2. BASIC CONCEPTS

Distributed computing is the capability of making a coordinate use of


physically autonomous resources.
Distributed computing can take place with resources belonging to a
single machine or with resources belonging to autonomous machines.
In the former case, we talk about multiprocessing. Multiprocessing is the
capability of exploiting the computing power offered by a multiplicity of
CPUs connected by a very fast communication line and belonging to a single
machine (multiprocessor). Memory can be shared between the CPUs or
distributed among them.
In the latter case, we talk about multicomputer systems. A multicomputer
is made up of computers interfaced by a local or wide area network. The
simplest case of multicomputer is the cluster, made up of (generally)
homogeneous machines interconnected by a local network. Clusters are
centrally administered. A much more complex multicomputer distributed
system is the computational grid (see Chapter 5). Computational grids
generally consist of a multiplicity of heterogeneous machines belonging to
different organizations and being interconnected by a wide area network.
Grid administration is usually decentralized.
Parallel computing is a more focused discipline than distributed
computing. It consists in the simultaneous execution of the same task on
multiple processors and is finalized to the improvement of application
1. Parallel and Distributed Environments 3

performance. A simple example of parallel code is the execution in parallel


of the set of instructions included in a loop. Performance improvement
depends on a variety of factors: speed of the interconnection network,
amenability of the program to be parallelized (every program includes pieces
which are inherently sequential), amount of data to be exchanged among
processors, etc. Similarly to distributed computing, parallelism can be ex-
ploited on multiprocessors or on more or less coupled multicomputer sys-
tems.

3. PARALLEL PROGRAMMING

3.1 Introduction

The choice for the parallel model strongly depends on the problem
domain. For example, the so-called parametric programming is suited for
the iterated solution of the same application on different modeling
parameters or on different application data. This need often arises in
scientific research projects, where benchmarking, experimentation or gene-
ration of statistical data are common activities. Parametric programming
consists in solving the same problem in parallel on multiple processors with
different input data.
Several scientific problems are amenable to another simple parallel
approach, the so-called domain decomposition. Domain decomposition
partitions the data domain into several parts. Each part is elaborated in
parallel by a different processor. In this case, the master-worker model may
be a suited programming paradigm. According to it, a program, called
master, is responsible for partitioning the domain and for the initializations.
The master communicates with a number of workers, which operate in
parallel and are responsible for making all needed computations on the
problem sub-domain they have been assigned to. At the end, the master
gathers and eventually elaborates the results from workers and outputs the
problem solution (Fig. 1-1).
Parallelism can be exploited at single or multiple level. The most simple
case of single-level parallelism is parametric programming. Multi-level
parallelism nests parallel code on several layers. The simplest case is a
parametric application implementing parallel code. Multi-level parallelism
can be obtained by creating autonomous threads which spawn parallel code.
A thread is a sort of process with limited autonomy. Both a thread and a
process are streams of instructions being executed by the processor. The
process is a totally self-consistent and autonomous entity, as it is an instance
of a program in execution. A process can spawn one or more threads, intended
4 Chapter 1

as stream of instructions programmed to run autonomously from the process


they belong to. Their limited autonomy is due to their sharing with the parent
process of a number of critical resources, such as files and memory data.
Threads are primitive forms of parallelism and are very common when a
program needs to perform tasks which are intrinsically autonomous, such as
computation and interaction with user. For example, in graphical user
interfaces, committing computation and interaction to separate threads can
enormously reduce the waiting time for end-users.
Another classification of parallel programs is based on the way pro-
cessors communicate with one another to exchange data. Processors may
communicate implicitly by accessing a common storage area, or expli-
citly by exchanging messages. The former case is called shared memory
paradigm. The latter is called message passing paradigm. The shared
memory paradigm is common in the so-called shared memory architectures,
where the processors physically share a storage medium, or can be obtained
with a suited software simulating a shared memory behavior in distributed
memory systems. Message passing paradigm, on the contrary, is predo-
minant in loosely coupled distributed architectures, such as grids, where
the machines are physically distant and often heterogeneous.
The choice between the two paradigms depends both on the problem
characteristics and on the system architecture. In both cases, parallel
programming is done by means of appropriate languages, which allow the
programmer to indicate how and when the problem has to partitioned.
OpenMP [OpenMP, 2005] is an open standard for shared memory paradigm.
It allows to establish which portions of the code have to be parallelized and
the runtime characteristics of a program. Message Passing Interface (MPI)
[MPI, 2005] is the reference standard for message passing and is overviewed
in the next subsection.
1. Parallel and Distributed Environments 5

A B C

Master

C
A B
worker

worker
worker

Figure 1-1. Master-worker model. The master is responsible for partitioning the problem
domain and distributing subdomains to workers. Workers elaborate in parallel the subdomain
they have been assigned and return results to the master, which elaborates the returned results
in a form acceptable for the end-user.

3.1.1 MPI

MPI is a set of specifications for implementing parallel programs in the


message passing paradigm.
Message passing views the parallel program as made of autonomous
processes interacting via messages. MPI defines interfaces for controlling
communication among processes. Both point-to-point and group communi-
cation are included. Point-to-point communication takes place between two
processes. It consists in a send and receive path between the two
communicating partners. It can be blocking or non-blocking, depending on
the sender being waiting or not for the receiver having acknowledged the
reception of the message. Group communication allows to send (or receive)
messages simultaneously to all members of a set (group) of processes.
Simple examples are broadcasting, when a process sends a message to all
members of a group, and scattering, which allows to distribute data amongst
the members of a group. When group communication takes place it is often
necessary to introduce some synchronization among group members. This is
done via barrier synchronization, which makes a process waiting for others
having completed a task.
Several freeware and proprietary implementations of MPI are available.
An open source implementation of MPI is MPICH [MPICH, 2005], which
6 Chapter 1

can be ported over several distributed systems, provided that it has been
properly configured. MPICH offers also a grid-enabled configuration, called
MPICH-G2. Implementing an application in MPI guarantees a high
portability from multiprocessors to clusters up to loosely coupled systems,
such as computational grids.

3.2 Performance Assessment

Once a parallel program has been implemented, it is fundamental to


assess its capability to improve its performance by increasing the number of
processors (in other words, to assess its scalability). This is done by
measuring the speed-up and the efficiency. Speed-up compares the
performance of the program operating serially (i.e. on one processor) with
the performance obtained by operating N processors in parallel. It is defined
as the ratio:

S = T(1)/T(N)

Where T(1) is the time taken by the serial version, and T(N) is the time
taken by the parallel version when N processors operate in parallel.
Efficiency is the ratio:

E = S/ N

In an ideal world, efficiency is expected to be constant (equal to 1) as


speed-up is expected to increase linearly with the number of processors. In
concrete situations the speed-up saturates when the number of processors
exceeds a value depending on problem characteristics, the parallel algorithm,
system architecture, data, network conditions etc.. Consequently, efficiency
approaches zero when the number of processors tends to infinite.

4. DISTRIBUTED SYSTEMS

4.1 Introduction

Distributed computing deals with the development of applications that


exploit the power of autonomous resources interconnected by networks.
1. Parallel and Distributed Environments 7

Distributed computing span from cluster computing, available in centrally


administered local networks to grid computing, mostly operating in wide-
area networks linking multiple organizations. In the World Wide Web, Web
Services and Grid Computing (which are coming to a powerful fusion)
implement globally distributed applications.
Grid computing and Web Services are truly revolutionizing the way of
modeling, building and exploiting applications. This is also witnessed by
the huge number of EM applications making use of these technologies and
being described in the current book. For this reason, two tutorial chapters of
the current book are completely devoted to these technologies: Chapter 4
deals with Web Services whilst Chapter 5 concentrates on Grid Computing.
Moreover, a special focus is reserved to object oriented technologies for
distributed computing, whose description requires the comprehension of
concepts and terms related to object-orientation, a paradigm which is giving
a great impulse to interoperation and software reuse. For this reason some
technologies related to distributed computing are dealt with in Chapter 2, for
their embracing of OO philosophy.
Still several technologies and key points related to distributed computing
could be treated in the current section. But we have chosen to prioritize
simplicity to completeness. For this reason, in this section, two exemplifi-
cative technologies are introduced, i.e. RPC and mobile agents, as they are
used and cited later in the book. A whole section (Section 5) is devoted to
XML for its strong impact on other technologies described later in the book.

4.2 RPC

Remote Procedure Call (RPC) is a technique for constructing distributed,


client-server applications. It is based on extending the notion of conventional
procedure calling. With RPC the two processes may be on the same system
(as happens with conventional procedure calling), or they may be on
different systems with a network connecting them. In both cases, an RPC is
very similar to a traditional function call. Like in a function call, the calling
arguments are passed to the remote procedure and the caller waits for a
response from the remote procedure, i.e. RPC is a synchronous operation. In
order to achieve asynchronous operation, the client application may initiate
an RPC call in a separate thread and then proceed with other processing.
RPC programming is facilitated by the availability of libraries, such as
DCE, Java RMI (see Chapter 2) and XML-RPC (see section 5.1.5), which
hide to the programmer the details of the interface with the network and
8 Chapter 1

isolate the application from the physical and logical elements of the data
communication mechanism.

4.3 Mobile Agent Framework

The mobile agent framework emerged as an open and decentralized


model, suited for dynamic computation over Internet. A mobile agent
application can be designed and implemented as a collection of mobile
agents. A mobile agent is typically defined as an autonomous program,
which can migrate during its execution from one host to another in a
heterogeneous network. A mobile agent is autonomous because it is con-
ceived as a separate computational unit, with its own thread of execution,
independent on other units. It can migrate during its execution by suspending
its execution at the hosting machine, transferring its current state (data) and
code to a different host and resuming its execution there. Mobile agents can
communicate with other agents on the local host or remote machines.
The basic motivation behind agent migration is to perform access and
information processing of remote resources: instead of remotely accessing
the resource, the agents move towards it and access it locally.
This model has numerous appealing features. First of all, it has potentials
for reducing network traffic. By performing time and traffic-consuming
operations locally, mobile agents can reduce bandwidth consumption. On the
other hand, the process of migration consumes computing and network
resources. The choice between migration and remote interaction (e.g. via
RPC) depends on the network and computational conditions. Another
interesting feature is mobile agent independence from the dispatching host.
A mobile agent can continue functioning even if its home is unavailable or
unreachable, and sends back results upon reconnection. This feature is
particularly useful for portable computing devices. Finally, mobile agents
allow the on-the-fly deployment of software components. Software com-
ponents can be implemented in the form of mobile agents or can be flexibly
and dynamically embedded with the support of mobile agents.

5. THE WEB

The Web was originally conceived as a vehicle for exchanging


information, stored in documents written in a simple language, the so-called
Hypertext Mark-up Language (HTML).
The original Web follows the well known client-server model (Fig. 1-2).
According to this model, HTML documents are published on server
platforms, which are equipped to send requested documents through the
1. Parallel and Distributed Environments 9

network. Information consumers access the information contained in remote


documents by logging into a client machine. They contact the server
platforms by using the Web user interface called browser. Browsers interpret
HTML mark-ups to represent them in a graphical form.
This is a static model, where end-users have the substantially passive role
of reading the information contained in HTML documents. Nowadays, the
Web follows an interactive model, where end-users can enter and
manipulate data through the Web.
In the interactive Web, documents are generated on the fly, based on user
input or on data stored on back-end databases. The Java language allows
even more: it permits to implement programs runnable on client or server
platforms by a simple click on the browser window. A Java applet is a
program which, stored on the server platform, can be invoked by a client
through the browser and there launched. A Java servlet is a program which
enriches a Web server with a computational capability. Moreover, communi-
cation among remote software entities is facilitated by the availability of a
flexible and portable language, the so called eXtensible Markup Language
(XML), which allows to interface heterogeneous components.
Thanks to the improved Web, end users can perform calculations, insert
data, modify the appearance of Web documents. Similarly, Web servers can
access and update data stored on other platforms, launch applications,
contact third nodes, interact with clients. The improved capabilities of both
clients and servers induce a gradual reduction of their difference: in the
revised Web each node can behave interchangeably as client or server,
depending on contingent needs. In other words, the Web is evolving from
the original, static client-server architecture to a more flexible peer-to-peer
architecture (Fig. 1-3), which envisions the Web as an assemblage of peer
nodes, which can exchange data, programs, messages to achieve a common
goal.
In other words, the Web is moving more and more from a human-centric
vision, where most activities are decided and started by humans, towards an
application-centric vision, where activities are more and more delegated to
software entities able to exchange information through the Net [Cerami,
2002].
Due to the high potentialities of such a revolutionized Web, several
chapters of the current book deal with Web-compliant applications, able to
coordinately exploit resources disseminated through wide area networks.
Chapter 4 and Chapter 5 are devoted to the most promising and advanced
technologies, i.e. Web Services and Grid Computing. The enabling tech-
nologies which paved the way to such a transformation, i.e. XML and
Java, are dealt with respectively in the current chapter (being XML one of
10 Chapter 1

the building blocks for Web compliant distributed frameworks, including


Web Services and in some respects Grid Computing) and in the following
one (being Java an object-oriented language).

HTML

Figure 1-2. Traditional Web model. It is a client-server model, where clients request
documents stored on server platforms.

Figure 1-3. The Web is evolving towards an automated model, where peer nodes exchange
information to achieve a common goal. Humans use browsers to start and control distributed
processing and to gather results.

5.1 XML

5.1.1 Introduction

As said before, the traditional Web is based on the distribution through


the network of a document coded in the HTML language. HTML is a simple
language well suited for the visualization of hypertexts and multimedial
1. Parallel and Distributed Environments 11

documents. It is a collection of predefined tags (also called mark-ups) used


by the browser to properly visualize the contents of the document. For
example, the following HTML document:

<p align = center>This is a <em> panel </em> antenna working in the
frequency range of 870-960 MHz

uses the tags named p and em to express respectively that the text
must be centered and that the panel word must be emphasized.
No relationships exist between document contents and HTML tags:
HTML does not provide any indication neither about the structure of the
document contents nor about its semantics. An immediate effect of this is the
way in which search engines operate. They perform a blind full text search,
with no capabilities of understanding the document structure or of
collocating document contents in the suited domain. As a result, when the
end-user inputs a word for a search, the engine often returns a long list of
documents, most of which are unrelated to the user domain of concern.
The XML language [Harold, 2002] has been defined to improve the
HTML language, by increasing considerably its flexibility and providing the
capability to create documents with structured information, i.e. documents
including indications about the semantic role played by each part of the text.
This is done by allowing information providers to specify their own tags:
document owners are free to define their own set of mark-ups, in such a way
to represent both the structure and the meaning of document contents.
For example, the above reported HTML document may be coded in
XML in the following form:

<antenna>
<type>panel Antenna</type>
<frequency_range> 870-960 MHz </frequency_range>
</antenna>

HTML tags have been substituted by domain-related mark-ups which


describe in a machine-processable format the components of the document
contents (antennas, their type and their frequency range). In this way, an
XML compliant application, i.e. an application aware of XML rules (see the
next subSection) and of document domain of concern, can extract the
meaningful parts of the document (for example the frequency range) and
process them.
12 Chapter 1

5.1.2 XML Fundamentals

An XML document is a simple text file, thus resulting portable and easily
understood by humans and programs.
XML documents are made of elements. Elements are delimited by a start-
tag and an end-tag. For example, the following is an element:

<frequency_range> 870-960 MHz </frequency_range>

The content of the element is the string 870-960 MHz, the start-tag is
<frequency_range>, the end-tag is </frequency_range>.
Elements can have attributes. An attribute is a name-value pair attached
to the start-tag of the element. For example:

<antenna name = K730691>

associates the attribute named name and having the value K730691 to
the element named antenna.
Elements can be nested to form a tree. For example, the following
document has the tree structure shown in Fig. 1-4.

<antenna name = K730691>


<type>Panel Antenna</type>
<technical_data>
<frequency_range> 870-960 MHz </frequency_range>
<polarization> Vertical </polarization>
<gain> 17dB </gain>
<half_power_beam_width>
<H-plane> 65 </H-plane>
<E-plane> 8.5 </E-plane>
</half_power_beam_width>
</technical_data>
</antenna>
1. Parallel and Distributed Environments 13

antenna name

type technical_data

frequency_range polarization gain

Half_power_beam_width

H-plane E-plane

Figure 1-4. XML documents have a hierarchical structure. The figure shows the tree structure
of an XML document representing antennas. The antenna node (root node) has two children:
the nodes named type and technical data. Technical data has children too (the nodes
frequency range, polarization, gain and half-power beam width). Half power beam width has
the two children named H-plane and E-plane. Nodes can have attributes too. In the
example, the node named antenna has an attribute named name.

5.1.3 Namespaces

Some documents may mix markups coming from multiple XML


applications. For example, consider an XML application listing the radiation
patterns of several antennas for graphical rendering. They may look like:

<radiation pattern>
<type>Horizontal</type>
<image file = RPO730691.jpg>
</image>
</radiation pattern>

where the type tag expresses the typology of radiation pattern and the
image tag points to the file to be visualized via the file attribute.
14 Chapter 1

The above application may be joined with the XML application des-
cribing antenna features described in the previous section, herein reported
for clarity:

<antenna name = K730691>


<type>Panel Antenna</type>
<technical_data>
<frequency_range> 870-960 MHz </frequency_range>
<polarization> Vertical </polarization>
<gain> 17dB </gain>
<half_power_beam_width>
<H-plane> 65 </H-plane>
<E-plane> 8.5 </E-plane>
</half_power_beam_width>
</technical_data>
</antenna>

As reported above, the applications may use the same name to refer to
different things. In the previous example, the name type means radiation
pattern type for the former application, and antenna type for the latter. In
order to distinguish different concepts having the same name, XML allows
to label names belonging to the same domain. A prefix is placed before the
name to specify its ownership to a certain domain. For example the
radiation pattern type and the antenna type tags may become respec-
tively:

rp:type
ant:type
in this way, the type name is disambiguated by the prefix which
distinguishes between its two possible meanings.

The prefix is indeed associated to an URI, i.e. a unique identifier valid in


Internet, similar to the well known URLs, by using a predefined XML
keyword.
Namespaces are fundamental to establish a shared terminology among
partners. An agreement on terms, their usage, and the consequent standar-
dization via the definition of a valid namespace increases interoperability
and improves reusability of applications.
1. Parallel and Distributed Environments 15

5.1.4 XML Schema

As the names of the used tags and the rules to nest them are not known a-
priori, information providers attach to the XML document the specification
of the document structure, in the so-called schema. In this way, XML
documents are self-describing documents with a structured description of
their contents.
The W3C specified a language to define schemas, the W3C XML
Schema Language. An XML Schema is a schema provided with an XML
document following the W3C recommendations. For example, the following
XML document:

<type>panel Antenna</type>

can be associated to the following XML schema:

<xs:schema xmlns:xsd = http://www.w3.org/2001/XMLSchema>


<xs:element name = type type = xs:string/>
</xs:schema>

which says that the document contains the element named type whose
content is of the W3C data type string.
In order to associate the document to its schema, a pointer to the file
containing the schema must be added. The XML document becomes:

<typexmlns:xsi = http://www.w3.org/2001/XMLSchema-instance
xsi:noNamespaceSchemaLocation = antenna-schema.xsd>
panel Antenna
</type>

where the attribute named noNamespaceSchemaLocation contains the


name of the schema file, and the attribute named xmlns:xsi identifies the
W3C namespace URI.
The W3C Recommendations allow to introduce all common data types,
such as integer, Boolean, dateTime, and so on, to define nested elements and
to introduce attributes.
A validating XML parser is a tool that, having an XML document and its
schema as input, checks the congruency of the document against the schema,
signaling each violation. A lot of freeware validating parsers are available on
the Net, among them we mention the parser from the Apache XML Project,
named Xerces [Xerces, 2005], written in Java.
16 Chapter 1

5.1.5 Applications

XML is revealing a powerful tool for exchanging information, thanks to


its features of flexibility, portability and simplicity. For example, it is the
basic enabling technology for Web Services (see Chapter 4) both for
communication among services and to allow services to describe themselves.
Communication via XML files is a simple way to bridge heterogeneous
systems. Applications resident in remote heterogeneous platforms can
communicate by simply encoding information in an XML-compliant format
being specified by standard publicly available XML schemas [Cerami,
2002]. The most common of them are XML-RPC and SOAP, the latter being
the most used. It is overviewed in Chapter 4.
XML is valuable when the need of merging heterogeneous databases
arises: an XML file can be easily considered a truly portable database. It is,
in fact, a text file containing an ordered and querable set of data.
Different application domains are developing their own XML schemas
and namespaces to exploit XML potentialities. For example, the project
Physics Markup Language, or PhysicsML [PhysicsML, 2005] aims at
defining standard data formats for physics data and to capture the basic
concepts used in physics in order to promote interoperability among physics
distant partners. The MathML [MathML, 2005] project defines an XML
schema and namespace to mark up equations. The Chemical Markup
Language [ChemicalML, 2005] is a similar effort for chemistry.

References

Cerami, E., 2002, Web Services, OReilly & Associates, Inc.


ChemicalML, 2005, http://xml.coverpages.org/chemicalML.html.
Harold, E. R., and Scott Means, W. 2002, XML in a nutshell, OReilly, pp. 7-612.
MathML, 2005, http://xml.coverpages.org/mathML.html.
MPI 2005, http://www-unix.mcs.anl.gov/mpi/.
MPICH 2005, http://www.mcs.anl.gov/mpi/mpich/download.html.
OpenMP 2005, http://www.openmp.org/specs/.
PhysicsML, 2005, http://xml.coverpages.org/physicsML.html.
Xerces, 2005, http://xml.apache.org/xerces-j/.
1. Parallel and Distributed Environments 17

Bibliography
Butenhof, D. R., 1997, Programming with POSIX Threads, Addison-Wesley, pp. 1-12.
Chew, K. C., and Fusco, V., 1995, A Parallel Implementation of the FDTD Algorithm, Int.
Journ. Num. Modelling, Vol. 8.
Dongarra, J., et al., 2003, Integrated PVM Framework Supports Heterogeneous Network
Computing, Computers in Physics, (April, 1993).
Duncan, R., 1990, A Survey of Parallel Computer Architectures, IEEE Computer, Vol. 23,
No. 2, (February, 1990).
Flynn, M. J., 1966, Very High Speed Computing Systems, Proc. IEEE, Vol. 14.
Guiffaut, C., and Mahdjoubi, K., 2001, A Parallel FDTD Algorithm Using the MPI Library,
IEEE Antennas and Propagation Magazine, Vol. 43, No. 2, (April, 2001).
Hennessy, J., and Patterson, D., 1998, Computer Organization & Design, Morgan Kaufmann
Publishers, San Francisco.
Lewis, Ted G., and El-Rewini, H., 1992, Introduction to Parallel Computing, Prentice-Hall,
Inc.
MPI, The Message Passing Interface Standard, 2006, http://www-unix.mcs.anl.gov/mpi/.
MPICH, download page, 2006, http://www.mcs.anl.gov/mpi/mpich/download.html.
OpenMP, C and C++ Application Program Interface; http://www.openmp.org/specs/.
OpenMP, 1998, Architecture Review Board, (October, 1998).
Pacheco, P. S., 1997, Parallel Programming with MPI, Morgan Kaufman.
Schendel, U., 1984, Introduction to Numerical Methods for Parallel Computers, Ellis
Horwood Lim.Publishers.
Tarricone, L. et al., 2001, A Parallel Framework for the Analysis of Metal-Flanged
Rectangular-Aperture Arrays, IEEE Trans. on Ant. and Prop., (October, 2001).
Visual KAP for OpenMP, http://www.kai.com/vkomp/_index.html.
Chapter 2
OBJECT-ORIENTED TECHNOLOGIES

A. Esposito
University of Lecce, Italy

Abstract: This chapter provides an introduction to basic concepts and terms of object
oriented programming and software designing model. A description of Java is
provided together with an overview of object oriented distributed frameworks
(i.e. Java RMI and Java mobile agents) cited later in the book.

Key words: Object orientation; Polymorphism; Encapsulation; Inheritance; Class.

1. INTRODUCTION

Object Orientation (OO) originated as a new programming model and a


new methodology for developing software applications [Booch, 1994].
During the 1970s and early 80s, structured programming was the primary
software engineering methodology. It was initially based on the so-called
top-down approach. A complex problem is divided into smaller pieces so
that the code able to solve each piece can be easily implemented. This
approach has some drawbacks. First of all, it produces programs strictly
tailored on the specific problem. As a consequence, the implemented code is
seldom reusable for other problems. Secondly, it concentrates on the
instructions needed to solve the problem, and little care is devoted to the
design of data structures. This produces costly and hard to maintain codes.
The methodology was improved by combining it with the so called
bottom-up approach. The bottom-up approach first individuates solvable
problems and then goes up to the solution of the whole problem. This
approach focuses on available reusable codes (modules) rather then on the
features of specific large problems, thus promoting reusability and
improving maintainability.
OO can be considered an evolution of this approach. OO represents the
problem domain as made of self-consistent interacting software entities,

19
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 1928.
2006 Springer. Printed in the Netherlands.
20 Chapter 2

called objects. Objects represent the concrete entities existing in the problem
to solve (for example antennas, dipoles, arrays are candidate objects of an
OO EM application). The corresponding pieces of code appear to external
applications as black boxes, whose internal is viewable and modifiable in a
limited extent, as established by the OO designer. This feature, called
encapsulation, is one of the main reasons for the OO success. According to
the encapsulation property, the implementation details of objects are
completely hidden and can be altered in any moment without affecting the
way applications interact with them. This simplifies enormously software
maintenance. Moreover, OO design tools allow to express dependencies and
similarities among objects, in order to concentrate shared behavior in a few
pieces of code. This feature enormously improves software reuse.
OO features are well suited for distributed software development too. In
an OO system, objects are autonomous entities able to interact with the rest
of the environment in a simple, well-defined, straightforward manner. The
details of what goes on inside them are not important to the system as a
whole, as long as the object plays its assigned role correctly. OO designers
concentrate on interfaces and external behavior, thus producing systems
open to interoperability and cooperation. For this reason, several distributed
frameworks are based on OO and envision a distributed system as composed
of autonomous network-enabled objects exposing a well defined interface
and hiding their way of operating.
Both programming and distributed computing with OO are treated in this
chapter. Basic OO programming concepts are introduced in Section 1, with a
focus on the Java language, whilst Section 2 deals with OO enabling
technologies for distributed computing.

2. OO PROGRAMMING

2.1 Basic Concepts

The OO model represents the domain as made of objects. An object is the


collection of data and operations (methods). Objects have an internal state
(the data they contain) and can respond to messages (calls to their methods).
Data are accessible and/or changeable by invoking the object methods (see
Fig. 2-1).
Set of objects with similar properties are grouped into classes. Objects
having in common part of their structure and/or behaviour are related via the
so-called inheritance and polymorphism properties.
If two classes, called superclass and subclass, are related by the
inheritance relationship, methods and data of the superclass are inherited by
2. Object-Oriented Technologies 21

the subclass. In this case, we say that the subclass is derived from (or
extends) the superclass. Several classes can be declared as subclasses of the
same superclass. Inheritance can also extend over several generations of
classes, to form a tree. Subclasses have the same variables and methods of
their superclass, but may 1) extend them by adding other variables and/or
methods 2) provide their own implementation of inherited methods. Point 1)
expresses the fact that the superclass features the common behavior shared
by its subclasses (see Fig. 2-2). Point 2) is an application of the feature
known as polymorphism.
According to polymorphism, different objects may respond differently to
messages with the same name. In other words, different classes may provide
a different implementation and behavior of a method with the same name.
Looking at the example represented in Fig. 2-2, Dipole and Aperture
Antenna are derived from the class named Antenna. The Antenna
superclass has the drawRadiationPattern method which is inherited by its
subclasses. Even though the code needed to implement a routine for drawing
the radiation pattern of a dipole is different from the code needed to
implement a routine for drawing the radiation pattern of a reflector antenna,
it makes sense to join them conceptually in a unique method. The
application is responsible for calling the appropriate code at run time,
depending on the current type of the invoked object.
Polymorphism adds flexibility to the code and simplifies its extensions.
Suppose that a new typology of antennas must be added to the system. The
programmer has only to take care of the implementation of the specific code
of that typology: all design efforts have taken place in the starting phase of
the project, when data and methods of the core classes of the problem were
established.
In some cases, software engineers just sketch the skeleton of the
fundamental classes of the problem (i.e. their external behavior or interface).
This is done thanks to the so called abstract classes. Consider the Antenna
example. It has no sense to implement the method drawRadiationPattern
in the superclass Antenna. It exists merely to specify the common
interface for all the actual, concrete versions of drawRadiationPattern in
the Antenna subclasses. Such a method is called abstract method: it is
defined only to declare a method, not to be effectively called (when dealing
with an Antenna object, of course). Classes having at least one abstract
method are called abstract.
Polymorphism is strictly related to dynamic binding. Dynamic binding is
the capability of specifying at run time the type of a variable based on its
content. Dynamic binding applies to methods too: it maps a method name to
an implementation according to the objects dynamic type. The expression
e.drawRadiationPattern() calls the correct version of drawRadiationPattern()
22 Chapter 2

according to whether e is referring to an object belonging to the Dipole


or Aperture Antenna class at the moment. This feature is very useful.
Suppose to have a problem where different kinds of antennas are managed.
Suppose to have a tool for visualizing the radiation pattern on request. The
program stores in an array the list of antennas chosen by the user for
visualization. In a traditional programming language, the type of the array
elements must be homogeneous and must be specified at compile time.
Thanks to the OO features previously described (inheritance, polymorphism,
dynamic binding), the effective type of OO array elements may be specified
at run time. Returning to our example, we can declare an array of type
Antenna: this allows to populate it with objects belonging to any of the
Antenna subclasses. In order to visualize the radiation pattern, a loop
invokes the drawRadiationType method on each element of the array. At
run time the array is populated with objects of several types (any Antenna
subclass) and the correct implementation of drawRadiationPattern is called
at each loop iteration, according to the current object class.

Antenna

int f;
getFrequencyRange();
getFrequencyRange();
drawRadiationPattern();

Figure 2-1. Encapsulation is one of the key features of OO. Objects encapsulate code and
data. Data are accessible and modifiable by sending messages to the objects. A message
consists in calling an object method.
2. Object-Oriented Technologies 23

Antenna

Int frequency;
int getFrequencyRange();
void drawRadiationPattern();

Dipole Aperture Antenna

int length; void drawRadiationPattern(){


void drawRadiationPattern(){ ..
.. }
}

Figure 2-2. Inheritance allows to model a common behaviour shared between separate
classes. Classes named Dipole and Aperture Antenna have in common some data
(frequency) and methods (getfrequencyRange and drawRadiationPattern). This is
modeled by defining a common superclass, called Antenna. Subclasses may extend the data
and method inherited by the superclass. In the example, the subclass named Dipole contains
the dipole-specific variable named length. Subclasses may provide their own
implementation of an inherited method as well. The method drawRadiationPattern has a
customized implementation both in Dipole and in Aperture Antenna classes.

2.2 Java

2.2.1 Introduction

Java has been the first language designed and modeled after the Web.
The most critical requirement for a Web-enabled language is portability.
The Web joins heterogeneous machines and a Web-enabled language should
be able to run on any machine of the Web. It is known that high level
programming languages (i.e. languages for computer programming
understandable by humans and obscure for machines) are grouped into two
categories: compiled and interpreted languages. Both are converted into a
machine-understandable language to be executed. Compiled languages are
converted off-line, interpreted languages are converted at run time.
Interpreted languages are portable as they can run everywhere provided that
a suitable interpreter in installed on the platform. Compiled languages are
faster as the executable produced by the compiler is modeled upon platform
properties. Java joins the features of both approaches by using a combination
of compilation and interpretation via the creation of the so-called Java
24 Chapter 2

Virtual Machine (Fig. 2-3). The programmer writes a program in the Java
language and compiles it in an intermediate language (the so-called
bytecode). Java bytecode is interpreted at running time by a suited
interpreter installed on the target machine.
Java portability has opened a lot of applications for the Web. Java applets
are programs which, installed on a Web Server (see Chapter 1), can run on a
client machine. To invoke an applet and run it locally, the end-user must
simply point-and-click in the browser window. Java servlets are Java
programs running on the server side. They enrich Web servers with
computing capabilities by interacting with remote platforms.
Java is an OO language with a rich library of reusable code (the so-called
Java API) and features all the previously described properties of OO
programming. In the following subSection, a very essential overview of Java
syntax is provided, whilst Section 3 deals with application of Java to
distributed computing.

Java
interpreter for
UNIX

Java source Java


Java compiler bytecode interpreter for
code
Windows

Java
interpreter for
UNIX

Figure 2-3. Java Virtual Machine. Java source code is compiled off-line for a virtual machine,
producing the so-called bytecode. Java bytecode is interpreted at run time by a local
interpreter. In this way, Java joins benefits from compiling and interpreting approaches.

2.2.2 The Language

Java declares a class in the following lines:

public class Antenna {


public int f; // frequency
public int getFrequencyRange(){
2. Object-Oriented Technologies 25

return f;
}
}

f is a class variable, getFrequencyRange is a method.

The abstract keyword allows to declare the abstract class named


Antenna:

public abstract class Antenna {


public int f; // frequency
public int getFrequencyRange(){
return f;
}
public abstract void drawRadiationPattern();
}

The drawRadiationPattern method is not implemented in the


Antenna class because it has been declared as abstract (i.e. Antenna
subclasses must provide their own implementation).
The reserved Java keyword extends is used to introduce inheritance, for
example:

public class Dipole extends Antenna {


public int length;
public abstract void drawRadiationPattern(){
..
}

In this way, the class named Dipole inherits Antenna data (i.e. f )
and methods (i.e. getFrequencyRange and drawRadiationPattern). The
method drawRadiationPattern is implemented in the Dipole class.
Moreover, Dipole extends its superclass by adding a variable (i.e.
length).
Java allows to define the so-called interfaces as well. Classes whose
methods are all abstract are called interfaces. An interface is a class that is
not used to create objects, but only as a basis for making subclasses. An
26 Chapter 2

interface exists only to express the common properties of all its subclasses.
The reserved keyword implements is used to say that one class implements a
method of one or more interfaces.
Objects are created at run time, when the instruction of instantiation is
met. For example, the instruction:

Dipole d;

declares the object named d as element of the class named Dipole.


The instruction:

d = new(Dipole);

instantiates the object d (i.e. allocates the needed memory for it). Since
now on, the Dipole methods can be invoked for the instance d.
For example:

d.drawRadiationPattern();

draws the radiation pattern of dipole d.

3. OO DISTRIBUTED FRAMEWORKS

3.1 Introduction

3.1.1 Java RMI

Remote Method Invocation (RMI) is a Java-based distributed technology.


It allows Java programs to exchange data and trigger remote method calls
across networks. RMI is basically an object-oriented RPC mechanism (see
Chapter 1), according to which classes with methods that can be called
across virtual machines can be defined. The instances of such classes are
called remote objects. Java hides details on network communication so that
calls to remote objects are similar to local calls. RMI permits the definition
of the so called serializable objects as well. An object is serializable if it can
be migrated from a virtual machine to another across the network.
2. Object-Oriented Technologies 27

3.2 Java Mobile Agents

As seen in Chapter 1, a mobile agent is an autonomous program, which


can migrate during its execution from one host to another in a network. An
agent is executed as a process or a thread in the context of the agent runtime
environment. The agent runtime environment [Chess et al., 1995] must run at
each node willing to accept incoming agents. It provides the basic
functionalities needed to support agents migration and communication, such
as executing agents and applying security mechanisms to authenticate agents
and control access to local resources.
Java is the most widely-used language in mobile-agent systems for a
number of reasons [Chess et al., 1998]. One of them is portability, guaran-
teed by the virtual machine model of Java. As said in Section 2.2.1, compi-
ling a Java program into bytecode allows to implement executables portable
across heterogeneous platforms, provided that a virtual machine is installed.
Another reason is represented by Java built-in security mechanisms which
facilitate the implementation of mobile agent security procedures. Finally,
support for migration and communication is provided by using serializa-
tion, dynamic class loading and RMI facilities offered by Java API.

References
Booch, G., 1994, Object-oriented Analysis and Design (with applications), Benjamin-
Cummings Publishing Co. Inc.
Chess, D., et al., 1995, Itinerant agents for mobile computing, IEEE Personal Comm. Mag.,
2(5):34-59.
Chess, D., et al., 1998, Mobile agents: are they a good idea?, in: Mobile Agents and Security,
G. Vigna, ed., LNCS 1419, Springer-Verlag, pp. 25-47.
28 Chapter 2

Bibliography
Booch, G., 1994, Object-oriented Analysis and Design (with applications), Benjamin-
Cummings Publishing Co. Inc.
CORBA, 2005, http://www.omg.org.
Cristoffersen, C. E., Mughal, U. A., Steer, M. B., 2000, Object-oriented microwave circuit
simulation, International Journal on Radiofrequency and MW CAE, Vol. 10.
DCE http://www.opengroup.org/dce.
Felsen, L. B., Mongiardo, M., Russer, P., 2002, Electromagnetic Field Representations and
Computations in Complex Structures III: Network Representations of the Connection and
Subdomain Circuits, International Journal on Numerical Modelling, Vol. 15.
Kafura, D., 2000, Object-Oriented Software Design and Construction with Java, Prentice-
Hall, Englewood Cliffs, NJ.
Khoshafian, S., and Abnous, R., 1995, Object-Orientation: Concepts, Languages, Databases,
User Interfaces, Wiley, New York.
Liotta, G., Mongiardo, M., Tarricone, L., 2002, Introductory Review on Object Oriented
Paradigm for Full-Wave Microwave CAD, International Journal on Radiofrequency and
MW CAE, Vol.12.
Monson-Haefel, R., 2001, Enterprise JavaBeans, OReilly & Associates, (October, 2001).
Nicol, J. R., Thomas Wilkes, C., and Manola, F. A., 1993, Object Orientation in
Heterogeneous Distributed Systems, IEEE Computer, (June, 1993).
Oaks, S., and Wong , H., 2000, Jini in a Nutshell, OReilly.
Olyslager, F.,Van Der Berghe, S., Rogier, H., De Zutter, D., 2002, An Academic FDTD
Simulator Using Object Orientation, AP2000 Int. Conference, 2A1.2, Davos, (April, 9-14
2000).
OMG, 2005, http://www.omg.org.
Siniaris, C. G., Kostaridis, A. I., Kaklamani, D. I., Venieris, I. S., 2002, Implementing
distributed FDTD codes with Java mobile agents, IEEE Antennas and Propagation
Magazine, Vol. 44, No. 6, (December, 2002).
Thai, T. L., and Oram, A., 1999, Learning Dcom, OReilly & Associates, (April, 1999).
Chapter 3
THE SEMANTIC WEB

A. Esposito
University of Lecce

Abstract: The new perspectives opened by Semantic Web are overviewed. The basic
concepts behind Semantic Web and foundations of description logics are
described. A very brief taxonomy of most used reasoners and tools for
Semantic Web is provided as well.

Key words: Semantics; Ontology; Description Logics; TBox; ABox; Reasoner.

1. INTRODUCTION

The Web is nowadays an indispensable tool to access information.


However, searching the Web is often a frustrating experience: the
availability of enormous amounts of unstructured information makes the
search process cumbersome and tedious. Even the most powerful search
engines often return long lists of documents most of which are irrelevant.
This leads to an iterative search process in which the user looks through the
returned documents to establish more refined keywords for the next
iteration. One of the main reasons is the mark-up language used for Web
pages: the HTML language. It marks the text with rendering information, to
enable the visualization for human consumption: the meaning of the contents
is accessible only to humans. A revolutionary Web, where the search for
information is completely renewed, is envisioned in a famous paper [Berners
Lee, 2001], where the so-called Semantic Web is introduced. The Semantic
Web marks up resources with terms describing their contents in a way that is
understandable by both software programs (agents) and humans. This is
done via the definition of the so-called ontology. Defining an ontology
means encoding knowledge by means of an appropriate language, so that
electronic agents search information on the bases of human-readable queries.

29
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 2944.
2006 Springer. Printed in the Netherlands.
30 Chapter 3

The use of ontologies for conceptually describing distributed resources is


gaining momentum, with a consequent widening of the range of related
applications. This is happening in all Web-like virtual communities, such as
those supported by grid computing facilities (see Chapter 5). A virtual
community is a collection of autonomous organizations which decide to join
their resources to increase each own capabilities. The stimulus for
aggregating is generally given by the existence of common interests, such as
scientific research in a shared area, or the need to join diverse competencies
for accomplishing goals otherwise unaffordable. The definition of an ontology
related to the common domain in the former case, or the aggregation of
multiple ontologies each related to a specific domain in the latter case, then,
facilitates the access to shared resources. This is much more appealing when
the searched resources are software codes exploitable through the network.
In this case, ontologies allow to build complex applications by aggregating
distributed components developed by autonomous groups. Multidisciplinary
problems are then faced by assembling autonomously developed codes.
Virtual communities develop virtual applications by specifying in some
language the properties of the problem to be solved: software agents do the
work by searching the components able to solve the required subtasks. How
the technology is progressing to enable this is clearer after reading Chapter
4, which focuses on how to build the above mentioned network-enabled
software programs, and Chapter 5, where grid computing and its relation-
ships with the Semantic Web are introduced. Herein we focus on the Semantic
Web basic concepts, and on languages and tools needed to build a machine-
understandable vision of reality. As we will see in the other tutorial chapters,
these concepts remain valid in distributed environments alternative to the
Web.
The Semantic Web borrows concepts and technologies from Artificial
Intelligence (AI). The Description Logics (DL) [Baader, 2003] formalism
and the related reasoning techniques are nowadays considered the most
promising instruments of AI and represent the foundations of the Semantic
Web. The strength of DL lies in their capability to support reasoning, i.e. the
ability to manipulate the explicitly represented concepts to infer hidden
information. The current chapter gives first an introduction to DL and related
reasoning capabilities, then gives an overview of the most used languages
and tools for the Semantic Web.
3. The Semantic Web 31

2. DESCRIPTION LOGICS

2.1 Introduction

DL are a family of languages to represent knowledge, which is viewed as


a structured collection of concepts and relationships between concepts. To
build a DL representation of a certain domain, the fundamental entities
(concepts) must be identified together with their relationships (roles). This is
the so called terminologic box (TBox), i.e. the description of the general
knowledge of a problem domain. To complete the description of the domain,
concepts are populated by the elements (individuals) which are specific of a
particular problem of the domain. This is the so-called assertional box
(ABox). In other words, the TBox is thought not to change over time, whilst
the ABox is related to contingent circumstances. For example, a TBox
describing the EM domain could include the generic concepts Antenna
and Vendor. These concepts can be supposed to be timeless and
independent from the specific EM problem. An ABox for the same domain
could populate the Antenna and Vendor concepts respectively with the
instances VT300 and Kathrein. They can be supposed to be contingent:
the production of that specific model of antenna may cease and the vendor
may change its business goals. Section 2.2 gives some more details about
how to define a TBox, whilst Section 2.3 focuses on the ABox.

2.2 A Model for Reality: The TBox

Fig. 3-1 describes a simple model, with four concepts: Antenna,


Aperture Antenna, Dipole and Vendor. The role named
isProducerOf links the two concepts Vendor and Antenna. Roles are
binary predicates connecting one individual of some concept (i.e. one
specific element belonging to it) to one individual of some other concept. In
our example, the role named isProducerOf allows to express which vendor
produces which antenna.
The relationship named is-a links the two concepts Aperture Antenna
and Antenna and the two concepts Dipole and Antenna. It is a special
relationship. It means that a concept (subconcept) is more specific than some
other concepts (superconcepts). In set-theory words, if an element belongs to
the subconcept, it belongs to the superconcept too (if an object is an
Aperture Antenna, it is an Antenna too). The vice-versa is not
necessarily true (an Antenna is not necessarily an Aperture Antenna, as
it may be a Dipole).
When an is-a relationship links a subconcept with some parent
concepts, the subconcept inherits all the relationships of the more general
32 Chapter 3

concepts. In the example, the concept Aperture Antenna inherits all the
relationships of the concept Antenna, i.e. an hidden relationship named
isProducerOf exists between Vendor and Aperture Antenna. This
feature resembles the inheritance feature of object-oriented models, with the
key difference that DL supports multiple inheritance, i.e. a concept can be a
subconcept of several concepts.
Inheritance is a very simple example of reasoning, i.e. of inferring
implicit knowledge from the explicitly represented info. Indeed, DL may
exhibit much more complex reasoning capabilities, by means of operators
allowing to express concepts, constraints and dependencies which reflect the
human model of reality. Hidden knowledge is inferred by using DL
operators and reasoning capabilities depend on which set of operators a
specific DL provides. The expressiveness is higher when more operators are
available, with a consequent increase in the reasoning computational
complexity. For this reason, when choosing a DL, a tradeoff between
expressiveness and complexity must be found. As explained in Section 3, the
ontology languages adopted for the Web derive from such a compromise.
We propose now an overview of principal DL operators, with a special
focus on those provided by Web languages.
Two types of operators exist: constructors and axioms. Constructors
allow to introduce new concepts and roles based on previously defined ones
(the atomic concepts and roles). Axioms allow to express features of
concepts and/or roles.

Antenna
isProducerOf

is-a
is-a Vendor

Aperture
Antenna Dipole

Figure 3-1. Concepts and roles in DL.


3. The Semantic Web 33

Antenna

isProducerOf
hasProducer
Aperture
Antenna
Vendor

Elliptic
Aperture
Antenna
Coaxial
Aperture
Antenna Rectangular
Aperture
Antenna

Figure 3-2. Example of transitive and inverse relationships. Large arrows represent is-a
relationships. The role named isProducerOf is the inverse of the hasProducer role.
Transitivity applies to the represented is-a relationships.

2.2.1 Constructors

In Table 3-1 some constructors are listed. The union, intersection and
complement constructors recall the well-known set theory operators and are
to be interpreted in that way. Suppose that Coaxial Aperture Antenna,
Rectangular Aperture Antenna and Elliptic Aperture Antenna are atomic
concepts. We can define the concept Aperture Antenna (Fig. 3-2) as the
union of the three already defined concepts:

ApertureAntenna RectangularApertureAntenna
EllipticApertureAntenna CoaxialApertureAntenna

In this way we say that an aperture antenna is a rectangular antenna or an


elliptic antenna or a coaxial antenna.
34 Chapter 3

Suppose now that we need to define the concept linear array of


antennas as the ordered aggregate of a number of antennas along a line. We
could first define the atomic concept Array as ordered aggregate of
entities. Then we could introduce the array of antennas as (Fig. 3-3):

Linear Antenna Array Antenna Array

This means that an antenna array is both an antenna and an array.

More complex expressions can be built by using the so-called role


restrictions (listed in Table 3-1 too).
For example, the value restriction constructor can be used to build new
concepts by aggregating all the instances in relation with a specific concept.
Suppose to introduce the atomic role named hasPart linking the concept
Linear Antenna Array and Antenna (see Fig. 3-3). It expresses the fact
that an array of antennas is made of antennas. Now we define the concept of
array of rectangular aperture antennas:
an array of rectangular aperture antennas is an array whose components
are all rectangular aperture antennas.
In DL operators:

RectangularApertureArray LinearAntennaArrayhasPart.Rectangular
ApertureAntenna

Note that this definition does not exclude the case that the array has zero
components. To impose the existence of at least one components, we use the
so called existential restriction:

RectangularApertureArray LinearAntennaArray
hasPart.RectangularApertureAntenna
hasPart.RectangularApertureAntenna

The number restriction constructor, instead, imposes a lower and/or an


upper bound to the second argument of a role. For example, we could define
the concept of
array of rectangular aperture antennas having more than two
components
with the expression:

2 hasPart.RectangularApertureAntenna
3. The Semantic Web 35

The last constructor showed in the table is the inverse constructor. It


defines a role as the inverse of an atomic one, i.e. exchanging the order of
the linked individuals we can pass from one role to the other. For example, if
we define the role isPartOf as the inverse of the hasPart atomic role:

isPartOf hasPart

we are saying that:

if VT300 hasPart VT30, then VT30 isPartOf VT300

2.2.2 Axioms

Table 3-2 lists some axioms.


The equivalence axiom introduces the identity between concepts and/or
roles. It can be used to introduce synonyms.
The inclusion operator says that a concept (role) is a subconcept (subrole)
of an other concept (role), for example:

CoaxialApertureAntenna Antenna
isProducerOf Dipole isProducerOf

Disjointness says that no individuals can belong to both the disjoint


concepts.
For example,

CoaxialApertureAntenna RectangularApertureAntenna

means that an antenna cannot be a coaxial aperture and a rectangular


aperture antenna at the same time.
The transitivity axiom can be explained by looking at the example in Fig.
3-2. The figure shows three specializations of the Aperture Antenna
concept: the Coaxial Aperture Antenna, the Elliptic Aperture Antenna
and the Rectangular Aperture Antenna. They are linked to the super-
concept Aperture Antenna by an is-a relationship. The concept Aper-
ture Antenna, in its turn, is linked to the superconcept Antenna.
This casts the linking of Coaxial Aperture Antenna, Elliptic Aperture
Antenna and Rectangular Aperture Antenna to the superconcept
Antenna with an implicit is-a relationship.
36 Chapter 3

In generic terms:

if concept C specializes concept B and concept B specializes concept A


then
concept C specializes concept A

Transitivity is always true for the is-a relationships, but can be applied to
other roles as well. Suppose to define the concept of Planar Array of
Antennas (Fig. 3-3). Suppose also to link the concept Linear Array of
Antennas with the new concept by the role named isPartOf .
It is useful to impose that isPartOf is a transitive role:

if a linear array of antennas L is part of a planar array of antennas P


and an antenna A is part of the linear array of antennas L
then
the antenna A is part of the planar array of antennas P

Table 3-1. Some DL constructors. Classes are marked with the letter C, roles are marked with
the letter P.
Name Syntax Example
Union C1Cn CoaxialApertureAntennaEllipticApertureAntenna
Intersection C1Cn ArrayAntenna
Complement C CoaxialApertureAntenna
Value restriction R.C hasPart.RectangularApertureAntenna
Existential quantification R.C hasPart.Antenna
Atleast number restriction R.C 1hasPart.Antenna
atmost number restriction R.C 4hasPart.Antenna
Inverse R IsProducerOf

Table 3-2. List of axioms. Classes are marked with the letter C, roles are marked with the
letter P.
Name Syntax Example
Inclusion C1C2 CoaxialAntennaAntenna
Equality C1C2 Vendor Provider
Inclusion P1P2 isProducerOfDipoleisProducerOf
Equivalence P1P2 isProducerOf provides
Disjointness C1C2 CoaxialAntennaRectangularAntenna
Transitivity PR+ hasPart R+
3. The Semantic Web 37

Antenna

hasPart
isPartOf
Linear
Array of
Array
Antennas

hasPart
isPartOf

Planar Array
of Antennas

Figure 3-3. Example of transitivity applied to roles.

2.3 The ABox

Once the model of reality has been specified, the DL must be populated
with individuals to form a knowledge base. Individuals are the concept
members, i.e. they are the specific, concrete instances a concept is made of.
Indeed, the ABox contains the so-called membership assertions, as, for
example, the concept assertion:

RectangularApertureArrayAntenna (VT30)

says that the model VT30 is an array of rectangular antennas.


The role assertion:

isProducerOf (Kathrein, VT300)

says that the vendor Kathrein produces the antenna VT300.


38 Chapter 3

Constructors and axioms exist for the ABox too.


For example, the existential constructor collects all the individuals
having a specific role with a specific individual:

hasPart.VT300

collects all antennas having at least one VT300 element.


The equivalence axiom allows to assign several names to the same
individual, for example

Kathrein Kathrein-Werke KG

2.4 Reasoners

Reasoners are software tools able to extract the implicit knowledge


included in a domain description. Reasoning is performed by using the set of
operators included in the DL.
The principal capability of reasoners is subsumption, i.e. the capability of
checking whether a concept is a specialization of some other concepts. For
example, looking at the DL represented in Fig. 3-4, one could be interested
to know whether the concept Array of Elliptic Aperture Antennas is a
subconcept of the concept Array of Aperture Antennas. Note that no direct
is-a links connect the two concepts, neither inheritance applies. Suppose that
the DL defines the two concepts with the following expressions:

EllipticApertureArray Array hasPart.EllipticApertureAntenna


hasPart.EllipticApertureAntenna

ApertureArray Array hasPart.ApertureAntenna


hasPart.ApertureAntenna

Joining these expressions with the fact that the concept


EllipticApertureAntenna specializes ApertureAntenna:

ElliticApertureAntenna ApertureAntenna

the reasoner is able to conclude that the answer is positive.

The above example regards the TBox, reasoners may operate on ABoxes
too. The basic reasoning task over an ABox is the so-called instance-
checking, i.e. the inference of whether a specific individual is instance of a
3. The Semantic Web 39

specific concept. Other tasks are realization, i.e. the search for the most
specific concept an individual is instance of and retrieval, i.e. the discovery
of all the instances of a given concept.
Reasoners are useful both in the building phase and in the deployment.
In the former case, the reasoner is used to reveal DL inconsistencies,
redundancies, misclassifications, etc. An example of error made when
building the model and discoverable by reasoners is the generation of the so
called empty concepts, i.e. of concepts which will never be populated by
individuals. In this case we talk about concept unsatisfiability.

For example, suppose that the concepts CoaxialApertureAntenna and


RectangularApertureAntenna are disjoint:

CoaxialApertureAntenna RectangularApertureAntenna

Suppose also to define the concept of HornAntenna as subconcept of


CoaxialApertureAntenna:

HornAntenna CoaxialApertureAntenna

Suppose now to define the concept of HornAntenna as subconcept of


RectangularApertureAntenna too:

HornAntenna RectangularApertureAntenna

The reasoner will classify the HornAntenna concept as empty. The


reason is that HornAntenna superconcepts are disjoint from each other, i.e.
individuals being members of the CoaxialApertureAntenna concept cannot
belong to the RectangularApertureAntenna concept as well, and vice-
versa.
This capability of reasoners is particularly useful for the Semantic Web.
Indeed, given the great abundance of resources and the multiplicity of
domains of discourse, ontologies are often built by merging a number of
autonomously generated ontologies. In this way, each domain implements its
specific ontology and multidisciplinary applications refer to wider ontologies
obtained by the aggregation of specific ones. This process is risky as
conflicting concepts may emerge after the fusion. The reasoner capability of
individuating inconsistencies is then used in a debugging phase when the
wide ontology is tested.
In the deployment phase, reasoners are used for searching resources after
the expression of a certain query. The capability to build taxonomies of
40 Chapter 3

concepts (classification) constitutes a good help in both the developing and


deployment phases. Classification is the subsumption of all hidden
specialization relationships for the organization of the concepts in a
hierarchical way. This facilitates both debugging and navigation into the
model. Fig. 3-5 shows the classification of the model depicted in Fig. 3-4 by
an hypothetic reasoner.

Antenna Array

Aperture hasPart Array of


Antenna Aperture
Antennas

Elliptic
Aperture
Coaxial Antenna
Aperture hasPart
Antenna

Array of
Rectangular Elliptic
Aperture Aperture
Antenna Antennas

Figure 3-4. A reasoner can infer the hidden is-a relationship between Array of Elliptic
Aperture Antennas and Array of Aperture Antennas classes.
3. The Semantic Web 41

Antenna Array

Aperture Array of
Antenna Aperture
Antennas

Coaxial Rectangular Elliptic


Aperture Aperture Aperture
Antenna Antenna Antenna

Array of
Elliptic
Aperture
Antennas

Figure 3-5. Classification of the model represented in Fig. 3-4. Concepts are rearranged in a
hierarchical fashion, according to explicit and implicit specializations.

3. TOOLS FOR THE SEMANTIC WEB

3.1 Languages

In the Semantic Web, resources are given a well-defined meaning, and


applications can automatically discover, use and integrate them. For this
purpose, several languages have been developed to encode knowledge.
The earliest is the so-called Resource Description Format (RDF) [RDFS,
2005], a standard language defined by W3C. RDF represents reality as a
labeled directed graph. The graph is described by a set of triples, in the form
known as subject-verb-object:

(Subject, Predicate, Object)


42 Chapter 3

where Predicate is the label of the link joining Subject with


Object. Subject is a resource (for example a Web page identified by its
URL). Object may be a resource or a literal (i.e. a string). The RDF is a
very simple language, whose limits were soon apparent. It lacks in the
absence of fundamental constructors needed to represent reality.
In response to RDF, the DARPA Mark-up language (DAML) was
developed by an effort from the U.S. government. Soon after, the Ontology
Inference Language (OIL) was defined, yielding the DAML + OIL language.
DAML + OIL [DAML + OIL, 2005] is built on top of RDF but includes
a much richer set of operators. The already reviewed Tables 3-1 and 3-2
substantially summarize the operators included in DAML + OIL. A pecu-
liarity of DAML OIL documents is their interlacing of A-Boxes and
T-Boxes, with no clear distinction among them.
DAML + OIL has been demonstrated to be an extension of a well known
DL, which, for sake of brevity, will not be treated herein. The distinguishing
features of DAML + OIL with respect to this DL are nominals and concrete
datatypes. Nominals are special classes made of a single instance. Concrete
datatypes are concepts representing RDF literals or XML schemas. This
means that instances of datatypes are a simple type such as those defined in
the RDF literals or in the W3C XML Schema (see Chapter 1): it includes
strings, numbers, dates, etc. This allows to associate a concept with some
attributes of a specific type (such as integer, real, etc.). This is useful for
describing properties of the entities represented in the model (for example an
integer could contain the number of elements of array individuals).
The most recent emerging standard proposed by W3C is the so called
Web Ontology Language (OWL) [OWL, 2002], which supports three kinds
of sub-languages: OWL-Lite, OWL-DL and OWL-Full.
OWL-Lite is the simplest language and allows making simple
taxonomies and little more. OWL-DL is an extension of DAML OIL, thus
being more expressive than OWL-Lite while retaining computational
completeness (all conclusions are guaranteed to be computable) and
decidability (all computations finish in a finite time). OWL-Full is the most
expressive but it is not completely supported by OWL reasoners. It is
suggested only when decidability or computational completeness is a less
stringent requirement with respect to expressiveness.

3.2 Reasoners

A number of freeware software tools with reasoning capabilities exist


(Racer, Fact, Pellet are some of them). They differ in the algorithm used and
in the languages supported. Most of them are DIG-compliant. DIG is a
3. The Semantic Web 43

standard that provides a specification for a common way of connecting to


DL reasoners. A DIG-compliant reasoner is a Description Logic reasoner
that provides a standard access interface (known as DIG interface), which
enables the reasoner to be accessed over HTTP, using the DIG language.
A common DIG-compliant reasoner is Racer [Racer, 2005]. It supports
reasoning on both T-Boxes and A-Boxes (the last being known to require a
higher degree of computational complexity). Racer supports OWL-DL and is
available as an open source freeware application program providing a
Semantic Web Inference Server.

3.3 Tools for Building Ontologies

In order to make DL practical, graphical user interfaces are critical to


develop ontologies. OilEd and Protg are the most used tools for building
ontologies. Both use reasoning to support ontology design. OilEd [OilEd,
2005] was developed with the major objective to make DL technology
available to a large community. It has limits in a contained scalability and
flexibility. Protg [Protg, 2005] has revealed as a more robust tool,
supporting large ontologies and providing an extensible architecture with a
plug-in philosophy. The recently developed Protg OWL Plugin comes
from a joint effort from Protg developers and the OilEd team, in the
context of the CO-ODE project. The Protg OWL Plugin offers a graphic
user interface to build and/or load ontologies in the OWL language. It
connects to reasoners via the DIG-interface for consistency checking and
classification and offers an API for building and querying (via the DIG
interface) ontologies programmatically.

References
Baader, F. et al., 2003, The Description Logics Handbook: Theory, Implementation and
Applications, Cambridge University Press, 2003.
Berners-Lee, T., et al., 2001, The Semantic Web, Scientific American, 284(5), pp. 34-43.
DAML + OIL, 2001, http://www.daml.org/2001/03/daml+oil-index.html.
OilEd, 2005, http://oiled.man.ac.uk.
OWL, 2002, http://www.w3.org/TR/2002/WD-owl-ref-20020729/.
Protg, 2005, http://protege.stanford.edu.
Racer, 2005, http://www.sts.tu-harburg.de/~r.f.moeller/racer/.
RDFS, 2005, http://www.w3.org/TR/2002/WD-rdf-schema-20020430/.
44 Chapter 3

Bibliography

Baader, F. et al., 2003, The Description Logics Handbook: Theory, Implementation and
Applications, Cambridge University Press.
Berners-Lee, T., Hendler, J., and Lassila, O., 2001, The Semantic Web, Scientific American,
284(5), pp. 34-43.
DAML + OIL, 2001, http://www.daml.org/2001/03/daml+oil-index.html.
Enderton, H. B., 1972, A mathematical introduction to logic, Academic Press.
Fensel, D., M. R., Nilsson N., 2001, Ontologies: A Silver Bullet for Knowledge Management
and Electronic Commerce, Logical Foundation of Artificial, Berlin, Germany, (Springer,
2001).
Kaufmann, Morgan, 1986, Intelligence, Palo Alto, CA.
Levesque, H. J., 2002 A logic of implicit and explicit belief. In Proceedings of the Fourth
National Conference on Artificial Intelligence, Austin, TX, 198-202OWL;
http://www.w3.org/TR/2002/WD-owl-ref-20020729/.
Ontology Portal, 2005; http://www.ontologyportal.org.
OPENCYC, 2005; http://www.opencyc.org.
Protg, 2005; http://protege.stanford.edu.
RDFS, 2002; http://www.w3.org/TR/2002/WD-rdf-schema-20020430/.
Chapter 4
WEB SERVICES

A. Esposito
University of Lecce, Italy

Abstract: Web Services propose a new model for implementing applications, which
promotes reusability and cooperation. In this chapter, the basic concepts
behind Web Services are explained and the main standards supporting them
are described. A brief introduction to semantic-driven service-oriented
architectures is provided at the end of the chapter.

Key words: Service-Oriented Architecture; SOA; Web Services; WSDL; XML; UDDI;
SOAP; Semantic Web; OWL-S.

1. INTRODUCTION

Since the advent of the Web, Web enabled applications have become part
of our daily work and life. The Web model works so well that its popularity
has grown in an unexpected measure. One of the side effects of this booming
is the increased complexity of Web applications. Nowadays it is possible to
teach through the Web, to make business transactions and to experiment
complex scientific software. One of the key technologies for managing such
a complexity is Web Services.
Web Services envisage applications as dynamic assemblages of
distributed software, thus promoting software reuse and interoperation. Web
Services allow to build new applications by integrating pieces of software
(services) available on disparate platforms rather than implementing
complex monolithic applications from scratch. This idea is not new, as
demonstrated by the plethora of other distributed frameworks such as
CORBA and DCE. The affirmation of Web Services over earlier forms of
distributed computing architectures is due to a number of reasons.
First of all, Web Services success is related to their amenability to
interface loosely coupled applications, where the client has little a-priori

45
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 4554.
2006 Springer. Printed in the Netherlands.
46 Chapter 4

knowledge of the service to be called. This is very useful in the Web and in
any distributed environment where the partners possess reduced knowledge
of each other. In these cases, the end-users may not know which services are
available and which tasks they perform. Web Services solve this by
equipping services with info that make them self-describing components
which can be queried, located and invoked in the net. Indeed, services
provide the standardized description of their properties, as well as its
publication. Once the publication is performed, the service can be searched
for by another application, which automatically can identify and evoke it.
Web Services booming is strictly related to their focus on Internet
standards too. Web Services propose a collection of Internet-wide standards
for implementing a service-oriented environment. The simplicity and
ubiquity of the adopted standards are key points for interoperability and
cooperation.
In conclusion, Web Services promise a revolutionized Web, where a
number of activities performed nowadays by human beings are carried out
by programs: whenever a chaining of tasks is needed, Web Services offer the
the opportunity to do that programmatically. This is very appealing in
e-business, where a flux of transactions must be established at each new
client request, and in cooperative engineering, where the on-line definition
of a workflow of autonomously implemented tasks permits the solution
of complex and multidisciplinary problems.

2. BASIC CONCEPTS

2.1 Web Services Architecture

Web Services is a kind of Service Oriented Architectures (SOA). SOA


include standards, paradigms, languages, for the automatic interaction
between autonomous applications (services) [Snell et al., 2002]. In a SOA
each resource is a service, i.e. it can be invoked through the network via a
well-defined interface. The interface is the description of the way of inter-
acting with the service, abstracting from inner details, such as implemen-
tation languages and platform. Existing pieces of software can be enabled
to behave as services by simply adding a shell which specifies their inter-
face.
This feature is critical for permitting the interoperability among
heterogeneous codes and platforms: once the envelope is available, the
programs are exploitable via the net similarly to any other service and
complex applications can be built from them once the service workflow is
specified.
4. Web Services 47

Moreover, services are self-describing: they describe their interface in a


machine-readable language, so that when an application contacts one of
them, it may acquire info on how to interoperate with it. This allows the so-
called late binding, i.e. an application may establish at run-time which
service to invoke, based on its current state.
When describing its interface, the service specifies the messages it
exchanges to carry out an operation. Services exchanging the same set of
messages expose the same interface. Indeed, they are not distinguishable by
the calling application, even though they may correspond to different
implementations and behaviors, which are completely hidden by the
interface. This is due to the fact that the interface describes the syntactical
properties of services, with no reference to their semantic behavior. In other
words, the interface describes how to invoke a service, without saying
anything about what it does.
For this reason, Web Services implementations generally include some
discovery tools to distinguish between different implementations of services.
Discovery is generally performed by means of a central repository, with the
specific role of maintaining a list of available services and their properties.
The client inputs the requirements and queries the repository to obtain the
list of matching services.
What explained till now, is exemplified by the scheme reported in Fig.
4-1, where the complete cycle for invoking a service is described. It is
composed of three procedures:

registration the service informs the repository about its existence and
properties. This operation may be performed manually or programmatically.
In the first case, the service provider, i.e. the person who owns the service,
updates the repository through a user interface. In the second case, the
service itself or some software acting on its behalf contacts autonomously
the repository. A common model for the automatic updating is the so called
publish/subscribe model, where the service may be configured to send
asynchronous events to recipients having expressed their will to receive
notifications;
discovery the client contacts the repository and queries for a list of
services matching input requirements. The repository returns the list of
matching services. This procedure may happen manually or program-
matically too;
invocation the client contacts the matching service to obtain a
description of the supported interface. The service returns the supported
interfaces described in a standard format. The client transforms the
description of the service interface into interface definitions of a program-
ming language so that the service invocation may take place.
48 Chapter 4

As said above, these steps may happen programmatically, without any


human intervention and assume that no a-priori knowledge of the remote
services is available. The unique info to be known is the address of the
repository.

services
invoke
register

client
discover
repository

Figure 4-1. Complete cycle for invoking a service. Services inform a central repository about
their properties. Clients contact the repository to discover services matching some input
requirements. Once the address of the best-matching service is available, the client contacts it
to get info about the way to invoke it.

As said before, Web Services differ from other approaches in their focus
on open, simple, Internet-based standards [Cerami, 2002], some of which are
briefly overviewed in the following sections. Section 3 deals with the
standard for describing Web Services interfaces, the so-called Web Services
Description Language (WSDL). Section 4 introduces the standards related to
the discovery process: the standard mechanism for registering Web Services,
named Universal Description, Discovery and Integration (UDDI) is
overviewed in Section 4.1, whilst emerging standards for more sophisticated
semantic-driven, automatic discovery are treated in Section 4.2.

3. WEB SERVICES DESCRIPTION: WSDL

Web services are self-describing applications, i.e. their interfaces are


defined using a standardized language, the Web Services Description
Language (WSDL). This feature permits the clients to call a remote service
by specifying on-line (rather that at compile time) the method to be invoked
and the parameters to be passed: all that is required is the address of the
4. Web Services 49

remote service. Given that address, the application can retrieve the WSDL
document, parse its contents and automatically generate the classes or
programs that invoke the service.
A WSDL file is an XML document (see Chapter 1). It is basically
composed of two parts, the so-called port-type and the so-called binding.
The former concentrates on service operations and the messages needed to
carry them out. The latter focuses on the protocols supported by the service.
The port-type provides an abstract description of the operations
supported by the service (similarly to Java interfaces or C++ classes).
Operations are carried out by exchanging ordered set of messages with clients.
Typical messages are the input, output and fault messages. Input messages
specify the set of parameters to be passed to operations. Output messages
specify the set of parameters returned by operations. Fault messages specify
the error conditions that may occur when calling the operation.
The binding section ties the abstract definition of operations provided by
the port-type section to concrete packaging and transport protocols.
Packaging protocols specify the format of messages when traveling in the
Net so that the two end-points may understand them. A well-known
packaging protocol is HTML. It allows to package info on how to visualize
Web documents in a format that is easily understandable by any type of
browser and platform. The standard packaging protocol for Web Services is
the Simple Object Access Protocol (SOAP), which specifies how to format
Web Services invocation in a form independent from platforms and
languages. As seen in Chapter 1, XML messaging is such a form and SOAP
is an XML-based protocol. Transport protocols specify the underlying
communication mechanisms. Web Services support the most common
transport protocols used in the Web, such as HTTP, FTP, SMTP.
The WSDL description of port-type and binding info allows to establish a
sort of contract between the client and the service thus enabling their
communication. To retrieve the WSDL document, the a-priori knowledge of
the address of the remote service is needed, i.e. the client must know which
service does the work. In other words, the WSDL document informs the
client about how it can invoke the service, once the appropriate service has
been selected. In some circumstances the client may not know where the
most appropriate services for its needs reside. This requires a procedure to
assist the client for discovering the service, as described in the following
section.
50 Chapter 4

4. AUTOMATIC DISCOVERY OF WEB SERVICES

4.1 UDDI

In loosely coupled environments, users may not know which services are
available or which tasks they are able to execute. Moreover, Web Services
are dynamic environments built over Internet, where the published services
and their properties may change frequently over time. This casts the need
of a discovery service, where services are classified so that their location is
viable.
The standard discovery service for Web Services is UDDI [UDDI, 2005].
UDDI is a technical specification for implementing searchable repositories
for services. It gives technical guidelines both for publishing services to the
repository and for searching them. The specification also includes an API for
publishing and for querying.
UDDI is basically composed of three parts:

white pages includes general information about registered companies,


such as business description, contact information, unique business identi-
fiers, and so on;
yellow pages contains taxonomies of registered services and
companies;
green pages focuses on technical information needed to invoke the
registered service. Generally it is a pointer to a technical document (such as
the WSDL).

UDDI allows a substantially manual discovery procedure, where a


search similar to that offered by traditional Web search tools is performed:
the end-user narrows the matching set with an iterative process requiring the
insertion of more and more refined keywords. More flexibility is offered by
the Semantic Web. As explained in Chapter 3, the Semantic Web adds a
structured conceptual description (ontology) to the published resources, so
that the client may adopt meaningful terms and relations (similar to nouns
and verbs of human languages) to express the requirements. Results of
recent efforts for bridging the Semantic Web infrastructure with Web
Services are overviewed in the following Section.

4.2 The Semantic Web Services

As described in Chapter 3, the Semantic Web gives resources a well-


defined meaning, so that applications can automatically discover, use and
4. Web Services 51

integrate them. For this purpose, an appropriate language has been


developed to encode knowledge. As seen in Chapter 3, the most recent
emerging standard is the OWL [OWL, 2002].
A specialized OWL-based language (OWL-S) is now being developed to
tailor OWL to Web Services needs. OWL-S [OWL-S, 2005] is an upper
ontology for services. In other words, it defines a number of concepts so
general to be valid in every domain. When implementing a domain-specific
framework, the OWL-S ontology must be specialized, i.e. a domain specific
ontology must be built by instantiating and specializing OWL-S high level
concepts.
In Fig. 4-2, OWL-S basic concepts are depicted. As shown in the figure,
OWL-S is centered around the general concept named Service, which is
linked to other three fundamental concepts. The three concepts are Service
Profile, Service Model and Service Grounding.
Service Profile is a representation of what the service does. It provides a
human readable description of the service and of its provider, a description
of the service functionalities and a list of attributes, such as response time
and constraints.
Service Model describes how the service works, i.e. it tells what happens
when the service is carried out. The service model assumes that services can
be complex entities, eventually composed of a number of simpler services
(which may, in their turn, be decomposable too). This is done by
representing services as processes. A process is an abstract vision of the
service, in terms of its inputs, outputs, preconditions and, when appropriate,
its components. A process can be atomic, composite or simple (Fig. 4-3).
An atomic process corresponds to an invocable Web Service. A composite
process is a process composed of other processes: it corresponds to several
execution steps. Instead, a simple process corresponds to a single execution
step: it may be an abstract view of an atomic process or a simplified
representation of a composite process.
This model allows to define sort of meta-services obtained by
orchestrating a number of more elementary services to achieve a global task
not affordable by the single services. The model allows also to specify the
way elementary processes are composed (in parallel, concurrently, etc.) by
means of a set of control constructs.
Service grounding gives info on how to invoke the service, by providing
concrete details on communication protocols and message exchange formats.
52 Chapter 4

Resource
provides

Service
supports
Service
presents isDescribedBy Grounding

Service
Profile Service
Model

hasProcess

Process
Model

Figure 4-2. Top level of the OWL-S service ontology.

composite process

composite process

single process

atomic process

Figure 4-3. Composition of services. A composite process is an abstract view of a service


made up of several services. The composite process is obtained by composing a number of
processes, which can be composite in their turn. Single processes correspond to single
execution steps, they may be an abstract view of atomic processes (i.e. runnable services) or a
simplified representation of some composite processes.
4. Web Services 53

References
Cerami, E. 2002, Web Services, OReilly & Associates, Inc.
OWL-S, 2005, http://www.daml.org/services.
Snell et al., 2002, Programming Web Services with SOAP, OReilly & Associates, Inc.
UDDI, 2005, http://www.uddi.org.
54 Chapter 4

Bibliography
Basic Profile Version 1.1, 2005; http://www.ws-i.org/Profiles/BasicProfile-1.1-2004-08-
24.html.
Resource Description Framework, 2005; http://www.w3.org/RDF/Web Services
Interoperability Organization; http://www.ws-i.org.
WS Reliable Messaging 1.0, 2005;
http://www-128.ibm.com/developerworks/webservices/library/specification/ws-rm/.
Chapter 5
GRID COMPUTING

A. Esposito
University of Lecce, Italy

Abstract: This chapter provides an overview of Grid Computing and of the de-facto
standard for grid middleware (the Globus Toolkit). A special focus is devoted
to OGSA and integration of grid computing with Web Services. Basic
concepts about Semantic Grid Computing are provided at the end of the
chapter.

Key words: Computational Grid; Globus Toolkit; Middleware; OGSA; MPICH-G2;


GridFTP.

1. INTRODUCTION

The name Grid Computing (GC) is inspired by the electrical grid.


Indeed, GC originates from the idea of applying the concepts of the
electrical grid to supercomputing. The electrical grid allows users to
consume and pay for an amount of electrical power always conformant to
their needs. Similarly, GC enables software applications to always consume
and pay for the amount of CPU-power they need. This is achieved by
collecting the CPU power from a pool of networked computers, by virtue of
a distributed computational infrastructure (computational grid). In this way,
software consumers pick up CPU time from idle remote machines, thus
eliminating the need of equipping their laboratories with expensive
multiprocessors.
GC is a very recent technology, whose progress is perceivable day after
day. The most impressing advance is the extension of the idea of controlled
sharing of available distributed CPU-power to the more general concept of
resource. In grid terms, a resource is any kind of software (a piece of
application, a file, a database, etc.) or hardware entity (an electrical device, a
storage card, etc.) accessible through the network. GC enables users to

55
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 5568.
2006 Springer. Printed in the Netherlands.
56 Chapter 5

access and exploit disperse resources as if they were local. On one hand, this
promotes a plug and play philosophy, where end-users build software
applications on the fly, by aggregating available software components and
by allocating resources dynamically and on as-needed bases. On the other
hand, GC promotes cooperation between individuals, as it facilitates their
interaction by providing the tools to exchange data, codes, devices.
As seen in previous chapters, the existing frameworks for parallel and
distributed computing have goals similar to GC. Indeed, each of them
affords a different facet of GC. An example is the capability of integrating
dispersed pieces of codes offered by Web Services, as well as the ability of
simultaneously exploiting the power of multiple CPUs given by MPI-
compliant frameworks. Each of these technologies features its own
drawbacks and benefits, and no clear winner has emerged till now. GC,
instead, is coming out as a global unifying technology for distributed
computing and cooperative engineering. It does not replace existing
distributed frameworks, some of which exhibit clear advantages in certain
circumstances and environments. Instead, GC interoperates with well-
established technologies and integrates them in a coordinated global
framework which is valid both in wide-area and in local-area environments
and is based on universally accepted standards.

2. GC BASIC CONCEPTS

A computational grid can be viewed as a sort of metacomputer, whose


software and hardware resources are distributed over disparate networked
machines (nodes). Computational grids may span domains of different
dimensions, starting from local grids, where the nodes belong to a single
organization via a LAN connection, to global grids, where the nodes are
owned by different organizations and linked via Internet. In both cases a
special software (the so called grid middleware , GM) allows to access the
dispersed resources as if they were local.
GM smoothes the heterogeneities among the involved entities (operating
systems, storage devices, program languages, etc.) and hides the complexity
of the underlying environment (Fig. 5-1). Moreover, GM permits a contro-
lled and optimized management of the grid resources, both from the owner
side and from the consumer side. Resource owners are enabled to establish
security policies for controlling and monitoring the access and the
exploitation of their resources. Resource consumers are enabled both to
allocate the resources they are granted and to get info about the state of their
applications at any moment.
5. Grid Computing 57

There are many grid projects worldwide working around the


development of GM software [Baker et al., 2000]. The Globus Toolkit (GT)
is the most promising and is rapidly becoming the GM de facto standard. For
this reason, it is the reference tool discussed in the current chapter and used
throughout the book. A brief introduction of GT is provided in the following
section, while the remaining sections focus on specific services offered by
the tool, with the twofold objective of describing GT and of better clarifying
what a grid is and what it offers.

grid
GM
GM

network

GM

GM

GM GM

Figure 5-1. Example of grid. A grid includes geographically dispersed hardware and software
resources. It can span domains of different dimensions, from local area to wide area domains.
In each case, a special software, the GM, must be installed on each node. The GM is
responsible for unifying the dispersed heterogeneous resources in a global framework, which
is perceived by end-users as a single entity.

3. THE GLOBUS TOOLKIT

GT [GT, 2005] is a joint initiative of the University of Southern


California, the Argonne National Lab and the University of Chicago. Unlike
the majority of alternative tools (as for instance Legion [Legion, 2005])
which tether the end-user to fixed programming paradigms, such as object-
orientation, GT provides an open-source set of autonomous components and
tools.
58 Chapter 5

GT is a very recent technology, continuously evolving and improving.


Originally, GT was conceived as a kit of tools, utilities and libraries for
implementing, managing and using a grid. The openness and flexibility of
GT conception induced several worldwide scientific projects to adopt it, thus
electing GT as the natural reference tool for grids. In the meantime, service
oriented environments and Web Services (WS) gained momentum,
demonstrating their validity for cooperation and reusability. This induced the
Globus Alliance to embrace WS architecture and vision. As a result, the
latest versions of GT (i.e. GT3 and GT4) follow a service-oriented
philosophy. Even retaining a number of old-style components (the so-
called non-WS components), the new versions are conceived as a set of built-
in services (the so called WS components) and grid-aware applications are
now envisioned as collections of services.
As we see in the next section, GT4 adopts standards newer than GT3s
ones, but the concepts, philosophy and architecture of the two releases
exhibit a strong continuity.

3.1 GT and Web Services

As a result of the decision of adopting Web Services architecture and


vision, the Global Grid Forum (GGF) produced the Open Grid Services
Architecture (OGSA). OGSA [Foster et al., 2002] is the specification of the
requirements needed by GC environments in service-oriented frameworks.
OGSA views a grid as a collection of services, in complete agreement with
Web Services vision, but expresses some requirements not optimally
matched by Web Services standards, the most critical being the implemen-
tation of the so-called stateful services.
Stateful services have the capability of remembering information from
one invocation to another. An example of stateful service is the accumulator,
which adds a number provided as input to a stored number and keeps the
result for the subsequent calls. Web Services are traditionally supposed to be
stateless. Indeed, there are no technical limits for implementing stateful Web
Services, but the implementation of this feature is not standardized.
OGSA is just a specification of requirements without any codification of
mechanisms and procedures. A standardization effort was needed to define
the guidelines for implementing an infrastructure matching OGSA indi-
cations. As a result, in 2000, the so-called Open Grid Services Infrastruc-
ture (OGSI) was released.
OGSI is based on the concept of Grid Services, a rearrangement of Web
Services performed to match OGSA specifications. The most distinguishing
features of Grid Services with respect to Web Services concern the already
mentioned statefulness, and other issues such as lifecycle management and
5. Grid Computing 59

the capability of sending asynchronous notifications to other services [Foster


et al., 2003]. GT3 is an implementation of the OGSI specification.
The specification of OGSI introduced a divergency between GC and
Web Services: the diversity of tools, standards and mechanisms provided by
OGSI implementations with respect to WS generated some discouragement
among the IT community. To make an example, the introduction of Grid
Services required the replacement of WSDL with a revised language derived
from it, the so called Grid WSDL (GWSDL).
Nonetheless, the experience with Grid Services was determinant to give a
further impulse to Web Services, to their mechanisms and standards. Grid
Services helped to improve Web Services, in the same way Web Services
had contributed to GC evolution. The earliest result of the cooperation
between the two technologies is the specification of a family of standards,
derived from a joined effort of the GGF and the standard body for Web
Services (OASIS) for finding a single specification valid both in GC and in
WS frameworks.
This family of standards includes the so called WS-Resource Framework
(WSRF), the WS-Addressing and the WS-Notification specification.
WSRF removes the disparity between Web Services and Grid Services
with respect to statefulness. With the new standard, Web Services and Grid
Services merge into a stateless entity (the Web service). This choice depends
on the shared opinion that statelessness is a good engineering practice: a
stateless service can be restarted after a fault without concern for its history
and prior interactions and can be easily migrated to other platforms when
load balancing is needed. To match OGSA specification and its advocacy for
statefulness, WSRF assigns to external resources the responsibility of
storing the state (Fig. 5-2): according to WSRF the service contacts an
external resource whenever it needs to modify or access stored data. WSRF
codifies all the mechanisms and procedures needed for associating the
service to external resources and for permitting client applications to invoke
the service and to access stored data.
WS-Addressing codifies the way applications can address the service and
the resources. WS-Notification specifies the mechanisms for permitting
services to disseminate information one another.
The most recent version of GT, namely GT4, is an implementation of the
new family of standards.
60 Chapter 5

OGSI state
service WSRF
service
state state

Figure 5-2. Comparison between OGSI and WSRF vision of statefulness. According to
OGSI, services store state in internal variables. WSRF, on the contrary, assigns the
responsibility of maintaining state to external resources, whose lifecycle is governed by client
applications.

4. GT COMPONENTS

One of the features common to all GT versions is the flexibility of the


architecture, conceived as an assemblage of autonomous components and
utilities, which can be grouped into the following categories:

job management job management services permit to maintain a


complete control of the remote functioning of applications ( jobs), so that
their progress is monitored and jobs are paused or stopped when needed;
information services information services allow to monitor the grid
resources and their status. The information services implement two mecha-
nisms, namely registration and discovery. Registration allows entities to
declare themselves as part of the resource pool, and to communicate to the
grid their characteristics. Discovery locates and accesses the resources and
their attributes;
data management data management includes utilities for the access,
transfer and management of large sets of data and for integrating
geographically dispersed heterogeneous storage sets;
common runtime a set of libraries and tools to build grid-aware
applications.

Each of the above components works in tight cooperation with the so-
called security GT components. Security serves to guarantee that resource
sharing is controlled, with resource providers defining clearly what is shared,
who is allowed to share and the conditions under which sharing occurs.
Furthermore, GT security grants mutual authentication, to request that both
endpoints of a communication pattern prove one another the authenticity of
their identity and, if required, confidentiality.
The following sections give a brief overview of the GT components.
5. Grid Computing 61

5. JOB MANAGEMENT

In a grid, users share hardware and software resources. They pick up


applications from the grid, choose the most appropriate platforms to run
them and launch them. Once the applications are launched, users track their
behavior and in case of faults stop and eventually launch them again. Job
management deals with the capability of running applications ( jobs) on
remote machines and of monitoring and controlling them.
The basic component for job management is the so called Grid Resource
Allocation and Management (GRAM), which is provided both in WS and
non-WS form. GRAM permits to run remotely an application and provides
an API for submitting, monitoring and terminating a job. It allows to specify
the resources to be used and to perform third party transfer of input and
output data.
Job management is a core component of most distributed systems, which
often implement a job management utility called scheduler. Schedulers are
in charge of allocating the required resources to jobs submitted by users in
an optimized manner. When a scheduler is available, the user submits a job
to the tool, together with a description of its requirements (such as an
indication of the required amount of memory or CPU power). Based on the
available resources, on the properties of the submitted jobs and on some
prioritization policy, the scheduler decides when and where to launch them.
GT does not include a scheduler, but interoperates with the most well-
known scheduler tools, by providing 1) a common layer for accessing grid
resources, 2) a unifying interface for end-users and 3) a tool for integrating
their functionalities. When a scheduling activity is needed, it must be
delegated to external tools. This can be done easily by properly configuring
and instructing the GRAM component.
An example of GRAM cooperation with other packages is given by
MPICH-G2. As explained in the following subsection, it is used in GC to
run parallel applications written in MPI.

5.1 GC for HPC

As seen in Chapter 1, the performance of a program can be improved if


autonomous operations are carried out in parallel. Parallelization is achieved
by running parallel code on computers with several processors
interconnected by a fast local interconnection network (the so called massi-
vely parallel processors, MPP), or by assembling homogeneous machines
interconnected by local area networks, to form the so-called clusters. Grids
extend parallelism concepts to the case where machines belong to diverse
organizations and are interconnected by geographical networks. In this
62 Chapter 5

hypothesis, the available computing power is potentially infinite. The main


drawbacks are i) the complex need of harmonizing machines which must be
supposed heterogeneous and autonomously managed, ii) the need of
adequately fast connectivity. As explained before, the former difficulty is
faced by the GM software, which smoothes heterogeneities by providing a
uniform way of accessing machines.
GT allows migrating parallel applications from MPP or clusters to grids
with no modifications. This is possible by installing the MPICH-G2 package
on top of GT. MPICH-G2 is an open-source grid-enabled implementation of
MPI, which, as discussed in Chapter 1, is the international standard
specification for message passing paradigm.
MPICH-G2 works on top of GT job management utilities to schedule the
instances of an MPI application to multiple machines belonging to a grid.
This allows MPI applications to run in grid environments with no changes,
given that MPICH-G2 and GT have been properly installed and configured
on each node.
As mentioned before, the major risk when running parallel applications
in grid environments is the reduced availability of bandwidth: grids often
work in wide-area environments, where a dedicated connection does not
exist. In these cases, a careful design of the parallel application is
particularly critical as communication of data must be minimized. Another
issue to consider is the contribute to latency due to the presence of the GM
software. In [Tarricone and Esposito, 2004], results concerning the
comparison between experiments with native MPI implementations, public
domain MPI implementations and grids with GT and MPICH-G2 are
provided. They demonstrate that latency is not substantially increased by
GT.

6. INFORMATION SERVICES

Information services includes tools for both the monitoring of grid


resources and their discovery. Monitoring deals with the capability of
observing resources and services for such purposes as tracking usage or
fixing problems. Discovery allows to find the suitable resources for
performing a task.
The GT component responsible for information services is the so-called
Monitoring and Discovering System (MDS). The most relevant component
of MDS (in the the latest versions of GT [i.e. GT3 and GT4]), is the so
called index service.
The index service maintains a repository of the available resources and
their properties and is able to respond to remote queries.
5. Grid Computing 63

It implements a registration mechanism, similar to that explained in


Chapter 4, when dealing with Web Services and UDDI. According to this
mechanism, information sources register to the index service to express their
wish to publish their properties. Once registration has taken place, the index
service repository can be updated with information pertinent to the registered
resource. Differently from static repositories such as UDDI, the index
service implements a soft state registration, which must be refreshed
periodically, otherwise it expires. This is fundamental in highly dynamic
environments such as computational grids, where resources are frequently
added and/or removed from the pool.
The index service organizes its data in an XML data model and responds
to queries formulated in the XPath language. As seen in Chapter 1, XML
documents have a tree structure. XPath allows to navigate in the XML tree
and to directly address its individual nodes. It uses a syntax similar to UNIX
for locating files and directories in a file system.
Consider for example the simple XML document of Fig. 5-3, and its tree
structure represented in Fig. 5-4.
The following expression locates all the chapters of the book:

/book/*

The following expression, instead, locates the chapter titled Grid


Computing:

/book/chapter/[@title=Grid Computing]

XPath provides also a number of built-in functions for building more


complex expressions. For example:

count(//chapter)

returns the number of elements named chapter.

As said before, MDS is used both for monitoring purposes and for
discovery tasks. As seen in Chapter 3, the need for discovery tools is
sensibly felt by Web communities too, which are working for the so called
Semantic Web. The Semantic Web aims at representing Web resources in a
way which is understandable by human and software agents. This is done via
the definition of the so called ontologies. Some languages (e.g. OWL) have
been defined by standard bodies to support such a vision of resource
discovery in the Web. In Chapter 4, we have seen how these languages can
be used for the discovery of services in service-oriented frameworks. An
64 Chapter 5

upper ontology (OWL-S) has been implemented for describing a service, its
functioning and its relationships with cooperating services.
The so-called Semantic Grid [De Roure et al., 2003] is a very recent
branch of GC. As the name suggests, it aspires at unifying Semantic Web
technologies with GC. The final goal is that of building the tools for the
automatic discovery of resources and their dynamic orchestration for
complex problem solving tasks.

<book title=new trends>


<chapter title=parallel systems>
<text>
Parallel systems are
</text>
<image file=image11.jpg>
</image>
</chapter>
<chapter title=semantic Web>
<text>
The origins of Semantic Web
</text>
<image file=image21.jpg>
</image>
</chapter>
<chapter title=Grid Computing>
<text>
The name Grid Computing
</text>
</chapter>
</book>

Figure 5-3. An example of XML document. It represents a book as composed of a number of


chapters, each having its own title. Each chapter, on its turn, is made of text and images.
5. Grid Computing 65

book title

chapter title chapter title chapter


title

text image

file

text image file


text

Figure 5-4. XML documents have a hierarchical structure. The figure shows the tree structure
of an XML document representing books. The book (root node) has a number of children
named chapter. Both book and chapters have an attribute named title. Each chapter is
made of text and images. Images have an attribute, named file, which contains the name of
the file containing the image.

7. DATA MANAGEMENT

Data management services deal with distributed data storage, transfer and
management.
The OGSA-DAI project, from a consortium of big IT companies and
centers such as IBM and Oracle, is working on the so called Data Access
and Integration (DAI) component. DAI integrates distributed heterogeneous
collections of data, such as DBMS, XML documents and files whose
structure is adequately described.
As regards data transfer and management, GT includes a number of built-
in utilities addressing these tasks. An example of general service offered by
GT is the extended version of the File Transfer Protocol, named GridFTP.
GridFTP adds a series of features to the well known FTP protocol,
customizing it to grid environments. The main features are:

partial file access allows to transfer selected portions of files. This is


useful when dealing with huge files, as it helps in saving bandwidth;
secure transfer includes authentication, privacy and integrity check;
parallel transfers the parallel movement of TCP streams facilitates
high-speed transfers and permits a considerable bandwidth saving;
66 Chapter 5

third party transfers GridFTP includes an authenticated protocol to


permit third-party control of transfers between two remote dataset storage
systems;
reliable file transfer GridFTP furnishes fault recovery methods to cope
with transient network failure, server outages, etc. and to restart failed
transfers.

Other interesting functionalities offered by GT are those related to


dataset replicas. When optimization of data access times is the most critical
issue, it can be useful to create a number of dataset replicas, i.e. to generate
identical copies of data and store them in different sites. This can reduce data
access latency. The creation of data replicas can considerably improve the
performance of data access, but adds a number of complications not existing
when dealing with a single instance of files. For example, the replicas must
be associated with each other and their location must be tracked (replica
management) and users should be enabled to access replicas transparently,
eventually specifying a selection criterion (replica selection).

References
Baker, M., et al., 2000, The Grid: International Efforts in Global Computing, International
Conference on Advances in Infrastructure for Electronic Business, Science, and Education
on the Internet, Italy.
De Roure, D., et al., 2003, The Semantic grid: a future e-science infrastructure, in Grid
Computing Making the global infrastructure a reality, F. Bernam, A. Hey, G. Fox
(Eds.), J. Wiley, Chichester.
Foster, I., et al., 2002, The Physiology of the Grid: An Open Grid Services Architecture for
Distributed Systems Integration, at Open Grid Service Infrastructure WG, Global Grid
Forum, (June 22, 2002); http://www.globus.org/research/papers/ogsa.pdf.
Foster, I., et al., 2003, The Physiology of the grid, in Grid Computing Making the global
infrastructure a reality, F. Bernam, A. Hey, G. Fox (Eds.), J. Wiley, Chichester, pp. 217-
249.
GT, 2005, http://www.globus.org.
Legion, 2005, http://legion.virginia.edu.
Tarricone, L., and Esposito, A., 2004, Grid Computing for Electromagnetics, Artech House,
Boston, MA, pp. 1-266.
5. Grid Computing 67

Bibliography

Access Grid Project Home Page, 2005; http://www.accessgrid.org.


Allen, G., Seidel, E. and Shalf, J. 2002, Scientific Computing on the Grid, Byte, (Spring, 2002).
ASC Portal Home Page, 2005; http://www.acsportal.org.
Berman, F., Hey, A. J. G. and Fox, G. C., 2003, Grid Computing: Making the Global
Infrastructure a Reality, John Wiley and sons.
Buyya, R., Abramson, D. and Giddy, J., 2000, An Economic Driven Resource Management
Architecture for Global Computational Power Grids, Intl. Conference on Parallel and
Distributed Processing Techniques and Applications (PDPTA2000), Las Vegas, USA,
(June, 2000).
Buyya, R., Abramson, D. and Giddy, J., 2000, Nimrod/G: An Architecture for a Resource
Management and Scheduling in a Global Computational Grid, 4th International
conference/exhibition on High Performance Computing in the Asia-Pacific Region
Beijing, China, IEEE Computer Society, Los Alamitos, USA, (May, 2000).
Cactus Code, 2005; http://www.cactuscode.org.
Chivers, H., 2003, Grid Security: Problems and Potential Solutions, Department of Computer
Science, University York.
Condor, 2005; http://www.cs.wisc.edu/condor/.
Czajkowski, K., et al., 2001, Grid Information Services for Distributed Resource Sharing,
Procs. Tenth IEEE International Symposium on High-Performance Distributed Computing
(HPDC-10), (August, 2001).
DataGrid, 2005; http://www.eu-datagrid.org.
Decusatis, C., 2002, Grid Computing: the Next (really, really) Big Thing, Byte, (Spring,
2002).
Di Martino, B. and Rana, O., 2003, Grid Performance and Resource Management using
Mobile Agents, in: V. Getov et al. (Eds.), Performance Analysis and Grid Computing, pp.
251-264, Kluwer, (Oct. 2003).
DPSS; 2005, http://www-didc-lbl.gov/DPSS.
Esposito, A. and Tarricone, L., 2003, Grid Technology for Computational Electromagnetics:
a Beginners Guide with Applications, IEEE Antennas and Propagation Magazine, 45, 2.
Foster, I., et al., 1998, A Security Architecture for Computational Grids, presented at ACM
Conference on Computers and Security.
Foster, I., et al., 2002, Grid Services for Distributed System Integration, IEEE Computer,
(June, 2002).
Foster, I., et al., 2003, The Anatomy of the Grid: Enabling Scalable Virtual Organizations,
Int. Journal of High Performance Computing Applications, Vol. 15, No. 3, pp. 200-222.
Fox, G. C., 2000, Portals and Frameworks for Web Based Education and Computational
Science; http://www.new-npac.org/users/fox/documents/pajavaapril00.
Gannon, D., et al., 2003, Grid Portals: A Scientists Access Point for Grid Services, Global
Grid Forum, (September, 2003).
GGF Home Page, 2005; http://www.gridforum.org.
Gibson, J., 1997, The Communication Handbook, CRC Press.
HPSS, 2006, http://www.sdsc.edu/hpss.
IBM Grid, 2005; http://www.ibm.com/grid.
Lai, W. K., Ng, K. W. and Lyu, M. R., 2004, Integrating Trust in Grid Computing Systems,
presented at GCC 2004.
LSF; http://www.platform.com.
Monson, Haefel, R., 2001, Enterprise JavaBeans, OReilly & Associates, (October, 2001).
68 Chapter 5

Myer, T., 2004, Grid Watch: GGF and Grid Security, vol. 2004: IBM developerworks.
NCSA, 2005; http://www.ncsa.uiuc.edu.
Ning, P., et al., 2001, Abstraction-Based Intrusion Detection in Distributed Environments,
ACM Transactions on Information and System Security (TISSEC), vol. 4.
NSF, 2005; http://www.nsf.gov.
OGSA, 2005; http://www.globus.org/ogsa.
Paton, N., et al., 2002, Database Access and Integration Services on the Grid, UK e-Science
Programme Technical Report Series Number UKeS-2002-03, National e-Science Centre;
http://www.cs.man.ac.uk/grid-db/papers/dbtf.pdf.
PBS; http://www.altair.de.
Peer to peer working group HOME PAGE; http://www.peer-to-peerwg.org.
Rana, O. F., et al., 2001, Resource Discovery for Dynamic Clusters in Computational Grids,
Procs.of Heterogeneous Computing Workshop, at IPPS/SPDP, San Francisco, 2001.
Siebenlist, F., 2003, Grid Security: Requirements, Plans and Ongoing Efforts, Presented at
ACM Workshop on XML Security, Fairfax, Virginia.
SRB; www.sdsc.edu/DICE.
Steiner, J., et al., 1988, Kerberos: An Authentication System for Open Network Systems, Proc.
Usenix Conference, 191-202.
TeraGrid; http://www.teragrid.org.
Thompson, M. R., Essiari, A. and Mudumbai, S., 2003, Certificate-Based Authorization
Policy in a PKI Environment, ACM Transactions on Information and System Security
(TISSEC), vol. 6, pp. 566-688.
Tuecke, S., Welch, V., Engert, D., Pearlman, L. and Thompson, M., 2004, Internet X.509
Public Key Infrastructure (PKI) Proxy Certificate Profile, vol. 2004: RFC, network
working group.
WebFlow; http://www.npac.syr.edu/users/haupt/WebFlow.
Zieger, A., 2003, Grid Security: State of the Art, vol. 2004: IBM developerworks.
Chapter 6
COMPLEX COMPUTATIONAL
ELECTROMAGNETICS USING
HYBRIDISATION TECHNIQUES
R. A. Abd-Alhameed and P. S. Excell
University of Bradford, UK

Abstract: A number of different computational electromagnetics methods are in


widespread use at the present time. The reason that a single method has not come
to dominate is because different methods have different advantages and
disadvantages. An outstanding case is the distinction between quasi-optical
methods and methods based on solutions of Maxwell's equations. The quasi-
optical methods rely on the classical approximations of geometrical optics and
diffraction theory, in which structures and their details are presumed to be either
electrically very large compared with a wavelength or else electrically very
small. The intervening region, where structures and their details have dimensions
approximately comparable with the wavelength, must be handled by a more exact
application of Maxwell's equations. The treatments that satisfy this requirement
can be further subdivided into integral equation methods and differential equation
methods, each with distinctive advantages and disadvantages. Integral equation
methods can give very good verisimilitude in representation of metallic
structures, but they run into severe problems of computer capacity requirements
when handling penetrable dielectric volumes or modelling relatively large
structures. Differential-equation based methods have no difficulty with
penetrable dielectric volumes, but the size of the structure they may model is also
limited and the representation of curved and other arbitrarily-oriented surfaces or
wires can suffer from significant problems of discretisation error.
The logical conclusion is that a range of methods should be available to
handle real-world problems, with different parts of the problem being handled by
the most appropriate method. This is known as a hybrid method. The boundary
between any two of the formulations has to be treated as a surface populated with
virtual sources whose excitations have to be determined, for example using the
Equivalence Principle. Implementation in a reliable computer algorithm and
segmentation of the task volume have been addressed by a number of research
groups and the method has now been become accepted as a sound and reliable
approach.

Key words: Hybrid methods; Method of Moments (MoM); Finite-difference time-domain


method (FDTD); Integral-equation time-domain method (IETD); Equivalence
Principle.

69
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 69145.
2006 Springer. Printed in the Netherlands.
70 Chapter 6

1. INTRODUCTION

As with most aspects of computational engineering, the computation of


electromagnetic field distributions has to proceed either by the calculation of
analytical formulae, or else by discretisation (digitisation) of a multi-
dimensional problem space and then the calculation of the parameters of
each digitised element as an approximation to the real continuous system.
The straightforward calculation of analytical formulae is only valid for a
very limited number of cases where the physical structure of the problem
corresponds to a standard analytical shape that can be described by closed-
form algebraic expressions. This situation rarely appears in the real world
and hence the discretised representation becomes the norm.
Discretised representations of electromagnetic problems fall into two
broad categories, which have their roots deeply in the history and philosophy
of physics. They may be conveniently named integral equation and
differential equation methods and their principal characteristics are as
follows.

1.1 Integral Equation Methods

These function by integrating the interaction of all of the (discretised)


portions of the physical structure of an electromagnetic system with each of
the same discretised portions, notionally considered in sequence. In its most
common form, this means that currents in discretised segments of a
conducting structure each contributes a component of magnetic field at a
specific segment that is being used to observe their effect and thus it can be
seen that this corresponds with the Newtonian concept of the representation
of force fields as action at a distance. This means that the method does not
take any interest in the behaviour of fields in the intervening space, provided
that this space can be presumed to be homogeneous. The analogy with the
original Newtonian problem of the interaction of gravitational fields of
planets in a solar system will be obvious.

1.2 Differential Equation Methods

These methods are based on a fundamentally different philosophy, of


which Michael Faraday is normally thought to be a key originator. This is
the field philosophy, in which fields of force are presumed to exist as real
entities, independently of the forces acting on charged and current carrying
6. Complex CEM with Hybridisation Techniques 71

bodies (for the electromagnetic case: other sources obviously apply for other
types of force fields, such as gravity). In computational implementations,
this means that the space intervening between physical structures where
force may be manifest has to the divided up into discretisation elements and
the fields are then made to propagate through the space by a computational
interaction of the parameters of the discretisation elements, the interaction
being determined by discretised versions of the fundamental differential
equations describing the behaviour of the fields, such as Maxwells
equations (i.e. in their most familiar differential form).

1.3 The Advantages and Disadvantages of the Methods

From the point of view of the philosophy of physics, the gulf between
these methods is very large, but in reality there has been a long history of
changing shifts in preference between the two approaches. Thus, the ancient
Greeks, Newton and the early electrical investigators all favoured the view
that a mystical power was causing action at a distance, but Faraday,
massively reinforced by Maxwell, took a pragmatic view that the complexity
of understanding and calculating the effects could be vastly simplified by
treating field distributions as having independent reality, such that the fields
could then be used to compute forces on physical entities lying within them
as a 2-step process (which was much simpler than a single-step procedure).
When Maxwell showed that radiated fields could be launched from an
antenna and could then travel through space, continuing even after the
antenna might have been disconnected or destroyed, the viewpoint that the
fields had independent reality seemed to be completely proven. However,
Einstein (who always acknowledged his deep indebtedness to Maxwell) was
able to provide an alternative interpretation by removing the concept of the
absoluteness of simultaneity and hence rescuing the action at a distance
view, allowing for time delay in the propagation of the action. This was
further supported by Feynman, whose Feynman diagrams gave a new
interpretation of action at a distance as a particle (or virtual particle)
exchange mechanism.
Going back to the practicalities of electromagnetic field computation, it is
convenient to link it to the history of the subject. The earliest work on CEM
used differential equation methods to compute static or power frequency
(quasi static) distributions in a very limited space. These techniques then
became applied to high-frequency fields in enclosed spaces such as wave
guides and cavity resonators, but any attempt to deal with radiating
structures ran into two substantial difficulties. Firstly, there was the problem
of terminating the discretisation grid at a relatively small distance from the
72 Chapter 6

object under study, while making this termination appear to be an interface


to continuous empty space. Secondly, there was the related problem of the
data quantities involved, since even with the termination of the grid, the
models inherently had to be three-dimensional, thus having a large number
of discretisation elements. In contrast, many enclosed problems could be
reduced to two-dimensional studies, with reasonable approximations, and
these were much more within the capability of the computers of the day.
As a consequence of these considerations, the integral equation approach
became the method of choice, since it has the twin advantages that its
describing equations are fundamentally developed for the case of a system
located in free space and, secondly, it completely avoids the need to
discretise the empty space between active (solid matter) parts of the system.
Considering typical common systems consisting effectively only of
conducting structures (i.e. avoiding penetration into the solid material), this
means that discretisation is only needed on the surface of the structure. Thus
the problem is effectively two-dimensional at most, and often only one-
dimensional, if the structure is effectively composed only of wires and
electrically-thin rods.
For electrically very large structures, even the integral equation method,
in its full form (usually known to practitioners as the Method of Moments)
became non-viable, requiring an excessive computational task size, and
hence ray methods were invoked, starting from classical geometrical optics
and then incorporating additional rules for ray behaviour derived from
diffraction theory, commonly known as GTD (Geometrical Theory of
Diffraction). In reality, these methods are also integral methods, but with a
much simplified interaction formulation. Another simplified integral
approach is the physical optics method, which is similar to the Method of
Moments but with simplified interaction equations.
(The expressions Method of Moments and Geometrical Theory of
Diffraction are somewhat contentious, purists arguing that they are not
strictly appropriate; however, they have become established as convenient
abbreviated names for the methods concerned and their meaning is generally
understood).

1.4 Hybrid Methods

Credit for the first widely used, general purpose hybrid electromagnetic
modelling software package must go to Edgar Buddy Coffey and Diane
Kadlec, who developed the GEMACS (General Electromagnetic Model for
6. Complex CEM with Hybridisation Techniques 73

the Analysis of Complex Systems) program for the US Air Force in the
1980s [Coffey and Kadlec, 1990]. The first hybrid version of this linked the
Method of Moments with the Geometrical Theory of Diffraction and later
versions added a finite difference formulation. However, the formulations
and the structure of the program were fundamentally geared to military
aviation applications and hence other workers tended to develop new
programs when different classes of problems had to be addressed.
A principal motivation for this was the problem of computation of the
interaction of mobile phones with the human head, which became a major
issue in the 1990s. The large mass of penetrable dielectric matter
constituting the human head presented a problem substantially different from
the small air-filled penetration regions in an aircraft, and new approaches
were adopted to deal with this. In early work, the dominance of the bulk of
biological tissue determined the computational electromagnetics method to
be used and almost all researchers adopted the finite difference time domain
(FDTD) method, because it minimised the ratio of the computational task
size to the number of discretisation elements, thus allowing a relatively high-
resolution three-dimensional representation of the head to be processed in a
viable timescale.
While the representation of the head became increasingly realistic, it
became more and more apparent that the representation of the handset and
antenna were deficient in comparison. This was because the FDTD method
represents all structures as a regular matrix of rectangular parallelepipeds
and this gives a very coarse representation of fine wire structures. Not only
are such structures represented as a staircased approximation, but the
current path on the surface of the staircase is considerably longer than on
the real metallic structure (unless it happens to correspond with one of the
principal axes of the FDTD geometry) and hence the resonance of the
structure will be shifted to a lower frequency than its true value: this in
turn will modify the impedance characteristics of the structure. This became
a particular issue with the growth in popularity of compact helical antennas
on handsets: the FDTD representation of a fine wire helix was extremely
unsatisfactory even when the axis was parallel to a principal axis of the
FDTD geometry.
To overcome this problem, some groups decided to create hybrid models
in which the head was represented by FDTD, but the handset was
represented by the Method of Moments, which permitted essentially
arbitrary shapes of conducting structures, although with the penalty that
dielectric could not normally be represented. Some groups chose to
hybridise FDTD with integral-equation time-domain (IETD) methods, a
74 Chapter 6

logical choice given the commonality of time domain in both parts of the
problem.
However, IETD methods have not yet achieved widespread operational
status and experience with them is still being built. The present authors
therefore took the view that it would be more pragmatic to use well-known
and widely used methods in the two halves of the problem, and this meant
combining FDTD with the frequency-domain Method of Moments. The
nominal incompatibility between time domain and frequency domain was
then overcome by application of a simple Fourier transform at the boundary
between the two domains: this is the method that is explained in detail
below.
Subsequently, means have been found to make the grid of the FDTD
method conformal to arbitrarily-shaped metallic structures, but the resolution
has to be quite small if fine detail is to be represented and this can make
computer run times relatively long. The hybrid method thus remains very
competitive if relatively short run times are required. The hybrid method has
itself evolved by use of the finite element method (FEM) as an alternative to
Method of Moments: this permits the routine handling of dielectrics.
It is evident that it will continue to be the case for some time that no
single method will be superior for all computational electromagnetics
problems, and hence the need to deploy two or more methods in solution of
particular problems will continue to be a useful tool: the hybridisation
techniques described in the following sections provide the linkage to permit
users to combine multiple methods in this way, facilitating the solution of
as-yet unaddressed problems.

1.5 Literature Review

The hybridisation approach takes the advantages of features offered by


several methodologies to analyze complex electromagnetic problems that
cannot be resolved conveniently and/or accurately by using a single
technique. Examples of such problems include those comprising arbitrarily-
oriented, thin-wire antennas and inhomogeneous dielectric scatterers, as
encountered in areas such as microwave breast tumour detection [Fear et al.,
2002; Pantoja et al., 2002], ground-penetrating radar [Lopez et al., 2001;
Demarest et al., 1996], hyperthermia treatments [DAmbrosio and Migliore,
1994] and electromagnetic compatibility [Bridges, 1995; Lail and Castillo,
2000].
6. Complex CEM with Hybridisation Techniques 75

Numerical methods such as the Method of Moments (MoM) [Harrington,


1968], Finite Difference Time Domain (FDTD) [Yee, 1966] and Finite
Element Method (FEM) [Jin, 2002] have been hybridised with different
numerical techniques in the literature, these hybrid approaches taking
advantage of the strength of each numerical technique in order to solve
problems that neither technique alone could model efficiently.
In the past, ray-based hybrid methods combining the Geometrical Theory
of Diffraction (GTD) and Uniform Geometrical Theory of Diffraction
(UTD) with MoM have been extensively employed [Kim et al., 1999;
Silvestro, 1992; Coffey and Kadlec, 1990]. This kind of hybrid method is
especially suitable when a small object such as a dipole antenna is located in
front of an electrically large object like a reflecting screen. Then the small
object is treated with the MoM, and the influence of the large body is
considered by means of the GTD/UTD. More recently, in [Jakobus and
Landstorfer, 1995], a current-based hybrid method combining the MoM with
the physical optics (PO) approximation, suitable for three-dimensional
perfectly conducting bodies, was proposed. This permitted a continuous
current flow, modelled using the two techniques on the whole surface of the
scattering body. Later, a combination of MoM with the Finite Element
method (FE) has been presented, e.g. in [Ali et al., 1997], to solve EM
radiation problems from structures consisting of an inhomogeneous
dielectric body of arbitrary shape (e.g. printed circuit boards) attached to one
or more perfectly conducting bodies, e.g. wires, strips and cables. The FE
method is an efficient differential equation technique that is generally used
in the frequency domain. Moreover, a hybrid of three numerical methods,
FE, GTD and MoM was presented in [Reddy et al., 1996] to analyse the
radiation characteristics of cavity-backed aperture antennas in a finite
ground plane. However, as discussed in [Kuster et al., 1997], the difficulty
of realising open domain boundaries was a limitation due to the need for
efficient radiation boundary conditions. The FE mesh generation in three
dimensions is still also a formidable task. While there are reasonably good
techniques for the discretisation of artificial structures, the difficulty of
generating finite element models for the typically very inhomogeneous
problems in (biological) dosimetry currently limits their use. Another
significant three-method hybrid was GEMACS 5.2 [Coffey, 1993], which
combined MoM, GTD and the Frequency-Domain Finite Difference method
(FDFD). This was intended for aircraft simulation and the FDFD domain
used a non-standard formulation, specific to penetrable cavities in an
aircraft.
It is a well-known fact that integral equation methods such as the Method
of Moments (MoM) [Harrington, 1968], and boundary element method
76 Chapter 6

(BEM) [Burke and Poggio, 1981] treat unbounded problems very efficiently,
but they become computationally intensive when complex inhomogeneities
and nonlinear dielectrics are present. In contrast, inhomogeneities and
nonlinear dielectrics are easily handled by finite methods. The finite element
method [Jin, 2002] requires less computer time and storage than MoM (for
comparable dielectric problems) because it results in a sparse and banded
matrix. The finite difference time domain (FDTD) [Yee, 1966] method uses
an iterative approach and therefore requires less computational time,
provided its other limitations are acceptable. However, finite methods are
most suitable for bounded problems and special steps need to be taken if an
unbounded region is present. Obviously, any hybrid method that retains the
most efficient characteristics of both finite methods and integral methods is
computationally advantageous.
Reviewing the literature, the hybrid MoM/FDTD method may be said to
have been first investigated in 1982, when Taflove and Umashankar
[Taflove and Umashankar, 1982a] used a hybrid FDTD/MoM approach to
investigate EM coupling problems and aperture penetration into complex
geometries and loaded cavities, for example a missile guidance section. This
hybrid method used MoM to solve the exterior problem and the FDTD
method to model complex interior problems. The two regions were linked
via an equivalent short-circuit electric current excitation in the aperture
regions of the structure using MoM for a given external illumination.
However, it did not employ computations of equivalent magnetic current on a
virtual equivalent surface and it was only suitable for field penetration
problems into a closed cavity region. In addition, hybrids of the finite
element method with integral equation methods, such as the extended
boundary condition integral method (FEBI) [Morgan et al., 1984; Morgan
and Welch, 1986; Boyse and Seidl, 1991; Sheng et al., 1998], the boundary
element method (FE/BE) [Lynch et al., 1985, 1986; Paulsen et al., 1988;
Stupfel et al., 1991; Soudais, 1995; Zielinski and Zienkiewicz, 1985; Salon
and Angelo, 1998; Nath et al., 1993], the Method of Moments (FE/MoM)
[Yuan, 1990; Yuan et al., 1990] and integral equation domain decomposition
method (FE/IEDD) [Stupfel and Despres, 1999; Bruno, 2001] have been
developed by implementing the same principles.
In 1987, Taflove and Umashankar et al used an equivalent surface fully
enclosing equivalent wire bundles (the concept of equivalent radius) to
replace it with a single wire in an FDTD model [Umashankar et al., 1987].
Later, the same concepts of [Taflove and Umashankar, 1982a] and
[Umashankar et al., 1987] were deployed in the computer software
GEMACS [Coffey, 1993] which was developed using MoM/UTD/FDFD
6. Complex CEM with Hybridisation Techniques 77

hybrids that let users model problems with more than one region, e.g. an
inside and an outside. FDFD (Finite Difference Frequency Domain) was
used to model the interior region(s), while MoM or MoM/UTD were used to
model the exterior region. The physics of each region was reduced to a
matrix formulation, and boundary conditions across regions were enforced
by the way the matrices were connected together.
In 1993 Aoyagi et al. [Aoyagi et al., 1993] used the Yee algorithm in
conjunction with the scalar wave equation to reduce the computations
needed to model a Vivaldi antenna, while Cangellaris et al. [Cangellaris
et al., 1993] used a hybrid spectral-FDTD method to analyze propagation in
anisotropic, inhomogeneous periodic structures. Lee and Wang [Lee and
Chia, 1993; Wang et al., 2002] introduced a hybrid ray-FDTD method and
used it to investigate scattering from a cavity with a complex termination
and wave penetration through inhomogeneous walls. In 1994, Mrozowski
[Mrozowski, 1994] introduced a hybrid FDTD-PEE (partial eigenfunction
expansion) method to speed up the FDTD method when solving shielded
structure problems. In addition, finite element and finite volume methods
have recently been combined with FDTD, [Monorchio and Mittra, 1998;
Yang et al., 2000], to accurately model curved geometries and those with
fine features.
More recently, in 1998, Bretones et al. presented in [Bretones et al.,
1998] a time-domain version of MoM in a hybrid approach for studying the
transient excitation of a thin wire antenna located in the proximity of an
inhomogeneous dielectric scatterer and above a perfectly electrically
conducting (PEC) ground plane. Also, Cerri et al. [Cerri et al., 1998] used a
time-domain version of MoM for developing a hybrid technique. The
method has the advantage of generating information over a wide frequency
band. It does not require an iterative procedure to couple with FDTD, but it
requires very large run-times when treating a junction with more than two
wires [Tinniswood, 1996], unlike the frequency-domain version in which the
complex metallic structures may be modelled accurately in less run-time and
with more flexibility for modelling different complex geometries. Huang
et al. [Huang et al., 1999] employed a hybrid technique for modelling the
interaction of ground-penetrating radar (GPR) with complex ground, using a
combination of frequency domain MoM, Fourier transformation and
iterations. This method has the same principles as the method proposed in
the present work, while it focuses on GPR applications. Recently, another
hybrid MoM/FDTD method [Chen et al., 1998] was applied for numerical
simulations of SAR and the magnetic field of shielded RF coils loaded with
a human head for a biomedical application. In [Forgy et al., 1998] the source
antenna is modelled as a stack of Hertzian dipoles: however, the authors
78 Chapter 6

neglect the effect of the back-scattered field on the source [Chen et al., 1998;
Forgy et al., 1998]. The same approximation is used in [Lysiak et al., 1996]
which is oriented towards two-dimensional UHF/VHF propagation
problems: the FDTD is excited just by a vertical slice near the problem area.
Research is still going on and more groups have become interested in this
novel hybrid method. Rubio Bretones et al have recently published a method
to combine the NEC [NEC, 2005] with FDTD in [Bretones et al., 1999]. The
entire algorithm entails running the FDTD code Ns times (where Ns is the
number of the basis functions on the wire antenna). This is considered a
drawback of the proposed technique in [Bretones et al., 1999], as it requires
extensive computational time which will be impossible in many real-world
cases. Some interesting comparisons between the MoM and FDTD
numerical methods were published in [Monk et al., 1994] for modelling
electrically small antennas and in [Colburn et al., 1995] for radiation and
scattering involving dielectric objects.
Figure 6-1 shows the basic geometry of the hybrid combination of
frequency domain MoM (FDMoM) and FDTD, as proposed by the present
authors [Mangoud et al., 2000; Abd-Alhameed et al., 1999] in 2000 (note:
the symbols and notations presented will be fully explained in later parts of
this chapter). They successfully implemented hybrid FDMoM/FDTD
numerical methods to overcome the drawbacks of homogeneous FDTD and
MoM simulations and in turn to solve a wide variety of electromagnetic
interaction problems effectively. Meanwhile, the finite element method was
successfully hybridised with High-Frequency Asymptotic Techniques
(HFATs) such as PO (Physical Optics), GTD (Geometrical Theory of
Diffraction), PTD (Physical Theory of Diffraction) and UTD (Uniform
Theory of Diffraction) and applied to investigation of antennas mounted on a
large complex structure [Han et al., 2000a, 2000b, 2000c, 2002]. In this
analysis, since the size of the computational domain was too large to be
treated by a full-wave analysis, the use of HFATs was essential. In 2003, the
authors hybrid method was implemented and validated to analyse a
complex bio-EMC problem [Mochizuki et al., 2003]. A procedure for
calculation of grounding systems by using hybrid FEM-BEM has proved to
be a very accurate and simple way of analyzing and designing such systems,
especially in cases of different local soil properties and arbitrary shapes and
combinations of system elements [Trlep et al., 2003].
6. Complex CEM with Hybridisation Techniques 79

MoM
+
interpolation method J(t)
(IDFT)
M(t)

Eb(jw)

Hb(jw)

(DFT)
S
cb
Sci Scatterer
Source
PML
FDTD

Figure 6-1. Basic geometry of the hybrid combination of frequency domain MoM and FDTD.

In 2004, a new hybrid method combined FDTD, FETD and MoMTD to


analyze problems of thin-wire antennas radiating in the vicinity of
arbitrarily-shaped inhomogeneous bodies [Monorchio et al., 2004]. In
[Monorchio et al., 2004], staircase errors [Akyurtlu et al., 1999] from FDTD
were mitigated by using the Finite Element Method (FEM); however, this
method has the shortcoming of high demand for computational resources.
Meanwhile, a hybrid finite difference/finite-volume time-domain (FVTD)
method [Yang et al., 2000; Yee and Chen, 1997; Edelvik and Ledfelt, 2000],
was applied to solve an automotive electromagnetic compatibility (EMC)
problem [Fierriers et al., 2004]. Hybrid mode-matching (MM) [Orfanidis
et al., 2000] plus finite-element/method of moments/finite difference tech-
niques were applied for rigorous, fast computer-aided design and opti-
mization of waveguide [Arndt et al., 2004].
In 2005, the method in [Mangoud et al., 2000] was extended to include
the analysis of wide band antenna response using an impedance interpolation
method to minimise the computational time on the MoM side [Abd-
Alhameed et al., 2005]. Hybrid FE/BE methods have been applied to many
fields [Zielinski and Zienkiewicz , 1985; Salon and Angelo, 1988; Nath
et al., 1993]. However, there are very few comparisons with analytical
solutions. In [Thiagarajan and Hsieh, 2005], 3D-hybrid FE/BE methods for
80 Chapter 6

electromagnetic launch applications are investigated and verified with semi-


analytical solutions. In [Djordjevic and Notaros, 2005], an efficient and
accurate higher order, large-domain PC-oriented method is proposed, based
on a hybrid method of moments (MoM) and physical optics (PO) technique
for 3-D analysis of arbitrary perfectly conducting antennas and scatterers in
the frequency domain.

2. OUTLINE OF THEORY AND


IMPLEMENTATION OF HYBRID METHOD

The following sections develop the theory of the hybrid treatment, using
well-known numerical methods in two or multiple regions, constituting a
general electromagnetic problem. Initially, a simple numerical technique is
presented and implemented using a sub-matrix iterative technique followed
by a field transfer iterative algorithm in multiple regions. The equivalence
principle surfaces are then applied to link separate regions wherein the
sources are all modelled by MoM. A new near-to-near field transformation is
invoked to calculate the back-scattered voltage in the source region using the
reaction theory. The total/scattered field formulation in three dimensions,
as used for incident wave excitations in the FDTD method, is detailed.
A modified version of the total/scattered field formulation is introduced, to be
applied if the size of the scatterer is large with respect to the size of the
source. Two different implementations concerning the locations of E and H
components on the equivalence-principle surface are presented for the
modified formulation. The chapter continues by introducing the theory of
hybrid MoM/FDTD techniques (heterogeneous hybrids). The theory of the
far field and radiated power calculations used in the hybrid code is
illustrated. Studies on the effect of varying the size of the intervening
Huygens surface on the accuracy of the results are performed. Current
distributions, near fields, far fields and radiated power calculations used in
the hybrid technique are presented. Finally, the validity of the method is
checked by comparing some sample geometries involved in EM scattering
problems with either theory or standard numerical techniques.

2.1 Hybrid Treatment for Homogeneous Multiple


Elements

As a first step in developing the desired technique, hybrid MoM/MoM


iterative techniques were proposed for simple dipole examples, then the
6. Complex CEM with Hybridisation Techniques 81

same computational electromagnetics (CEM) formulation was applied in


separate regions, linked via equivalence-principle surfaces (Huygens
surfaces). After the techniques had been developed and proven, hetero-
geneous sets of formulations were investigated.

2.1.1 Hybrid MoM/MoM Treatment for Two Elements


(Sub-Matrices Iterative Technique)

A basic scenario of two straight wire dipoles, as shown in Fig. 6-2, was
chosen to demonstrate the hybrid homogeneous MoM/MoM algorithm in
two regions. Both were modelled by MoM, one divided into m wire
segments and the other divided into n segments.
Expanding the input impedance matrix of dimension (n m, n m)
obtained from the solution of MoM, that represents the inner product of the

V1 V2

m
n

Figure 6-2. Geometry of two dipole elements for two-region MoM/MoM treatment.

total scattered fields with the test current distribution, into sub-matrices that
define the self and the mutual impedances of the radiating elements, then for
the two-element problem of Fig. 6-2, the describing equation can be written
as:

< J m1.Es1 > < J m1.Es2 > < J m1.Ei >


< J .E > =
m2 s1 < J m2 .Es2 > < J m2 .Ei >
(6.1)
Z11 Z12 I1 V1
Z =
21 Z 22 I 2 V2
82 Chapter 6

where the impedance sub-matrix Z11 has dimensions m m, Z12 has


dimensions m n, Z21 has dimensions n m and Z22 has dimensions n n.
Element 1 of the current array (I1) has m elements I11,,Im1 and element 2
(I2) is an array of n elements I12,,In2.
The two sub-arrays of the right hand side of the above equation, V1 and
V2, are excitation arrays of m and n elements respectively: both have zero
elements except the centre element which is equal to v1 and v2. Es1 and Es2
are the incident fields from elements 1 and 2, induced as functions of I1 and
I2 respectively. Jm1 and Jm2 are the test current density functions of elements
1 and 2. Ei is the impressed excitation source. Eq. (6.1) can be expanded into
the following two equations (in matrix form):

Z 11 I 1 + Z 12 I 2 = V1 (6.2)

Z 21 I 1 + Z 22 I 2 = V2 (6.3)

Separating I1 from Eq. (6.2):

I 1 = Z 11 1 [V1 Z 12 I 2 ] (6.4)

If I2 = 0 then this means element 1 is considered in free space without the


existence of element 2,

I1 = I1free = Z111V1 (6.5)

Thus if I1 is known under free space conditions, then I2 can be obtained


easily from Eq. (6.3) as follows:

I 2 = Z 22 1 [V2 Z 21 I 1 ] (6.6)
6. Complex CEM with Hybridisation Techniques 83

Figure 6-3. Iterative algorithm for MoM/MoM Sub-Matrix method for two elements.
84 Chapter 6

Eq. (6.6) can then be rewritten as:

I 2 = I 2free Z 22 1 [ Z 21 I 1 ] (6.7)

Where the term [Z21I1] can be defined as backscattered excitation.


Knowing I2, the same procedures can be repeated to find the updated I1 from
Eq. (6.4), again by considering the back-scattered field from element 2 to
element 1. The algorithm is repeated until the steady-state condition is
achieved. The results of this iterative procedure are compared with the exact
solution of the problem using a single MoM region to both elements 1 and 2
at the same time by solving Eq. (6.1) simultaneously. A flow-chart of the
algorithm is shown in Fig. 6-3.

2.1.2 Hybrid MoM/MoM Method for Two Elements (Field Transfer


Iterative Technique)

The next step can be called the Field transfer hybrid MoM/MoM
technique. As shown before, the back-scattered excitation term [Z21I1] used
in calculating I2 is equal to the inner product term <Jm2, Es1>. Also, the back-
scattered excitation term [Z12I2] in the calculations of I1 is equal to the inner
product term < Jm1, Es2>, where Es1 is the back-scattered field due to current
I1 at the locations of the second elements current test functions and Es2 is the
back-scattered field due to current I2 at the locations of the first elements
current test functions. Another way to get the inner product term can be
implemented by applying the near field term from one element at the test
current locations of the other element. Esp observed at the centre of segment
s for p = 1 or 2 is assumed uniform on the segment length of ls, on the
approximation that the segment length is very small compared to the
operating wavelength. Then the excitation voltage corresponding to that field
for segment i on the locations of the second element (assuming the wire does
not exist) can be rewritten as:

Z 21 (m2, s1)
I 1 = < Jm2 , Es1 > = Es1 f m (l) dl ( a s .am ) (6.8)
lm

where lm is the length of the test function. It is clear that the above
integration value depends on the type of test function applied on element 2.
For example, for equal segment lengths and the first order polynomial test
6. Complex CEM with Hybridisation Techniques 85

functions (triangle) used in [Abd-Alhameed et al., 1998, 1999; Abd-


Alhameed and Excell, 1996], Eq. (6.8) will be equal to:

< J m2 , Es1 > = E s1 lm /2 = E s1 ls /2 (6.9a)

On the other hand, as will be shown later, for the standard NEC MoM
code interface, the test (weighting) functions are impulses as the solution is
of point-matching form. It is then found that the previous back-scattered
excitation voltage will equal:

< J m2 , Es1 > = E s1 ls (6.9b)

Here the idea of a heterogeneous hybrid method can be introduced as this


procedure can be implemented with different frequency-domain numerical
solutions, for example the finite element method, the Geometrical theory of
diffraction or the Physical optics method.

Figure 6-4. Field Transfer Hybrid MoM/MoM algorithm for two-dipole problem.

Thus, the field transfer MoM/MoM [Mangoud et al., 2000] hybrid


technique algorithm summarised in Fig. 6-4 and the flow chart of Fig. 6-5,
for the same basic example of two dipoles, is started by calculating the free
space currents of both dipoles. Next, applying Eq. (6.8), Eq. (6.6) can be
written in the following form:
86 Chapter 6

Figure 6-5. Two dipole MoM/MoM field transfer iterative algorithm.


6. Complex CEM with Hybridisation Techniques 87

I 2 = I 2free Z 22 1Vs1 = I 2free Z 22 1 (E s1l s /2) (6.10)

Similarly for element one, the corresponding equation can be obtained as:

I 1 = I 1free Z 11 1Vs2 = I 1free Z 111 (E s2 l s /2) (6.11)

where Es1 and Es2 are as defined before.

0.25 D/

0.47
0.5
= 41.3
= 0.83 y

0.0625

Figure 6-6(a). Geometry of the problem modelled in Example 6.1.

The following example is used to illustrate the above technique.


Example 6.1: Figure 6-6(a) shows a dipole of length 0.47 and radius
0.0045, located adjacent to a slab of simulated human brain material, for
different distances between their centres. Here the dielectric material is
implemented using the volume polarisation current technique discussed in
[Abd-Alhameed et al., 1998; Abd-Alhameed and Excell, 1996]. The slab
has dimensions 0.25 0.0625 0.5, relative permittivity 41.3 and
conductivity 0.83 Sm1. Fig. 6-6(b) shows the input impedance for the dipole
using MoM/MoM method. The results of the hybrid method, using two
separate coupled MoM regions, are in excellent agreement with results using
a single MoM region [Abd-Alhameed and Excell, 1996].
88 Chapter 6

2.1.3 Extension of Hybrid MoM/MoM Method from Two Elements


to Multiple Elements (Field Transfer Iterative Technique)

The problem is now extended to n elements (sources or scatterers). The


multiple reaction iteration scheme between different elements can be stated
in mathematical form, taking as an example the MoM expression for the
total current in terms of the induced currents. The previous techniques can
be generalised, so that the induced currents in the element p at the i th
iteration can be given by:
n
I ip = < J mp , E mp > 1
< J mp , E mk > I ki 1 + I pfree (6.12)
k p

where Ipfree is the free space current on element p. This current is zero if this
element is considered to be a scatterer element. Jmp is the current test
function on element p. Emp is the scattered field due to the test function Jmp.
Emk is the field due to the test function Jmk. Ik is the induced current on
element k. i is the iteration number and <A, B> is the inner product of A and
B. The second inner product of Eq. (6.12) is the excitation vector due to the
element k. The following example may be used to test the method:
Example 6.2: Three parallel dipoles of length 0.47 and radius 0.0045
were considered in order to test the validity of the hybrid method (Fig. 6-7).
Each of them was treated as being within a separate domain and two of them
were excited by delta source generators of amplitude (1 + j0) V. The input
impedances versus the number of iterations are shown in Fig. 6-8 and Fig.
6-9. Rapid convergence is observed within a few iterations.
6. Complex CEM with Hybridisation Techniques 89

Reactance:MbM Reactance : Hybrid Method


Resistance:MbM Resistance : Hybrid Method
120
13
115
110 7
Resistance in Ohms

Reactance in Ohms
105
1
100
95
5
90
85 11
80
17
75
70 23
0.2 0.25 0.3 0.35 0.4 0.45 0.5
Separation Distance (Wavelengths)

Figure 6-6(b). Input impedance of 0.47 dipole adjacent to a slab of simulated human brain
material (MoM/MoM coupled regions and single MoM treatment [Abd-Alhameed and Excell,
1996]).

Y
0.4 X
0.3
V2

V1

Source 2
Source 1

Figure 6-7. The antenna geometry of MoM/MoM Example 6.2.


90 Chapter 6

Source 1 Source 2
140

130
Resistance in ohms

120
Rin (source 1)= 127 ohms [15]
110
Rin (source 2)= 80 ohms [15]
100

90

80

70
1 2 3 4 5 6 7 8 9 10
Number of Iterations

Figure 6-8. Iteration convergence of the input resistance of the source dipoles in Fig. 6-7.

In the above examples the surface of the scatterer and source are assumed
to be the Huygens surfaces. The next step is to use the Huygens surfaces
surrounding the object to be modelled. The effect of the back-scattered
excitation on that object (either the source or the scatterer) can then be
obtained by applying a suitable near-field to near-field transformation, as
will be shown in the next sections.

Source 1 Source 2
10

0 Xin (source 1)= 22 ohms [15]


Xin (source 2)= 35 ohms [15]
Reactance in ohms

10

20

30

40

50

60
1 2 3 4 5 6 7 8 9 10
Number of Iterations

Figure 6-9. Iteration convergence of the input reactance of the source dipoles in Fig. 6-7.
6. Complex CEM with Hybridisation Techniques 91

Figure 6-10. The basic structure of the problem.

2.1.4 Hybrid MoM in Multiple Regions Using the Equivalence


Principle Surface

Consider the geometry of the problem given in Fig. 6-10. Here a closed
equivalent surface is used to enclose each element in the problem, instead of
dealing with the surface of the source or scatterer itself, as used in the
previous techniques. There should be a specific distance 0 between the
surface of each element and the virtual surface surrounding it, as shown in
region 1 of Fig. 6-10. This distance should be considered to be as small as
possible.
Figure 6-10 shows the problem geometry which is subdivided into n
regions provided there are no physical attachments between them. The
regions may be highly coupled. Each region can be represented as a source
or a scatterer. Since the problem space is divided into n regions, n sub-
domains can be created by introducing closed surfaces Si (for i = 1,..., n),
enclosing each region. Each sub-domain can then be treated separately as
follows.
Initially, each source region is solved (using a preferred CEM method)
for the induced current, assuming the region to be in free space. The induced
resulting currents in each region are used to evaluate the fields on the
enclosing surface. The fields due to all source regions are then used as
excitation sources for the scattering due to all of the regions. The induced
currents in each scatterer region are used to compute the back-scattered
fields on the closed surface surrounding that region.
92 Chapter 6

The effect of the back-scattered fields on a region containing a source


was accounted for as impressed excitation fields in that source region. A
novel near-to-near field transformation is proposed here to get these
impressed excitations from the available data of the field values on the
virtual surface. The new induced currents in this source region were used to
obtain excitation fields in all other source and scatterer regions. Scatterer
regions were dealt with in the same way. An iterative procedure was then
used to obtain convergence of the results for the interaction between the
regions.

2.1.4.1 A New Near-to-Near Field Transformation Using Reaction


Theory
The reaction integral equation (RIE) has been discussed in detail in
[Richmond, 1974], where it was used for internal source excitation of a thin
wire MoM model. A new application for the reaction theorem will be
proposed to perform a near-to-near field transformation for the proposed
hybrid technique.
It has first to be asked, how source surface currents affect or react with
the scatterer, or alternatively what is the induced surface current density on
the scatterer (Js2), given that the internal impressed excitation exists only
in the source element? The answer will involve applying the RIE to get the
scattered excitation voltage (Vb2) from the near field values on the closed
Huygens surface, as follows. As shown in Fig. 6-11(a), let S denote a virtual
closed surface around the source structure, and let V denote the interior
volumetric region. Assume the source surface current densities to be (Js1free,
Ms1free). From the equivalence principle theorem of Schelkunoff
[Schelkunoff, 1951], the interior field in V will vanish without disturbing the
exterior field (Ei1, Hi1), if we introduce the following surface-current
densities (Ji1, Mi1) on the closed surface S.

J i1 = n H i1 (6.13)

M i1 = Ei1 n (6.14)
6. Complex CEM with Hybridisation Techniques 93

Figure 6-11. Near-to-Near field Transformation using Reaction Theorem.

In this situation, we may replace the source structure with a


homogeneous medium (having permeability and permittivity ) without
disturbing the field (Ei1, Hi1) anywhere outside the closed surface S. Now let
us place a test source (or probe) on the scatterer surface and consider its
reaction with the equivalent sources (Ji1, Mi1) or the resultant external field
(Ei1, Hi1). If the test source has electric current density Jm2 and magnetic
current density Mm2, the RIE can be applied to the scatterer, resulting in:

( J m2 .E s2 M m2 .H s2 )ds = ( J m2 .Ei1 M m2 .H i1 )ds (6.15)


s s

where (Es2, Hs2) denotes the field generated by (Js2, Ms2). From Eq. (6.15)
and the reciprocity theorem, we obtain the other form of the RIE:

( J s2 .E m2 M s2 .H m2 ) ds = ( J i1 .E m2 M i1 .H m2 ) ds (6.16)
s s

where (Em2, Hm2) are the fields radiating from the test current sources of the
scatterer. It should be noted that the right-hand side is a surface integral over
the equivalent closed surface S, since (Ji1, Mi1) are the equivalent surface
currents. This form of the integral will be used to implement the method. For
a scatterer with a perfectly conducting surface, the magnetic current density
Ms2 vanishes. In this case the left hand side will be simplified to the inner
product term < Js2 , Em2> which is equal to:
94 Chapter 6

( J s2 .E m2 )ds = < J s2 , E m2 > = Z 22(s2,m2) I s2 (6.17)


s

where Z 22(s2,m2) are the elements of the MoM impedance matrix for the
scatterer object and Is2 are the induced currents on the scatterer surface which
are the unknown that required from the solution of Eq. (6.16). Eq. (6.17)
could be implemented for the case of wire antennas as in [Abd-Alhameed
and Excell, 1996]: the MoM solution is as follows:

< J s2 , Em2 > = j I s2 [f m2 (l) f s2 (l' )( am2 as2 )g(R)


s l' l
(6.18)
1
+ f (l)
2 m2
f s2 (l' ) ( am2 g(R))] dldl'
k l'

where fs2(l') and fm2(l) are the basis and weighting current functions along the
scatterer wire respectively. For the integral of the right hand side of Eq.
(6.16) the back-scattered excitation voltage from the source to the scatterer
or the reaction of the equivalent surface currents (Ji1, Mi1) on the scatterer
are considered.

Vb2 = ( J i1 .E m2 M i1 .H m2 ) ds (6.19)
s

Equation (6.19) can be expanded in the following form:

V b2 = (J i1 .[ j (f m2 (r')g(R)am2
s S m2

1
+ 2 .f m2 (r')g(R))ds m2 ] (6.20)
k
M i1 .[ ( f m2 (r')am2 g(R))ds m2 ] ds
S m2

This equation can be simplified and discretised by gridding the closed


surface into a suitable number of uniform cells (ns) (usually a rectangular
grid). The surface integral can then be approximated by summation over the
surfaces of the outer cells to evaluate the voltage back-scattered as:

ns
Vb2 = ( J ki1 .E m2 ( rk ,r' ) M ki1 .H m2 ( rk ,r' ) ak (6.21)
k =1
6. Complex CEM with Hybridisation Techniques 95

where rk is the kth position vector of the centre of the cell surface Sk; ns is
the total number of the cell surfaces of the Huygens surface S and ak is the
surface area of a cell. Therefore Jki1 and Mki1 are considered to be the
equivalent surface currents at the centre of the surface of cell k on the closed
Huygens surface S. After the excitation voltage for the scatterer region has
been found, the MoM can be executed to compute the induced unknown
currents (Js2).
The next step, considering Fig. 6-11(b), is to calculate the effect of the
induced currents of the scatterer (Js2). These produce the back scattered field
(Ei2, Hi2) which will in turn produce a back scattered induced current on the
source region, to be added to the free space currents (Js1free, Ms1free)
calculated before. The same procedures for the near-to-near field
transformation have to be repeated, but this time for the direction from the
scatterer to the source. A test source is again placed on the source surface
and its reactions with the equivalent sources (Ji2, Mi2) or its resulting field
(Ei2, Hi2) in the interior volume V of the closed surface are considered, as
shown before in Eqs. (6.13)-(6.14) with reversed signs (noting that n is the
unit vector directed outward on S). If the test source has electric current
density Jm1 and magnetic current density Mm1, the RIE and the reciprocity
theorem can be applied in this case, resulting in:

( J s1 .E m1 M s1 .H m1 )ds = ( J i2 .E m1 M i2 .H m1 )ds (6.22)


s s

where (Em1, Hm1) are the fields radiating from the test currents of the source.
Again, the magnetic current density Ms1 vanishes. To implement Eq. (6.22)
the same Eqs. (6.17)-(6.21) should be applied for the source region, thus:

( J
s
s1 .Em1 )ds = < J s1 , Em1 > = Z11(s1,m1) Is1
(6.23)

< J s1 , Em1 > = j Is1 [f m1 (l)f s1 (l')(am1 as1 )g(R)


s l' l
(6.24)
1
+ 2 f m1 (l) f s1 (l') (am1 g(R))] dldl'
k l'
96 Chapter 6

Vb1 = ( J i2 .E m1 M i2 .H m1 )ds (6.25)


s

Vb1 = ( Ji2 .[ j ( fm1 (r')g(R)am1


s Sm1

1
+ 2 .fm1(r')g(R))dsm1 ] (6.26)
k
Mi2 .[ ( f m1(r')am1g(R))dsm1 ] ds
Sm1

ns
V b1 = ( Jki2 .Em1 (rk ,r') M ki2 .Hm1 (rk ,r')) ak (6.27)
k =1

After the excitation voltage for the source region is known, the MoM can
be executed to compute the induced unknown currents (Js1) which should be
added to the current on the antenna, computed in the absence of the scatterer.
As shown before, this method is repeated to take account of multiple
reactions between the source and the scatterer until the current on both of
them converges.
In the following sections examples are given to demonstrate numerical
implementation and validation of the proposed method.
6. Complex CEM with Hybridisation Techniques 97

Figure 6-12. Flow chart for Hybrid MoM/MoM treatment of a dipole and scatterer example.
98 Chapter 6

2.1.4.2 Numerical Implementation for Hybrid MoM/MoM


Formulation, Using Equivalence-Principle Surface
Fortran subroutines were added to the core of the MoM program in [Abd-
Alhameed et al., 1998], to implement the hybrid MoM/MoM technique with
the Huygens surface proposed in the previous section. Fig. 6-12 shows the
flow chart for the Hybrid MoM/MoM computer program written for the
interaction of two dipoles.
In general the modifications done for the source code in [Abd-Alhameed
et al., 1998] or alternatively to any similar electromagnetic numerical source
code are as follows:

Generating the full meshing grid for the virtual closed surfaces Sc or Sc at
any position surrounding the source and the scatterer regions, res-
pectively. The density of meshing for the Huygens surface is defined in
the new code geometry input file.
The surface currents on the surfaces are computed separately for each
region.
The computation of the voltage back-scattered from the near-to-near field
transformation is added. This implements Eq. (6.27), discussed in
Section 2.1.4.1, using the calculated surface currents and the field
produced from the test function.

To illustrate the above modification, a problem consisting of a half-


wavelength dipole wire antenna driven as a source, facing another dipole of
the same length acting as a scatterer was analysed. Both dipoles are directed
parallel to the z-axis. In the first example, two Huygens surfaces S c and Sc
are used, one for the source and the other for the scatterer, as shown in Fig.
6-13(a).
Another example implementing the technique with just one equivalent
Huygens surface (Sc same as Sc ) around the source was also studied, as
shown in Fig. 6-13(b). The following geometry and parameters were used
for the hybrid example simulation: working frequency = 300 MHz, dipole
lengths = 0.5 m, radius of the wire = 0.001 m, number of basis and test
functions along each dipole = 11, distance between dipoles 0.3 m, starting
and ending points for dipole 1 were p1 = (0, 0, 0.25) m, p2 = (0, 0, 0.25) m.
Starting and ending points for dipole 2 were p3 = (0, 0.3, .25) m, p4 = (0,
0.3, 0.25) m, Huygens surface dimensions were (0.3 0.3 0.7) m and the
number of surface patches used in modelling this surface was (30 30
70). Other values for the number of patches were tested as well. Dipole 1 has
6. Complex CEM with Hybridisation Techniques 99

an impressed voltage source at its centre while dipole 2 is considered as a


scatterer.

Figure 6-13(a). Two dipole elements modelled with two equivalent surfaces.

Figure 6-13(b). Two dipole elements modelled with one equivalent surface.

Three different values of separation distance, D, between the two dipoles


were tested and the input impedance calculated with the hybrid method was
compared with the exact MoM calculations, as given in Table 6-1. This
shows excellent agreement between the two techniques. A difference
between the exact and hybrid techniques appears in the imaginary part of the
100 Chapter 6

input impedance: this would be less if increasing numbers of iterations and


numbers of gridding cells of the equivalent closed surface were to be
applied.

Table 6-1. Input impedance for hybrid and standard MoM calulations for two-dipole example
with three different values of the separation distance D, for four iterations.
Separation Distance Standard MoM Hybrid MoM/MoM
(D) Input Impedance (Ohms) Input Impedance (Ohms)
0.1 (25 + j69) (25 + j65)
0.3 (104 + j64) (104 + j60)
0.5 (84 + j33) (84 + j31)

0.012
Re(l), Standard
0.010
Read and Imaginary parts of the

Img(l), Standard
current on the antenna (amp)

0.008 Re(l), Hybrid


Img(l), Hybrid
0.006
0.004
0.002
0.000
0.002

0.004
0.006
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Number of segments on the antenna

Figure 6-14. Real and imaginary parts of the current on antenna structures versus segment
number for both the source and scatterer dipoles for hybrid and standard MoM solutions as
shown in Figs. 6-11 and 6-12. (D = 0.5).

Figure 6-14 shows the real and imaginary parts of the current on the
antenna structures versus the number of segments on the wire antenna for
both the source and scatterer dipoles, using hybrid MoM and exact MoM
solutions, for a distance between dipoles D = 0.5. The result shown was
obtained after 4 iterations. It is clear that the current values along the two
dipoles for the proposed hybrid method compare well with the exact MoM
curves along both dipoles.
6. Complex CEM with Hybridisation Techniques 101

3. INCIDENT WAVE EXCITATIONS


IN THE FDTD METHOD

The Total/Scattered Field Formulation method [Mur, 1981; Taflove and


Umashankar, 1982b] is the most popular method used for compact
sinusoidally-illuminated wave sources, especially plane waves of arbitrary
angle of incidence. Figure 6-15 illustrates the zoning of the numerical space
grid into two distinct regions, a total field region and a scattered field region,
separated by a non-physical virtual surface implemented numerically with a
special treatment to include the incident wave excitations and to split the
problem space into total and scattered field regions at the same time. This
leads to a very important feature which is that the scattered field vector
values may be computed in the scattered field region (region 2) with no
incident field included in that region. This is very useful for the hybrid
MoM/FDTD method, as will be seen later in this chapter. The second key
feature of this formulation is the efficient modelling of arbitrary incident
plane waves with different oblique incidence angles using an incident-field
array (IFA) excitation scheme proposed by Taflove [Taflove, 1995]. The
IFA is an FDTD-based look-up table from which incident-field values are
overlaid on the FDTD grid in the direction of propagation. This formulation
will be adapted later to be used in the proposed hybrid MoM/FDTD
technique.

Region 1
Total Field

Interacting object

Connecting surface and Radiation


plane wave source Boundary
Condition

Region 2
Scattered Field

Figure 6-15. Overview of the total and scattered field zoning for a generic scattering problem.
102 Chapter 6

3.1 Total/Scattered Field Formulation in Three


Dimensions

The Equivalence principle surface implementation in a FDTD computer


code is complicated by the fact that H and E nodes are at different points, a
half-cell apart from each other. We can visualise two rectangular boxes, the
inner box defined by locations where the Huygens electric current sources
(tangential magnetic field) and the magnetic current sources (tangential
electric field) on the outer box are to be computed with a special format
(different from that previously defined in Section 2.1.4.1) and applied to
updating equations.
To simplify the large number of equations required, the abbreviated
notation used by Taflove in ref. [Taflove, 1995] will be adopted. The basic
FDTD updating equations given in [Taflove, 1995] used in the form of
{A}FDTD and the modified forms for the field component A are then given
below by adding appropriate additional terms. Referring to Fig. 6-16(a), the
Ey components at cells referenced (i = io, j = jo+1/2,, j1 1/2; k = ko,,k1) are
given by:

E y n +1 = {E y n +1 }FDTD + C b, Ey .H z, inc in +1/2


1/2, j, k
(6.28)
i o , j, k i o , j, k i o , j, k o

Ez components at cells referenced (i = io ; j = jo, j1; k = ko+1/2, ,k11/2 ):

E z in +,1j, k = {E z in +,1j, k }FDTD C b, E z H y,inc n +1/2 (6.29)


o o i o , j, k i o 1/2, j, k

Hy components at cells referenced (i = io1/2; j = jo, j1; k = ko+1/ 2 ,,k11/2 ):

H y n +1/2 = {H y n +1/2 } Db, H y E z,incin , j, k (6.30)


i o 1/2, j, k i o 1/2, j, k FDTD i o 1/2, j, k o

Hz components at cells referenced (i = io1/2; j = jo1/2,, j1+1/2; k = ko ,,k1):

H z in +1/2 n +1/2
1/2, j, k = {H z i 1/2, j, k }FDTD + D b, H z E y,inc n (6.31)
o o i o 1/2, j, k i o, j, k

Ey components at cells referenced (i = i1 , j = jo+1/2,, j11/2; k = ko ,,k1):


6. Complex CEM with Hybridisation Techniques 103

E y n +1 = {E y n +1 }FDTD C b, Ey .H z, inc in ++1/2


1/2, j, k
(6.32)
i1 , j, k i1 , j, k i1 , j, k 1

Figure 6-16(a). Location of Ey () and Ez() components in planes I = io and i = i1 Location


of Hz () and Hy() components in planes i = io-1/2 and i = i1+1/2.

Figure 6-16(b). Location of Ex() and Ez () components in planes j = jo and j = j1 Location


of Hz () and Hx ()components in planes j = jo1/2 and j = j1+1/2.

Ez components at cells referenced (i = i1 ; j = jo, j1; k = ko+1/2, ,k11/2 ):

E z in +, 1j, k = {E z in +, 1j, k }FDTD + C b, E z H y,inc n +1/2 (6.33)


1 1 i1 , j, k i1 +1/2, j, k
104 Chapter 6

Hy components at cells referenced (i=i1 +1/2; j = jo, j1; k = ko+1/2, ,k1 1/2 ):

H y n +1/2 = {H y n +1/2 } + D b, H y E z,inc in , j, k (6.34)


i1 +1/2, j, k i1 +1/2, j, k FDTD i1 +1/2, j, k 1

Hz components at cells referenced (i = i1 +1/2, j = jo+1/2,, j1 1/2;


k=ko,,k1):

H z in ++1/2 n +1/2
1/2, j, k = {H z i +1/2, j, k }FDTD D b, H z E y,inc n (6.35)
1 1 i1 +1/2, j, k i1 , j, k

Figure 6-16.(c) Location of Ex() and Ey () components in planes k = ko and k = k1 Location


of Hx () and Hy () components in planes k = ko1/2 and k = k1+1/2.

Referring now to Fig. 6-16(b), the Ex components at cells referenced


(i = io+1/2,, i11/2; j = jo; k = ko,,k1) are given by:

E x i,n +j 1, k = {E x i,n +j 1, k }FDTD C b, Ex .H z, inc i,n +j 1/2


1/2, k
(6.36)
o o i, j o , k o

Ez components at cells referenced (i = io,,i1;j = jo; k = ko+1/2,,k1 1/2):

E z i,n +j 1, k = {E z i,n +j 1, k }FDTD + C b, E z H x, inc i,n +j 1/2


1/2, k
(6.37)
o o i, jo, k o

Hx components at cells referenced (i = io,,i1; j = jo 1/2 ; k = ko + 1/2 ,,k1 1/2 ):


6. Complex CEM with Hybridisation Techniques 105

H x i,n +j 1/2 n +1/2


1/2,k = {H x i, j 1/2,k }FDTD + D b, H x E z,inci,n j (6.38)
o o i, jo 1/2,k o ,k

Hz components at cells referenced (i = io+1/2 ,,i1 1/2; j = jo 1/2 ; k = ko ,,k1):

H z i,n +j 1/2 n +1/2


1/2, k = {H z i, j 1/2, k }FDTD D b, H z E x,inci,n j (6.39)
o o i, jo 1/2, k o ,k

Ex components at cells referenced (i = io+1/2,,i1 1/2; j = j1; k = ko,,k1):

E x i,n +j 1, k = {E x i,n +j 1, k }FDTD + C b, Ex .H z, inc i,n +j 1/2


+1/2, k
(6.40)
1 1 i, j1 , k 1

Ez components at cells referenced (i = io,,i1; j = j1; k = ko+1/2,,k11/2):

E z i,n +j 1, k = {E z i,n +j 1, k }FDTD C b, E z H x, inc i,n +j 1/2


+1/2, k
(6.41)
1 1 i, j1 , k 1

Hx components at cells referenced (i = io,,i1; j = j1+1/2; k = ko+1/2 ,,k1 1/2 ):

H x i,n +j 1/2 n +1/2


+1/2, k = {H x i, j +1/2, k }FDTD D b, H x E z,inc i,n j (6.42)
1 1 i, j1 +1/2, k 1,k

Hz components at cells referenced (i = io+1/2,,i1 1/2; j = j1+1/2; k=ko,,k1):

H z in,+j 1+/ 12/ 2, k = {H z in, +j 1+/ 12/ 2, k }FDTD + Db, H z Ex,incin, j (6.43)
1 1 i, j1 +1 / 2, k 1,k

Considering Fig. 6-16(c), the Ex components at cells referenced


(i = io+1/2 ,, i1 1/2; j = jo,,j1; k = ko) are given by:

E x i,n +j,1k = {E x i,n +j,1k }FDTD + C b, Ex .H y,inc n +1/2 (6.44)


o o i, j, k o i, j, k o 1/2

Ey components at cells referenced (i = io,,i1; j = jo+1/2,,j1 1/2; k = ko):


106 Chapter 6

E y n +1 = {E y n +1 }FDTD C b, Ey .H x, inc i,n +j,1/2


k 1/2
(6.45)
i, j, k o i, j, k o i, j, k o o

Hx components at cells referenced (i = io,,i1; j = jo+1/2,,j1 1/2; k = ko 1/2):

H x i,n +j,1/2 n +1/2


k 1/2 = {Hx i, j, k 1/2}FDTD Db, H x E n
(6.46)
o o i, j, k o 1/2 y,inci, j, k o

Hy components at cells referenced (i = io+1/2,,i1-1/2; j = jo,,j1; k = ko1/2):

H y n +1/2 = {H y n +1/2 } + D b, H y E x,inci,n j, k (6.47)


i, j, k o 1/2 i, j, k o 1/2 FDTD i, j, k o 1/2 o

Ex components at cells referenced (i = io+1/2,,i11/2; j = jo,,j1; k = k1):

E x i,n +j,1k = {E x i,n +j,1k }FDTD C b, Ex .H y,inc n +1/2 (6.48)


1 1 i, j, k 1 i, j, k 1 +1/2

Ey components at cells referenced (i = io,,i1; j = jo+1/2,,j1 1/2; k = k1):

E y n +1 = {E y n +1 }FDTD + C b, Ey .H x, inc i,n +j,1/2


k +1/2
(6.49)
i, j, k 1 i, j, k 1 i, j, k 1 1

Hx components at cells referenced (i = io,,i1; j = jo+1/2,,j1 1/2 ; k = k1+1/2):

H x i,n +j,1/2 n +1/2


k +1/2 = {H x i, j, k +1/2 }FDTD + D b, H x E y,inc n (6.50)
1 1 i, j, k1 +1/2 i, j, k 1

Hy components at cells referenced (i = io+1/2,,i1 1/2 ;j=jo ,,j1; k = k1+1/2):

H y n +1/2 = {H y n +1/2 } D b, H y E x,inci,n j, k (6.51)


i, j, k 1 +1/2 i, j, k 1 +1/2 FDTD i, j, k 1 +1/2 1

It should be noted that a linear interpolation using the closest two points
in the source grid is used on the E and H surfaces, given the delay distance
and source grid values as shown in ref. [Taflove, 1995].
6. Complex CEM with Hybridisation Techniques 107

4. MODIFIED TOTAL/SCATTERED FIELD


FORMULATION FOR THE HYBRID
TECHNIQUE

Plane wave injection by the total/scattered field formulation, as reviewed


in the previous section in the Eqs. (6.28)-(6.51), is applied in the proposed
hybrid method to inject the MoM-modelled antenna near field excitations
into the FDTD domain. Instead of the known calculated plane wave FDTD
functions Einc and Hinc (obtained from the look-up table and linear
interpolation) in those equations, the MoM-calculated near field values Emom
and Hmom are substituted on the six faces of the rectangular equivalent
surface. This was discussed and implemented using the integral method in
[Abd-Alhameed et al., 1998; Abd-Alhameed and Excell, 1996]: equivalent
surface electric and magnetic currents tangential to the Huygens surface may
alternatively be used as shown in Fig. 6-17(a).

Figure 6-17(a). Huygens equivalent source currents on FDTD grid with half cell difference
for J and M surface currents on the six faces.

A modified formula is used for the total/scattered field formulation to


swap the scattered and total field regions (see Fig. 6-17(b)), so as to consider
the field inside the Huygens surface to be the scattered field region while
outside the Huygens surface is considered to be the total field region. This
modified formula is suitable if the size of the scatterer is larger than the size
of the source. Thus the update equations in the previous section can be
108 Chapter 6

modified with two different boundaries. These depend on the type of


the field components defined on the outer surface of the equivalent surface.
Hence, for H outside the surface (outside surface in the FDTD Huygens
surface grids), for this example considering only the face io as in Fig. 6-
16(a), the Ey components at cells referenced (i = io, j = jo+1/2,, j1 1/2; k =
ko,,k1) are given by:

E y n +1 = {E y n +1 }FDTD C b, Ey .H z, mom in +1/2


1/2, j, k
(6.52)
i o , j, k i o , j, k i o , j, k o

Ez components at cells referenced (i = io ; j = jo, j1; k = ko+1/2, ,k11/2):

E z in +,1j, k = {E z in +,1j, k }FDTD + C b, E z H y, mom n +1/2 (6.53)


o o i o , j, k i o 1/2, j, k

Hy components at cells referenced (i = io1/2; j = jo, j1; k = ko+1/2,,k1 1/2):

H y n +1/2 = {H y n +1/2 } + Db,H y E z,momin , j, k (6.54)


i o 1/2, j, k i o 1/2, j, k FDTD i o 1/2, j, k o

Hz components at cells referenced (i = io1/2; j = jo 1/2,, j1 + 1/2; k = ko,,k1):

H z in +1/2 n +1/2
1/2, j, k = {H z i 1/2, j, k }FDTD D b, H z E y,momn (6.55)
o o i o 1/2, j, k i o , j, k
6. Complex CEM with Hybridisation Techniques 109

Region 1
Total Field

Interacting object

Connecting surface and Radiation


excitation near MoM Boundary
field source Condition

Region 2
Scattered Field(MoM)

Figure 6-17(b). Overview of the modified total and scattered zoning for hybrid method.

The same changes with a sign reversal for the applied E and H
excitations is applied for Eqs. (6.32)-(6.51).

Now, in case the H is inside the Huygens equivalent surface, those


equations are written as follows:

Ey components at cells referenced (i = io , j = jo+1/2,, j1 1/2; k = ko,,k1):

E y n +1 = {E y n +1 }FDTD C b, Ey .H z, mom in ++1/2


1/2, j, k
(6.56)
i o , j, k i o , j, k i o , j, k o

Ez components at cells referenced (i = io; j = jo, j1; k = ko+1/2 , ,k1 1/2):

E z in +,1j, k = {E z in +,1j, k }FDTD + C b, E z H y, mom n +1/2 (6.57)


o o i o , j, k i o +1/2, j, k

Hy components at cells referenced (i = io+1/2; j = jo, j1; k = ko+1/2,,k1 1/2):

H y n +1/2 = {Hy n +1/2 } + Db,H y E z,momin , j, k (6.58)


i o +1/2, j, k i o +1/2, j, k FDTD i o +1/2, j, k o
110 Chapter 6

Hz components at cells referenced (i = io+1/2; j = jo1/2,, j1+1/2; k = ko,,k1):

H z in ++1/2 n +1/2
1/2, j, k = {H z i +1/2, j, k }FDTD D b, H z E y,momn (6.59)
o o i o +1/2, j, k i o , j, k

Again, the same treatment for the equations of the other five faces can be
implemented. It should be noted that if it is required to find the total field
inside the Huygens surface and the scattered field outside (to react with the
boundary as in the configuration of Section 3.1 for the hybridisation method)
a change of the sign must be implemented in the right hand side of Eqs.
(6.52)-(6.55) along with the equations of the other five faces.

5. VALIDATION OF TOTAL/SCATTERED FIELD


FORMULATION IMPLEMENTATION USING
HOMOGENEOUS FDTD IN MULTIPLE
REGIONS

Before developing the hybrid MoM/FDTD method, the total/scattered


field formulation was implemented and tested in a homogeneous FDTD code
having multiple regions. This can be called an FDTD/FDTD hybrid method.
In this method both the source and the scatterer are modelled by the FDTD
method, as shown in Fig. 6-18. In general, the cell meshing of the problem
space can be chosen to be different in each region [Okoniewski et al., 1997].
The incident field from the source to the closed surface containing the
scattering object is modelled by using a Huygens surface [Merewether et al.,
1980]. This excludes the source by replicating its fields incident on the
surface (the incident fields on the Huygens surface are represented by
the equivalent surface magnetic and electric currents). For each time step,
the scatterer fields are obtained and the equivalent surface currents on the
Huygens surface are again obtained. An interpolation algorithm is required if
the cell mesh size for the source and the object are different. The equivalent
surface currents are also used to work back to the source to obtain the
induced currents on it. The algorithms are repeated for each time step until a
few cycles have passed and the steady state solution is obtained.
6. Complex CEM with Hybridisation Techniques 111

PML Layer

Scatterer region
Source region with
with cellmesh
cell mesh size (M)
size (N)

Figure 6-18. FDTD/FDTD technique.

One example is used to contrast this method with a single-domain MoM


treatment. The test scenario chosen is effectively a division of a basic FDTD
simulation of a dipole in free space. The division is created by the insertion
of a Huygens surface part-way between the source and the PML boundary:
the outer region is thus nominally a scatterer although containing no
physical scattering object. The intention of this test was to test the viability
of the Huygens surface treatment. The cell meshing of the problem space
was chosen to be the same in each region.
The dipole studied was 0.51 long and of 1 mm radius, operating at 300
MHz. In the FDTD/FDTD method the dipole is represented by zeroing the
electric fields along its axis; its centre was located at the origin. The cell size
was 0.03 for the two regions. The Huygens surface was located 2 cells
from the dipole axis. The Huygens surface used had a total length of 30 cells
and a width of 4 cells between the inner PML boundaries. For the MoM
model the dipole was segmented into 17 segments, centred at the coordinate
origin and it was fed by a voltage source of amplitude 1V at its centre. The
results of the two methods are presented in Table 6.2. This table shows the
field values for different locations inside the scatterer region. The values of
Ex, Ey and Ez are shown for the two points P1 (0.09, 0.06, 0.06) and P2
(0.3, 0.06, 0.06), equivalent to (3, 2, 2) and (10, 2, 2) respectively,
when quantified as number of cells from the origin. Good agreement is
observed.

6. HYBRID MoM/FDTD TECHNIQUE ALGORITHM

Figure 6-19 shows an outline diagram of the iteration flow chart for the
MoM/FDTD hybrid method. To maximise usefulness and exploit extensive
112 Chapter 6

Table 6-2. Comparison between computed fields with the FDTD/FDTD hybrid technique and
a single-domain MoM treatment [Abd-Alhameed and Excell, 1999].
Observation point Technique used Ex (V/m) Ey (V/m) Ez (V/m)
P1 FDTD/FDTD 1.321 0.931 2.235
(0.09,0.06,0.06) Single MoM 1.119 0746 2.180

P2 FDTD/FDTD 0.349 0.0655 1.487


(0.3,0.06,0.06) Single MoM 0.303 0.0607 1.449

Figure 6-19. Flow chart of the proposed MoM/FDTD hybrid method.

past experience, the frequency domain version of MoM was used and hence
simple Fourier transforms had to be applied between each iterative step.
The source and the scatterer are located in two separate regions. The
induced currents for the source region are obtained, excluding the effect of
the scatterer, using the frequency domain version of MoM. An appropriate
set of fictitious sources allows division of complex radiation problems into
two simpler problems, accounting for the coupling between actual sources
and scatterers.
The fields due to these currents are obtained on the closed surface
(Huygens surface) that separates the source from the scatterer. Oscillating
with respect to a reference phase of the source, these fields or their
equivalent surface currents are converted to time-domain excitation incident
fields or current sources using an inverse discrete Fourier transform.
The FDTD algorithm is now executed with these time-domain sources to
obtain the induced currents on the scatterer. The back-scattered fields on the
source side of the Huygens surface are considered to be the excitation
sources for the source region. These fields or their equivalent current sources
6. Complex CEM with Hybridisation Techniques 113

are transferred to the frequency domain using a discrete Fourier transform.


The phase difference relative to the reference phase of the source is taken
into account.
The MoM is then used in reverse to evaluate the induced currents on the
source region due to both the source excitation region and the induced
equivalent current sources from the FDTD method. The method is repeated
until a steady state solution is obtained.

6.1 Theoretical Formulation

Consider Fig. 6-20; this shows two regions, one representing the source
region A and the other the scatterer B. The source region is bounded by the
closed Huygens surface Sc. The method starts by computing the fields due to
the real currents of the source region A (previously evaluated using the
internal excitation in the source region) on the surface Sc, excluding the
scatterer region B. These fields are computed using the MoM wire current
calculation as discussed in [Colburn et al., 1995; Abd-Alhameed et al., 1998;
Abd-Alhameed and Excell, 1996], using Galerkin's solution with straight
and curved segments and triangular basis functions on the wire surface.
The equivalent surface currents on the surface Sc that represent the
outward travelling fields from the source to the scatterer, due to the fields of
the source region A, as shown before in Eqs. (6.13)-(6.14), can be rewritten
as:

J if = n H if (6.60)

M if = E if n
(6.61)

where n is the unit vector normal to the surface and directed outwards
from the source region.
114 Chapter 6

Figure 6-20. Hybrid MoM/FDTD configuration for the single source and scatterer
geometries.

Hif and Eif are the forward scattered fields from the source region A on
the equivalent surface Sc. Jif and Mif are the equivalent electric and magnetic
source currents on the surface Sc. Thus these currents are treated as the
source in the FDTD domain, propagating fields to the scatterer by using the
E and H time domain equations as follows:

B
E = M if (6.62)
t

D
H = + J if (6.63)
t

The FDTD updating equations for the field components for the previous
two Maxwells equations are expanded as shown in Section 4 with a three-
dimensional modified total/scattered FDTD formulation for the special
components on the Huygens surface, while the rest of the problem space
field components follow the normal updating equations.

The back-scattered fields were computed by FDTD at S c (the closed


surface interior to the surface Sc and bounding the region A). This closed
surface is in the scattered field region, so that the calculated surface currents
6. Complex CEM with Hybridisation Techniques 115

are due the scattered field only. The equivalent surface currents due to these
fields, representing an additional source to the MoM domain (region A), are
given by:

J ib = H ib n (6.64)

M ib = n E ib (6.65)

where Hib and Eib are the back-scattered fields computed at S c . Note that
n is here as defined before in Eqs. (6.60)-(6.61), directed outwards from the
source region.

Jib and Mib are the electric and magnetic equivalent surface currents at
S c . Now, the voltage back scattered (the excitation for the MoM) on the
source region can be evaluated using either of the following equations,
defined by the Reciprocity Theorem in the same way as discussed in Section
2.1.4.1:

Vb = J ms .E ib dsa (6.66)
Sa

or V b = ( J ib .Ems - M ib .H ms )ds c (6.67)


S c

where

Eib = - jA(r) - V (r ) - 1 F (r ), A(r ) = J ib g(r ,r ')ds c


Sc

V( r ) = - j .J
s ib g(r ,r )ds c , F (r ) = M ib g(r , r )ds c
S c S c
116 Chapter 6

- jk r - r
e
g(r , r ) = is the free space Greens function.
r - r

The vectors r and r apply to the source and observation points


respectively and Sa is the conducting surface area of the structure within
region A. Jms is the electric test function used on the wire. Ems and Hms are
the electric and magnetic fields respectively for the test function Jms.

Eq. (6.66) requires a double integral to evaluate Eib and integrate over the
surface on the antenna, while Eq. (6.67) will be simpler to implement,
assuming that the cell meshing used in FDTD is very small compared to the
operating wavelength. In this case Eq. (6.67) can be reduced by ignoring the
surface integral and evaluating the voltage back-scattered corresponding to
the centre of the cell surface, as discussed before for Eq. (6.27), by a
summation over grid cell surfaces, to get the following equation for the
hybrid MoM/FDTD case:

n
sc

V b = ( J ibk .Ems (rk ,r ') - M ibk .H ms (rk ,r '))ak (6.68)


k =1

where rk is the position vector of the centre of the cell surface and ak is the
surface area of the cell surface. Therefore J ib k and M ib k are considered to
be the equivalent surface currents at the centre of the surface cell n. Since
the excitation voltages are known, the MoM can be executed to compute the
new currents and the procedure can be repeated until the steady state
solution is reached. The implementation of the procedure is illustrated in
detail in the flow chart of Fig. 6-21.

6.2 Multiple-Source Scattering Problems

Consider Fig. 6-22 which shows two source regions (1 and 2) and one
scatterer region. The source regions are bounded by the Huygens closed
surfaces Scx (x = 1, 2).
6. Complex CEM with Hybridisation Techniques 117

Figure 6-21. Hybrid MoM/FDTD program flow chart.

The same procedures as in the hybrid method presented in Section 6.1


could be used here but for two or multiple source regions instead of one
region, hence Eqs. (6.60)-(6.68) can be extended as follows:
118 Chapter 6

M if = nx Eif (S cx )
x x
(6.69)

J if = nx H if (S cx )
x x
(6.70)

where E if x and H if x (for x = 1, 2) are the forward scattered electric and


magnetic fields on the Huygens surfaces Scx (x = 1, 2) respectively. Jif x and
M if x are the equivalent surface currents on these surfaces. n x is the unit
vector directed outwards from the xth closed surface Scx.

Figure 6-22. Hybrid MoM/FDTD configuration for two source regions and a scatterer region.

Thus these currents are treated as the multiple sources in the FDTD
domain, propagating fields to the scatterer by using the E and H time-
domain equations. The scattered regions are considered to be inside each of
the equivalent closed Huygens surfaces shown in Fig. 6-22, whereas the total
fields are considered to be exterior to these surfaces. Other configurations of
the surfaces are possible: those which minimise the size of the Huygens
surface are normally the most efficient. Thus, the time-domain equations on
the surfaces Scx (for x = 1, 2) can be stated as follows:
6. Complex CEM with Hybridisation Techniques 119

B
E = M if x (6.71)
t

D
H = + J if x (6.72)
t

Again, the resulting difference equations for the electric and magnetic
field components, based on Eqs. (6.71)-(6.72), are given in Section 4 by
applying the modified total field/scattered field formulation for each source.
The back-scattered fields for each source region are computed by FDTD at

S cx (the xth closed surface interior to the surface Scx and bounding the xth
source region). These fields for the xth source region include the effect of the
scatterer and source region y (where y x) and hence each source region
will be treated separately to determine the new current distributions using
MoM, as follows. The xth equivalent surface currents due to these fields that
represent an additional source to the MoM domain are given by:


M ib x = n x E ib x (S cx ) (6.73)


J ib x = n x H ib x (S cx ) (6.74)

where E ib x and H ib x (for x = 1, 2) are the back scattered electric and



magnetic fields on the Huygens surfaces S cx (x = 1,2) respectively.
J ib x and M ib x are the equivalent surface currents on the Huygens surfaces

S cx (x = 1, 2) respectively.
Now, the voltage back-scattered (the excitation for the MoM) on the xth
source region can be evaluated using the Reciprocity Theorem:


Vbx = ( J ibx .E ms x M ib x .H ms x ) dscx for x = 1,2 (6.75)

S cx

where E ms x and H ms x are the electric and magnetic test fields of the xth
source region. If the cell meshing used in the FDTD region is very small
compared to the operating wavelength, Eq. (6.75) can be simplified and
discretised by changing the surface integral to summation over the cell outer
120 Chapter 6

surfaces and evaluating the voltage back-scattered, corresponding to the


centre of the cell surface, thus:

n
scx

V bx = (J
k =1
ibkx . Emsx (rkx ,r ') M ibkx. H msx (rkx ,r '))akx for x = 1,2 (6.76)


where rkx is the xth position vector of the centre of the cell surface S cx ,
th
n s is the total number of the cell surfaces of the x equivalent surface S cx
cx

and akx is the surface area of the kth cell on S cx . Therefore J ib and
th kx
M ibkx are considered to be the x equivalent surface currents at the centre of

the cell surface k of the equivalent closed surface S cx . Since the excitation
voltages of each source region are known, the MoM can be executed to
compute the new currents (for each source region) and the procedure can be
repeated until the steady state solution is reached.

7. NEC/FDTD HYBRID PROGRAM

Another version of the hybrid code which gives more flexibility for
modelling the antenna geometry, links the FDTD routine with the industry-
standard NEC Program [Burke and Poggio, 1981]. The source code of NEC
used in this work was obtained from a public-domain Web site [NEC 2005].
NEC uses different basis and test functions to solve the EFIE from those
used in [Mangoud et al., 2000]. The computation of the back-scattered field
from the scatterer on the source region had to be modified from the previous
version, due to the distinctive type of testing field used by NEC. Thus a new
form of the coupling field equations on the equivalence principle surface
will be clarified here for back-scattered calculations on the source region. It
should also be noted here that the forward field equations on the Huygens
surface, as used to propagate the source fields to the scatterer (using the
FDTD method), are same as those used before in the MoM [Mangoud et al.,
2000]/FDTD hybrid formulation.
6. Complex CEM with Hybridisation Techniques 121

The forward-propagating near fields are computed subject to the antenna


current calculated using the following basis functions on wires Eq. (6.77)
and patches (Eq. 6.78), respectively:

I j (s) = Aj + B j sin(k(s s j )) + C j sin(k(s s j )) (6.77)

Js j (r ) = (J1 j t1j + J2 j t2 j ) vj (r ) (6.78)

where

s s j < j / 2 , t1j = t1 (rj ) , t2j = t2 (rj )

sj is the value of s (local axis of the wire segment) at the centre of the
segment j and j is the length of the segment j. Aj, Bj and Cj are unknown
complex constants; two of them are eliminated by the local conditions on the
current, leaving one constant when calculating the back scattered effects.
The basis function can be extended over one or more segments. Thus, Aj, Bj
and Cj are determined for each segment.
rj is the position of the centre of the patch j. vj(r) = 1 for r on the patch j
and is zero otherwise. J1j and J2j represent the average surface-current
density over the surface patch. t1 ( r j ) and t 2 ( r j ) are the unit vectors
representing the surface current distribution components, on the patch j.
Firstly, the forward field calculation is undertaken by importing the
coordinates of the Huygens surface to NEC as input. The program then
produces the equivalent surface currents on the surface using the subroutine
NFPAT in the NEC program. Secondly, the back scattered voltage (the
excitation for the MoM) on the source region can be evaluated using Eq.
(6.68), defined before. Ems and Hms are the electric and magnetic fields
respectively for the test current function Jms that can be either an impulse or
a uniform pulse over the segment length. It should be noted that there is no
magnetic test function specified in NEC [NEC 2005].
Ems and Hms are obtained by modifying the NEC source code subroutines
NEFLD and NHFLD which give the electric and magnetic fields after
finding the basis function solution of the current. The test fields are obtained
by eliminating the field due to the sinusoidal basis functions (with coef-
ficients B j and Cj set to zero) and by setting the field due to the constant
122 Chapter 6

part of the current basis function to unity. This modification was confirmed
by a test program which calculated the electric field and magnetic field due
to pulse functions and compared the results. After obtaining the test field,
Eqs. (6.75) and (6.76) are implemented inside the NEC Fortran source code
to get the back-scattered voltage.
Hybridisation with a flexible MoM code like NEC makes the hybrid code
very powerful for modelling a wide range of complex antennas. This is aided
by auxiliary packages that perform the wire gridding process, import NEC
geometry data and write the input files of NEC. On the other hand, using an
alternative MoM program [Mangoud et al., 2000] with, for example, straight
and curved segment basis functions can give advantages over NEC,
especially if there are dielectric parts in the antenna.
In the rest of this chapter the hybrid code using NEC will be referred as
NEC/FDTD while the hybrid code version that uses the program described
in [Mangoud et al., 2000] will be referred to as MoM/FDTD.

Figure 6-23. Hybrid MoM/FDTD region for far field calculations.


6. Complex CEM with Hybridisation Techniques 123

8. FAR FIELD CALCULATIONS USING


THE HYBRID CODE

Applying the frequency-domain near-to-far field transformation using


FDTD the far field can be computed. An important issue is that there will be
an additional equivalent Huygens virtual surface Sa and this must be placed
within the total field region, not the scattered region, as shown in Fig. 6-23.
The equivalent surface currents on the surface Sa that represent the outward
travelling fields from both source and scatterer, as shown before in Eqs.
(6.60) and (6.61), can be rewritten as:

Jf = n Hf (6.79)

M f = Ef n (6.80)

where n is the unit vector normal to the surface and directed outwards from
Sa . Hf and Ef are the total (incident + scattered) fields on the Huygens
surface Sc. Jf and Mf are the equivalent electric and magnetic source currents
on the surface Sa.

9. NUMERICAL EXAMPLES USING THE HYBRID


MoM/FDTD TECHNIQUE

Several examples were investigated and discussed and the results were
compared and a further recommendation of using the MoM/FDTD hybrid
method was presented:

Example 6.3: Free Space Half-Wavelength Dipole Modelling Using


Modified Total/Scattered Field Formulation

An initial study was made of a half-wavelength dipole in free space in


order to undertake basic validation of the method. Near field, input
impedance and far field results were computed using the hybrid MoM/FDTD
code and compared with the standard MoM package NEC. The modified
total/scattered field formulation was applied in this example and the far field
and radiated power subroutines examined and compared with NEC. Also,
equivalent virtual surface size considerations need to be studied here, thus
124 Chapter 6

nine different cases were considered with various sizes and locations for
Huygens surfaces Sc and Sc- located inside the FDTD model and surrounding
the dipole modelled by MoM.
Simulation parameters for this example are shown in Table 6.3 for both
MoM and FDTD parts of the code. The Huygens equivalent surface size will
be represented by the cell number in each direction with respect to the
reference cell point mentioned in Table 6.3. These numbers represent the
start and end of the surface and are given by xmin-xmax, ymin-ymax and
zmin-zmax. Moreover, for the back-scattered surface Sc- inside the virtual
box in the scattered region they are xminb-xmaxb, yminb-ymaxb and zminb-
zmaxb. In this example, the cell numbers for the back-scattered surface are
assigned the following parameters xmin-1, xmax+1, ymin-1, ymax+1, zmin-
1 and zmax+1.

Table 6-3. Input MoM and FDTD parameters for Example 6.3.
FDTD parameters
Formulation Modified total/scattered field
operating frequency 900 MHz
FDTD Problem space px, py, pz 29,29,91
Total number of FDTD cells 40 40 102 = 163200
Nlayer 6
0.0025 m
t 3 ps
time cycles 9
Reference cell point ax, ay, az 20,20,51
Centre of coordinates Ez(20,20,51)
MOM PARAMETERS:
RADIUS OF THE WIRE = .00125 M, NUMBER OF SEGMENTS 17, SOURCE SEGMENT
NUMBER = 9. COORDINATES OF END 1 OF ANTENNA = (0,0,-0.0833) M, COORDINATES
OF END 2 OF ANTENNA = (0,0,0.0833) M.

A list of Huygens surface dimensions, starting and ending cell indices in


each direction and number of patches on the surface for each case are shown
in Table 6.4(a). The table is divided into two groups, in the first group (cases
1 to 5), the antenna is exactly in the centre of the box and it is separated
equally from the 4 boundaries of the box in the x and y directions. This
group is used for studying the effect of Huygens surface size change. In the
second group (cases 6 to 9), the box size is the same for cases 6 and 8 and
for cases 7 and 9. However, in this group the size of the box in the z
direction is reduced and the antenna is located one or two cells from one
boundary side to study how close the wire could be located with respect to
6. Complex CEM with Hybridisation Techniques 125

the Huygens surface. This group is used for studying the effect of Huygens
surface location change.
Figure 6-24 shows an x-z slice of the (40 40 102) FDTD problem
space without PML layers. Grid cells in the x direction are shown in detail,
thus horizontal cell numbers start from 7 and end at 34. Selected cells are
presented in the z direction, such as in the region of the upper and lower
boundaries and the vertical centre of the Huygens surface. The centre of the
dipole is illustrated at Ez(20, 20, 51) and the nine different cases for sizes
and locations of Huygens surfaces Sc and Sc- around the dipole are shown in
the figure. In addition, two near field test loci are shown as well.

Table 6-4(a). Dimensions and locations of the nine cases with different Huygens surface
sizes.
Huygens xmin, xmax ymin, ymax zmin, zmax Number of
surface size surface
patches on
Sc
Case1 4474 18,22 18,22 14,88 584
Case2 8874 16,24 16,24 14,88 1800
Case3 121274 14,26 14,26 14,88 3080
Case4 161674 12,28 12,28 14,88 4424
Case5 202074 10,30 10,30 14,88 5832
Case6 8570 16,24 16,21 16,86 1260
Case7 8670 16,24 16,22 16,486 1408
Case8 8570 16,24 19,24 16,86 1260
Case9 8670 16,24 18,24 16,86 1408

Table 6-4(b). Results of the nine cases with different Huygens surface sizes.
Input Vertical locus Vertical Far Power radiated
impedance max near field field difference (watts)
Difference (dB)
(v/m)
Case1 85+j47 1.373 1.33 .00576
Case2 89+j48 0.457 0.71 .00499
Case3 90+j49 0.080 0.46 .00475
Case4 91+j50 0.071 0.35 .00460
Case5 92+j50 0.006 0.34 .00460
Case6 82+j46 1.640 1.16 .00564
Case7 89+j47 0.666 0.77 .00510
Case8 82+j46 1.629 1.15 .00564
Case9 89+j47 0.710 0.77 .00510
NEC 92+j50 .00415
126 Chapter 6

Theoretically, the antenna can be placed right up against the side of the
Huygens box (within a few cells). However, there are some numerical
mismatches between the propagating wave and the exact currents on the
Huygens box. This effect will cause some of the incident field to leak into
the scattered field region. The cases considered here were chosen in order to
study this leakage and to keep it within a workable limit.

Figure 6-25 clearly illustrates the extinction of the scattered field inside
the Huygens box and serves as a verification that the modified total/scattered
field formulation is working properly. It can be seen that the inner region
(the dark one) is the scattered field region. Since there is not any scattering
object it represents a null field region.
6. Complex CEM with Hybridisation Techniques 127

Figure 6-24. Problem space details on xz plane with different configurations of Huygens
surface Sc (Note: the vertical dimensions are not to scale).

Figure 6-25 is divided into 9 subfigures, one each for the nine cases
mentioned before. Both vertical and horizontal slice cuts are taken, the
central planes of the problem space being chosen as these are also the planes
passing through the centre of the dipole Ez(20, 20, 51).
128 Chapter 6

Figure 6-25. Total electric field (dB) distribution in vertical and horizontal slices for Hybrid
Model dipole of example 6.3, note the dark areas show different scattered regions for all
cases.
(Note: the colour palletes vary as they were automatically selected by MATLABTM)
6. Complex CEM with Hybridisation Techniques 129

Sub-figures of Fig. 6-25 show total electric field distribution in dB and


how the size of Sc can be changed around the wire antenna. The dipole
placed in the centre of the box, as in cases 1-5, gives a symmetrical pattern,
but when the dipole is shifted to one side of the box, as in cases 6-9,
asymmetric contours with respect to the centre cell Ez(20, 20, 51) are given.
In all cases the expected near field surrounding the dipole is observed to be
exactly the same in the total field region outside the box, taking into
consideration that the maximum value of the field is different from one case
to another, hence the amplitude colour contours differ.
Table 6.4(b) shows, for the nine cases, results for input impedance,
maximum near field differences on the vertical observation locus and
differences in vertical far field pattern and radiated power when using the
hybrid code. For cases 1-5, the size of the box is gradually increased: it is
clear that as the size increases the results compare better to those of NEC, as
shown at the end of the table. For cases 7 and 9 the results are more accurate
than those of cases 6 and 8, in which the location of the dipole moves closer
to the boundary.
The input impedance is calculated considering the back-scattered field
effect on the wire, so when the scattered region has null field (ideally) the
input impedance will equal the free-space impedance since there is no
scatterer. Thus this is a very good test for the scattered field region
implementation. Cases 1, 6 and 8 give more difference error than other cases
like 2, 3, 4, 7 and 9 which give differences of 1-3 ohms while case 5 gives
exactly the same result as NEC.
Regarding near field calculations, in Table 6.4(b) the maximum value for
the field along the vertical test locus shown in Fig. 6-24 is compared with
that of NEC and the differences are shown in the table. The worst case is
13.4% error difference in case 1, while virtual agreement has been observed
for cases 3, 4 and 5 which have error differences less than 1%. For the far
field pattern, differences between 0.34 dB (case 5) and 1.33 dB (case 1) are
observed for the azimuth pattern. The radiated power calculated using the
integral of the normal Poynting vector method differs slightly from that of
the feeding point source calculated by NEC: also cases 3, 4 and 5 give the
best results.
130 Chapter 6

Example 6.4: Dipole Adjacent to Perfectly Conducting Plate

An antenna adjacent to a perfectly conducting plate was chosen as a


further verification example for the developed hybrid code. The antenna was
modelled using both the custom MoM program [Mangoud et al., 2000] and
NEC: it was a half-wavelength dipole directed along the z-axis. The
conducting plate used had height = 3/2 and width = , modelled by FDTD
exactly as for the verification example of the time-domain MoM/FDTD
technique used in [Cerri et al., 1998]. A conducting plate is a better scatterer
object to be modelled by FDTD since it can be represented relatively exactly
in the FDTD grid, without the staircase errors that would affect a curved
surface.
Table 6-5(a). Input MoM and FDTD parameters of Example 6.3 (run 1).
FDTD parameters
Formulation Modified total/scattered field
Operating frequency 961 MHz
px, py, pz 71,40,96
mx, my, mz 83,52,108
Total number of FDTD cells 82 51 107
Nlayer 6
0.00624 = 0.02
t 5 ps
Time cycles 15
(Dipole centre) ax, ay, az 42,15,54
xmin, xmax 32,52
ymin, ymax 12,18
zmin, zmax 24,84
Huygens surface size (Sc) 20 6 60 (fixed)
Plate centre 42, 15+Dc , 54
pxmin, pxmax 17,67
pzmin, pzmax 17,92
MoM parameters:
radius of the wire = .002 m, number of segments = 17, source segment number = 9.
Coordinates of the end 1 antenna = (0,0,-0.078)m, coordinates of the end 2 antenna
= (0,0,0.078)m.
Notes: in this case, the Huygens surface was fixed while the plate location varied inside the
FDTD space.
Dc is the variable separation distance between the antenna and the plate (in number of
FDTD cells)
6. Complex CEM with Hybridisation Techniques 131

Table 6-5(b). Input FDTD parameters of Example 6.4 (run 2).


FDTD parameters
Formulation Modified total/scattered field
ax, ay, az 37, 22-37*, 49
xmin, xmax 31,43
ymin, ymax 13-28* , 31-46*
zmin, zmax 32,68
Huygens surface size (Sc) 12 18 36
Plate centre 37,12,49 (fixed)
pxmin, pxmax 12,62
pzmin, pzmax 12,87
Notes: in this case, the plate model was fixed while the Huygens surface location varied
inside the FDTD space.
(*) indicates variable parameters according to the variation of the antenna and the plate
separation distance.

Table 6-5(c). Input FDTD parameters of Example 6.4 (run 3).


FDTD parameters
Formulation Total/scattered field
ax, ay, az 42,17,54
xmin, xmax 12,72
ymin, ymax 12,22
zmin, zmax 12,97
Huygens surface size (Sc) 50 10 85
Plate centre 42,17,54
pxmin, pxmax 17,67
pzmin, pzmax 17,92
Note: in this case, the plate and Huygens surface were fixed inside the FDTD space and the
dipole location varied in the MoM model.

The separation distance Dc between the antenna and the plate was varied
and the input impedance of the dipole was the target of comparisons: it was
compared between both hybrid code versions MoM[Mangoud et al.,
2000]/FDTD and NEC/FDTD, and with pure NEC. The details are
summarised in Table 6.5(a-c).
It may be noted from the tables that for Run 1 the modified total/scattered
field was used with a Huygens surface of size (20 6 60) that replaces
the dipole in FDTD grids. This was fixed at the location shown in the table
and the plate was made to be movable in the problem space. For run 2 also,
the modified total/scattered field was used, but in this case the plate was
fixed and the (12 18 36) Huygens surface was made movable. For run 3
the total/scattered field was used with a Huygens surface of size
(50 10 85) surrounding the plate.
132 Chapter 6

Figs. 6-26 and 6-27 illustrate contours of electric field distribution for the
central y-z and x-y planes for Run 1. Both the surface of the plate and the
Huygens surface can be distinguished from the figures. It can be seen that
the plate reflects the incident signal from the dipole and that the inner
(scattered) region of the Huygens surface contains non-zero values for the
back-scattered field.
Fig. 6-28 shows the input impedance of the dipole adjacent to the plate
versus separation distance in wavelengths for both versions of hybrid code,
compared with NEC, for the Run 2 case. Very good agreement is observed
between the two versions of the hybrid code and the NEC model for the
same problem. This indicates that the back-scattered voltage has been
implemented correctly according to Eq. 6.68.

Figure 6-26. Contours of electric field distribution for central y-z plane (in dB) with run 1.
6. Complex CEM with Hybridisation Techniques 133

Figure 6-27. Contours of electric field distribution for central x-y plane (in dB) with run 1.

Resistance (MoM/FDTD) Reactance (MoM/FDTD)


Resistance (NEC/FDTD) Reactance (NEC/FDTD)
Resistance (NEC) Reactance (NEC)
160
140
Input Impedance in ohm

120
100
80
60
40
20
0
0.2 0.25 0.3 0.35 0.4 0.45 0.5
Separation Distance in wavelengths

Figure 6-28. Input impedance of the dipole adjacent to a plate versus separation distance in
wavelengths for both versions of hybrid code, compared with NEC.

The number of iterations required for this method to converge is


illustrated in Figs. 6-29 and 6-30, which give the input resistance and
reactance respectively, versus number of iterations, for 0.2 separation
between dipole and plate for the three runs of Table 6.5. Here it should be
noted that run 3, which uses the total/scattered field technique, gives more
accurate results compared to NEC, with the disadvantage of greater Huygens
134 Chapter 6

surface size and more computational time in the MoM than for the other two
runs. Otherwise runs 1 and 2 give results that have acceptably small error
differences.
It is clear that all runs become stable after four iterations. At this point
there is 31% difference from the first iteration for the resistance and 2% for
the reactance. This is a remarkably low number of iterations: for a highly
resonant structure it could be higher, while for near field and SAR inside the
dielectric calculations this number of four iterations could be lower, as it
depends on the percentage of back scattered field with respect to the forward
field. Moreover, there is an inverse relationship between the separation
distance and the number of iterations, and the damping effect in lossy
dielectric will accelerate convergence of the results.

Figure 6-29. Input resistance versus number of iterations for 0.2 separation between dipole
and plate for the three runs of Example 6.4.
6. Complex CEM with Hybridisation Techniques 135

Figure 6-30. Input reactance versus number of iterations for 0.2 separation between dipole
and plate for the three runs of Example 6.4.

Example 6.5: Interaction of a Dipole and a Sphere for Comparisons


with FDTD

Numerical simulations of canonical problems were undertaken to test the


hybrid technique and allow comparison of the results with pure FDTD
simulations. A 900 MHz half-wavelength dipole was considered at 5
different distances, with 1 cm steps, from a 20 cm diameter sphere of
biological material. Two separate simulations were undertaken for each case,
one with the hybrid MoM [Mangoud et al., 2000]/FDTD and one with pure
FDTD: the parameters used are summarised in Table 6.6. All parameters are
the same for both simulations except the dipole model, which is represented
by a thin wire subroutine in FDTD and replaced by the equivalent Huygens
surface in the hybrid model.
136 Chapter 6

Table 6-6. Input MoM and FDTD parameters.


FDTD parameters
Formulation Modified Total/scattered field
Operating frequency (MHz) 900
Total number of FDTD cells 138 102 102
nlayer, (mm), t (ps), number of time 6, 2.5, 3, 25
cycles
Huygens surface size (Sc) 10 10 80
ax, ay, az 18, 51, 51
Biological sphere material properties = 0.7 s/m ' = 39.5 ' ' = 14.0 =1100
kg/m3
sphere centre cell # = cxx (for 5 cases) , 18+ 40+ (4 or 8 or 12 or 16 or 20), 52, 52
yxx, zxx
FDTD dipole number of cells 67
MoM parameters:
Radius of the wire = .001, number of segments = 17, source segment number = 9.
Coordinates of the antenna end 1= (0,0,-0.833), coordinates of the antenna end 2 =
(0,0,0.833)

In Figs. 6-31 (a), (b), (c) and (d), SAR distributions in a two dimensional
horizontal cut (x-y plane) with sphere-dipole distances of 1, 2, 3 and 4 cm for
the hybrid technique are shown. These were computed with the hybrid
method: the input voltage was 1 volt for all cases.
It should be noted that the x-axis represents the number of cells in the x
direction of the problem space, ignoring the PML layers and one further non-
PML layer. Thus the cell number on the x-axis starts from cell number
(nlayer+2) and ends at cell number (mxm1-nlayer-2) where mxm1 and nlayer
are number of cells in x direction and number of PML layers respectively.
Thus, the x-axis scale in Fig. 6-31 starts with cell number 1. Because of the
PML, this represents cell number 8 in the complete simulation problem space.
The x-axis scale ends at cell number 122, which represents the last cell before
the PML layer on the right hand side of the problem space. As seen, in these
simulations the dipoles position is fixed and the sphere was moved for each
separation distance run. The values of ax (the centre of the dipole) and cxx
(the centre of the sphere) are indicated in the captions for each case.
6. Complex CEM with Hybridisation Techniques 137

Figure 6-31. SAR (W/kg) distribution in horizontal axial slice of a simulated biological
sphere for a dipole (with input voltage of 1 volt for all cases) at a distance of (a) 1cm (b) 2cm
(c) 3cm (d) 4cm (e) 5cm with hybrid method and f) 5cm with pure FDTD. (Frequency = 900
MHz; scale in dB).

It can be seen that the absorbed power distribution in the spherical head
models is strongly inhomogeneous with a range of variation of about 50 dB.
The region with high absorption values in all head models is small and close
138 Chapter 6

to the feed point of the dipole. Also it is observed that the distribution
changes as the dipole-to-sphere separation distance changes and a noticeable
standing wave is created for cases when the separation equals 3 cm or more,
due to strong reflection from the dielectric discontinuity at the far side of the
sphere.
Figures 6-31 (e) and (f) show a comparison between the hybrid and pure
FDTD techniques for a distance of 5 cm separating the dipole from the
homogeneous sphere. These distributions confirm that the hybrid
MoM/FDTD method is well implemented and it compares very well with
pure FDTD simulations for SAR calculations inside dielectric materials in
this canonical case.

Figure 6-32(a). Peak SAR and maximum averaged SAR over 10 g versus separation distance
between the dipole and sphere. (The antenna input power is 1W).

Figures 6-32 (a), (b) and (c) show other parameters from the simulation
results, comparing the hybrid and the pure FDTD techniques. Fig. 6-32 (a)
shows peak (unaveraged) SAR and maximum averaged SAR over 10 g (both
normalised for 1W total radiated power): the SAR values from the two
methods are very similar to each other. The normalisation was performed by
dividing the peak and averaged SAR values by the total radiated power.
6. Complex CEM with Hybridisation Techniques 139

Figure 6-32(b). The input impedance of the dipole/sphere interaction using both hybrid and
FDTD methods.

Figure 6-32(c). Absorbed power versus separation distance between the dipole and sphere
when the antenna input power is 1W.
140 Chapter 6

In Fig. 6-32 (b) it can be seen that the input impedance shows similar
values for the two methods, with some differences because of the different
methods of calculating the current. This is likely to be more accurate in the
MoM than the FDTD techniques, (but variations could also be due to
different effective radii of the dipole geometry). Figures 6-32 (c), show the
absorbed power: the maximum difference between the two methods is about
6%.

10. SUMMARY

This chapter has discussed the detail of the implementation and the basic
validation of the hybrid treatment of the electromagnetic behaviour of
coupled multiple regions using the heterogeneous MoM/FDTD com-
putational electromagnetics technique. Two different techniques for total and
scattered formulations have been used and Huygens surface design
considerations studied. It has been shown that it gave stable and accurate
results. The results of test cases were in excellent agreement with predictions
from well-established programs, published results and physical expectations.
The number of iterations required to account for the multiple reactions
between regions was investigated: rapid convergence was found for
structures consisting of two or more regions. The method is particularly
useful for analysing complex problems involving coupling between antennas
and dielectric volumes, especially biological tissue. This is because it
permits the computationally-efficient FDTD method to be used for the
dielectric, but the Method of Moments which represents conducting
structures more accurately, for the antenna. It can thus be concluded that the
hybrid MoM/FDTD technique will be a very good basis for the intensive
simulations needed in many modern applications, particularly those involving
regions of biological tissue. Hybrid treatment of the electromagnetic behaviour
of coupled multiple regions using a heterogeneous MoM/MoM computational
electromagnetics technique gave stable and accurate results. The theory of
the same technique was investigated and presented using the MoM/FDTD
hybrid technique.
The hybrid method will have particular advantages over alternative
methods in the following electromagnetics applications: cellular telephone
dosimetry; investigations close to real biological tissues; complex satellite-
mobile antennas; SAR reduction using phased array antennas; base station
safety assessment; medical radiofrequency/microwave therapy equipment;
subsurface radar, and any other applications involving interaction between
complex source structures and inhomogeneous dielectric materials.
6. Complex CEM with Hybridisation Techniques 141

References
Abd-Alhameed, R. A. et al., 1998, Procedure for Analysis of Microstrip Patch Antennas
Using the Method of Moments, IEE Proceedings on Microwaves, Antennas and
Propagation, vol. 145, No. 6, pp. 455-459, 1998.
Abd-Alhameed, R. A. et al., 1999, Computation of radiated and scattered field using separate
frequency domain moment-method regions and frequency domain MOM-FDTD hybrid
methods, IEE National Conference on Antennas and Propagation, pp. 53-56, 1999.
Abd-Alhameed, R. A. et al., 2005, Broadband antenna response using hybrid technique
combining frequency domain MoM and FDTD, ACES Journal, vol. 20:1, pp. 70-77,
2005.
Abd-Alhameed, R. A. and Excell, P. S., 1996, Analysis of Dielectrically-Loaded Wire, Strip
and Patch Antennas Using the Method of Moments, IEE Conf. Pub. No. 420,
Computation in Electromagnetics, Bath, April 1996, pp. 306-311.
Abd-Alhameed, R. A. and Excell, P. S., 1999 Analysis of a normal-mode helical antenna
including non-uniform wire surface current effects, IEE Proc. Microwaves, Antennas and
Propagation, vol. 146, no. 1, 1-5, 1999.
Akyurtlu, A. et al., 1999, Staircasing errors in FDTD at an air-dielectric interface, IEEE
Microwave Guided Wave Letters, vol. 9, pp. 444-446, 1999.
Ali, M. W. et al., 1997, A hybrid FEM/MOM technique for electromagnetic scattering and
radiation from dielectric objects with attached wires, IEEE Transactions on
Electromagnetic Compatibility, vol. 39, pp. 1327-1333, 1997.
Aoyagi, P. H. et al., 1993, A hybrid Yee algorithm/scalar-wave equation approach, IEEE
Transactions on Microwave Theory and Techniques, vol. 41, pp. 1593-1600, 1993.
Arndt, F. et al., 2004 Fast CAD and optimization of waveguide components and aperture
antenna by hybrid MM/FE/MoM/FD methods: state-of-the-art and recent advances, IEEE
Transactions on Microwave Theory and Techniques, vol. 52, pp. 292-305, 2004.
Boyse, W. E. and Seidl, A. A., 1991, A hybrid finite element method for near bodies of
revolution, IEEE Transactions on Magnetics, vol. 27, pp. 3833-3836, 1991.
Bretones, A. R. et al., 1998, A new hybrid method combining the method of moments in the
time domain and FDTD, IEEE Microwave and Guided Wave Letters, vol. 8, pp. 281-283,
1998.
Bretones, A. R. et al., 1999, Hybrid NEC/FDTD approach for analysing electrically short
thin-wire antennas located in proximity of inhomogeneous scatterers, Electronic Letters,
vol. 35, pp. 1594-1596, 1999.
Bretones, A. R. et al., 2000, Hybrid technique combining finite element, finite difference and
integral equation methods in the time domain, Electronic Letters, vol. 36, pp. 506-508,
2000.
Bridges, G. E., 1995, Transient plane wave coupling to bare and insulated cables buried in a
lossy half-space, IEEE Transactions on Electromagnetic Compatibility, vol. 37, pp. 62-
70, 1995.
Bruno, S., 2001, A Hybrid Finite Element and Integral Equation Domain Decomposition
Method for the Solution of the 3D-Scattering Problem, Journal of Computational
Physics, vol. 172, pp. 451-471, 2001.
Burke, G. J. and Poggio, A. J., 1981, Numerical Electromagnetics Code (NEC): Method of
Moments, US Naval Ocean Systems Centre, Rep.no. TD116, 1981.
142 Chapter 6

Cangellaris, A. C. et al., 1993, A hybrid spectral/FDTD method for the electromagnetic


analysis of guided waves in periodic structures, IEEE Microwave & Guided Wave letters,
vol. 3, pp. 375-377, 1993.
Cerri, G. et al., 1998, MoM-FDTD hybrid technique for analysing scattering problems,
Electronic Letters, vol. 34, pp. 433-440, 1998.
Chen, J. et al., 1998, Numerical Simulation of SAR and B1-field inhomogenity of shielded
RF coils loaded with human head, IEEE Transactions on Biomedical Engineering, vol.
45, pp. 650-659, 1998.
Coffey, E. L., 1993, Recent Enhancments to GEMACS 5.2, Ninth annual review of
progress in Applied Computational Electromagnetics, Monterey,, pp. 894-900, 1993.
Coffey, E. L. and Kadlec, D. L., 1990 General electromagnetic Model for the Analysis of
Complex Systems (GEMACS) version 5.0, Advanced Electromagnetics Corporation for
USAF Rome Air Development Center (USA), Report No. RADC-TR-90-360, vol. 1-3,
1990.
Colburn, J. S. et al., 1995, A Comparison of MoM and FDTD for Radiation and Scattering
involving Dielectric objects, IEEE Antennas and Propagation Society International
Symposium, vol. 1, pp. 644-647, 1995.
DAmbrosio, G. and Migliore, M. D., 1994, The grounded dielectric layer fed by a current
line as a planar microwave applicator, IEEE Transactions on Antennas and Propa-
gations, vol. 42, pp. 1467-1475, 1994.
Demarest, K. et al., 1996, Hybrid numerical techniques for modeling ground penetrating
radar antennas, in Proc. USNC/URSI Meeting Baltimore, MD, pp. 260, 1996.
Djordjevic, M. and Notaros, B. M., 2005, Higher order hybrid method of moments-physical
optics modeling technique for radiation and scattering from large perfectly conducting
surfaces, IEEE Transactions on Antennas and Propagation, vol. 53, pp. 800-813, 2005.
Edelvik, F. and Ledfelt, G., 2000 Explicit hybrid time domain solver for the Maxwell
equation in 3D, J. Sci. Comput., vol. 15, pp. 61-78, 2000.
Fear, E. C. et al., 2002, Enhancing breast tumour detection with near-field imaging, IEEE
Microwave Magazine, pp. 48-56, 2002.
Fierriers, X. et al., 2004, Application of Hybrid Finite Difference/Finite Volume Method to
Solve an Automotive EMC Problem, IEEE Transactions on Electromagnetic
Compatibility, vol. 4, pp. 624-634, 2004.
Forgy, E. A. et al., 1998, A Hybrid MoM/FDTD Technique for studying Human
Head/Antenna Interactions, IEEE Antenna and Propagation Conference, Boston, pp. 81-
84, 1998.
Han, D. H. et al., 2000a, Finite-element based iterative hybrid techniques for the solution of
electrically large radiation problems, IEEE Antenna and Propagation Society
International Symposium, vol. 4, pp. 2320-2323, 2000.
Han, D. H. et al., 2000b, Analysis of reflector antennas including higher-order interactions,
Radio and Wireless Conference, RAWCON 2000, IEEE, pp. 131-133, 2000.
Han, D. H. et al., 2000c, FEM-based hybrid methods for the analysis of antennas on
electrically large structures, IEEE Radio and Wireless Conference, RAWCON 2000, pp.
59-61, 2000.
Han, D. H. et al., 2002, Hybrid analysis of reflector antennas including higher order
interactions and blockage effects, Hybrid analysis of reflector antennas including higher
order interactions and blockage effects, vol. 50, pp. 1514-1524, 2002.
Harrington, R. F., 1968, Field Computation by Moment Methods: The Macmillan Co., New
York, 1968.
6. Complex CEM with Hybridisation Techniques 143

Huang, Z. et al., 1999, An FDTD/MoM hybrid technique for modeling complex antenna in
the presence of heterogeneous grounds, IEEE Transactions on Geoscience and Remote
Sensing, vol. 37, pp. 2692-2698, 1999.
Jakobus, U. and Landstorfer, F. M., 1995, Improvement of the PO-MoM hybrid method by
accounting for effects of perfectly conducting wedges, IEEE Transactions on Antennas
and Propagations, vol. 43, pp. 1123-1129, 1995.
Jin, J., 2002, The Finite Element Method in Electromagnetics, 2nd ed. New York: John Wiley
& Sons, Inc., 2002.
Kim, J. P. et al., 1999, Analysis of corrugated surface wave antenna using hybrid
MOM/UTD technique, Electronic Letters, vol. 35, pp. 353-354, 1999.
Kuster, N. et al., 1997, Mobile Communication Safety, London: Chapman and Hall, first
edition, 1997.
Lail, B. A. and Castillo, S. P., 2000 Coupling through narrow slot apertures to thin-wire
structures, IEEE Transactions on Electromagnetic Compatibility, vol. 42, pp. 276-283,
2000.
Lautru, D. et al., 2000, A MoMTD/FDTD hybrid method to calculate the SAR induced by a
base station antenna, IEEE Antennas and Propagation Society International Symposium,
vol. 2, pp. 757-760, 2000.
Lee, R. and Chia, T., 1993, Analysis of electromagnetic scattering from a cavity with a
complex termination by means of a hybrid-ray FDTD method, IEEE Transactions on
Antennas and Propagation, vol. 41, pp. 1560-1569, 1993.
Lopez, M. A. H. et al. , 2001, A resistively loaded thin-wire antenna for mine detection,
Subsurface Sensing Technologies and Applications, vol. 2, pp. 265-271, 2001.
Lynch, D. R. et al., 1985, Finite element solution of Maxwells equation for hyperthermia
treatment planning, Journal of Computational Physics, vol. 58, pp. 246-269, 1985.
Lynch, D. R. et al., 1986, Hybrid element method for unbounded electromagnetic problems
in hyperthermia, International Journal of Numerical Methods in Engineering, vol. 23,
pp. 1915-1937, 1986.
Lysiak, K. A. et al., 1996, A Hybrid MoM/FDTD Approach to UHF/VHF Propagation
Problems, IEEE Antennas and Propagation Society Internatonal Symposium, Baltimore
MD, pp. 358-361, 1996.
Mangoud, M. A. et al., 2000, Simulation of human interaction with mobile telephones using
hybrid techniques over coupled domains, IEEE Transactions on Microwave Theory and
Techniques, vol. 48, pp. 2014-2021, 2000.
Merewether, D. E. et al., 1980, On implementing a numeric Huygens source scheme in a
finite difference program to illuminate scattering bodies, IEEE Trans. on Nuclear
Science, vol. NS-27, no. 6, pp. 1829-1833, 1980.
Mochizuki, S. et al., 2003, Novel iteration procedures of a hybrid method combining MoM
and scattered-field FDTD method for electromagnetic dosimetry, IEEE Topical
Conference on Communication Technology, pp. 200-201, 2003.
Monk, A. D. et al., 1994, A Comparison of FDTD and Method of Moments to Model
Electrically Small Antennas, IEEE Antennas and Propagation Society International
Symposium, vol. 1, pp. 565-568, 1994.
Monorchio, A. and Mittra, R., 1998, A hybrid finite-element/finite-difference time-domain
(FE/FDTD) technique for solving complex electromagnetic problems, IEEE Microwave
and Guided Wave Letters, vol. 8, pp. 93-95, 1998.
Monorchio, A. et al., 2004, A Hybrid Time-Domain Technique that Combines the Finite
Element, Finite Difference and Method of Moment Techniques to Solve Complex
Electromagnetic Problems, IEEE Transactions on Antennas and Propagation, vol. 52,
pp. 2666-2674, 2004.
144 Chapter 6

Morgan, M. A. and Welch, B. E., 1986, The field feedback formulation for electromagnetic
scattering computations, IEEE Transactions on Antennas and Propagation, vol. 34, pp.
1377-1382, 1986.
Morgan, M. A. et al., 1984, Finite element-boundary integral formulation for electromag-
netic scattering, Wave Motion, vol. 6, pp. 91-103, 1984.
Mrozowski, M., 1994, A hybrid PEE-FDTD algorithm for accelerated time domain analysis
of electromagnetic waves in shielded structures, IEEE Microwave Guided Wave Letters,
vol. 4, pp. 323-325, 1994.
Mur, G., 1981 Absorbing boundary conditions for the finite-difference approximation of the
time domain electromagnetic field equation, IEEE Transactions on Electromagnetic
Compatibility, vol. 23, pp. 377-382, 1981.
Nath, S. et al., 1993, Three dimensional hybrid finite boundary element model for eddy
current NDE, IEEE Transactions on Magnetics, vol. 29, pp. 1853-1856, 1993.
NEC 2005, Programs database: http://www.funet.fi/pub/ham/antenna/NEC.
Okoniewski, M. et al., 1997, Three-dimensional Subgridding Algorithm for FDTD, IEEE
Trans. Antennas and Propagation, vol. 45, no. 3, pp. 422-429, 1997.
Orfanidis, A. P. et al., 2000 A mode-matching technique for the study of circular and coaxial
waveguide discontinuities based on closed-form coupling integrals, IEEE Transactions
on Microwave Theory and Techniques, vol. 48, pp. 880-883, 2000.
Pantoja, M. F. et al., 2002, Design of an ultra-broadband V-antenna for microwave detection
of breast tumors, Microwave Optical Technology Letters, vol. 34, pp. 164-166, 2002.
Paulsen, K. D. et al., 1988, Three-dimensional finite, boundary, and hybrid elements
solutions of the Maxwell equations for lossy dielectric media, IEEE Transactions on
Microwave Theory and Techniques, vol. 36, pp. 682-693, 1988.
Reddy, C. J. et al., 1996, Radiation characteristics of cavity backed aperture antennas in
finite ground plane using the hybrid FEM/MoM technique and geometrical theory of
diffraction, IEEE Transactions on Antennas and Propagations, vol. 44, pp. 1327-1333,
1996.
Richmond, J. H., 1974 Radiation and scattering by thin-wire structures in the complex
frequency domain, NASA Rept. No. CR-2396. 1974
Salon, S. J. and Angelo, J. D., 1988, Applications of the hybrid finite element-boundary
element method in electromagnetics, IEEE Transactions on Magnetics, vol. 24, pp. 80 -
85, 1988.
Schelkunoff, S. A., 1951 Field Equivalence Theorems, Comm. Pure Appl. Math., vol. 4
pp. 43-59, 1951.
Sheng, X. Q. et al., 1998, On the Formulation of Hybrid Finite-Element and Boundary-
Integrals method for 3-D Scattering, IEEE Transactions on Antennas and Propagation,
vol. 46, pp. 303-311, 1998.
Silvestro, J., 1992, Scattering from slot near conducting wedge using hybrid method of
moments/geometrical theory of diffraction: TE case, Electronic Letters, vol. 28, pp.
1055-1057, 1992.
Soudais, P., 1995, Computation of the electromagnetic scattering from complex 3D objects
by a hybrid FEM/BEM method, Journal of Electromagnetic Waves and applications, vol.
9, pp. 871-886, 1995.
Stupfel, B. and Despres, B., 1999, A domain decomposition method for the solution of large
electromagnetic scattering problems, Journal of Electromagnetic Waves and
applications, vol. 13, pp. 1553, 1999.
Stupfel, B. et al., 1991, Combined Boundary-element and finite-element method for the
scattering problem by axisymmetrical penetrable objects, in Proceedings of the
6. Complex CEM with Hybridisation Techniques 145

International Symposium on Mathematical and Numerical Aspects of Wave Propagation


Phenomena (SIAM, Philadelphia), pp. 332, 1991.
Taflove, A., 1995 Computational Electrodynamics: The Finite Difference Time Domain
Method, Dedham, MA: Artech House, 1995.
Taflove, A. and Umashankar, K. R., 1982a, A hybrid moment method/finite difference time-
domain approach to electromagnetic coupling and aperture penetration into complex
geometries, IEEE Transactions on Antennas and Propagation, vol. AP-30, pp. 617-627,
1982.
Taflove, A. and Umashankar, K. R., 1982b A novel method to analyze electromagnetic
scattering of complex objects, IEEE Transactions on Electromagnetic Compatibility, vol.
24, pp. 397-405, 1982.
Thiagarajan, V. and Hsieh, K. T., 2005, Investigation of a 3-D hybrid finite-
element/boundary-element method for electromagnetic launch applications and validation
using semianalytical solutions, Symposium on Electromagnetic Launch Technology,
pp. 375-380, 2005.
Tinniswood, A. D., 1996, Time Domain Integral Equations, PhD Dissertation, University
of York, 1996.
Trlep, M. et al., 2003, The FEM-BEM Analysis of Complex Grounding Systems, IEEE
Transactions on Magnetics, vol. 39, pp. 1055-1058, 2003.
Umashankar, K. R. et al., 1987, Calculation and experimental validation of induced currents
on coupled wires in an arbitrary shape cavity, IEEE Transactions on Antennas and
Propagation, vol. AP-35, pp. 1248-1257, 1987.
Wang, Y. et al., 2002, An FDTD/ray-tracing analysis method for wave penetration through
inhomogeneous walls, IEEE Transactions on Antennas and Propagation, vol. 50, pp.
1598-1604, 2002.
Yang, M. et al., 2000, Hybrid finite-difference/finite-volume time-domain analysis for
microwave integrated circuits with curved PEC surfaces using a nonuniform rectangular
grid, IEEE Transactions on Microwave Theory and Techniques, vol. 48, pp. 969-975,
2000.
Yee, K. S., 1966, Numerical solution of initial boundary value problems involving Maxwells
equations, IEEE Transactions on Antennas and Propagation, vol. 3, pp. 302-307, 1966.
Yee, K. S. and Chen, J. S., 1997, The Finite-Difference Time-Domain (FDTD) and the
Finite-Volume Time-Domain (FVTD) Methods in Solving Maxwells Equations, IEEE
Transactions on Antennas and Propagation, vol. 45, pp. 354-363, 1997.
Yuan, X., 1990, Three-dimensional electromagnetic scattering from inhomogeneous objects
by the hybrid moment and finite element method, IEEE Transactions on Microwave
Theory and Techniques, vol. 38, pp. 1053-1058, 1990.
Yuan, X. et al., 1990, Coupling of finite element and moment methods for electromagnetic
scattering from inhomogeneous object, IEEE Transactions on Microwave Theory and
Techniques, vol. 38, pp. 386-393, 1990.
Zielinski, A. P. and Zienkiewicz, O. C., 1985, Generalized finite element analysis with
T-complete boundary solution functions, International Journal of Numerical Methods
in Engineering, vol. 21, pp. 509-528, 1985.
Chapter 7
ENHANCED EM SOFTWARE FOR PLANAR
CIRCUITS
An efficient Multilevel Fast Multipole Algorithm based
on the use of Perfectly Matched Layers

D. Vande Ginste1, F. Olyslager1, D. De Zutter1 and E. Michielssen2


1
Ghent University, Belgium; 2University of Illinois at Urbana-Champaign, USA

Abstract: The most successful simulation technique for planar circuits embedded in
layered media is the integral equation approach solved with the Method of
Moments (MoM). The kernel in the integral equation is a Greens function of
the layered medium. The MoM leads to the solution of a dense linear system
of equations. For large and complex circuits this soon leads to systems with a
huge number of unknowns N. Storing and solving the linear system requires
O(N2) memory and O(N3) CPU time respectively. Using iterative solution
techniques the cost for solving the linear system can be reduced to O(PN2),
with P the number of iterations. The calculation of the Greens functions for
layered media demands the numerical evaluation of Sommerfeld-integrals. By
making use of the excellent absorbing properties of Perfectly Matched Layers
(PML) it is possible to obtain a series representation for these Greens
functions. The terms in this series allow for the application of a Multilevel
Fast Multipole Algorithm (MLFMA) which can reduce the memory and
computational complexity of the algorithm to O(N) for dense geometries. In
this chapter the combined PML-MLFMA is outlined. It is numerically
demonstrated that this technique allows for the analysis of very large planar
structures. An extension to small circuits with much geometric detail is also
presented.

Key words: Microstrip Structure; Planar Antenna Array; Perfectly Matched Layer;
Multilevel Fast Multipole Algorithm.

147
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 147222.
2006 Springer. Printed in the Netherlands.
148 Chapter 7

1. INTRODUCTION

1.1 Setting and Definition of the Research Topic

1.1.1 High-Frequency Applications and Design

Half a century ago high-frequency electronic devices were exclusively


manufactured for military purposes and for use in the areas of satellite
communications, radar technology, and avionics. However, during the last
decades, the development of consumer applications is booming, particularly
for the telecommunication market. Cordless telephony (DECT) and the
second generation GSM mobile telephones (operating at 900MHz and
1800MHz) are already old news. The introduction of e.g. UMTS (2.2GHz),
i.e. third generation mobile telephony, has cleared the way for mobile
multimedia applications, requiring broadband access (2Mbit/s) to the
internet. Also a whole range of technologies is (being) developed for indoor
data communication. For short range (10 to 30m) wireless communications,
Bluetooth applications (2.45GHz) have conquered the market. Concerning
wireless computer networks, the HiperLAN (5.15-5.3GHz and 17.1-
17.3GHz) concept is fully implemented and further developed to achieve
even larger bandwidths (and thus higher data rates). Furthermore, in the
domain of Personal Computers (PCs) the increase of the bit rates is also
remarkable. According to Moores law [Schaller, 1997] the clock speeds
double every 18 months. At present, PCs with clock speeds of 3GHz are
affordable to private consumers. As a final example the development of the
Global Positioning System (GPS) has to be mentioned. This technology,
operating at 1227.5MHz and 1575MHz, necessitated the manufacturing of
high-frequency devices in the navigation sector.
The constant increase of data rates, clock speeds, etc, in the state-of-the-
art applications has an important consequence, i.e. more and more high-
frequency phenomena appear in the electrical circuits of systems. Indeed, at
increasing operating frequencies the electrical size of the circuits increases
and the wave character of the currents flowing in the circuits becomes
noticeable (even for physically small circuits as Integrated Circuits (ICs)).
Some examples: interconnections no longer behave as perfect short circuits,
signals are delayed and even distorted because of transmission line effects,
cross-talk occurs, ground-bounce effects take place, and the skin-effect
causes nonuniform current distributions. Also, because of the exponential
growth of high-frequency equipment, the always undesired
electromagnetic interference (EMI) issues gain in significance. At high
frequencies the circuits start radiating and also, because of reciprocity,
become more susceptible to noise signals. This EMI can seriously disrupt the
7. Enhanced EM Software for Planar Circuits 149

functionality of high-frequency systems, and therefore, international rules


have been enforced to ensure electromagnetic compatibility (EMC)
[Goedbloed, 1992].
To efficiently design the electrical circuits of novel systems, the engineer
wants to be able to predict all these effects in order to reduce the time-to-
market, i.e. the time needed to develop a new application, and to avoid the
manufacturing of expensive prototypes. For that purpose, the engineer uses
high performance software to simulate the behavior of the circuits. A
multitude of Computer Aided Design (CAD) tools have been developed
throughout the years, making a complete survey almost impossible. Here, a
distinction is made between two important classes of simulators. On the one
hand, for low-frequency applications, CAD-tools relying on Kirchoffs laws
have been implemented. These tools are called circuit simulators and they
allow for the fast modeling of complex and large circuits. A well-known
example is SPICE. On the other hand, it is inconvenient, or even impossible,
to predict high-frequency phenomena with these tools since they do not
incorporate the wave effects described by Maxwells equations [Maxwell,
1954]. Consequently, a wide range of CAD-tools not exclusively for
circuit design based on Maxwells equations have appeared on the market
(see the survey presented by Mirotznik and Prather (1997) and the Web site
http://www.emclab.umr.edu/csoft.html). Since these solvers have the ability
to model the electric and magnetic fields, they are called field simulators. In
this chapter, we are interested in this last group of high-frequency
simulators, and more specific, in planar circuit solvers.

1.1.2 Planar Circuits and Planar Solvers

High-frequency phenomena have a large impact on the design of planar


microwave circuits. These circuits consist of one or more layers of dielectric
and thin metallization layers. The metallization patterns comprise micro-
strips, striplines, slotlines, and coplanar waveguides. Mass production of
these structures is made possible via photolithographic manufacturing
techniques. This is an important advantage w.r.t. older (rectangular or
cylindrical) waveguide structures. Also, passive planar circuit elements and
active elements can easily be integrated. This allows the manufacturing of
PCBs (Printed Circuits Boards), MICs (Microwave Integrated Circuits),
MMICS (Monolithic Microwave Integrated Circuits), advanced packaging
structures as MCMs (Multi-chip modules), and planar microwave antenna
structures.
As stated in the previous section, the design of complex planar structures
asks for an accurate modeling of its electromagnetic behavior. Basically,
field simulators solve Maxwells equations for a certain configuration, which
150 Chapter 7

can be planar or nonplanar. Mostly, solutions of the equations can only be


found by using numerical techniques. Many solvers rely on a finite elements
method (FEM) [Silvester and Ferrari, 1990], a finite-difference time-domain
technique (FDTD) [Taflove, 1995], and/or a boundary integral equation
(BIE) technique combined with a Method of Moments (MoM) solution
[Harrington, 1993]. Especially this last technique is popular for the modeling
of planar circuits. Two well-known examples of software packages that
comprise a BIE-MoM based planar solver are Agilent EEsof ADS and
Ansoft Designer.

1.1.3 Some Advantages and Drawbacks of BIE-MoM Based Planar


Solvers

A first advantage is the reduction of the number of unknowns. In this


chapter planar metallizations residing in layered background media are
considered. When using a BIE-MoM technique to model these structures,
only the systems metallic conductors have to be discretized, due to the fact
that the characteristics of the layered medium are stored in the so-called
Greens functions [Tai, 1993]. Hence, the number of unknowns will be much
smaller than what is obtained when using a FEM simulation technique.
Another advantage is the efficient and accurate treatment of open areas.
Microstrip substrates are open structures because they comprise a semi-
infinite layer of air. Since the radiation condition [Van Bladel, 1985] is
incorporated in the Greens functions, open structures can be modeled
rigorously. In FDTD the open simulation areas have to be terminated using
not always ideal Absorbing Boundary Conditions (ABCs) [Taflove,
1995; Givoli, 1991].
Of course there are also some drawbacks. First of all, the calculation of
the Greens functions for planar structures is computationally expensive.
Luckily, these Greens functions are determined by the layered background
medium, and hence, they can be stored separately, i.e. independent of the
metallization they support. Another disadvantage concerns the linear system,
arising from the MoM, itself. When the system's metal conductors are
meshed using N discretization elements, this set of equations is given by

V = ZI
N (7.1)
or: Vi = Z ij I j , i = 1, , N
j =1
7. Enhanced EM Software for Planar Circuits 151

where V and I are N dimensional vectors. The first one contains the
known numbers Vi , i = 1, , N , caused by the excitation of the planar circuit.
The latter is the unknown of the problem and thus it contains the N
unknown current densities I j , j = 1, , N , on the N segments. The
( N N ) dimensional matrix Z is called the moment matrix (or interaction
matrix or system matrix). Although the number of unknowns is reduced w.r.t
other methods, this moment matrix is dense, leading to some cumbersome
consequences. Storing the linear system requires a lot of memory, more
specifically, the memory complexity of the method is of order O (N 2 ).
Also, when solving the linear system with a direct solver (such as Gaussian
elimination or LU-decomposition), the solving time scales unfavorably with
the number of unknowns, i.e. the computational complexity is of order
O ( N 3 ). When using an iterative solver (see Section 1.2.2), this reduces to
O ( P N 2 ) , with P the number of iterations. The reader understands that for
electrically large structures (such as antenna arrays), involving a large
number of unknowns N , the memory requirements easily exceed the
computers memory capacity and that the modeling of large systems takes
ages. The same problems can occur for the simulation of small systems
(such as MMICs) too, because in order to model the fine geometric details of
the structure accurately, the number of unknowns has to be large as well. In
this chapter, we will focus on the memory and computational complexity
problems of BIE-MoM based solvers for planar microwave structures.

1.2 Methodology

In this section the basic tools used to tackle the above described
complexity problems are briefly discussed. The purpose is to provide the
reader with some mathematical and/or historical insight of a few techniques,
without going into too much detail. Afterwards, these techniques will be
further elaborated.

1.2.1 Perfectly Matched Layer (PML) Based Greens Functions

As stated before, the BIE-MoM procedure where only the systems


metallic conductors are discretized, is made possible because of the use of
Greens functions [Tai, 1993]. In this chapter, the Greens function is
defined as the electric field produced by an elementary current source. This
Greens function determines the behavior of the layered background medium
152 Chapter 7

in which the metallization resides. Classically, the determination of the


Greens functions of layered background media is definitely not
straightforward. The modal spectrum of an open planar structure comprises
some discrete propagating surface waves and a continuous set of radiation
modes [Felsen and Marcuvitz, 1994; Olyslager, 1999]. The presence of this
continuous set unavoidably calls for the time-consuming numerical
evaluation of Sommerfeld-type integrals [Fach et al., 1989; 1992; 1993].
An innovative, very efficient, and elegant way to calculate Greens
functions, based on the use of PMLs [Brenger, 1994; Chew and Weedon,
1994], has first been proposed by Derudder et al. [1999a, 1999b] and also by
Olyslager and Derudder [2003]. For a layered structure that is closed by a
perfect electric conductor (PEC) plane at the top and bottom of the structure,
the Sommerfeld-integrals that arise in the Greens function calculations can
be expressed as a series of surface waves. By using a PML that is covered by
a PEC plane, open layered media can be closed while (approximately)
maintaining the open character of the structure. In Derudder et al. [1999a,
1999b] and Olyslager and Derudder [2003] this approach was used to obtain
an analytic and easy to determine series representation for the Greens
functions of open layered media (see also Section 3). These new analytical
expressions are the core of our new fast formalisms.
7. Enhanced EM Software for Planar Circuits 153

1.2.2 Iterative Solvers

Figure 7-1. Flowchart of an iterative solver.

Basically, there are two ways to solve the linear system (Eq. 7.1). First of
all, a direct solver can be used. Well-known methods are Gaussian
elimination and LU-decomposition. With these techniques an exact solution
can be found, of course provided that Z is not singular. Another advantage
is that, once the LU-decomposition of Z is determined and stored, new
154 Chapter 7

solutions I for different excitations V can quickly be generated. An impor-


tant drawback is that the cost to solve the N-dimensional problem scales
as O ( N 3 ). In this work, we prefer to use iterative solvers (also called
indirect solvers). Several types are described in the literature [Axelsson,
1994], but the specifics of these solvers are not of major importance here. In
general, the solvers seek an approximation of the solution I in a number
of successive iterations. In each iteration p, an N-dimensional test solu-
tion X p is proposed as a solution of the linear system. This estimate is
multiplied with the moment matrix, yielding an N -dimensional vector
Yp = Z X p . The indirect solver then compares Yp with the excitation
vector V and decides whether to accept X p as a good approximation of the
real solution I or to propose a new (and hopefully better) estimate X p +1 in
the next iteration p + 1 . The flowchart of a general iterative solution
technique is shown in Fig. 7-1.
The N N moment matrix Z is dense. Therefore, the cost to store this
matrix (and thus the linear system) scales as O (N 2 ) . Also, performing one
matrix-vector multiplication Z X p costs O (N 2 ) operations, which is
expensive. Suppose that an acceptable solution is found in P iteration steps,
it is clear that the computational complexity is of order O (P N 2 ) . For well-
conditioned problems, it is safe to assume that P<<N. The goal is to find a
method that sparsifies the moment matrix with an acceptable and
controllable loss of accuracy. Obviously, this will reduce the memory
requirements and the solving time. For this purpose, we opt to use Fast
Multipole Methods (FMMs).

1.2.3 Fast Multipole Method (FMM)

The Fast Multipole Method (FMM) was first presented by Rokhlin and
Greengard (1987) and its purpose was to solve static problems in a fast way.
These static problems are typically described by the Laplace equation. The
high-frequency problems considered in this chapter comprise wave
phenomena and hence, they constitute dynamic problems, characterized by
the Helmholtz equation. Rokhlin (1990) forced a breakthrough in the search
of fast methods for dynamic problems. Other investigators joined him
shortly after [Coifman et al., 1993; Lu and Chew, 1993; Hamilton et al.,
1994]. The dynamic FMMs were later optimized by implementing them in a
multilevel framework. These methods became known as Multilevel Fast
7. Enhanced EM Software for Planar Circuits 155

Multipole Algorithms (MLFMAs) [Lu and Chew, 1994; Dembart and Yip,
1995; Epton and Dembart, 1995; Song and Chew, 1995] and are faster and more
memory efficient than the original (two-level) FMMs. MLFMAs were origi-
nally developed in the frequency domain. Michielssen extended these
algorithms to be used in the time domain and called them plane-wave time-
domain (PWTD) methods [Ergin et al., 1999]. Recently, Chew presented the
Fast Inhomogeneous Plane Wave Algorithm (FIPWA) [Hu and Chew, 2000;
Hu and Chew, 2001; Jiang and Chew, 2004]. For the sake of completeness,
it needs to be said that other fast methods also have been proposed [Canning,
1990a; Canning, 1990b; Michielssen and Boag, 1994; Michielssen and
Boag, 1996; Catedra and Gago, 1990; Zwamborn and van den Berg, 1991;
Ling et al., 1998; Ling et al., 2000; Chew et al., 2002].
Within the limited scope of this chapter, it is impossible to provide the
reader with a full mathematical introduction of the FMM and the MLFMA.
The reader is encouraged to consult the above mentioned references.
Nevertheless, the gist of the FMM is as follows. The metallization pattern
is discretized in N elements. For every element an unknown current
expansion coefficient I j has to be calculated. Therefore, the dense N N
moment matrix Z is constructed, which in fact describes the N 2
interactions Z ij between every pair (i, j ) discretization elements
(i = 1, , N , j = 1, , N ) . These interactions are to a great extent dependent
on the Greens function, which in turn is always dependent on the distance
between the source discretization element j and the observer discretization
element i . Originally the source j and the observer i are connected via this
Greens function. By applying the FMM, the Greens function is factorized
such that source and observer contributions are separated. This allows
subdividing the discretization elements into groups. Instead of calculating
interactions between source-observer pairs, now the interactions between
groups of sources (source groups) and groups of observers (observer
groups) are considered. These group-group interactions can be calculated in
a fast way, and mathematically, this corresponds to a decomposition of the
original moment matrix into a product of sparse matrices. This is the core of
the two-level FMM. By clustering the groups on multiple levels, the moment
matrix is further factorized, leading to the very efficient MLFMA.

1.3 Outline

The rest of this chapter is organized as follows. In Section 2 the classical


solution technique, based on BIE-MoMs, for planar microwave structures is
156 Chapter 7

discussed. Since the Greens functions play a major role in this method,
some attention is devoted to the impact of Sommerfeld-integrals in their
determination.
A new formalism to determine these Greens functions is presented in
Section 3. First, the Perfectly Matched Layer (PML) is introduced, and its
historical background is mentioned. Next, the PML-concept is applied to
open microwave structures, in order to obtain alternative, closed-form
expressions for the Greens functions. These new expressions are in fact
series expansions of PML-surface waves.
Section 4 is the main portion of this chapter and describes a technique for
the modeling of large planar microstrip structures. The method is based on
the use of PML-series and a plane wave decomposition of the Hankel
function that appears in this series. This approach leads to the implemen-
tation of an MLFMA. Many numerical examples validate and illustrate the
method.
Some extensions to the above technique are proposed in Section 5.
Finally, at the end of this chapter, the main conclusions are drawn.

2. CLASSICAL SOLUTION TECHNIQUE


FOR MICROSTRIP STRUCTURES

2.1 Geometry of the Problem

Figure 7-2. Geometry of the problem.


7. Enhanced EM Software for Planar Circuits 157

The microstrip configurations considered herein consist of a substrate


with thickness d, relative permittivity r , and relative permeability r ,
that is backed by a perfectly electric conducting (PEC) ground plane. On top
of the substrate, microstrip elements, comprised of traces and patches, are
printed. It is assumed that this metallization S is infinitely thin and perfectly
conducting. Antenna arrays (Fig. 7-2) are a typical example of such
configurations. The structure can be excited by port sources or it can be
illuminated by a plane wave parameterized as

j ( kx x + k y y + kz z )
EPW ( x, y, z ) = E0 e = E0 e jk0 (cos sin x + sin sin y + cos z ) (7.2)

where k0 = /c is the free space wavenumber, with = 2 f the angular


frequency 1 and with c the speed of light. The goal of this section is to
analyze radiation and scattering by/from this structure. These kinds of
problems have been studied for several years [Michalski and Butler, 1983;
Mosig and Gardiol, 1985; Tsalamengas and Fikioris, 1993; Tsalamengas,
1993; Ling and Jin, 1997].

2.2 The EFIE Description

The plane wave (Eq. 7.2), or another source, excites the structure. This
causes an incident field Einc (r ) that represents the field in presence of the
substrate on the ground plane without the actual metallization S. T his inci-
dent field induces unknown surface current densities Jt (r), r = ( x, y, d ) S,
sc
flowing on the conductors S , which in turn produce a scattered field E (r ) 2.
This scattering problem can be described in terms of a BIE that here
takes the form an Electric Field Integral Equation (EFIE). Since the metallic
strips placed at the substrate-air interface are perfect conductors, this integral
equation for the current density is obtained from enforcing the transverse
to z component of the total electric field E tot (r ) to vanish on the metal-
lization S :

1
All sources and fields are assumed time-harmonic with angular frequency and the time
dependencies e jt are suppressed.
2
Here, and in what follows, vectors r often indicate locations at the substrate-air interface
z = d . Hence, they are actually two-dimensional vectors r ( x, y, d ) . It will always be
clear from the context whether r depends on z or not. Transversal to z restrictions of a
vectorial quantity are denoted with an index t .
158 Chapter 7

Ettot (r ) = Einc
t (r ) + Et (r ) = 0 ,
sc
r = ( x, y, d ) S (7.3)

The fields produced by J t (r ) can be calculated by making use of the


pertinent Greens dyadic [Tai, 1993], which is detailed in the next section.
The boundary condition (Eq. 7.3) is cast as

t (r ) = Et (r )
Einc sc

(7.4)
= G ee (r | r ') J t (r ') dS ' , r = ( x, y, d ) S
S

The integration in the EFIE (Eq. 7.4) obviously extends the metallization
pattern S . The left-hand side is the known incident field, e.g. caused by the
plane wave impinging upon the substrate without any metallization S. The
current density J t (r ) is the unknown of the equation. If the transverse two-
by-two electric-electric Greens dyadic G ee (r | r ') is known, then the
scattering problem is uniquely defined.

2.3 The Greens Dyadic G ee (r | r ')

2.3.1 Integral Rrepresentation

Using the well-known spectral domain method [Fach et al., 1993], the
Greens dyadic can be determined. A full derivation is not repeated here, but
the results suffice. For an elementary current source at r ' = ( x ', y ', d ) , being
the center of the coordinate system, and an observer at r = ( x, y, d ) , the
two-by-two Greens dyadic using the cylindrical coordinates
y y'
= ( x x ') 2 + ( y y ') 2 and tan = is given by:
x x'
7. Enhanced EM Software for Planar Circuits 159

Gee, xx ( , ) Gee, xy ( , )
Gee ( , ) =
Gee, yx ( , ) Gee, yy ( , )
(7.5)
1 W0 ( ) W2 ( )cos 2 W2 ( )sin 2
=
4 W2 ( )sin 2 W0 ( ) + W2 ( ) cos 2

with

+
W0 ( ) = W ( ) J
0
0 0 ( ) d (7.6)

+
W2 ( ) = W ( ) J
0
2 2 ( ) d (7.7)

W0 ( ) = g TM ( ) g TE ( ) (7.8)

W2 ( ) = g TM ( ) + g TE ( ) (7.9)

1
g TM ( ) = (7.10)
Y 1
TM
cot( 1d ) + jY2TM

1
g TE ( ) = (7.11)
Y 1
TE
cot( 1d ) + jY2TE

and:
160 Chapter 7

j i
Yi TM = , i = 1, 2 (7.12)
i

i
Yi TE = , i = 1, 2 (7.13)
ji

i = 2 i i 2 , i = 1, 2 (7.14)

where the index i = 1, 2 indicates the layer. Layer i = 1 is the substrate


(0 < z < d ) with permittivity 1 = 0 r and permeability 1 = 0 r .
Layer i = 2 is the semi-infinite layer of air (d < z < +) with permittivity
1 = 0 and permeability 1 = 0 . Since i is only defined up to a sign, we
demand that ( i ) 0 . Of course, J 0 ( ) and J 2 ( ) are the zeroth order and
the second order Bessel function of the first kind. The indices TM and TE in
the above equations indicate transversal magnetic and electric w.r.t. the
z axis.

2.3.2 Sommerfeld-Integrals

The integrals in Eq. (7.6) and Eq. (7.7) are called Sommerfeld-integrals
[Fach et al., 1989; Fach et al., 1992; Fach et al., 1993] and have no
analytical solution. T his can be understood by observing the functions
g TX ( ) , where here, and in what follows, TX stands for TM or TE.
Actually, the denominators of these functions correspond to the TX-
dispersion relations of the PEC-dielectric-air structure:

for TX: N TX( ) = Y1TX cot( 1d ) + j Y2TX = 0 (7.15)

Solving Eq. (7.15) yields the TX-modal spectrum of the PEC-dielectric-


air structure. The typical modal spectrum of these open microstrip structures
comprises some discrete propagating surface waves and a continuous set of
radiation modes [Felsen and Marcuvitz, 1994; Olyslager, 1999]. Provided
the substrate is lossless, these surface waves transverse modal propagation
7. Enhanced EM Software for Planar Circuits 161

constants can be found on the real axis between k0 and kr = k0 r r .


The radiation modes are a consequence of the open nature of the structure
caused by the semi-infinite layer of air z > d and are characterized by
propagation constants located along a branch cut in the complex plane. A
typical modal spectrum is shown in Fig. 7-3. Mathematically spoken, this
branch cut is a consequence of the fact that the dispersion relations (Eq.
7.15) are odd functions of the root 2 . For lossy materials, the propagation
constants of the surface waves shift downwards in the complex plane. Of
course, all the modes come in pairs with propagation constants of opposite
sign. The branch cut causes the numerical evaluation of the Sommerfeld-inte-
grals, and hence, there is no closed-form expression available for the
Greens dyadic (Eq. 7.5). This complicates the application of the MLFMA to
the matrix-vector products Z X .

(l)

surface waves

k0 kr
(l)

radiation modes

Figure 7-3. Modal spectrum.

2.4 The Method of Moments

The EFIE (Eq. 7.4) is solved by application of the MoM [Harrington,


1993]. T o this end, S is approximated by a (potentially nonuniform)
rectilinear mesh with N interior edges. Next, J t (r ) is expanded in a set of
vector rooftop basis functions w j (r ) = wx , j (r ) x + wy , j (r ) y , j = 1, , N ,
162 Chapter 7

with support S j S comprising two patches that are joined by the meshs
j th interior edge as [Sercu et al., 1995] 3

N
J t (r ) = I j w j (r ) (7.16)
j =1

Inserting Eq. (7.16) into Eq. (7.4) and testing the result with a set of test
functions w i (r ), i = 1, , N , that are identical to the basis functions this
is called Galerkin testing yields the following N N linear system in the
unknown expansion coefficients I j , j = 1, , N :

V = ZI (7.17)

The N vector I contains the current expansion coefficients Ij , j = 1 , ,N ,


and the elements of the N vector V and the N N matrix Z are given by

Vi = Einc
t (r ) w i (r ) dS (7.18)
Si

Z ij = w i (r ) G ee (r | r ') w j (r ) dS ' dS (7.19)


Si S j

Linear system (Eq. 7.17) can be solved using direct or iterative schemes,
yielding the unknown coefficients I j , and hence, the desired current density
J t (r ) . Iterative solution techniques, e.g. the BiConjugate Gradients stabi-
lized method (BiCGstab) [Axelsson, 1994], are amenable to acceleration
by FMMs) (and their descendants) as they only require the multiplication
of the moment matrix Z by a test vector X p ; hence, they are adopted

3
When using a triangular mesh, the so-called RWG basis function (Rao et al., 1982) have to
be implemented.
7. Enhanced EM Software for Planar Circuits 163

here. From this current density, the scattering or radiation characteristics


can be derived, e.g. by using a stationary phase method [Wilcox, 1964].

3. PERFECTLY MATCHED LAYER BASED


GREENS FUNCTIONS FOR LAYERED MEDIA

3.1 The Perfectly Matched Layer Concept

3.1.1 The Split Field Formalism

Figure 7-4. Planar interface between a lossless and a lossy, homogeneous, isotropic medium.

The Perfectly Matched Layer (PML) was first presented by Brenger


(1994). It was his intention to achieve a better Absorbing Boundary
Condition (ABC) to terminate the simulation area in FEM and FDTD methods
By splitting the fields and consequently introducing an extra degree of
freedom in Maxwells equations, he succeeded to describe a medium that
was perfectly matched to any homogeneous, isotropic medium. This concept
of perfectly matching is of major importance of the new fast techniques
described in this chapter and it can best be understood by considering
Fig. 7-4. T his figure represents a two-dimensional (2-D) planar interface
between a lossless, homogeneous, isotropic medium 1 (e.g. the free space)
and a homogeneous, isotropic medium 2 (the PML) with electric ( e ) and
magnetic ( m ) losses. A TM-polarized plane wave impinges upon the PML
under an (oblique) angle of incidence . By making use of the split field
formalism it can be shown that for the following proper choice of the losses
164 Chapter 7

e m
= (7.20)

the PML has an impedance of


Z= (7.21)

for every angle . Hence it can be stated that these two media are perfectly
matched. (Note that with a classical description of a medium with losses, viz.
without split fields, this matching only occurs for perpendicular incidence).
Two major consequences are that no reflections occur at the interface and
that waves traveling in the PML are damped. Due to his innovative
approach, Brenger succeeded in implementing an ABC which completely
outperformed all existing ABCs of the time.

3.1.2 Complex Coordinate Stretching Formalism

Another way to describe the PML, whilst maintaining its favorable


properties, has been presented by Chew and Weedon (1994). They also
added a degree of freedom in Maxwells equations by stretching the
z coordinate into the complex plane. The new, complex z coordinate is
given by

z =z (7.22)

and is called the complex stretching factor. Often [Chen et al., 1995;
Fang and Wu, 1995], the following choice is made

0
= 0 j (7.23)

7. Enhanced EM Software for Planar Circuits 165

where 0 and 0 are real, positive numbers, since this leads to perfectly
matching and good damping. Typical values used in microstrip applications
0
are approximately 0 = 10 and = 8 [Vande Ginste et al., 2004].

3.2 Closure of Open Microstrip Substrates

3.2.1 Procedure and Influence on the Greens Functions

To determine the Greens functions, classically there has to be dealt with


the modal spectrum of structures with an air half-space above the microstrip
substrate, as stated in Section 2.3. This leads to branch cut contributions in
the modal spectrum and hence unavoidably to the numerical evaluation of
Sommerfeld-integrals. A cross section of these structures and the corres-
ponding modal spectrum is shown in Figs. 7-5(a) and 7-5(c). To obtain new
closed-form expressions, the semi-infinite layer of air is closed by placing a
PML backed by a PEC on top of it, see Fig. 7-5(b). The PML is described
using the stretched coordinate formalism with material parameters = 0 ,
= 0 , and = 0 j 0 . The layer of air with thickness d air and the

PML with thickness d PML are perfectly matched. The specific choice of
the stretching factor causes good absorption in the PML and hence, the
influence of the PEC on top of the PML is negligible. Due to the bottom and
top PEC in Fig. 7-5(b) the structure becomes a closed waveguide. Of major
importance is that the PML-closed substrate mimics the behavior of the
original, open structure and hence, their Greens functions in the air and in
the substrate are almost identical.
166 Chapter 7
z z
PEC
d + d air+ d PML

0, 0 ~
d + dair D
air air
d d
r, r r, r

0 y 0 y
x x
PEC PEC
(a ) opensubstrate
(b) PML-closed substrate

() ()

surface waves surface waves

k0 kr k0 kr
() ()

Brenger
evanescent waves waves
radiationmodes

(c) continuous spectrum (d) discrete spectrum

Figure 7-5. Closure of the microstrip substrate and corresponding modal spectra.

3.2.2 Complex Thickness

In order to be able to deal with PMLs mathematically, the concept of


complex thickness is introduced. From z = d + d air the z coordinate is
stretched with the factor , according to Eq. (7.22). Hence, the PML can be
considered as a homogeneous, isotropic layer, but with a complex thickness.
In the case of Fig. 7-5(b), the air-PML combination can be replaced by a
single layer of air with complex thickness


D = d air + d PML = d air + d PML 0 j 0 (7.24)

7. Enhanced EM Software for Planar Circuits 167

This leads to a different interpretation. It is possible to close open


structures by placing PECs (or PMCs) in the complex plane. In this way, the
PML-closed waveguide can be treated as any plane-parallel plate waveguide,
keeping in mind that z is complex. The waveguide is filled partially with a
dielectric layer (0 < z < d ) and partially with a layer of air (d < z < D ) 4.
The complex thickness of the second layer can be seen independently of its
PML-parameters as any complex number

D = Dr + j Di = e j (7.25)

with Dr , Di , , R . In order to provide damping, D must be located in


the fourth quadrant of the complex plane.

3.2.3 Dispersion Relations

Since the new structure is closed, the continuous set of radiation modes is
replaced by a discrete (infinite denumerable) set of surface waves (see Fig.
7-5(d)). By using the concept of complex thickness, this can again be seen
mathematically. The new modal spectrum is determined by solving the
following dispersion relations of the PEC-dielectric-air-PML-PEC waveguide
with complex thickness d + D:

for TX: N TX, PML( ) = Y1TX cot( 1d ) + Y 2TX cot( 2D) = 0 (7.26)

The definitions of YiTX and i are still given by Eqs. (7.12)(7.14). From Eq.
(7.26) it is clear that the branch cut vanishes because the dispersion relations
are even functions of 1. Both the TM- and TE-modes come in three flavors.
First, there are the propagating surface waves which are (virtually) identical
to those found in the original open microstrip substrate. Next, there are the
evanescent or pseudo-leaky surface waves and finally there are the so-called
Brenger surface waves [Rogier and De Zutter, 2001]. Below, no distinction

4
Strictly spoken, the less than (<) and greater than (>) signs only make sense for real
numbers. By d < z < D it is meant that z is along a chosen line in the complex plane
that connects d and D .
168 Chapter 7

between these three types of modes is made and they are collectively
( TX,n ) < ( TX,n +1 )
indexed such that for all n.

3.3 Series Expansion for the Greens Dyadic G ee

In FDTD and FEM, PMLs have been introduced for quite a long time.
Nevertheless, the PML is merely used as a numerical way to terminate
simulation areas. The analytical application of PMLs to obtain closed-form
Greens functions has been thoroughly investigated by the Electromagnetics
Group of the department of Information Technology Ghent University,
Belgium [Derudder et al., 1998a; Derudder et al., 1998b; Derudder et al.,
1999a; Derudder et al., 1999b; Knockaert and de Zutter, 2000; Derudder
et al., 2000; Bienstman et al., 2001; Derudder et al, 2001; Rogier and De
Zutter, 2001; Knockaert and De Zutter, 2002; Rogier and De Zutter, 2002a;
Rogier and De Zutter, 2002b; Olyslager and Derudder, 2003; Olyslager,
2004]. These modal expansions of the Greens functions are the key to the
new fast simulation methods and lead to very elegant modeling techniques.
In this section, a closed-form expression for the Greens dyadic G ee of the
PML-closed substrate is derived.

3.3.1 Integral Representation

Using the spectral domain method again, similar expressions as in


Section 2.3 are found. Eqs. (7.5)(7.9) are still valid, but new functions
g TX ( ) are obtained. These new functions are given by

1
for TX: g TX, PML ( ) = (7.27)
Y
1
TX
cot( 1d ) + Y2TX cot( 2 D)

The reader notices the dispersion relation of the PML-closed structure in the
denominator of Eq. (7.27), which has a huge impact on the solution
technique of the integrals in Eq. (7.6) and Eq. (7.7).

3.3.2 Gee , xx

The xx -component of the dyadic determines the x -component of the


electric field, caused by a dipole source that is also x -oriented. Consider a
7. Enhanced EM Software for Planar Circuits 169

source that is placed at (0, 0, d ) , then from Eq. (7.5) it is readily seen that
this leads to

1 x2 y 2 y 2 x2
Gee , xx ( x, y ) = I
1 2 I + I I4 (7.28)
4 2 2
3

where = x 2 + y 2 and with the integrals I1 I 4 defined as:

+
I1 = g ( ) J 0 ( ) d
TM, PML

0
+
I2 = g ( ) J 0 ( ) d
TE, PML

0
+
(7.29)
I3 = g ( ) J 2 ( ) d
TM, PML

0
+
I4 = g ( ) J 2 ( ) d
TE, PML

Consider integral I1 . Since g TM, PML ( ) is an even function of and


using the zeroth order Hankel functions of the first and the second kind, i.e.
H 01 ( z ) = J 0 ( z ) + jY0 ( z ) and H 0(2) ( z ) = J 0 ( z ) jY0 ( z ) respectively, I1
can be rewritten as:

+
1
2 e j
I1 = g TM, PML ( ) H 0(2) ( ) d (7.30)
170 Chapter 7

Figure 7-6. Contour integration path.

The branch cut of the Hankel function is situated on the negative real
axis. We write here e j instead of to make clear that the integration
path runs just under this branch cut. The integration path is now closed with
a semicircle at infinity, as in Fig. 7-6. The contribution of this semicircle to
the contour integral is zero. The zeroes of the TM-dispersion relation,
corresponding to the propagation constants of the TM-eigenmodes of the
structure, are simple poles of g TM, PML ( ) , and hence of the integrand of Eq.
(7.30). Using Cauchys residue theorem [Churchill and Brown, 1984] then
leads to

1 TM, PML
I1 =
2C
g ( ) H 0( 2 ) ( ) d
(7.31)
= j Res TM , n g [ TM, PML
]
( ) H ( 2)
0 (TM ,n )TM ,n
n

The residues corresponding to the simple poles of g TM, PML ( ) can easily
be calculated:
7. Enhanced EM Software for Planar Circuits 171

1
Res TM,n g TM, PML ( ) =
d TM, PML
N ( )
d = TM,n (7.32)
1
=
jTM,n M TM (TM,n )

with

1d 1 cot( 1d ) D cot( D)
M TM( ) = + + 2 22 + 2 3 2 (7.33)
sin ( 1d )
2
1
2
13
2 sin ( 2 D) 2

Using a similar technique for I 2 I 4 and also introducing the following


auxiliary function

d cot( 1d ) D cot( 2 D)
M TE ( ) = + (7.34)
1 sin ( 1d )
2
1 1 2 sin ( 2 D)
2
2 2

yields the final result for the xx -component of the dyadic:

1 H0(2) (TM,n ) 1 x2 y 2 H2(2) (TM,n )


Gee, xx ( x, y) = +
4 n M TM (TM,n ) 4 2 n M TM (TM,n )
H0(2) (TE,n ) y 2 x2 H2(2) (TE,n )
TE + (7.35)
4 n M (TE,n ) 4 2 n M TE (TE,n )

The first and the second summation run over all TM-surface waves
(n = 1, , ) ; the third and the fourth summation run over all TE-surface
waves (n = 1, , ) .
172 Chapter 7

3.3.3 Gee , xy
The xy -component of the dyadic determines the x -component of the
electric field, caused by a dipole source that is also y -oriented. Consider a
source that is placed at (0, 0, d ) , then from Eq. (7.5) it is again easily
proved that this component can be written as:

xy
Gee , xy ( x, y ) = ( I3 + I 4 ) (7.36)
2 2

The integrals I3 and I 4 are still given by Eqs. (7.29). The final result is
given by:

1 xy H 2(2) (TM,n ) xy H 2(2) (TE,n )


Gee , xy ( x, y ) =
2 2
n M TM ( ) 2 2
n M TE (TE,n )
(7.37)
TM,n

The first summation runs over all TM-surface waves (n = 1, , ) ; the


second summation runs over all TE-surface waves (n = 1, , ) .

3.3.4 Closed-Form Expression for G ee

First, before giving the new expression for the Greens dyadic, it should
be noted that:
Eqs. (7.35) and (7.37) can easily be extended for a source placed at
( x ', y ', d ) . This point becomes the new origin of the coordinate system,
y y'
and = ( x x ') 2 + ( y y ') 2 and tan = should be used
again.
x x'
Out of symmetry or from the classical expression in Eq. (7.5) for the
Greens dyadic and with the above definition of and it is clear
that:

Gee , yx ( x x ', y y ') = Gee , xy ( y y ', x x ')


(7.38)
Gee , xx ( x x ', y y ') = Gee, xx ( y y ', x x ')
7. Enhanced EM Software for Planar Circuits 173

Next, taking these remarks into consideration, a compact analytical,


closed-form expression for the Greens dyadic can easily be derived:

1 1
G ee ( , ) = t t H 0(2)( TM,n )
2 n TM,n M (TM,n )
2 TM

(7.39)
1
+ ( z t )( z t ) H ( TM,n )
(2)

M TE (TE,n )
2 0
2 n TE,n

where dyadic notation was used [Lindell, 1992] and with



t , , 0 . For a numerical evaluation, the infinite series in Eq.
x y
(7.39) running over all TX-modes (n = 1, , ) have to be truncated
(see further).

3.3.5 Important Remarks Concerning the Series Expansion

From the final expression of Eq. (7.39) it is clear that the Greens
function of this three-dimensional (3-D) layered structure can be written
as a sum of Hankel functions. This Hankel kernel can be seen as the
Greens function of a 2-D Helmholtz equation in a homogeneous space.
Hence, by applying the PML-concept the Greens functions of the
original open microstrip problem in three dimensions is decomposed into
a set of 2-D homogeneous space Greens functions. This decomposition
will play a crucial role in the construction of the new fast method (see
Section 4).
Proofs of the completeness of the modes of a PML-closed structure are
given by Knockaert and De Zutter (2002) and by Olyslager (2003).
However, in practical applications only a limited number of modes can
be used. Fortunately, the propagation constants of the modes are situated
in the fourth quadrant of the complex plane, meaning that each mode can
be seen as a damped surface wave. Especially the propagation constants
of higher order modes have large negative imaginary parts and decay
very quickly. Hence, the influence of higher order modes is negligible at
large distances . This can also be understood as follows: the number of
modes that is needed to achieve a desired accuracy decreases with
increasing distance .
174 Chapter 7

Close to the source too many modes are needed and the series become
impractical [Olyslager, 2004]. Even though it is possible to improve the
method [Rogier and De Zutter, 2002b], in a MoM technique, for the
selfpatch and the very near interactions5, we still use a classical
evaluation. In the construction of an MLFMA a distinction is made
between near and far interactions [Chew et al., 2001], hence, this
restriction causes no problems. Even more, in the fast method of Section
4 we will fully benefit from the fact that the number of modes needed to
achieve a desired accuracy, decreases with increasing distance.

4. A PML-MLMFA FOR THE MODELING OF


LARGE PLANAR MICROSTRIP STRUCTURES

4.1 Introduction and Outline

The problem geometries under consideration in this section are


(electrically) large planar microwave structures. Typical examples are
microstrip antenna arrays [Wang et al., 1998; Parrn et al., 2002] as depicted
in Fig. 7-2, reflect arrays [Pozar et al., 1997; Pozar et al., 1999], frequency
selective surfaces [Mittra et al., 1988; Kipp et al., 1994], and diffraction
gratings [Bledowski and Zakowicz, 1997]. The structures can be excited by
an illuminating plane wave (Eq. 7.2) or by port sources Classical solution
techniques for these kinds of problems are provided in Section 2, by Mosig
and Gardiol (1985), and by Sercu et al. (1995). The method is e.g. based on
an EFIE that is solved by a MoM [Harrington, 1993]. The linear system,
arising from the MoM, can be solved using an iterative method (see Section
1.2.2). Some drawbacks of these methods have already been discussed in
Section 1.1.3. Here we focus on the memory and computational complexity
of the technique. As stated before, storing a linear system in N unknowns
scales as N 2 , because of the dense moment matrix Z . Also, one iteration in
an iterative solution of the linear system being a matrix-vector
multiplication Y = Z X has a cost of O (N 2 ) . An efficient new
technique for structures as in Fig. 7-2, i.e. with only one substrate on a
ground plane, is presented here. From the analysis, it will be clear that the

5
The selfpatch interactions are described by the numbers Z ii , i = 1, , N , i.e. the elements
of the moment matrix that are located on the diagonal. The near interactions are the
interactions Z ij between a source basis function j and an observer test function i that
are placed too close to each other to use use the PML-series expansion conveniently.
7. Enhanced EM Software for Planar Circuits 175

technique is trivially extended to general multilayered structures with


metallizations at several levels (Section 5.1.3).
Application of the PML-paradigm leads to a modal expansion of the
Greens dyadic G ee . This is a first important step in the development of a
new technique, since now, a closed-form expression for the Greens dyadic
is at our disposal. Nevertheless, to implement an MLFMA, the source and
the observer have to be separated. The kernels of the series (Eq. 7.39) are
Hankel functions that depend on the distance = r r ' between a source
placed at r ' = ( x ', y ', d ) and an observer situated at r = ( x, y, d ) . Thus, the
second step is to develop a factorization of this Hankel kernel. We will use a
plane wave decomposition as factorization.
In Section 4.2 the mathematical formulation of the technique is
presented. First it is shown how to decompose the Hankel function in plane
waves and second, the PML-paradigm and the plane wave decomposition
are combined to obtain a separation between source and observer
contributions in the expression of the Greens dyadic. The implementation of
the technique is described in Section 4.3 and the complexity of the new
algorithm is briefly discussed in Section 4.4. Finally, in Section 4.5, several
illustrative, numerical examples validate the proposed technique.

4.2 Formulation of the Technique

4.2.1 The Moment Matrix Written as Interactions Between


Elementary Current Sources

In Section 2.4, the moment matrix was formed, describing the


interactions Z ij between a basis function j , j = 1, , N , and a test function
i, i = 1, , N . Eq. (7.29) is repeated here:

Z ij = w i (r ) G ee (r | r ') w j (r ) dS ' dS (7.40)


Si S j

The functions w i and w j are rooftop functions, oriented in the


x direction or the y direction. G ee still is the two-by-two electric-
electric Greens dyadic. The intervals Si and S j are the pertinent cells of
the mesh over which the integrations extend. Each interaction Z ij is
176 Chapter 7

rewritten as a set of interactions between elementary current sources. This is


achieved by a numerical evaluation of the integrals in Eq. (7.40):

Vi Uj

Z ij i ,v G ee( rv | ru ) j ,u (7.41)
v =1 u =1

where the current sources are given by

j ,u = qu w j ( ru )
(7.42)
i ,v = qv w i ( rv )

The dipole sources positions ( ru and rv ) and weights ( qu and qv ) are


determined by a Gaussian quadrature rule [Abramowitz and Stegun, 1970].
The number of Gauss quadrature points for the j th basis function is given
by U j . The test function i is represented by Vi quadrature points. The
accuracy of Eq. (7.41) is to a great extent determined by the number Vi U j .
Each rooftop function corresponds to a set of dipoles now. In fact, by using
Eq. (7.41), abstraction can be made of the idea of basis and test functions.
From now on, we do no longer consider interactions between basis and test
function, but interactions between elementary dipole sources.

4.2.2 Plane Wave Decomposition of the Hankel Function

In the literature many FMMs have been described that solve scattering
problems in free space environments fast and efficiently [Rokhlin, 1990;
Engheta et al., 1992; Lu and Chew, 1993; Hamilton et al., 1994]. These
Helmholtz equations FMMs rely on so-called diagonal translation operators,
viz. expansions of the pertinent Greens functions in terms of plane waves,
to represent fields produced by distributed sources at sufficiently separated
observers. Here, the focus is on a plane wave decomposition of the kernel of
( )
the Greens dyadic (Eq. 7.39), viz. the Hankel function H 0(2) r r ' .
7. Enhanced EM Software for Planar Circuits 177

Figure 7-7. Constellation of a source and an observation group.

Consider the configuration of Fig. 7-7, consisting of a line source and an


observer, placed in a 2-D homogeneous space at r ' and r respectively 6.
This source and observer belong to source and observer constellations
contained in circles of radius R centered about rsc and roc . It was shown by
Chew et al. (2001, 2003) that the kernel can be expressed as:

( )
Q
H 0(2)( r r ' )
j (q ) ( r ' rsc ) j (q ) ( r roc )
e Tq , rsocc , socc e
q = Q
Q
(7.43)
= PW
q = Q
q

with:

Q
1 jq '( q )
Tq ( , r , ) =
2Q + 1 q '= Q
H q' (
(2)
r ) e 2
(7.44)

6
Strictly spoken, the vectors r ' and r indicate locations in a 2-D space and thus, they are
independent of the z -coordinate, i.e. r ' = ( x ', y ') and r = ( x, y ) . Since the results that
are obtained here, will later only be applied in the plane z = d , we use the same notation
for these vectors as before.
178 Chapter 7

x rsocc
where rsocc = roc rsc , socc = arctan , ( ) = (cos x + sin y ) ,
y rso
cc

and 2 q
q = , q = Q , , Q . Eq. (7.43) realizes a plane wave
2Q + 1
decomposition of the Hankel function. Its physical interpretation is
illuminating. The radiation pattern of the source group is sampled into
2Q + 1 outgoing plane waves (OPWs) referenced w.r.t. the center of the
source group and traveling in directions q = 2 q , q = Q, , Q . Upon
2Q + 1
multiplication by the translation operator (Eq. 7.44) the OPWs are converted
into 2Q + 1 incoming plane waves (IPWs) referenced w.r.t. the center of the
observation group. Then the contribution of each plane wave is measured at
the observer. Expansion Eq. (7.43) is only valid when the source group and
the observation group are well-separated, meaning e.g. rsocc > R , with
(typically) = 5 . If this condition is satisfied and if is real, then 2Q + 1 ,
viz. the number of plane waves participating in the expansion, can be chosen as

2Q + 1 = 4 R + C ( R )1/ 3 (7.45)

where C is a constant that sets the accuracy level of the decomposition


[Chew et al., 2001]. Note, however, that the wavenumbers in Eq. (7.39) are
complex wavenumbers; therefore, the above (classical) estimate (Eq. 7.45)
does not apply. Hence, in our case, Q will be determined using a simple
search procedure, as will be outlined in Section 4.4.3. The important
parameter 2Q + 1 will often be called the sampling rate, since it can be
shown that this number corresponds closely to the number of samples used
in signal processing applications, as prescribed by Nyquists theorem [Vande
Ginste, 2004].

4.2.3 Core Equation of the PML-MLFMA for Microstrip Structures

The above obtained results are now combined. Consider a dipole with
strength = x x + y y located at r ' = ( x ', y ', d ) . By using Eq. (7.4), it is
7. Enhanced EM Software for Planar Circuits 179

easy to see that the field radiated by this source at an observer placed on the
substrate at position r ' = ( x ', y ', d ) is given by:

Et (r ) = G ee ( , ) (7.46)

First the unit vectors

rq = cos q x + sin q y
(7.47)
q = sin q x + cos q y

are introduced and the following identities are derived from Eq. (7.43):

Q
t t H 0(2) ( ) 2 PW r r
q = Q
q q q

Q
(7.48)
( z t )( z t ) H (2)
0 ( ) 2

q = Q
PWq q q

The Greens dyadic in (Eq. 7.46) is now written as the PML-series (Eq.
7.49) and every Hankel kernel in this series is replaced by its plane wave
decomposition (Eq. 7.43). These substitutions and Eqs. (7.48) yield the
following expression for the field in the observer:

Q
1 1 TM,n

Et (r ) PWq rqrq
2 n M TM (TM,n ) q = QTM,n
(7.49)
1 TE,n Q
+ TE PWq q q
2 n M (TE,n ) q = QTE,n
180 Chapter 7

where subscripts on QTX,n indicate that the number Q in Eq. (7.43) depend
on the propagation constant TX,n (see also further). Similar subscripts on
PWq , rq , and q have been omitted for simplicity.
Eq. (7.49) reflects the fact that, when calculating the field radiated by an
arbitrarily oriented dipole in the z = d metallization plane, only longitu-
dinal and transverse components of the plane wave decomposed electric
field are to be retained when considering TM- and TE-polarized fields
respectively. This expression, where source and observer contributions are
clearly separated, is the core formula of the new PML-MLFMA for micro-
strip structures.

4.3 Implementation of the Technique

4.3.1 Construction of the MLFMA Tree

4.3.1.1 Grouping of the Dipole Sources at Several Levels


7. Enhanced EM Software for Planar Circuits 181

Figure 7-8. Top view of the metallization divided into groups on four levels.

A simple example is used to explain how the MLFMA tree is built.


Consider the metallization S as shown in Fig. 7-8(a). The circuit consists
of a single microstrip line with a 90 corner. The metallization S is
approximated by a (potentially nonuniform) rectilinear mesh with N
interior edges (see Section 2.4). These N edges correspond with N rooftop
functions, which in turn are represented by elementary dipole sources using
a Gaussian quadrature rule. The rooftops and sources are not shown in the
figure.
To build an MLFMA tree, the sources are divided into groups. The
grouping process is shown in Fig. 7-8. Contrary to what is done before, the
groups are no longer circles but square boxes. First one needs to decide on
the minimal box size. This is the size of the groups at the lowest level l = 1
of the MLFMA and it is an input parameter of the algorithm. The minimal
wavelength in the substrate characterized by r and r is
182 Chapter 7

sub = 0 r r , with 0 the free space wavelength. The minimal box


size, described by the length min of the side of the smallest squares, is
typically chosen between sub 4 and sub for high frequency applications,
i.e. for structures that are electrically large.
Now a square box is determined that, on the one hand, is big enough to
enclose all the sources. On the other hand, the length max of the side of this
big box has to satisfy:

max = min 2 L 1 (7.50)

Here, L is an integer, indicating the number of levels used in the


algorithm. For the example of Fig. 7-8 the number of levels is L = 4 . The
big box is named box 0 and it is subdivided into four equal squares. These
new boxes are the groups at the level l = L 1 and they are numbered
according to the quadrant they represent in box 0 (see Fig. 7-8(b)). Similarly,
this process of subdividing and numbering is continued recursively until the
lowest level is reached. Because of Eq. (7.50), the boxes at level l = 1
automatically have the desired size min .

Figure 7-9. MLFMA tree corresponding to the structure of Fig. 7-8.


7. Enhanced EM Software for Planar Circuits 183

In the computer memory a tree structure with L levels is stored. The


reader notices that for the example of the simple curved microstrip line,
several boxes are empty. Of course, these boxes are not retained in the tree.
The MLFMA tree structure corresponding with the example is shown in Fig.
7-9.
This paragraph is concluded by defining a terminology typically used for
the MLFMA tree. For the tree of Fig. 7-9, box 15 is called a child of box 3.
Vice versa, box 3 is the parent of boxes 15 and 16. With a slight abuse of
terminology, all boxes on one level (e.g. boxes 1, 3, and 4) are called
siblings.

4.3.1.2 Determination of the Interactions

The sizes of the groups (read the boxes) in this chapter are determined by
Rl , l = 1, , L, i.e. the radius of the circumscribed circle of the boxes at
level l . Or, to put it differently, 2 Rl is the length of diagonal of the squares
at level l . Each box b has a center rbc , coinciding again with the center of
the circumscribing circle. In Section 4.2.2 it is indicated that the accuracy of
the plane wave decomposition, and hence the necessary sampling rate,
depends on the distance between the groups and on the size of the groups.
There the parameter is introduced to determine whether two groups are
situated far enough from each other to be able to use the plane wave
decomposition. This leads to the following definition:

Definition 7-1: Well-Separated

A source box b ' and an observation b both residing at level l in the


MLFMA tree and both of size Rl are well-separated when the distance
between their centers satisfies:

rbcc'b = rbc rbc' Rl (7.51)

Typically, is chosen between four and six, assuring sufficient


convergence and accuracy of the plane wave decomposition. In the
multiplication algorithm the interactions between elementary current sources
are calculated. Therefore, it has to be decided which pairs of sources are
placed far enough from each other to use the PML-MLFMA paradigm. In a
two-level FMM) there are only two possibilities: the sources are members of
184 Chapter 7

groups that are well-separated, and hence it is safe to use the FMM, or they
reside in groups that are too close to each other, in which case the interaction
remains to be calculated with a classical technique. In a multilevel
framework, this becomes somewhat more complicated. The following
definitions classify two boxes at a certain level as a near-field pair or as a
far-field pair:

Definition 7-2: Far-Field Pair


A pair of (nonempty) boxes is a far-field pair if they are well-separated
and if their parents are not well-separated.

Definition 7-3: Near-Field Pair


Two boxes at the lowest level that are not well-separated form a near-
field pair.

With these definitions, every pair of sources belongs to one and only one
far- or near-field pair. Near interactions only appear at level 1. This is quite
logical. However, in the multilevel algorithm we are building, this is not the
case for the far interactions. Far interactions between sources can appear at
any level, as long as the boxes they belong to form a far-field pair,
determined by Def. 7-2. In this way, for each pair of sources residing in a
corresponding pair of boxes, the interaction always occurs at the highest
possible level.

Table 7-1. Classification of the box-box interaction pairs for the example of Fig. 7-8 with
= 2.5.
Far interactions at level 2 Far interactions at level 1 Near interactions
8-15 1519 3571 7080 3535 7071
816 1520 6367 7082 3570 7171
819 1617 6368 7083 6363 7182
820 1620 6468 7179 6364 7979
1517 6779 7180 6464 7980
6780 7183 6467 8080
6880 7982 6767 8082
7079 7983 6768 8083
6868 8282
6879 8283
7070 8383

This theory is illustrated with the example of the previous section (Fig.
7-8). In Table 7-1 the interaction pairs are shown for = 2.5 . In practice,
this number is too small, but it is used here for convenience: with this ,
two boxes that do not touch each other are well-separated according to Def.
7. Enhanced EM Software for Planar Circuits 185

7-1. Given the so-called interaction table 7-1, the reader notices that for
every pair of rooftop functions that reside in the same box at level 1, the
interaction obviously has to be calculated with a classical technique. This is
also the case for rooftops that reside in adjacent boxes on the lowest level
(e.g. the pairs 6768 and 80 82). However, at the first level, there are also
many interactions between dipole sources that can be determined using the
PML-MLFMA paradigm. E.g., this is the case for the sources of the pairs
6367 and 7180. Although the box pair 6379 satisfies Def. 7-1, it is not a
far-field pair, because their parents, the boxes 15 and 19 respectively, are
also well-separated (see Def. 7-2). This parent pair 1519 does form a far-
field pair. Therefore, the interactions between all the sources residing in box
15 and all the sources residing in box 19, are calculated using the fast
technique at level 2. This makes the determination of the interactions
between the children of box 15 and the children of box 19 redundant because
it has already been done at a higher level. The multilevel approach, adopted
in this section, yields an important gain in memory requirements and CPU
time (see Section 4.4.1).

4.3.2 The Matrix-Vector Multiplication

In this section, it is clarified how a matrix-vector multiplication


Y = Z X is implemented, using the tree structure and the interaction table
(see higher). This part of the code is called repeatedly by the iterative solver
until an acceptable solution for the linear system is reached. The tree
structure and the interaction table are determined and stored in the computer
memory during the setup phase of the algorithm

4.3.2.1 Near Interactions


Contributions to the right hand side of Eq. (7.40) due to spatial basis
functions contained in near-field pairs are evaluated classically, that is,
without using PML and MLFMA concepts. For every test function
i, i = 1, , N , this part is written as

Yi near = Z ijnear X j (7.52)


j Sinear

The interaction table determines the sets Sinear , i = 1, N , containing


rooftop pairs i j that are placed near to each other. In the setup phase the
186 Chapter 7

corresponding elements of the moment matrix Z ijnear are calculated and


stored. One definitely does not want to determine these elements on the fly,
i.e. in every iteration, since this would be computationally very expensive
because of the troublesome Sommerfeld-integrals. In every iteration step,
viz. multiplication, and for every test function i , the algorithm loads the
appropriate matrix elements Z ijnear , j S inear , from the memory and
multiplies them with the corresponding element X j of the test vector, as in
Eq. (7.52). For every i, i = 1, , N , the result Yi near is stored. This part of
the code corresponds to what is done classically.

4.3.2.2 Far Interactions at the Lowest Level


First it is explained how to calculate the contributions from the far
interactions at the lowest level l = 1 to the total matrix-vector multiplication.
This contribution to Y = Z X is for each test function i, i = 1, , N ,
denoted as:

Yi far = Z ijfar X j (7.53)


j Sifar , l =1

For each test function i we are only interested in interactions with basis
function j S ifar, l =1 . Suppose that the test function i resides in an
observation box b , then the set S ifar, l =1 contains all j that reside in source
boxes b ' under the condition that the pairs b 'b form far-field pairs at level
l = 1 . The summation (Eq. 7.53) can be rewritten using the theory of Section
4.2 and more specifically by applying the core formula (Eq. 7.49). In the
MLFMA these multiplications are calculated in several steps.

Step 1: Calculation of the OPWs


The radiation patterns of each (nonempty) source box at level l = 1 are
calculated and sampled. This has to be done for each TX-polarized PML-
mode n used in the algorithm. Consider a source box b ' , containing a set of
rooftops j ( j b ') . Each j corresponds to U j dipole sources j ,u . The
multiplication of the moment matrix with a test vector X can be seen as the
interaction between dipole sources (see higher). Each dipole source j ,u ,
7. Enhanced EM Software for Planar Circuits 187

residing at ru , has a strength qu X j , where qu is determined by a Gaussian


quadrature rule (see Eq. (7.42)) and X j is the appropriate element of the test
vector. For each TM-mode n , the outgoing plane waves of box b ' can be
calculated as:

Uj
( )( ) r X
OPW b'
TM, n , q = e
j TM ,n q ru rbc'
( q j j ,u ) (7.54)
jb ' u =1

and q = QTM,n , , QTM,n . The OPWs of box b ' for each TE-mode n are
given by:

Uj

OPW b'
TE, n , q = e
jb ' u =1
( )(
j TE ,n q ru rbc'
(
) X
q j j ,u ) (7.55)

and q = QTE,n , , QTE,n . The OPWs Eq. (7.54) and Eq. (7.55) have to be
calculated and stored for every PML-mode in every iteration.

Step 2: Only Needed for Higher Levels ( l > 1) , see further.

Step 3: Translation to IPWs


For each TX-polarized PML-mode n the OPWs are translated into sets
of IPWs arriving at box b :

b 'b
IPWTX, ( cc cc b'
)
n , q = Tq TX, n , rb ' b , b ' b OPWTX, n , q (7.56)

for q = QTX,n , , QTX,n and for all boxes b ' that form a far-field pair with
box b . We add all these IPWs (Eq. 7.56) at box b , such that for each PML-
mode n one set of IPWs is obtained:
188 Chapter 7

n , q = IPWTX, n , q
b b 'b
IPWTX, (7.57)
b'

for q = QTX,n , , QTX,n . Of course, the summation in Eq. (7.57) runs


exclusively over boxes b ' that form a far-field pair with box b . The
(
translation elements Tq TX,n , rbcc'b , bcc'b are preferably stored during the)
setup phase because their calculation requires the determination of many
Hankel functions (see Eq.(7.44)). To this end, the AMOS library [Amos,
1986; http://www.netlib.org] can be used.

Step 4: only needed for higher levels ( l > 1) , see further.

Step 5: Measuring the IPWs at the observers and summation of PML-


mode contributions
In the final step of the algorithm, the contribution of each IPW is
measured at the sources representing the test function i. All 2QTX,n + 1
samples are then added. For TM-mode n one obtains:

QTM ,n Vi
( )( ) r IPW b
Y far
TM, n ,i = e
j TM ,n q rv rbc
( q i,v ) TM,n,q (7.58)
q = QTM ,n v =1

And for TE-mode n :

QTE ,n

( )
Vi
( )( ) IPW b
e
j TE ,n q rv rbc
Y far
TE, n ,i = q i ,v TE, n , q (7.59)
q = QTE ,n v =1

The mode contributions Eq. (7.58) and Eq. (7.59) are now weighted and
added. For a test function i, residing in box b , the result of the multi-
plication involving the far interactions at level 1 is given by:
7. Enhanced EM Software for Planar Circuits 189

1 1 1
Yi far =
2 n M (TM,n )
TM
far
YTM, n ,i
2 n M (TE,n )
TE
far
YTE, n ,i (7.60)

4.3.2.3 Far Interactions at Higher Levels


One might consider to use the same routine (Step 1 Step 3 Step 5) at
higher levels. Unfortunately, the calculations of the OPWs using Eq.
(7.54) and Eq. (7.55) and measuring the IPWs using Eq. (7.58) and
Eq. (7.59) for boxes residing at levels l > 1 become very expensive, since
these boxes contain many dipole sources. Also, at higher levels, the
sampling rates 2QTX,n + 1 increase, since the size Rl of the groups (read
boxes) increases. In fact, calculating the OPWs and measuring the IPWs in
this way, destroys the computational and memory complexity of the
multilevel algorithm. Therefore, a different technique has to be adopted
[Gyure and Stalzer, 1998].

Figure 7-10. Schematic representation of going up and down the MLFMA tree.

Step 2: Calculation of the OPWs at levels l > 1


For the determination of the OPWs of a parent box B ' residing at level
l + 1 , information that was previously calculated at level l is used. The
procedure is shown schematically in Fig. 7-10. We start from the sampled
radiation patterns of the (nonempty) children b ' of box B ' . These have
n + 1 , where the superscript index l indicates the level.
l
sampling rates 2QTX,
l +1
The sampling rate 2QTX, n + 1 for the parent box B ' is higher, since this box
l +1
is bigger, thus QTX,n > QTX, l
n . Therefore, the procedure is as follows. First
the sampled radiation patterns of the children are interpolated. Then these
interpolated radiation patterns are shifted to the center of the parent box and
added. These two steps, as indicated on the figure, are now discussed.
190 Chapter 7

Figure 7-11. Schematic representation of the interpolation from seven samples to eleven
samples.

In order to achieve a fast interpolation step, Fast Fourier Transforms


(FFTs) [Press et al., 1992] are used. The technique can be explained by
means of Fig. 7-11. The radiation pattern of box b ' is sampled into
n + 1 samples for every mode. The one-dimensional FFT of this
l
2QTX,
sampled radiation pattern F(q ) in the spatial q -domain to the spectral
q -domain, also yields 2QTX,
l
n + 1 samples. The result of this operation is
7. Enhanced EM Software for Planar Circuits 191

denoted as F( q ) . The reader notices that at high frequencies q , the


magnitude of these samples in the spectral domain is very low, almost zero.
This is due to the fact that the far-field is (quasi) bandlimited. Hence, it is
safe to add some additional samples with magnitude zero in the spectral
domain. These so-called padding zeroes do not affect the spectral contents of
the far-field. The number of padding zeroes that have to be added is
( l +1
2QTX, ) ( l
) (
l +1
)
n + 1 2QTX, n + 1 = 2 QTX, n QTX, n . Now, the radiation pattern
l

F( q ) in the spectral domain sampled in a number of samples needed at


l +1
level l + 1 , i.e. 2QTX, n + 1 , is obtained. Calculating the inverse FFT of
l +1 l +1
F( q ) described with 2QTX, n + 1 samples, yields the desired 2QTX, n + 1
samples of radiation pattern F(q ) in the spatial domain.
Performing the shifts is very easy by simply multiplying each OPW by
c
an appropriate shift factor. Suppose that the child box b ' is centered at rb '
c
and that the parent box B ' is centered at rB ' , this shift factor is given by:

( )(
j TX ,n q rbc' rBc ' )
SbTX, n ,q = e
'B'
(7.61)

l +1 l +1
for all samples q = QTX, n , , QTX, n and for each PML-mode n .

So, for the calculation of the OPWs of each box and for each PML-mode
at levels l > 1 the process of interpolating and shifting is used, going
upwards in the tree. It is important that first the interpolation is done and
thereafter the shifting. If one shifts first, one tries to describe the radiation
pattern of the parent box at level l + 1 with only 2Qnl + 1 samples. This is
not correct and interpolating is no longer possible since information is lost
during the shifting. The FFTs and inverse FFTs can be performed with the
libraries from FFTW [http://www.fftw.org].

Step 4: Calculation of the IPWs at level l = 1 due to interactions at levels


l >1
First it needs to be said that step 3 remains basically the same, but now
extended to all levels that contain far-field pairs. Next, after step 3, one
disposes of IPWs for each nonempty box and for each mode at all levels.
Before going to step 5, the IPWs at the levels l > 1 have to be transformed
192 Chapter 7

into level 1. Now the inverse procedure of step 2 is applied. This is also
shown in Fig. 7-10. The IPWs of a parent box B at level l are first shifted to
the four child boxes b at level l 1. Then the sampling rate is reduced
l 1
n + 1 to 2QTX, n + 1 by means of adjoint interpolation, also called
l
from 2QTX,
anterpolation. The result is added with the IPWs that were already present at
the lower level l 1 .
The shifting step is again very easy. Suppose that the child box b is
centered at rbc and that the parent box B is centered at rBc , then the
appropriate shift factor is given by:

( )(
j TX ,n q rbc rBc )
n ,q = e
Bb
STX, (7.62)

for all samples q = QTX,


l l
n , , QTX, n and for each PML-mode n .
7. Enhanced EM Software for Planar Circuits 193

Figure 7-12. Schematic representation of the anterpolation from eleven samples to seven
samples.

For the anterpolation, again FFTs are used. The procedure is shown in
Fig. 7-12. After the shifting, the radiation pattern of the child box is
presented with more samples than needed. Therefore, in the spectral domain,
the samples at high frequencies have a magnitude that is (nearly) zero. These
2 (QTX,
l
n QTX, n) samples are redundant and they can be omitted without
l 1

loss of information. Going back to the spatial domain now, yields the
sampled radiation pattern with the desired sampling rate at level l 1 .
194 Chapter 7

So, for the calculation of the IPWs of each box and for each PML-mode
at level l = 1 the process of shifting and anterpolating is used, going
downwards in the tree. It is important that the anterpolation is done after the
shifting. If one anterpolates first, one tries to describe the radiation pattern of
the parent box at level l = 1 with only 2Qnl 1 + 1 samples. In this way,
information is lost that cannot be recovered.

4.3.2.4 Summary
The iterative solver proposes a solution, viz. the test vector X. This
vector has to be multiplied with the moment matrix, yielding

Y = ZX (7.63)

First, for each test function i the near interactions as expressed in Eq.
(7.52) are calculated. The results Yi near , i = 1, , N , are stored. Next, the
procedure for the far interactions is followed:

Step 1: Calculation of all the OPWs of the nonempty boxes at the lowest
level ( l = 1) and for each PML-mode n , using Eq. (7.54) and Eq. (7.55);
Step 2: Determination of the OPWs of the nonempty boxes at higher levels
( l > 1) for each mode n , using interpolation and shifting, going upwards
in the tree;
Step 3: Multiplication of the OPWs at all levels with the translation opera-
tors, yielding the IPWs for each box and for each mode n at all levels
(see Eqs. (7.56) and (7.57));
Step 4: By shifting and anterpolating, going downwards in the tree, the
IPWs at level l = 1 are obtained for each mode n;
Step 5: The contributions of the IPWs at the lowest level of a box are
measured at the observers in that particular box for each mode n, using
Eqs. (7.58) and (7.59), yielding a TM- or a TE-contribution to the far-
field interactions. All mode contributions are then weighted and added,
using Eq. (7.60).

These five steps yield the contributions from the basis function j that
are placed far from the test function i . The result Yi far is added up with
Yi near for each i, i = 1, , N , leading to the desired result Y.
7. Enhanced EM Software for Planar Circuits 195

4.4 Some Important Remarks about the Complexity of


the PML-MLFMA

4.4.1 Memory and Computational Complexity

The application of the PML-paradigm converts the 3-D layered medium


problem into a set of 2-D homogeneous space problems. Each member of
this set corresponds to one PML-mode used in the modal expansion of the
Greens dyadic. For elaborate calculations of the complexity of 2-D free
space MLFMAs, we refer to Chew et al. (2001). The complexity compu-
tations are not repeated here, but the result suffice.
On the one hand, S can represent dense metallizations, meaning that
every parent box contains four nonempty children, and hence, the MLFMA
tree will be a full quad tree. Then, both the memory and computational
complexity are of order O ( N ) . The structures of our interest, e.g. planar
antenna arrays, typically meet the requirements of this best case scenario.
On the other hand, when a binary tree is built, the memory and compu-
tational complexity are of O(N log N ) and O ( N log2 N ) respectively.
This is the worst (unrealistic) case, corresponding to very sparse metalli-
zations, such as small, very long microstrip traces.
Contrary to what the reader might expect, the above mentioned comple-
xities are not linearly dependent on the number of modes used in the
algorithm. This is clarified in the next section.

4.4.2 Mode Trimming

In the final remarks of Section 3.3.5 it is stated that for a certain accuracy
of the PML-series for G ee the number of modes decreases rapidly with the
distance between a source and an observer. Therefore, for the calculation
of interactions that take place at high levels in the MLFMA tree and
hence for large distances one can use fewer modes than at low levels
without destroying the accuracy. Let M TX,l denote the number of TX PML-
modes used at a certain level l in the tree. For increasing level number l ,
M TX,l decreases substantially. This will also be shown numerically in
Section 4.5.1. An important consequence is that the cost of the algorithm is
not linearly dependent on the number of modes. This is a major improve-
ment with respect to an earlier 2-D implementation of the PML-MLFMA
[Vande Ginste et al., 2004] and this new feature is named mode trimming.
196 Chapter 7

n +1
l
4.4.3 Determination of the Sampling Rates 2QTX,

So far, the influence of the propagation constants of the PML-waves


TX,n on the sampling rates is ignored. However, from Eq. (7.45) it can be
seen intuitively that the sampling rate should increase rapidly for increasing
TX,n in order to maintain a good accuracy for the plane wave
decomposition (Eq. 7.43). Hence, by using higher order PML-modes, the
low complexity of the algorithm would be completely destroyed. To avoid
n + 1 are determined by using a
l
this, the minimum sampling rates 2QTX,
simple brute-force search procedure to ensure a given level of accuracy of
expansion (Eq. 7.49). We require that at all levels l , l = 1, , L,

l
QTX

( )
,n

H (2)
0 TX, n l
PWn
q = QTX ,n
< PW (7.64)
(
H 0(2) TM,1 min
l
)

where PW is the desired accuracy for Eq. (7.49) and minl


is the minimal
distance between two boxes at level l placed far enough from each other to
form a far-field pair. The propagation constant TM,1 belongs to the funda-
mental TM-surface wave which is always supported by the microstrip
substrates under consideration. Given the loci of the modes retained in Eq.
n + 1 to a number equal to, or
l
(7.49) this method typically restricts 2QTX,
smaller than, that needed to represent the fundamental surface wave by
itself. Enforcement of a given level of relative accuracy for each mode
separately would increase the number of plane waves for the highly
evanescent modes to impractical levels and/or nonconvergence of the series
(Eq. 7.49). Using Eq. (7.64) leads to very accurate results for the total
algorithm without increase of the sampling rates and hence, without
destroying the complexity (see also Sections 4.5.1 and 4.5.2).
7. Enhanced EM Software for Planar Circuits 197

4.5 Numerical Results

This section comprises three subsections. First, the correctness of the


new method is demonstrated numerically and it is also shown that the
sampling rates in the plane wave decomposition (Eq. 7.43) do not increase
for higher order modes. Second, the high computational and memory
efficiency of the formalism is demonstrated, in comparison with a classical
technique. Third, some illustrative examples are given. Emphasis is on
showing the reader that the proposed method is indeed suited for a variety of
large microstrip problems. All simulations are carried out on a Linux-based
2.4GHz Pentium IV PC with 2GB RAM.

4.5.1 Validation of the Method

The accuracy of the PML-MLFMA code is controlled by many


parameters. Several of them are described in the previous sections (e.g.
PML parameters , , and MLFMA parameters QTX, l
n , M TX,l , and PW ).
One of the input parameters to the PML-MLFMA code is the target
accuracy , defined as the average relative error of the far-field matrix
elements computed by using the PML-MLFMA paradigm. Upon
specification of this parameter, all code parameters adjust to a critical value
that guarantees this target accuracy without wasting computational
resources; parameter selection is achieved either by (approximate) analytic
means ( M TX,n , PW , , and ), or by brute-force local searches ( QTX,l
n ).
198 Chapter 7

1 mm

1 mm

x-oriented basis function j

y- and x-oriented test functions i

Figure 7-13. Layout of the metallization used to test the accuracy of the method.

To verify the usefulness of the PML-MLFMA for the modeling of


microstrip geometries, consider the metallization depicted in Fig. 7-13. This
metallization is separated from a PEC ground plane by an air-substrate
with r = r = 1 of thickness d = 1mm . The operating frequency is
f = 10GHz . This air-substrate is chosen for two reasons. First, for this
configuration, the transverse Greens dyadic is known analytically [Van
Bladel, 1985]:

1 e j k0 e j k0 2 + d 2
G ee ( , ) = 2

t t + k0 I (7.65)
j 0 4 2 + d2

7. Enhanced EM Software for Planar Circuits 199

where I is the two-dimensional unit dyadic. The moment matrix elements


Z ijclass , calculated by evaluating Eq. (7.40) using Eq. (7.65), is used as a very
precise reference in order to check the accuracy of the new method. Second,
for this configuration, the propagation constants of the modes of the
pertinent PML-waveguide, viz. an air-filled parallel plate waveguide of
complex thickness d + D , are also analytically known:

n
TM,n = TE,n = k02 , n = 1, , (7.66)
d+D

For other substrates the dispersion relations (Eq. 7.26) have to be solved
numerically. Note that while this PML-waveguide also supports a TEM-
mode, it is never excited as the currents on S flow transverse to z . Let
Z ijMLFMA be an element of the moment matrix of the new method, supposing
that all interactions are far and hence using PMLs and MLFMA. The
elements Z ijMLFMA are evaluated by consecutively multiplying the moment
matrix with test vectors X equal to the columns of the unit matrix. Now
Z ijclass and Z ijMLFMA are compared. For an increasing distance , as indicated
in Fig. 7-13, between the basis function and the test function, viz. for an
increasing | i j | , the relative error () is given by:

Z ijclass Z ijMLFMA
( D) = (7.67)
Z ijclass
200 Chapter 7

2 3
=10
3

4 5
=10
log [()]

5
10

6
7
=10
7

9
2 3 4 5 6 7
/0
Figure 7-14. Relative error () on the x x interactions as a function of the distance
and for different values of the target accuracy .
7. Enhanced EM Software for Planar Circuits 201

2 =103

4
=105
log10[()]

6
7
=10
7

9
2 3 4 5 6 7
/
0
Figure 7-15. Relative error () on the x y interactions as a function of the distance
and for different values of the target accuracy .

Due to the staircase layout of the metallization, it is possible to evaluate


the interaction between a basis and a test function that have a parallel or an
orthogonal orientation, allowing to check all four elements of the Greens
dyadic. The accuracy of the method is shown in Figs. 7-14 and 7-15
for a varying target accuracy . Fig. 7-14 gives the results for the
x x interactions, i.e. for the matrix elements describing interactions
between x oriented basis and test functions. In Fig. 7-15 the results for
x y interactions, viz. for the matrix elements describing interactions
between x oriented basis functions and y oriented test functions or vice
versa, are shown. The reader notices that the target accuracies can easily
be reached. The radius of the groups at the lowest level is 0.3 0 . This might
seem quite small (often one uses 0.5 0 or even 0 ), but clearly, it does not
cause accuracy problems here. Even better, the MLFMA is used starting
from small distances (here = 51mm ), and hence full advantage of the
technique is taken. Below this distance, a classical technique needs to be
adopted. In the example, the MLFMA tree comprises four levels.
In Section 4.4.2 the concept of mode trimming is introduced, meaning
that the number of modes M TX,l decreases with increasing level number l .
An increasing l means of course that the distance between a basis and a
202 Chapter 7

test function increases. In Table 7-2 it can be seen that it is perfectly safe to
trim the modes without loss of accuracy. In the table at an accuracy of =106
and =104 for the x x interactions is pursued.

Table 7-2. Total number of modes M TX,l needed to obtain a relative error of 106 and 104 on
the x x interactions given a certain distance .
Distance Total numbers of modes M TX,l needed

[mm] / 0 = 106 = 104


70.71 2.359 32 22
106.1 3.538 24 18
141.4 4.717 20 16
176.8 5.897 16 12
212.1 7.076 12 10

100
level 1
number of samples 2 QTM, n + 1

level 2
80 level 3
level 4
l

60

40

20

0
0 5 10 15 20 25 30 35
modenumber TM, n

Figure 7-16. Number of samples at four levels needed for each TM-mode to obtain an
accuracy of 107 .
7. Enhanced EM Software for Planar Circuits 203

n + 1 are plotted for the TM-modes


l
In Fig. 7-16 the sampling rates 2 QTM,
at four levels, for a target accuracy = 10 7, as used in the example of Figs.
7-14 and 7-15. When the number of samples drops to zero, this means of
course that the mode is not used at that particular level as is requested by the
mode trimming feature. Also, the sampling rate does not increase for higher
order modes, on the contrary. This indicates that the relative error for the
core MLFMA Eq. (7.49) as defined in Eq. (7.64) can be used. It also means
that the plane wave decomposition for complex wavenumbers TX,n
does not destroy the computational complexity of the algorithm. Note that in
the case of a microstrip substrate, with kr k0 , the substrate does propagate
surface waves (with wavenumbers TX,n). In this case the mode trimming
is even more efficient, since these modes more and more dominate the total
accuracy as the distance increases. It can be concluded that with the new
method a fully controllable accuracy is achieved.
204 Chapter 7

4.5.2 Computational and Memory Efficiency

w s T

f
x
Figure 7-17. Layout of the metallization used to test the computational and memory
complexity of the method.

To test the computational and memory complexity of the new MLFMA,


a substrate with thickness d = 3.17mm , relative permittivity r = 11.7 , and
relative permeability r = 1 is used. The operating frequency is
f = 10GHz . On the substrate a realistic structure for measuring the
memory requirements and speed is placed. The metallization is shown in
Fig. 7-17 and consists of a uniform antenna array. Each individual patch is
discretized using a non-uniform mesh. At the edges of each patch the grid is
refined in order to model the edge current behavior more accurately. The
number of unknowns N is increased by adding more patches. The target
accuracy is set at = 10 5 . In Fig. 7-18 the CPU time needed for one
iteration is plotted for a variable number of unknowns. Fig. 7-19 shows the
memory requirements of the code. As predicted, with the new PML-
MLFMA both the operation count and the memory requirements scale as
7. Enhanced EM Software for Planar Circuits 205

O( N ) , as opposed to a classical method with an O( N 2 ) complexity.


Important to stress is that the cross-over point for the speed is found just
below N = 1000 . Starting from about 2000 unknowns, there is also
already a gain in memory efficiency. This is slightly larger than what is
obtained for the speed and can be explained by the fact that even for small
structures the MLFMA has a large fixed cost of memory, just for building
the tree. These results are in line with those of free space FMMs) and
demonstrate that PML-MLFMA allows modeling very large planar
structures.

1
10

0
10
Time [sec]

1
10

2
10
PMLMLFMA
Classical method

3 4 5
10 10 10
N

Figure 7-18. CPU time for one matrix-vectors multiplication.


206 Chapter 7

6
10
Memory [kB]

5
10

4
10
PMLMLFMA
Classical method
3 4 5
10 10 10
N

Figure 7-19. Memory requirements.

4.5.3 Application Examples

This subsection presents computational results illustrating the applica-


bility of the proposed scheme to the analysis of radiation and scattering
by/from electrically large arrays. The far-field Esc,ff ( R, , ) scattered from
an antenna array is expressed as

j k0 R
e
Esc,ff ( R, , ) F ( , ) + F ( , ) (7.68)
R

for large k0 R , where R = r = x 2 + y 2 + z 2 and is the angle between


r and the z axis and the angle between the x axis and the projection
of r onto the xy plane. The radiation patterns F ( , ) and F ( , ) can
be calculated from the currents on the metallization using a stationary phase
method [Wilcox, 1964].
7. Enhanced EM Software for Planar Circuits 207

Figure 7-20. Layout of the 8 4 microstrip array. The grayscale is an indication for the
current density on the metallization.

The first example involves the 8 4 microstrip array detailed in [Ling


et al., 2000] and shown in Fig. 7-20. The array is situated on a substrate with
thickness d = 1.59 mm , relative permittivity r = 2.2 , and relative
permeability r = 1 . The various dimensions detailed in Fig. 7-20 are
l =10.08mm , w =11.79 mm , d1 = 1.3mm, d 2 = 3.93 mm , l1 = 12.32 mm,
l2 = 18.48mm , D1 = 23.58mm , and D2 = 22.40 mm . The array is fed by
fed by forcing a current at its input at operating frequency f = 9.42GHz .
The arrays radiation patterns in the E-plane ( = 0 ) and the H-plane
are shown in Figs. 7-21 and 7-22 and compared with there results found in
[Ling et al., 2000]. Very good agreement between both data sets is observed.
208 Chapter 7

0
PMLMLFMA
5 Ling et al.
Radiation pattern [dB]

10

15

20 F

25

30

35

90 75 60 45 30 15 0 15 30 45 60 75 90
[]

Figure 7 -21. Radiation pattern of the microstrip antenna array in the E-plane ( = 0 ).

0
PMLMLFMA
5 F Ling et al.

10
Radiation pattern [dB]

15

20

25

30

35
F
40

45
90 75 60 45 30 15 0 15 30 45 60 75 90
[]

Figure 7 -22. Radiation patterns of the microstrip antenna array in the E-plane ( = 90 ).
7. Enhanced EM Software for Planar Circuits 209

The second example involves the array first introduced in Section 4.5.2
(Fig. 7-17); the patch width is w = 7.5mm and the periodicity of the array,
detailed in Fig. 7-17, is T = 3 0 4 = 22.5mm . The structure is illuminated
by a plane wave (Eq. 7.2)

j ( kx x + k y y + kz z )
E PW ( x, y, z ) = E0 e = E0 e jk0 (cos sin x +sin sin y + cos z ) (7.69)

as indicated on Fig. 7-17. For this kind of excitation it can be easily derived
that at z = d the transverse to z incident field is of the following form:

j ( kx x + k y y + kz d ) j ( kx x + k y y kz d )
t (r ) = E0,t e
Einc + Rt e (7.70)

with R t the transverse to z strength of the wave reflected at the PEC-


backed substrate. In the example again an operating frequency of
f = 10GHz is used and the plane wave has angles of incidence = 30
and = 0 . The plane wave is linearly polarized along the y axis and has
a strength of 1V m , hence E0,t = E0 = y . For this illuminating plane wave
and for the given microstrip substrate, the strength of the reflected wave is
given by R t = ( 0.195 j 0.981) y . In the xz plane the scattering cross section
2
F ( ) EPW is studied and a grating lobe at gr = 56.4 (apart from the
specular reflection at spec = 30 ) is expected. For an infinite number of
patches, the scattering cross section only consists of two discrete Dirac-like
lobes at gr and spec . Fig. 7-23 shows the scattering cross section in the
xz plane for a varying number of square patches. With an increasing
number of patches, the result more and more resembles a pattern that only
comprises two discrete lobes. The reader also notices the two predicted lobes
at gr and spec .
210 Chapter 7

3
x 10
55
6
10 10
15 15
5
|| [m ]
2

4
PW 2
| F()/|E

0
90 60 60 45 30 15 0 15 30 45 60 75 90
[]

Figure 7-23. Scattering cross section as a function of the number of patches.

5. EXTENSIONS AND CONCLUSIONS

5.1 Extensions

In this section, some extensions of the previously described technique are


outlined. These topics constitute a part of the ongoing research of the
Electromagnetics Group in the Department of Information Technology.

5.1.1 Development of a Low-Frequency Algorithm

For the modeling of electrically large structures such as planar antenna


arrays, the number of unknowns is chosen proportional to the electrical size
of the object. As a general rule of thumb it can be stated that the number of
unknowns per wavelength should approximately be ten. Consider now the
simulation of electrically small objects such as MMICs. This is still a high-
frequency (HF) problem, but these objects contain very fine geometric
details. To model them precisely, this involves an accurate discretization, i.e.
a severe increase of the number of unknowns, and hence a violation of the
10 -rule. To develop an MLFMA, the basis and test functions are
7. Enhanced EM Software for Planar Circuits 211

classified in a tree structure. However, the grouping of very densely packed


elementary current sources differs in two ways w.r.t. the previously
explained technique:
In contrary to what is done before, now the groups are much smaller than
the characteristic wavelength of the problem.
Since the groups are placed near to each other, expressed in terms of the
wavelength , a plane wave decomposition which is based on far-
field approximations is no longer valid.
Since the structure is small compared to the wavelength, it seems that this
is a situation where the characteristic wavenumber k = 2 tends to zero.
For a decreasing size of the geometry, the electromagnetic fields produced
by the structure more and more satisfy the Laplace equation instead of the
Helmholtz equation. Schematically this can be written in its scalar form as:

where (r ) is the pertinent scalar field. The situation on the right-hand side
corresponds to a (quasi-)static problem. Hence, to develop a fast technique
for the simulation of electrically small structures, a low-frequency (LF)
variant of the MLFMA has to be implemented.
212 Chapter 7

Figure 7-24. Source and observer constellation.

As mentioned before, the plane wave decomposition fails. Consider again


a source-observer-constellation as depicted in Fig. 7-24. Multipole expan-
sions for the 2-D homogeneous space Greens function that remain valid for
small structures have been described in the literature [Chew et al., 2001]:
P P
H 0(2) (k rij ) iIT IJ Jj = (r
n = P m = P
n iI ) nm(rIJ ) m (rJj ) (7.71)

where the elements of the ( 2 P + 1) ( 2 P + 1) dimensional matrix IJ


are expressed by

nm (rIJ ) = H n(2) m (k rIJ ) e j ( n m ) IJ


(7.72)
7. Enhanced EM Software for Planar Circuits 213

and the elements of the ( 2 P + 1) 1 dimensional vectors Jj and iI are


given by

j m (Jj )
m (rJj ) = J m (k rJj ) e
(7.73)
n (riI ) = J n (k riI ) e j niI

In the above equations the length of a vector rXY , pointing from rY to


rX , is indicated by rXY and XY stands for the angle between that particular
vector and the x axis. Similar as for the plane wave decomposition, the
expansion (Eq. 7.71) is only valid when | rJj + riI |< rIJ , which is
automatically the case when the groups do not overlap. Also, the accuracy of
the factorization increases with increasing multipole order 2 P + 1 . The
physical interpretation is similar. The field of the source group is
decomposed into outgoing multipole waves (OMWs). Upon multiplication
with the translation matrix IJ , these OMWs are transformed into incoming
multipole waves (IMWs) arriving at the center of the observation group.
There, the IMWs are converted into field contributions at the observers.
A major difference between the HF-MLFMA, based on the plane wave
decomposition, and the LF-MLFMA, based on the multipole expansion, is
that in the LF-case the translation matrix is not diagonal. Without going into
any further detail, we mention here that because of the LF-character of the
problem, still an LF-PML-MLFMA can be obtained that always has a
memory and computational complexity of O ( N ) . In [Vande Ginste, 2004]
it is shown that this can be achieved by implementing Eq. (7.71) in an
appropriate multilevel framework.

5.1.2 Combination of the HF- and the LF-Technique

The combination of the HF-PML-MLFMA and the LF-PML-MLFMA is


the next logical step. This is useful for electrically large structures that
contain a lot of geometric detail. The LF-PML-MLMFA is then responsible
for the fast calculation of interactions between small groups, typically
smaller than 2 . The HF-PML-MLMFA takes care of the interactions
between larger groups.
214 Chapter 7

5.1.3 Extension to General Multilayered Structures

Up to now, only single-layered structures with planar metallizations


comprising of microstrip traces and patches were considered. In practice,
more complex structures appear. Here, two typical cases are mentioned:

1. Multilayered structures with planar metallizations


In Fig. 7-25 a microstrip patch antenna with an aperture-coupled
feed is depicted. The planar metallization patterns are embedded in a
double-layered background medium. The modeling of planar metal-
lizations in multilayered background media has been discussed on
many occasions [Das and Pozar, 1987; Catedra and Gago, 1990; Tsai
et al., 1997].

Figure 7-25. Microstrip patch antenna with an aperture-coupled feed.

2. Multilayered structures with non-planar metallizations


The techniques, as described above, only accept metallizations (and
thus currents) with transversal to z orientations. Of course, there are
some typical examples which require a technique that is able to solve
layered media with non-planar metallizations. In Fig. 7-26 a semi-
buried object is drawn. The lower substrate is a semi-infinite
substrate, which can e.g. model the earth. The semi-infinite top
substrate represents the air. A metal object is buried in the ground,
partly sticking out. This example represents some practical remote
sensing applications, such as the detection of mines buried in the
ground or the determination of the radar cross section of vehicles
(e.g. tanks, ships, etc). The number of papers found in the literature
[Michalski and Zeng, 1990a; Michalski and Zeng, 1990b; Cui and
Chew, 1999a; Ciu and Chew, 1999b; He et al., 2000; Geng et al.,
2001] related to this subject, reflects its importance.
7. Enhanced EM Software for Planar Circuits 215

Figure 7-26. Semi-buried object.

In Vande Ginste (2004) it is schematically demonstrated that, provided some


small extensions to the PML-based series expanions of the Greens function,
new fast and efficient PML-MLFMAs can be developed for the modeling of
the above kind of structures.

5.2 Conclusions

This chapter is winded up by drawing the main conclusions. After


presenting a classical solution technique for planar microwave structures
(Section 2), in Section 3 a new formulation for the Greens functions of
layered media is proposed. When the distance between a source and an
observer is not too small, the PML-paradigm leads to series expansions of
the Greens dyadic of the layered medium and this has two important
consequences. Firstly, the formalism allows the easier calculation of the
Greens functions, avoiding the numerical evaluation of Sommerfeld integ-
rals. Secondly, and more important in this context, the Greens functions are
known as analytical, closed-form expressions and this property is further
exploited for the formulation of new simulation techniques for planar
microwave structures.
This technique itself is presented in Section 4. The formalism is based on
an EFIE-MoM approach, a PML-series expansion of the Greens dyadic of
the layered background medium and a plane wave decomposition of the
Hankel kernels appearing in this series. Next, the details of the implement-
tation of a MLFMA are explained and it is briefly explained that a memory
and computational complexity down to order O ( N ) is obtained for the large
structures we are interested in. The good and controllable accuracy, as well
as the memory and computational complexity of the method, are empirically
verified by some numerical experiments. Finally, some illustrative examples
are given. Emphasis is on showing the reader that the proposed method is
indeed suited for a variety of large microstrip problems. The FIPWA
technique presented by Hu and Chew (2000, 2001) and by Jiang and Chew
216 Chapter 7

(2004) is also based on a series expansion for the Greens function. In the
FIPWA technique each Sommerfeld integral is replaced by a properly
chosen steepest descent path integral, a constant phase branch cut integral
and discrete surface wave pole contributions. The remaining integrals are
efficiently discretized using Gaussian quadrature rules. Although this
scheme, when applied to structures considered in this paper, also achieves an
O( N ) complexity, the PML-based series representation provides a valuable
alternative to this technique. It avoids the usual steepest descent path
complications when branch-points and/or surface wave poles come close to
each other and start to interfere, leading to a more robust scheme. The PML-
series comes in a natural way and can be calculated efficiently.
Finally, in Section 5 it is shown that the method will be and is being
extended to more general structures whilst maintaining the original BIE-
MoM formulation.

References
Abramowitz, M., and Stegun, I. A., 1970, Handbook of Mathematical Functions, Dover
Publications Inc., New York, USA.
Amos, D. E., 1986, A portable package for Bessel functions of a complex argument and
nonnegative order, ACM Trans. Math. Software 12(3): 265-273.
Axelsson, O., 1994, Iterative Solution Methods, Cambridge University Press, New York,
USA.
Brenger, J. P., 1994, A perfectly matched layer for the absorption of electromagnetic
waves, J. Comp. Phys. 114(2): 185-200.
Bledowski, A., and Zakowicz, W., 1997, Radiation properties of a planar dielectric
waveguide loaded with conducting-strip diffraction grating, IEEE Trans. Microwave
Theory Tech. 45(9): 1637-1640.
Bienstman, P., Derudder, H., Baets, R., Olyslager, F. and De Zutter, D., 2001, Analysis of
cylindrical waveguide discontinuities using vectorial eigenmodes and perfectly matched
layers, IEEE Trans. Microwave Theory Tech. 49(2): 349-354.
Canning, F. X., 1990a, Transformations that produce a sparse moment matrix, Journ. of
Electromagnetic Waves and Appl. 4(9): 893-913.
Canning, F. X., 1990b, The impedance matrix localization (IML) method for moment-
method calculations, IEEE Antennas and Propagation Magazine 32(5): 18-30.
Catedra, M. F., and Gago, E., 1990, Spectral domain analysis of conducting patches of
arbitrary geometry in multilayer media using the CG-FFT method, IEEE Trans. Antennas
Propag. 38(10): 1530-1536.
Chen, B., Fang, D. G., and Zhou, B. H., 1995, Generalized Berenger PML absorbing
boundary conditions for FD-TD meshes, IEEE Microwave Guided Wave Lett. 5(11):
399-401.
Chew, W. C., and Weedon, W. H., 1994, A 3D perfectly matched medium from modified
Maxwells equations with stretched coordinates, Microwave Opt. Technol. Lett. 7(13):
599-604.
7. Enhanced EM Software for Planar Circuits 217

Chew, W. C., Jin, J.-J., Michielssen, E., and Song, J., 2001, Fast and Efficient Algorithms in
Computational Electromagnetics, Artech House, Boston, USA.
Chew, W. C., Cui, T. J., and Song, J. M., 2002, A FAFFA-MLFMA algorithm for
electromagnetic scattering, IEEE Trans. Antennas Propag. 50(11): 1641-1649.
Chew, W. C., Chao, H. Y., Cui, T. J., Lu, C. C., Ohnuki, S., Pan, Y. C., Song, J. M.,
Velamparambil, S., and Zhao, J. S., 2003, Fast integral equation solvers in computational
electromagnetics of complex structures, Engineering Analysis with Boundary Elements
27(8): 803-823.
Churchill, R., and Brown, J., 1984, Complex Variables and Applications, Mc Graw-Hill, New
York, USA.
Coifman, R., Rokhlin, V., and Wandzura, S., 1993, The fast multipole method for the wave
equation: A pedestrian prescription, IEEE Antennas and Propagation Magazine 35(3):
7-12.
Cui, T. J., and Chew, W. C., 1999a, Fast evaluation of Sommerfeld integrals for EM
scattering and radiation by three-dimensional buried objects, IEEE Trans. Geosci.
Remote Sensing 37(2): 887-900.
Cui, T. J., and Chew, W. C., 1999b, Fast algorithm for electromagnetic scattering by buried
3-D dielectric objects of large size, IEEE Trans. Geosci. Remote Sensing 37(5):
2597-2608.
Das, N. K., and Pozar, D. M., 1987, A generalized spectral-domain Greens for multilayer
dielectric substrates with application to multilayer transmission lines, IEEE Trans.
Microwave Theory Tech. 35(3): 326-335.
Dembart, B., and Yip, E., 1995, A 3-D fast multipole method for electromagnetics with
multiple levels, in 11th Annu. Rev. Progress Appl. Computat. Electromagn.: 621-628.
Derudder, H., De Zutter, D., and Olyslager, F. , 1998a, Determination of the TE- and TM-
mode reflectivity at a laser facet using perfectly matched layers, in Proc. of the Third
Annual Symposium of the IEEE/LEOS Benelux Chapter 1998: 113-116, Gent, Belgium.
Derudder, H., De Zutter, D., and Olyslager, F., 1998b, Analysis of planar stratified
waveguides in the presence of perfectly matched layers, in Digest USNC/URSI National
Radio Science Meeting 1998: 276, Atlanta, Georgia, USA.
Derudder, H., De Zutter, D., and Olyslager, F., 1999a, Reflection of surface modes at the
substrate-air interface using mode matching techniques and PML-media, in Proc. of the
International Conference on Electromagnetics in Advanced Applications (ICEAA 99):
745-748, Torino, Italy.
Derudder, H., Olyslager, F., and De Zutter, D., 1999b, An efficient series expansion for the
2D Greens function of a microstrip substrate using perfectly matched layers, IEEE
Microwave Guided Wave Lett. 9(12): 505-507.
Derudder, H., De Zutter, D., and Olyslager, F., 2000, Efficient calculation of the 2D Greens
function of a truncated grounded dielectric slab, in IEEE Ant. Prop. Int. Symp. Dig. 2:
618-621, Salt Lake City, UT, USA.
Derudder, H., Olyslager, F., De Zutter, D., and Van den Berghe, S., 2001, Efficient mode-
matching analysis of discontinuities in finite planar substrates using perfectly matched
layers, IEEE Trans. Antennas Propag. 49(2): 185-195.
Engheta, N., Murphy, W. D., Rokhlin, V., and Vassiliou, M. S., 1992, The fast multipole
method (FMM) for electromagnetic scattering problems, IEEE Trans. Antennas Propag.
40(6): 634-641.
Ergin, A. A., Shanker, B., and Michielssen, E., 1999, The plane-wave time-domain
algorithm for the fast analysis of transient wave phenomena, IEEE Antennas and
Propagation Magazine 41(4): 39-52.
218 Chapter 7

Epton, M. A., and Dembart, B., 1995, Multipole translation theory for the threedimensional
Laplace and Helmholtz equations, SIAM J. Sci. Comput. 16(4): 865-897.
Fach, N., Van Hese, J., and De Zutter, D., 1989, Space domain Greens dyadic non-
coplanar microstrip or striplines in multilayered media, in Proceedings of the 1989 URSI
Int. Symp. on Electromagnetic Theory : 378-380, Stockholm, Sweden.
Fach, N., Van Hese, J., and De Zutter, D., 1992, Generalized space domain Greens dyadic
for multilayered media with special application to microwave interconnections, Journ. of
Electromagnetic Waves and Appl. 3(7): 651-669.
Fach, N., Olyslager, F., and De Zutter, D., 1993, Electromagnetic and Circuit Modelling of
Multiconductor Tranmission Lines, Oxford University Press Inc., New York, USA.
Fang, J., and Wu, Z., 1995, Generalized perfectly matched layer an extension of
Berengers perfectly matched layer boundary condition, IEEE Microwave Guided Wave
Lett. 5(12): 45-453.
Felsen, L. B., and Marcuvitz, N., 1994, Radiation and Scattering of Waves, IEEE Press,
Piscataway, NJ, USA.
Geng, N., Sullivan, A., and Carin, L., 2001, Fast multipole method for scattering from an
arbitrary PEC target above or buried in a lossy half space, IEEE Trans. Antennas Propag.
49(5): 740-748.
Givoli, D., 1991, Nonreflecting boundary-conditions, J. Comp. Phys. 94(1): 1-29.
Goedbloed, J. J., 1992, Electromagnetic Compatibility, 2nd ed., Kluwer, Deventer, The
Netherlands, 1991 (in Dutch), published in English by Prentice Hall.
Greengard, L., and Rokhlin, V., 1987, A fast algorithm for particle simulations, J. Comput.
Phys. 73(2): 325-348.
Gyure, M. F., and Stalzer, M. A., 1998, A prescription for the multilevel Helmholtz FMM,
IEEE Computational Science and Engineering 5(3): 39-47.
Hamilton, L. R., Macdonald, P. A., Stalzer, M. A., Turley, R. S., Visher, J. L., and Wandzura,
S. M., 1994, 3D method of moments scattering computations using the fast multipole
method, in IEEE Ant. Prop. Int. Symp. Dig. 1: 435-438, Seattle, WA, USA.
Harrington, R. F., 1993, Field Computation by Moment Methods, IEEE Press, Piscataway, NJ,
USA, 1993.
He, J. Q., Yu, T. J., Geng, N., and Carin, L., 2000, Methods of moments analysis of
electromagnetic scattering from a general three-dimensional dielectric target embedded in
a multilayered medium, Radio Science 35(2): 305-313
Hu, B., and Chew, W. C., 2000, Fast inhomogeneous plane wave algorithm for
electromagnetic solutions in layered medium structures: Twodimensional case, Radio
Science 35(1): 31-43.
Hu, B., and Chew, W. C., 2001, Fast inhomogeneous plane wave algorithm for scattering
from objects above the multilayered medium, IEEE Trans. Geoscience and Remote
Sensing 39(5): 1028-1038.
Jiang, L. J., and Chew, W. C., 2004, Low-frequency fast inhomogeneous plane-wave
algorithm (LF-FIWPA), Microwave Opt. Technol. Lett. 40(2): 117-122.
Kipp, R. A., and Chan, C. H., 1994, A numerically efficient technique for the method of
moments solution of planar periodic structures in layered media, IEEE Trans. Microwave
Theory Tech. 42(4): 635-643.
Knockaert, L., and De Zutter, 2000, D., On the stretching of Maxwells equations in general
orthogonal coordinate systems and the perfectly matched layer, Microwave Opt. Technol.
Lett. 24(1): 31-34.
Knockaert, L. F., and De Zutter, D., 2002, On the completeness of eigenmodes in a parallel
plate waveguide with a perfectly matched layer termination, IEEE Trans. Antennas
Propag. 50(11): 1650-1653.
7. Enhanced EM Software for Planar Circuits 219

Lindell, I. V., 1992, Methods for Electromagnetic Field Analysis, Oxford University Press,
New York, USA.
Ling, F., and Jin, J.-M., 1997, Scattering and radiation analysis of microstrip antennas using
discrete complex image method and reciprocity theorem, Microwave Opt. Technol. Lett.
16(4): 212-216.
Ling, F., Wang, C.-F., and Jin, J.-M., 1998, Application of adaptive integral method to
scattering and radiation analysis of arbitrarily shaped planar structures, Journ. of
Electromagnetic Waves and Appl. 12(8): 1021-1037.
Ling, F., Wang, C.-F., and Jin, J.-M., 2000, An efficient algorithm for analyzing large-scale
microstrip structures using adaptive integral method combined with discrete complex-
image method, IEEE Trans. Microwave Theory Tech. 48(5): 832-839.
Lu, C. C., and Chew, W. C., 1993, A fast algorithm for solving hybrid integral equation,
IEE Proceedings-H 140(6): 455-460.
Lu, C. C., and Chew, W. C., 1994, A multilevel algorithm for solving a boundary integral
equation of wave scattering, Microwave Opt. Technol. Lett. 7(10): 456-470.
Maxwell, J. C., 1954, A Treatise on Electricity and Magnetism, 3rd ed., Dover Publications
Inc., New York, USA.
Michalski, K. A., and Butler, C. M., 1983, Determination of current induced on a conducting
strip embedded in a dielectric slab, Radio Sience 18(6): 1195-1206.
Michalski, K. A., and Zheng, D., 1990a, Electromagnetic scattering and radiation by
surfaces of arbitrary shape in layered media, Part I: Theory, IEEE Trans. Antennas
Propag. 38(3): 335-344.
Michalski, K. A., and Zheng, D., 1990b, Electromagnetic scattering and radiation by
surfaces of arbitrary shape in layered media, Part II: Implementation and results for
contiguous half-spaces, IEEE Trans. Antennas Propag. 38(3): 345-352.
Michielssen, E., and Boag, A., 1994, Multilevel evaluation of electromagnetic fields for the
rapid solution of scattering problems, Microwave Opt. Technol. Lett. 7(17): 790-795.
Michielssen, E., and Boag, A., 1996, A multilevel matrix decomposition algorithm for
analyzing scattering from large structures, IEEE Trans. Antennas Propag. 44(8): 1086-
1093.
Mirotznik, M. S., and Prather, D., 1997, How to choose electromagnetic software, IEEE
Spectrum 34(12): 53-58.
Mittra, R., Chan, C. H., and Cwik, T., 1988, Techniques for analyzing frequency selective
surfacesa review, Proceedings of the IEEE 76(12): 1593-1615.
Mosig, J. R., and Gardiol, F.E., 1985, General integral equation formulation for microstrip
antennas and scatterers, IEE Proc.-H Microwave Antennas Propag. 132(7): 424-432.
Olyslager, F., 1999, Electromagnetic Waveguides and Transmission Lines, Oxford University
Press Inc., New York, USA.
Olyslager, F., and Derudder, H., 2003, Series representation of Greens dyadics for layered
media using PMLs, IEEE Trans. Antennas Propag. 51(9): 2319-2326.
Olyslager, F., 2003, Mathematical Modelling of Wave Phenomena, eds. Nilsson, B., and
Fishman, L., Vxj University Press, Vxj, Sweden, ch. Series Approximation for
Greens functions.
Olyslager, F., 2004, Discretization of continuous spectra based on perfectly matched layers,
SIAM J. Appl. Math. 64(4): 1408-1433.
Parrn, J., Rius, J. M., and Mosig, J. R., 2002, Application of the multilevel matrix
decomposition algorithm to the frequency analysis of large microstrip antenna arrays,
IEEE Trans. on Magnetics 38(2): 721-724.
Pozar, D. M., Targonski, S. D., and Syrigos, H. D., 1997, Design of millimeter wave
microstrip reflectarrays, IEEE Trans. Antennas Propag. 45(2): 287-296.
220 Chapter 7

Pozar, D. M., Targonski, S. D. and Pokuls, R., 1999, A shaped-beam microstrip patch
reflectarray, IEEE Trans. Antennas Propag. 47(7): 1167-1173.
Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P., 1992, Numerical
recipes in C, 2nd ed., Cambridge University Press, Cambdrige, UK.
Rao, S. M., Wilton, D. R., and Glisson, A. W., 1982, Electromagnetic scattering by surfaces
of arbitrary shape, IEEE Trans. Antennas Propag. 30(3): 409-418.
Rokhlin, V., 1990, Rapid solution of integral equations of scattering theory in two
dimensions, J. Comput. Phys. 36(2): 414-439.
Rogier, H., and De Zutter, D., 2001, Brenger and leaky modes in microstrip substrates
terminated by a perfectly matched layer, IEEE Trans. Microwave Theory Tech. 49(4):
712-715.
Rogier, H., and De Zutter, D., 2002a, Singular behavior of the Berenger and leaky-modes
series composing the 2D Greens function for the microstrip substrate, Microwave Opt.
Technol. Lett. 33(2): 87-93.
Rogier, H., and De Zutter, D., 2002b, Convergence behavior and acceleration of the
Berenger and leaky modes series composing the 2-D Greens function for the microstrip
substrate, IEEE Trans. Microwave Theory Tech. 50(7): 1696-1704.
Schaller, R. R., 1997, Moores law: Past, present and future, IEEE Spectrum 34(6): 52-59.
Sercu, J., Fach, N., Libbrecht, F., and Lagasse, P., 1995, Mixed potential integral equation
technique for hybrid microstrip-slotline multilayered circuits using a mixed rectangular-
triangular mesh, IEEE Trans. Microwave Theory Tech. 43(5): 1162-1172.
Silvester, P. P., and Ferrari, R. L., 1990, Finite Elements for Electrical Engineers, 2nd ed.,
Cambridge University Press, Cambridge, UK.
Song, J. M. and Chew, W. C., 1995, Multilevel fast-multipole algorithm for solving
combined field integral equations of electromagnetic scattering, Microwave Opt. Technol.
Lett. 10(1): 14-19.
Taflove, A., 1995, Computational Electrodynamics: The Finite-Difference Time-Domain
Method, Artech House, Norwood, MA, USA.
Tai, C.-T., 1993, Dyadic Greens functions in Electromagnetic Theory, 2nd ed., IEEE Press,
New York, USA.
Tsai, M.-J., De Flaviis, F., Fordham, O., and Alexopoulos, N. G., 1997, Modeling planar
arbitrarily shaped microstrip elements in multilayered media, IEEE Trans. Microwave
Theory Tech. 45(3): 330-337.
Tsalamengas, J. L., and Fikioris, J. G., 1993, TM scattering by conducting strips right on the
planar interface of a three-layered medium, IEEE Trans. Antennas Propag. 41(5): 542-
555.
Tsalamengas, J. L., 1993, TE-scattering by conducting strips right on the planar interface of
a three-layered medium, IEEE Trans. Antennas Propag. 41(12): 1650-1658.
Van Bladel, J., 1985, Electromagnetic Fields, revised printing, Hemisphere Publishing
Corporation, Washington.
Vande Ginste, D., Rogier, H., De Zutter, D., and Olyslager, F., 2004, A fast multipole
method for layered media based on the application of perfectly matched layers the 2-D
case, IEEE Trans. Antennas Propag. 52(10): 2631-2640.
Vande Ginste, D., 2004, Perfectly Matched Layer Based Fast Multipole Methods for Planar
Microwave Structures, Doctoral thesis, Dept. of Information Technology Ghent
University, Ghent, Belgium.
Wang, C.-F., Ling, F., and Jin, J.-M., 1998, A fast full-wave analysis of scattering and
radiation from large finite arrays of microstrip antennas, IEEE Trans. Antennas Propag.
46(10): 1467-1474.
7. Enhanced EM Software for Planar Circuits 221

Wilcox, C. H., 1964, Asymptotic Solutions of Differential Equations and their Applications,
John Wiley and Sons, Inc., New York, USA.
Zwamborn, A. P. M., and van den Berg, P. M., 1991, A weak form of the conjugate gradient
FFT method for plate problems, IEEE Trans. Antennas Propag. 39(2): 224-228.
222 Chapter 7

Bibliography

Chew, W. C., Jin, J.-J., Michielssen, E., and Song, J., 2001, Fast and Efficient Algorithms in
Computational Electromagnetics, Artech House, Boston, USA.
Derudder, H., 2000, Nieuwe toepassing voor perfect aangepaste lagen in planaire circuits en
golfgeleiders, Doctoral thesis, Dept. of Information Technology, Ghent University, Ghent,
Belgium.
Fach, N., Olyslager, F., and De Zutter, D., 1993, Electromagnetic and Circuit Modelling of
Multiconductor Tranmission Lines, Oxford University Press Inc., New York, USA.
Harrington, R. F., 1993, Field Computation by Moment Methods, IEEE Press, Piscataway, NJ,
USA.
Olyslager, F., 1999, Electromagnetic Waveguides and Transmission Lines, Oxford University
Press Inc., New York, USA.
Olyslager, F., 2003, Mathematical Modelling of Wave Phenomena, eds. Nilsson, B., and
Fishman, L., ch. Series Approximation for Greens functions, Vxj University Press,
Vxj, Sweden.
Sercu, J., 1994, Stroomdiscretisatie en interactiematrix berekening bij de momentenmethode
modellering van hoogfrequente planaire structuren, Doctoral thesis, Dept. of Information
Technology, Ghent University, Ghent, Belgium.
Tai, C.- T., 1993, Dyadic Greens functions in Electromagnetic Theory, 2nd ed., IEEE Press,
New York, USA.
Vande Ginste, D., 2004, Perfectly Matched Layer Based Fast Multipole Methods for Planar
Microwave Structures, Doctoral thesis, Dept. of Information Technology Ghent
University, Ghent, Belgium.
Chapter 8
PARALLEL GRID-ENABLED FDTD FOR THE
CHARACTERIZATION OF METAMATERIALS

L. Catarinucci1, G. Monti1, P. Palazzari2 and L. Tarricone 1


1:Univ. Lecce, Italy, 2: ENEA-HPCN, Italy

Abstract: Metamaterials are an appealing new frontier of electromagnetic research.


Interesting applications have been proposed in the recent past, though many
theoretical problems are still open. Along with appropriate analytical methods,
suitable numerical approaches can play a leading role in the study of such
materials, Finite Difference Time Domain Method (FDTD) being one of the
candidate solutions. The introduction of such materials in the FDTD scheme is
not straightforward: the frequency dispersive behaviour of metamaterials as
well as the numerical instability induced when negative permittivity and
permeability are directly imposed, enforce an alternative formulation. For such
a purpose the Drude model has been implemented. In this chapter, the
implementation of a parallel Variable Mesh FDTD, amenable to simulate
electromagnetic (EM) propagation through metamaterials, and enabled to take
full advantage from grid computing resources, is described.

Key words: FDTD; Variable Mesh; Metamaterials; DNG; Drude Model.

1. INTRODUCTION

Metamaterials, and more specifically one class of metamaterials, namely


double negative materials (DNG), are attractive for a wide range of
applications in the area of microwave, millimeter-wave and quasi-optical
circuits and antennas. Though several properties are still to be deeply
investigated, the time when these kinds of materials will routinely be
adopted in the design of EM circuits is not far away.
The diffusion of such technologies in daily computer-aided-design
(CAD) of circuits and antennas is basically dependent on the deep
understanding of their properties. To this aim, the use of suitable numerical
methods is of paramount importance. In this chapter, the finite-difference

223
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 223264.
2006 Springer. Printed in the Netherlands.
224 Chapter 8

time-domain (FDTD) method is proposed as a viable and attractive


approach.
On the other hand, the perspective use of FDTD tools in the CAD of
metamaterial-based devices casts the problem of probable time-demanding
tasks, one apparent example being the optimization of circuits, where
numerical simulations of complex apparatuses are typically repeated for
large numbers of times. In this chapter, this demand is satisfied by proposing
a parallel implementation of the FDTD tool, based on a message passing
approach (adopting the MPI library), and joining this with a simple and
effective variable-mesh, so that both memory and CPU-time is optimally
used. Finally, the adoption of MPI is the pathway towards a complete
opening to grid computing, so that a high-performance, low-cost, portable
FDTD tool is available.
Accordingly, the chapter is structured as follows. First, Section 2
introduces metamaterials and DNG materials. Later on, the basic phenomena
related to DNG are described (Section 3), as well as how a DNG can be
synthesised (Section 4). Section 5 reviews possible applications, and Section
6 addresses the specific theme of finite bandwidth signals in DNG. Section 7
finally addresses the area of FDTD simulation of DNG.

2. INTRODUCTION TO METAMATERIALS

Metamaterials are materials artificially synthesised with unusual


dielectric and magnetic properties, generally attained by including metals or
usual dielectrics inside a host material or periodically loading a transmission
line with R, L or C lumped or distributed elements.
The metamaterial idea rises from the observation that the concept of
homogeneity is absolutely relative. Indeed, for frequencies corresponding to
a wavelength comparable with the atomic distances, every material is
inhomogeneous, consisting of molecules constituted by atoms. Now, let us
consider a volume V of dielectric material and include in it a conductor or a
dielectric of different nature in such a way that, applying in the region of
space V an external electric or magnetic field, its answer is such to simulate
a desired value of electric permittivity and magnetic permeability:

Dav Bav
eff = , eff = (8.1)
Eav H av

where Dav, Bav are respectively the average of the electric displacement and
the magnetic induction vector over the region V, while Eav, Hav are the
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 225

average of the applied electric and magnetic field. The material attained as
composition of regions similar to V (such regions can be considered the
molecules of the material to be synthesised) is homogeneous for frequencies
whose wavelength is largely greater than the linear dimension of the
inclusions. Such a metamaterial is characterised by eff and eff, calculated
through homogenisation techniques [Pendry, 1999].
So, an accurate analysis of the geometry, position and material
constituting the inclusions allows us to synthesise a metamaterial with
unusual values of eff ed eff .
Interesting applications of metamaterials have been proposed in the
recent past. Consequently, a detailed analysis of their electromagnetic
properties can be quite useful for a wide variety of metamaterial devices, and
suitable numerical methods are needed.
For such purposes, the FDTD approach is definitely amenable,
Conformal FDTD or Lumped-elements FDTD being some possible
examples.
In this chapter, a state-of-the-art variable mesh (VM) FDTD tool is
proposed, embedding several attractive features, suitable for the analysis of
dispersive metamaterial slabs, or more generally for metamaterial-based
devices.

2.1 DNG Metamaterials

Introduced in 1968 by Veselago [Veselago, 1968] DNG materials are a


relevant, new class of metamaterials, with negative values of both the
dielectric permittivity and the magnetic permeability , while a material
having only one of these parameters smaller than zero is referred to as Single
Negative (SNG, MNG if < 0, ENG if < 0).
Now, let us write the dispersion equation for an isotropic lossless
material:

2 2
k = n 2 , n 2 = (8.2)
c2

At a first glance, a simultaneous change to the sign of and , should


have no effects on these relations. On the contrary, Veselago showed that
materials with negative values of both and have some properties different
from conventional substances (Double Positive, DPS, and greater than
zero).
226 Chapter 8

To demonstrate this, we must consider the relations where and appear


separately, as Maxwells equations and the constitutive relations:

~
~ 1 B
E = ~ ~
c t B = H
(8.3)
~ ~
D ~
~ 1 D = E
H =
c t

For a monochromatic plane wave, in the frequency domain, they reduce


to:

~ ~ ~ ~ ~ ~
k E = H, k H = E (8.4)
c c

( ~ ~ ~
)
So, one can see that if and are greater than zero then E, H, k form a
( ~ ~ ~
)
right-handed triplet, while, if and are less than zero then E, H, k form a
left-handed set (Fig. 8-1). Consequently, Veselago named these materials
Left-Handed (LH).
For Poynting vector, giving the wave power density, we have:

~ c ~ ~
S= EH (8.5)
4
~ ~ ~
As in a DPS medium, S forms a right-handed system with E and H , so
~ ~
S and k are in opposite directions in a DNG material.
This implies that, in a DNG medium, the phase velocity, which is
~
directed as the wave vector k , is antiparallel to the group velocity, which is
~
directed as the Poynting vector S (backward-wave [Ramo et al., 1995]).
It is worth to observe that from Eq. (8.4) it can be derived also that the
wave impedance of a DNG medium is positive, as for a DPS medium:


= (8.6)

8. Parallel Grid-enabled FDTD for Metamaterial Analysis 227

Figure 8-1. Relative position of the electric and magnetic field with the propagation vector, in
a RHM (on the left) and a LHM (on the right) medium.

So a DNG medium with ( = , = ) can be matched to a DPS


medium with ( = , = ) : having the two media the same wave
impedance, no reflected wave exists at their interface.
Indeed, considering a monochromatic wave, with an angular frequency
0, impinging normally from a DPS medium (for example from the free
space side) on a DNG interface, the reflection and transmission coefficients
are:

0 2
r01 = , t01 = (8.7)
+ 0 + 0

Assuming that the DNG slab is matched to the free space at 0


( = 0, = 0 ) , we have:

0
= = 0 r01 = 0, t01 = 1 (8.8)
0

3. NEGATIVE REFRACTION

Another attractive property of a DNG medium is the negative refraction.


It is well know that, at the interface between two different materials, the
~ ~
fields E and H must satisfy the following boundary conditions:
228 Chapter 8

~ ~ ~ ~
E t1 = E t 2 , H t1 = H t 2
(8.9)
1En1 = 2 En 2 , 1H n1 = 2 H n 2

Consequently, if the two materials have the same rightness, the field
components normal to the interface only change in magnitude, maintaining
the same direction in the incident and in the refracted ray. If the materials
~
have opposite rightness, the components, as the vector k , change sign
passing from one medium to the other one, giving a refracted ray (Fig. 8-2a).
In this way, Veselago showed that, if one admits that a DNG medium can
exist, then Snells law must be rewritten as follows:

sin( r ) n1 11
= =p (8.10)
sin(i ) n2 2 2

where p is equal to +1 if the two media have the same rightness, otherwise it
is equal to -1; so the index of refraction of a DNG medium is negative.
The negative refraction implies one more interesting characteristic,
namely a double focusing effect revealed by a simple ray diagram; indeed,
assumed a source located in a DPS medium at distance d1 from the front face
of a matched DNG slab of thickness d2, there are two distinct focal areas: the
former inside the slab and the latter at distance d = d2-d1 from the slabs
back-face (Fig. 8-2b); this property allows the design of unusual refracting
systems, such as the perfect lens proposed by Pendry [Pendry, 2000]
analyzed in the following part of the chapter.

4. HOW TO SYNTHESIZE A DNG MEDIUM

Smiths medium Despite the interesting observations by Veselago, the


scientific community did not pay substantial attention to DNG
metamaterials, due to the fact that materials with < 0 and < 0 do not exist
in nature. This lasted until 2000, when DNG media were brought to the
attention of the scientific community by Smith [Smith et al., 2000] who
synthesized a LH material as a medium composed of two structures
separately having < 0 and < 0 for the microwave regime.
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 229

Figure 8-2. a) Refracted ray in a Positive Refractive Index (PRI) medium and in a Negative
Refractive Index (NRI) medium. In a DNG material the ray forms a negative angle with the
normal (this is consistent with a negative refractive index). b) Double focusing effect given by
a DNG slab in a matched DPS medium.

It has been shown by Pendry [Pendry et al., 1999; Pendry, 2004] that a
medium constructed by metallic thin wires periodically embedded into a host
dielectric behaves as a homogeneous material, with a corresponding plasma
frequency ( fp), when the lattice constant of the structure (thin wires mutual
distance) and the diameter of the wire are small in comparison with the
wavelength; so an array of parallel conducting thin wires on a dielectric
substrate shows an ENG behaviour at frequencies below fp (Fig. 8-3c).
The inclusion proposed to obtain negative magnetic permeability is a
resonant particle, the Split Ring Resonator (SRR) [Pendry et al., 1999], a
highly conductive structure in which the small gap between the two rings
gives a large capacitance balancing the rings inductance (Fig. 8-3a).
Furthermore, the split of the rings ensures a resonant frequency corres-
ponding to a wavelength several times larger than the diameter of the
rings. The resonant behaviour is achieved by applying an external time-
varying magnetic field perpendicular to the ring surface, inducing currents
that produce a magnetic field that may either oppose or enhance the inci-
dent field, thus resulting in positive or negative effective permeability.
Consequently an array of SRR on a dielectric substrate gives a MNG
behaviour near the SRRs resonant frequency. The medium achieved in this
way is strongly dispersive (the magnetic permeability quickly varies with the
frequency [Pendry, 2004]) and lossy.
Combining the MNG and ENG structures (Fig. 8-3d), the resulting
electric permittivity and magnetic permeability take the Lossy-Drude model
form (Fig. 8-3e):
230 Chapter 8

pe
2

( ) = 0 1

( + j e )

(8.11)
pm2

( ) = 0 1
( + j )
m

Where pe,pm are respectively the electric and magnetic plasma


frequencies, while e, m represent the losses of the system; these relations
indicate that the medium acts as DNG below the plasma frequency.
Furthermore, due to the SRR anisotropy, the medium presented in [Smith
et al., 2000] is one dimensional. Indeed, the structure exhibits a DNG
behaviour only for a wave with the electric field polarized along the rings
gap, and the magnetic field perpendicular to the ring surface (Fig. 8-3a).
This happens for only one direction of propagation. Assuming in Eq. (8.11)
pe = pm = p, and neglecting the system losses, the phase constant is:

p2
( ) = ( ) ( ) = 1 (8.12)

c 2

where c is the light velocity, while the refractive index (n) and the phase and
group velocity ( ~vp , ~vg ) are:

2
n= = 1 p2 (8.13)
0 0

1 1 1 1

~v = = ~c 1 p 2
2
, ~vg = = ~c 1 + p (8.14)
p
2 2

Composite Right-Handed Left-Handed (CRLH) Medium In 2002


several authors [Caloz and Itoh, 2002], [Iyer, 2002], [Eleftheriades et al.,
2002]) proposed an alternative to Smiths medium, based on an L-C
distributed network representation of homogeneous dielectrics.
It is well know that a DPS medium ( = , = ) can be modeled with
a distributed L-C network in low-pass topology. Relating the per-unit-length
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 231

capacitance and inductance to the electric permittivity and magnetic


permeability of the medium as follows:

Z jL0 Y jC0
= = = L0 , = = = C0 (8.15)
j j j j

the Transmission Line (TL) propagation constant and its characteristic


impedance become:

L
= ZY = , Z0 = = (8.16)
C

so they are respectively equal to the propagation constant and the wave
impedance of the DPS medium.
This concept is not limited to a DPS medium, but it is also applicable to
obtain a DNG behaviour ( = , = ) , imposing negative values for L
and C. This implies considering the dual high-pass topology, depicted in Fig.
8-4a, made out of series capacitors and shunt inductors.
For this configuration Eqs. (8.15) and (8.16) turn into:

Z 1 1
- = = =
j j( jC ) 2C
(8.17)
Y 1 1
- = = = 2
j j( jL ) L

1 1 1
= ZY = = =
( j C )( j L ) CL
(8.18)
L
Z= =
C
232 Chapter 8

Figure 8-3. a) Split Ring resonator particle and incident wave to obtain a negative effective
permeability; b) MNG medium as array of SRR; c) ENG medium as array of thin wires; d)
DNG medium as composition of an ENG and a MNG medium; e) Lossy-Drude model
relative electric permittivity and magnetic permeability (p e = pm = p, e = m = 0 ).

Figure 8-4. a) Dual transmission line model of a DNG medium; b) Composite Right-Handed-
Left-Handed medium (CRLH) made of two TL loaded with lumped elements.

Consequently, a host TL planar network medium, periodically loaded


with lumped series capacitors and shunt inductors (Composite Right-Handed
Left-Handed (CRLH) medium) acts as a homogeneous DNG medium at the
frequencies corresponding to a wavelength much smaller than the unit cell
dimension (d).
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 233

Referring to Fig. 8-4b, Eq. (8.15) becomes:

1 1
= L0 , = C0 (8.19)
Cd
2
Ld
2

Choosing L0C = C0L (balanced case), the unit cell propagation constant is:

1 1
= TL + LH = L0C0 (8.20)
d CL

Eq. (8.20) evidences the dual nature of the CRLH medium, which
behaves as DNG for frequencies below = 1 L0C , and as DPS elsewhere.
The refractive index and the phase and group velocity are:

1 1
n= = c L0 2 C0 2 (8.21)
0 0 Cd Ld

1
~ 1 1
v p = L0 C 0
d 2 CL
1
(8.22)
1
~ 1 1
v g = TL + LH = L0 C 0 +
d 2 CL

It is worth observing that this dual transmission line approach gives a


broadband DNG behaviour, since it does not depend on resonant particles.
A simple mono-dimensional realization of a CRLH was achieved in 2002
by Caloz, who realized such a line in microstrip technology, using
interdigited capacitors as series capacitance, and shorted stubs as distributed
shunt inductances.
In the same year, Eleftheriades [Eleftheriades et al., 2002] extended this
approach into two dimensions: it was constructed a planar perfect lens
using standard printed circuit board (PCB) techniques and its double
focusing properties were demonstrated experimentally.
234 Chapter 8

5. DNG MEDIA APPLICATIONS

Perfect lens One of the very attractive applications is proposed in 2000


by Pendry [Pendry, 2000]. He found that lenses made out of a DNG slab in a
matched DPS medium allow a complete reconstruction of the point source to
a perfect image, without the conventional optical limitations.
It is well known that, in a lens made out of a DPS material, the maximum
resolution is equal to the mediums wavelength (), because the information
relative to sub-wavelength distance is carried out from the evanescent wave,
which amplitude decays exponentially in a DPS medium.
Pendrys analysis shows that evanescent waves would experience an
exponential growth in a DNG medium. This counterintuitive phenomenon is
confirmed by Grbic [Grbic and Eleftheriades, 2003], based on the two-
dimensional version of the dual transmission line approach; so Pendrys lens
is able to focus the entire spectrum, both the propagating and the evanescent
spectrum, thanks to the amplification experienced by the evanescent waves
and the negative phase delay experienced by the propagating waves, giving a
sub-wavelength resolution.
Later on, Ziolkowski [Ziolkowski and Heyman, 2001] pointed out that
this is possible only in a lossless DNG slab matched to the free space. This is
confirmed by Smith et al. [2003] who found that the amount of losses
allowing the perfect lens effect is very critical. Consequently, considered the
available technology, the perfect image is possible only for source-to-image
distances much smaller than one wavelength, unless losses in the DNG
medium are exceedingly small [Smith et al., 2003].

Compact Cavity Resonator Engheta [Engheta, 2002] proposes a mono-


dimensional Compact Cavity Resonator made out of a lossless slab of a DPS
material arranged into a sandwich structure with another DNG material. The
whole structure is backed by metallic plates.
A monochromatic solution, with a time dependence exp(-jt) and with
the electric and magnetic field vectors oriented along the x and y direction
(Fig. 8-5) experiences a phase delay (DPS) during the propagation in the
forward medium, compensated by the propagation in the backward medium,
which gives a phase advance DNG:

DPS = nDPS 0( f0 )d1


(8.23)
DNG = nDNG 0( f0 )d 2 = nDPS 0( f0 )d 2
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 235

So, choosing d1/d2 = n2/n1, the phase difference between the two metallic
plates is equal to zero, and the structure acts as a phase compensator,
independently of the sum of the two layers thickness.
A non trivial solution of this 1-D Compact Cavity Resonator, satisfying
the boundary conditions imposed by the structures geometry, is obtained for
values of d1, d2 given by the following relations:

tan( nDPS k0d1 ) nDPS 2


=
tan( nDNG k0d 2 ) nDNG 1
(8.24)
d1 nDPS 2
nDPS 0 ( f0 )d1 0
d2 nDNG 1

Consequently the structure can have a 1-D solution, if the ratio of the two
layers thickness satisfies Eq. (8.24), while their sum can assume any value.

Backward-wave Antennas Several are the applications proposed for


the DNG material in the field of antennas, based on the CRLH model.
Grbic [Grbic and Eleftheriades, 2002] proposes a NRI antenna in
Coplanar Waveguide technology, supporting a radiating backward-wave
fundamental spatial harmonic; while Caloz realize a conical leaky-wave
antenna with a conical beam, that exhibits both backward and forward
leakage [Allen et al., 2004].
Furthermore, in the field of antennas, Ziolkowski [Ziolkowski and
Kipple, 2003] find that a shell of DNG material surrounding a short dipole
antenna acts as a matching network increasing the antenna radiation
efficiency.

Planar circuit applications In this area, many realizations have already


been obtained, thus demonstrating how, using a DNG material, it is possible
to improve the performance of several microwave components as:
Waveguides: Hrabar obtains a miniaturised waveguide filling with a
uniaxial negative permeability material based on SRR [Hrabar, 2003].
Phase-shifters: Based on a CRLH model, Antoniades and Eleftheriades
propose a Compact Phase-Shifter that exhibits a linear phase response
around the design frequency and a group delay shorter than the
conventional delay lines [Antoniades and Eleftheriades, 2003].
Coupled-lines: Caloz proposes several types of miniaturized coupled-line
couplers based on the CRLH model synthesized in microstrip technology
[Caloz et al., 2003, 2004]. Furthermore, it realizes a Branch-Line Coupler
with two arbitrary operating frequencies [Lin et al., 2003].
236 Chapter 8

Figure 8-5. Mono-dimensional Compact Cavity Resonator proposed by Engheta.

6. MODULATED SIGNALS IN A DNG MEDIUM

6.1 Dispersion

In the previous section we have described some cases where the use of
DNG slabs is appealing.
Some recent papers [Al et al., 2004] investigate the introduction of
DNG slabs in real applications, though these studies are limited to the case
of input signals represented by plane waves at one single frequency. Of
course, being a DNG medium a dispersive material, this is a severe
limitation to the understanding of the behaviour of DNG slabs, as well as to
their use in MW circuits.
Indeed, with reference to the DNG medium proposed by Smith, the
constitutive parameters can be approximated with the Lossy-Drude model,
so, rewriting Eq. (8.14), the phase and group velocity ( ~vp , ~vg ) are:
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 237

1 1 1 1

~v = = ~c 1 p 2
2
, ~vg = = ~c 1 + p
p
2 2

Which manifests the DNG mediums dispersive behaviour.


In a transmission context, dispersion means that different spectral
components travel at different speed; particularly, in a DNG medium, the
higher frequency components experience a lower time delay (anomalous
dispersion [Someda, 1998]), indeed we have:

p
=~
1
=
(
2 2p 1 ) (8.25)
d v p( ) 2 c

g 1 + p 1
= ~
2 2
( d
g <0
) (8.26)
d vg 2
c d

where p and g are, respectively, the phase and time delay. Consequently, a
broadband propagating signal experiences a time envelope distortion.
So, a monochromatic analysis, useful for a first comprehension of the
propagating behaviour, is of limited usability for application purposes. On
the contrary, it is important to extend the investigation to the case of input
signals with finite bandwidth, such as modulated signals [Valanju et al.,
2002; Smith et al., 2002].
To this end in the next section we rigorously study the propagation of an
amplitude modulated signal in a DNG material assuming that the time-
domain signals envelope is a Gaussian function.

6.2 Gaussian Pulse in a DNG Slab

Let us consider a plane wave, with an angular frequency 0, x-directed


and z-polarized electric field, and amplitude-modulated by a Gaussian pulse,
propagating in a slab of a dispersive medium (Fig. 8-6).
So, in the time and frequency domain, the input signal can be written as
[Someda, 1998]:

E(t , f 0 ) = E z (t , f 0 ) z = A(t ) cos (2f 0 t ) z , i = 1,2 (8.27)


238 Chapter 8

(t )2
A1 (t , ) = S exp (8.28)
2

(2f )2
1 ( f , ) = S exp exp{ j 2f } (8.29)
23 2 4

1
z ( f , f0 ) = { i ( f + f 0 ) + i ( f f 0 )} , i = 1,2 (8.30)
2

where is the instant when the pulse reaches its maximum amplitude, and
determines the pulse width (Fig. 8-7). The propagation in the slab is
characterized by an attenuation factor () and a phase factor ();
assuming that the bandwidth of the propagating signal is small with
respect to 0, and that the attenuation factor is small and constant in this
frequency range, it can be neglected in our calculation.
The time behaviour of the propagating signal in the point R of Fig. 8-6, at
a distance d from the front face of the slab, can be calculated as an inverse
Fourier transformation:

+ +
Ez (t, d ) = 'z ( f, f0 , d )df = z ( f, f0 ) exp{ j [ ( f )d 2ft ] }df (8.31)

Assuming that the phase constant () can be expanded as a Taylor


series around 0, we can write:

d ( f ) 1 d 2 ( f )
( f ) ( f0 ) + ( f f0 ) + ( f f 0 )2 +
2
df f0 2 df f0 (8.32)
1
= 0 + ( f f0 ) + ( f f0 )
' " 2
0 0
2
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 239

Figure 8-6. The geometry of the slab and of the FDTD-TF/SF domain. The total field region
is composed of 182000 cells (1000 [#cell]- 183000 [#cell]); the DNG slab is located between:
1500 [#cell]- 182500 [#cell].

Figure 8-7. Gaussian pulse time envelope and Pulse width definition.

Substituting Eq. (8.32) into (8.31), for a non-dispersive medium (0= 0),
(8.31) yields:

E(t, d ) = E(t '0 t, d = 0 ) (8.33)


240 Chapter 8

The comparison of Eq. (8.33) and (8.27) indicates that the time duration
of the propagating signal does not depend on d, whilst the propagating pulse
has its maximum amplitude when t = + 0d. Consequently, the envelope
travels undistorted, with a group velocity vg = 1/0.
For a dispersive medium, as a DNG material, Eq. (8.31), gives:

1
(
Ez (t, d ) = S 4 + 20" d
2
)
1 2

exp 4
( 2
t'0' d 2 )

2
(
+ 40" 2d 2 )

[

(
2 + j 20" d )]
12
(
2 " d t'0' d
exp j 0d 0t' 0
) 2
+ (8.34)

4 + 20" d ( ) 2

[ (
+ 2 j 20" d )]
12

exp
(
2 " d t'0' d
j 0d 0t'+ 0
) 2





4 + 20" d ( )2

Where t is equal to (t-). Eq. (8.34) differs from (8.33): the last term (an
exponential with real argument) introduces distortion. From Eq. (8.34) it is
evident that the instant when the amplitude of the propagating signal reaches
its maximum is ( +0d), as for the non-dispersive case. The dispersive
nature of the DNG slab does not alter the pulse centers propagation
velocity, which is still vg, and the pulse shape, which remains Gaussian.
On the contrary, it affects the pulse width. In order to discuss this,
referring to Fig. 8-7, we define the pulse width of the propagating signal as:
t = (t2 t1), where t1, t2 are the instants when the amplitude of the electric
field is equal to the 5% of its maximum value:

t = (t2 t1 ) , E(t1 ) = E(t2 ) = 0.05 E(t ) MAX


(8.35)

Now, according to the Eqs. (8.34) and (8.27), it can be easily derived
that:

12
t( non dispersive ) = 2 [ln( 20 )] (8.36)

12



(
t( dispersive ) = 2 ln( 20 ) 2 + 2 "0 d
2

) (8.37)
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 241

In other words, in the dispersive case the pulse width is function of d and
it tends asymptotically to be proportional to the distance covered in the slab,
( ) (
indeed, as 2 "0 d >> 2, t [ ln ( 20 )]1 2 4 "0 d . )
Now, let us suppose that the slab is made of a DNG medium. Eq. (8.32)
turns into:

2f f2
( f ) = 1 p2
c f

2 f0 f p2 f p2 f p2
= 1 + 1 + ( f f ) ( f f0 )2 + (8.38)
f02 f02 f3
0
c c 0

So, for a DNG slab, Eq. (8.35) becomes:

( )
12
4 2df p2
2


t (DNG ) = 2 ln (20 ) +
2



2cf 0
3
( ) 2

(8.39)
8df p2
lim{t (DNG )} = (ln (20 ))
12
3
c 2 f 3
0

0 2cf 0
4 df 2
p

It is useful to observe that, when the DNG medium is matched with the
free space at 0, Eq. (8.38) becomes:

12
4( 4d )
2
(
t DNG , f p = 2 f 0 12
) = 2 ln (20) 2 +

(2cf 0 )2
(8.40)
16d
{ (
lim t DNG, f p = 21 2 f 0 ) } = (ln(20)) 12

c 2 f

2cf 0
0
0
8d

242 Chapter 8

7. NUMERICAL METHODS
FOR METAMATERIALS

The interesting properties assumed by devices based on metamaterials


could be rigorously investigated through an appropriate numerical tool.
Among the many numerical methods for electromagnetic problems, FDTD is
definitely one of the possible candidates.
Such a method, in fact, though notoriously strongly CPU-time and
memory consuming, is on the other hand extremely flexible and appropriate
to deal with complex structures. The use of parallel computing in
conjunction with efficient meshing strategies, permits to afford large
simulation domains in reasonable times, whilst the implementation of an
appropriate dispersive model makes the algorithm suitable for the analysis of
DNG slabs or, more generally, for the study of metamaterial-based devices.
In the following, the basic structure of the FDTD method is illustrated
both for standard computers as well as for parallel and grid computing, with
highlights on its use for metamaterial simulations.

7.1 Bases for the FDTD Method

The Finite Difference Time Domain algorithm, firstly proposed by Yee


in 1966 [Yee, 1966], is frequently adopted for the solution of a wide class of
electromagnetic problems and a large bibliography on this topic has been
produced in the recent past years. Therefore, the main purpose of this section
is to give the reader only the basic concepts of the method, without
rigorously considering all the elements and the aspects in exhaustive detail.
We address instead, for such a goal, to a more specific literature [Yee 1966,
Taflove 2000, Sullivan 2000].
In order to describe how the FDTD algorithm works, we consider a
classical and well known electromagnetic problem and describe step by step
how we solve it through finite-difference simulations. The test-case of a
half-wavelength dipole in a given volume will be considered. The problem is
illustrated in Fig. 8-8, where an 890 MHz half-wavelength dipole, which
length is L d = 0.168 m , is represented in the center of a cubical volume
which dimensions are Lx L y Lz = 0.28 0.28 0.28 m3 ; a sinusoidal
excitation is applied on the dipole-feed whilst the time behavior of the EM
field in a generic volume point is the desired solution.
In such a context, we can suppose to discretize the volume into
elementary cells, individuated by integer indexes (i, j, k), and to associate to
them the corresponding values of permittivity, conductivity and permeability,
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 243

namely < i , j , k , i , j , k , i , j , k > . If x , y and z are the cell-edge


dimensions and if we consider cubical cells with, for example,
x = y = z = 0.008 m , then N x N y N z = 35 35 35 = 42875 cells
will be needed in order to discretize the entire volume, and
N d = Ld z = 21 cells should be used for the dipole modeling. Every cell is
the point where the searched EM field will be computed in discretized time
steps t , as sketched in Fig. 8-9.

Figure 8-8. Test-case: field emitted by a half-wavelength dipole in every point in a chosen
volume.

Now, two basic questions can be put:


could we discretize Maxwells equations over this simulation domain?
If so, could we solve a generic electromagnetic problem by discretizing it
both in space and time and by characterizing the material dielectric
properties of each computational point?
In order to give answer to these interesting questions, we can start by
considering the well known time dependent Maxwells curl equations, here
reported:
244 Chapter 8
~
~ H
E = (8.41)
t
~
~ ~ E
H = E + (8.42)
t

which give the following six scalar differential equations:

Figure 8-9. Discretized simulation domain for the test-case of a half-wavelength dipole.

H x 1 E y E z
= (8.43)
t z y

H y 1 E z E x
= (8.44)
t x z
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 245

H z 1 E x E y
= (8.45)
t y x

E x 1 H z H y
= E (8.46)
y
x
t z

E y 1 H x H z
= Ey (8.47)
t z x

E z 1 H y H x
= E (8.48)
x
z
t y

Figure 8-10. Position of the field components in the generic Yees cell.

Now, naming F in, j , k the generic E or H component at the time n t in


( )
the grid point of coordinates i x , j y , k z , the space and time
246 Chapter 8

derivatives can be expressed respectively with the following centered finite


difference equations:

n n n
F F F
i, j,k
=
i, j,k i 1, j,k
+ O 2x ( ) (8.49)
x x

n n +1 / 2 n 1 / 2
F F F
i, j,k
=
i, j,k i, j,k
+ O t 2 ( ) (8.50)
t t

where the expression centered, and consequently the second-order


accuracy approximation, become evident only observing the effective
dislocation of the field components in the elementary cell, named Yees cell
(see Fig. 8-10). We address to [Yee, 1966] for more details.
By using Eqs. (8.49) and (8.50) in Eqs. (8.43) and (8.46), the following
explicit equations are obtained:

i , j , k t
1
n +1 2 i , j , k
Ex i, j ,k = Ex in, j , k
i , j , k t
1 +
2 i , j , k
(8.51)
H z in,+j1/, k 2 H z in,+j1/1,2k
+
t i , j , k y
+
i , j , k t H n + 1/ 2
H n + 1/ 2

1 + y i , j , k y i , j , k 1

2 i , j , k z


8. Parallel Grid-enabled FDTD for Metamaterial Analysis 247

n +1 / 2 n 1 / 2 t
Hx i, j,k = Hx i, j,k +
i, j,k
E n E y in, j,k
y i, j,k +1 + (8.52)
z

E z in, j +1,k E z in, j,k

y

Two main interesting aspects can be highlighted by observing Eqs. (8.51)


and (8.52):
1) the electric field evaluated at a certain time step in a generic
computational point (i, j, k), depends only on electric and magnetic fields
at the previous time step. Also, the dependence on the electric field is
confined to the values in the same cell. The dependence on magnetic
fields, instead, is limited on values in adjacent cells.
2) Dually, the magnetic field evaluated at a certain time step in a generic
computational point (i, j, k), depends only on electric and magnetic fields
at the previous time step, with a dependence on magnetic field confined
to the values in the same cell and a dependence on electric fields
confined on values in adjacent cells.
What does it mean from a computational point of view?
It means that, through the characterization of the simulation domain in
every point by setting the appropriate values of < i , j , k , i , j , k , i , j , k >
everywhere, and knowing the EM fields at a certain time (starting
condition), the new excitation value could be imposed on the dipole-feed,
the magnetic field could be updated by using Eq. (8.52) and the obtained
values could be used in Eq. (8.51) for the new E field values. Again, the time
step can be increased and after the updating of the E field on the feed point,
such values can be used in Eq. (8.52) for the H fields evaluation, thus
resulting in the classical and well known leap-frog scheme of the FDTD
algorithm.
Actually, the complete flux diagram of an FDTD algorithm should report
a set of not yet mentioned operations: the space and time steps, for instance,
have to be accurately evaluated in order to obtain respectively a certain
degree of accuracy and the numerical stability; also, when unlimited
domains have to be modelled, the numerical illusion of infinitely extended
media must be created, and the implementation of adequate absorbing
boundary conditions (ABC) on the edge of the simulation domain is
mandatory. Finally, it is sometimes useful to deal with plane waves sources
instead of hard sources; this is practicable by slightly changing the algorithm
248 Chapter 8

structure implementing the so called Total Field Scattered Field scheme


(TFSF); in such a formulation, the domain simulation is divided into two
different regions and a plane wave is generated at their interface. In the inner
region the total field is computed, whilst in the outer region only scattered
field contributions are considered. We address to the already suggested
bibliography for details.
In Fig. 8-11 the radiation pattern obtained by simulating the half-
wavelength test-case through the discussed FDTD scheme is reported and
compared with the ideal analytical solution. Even if the examined problem
requires the modelling of a really simple source, it is quite evident how
domains containing even more complex geometries can be studied. As an
example of the power and flexibility of FDTD algorithms, the same Fig. 8-
11 shows the E field behaviour obtained by simulating the difficult problem
of the human exposure to the field emitted by an EM source: in such a case,
a heterogeneous numerical human-body model has been implemented in the
near field region of a real GSM Radiobase Station Antenna, showing the
appropriateness of finite difference codes to deal with complex
environments.
Unfortunately, a price in terms of computational resources must be paid.
Large simulation domains, which are mandatory in many realistic problems,
and small space discretization steps, useful when complex geometries are
modeled, cause the growth of the number of computational points and,
consequently, the sharp rise of the memory needs. Also, at each time step,
the field components on every cell have to be updated; furthermore, the
simulations being performed in the time domain, the regime status has to be
reached, thus resulting in a large number of needed time steps and long
simulation times; hence, high CPU power is required.
To better afford such a huge computational effort, two different ways are
often chosen. One of them consists in the use of parallel computing: the
FDTD algorithm results to be naturally parallelizable through a domain
decomposition scheme, and high performances can be achieved. The latter
way consists in the reduction of the number of grid points: it is a matter of
fact that the use of a uniform mesh can result in an unaffordable waste of
computational resources, and subgridding techniques are consequently often
implemented.
In the following we will show in detail how the FDTD scheme can be
easily parallelized and enabled to run on a grid computing environment, and
then how the two approaches, parallel and non-uniform meshing, can be
adopted contemporaneously, giving rise to a parallel variable mesh FDTD,
which gathers many of the advantages of the classical subgridding schemes
without affecting the parallel algorithm performances.
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 249

7.2 Parallel Grid-Enabled FDTD using MPI

The need of reducing computational times, improving the accuracy, and


limiting computing costs, even with the adoption of low-cost high-
performance platforms, is satisfied by adopting parallel algorithms,
implementing them with MPI libraries, and making the whole tool amenable
for a migration towards Grid Computing (GC). The result is a Parallel-Grid-
Enabled (PGE) FDTD algorithm. For the sake of brevity, we put forwards
right now that we do not give details in this chapter about the extension of
the proposed FDTD to a grid-computing environment. Suffice it to say here
that it is all extremely straightforward, and described into details in
[Tarricone and Esposito, 2004].
In this section the sketch of the parallel implementation of the FDTD
algorithm is given. On a machine with n processors, the whole computation
domain is divided into n sub-domains, with equal volume and shape. More
specifically, the domain is divided along the x dimension, mapping in each
N
processor a sub-region of size x n N y N z ; for the sake of simplicity we
suppose Nx to be an integer multiple of n. The EM field components are
updated in each processor in the same instant through Eqs. (8.51) and (8.52)
and its companions (for the remaining scalar Ey,z and Hy,z components).
When the computation updates a field component on the border of the
domain, some values belonging to the border of the adjacent domain are
required: in order to avoid communications during computation, each
sub-domain is surrounded by the border cells of the other domains, as
depicted in Fig. 8-12. These border values are communicated after the
updating phase.
The scheme of the parallel algorithm, implemented on n processors for a
simulation domain of size [ Lx L y Lz ] m 3, is the following:

FDTD parallel algorithm


begin FDTD algorithm
Choose a spatial discretization of the domain (x, y, z);
/*if the domain has dimensions (Lx x Ly x Lz), the grid has Nx x Ny x Nz
points, being N x , y , z = Lx , y , z / x , y , z .*/
Determine the time step t by using stability conditions;
Partition the whole domain D = [ N x N y N z ] into n sub-domains
Di = [ N x' N y N z ] (i=1,2,...,n), where N x' = N x n is the number of
grid points along the x direction in each local domain Di.
for (t = 0;t < Tend ;t = t + t)
do in all the processors
compute the new values of H
250 Chapter 8

communicate the H values on the boundary of each


subdomain to its neighbor
compute the new values of E
enddo in all processors
put the correct value in the feed point in the processors
containing the source;
do in all the processors
compute the absorbing boundary conditions for the portion of
the border surfaces mapped within each processor;
communicate the E values on the boundary of each subdomain
to its neighbor;
enddo in all processors
endfor
end FDTD algorithm.

The proposed parallel algorithm, implemented with MPI, is easily


migrated towards GC environments, guaranteeing the possibility of
achieving very high computing performance at very low costs, and without
additional efforts with respect to the ones now described to implement the
parallel algorithm.

7.3 Efficient Subgridding Technique for Parallel FDTD


Algorithms: Variable Mesh FDTD

When using uniform FDTD meshing, the characterization of simulated


objects with an adequate spatial accuracy forces the adoption of a very fine
resolution over the entire simulation domain. Nevertheless, a larger spatial
discretization step could be applied wherever spatial accuracy is not needed.
Such an observation is the basis of variable mesh FDTD algorithms
which, allowing the existence of different discretization steps in different
regions, guarantee the same level of accuracy for the solution using less
memory and computational power.
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 251

Figure 8-11. Vertical-plane radiation pattern for the half-wavelength test-case compared with
its analytical solution (left) and FDTD solution of the complex human-antenna interaction
problem (right).

Unluckily, implementing traditional subgridding techniques, consisting


in the use of a fine mesh only in selected specific areas, the standard FDTD
algorithms structure is strongly varied, due to the need of an
interpolation/extrapolation process in the fine-mesh/coarse-mesh interfaces.
Consequently, the algorithm parallelization becomes not trivial and the
performance prospective not encouraging.
By renouncing some of the advantages of the traditional subgridding
schemes, though, a good compromise among memory saving capability,
implementation simplicity (also on parallel platforms) and performance, can
be reached by the proposed variable-mesh (VM) technique. In the Parallel
VM-FDTD the discretization is performed so that each cell has only one
adjacent cell for each of its six faces, as well as in the standard FDTD
method; furthermore, along each direction the space step can be arbitrarily
varied, within the limits imposed by the stability criterion, allowing very
smooth transitions between a fine and a coarse mesh region. In front of the
previous advantages, a price in term of memory needs has to be paid: in
order to guarantee such properties, in fact, the fine-mesh regions cannot be
strictly limited only to the interested region, but must be extended up to the
limit of the simulation domain. This is quite evident in Fig. 8-13 where a
schematic representation of a possible mesh for the half-wavelength problem
previously discussed is reported and where the 8 8 8 mm 3 discretization
step is confined only in the source region, whilst a coarser step is chosen
elsewhere.
252 Chapter 8

Figure 8-12. Domain border sharing among different processors.

In order to quantify the cost of the simulation, in terms of used domain


cells, let us refer to Fig. 8-14 and consider the simple case of a cubic
domain, having dimension L, with an object inscribed in a cube of edge l.
The object should be discretized with a step , while the whole domain can
be discretized with step . The simulation, if carried on at the smallest
discretization step, would require a number of grid cells

3
L
Ncells() = (8.53)

Neglecting the eventual transition layer between regions with different


discretization steps, conventional subgridding techniques would require a
number of grid cells

3 3 3
L l l
N cells(,) = + (8.54)

Referring to Fig. 8-14, it is easy to verify that the VM-FDTD method


would require a number of grid cells

3 3
Ll l
N cells (VM FDTD) = +

(8.55)
( L l)l2 ( L l )2 l
+3 +3

2 2

8. Parallel Grid-enabled FDTD for Metamaterial Analysis 253

For the sake of simplicity, in previous expressions we assumed to be


an exact divisor of L, l and (L-l), and to be an exact divisor of L and l.
As we have seen, both computational and memory requirements of the
FDTD method increase linearly according to a first order approximation
with the number of grid cells. Therefore we define the Performance Gain
(PG) of the variable mesh scheme VM with respect to the standard, uniform
fine meshing FM, as

Ncells(FM) Ncells(VM)
PG(VM) = (8.56)
Ncells(FM)

In a similar way, the Performance Gain between two different variable


meshing schemes, VM 1 and VM 2 , is given by

PG(VM1 ) PG(VM 2 )
PG(VM1,VM 2 ) = (8.57)
PG(VM1 )

Figure 8-13. A possible mesh, in a simplified 2D sight, for the test-case of the half-
wavelength dipole. Each cell is surrounded by only six other cells, as in traditional uniform-
meshing cases.
254 Chapter 8

For the case described above of a small cube (linear size l and discretized
with step ) inscribed in a larger cube (linear size L and discretized with
step ), the performance gain of conventional subgridding techniques (CS)
with respect to the uniform meshing with step is

3 3 L3 l 3
PG(CS) = (8.58)
3 L3

for the same case, the performance gain of the VM-FDTD method with
respect to the uniform meshing with step is

(L ) N
PG(VM FDTD) =
3
cells(VM FDTD)
(8.59)
(L )3

For instance, if we fix L = 2m, l = 0.1m, = 0.01m and = 0.002m, the


performance gain of conventional subgriddings is PG(CS) = 0.992, while
PG(VM-FDTD) = 0.986; as a consequence, the conventional subgridding
has a performance gain with respect the VM-FDTD method PG(CS, VM-
FDTD) = 0.006. As we see, in spite of a small decrease in performances
(~0.6%), the VM-FDTD method allows an easier, and more efficient,
parallel implementation and shows a better numerical behavior, avoiding
interpolation/extrapolation procedures and allowing smooth transitions
between regions with different discretization steps.
These properties allow an efficient parallelization scheme, algorithm-
mically equivalent to the already described code, without increasing data
communication. Consequently, the parallel implementation is based on the
same principles of the parallel uniform FDTD, i.e. it adopts the same
partition of the simulation domain, based on a balancing of the computation
among the different processors, and implements the same communication
pattern between adjacent processors.
In order to take into account the different cell-size, three auxiliary vectors
can be used, namely D x [ N x ] , D y [ N y ] and D z [ N z ] , the generic element
D [ p ] = x , y , z being the dimension of the pth cell along -axis, which is not
dependent on the values assumed by the other coordinates.
Now, because of the component location in Yees cell (Fig. 8-10), such
values can be directly used wherever E-field space derivatives in Eq. (8.52)
are evaluated; instead, when H-field space derivatives in Eq. (8.51) are
considered, the average dimension between 2 adjacent cells must be used.
Relations (8.51) and (8.52) are consequently changed into:
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 255

i , j , k t
1
n +1 2 i , j , k
Ex i, j ,k = Ex in, j , k
i , j , k t
1 +
2 i , j , k
(8.60)
H z in,+j1/, k 2 H z in,+j1/1,2k
+
t i , j , k Dy [ j ]
+
i , j , k t H y i , j , k H y i , j , k 1
n +1/ 2 n +1/ 2
1 +
2 i , j , k
Dz [k ]

E n E y in, j,k
y i, j,k +1 +
Dz [ k ]
t
Hx n +1 / 2
i, j,k = Hx n 1 / 2
i, j,k + (8.61)
i, j,k E z in, j +1,k E z in, j,k

D y [j]

where

D [p] = x, y,z = {(D [p] + D [p + 1]) 2} = x, y,z (8.62)

represents the average size of two adjacent cells along -axis. The same kind
of arrangement must be applied to the ABCs computation too.
256 Chapter 8

Figure 8-14. Mesh example for VM-FDTD simulations.

7.4 FDTD Methods and DNG Materials

The intrinsic FDTD amenability to deal with complex structures, together


with its appropriateness to solve large EM problems through the use of
parallel architectures, as well as of non-uniform meshing strategies, become
quite interesting when metamaterials are modelled. Nonetheless, in such a
context, two main aspects must be considered:
1. the trivial approach which consists in the straightforward imposition of
negative permittivity and permeability values in the FDTD leap-frog
scheme, can cause numerical instability;
2. the dielectric properties of metamaterials are strongly dependent on the
frequency.
A viable approach which permits to use the appealing FDTD features as
well as to deal with such complex and dispersive materials, consists
1. in a slight reformulation of the FDTD scheme by including electric and
magnetic current densities, so that frequency dependent materials can be
taken into account, and
2. in the use of the previously reported lossy Drude model for DNG
modelling.
No more details will be given about this topic. Results reported in the
following section, though, will show how the EM field incidence on a
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 257

metamaterial slab can be rigorously studied through an FDTD tool with all
of the features discussed till now.

7.5 DNG Slabs: Reflection by and Propagation


in a DNG Slab

For better understanding the DNG medium behaviour, results are now
proposed, related to the simple problem of the reflection by and of the
propagation in a DNG slab.
So, the case of a plane wave, with an angular frequency 0 (x-directed,
z-polarized electric field) (Fig. 8-15), impinging normally on the interface of
a low loss, matched DNG slab with pe = pm = p and e = m = = 10 8 has
been studied using the TFSF formulation of the PGE-VM FDTD.
The total field region of the FDTD simulation has been taken equal to
182000 cells, with the dimension of a single cell equal to /333 = 0.9 x103 m.
The time step, according to Courants condition is 2.1x1012s.
In the following, three different kinds of incident plane wave are
reported, namely single cycle pulse, m-n-m pulse and Gaussian pulse.

Single Cycle pulse reflection - As in [Ziolkowski and Heyman, 2001],


we analyze the reflection of a single cycle, broad bandwidth pulse:

2 3

7 (7 6 ) t T p 2 1 t T p 2 , 0 t Tp
T 2 T 2
f (t ) = p p (8.63)

0 , t > Tp

The cases of ENG ( = -0, = 0), MNG ( = 0, = - 0) and DNG


( = -0, = - 0) slab have been considered; Fig. 8-16 shows the relative
results, in agreement with those reported in [Ziolkowski and Heyman, 2001].
On the left side of Fig. 8-16 it can be seen the Electric field incident
pulse, whereas on the right side there is the reflected field; as expected, it is
equal to zero for the DNG slab, whilst it has opposite polarities for incidence
on a SNG medium. In these cases, the slab wave impedance is purely
imaginary, so the transmitted field is equal to zero.
258 Chapter 8

Figure 8-15. Single cycle pulse impinging on a DNG/ENG/MNG slab.

Multiple Cycle m-n-m pulse propagation - To verify the DNG medium


propagation properties, we assume as input signal a multiple cycle m-n-m
pulse:

g on (t ) sin ( 0 t ) , 0 t < mT p
sin ( t ) , mT p t < (m + n )T p

f (t ) =
0
(8.64)
g off (t ) sin (0 t ) , (m + n )T p t < (m + n + m )T p
0 , t > (m + n + m )T p

where

g on (t ) = 10 xon
3
15 xon
4
+ 6 xon
5
, g off (t ) = 1 [10 xoff
3
15 xoff
4
+ 6 xoff
5
]

xon = 1
(mTp t ) , xoff =
[t (m + n )Tp ] , Tp =
2
=
1
mTp mTp 0 f0

The input signal bandwidth, centered at the frequency f0, depends on the
number of cycles (m + n + m); the case here considered is those of a 4-8-4
pulse. The reflected wave, as expected, is equal to zero. Fig. 8-17 shows the
results achieved for the transmitted field. The time behaviour of the Ez
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 259

component is plotted for two different points inside the slab. A phase delay
is observed in the point spatially preceding the other one, according to the
negative phase velocity inside the slab.

Figure 8-16. The time histories of the electric field Ez measured in front of the total
field/scattered field plane for a negative permeability medium, a negative permittivity
medium, and a DNG medium. As showed in [Ziolkowski and Heyman, 2001] Ez has opposite
polarities for the SNG cases.

Gaussian-pulse propagation - The propagation of a plane wave,


amplitude modulated by a Gaussian pulse (Eq. 8.27) (0 = 210GHz,
S = 100V/m, = 6.65ns, = 1ns) in a low loss, matched DNG slab
(pe = pm = p = 20 = 22f0 and e = m = = 0) has been studied
using the TFSF formulation of the FDTD tool (Fig. 8-6).
260 Chapter 8

Figure 8-17. FDTD predicted time history of the electric field in the DNG medium
(p = 2(1/ Tp)) for the 4-8-4 pulse (x directed, z-polarized electric field) with Tp = 109s. The
electric field is plotted in two points inside the slab: Point 1 and Point 2, located 20 cells after
Point 1, along the x direction

The total field and scattered field regions have respectively been taken
equal to 182000 cells and 1000 cells, with the dimension of a single cell
equal to /100 = 3 x 10 03 m; the time step, according to the Courants
condition, is equal to 0.95 ps.
Considering Eq. (8.40), with the above reported values for 0, p and ,
for {d >> 0.375} we obtain:

(
t DNG , f p = 21 2 f 0 ) = (ln(20)) 16
= 1.47[ns m]
12

d 2cf 0

Fig. 8-18 shows the perfect agreement between the theoretical results and
the numerical data attained from the FDTD tool. It is evident that for d =10m
the pulse width assumes values comparable with its asymptotic value. The
slope of the curve (asymptotically tending to a straight line) representing the
pulse width as a function of d, is proportional to p.
We have also simulated the propagation of a broadband signal ( = 0.1ns
0). In this case, the theoretical solution given in Section 6 is not
accurate enough, since higher-order terms in Eq. (8.32) should be taken into
account.
Fig. 8-19 shows the relative results. The input signal propagating through
the slab experiences a strong distortion, affecting also the pulse shape. The
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 261

output signal can not be considered Gaussian. Luckily, signals generally


used in common communication systems satisfy the condition << 0
assumed in the previous section.

Figure 8-18. Theoretical and FDTD results for 1) the pulse width and 2) the ratio between the
pulse width and the distance covered in the slab.

Figure 8-19. Simulated results for = 0.1ns.


262 Chapter 8

References

Allen, C. A., et al., 2004, Leaky wave in a metamaterial-based two-dimensional structure


for a conical beam antenna application, IEEE MTT Int. Symp., 305-308.
Al, A., et al., 2004, Metamaterial bilayers for enhancement of wave transmission
through a small hole in a flat perfectly conducting screen, IEEE, Antennas and
Prop. Soc. Symp., 3: 3163- 3166.
Antoniades, M., and Eleftheriades, G. V., 2003, Compact, linear, lead/lag metamaterial
phase shifters for broadband applications, IEEE Anten. and Wireless Prop. Lett., 2:
103-106.
Caloz, C., and Itoh, T., 2002, Application of the Transmission Line Theory of Left-
Handed Materials to the realisation of microstrip LH Line, Proc. IEEE-AP-S, 2:
412-415.
Caloz, C., 2002, LH Transmission Lines and equivalent metamaterials for microwave
and millimeter-wave, Proc. EuMc 2002: 323-326.
Caloz, C., et al., 2003, A broadband left-handed (LH) coupled-line backward coupler
with arbitrary coupling level, IEEE MTT Int. Symp.,1: 317-320.
Caloz, C., et al., 2004, A novel composite right-/left-handed coupled-line directional
coupler with arbitrary coupling level and broad bandwidth, IEEE Trans. on Microw.
Theory and Techn., 52(3): 980-992.
Eleftheriades, G. V., et al., 2002, Planar Negative Index media using periodically L-C
loaded Transmission Lines, IEEE Trans. On Microwave Theory and Tech., 50(12):
2702-2712.
Engheta, N., 2002, An idea for thin subwavelength cavity resonators using metamaterials
with negative permittivity and permeability, IEEE AP-S Lett., 1(1): 10-13.
Grbic, A. and Eleftheriades, G. V., 2002, Experimental verification of backward-wave
radiation from a negative refractive index metamaterial, Journal of Appl. Physics,
92(10): 5930-5935.
Grbic, A., and Eleftheriades, G. V., 2003, Growing evanescent waves in negative-
refractive-index transmission-line media, Appl. Physical Lett., 82(12): 1815-1817.
Hrabar, S., et al., Waveguide miniaturization using uniaxial negative permeability
metamaterial, IEEE Trans. on Anten. and Prop., 53(1): 110-119.
Iyer, A. K., 2002, Negative Refractive Index metamaterials supporting 2D waves, Proc.
IEEE MTT-S: 1067-1070.
Lin, I., et al., 2003, A Branch-Line Coupler with two arbitrary operating frequencies
using Left-Handed Transmission Lines, IEEE MTT-S Digest: 325-328.
Pendry, J. B., et al., 1999, Magnetism from conductors, and enhanced non-linear
phenomena, IEEE Trans. on Microw. Theory and Techn., 47(11): 2075-84.
Pendry, J. B., 2000, Negative refraction makes a perfect lens, Physical Rev. Lett., 85(18):
3966-3969.
Pendry, J. B., 2004, Negative refraction, Contemporary Physics, 45(3): 191-202.
Ramo, S., et al., 1995 , Fields and Waves in Communication Electronics, John Wiley.
8. Parallel Grid-enabled FDTD for Metamaterial Analysis 263

Smith, D. R., et al., 2000, Composite medium with simultaneously negative permeability
and Permittivity, Physical Rev. Lett., 84(18): 4184-4187.
Smith, D. R., et al., 2002, Negative refraction of modulated electromagnetic waves,
Appl. Physics Lett., 81(15): 2713-2715.
Smith, D. R., et al., 2003, Limitation on sub-diffraction imaging with a negative
Refractive Index slab, Appl. Physics Lett., 82(10): 1506-1508.
Someda, C. G., 1998, Electromagnetic Waves, Chapman & Hall, London, UK.
Sullivan, D. M., 2000, Electromagnetic Simulations Using the FDTD Method, IEEE
Press.
Taflove, A., and Hagness, S. C., 2000, Computational Electrodinamicss, Artech House,
Inc.
Tarricone, L., and Esposito, A. 2004, Grid Computing for Electromagnetics, Artech
House, Boston, MA, 2004, pp. 1-266.
Valanju, P. M., et al., 2002, Wave refraction in Negative-Index media: always positive
and very inhomogeneous, Physical Rev. Lett., 88(18): 187-401.
Veselago, V. G. 1968, The electrodynamics of substances with simultaneously negative
values of permittivity and permeability, Soviet Physics USPEKHI, 10(4): 509-514.
Yee, K. S. 1966, Numerical solution of initial boundary value problems involving
Maxwells equations in isotropic media, IEEE Transactions on Antennas and
Propagation, AP-14, 4: 302-307.
Ziolkowski, R. W., and Heyman, E., 2001, Wave propagation in media having negative
permittivity and permeability, Physical Rev. E, 64(056625): 1-15.
Ziolkowski, R. W., and. Kipple, A. D. 2003, Application of Double Negative materials
to increase the power radiated by electrically small antennas, IEEE Trans. on Ant.
and Prop., 51(10): 2626-2640.
264 Chapter 8

Bibliography

Hrabar, S., 2003, Backward-wave meta-materials a brief review, Intern. Conf. Electrom.
And Communications: 245-250.
Chapter 9
A SOFTWARE TOOL FOR QUASI-OPTICAL
SYSTEMS

N. C. Albertsen1,2, P. E. Frandsen1 and S. B. Srensen1


1
TICRA Engineering Consultants, Lderstrde 34, DK-1201 Copenhagen, Denmark
2
Informatics and Mathematical Modelling, Technical University of Denmark

Abstract: The chapter describes a software tool, QUAST, for designing and analyzing
quasi-optical systems, beam-waveguides and other complex antenna
configurations. It contains a general discussion of the requirements met in
such a design, the types of components needed and a review of the analysis
methods available. It proceeds to describe the QUAST implementation, giving
an overview of the software implementation with particular emphasis on the
user interface. This includes a frame editor where all components can be laid
out graphically and connected with beams, a wizard to convert the frame data
to a form suitable for analysis, and a wizard to create complex sequences of
calculations with a single click of the mouse. Finally it is discussed how
several frames can be connected to form a 3-dimensional structure.

Key words: Quasi-optical networks; Antennas; User interface; Modeling; Design software.

1. INTRODUCTION

The QUAST program is developed as a sister program to TICRAs well-


known GRASP9, which is the latest version of the software for general
reflector antenna and antenna farm analysis. GRASP9 is used to design
antennas for space applications, ground-based antennas and for radio
telescopes used in astronomy. A number of research groups working with
quasi-optical systems have also found that GRASP9 is a reliable and
accurate tool for the analysis of such systems, though several additional
features are needed for this application. These features have been developed
and they are included in QUAST.
Both programs can be operated from a pre-processor denoted GPAD. An
important new feature in the program is a graphical user interface module in

265
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 265293.
2006 Springer. Printed in the Netherlands.
266 Chapter 9

GPAD that we have denoted a frame editor. The frame editor provides an
easy and intuitive approach to the design and analysis of complex multi-
component quasi-optical systems by select-and-put of components from
palettes into 2D design frames. The frame can be regarded as a rectangular
planar region. The user selects components such as feeds, reflectors, lenses,
filters and field sampling grids from the palette and places the component
symbols in the frame. To avoid complex and confusing drawings, symbolic
graphical representations are used to represent the antenna components
rather than a detailed graphical rendering of the actual components. The
beam-propagation from one component to the next can be specified using
simple mouse operations. A rapid analysis of the currently specified system
by Gaussian beam analysis can be performed by the GPAD pre-processor. It
is possible to create several frames and connect them when the network is
not confined to a single plane. The high accuracy modeling and analysis is
performed with the object oriented QUAST program. The frame editor
includes wizards, which create all necessary objects for QUAST. Another
wizard enables the user to specify and activate an accurate physical optics
analysis along the specified beam path.

Figure 9-1. Quasi Optical Network (courtesy of ESA).

The new design system has been presented in Albertsen et al. (2003)
where it is argued that the tool has been requested both by physicists
9. A Software Tool for Quasi-Optical Systems 267

working with quasi-optical networks and by antenna engineers for beam


waveguide design. This reference also includes a thorough description of the
motivation and requirements for the Frame Editor. A detailed description of
the Frame Editor advanced features is presented in the present paper. This
includes the object wizards generation of objects, the command wizard for
specifying an analysis sequence, and the techniques of frame connection.

2. REQUIREMENTS FOR QUASI-OPTICAL


NETWORK DESIGN

TICRA has developed the new design interface to be used both with the
GRASP9 program for reflector antenna analysis and with the new quasi-
optical analysis tool denoted QUAST. In this section the different options for
the development are discussed, and the requirements that were derived are
summarized. Based on the needs of both the quasi-optical community and
the antenna engineering one, it was decided to develop a new and advanced
user interface that addresses the requirements for significantly improved
design capabilities for both communities.
Current advanced computer-added-design software visualizes the system
geometry during the model input phase. This is well known in CAD systems
for various fields in engineering, science, architects etc. One way of
handling complex geometries is to use symbolic representations. Here
components with a complex geometry are represented by symbolic figures
which are the graphical entities manipulated by the user. The software
developer has pre-selected a number of entities for the components so that
they fit the required limitations. The software then enables an automatic
transformation of the symbolic representation into a complete data
representation on the users request. The symbolic representations are often
easier for the user to handle within the limitations determined by the
developers.
The parabolic reflector with elliptical rim can be used to illustrate the
handling of a complex component using a symbolic representation. QUAST
is based on parametric modeling. The data needed to specify a parabolic
reflector and to position it in 3D space is quite limited. It is obtained with
one Cartesian coordinate system (9 reals) for positioning relative to a
common coordinate system, focal length and vertex of the parabola (4 reals)
and the center, half axis and rotation of the rim (5 reals). It is, however, in
general far from obvious how the reflector shall be positioned if the reflector
268 Chapter 9

Figure 9-2. User designs with symbolic representations of components (left) and specifies
beam propagation with mouse clicks. The design software generates the full comprehensive
data representation from the symbolic representation (right). The double ring illustrates the
Gaussian beam feed.

is illuminated by a feed and when the reflected beam shall continue towards
another component like a planar mirror. This is difficult to handle
interactively in a 3D CAD environment. Assuming, however, that the
components share a common plane of symmetry, it is simple in a 2D
environment using symbolic representation. The user can select and position
component symbols into the planar design area and specify the beam
propagation. Then the design tool can perform the complex calculations
related to the determination of the reflector data (see Fig. 9-2).
As a first observation it is noted that most quasi optical networks and
complex antenna systems consist of a collection of sub-systems. Each of
them has a plane of symmetry shared by the components in the sub-system.
The antenna beam waveguide is a good example where the components in
each sub-system remain within a common 2D plane whereas the sub-systems
themselves can be rotated, translated and in other ways re-positioned relative
to each other thus enabling various ways of propagating the beam through
the system. The planes are even more explicit in quasi-optical networks
where many network components are placed in a rectangular planar frame as
shown in Fig. 9-1. The physicist thus mounts components such as reflectors,
mirrors, filters, interferometers, feeds etc. in the frame along the beam paths
that enables the desired transformations of the beams. The concept of a
design frame has been obtained from this observation. The frame represents
a planar bounded rectangular subsection of the real plane and it is intended
for modeling and designing systems or subsystems that have the common
plane of symmetry. The next observation is that if the user creates several
subsystems on separate frames and if the design tool enables him to position
these frames adequately in 3D space then the system will enable 3D
9. A Software Tool for Quasi-Optical Systems 269

modeling. This must of course be done such that the beams propagate in a
controlled manner from frame to frame.
The most important feature of a modern interactive graphical design
interface is the users ability to work directly with the design i.e. to build the
system interactively with graphical representations of the components and
the beam propagation. As discussed above, symbolic representations are
ideal for the present application. It has thus been a requirement that the user
can select components from a palette of component symbols and place them
directly in a drawing of the system. Clearly, a number of features for
assisting the user must be available such as a supporting design grid to
enable an easy exact positioning for the components. Also in the case of the
beam propagation it is important that the user can enter data in a simple way
where the path of the center beams is specified using simple mouse
operations.
The basic physical blocks needed in the design process are the physical
objects known to the potential users. The new interface must thus allow the
user to work with symbolic representations of real world objects. For the
Quasi-Optical design tool the following types of components were required:

1. Reflectors
2. Feeds representing Gaussian beams
3. Plane waves
4. Loads
5. Mirrors
6. Filters also denoted Beam splitters
7. Lenses
8. Screens with apertures
9. Martin-Puplett interferometers
10. Michelson interferometers.

Reflectors require several input data, from which the program determines
the type of reflector (parabolic, hyperbolic or elliptic) automatically
depending on the situation in which the reflector appears. The sources are
feeds and plane waves. The analysis frequencies shall be defined together
with the sources. Loads are important in quasi-optical networks whereas
they seldom appear in space borne antenna systems and beam wave-guides.
Mirrors and filters/beam splitters are planar scatterers with different
reflection and transmission characteristics such as polarization and
frequency selective filters. Lenses are key components in quasi-optical
networks. They are also of increasing importance in antenna design. Screens
with apertures and interferometers used for signal multiplexing are
270 Chapter 9

components that mainly appear in quasi-optical networks. Besides these


tangible components a small set of abstract components is required:

1. Near field sampling grid


2. Far field sampling grid
3. Connector.

Evidently, the design software must also enable the user to choose where
and how beam patterns from the analysis shall be inspected. For this purpose
two sampling components are available for near field grids and far field
grids. Finally, if more than one frame is required the user must have a
feature that enables a simple mutual positioning of the frames in such a way
that the beams progress correctly from one frame to the next. The connector
is a component that shall be used to specify where a beam leaves/enters the
subsection of a network in a frame.
When the user has mounted the desired components in the design grid
i.e. the frame, the software must enable him to translate these symbolic
components into data for the underlying program, which in this case will be
the new QUAST. The software must thus include wizards for simple
transformation of the components into all the necessary objects for the
subsequent analysis. Such wizards should require a minimum amount of
component data to be input such as the analysis frequency for a feed and
sensible default values should be assigned to all other object attributes
whenever possible.
The required radiation analysis is expressed through the specification of
the beam propagation to be analyzed. Initially it was decided that this
specification should be performed in a simple way by mouse click
operations from component to component in the design area. The seemingly
simple procedure can be quite difficult to manage because several
components have several ports that may enter in different ways in the
selection. As an example the beam splitter has two input and two output
ports. It has been a requirement that the design tool should offer the user a
high degree of freedom in selecting how the beam shall propagate among the
different components and that the tool shall offer a simple way of specifying
this propagation whereas the tool shall handle any complex management that
may appear. Further, the tool should warn the user whenever designs have
not been completed or if direct errors have been imposed and assist the user
to handle such problems.
The underlying analysis program QUAST shall perform the analysis.
GRASP9 includes a variety of methods for EM analysis including PO, GTD,
MoM, PTD and A-PO (see below). The analysis methods can be combined
in different ways to obtain the best possible accuracy. Due to the specific
9. A Software Tool for Quasi-Optical Systems 271

field of application, QUAST will only include the options for PO and A-PO.
Since these highly accurate methods often require considerable CPU times,
an easy to use and fast Gaussian Beam analysis should be included directly
in the design interface for rapid but less rigorous field calculations.
Subsequently a command wizard must help the user to set up the sequence
of field calculations to be performed starting from a source and through the
network, thus obtaining the required fields. Since the network often has
several sources, the wizard must enable the user to manage different
sequences of calculations along the available paths. The field calculations
should utilize PO, equipped with automatic convergence handling to avoid
complex PO control input that would confuse the novice user. The advanced
user may modify the commands using the command editor, which also
enables the user to activate the commands.
In the following sections it is described how these rather comprehensive
requirements have been realized in the new software tool for modeling the
networks.

3. OUTLINE OF THE SOFTWARE SYSTEM

QUAST shares the structure of GRASP8 and GRASP9 [Albertsen et al.,


2000, 2001]. Major extensions have been added to the software but the
backbone structure has been retained. This section contains a short outline of
the software system to be used as background for the further discussions.
The software system consists of two co-operating main modules, which are
the GPAD pre-processor (client) and the analysis program (server, here
QUAST). These two programs execute simultaneously in two separate
processes on the same processor and they communicate via an inter process
communication channel (IPC).
The structure is shown in Fig. 9-3. The program is object oriented. Each
class is implemented in a separate logical module with a set of characteristic
functions plus class specific functions. Furthermore, a number of general
modules are implemented in the kernel and interface modules to enable the
handling and monitoring of objects. New program systems can be created by
removing some classes and adding new ones. Actually, QUAST has been
developed in this way. Further, a combined system i.e. GRASP9+QUAST is
obtained if all classes used in the two programs are included.
In the kernel the Object Manager (OM) acts as a supervisor and a clerk to
the classes. Object construction, destruction and updates are reported to the
manager, which keeps record, and communicates this information to classes
when required. This includes the handling of all mutual object references
[Albertsen et al., 2001]. The Object Input/Output Module (OIO), which is
272 Chapter 9

called by the class modules, offers services for object storage allocation and
object input and output facilities. The object data parser checks the syntax of
data for object construction and updating, which are specified in a C-like

Figure 9-3. Outline of program system architecture.

syntax. The object semantics check module (OSC) is used to check the
validity of the input data. The Navigator module collects information about
objects, commands and files and serves this information to the pre-processor
on request.
Another key component is the Interface Definition handling. A complete
description of all data and functions of the different classes used in the
object based analysis program is stored in a class definition file denoted the
IDL-file which is input to the IDL Module. The module stores the interface
definition in a complex data structure. The class definition can be
communicated in various forms to the other modules. This enables a
comprehensive semantics check by the OSC module that compares the input
object constructors with the formal class definition. Likewise, the OIO uses
the information, e.g. for storage allocation. The interface definition is also
input to the pre-processor. Hence all class definitions used in the antenna
design program are visible to the pre-processor, which uses the class data
definitions to set up the object editor windows in the GUI.
9. A Software Tool for Quasi-Optical Systems 273

The interface modules are the Command User Interface (CUI), the Inter
Process Communication (IPC) Channel and the Log System Interface (LSI).
The IPC enables a fast communication between QUAST and GPAD. The
analysis program is operated with commands, entered either from a
command file, from the terminal or via the IPC. Commands and matching
data are transformed into actions and data for the OM by the main program.
The LSI module transmits various messages from all other modules to log-
files, the pre-processor and the screen whenever required. The Frame-related
modules are of main interest in the present context. The Frame Editor is
implemented in GPAD whereas each Frame appears as an instance of the
Frame class in the analysis program.
The GPAD includes design menus, which enable the user to design
standard single reflector and double reflector antennas from scratch using a
minimum number of intuitive parameters. The user can load data files with
objects for an arbitrary number of reflectors and other antenna components,
and perform an intelligent editing (e.g. only objects of the correct class can
be inserted into object references) to create additional objects. Since a
substantial number of objects can appear in large complex designs it will
often be very difficult for the user to comprehend the mutual relations
among the objects. For this purpose a module has been developed that
enables the user to inspect and if possible manipulate all references to and
from an object. The pre-processor can display 3D graphics of the objects
currently available, or a selection of these, at any time. The analysis can be
controlled from GPAD. The user can create and edit a sequence of comm-
ands via the command editor. When the users request an analysis, the GPAD
then manages the creation of input files and the execution of the selected
commands and show the results of the calculated radiation patterns
graphically. The Navigator Interface can present a structured overview of the
project components including instantiated objects, commands and the files
currently associated with the project.
The highly configurable GPAD pre-processor has access to the IDL-file
and uses the information about classes and functions stored here to tailor the
graphic interface used to manipulate the objects and the command hierarchy
that governs the calculations. Also, a set of configuration files enables the
developer to manipulate the menu structure, which is derived from the IDL-
file and to specify configuration data depending on the specific platform and
version.
As mentioned above, the pre-processor and the analysis program are
required to execute in two separate processes on the same processor. Since
all object data are stored in QUAST, it is essential to be able to transfer large
amounts of data rapidly between the two processes, which therefore share an
Inter Process Communication (IPC) channel. Since the program is not
274 Chapter 9

intended to run over a network, it was found that for large amounts of data,
exchange via shared memory files was the most efficient solution in this
case.

4. ANALYSIS METHODS

The basic analysis method in QUAST is based on 1.order Gaussian


beams [Goldsmith, 1998]. This beam analysis is extremely useful in the
design of a quasi-optical network because basic characteristics of the beam
can directly be computed with very little computational effort. This includes
the width of the beam along the beam path, the curvature radius of the phase
front and the cross-polarization level. Simple relations exist for computing
the reflection of such beams in a curved mirror or the refraction through a
dielectric lens. The only input required from the user is the initial beam
parameters, and the focal length of the reflectors and lenses. This analysis is
directly available in the design editor such that the effect of any change in
the design is displayed immediately.
Although 1.order Gaussian beam analysis is sufficient in the design phase
of a quasi-optical system a more rigorous analysis is normally also required.
The Gaussian beam analysis gives no information about beam asymmetry,
aberration effects and the consequences of beam truncation through the
system. To accomplish this an analysis by Physical Optics (PO) supple-
mented with the Physical Theory of Diffraction (PTD) [Pontoppidan, 2005]
is available in QUAST. In the PO analysis an approximation to the induced
surface currents on the reflectors is computed and the reflected field is found
by a numerical evaluation of the radiation integrals. PO is extremely
accurate for curved reflectors and plane mirrors that are large measured in
wavelengths, and good results can be obtained down to reflectors with a
diameter of 10 wavelengths. In most quasi-optical systems the PO analysis
will thus be sufficient. It gives accurate results for the main beam as well as
far-out sidelobes and even for the radiation behind the reflector. PTD is a
correction to the PO currents close to the edges of the reflector and it is
normally only important for low level sidelobes and cross polarization. The
numerical integration procedure of the PO currents is adaptive so that it is
automatically ensured that the density of current elements is sufficient for
the desired output points. If the computer has more than one processor the
QUAST program will make use of all of them in the current integration. A
modification of PO that makes use of the Fresnel refraction coefficients is
used for dielectric lenses.
In many cases the combination of 1.order Gaussian beam analysis and
PO will be adequate for design and analysis of the system. If, however, some
9. A Software Tool for Quasi-Optical Systems 275

of the components are very large or closely spaced the PO integration may
become time consuming. A method, which is more accurate than 1.order
Gaussian beams and faster than PO is thus needed. In such cases the higher
order Gaussian beam modes are useful. In QUAST the Gauss-Laguerre
modes have been implemented with the modifications suggested by Friberg
et al. (1992). With this modification the beams are more accurate solutions
to the Maxwells equations than the standard beams and also the vector
nature of the field is more accurately modeled.
In the multi-mode Gauss-Laguerre analysis the reflection and refraction
of such beams need to be considered. We have investigated various methods
and implemented the procedure suggested in Imbriale (2000), which seems
to be the most accurate and versatile. The Gauss-Laguerre analysis is able
to accurately compute the beam shape, truncation effects and cross
polarizations, but the field outside the main lobe cannot be computed. The
saving of computation time compared to PO is substantial and nearly
independent of the frequency. In a large complicated network where the PO
analysis can take an hour, the Gauss-Laguerre analysis will typically take
only a few seconds.
Yet another alternative to PO analysis has been implemented in QUAST.
This is the so-called A-PO [Bondo and Srensen, 2005] in which a number
of auxiliary planes are inserted in the beam path, typically close to the beam
waists. The PO field from the preceding reflector or lens in the beam path
converges rapidly on this auxiliary plane and hereafter the field is converted
to equivalent currents. With a suitable position of the auxiliary plane the
integration of the equivalent currents also converges rapidly on the next
component in the beam path. The A-PO method is slower, but more accurate
than the Gauss-Laguerre analysis, especially regarding beam truncation and
sidelobes.
In the combined GRASP9+QUAST two other analysis methods are
available. These are not integrated in the design frame, but must be accessed
outside the frame in the general GPAD interface. The first is Geometrical
Optics (GO) combined with the Geometrical Theory of Diffraction (GTD)
[Pontoppidan, 2005], and the second is the Method of Moments (MOM)
[Jrgensen, 2003]. In practical designs it often happens that small obstacles
of irregular shape are present close to the beam path, e.g. support structures.
Such structures can be difficult or inaccurate to analyze with PO, but here
MOM can be very useful.
276 Chapter 9

5. USER INTERFACE - THE FRAME EDITOR

The principal objective of the user interface is of course to facilitate the


design of a quasi-optical network as much as possible for the user and to
imbue the interface with as much intuitive logic as possible. The first
consideration was whether the interface should aim at a 2-dimensional
design process or whether a 3-dimensional process should be emulated. It
was decided that an implementation of the latter would not achieve the
simplicity in design aimed at, and it was decided to implement a com-
promise, where the network is divided into frames, which are designed
individually in 2 dimensions. The frames can subsequently be joined to form
3-dimensional networks using the dedicated components denoted connectors.
The details will be discussed in a subsequent paragraph.
A frame can now, due to its 2-dimensional nature, be designed almost
entirely with clicks of a mouse. This is made in the newly developed Frame
Editor. The frame itself constitutes an object, but it was decided that the
elements placed in the frame should, hierarchically, be placed above objects,
to be intuitively understood. These elements are termed components, and
typically consist of a large number of objects each. A component could be a
feed, lens, reflector, etc. To exemplify, a feed will consist of 3 objects, viz. a
feed radiation pattern, a wavelength and a coordinate system. To assist the
user in the conversion from components to objects, an object wizard must be
included. Furthermore the user must be assisted in the construction of proper
sequences of commands necessary to generate the physical properties of
interest. To achieve this, a command wizard must be included. The wizards
are discussed in the following sections.
A frame must, like all other objects, have a name. Much consideration
was given to the question, whether the frame should have a maximum size
within which components could be placed, or whether components would be
allowed to be placed anywhere in an infinite plane. The latter option would,
however, require that a separate component index be kept, since components
might erroneously be placed in positions where they would be very difficult
to find by visual inspection. As a consequence the former option was
selected, hence a frame size must be specified in units chosen by the user.
The Frame Editor is shown in Fig. 9-4. The top menu buttons are for
mode control and zooming. The buttons are from left to right: Edit
Component, Add Component, Delete Component, Add Beam, Delete Beam,
Check Beam, Follow Beam (Gaussian Beam Analysis), Zoom in, Zoom out,
Redraw to fit design window and Undo. The buttons in the component
selection menu on the left only appear in Add-Component mode. The
buttons are from top down and from left to right and: Feed, Load, Lens,
Mirror, Reflector, Aperture in screen, Michelson interferometer, Martin
9. A Software Tool for Quasi-Optical Systems 277

Puplett interferometer, Inter-frame connector, Plane wave, Near field, Beam


Splitter, and Far field.
The components available for the frame editor can be divided into those
that modify the curvature of the phase front of a beam and those that do not.
Lenses and reflectors belong to the former group, while flat mirrors, beam
splitters, apertures in screens and interferometers belong to the latter one. In
general the data necessary to define a component is a unique component
name, a position and an orientation angle.

Figure 9-4. The Frame Editor Interface. A feed and a Beam Splitter have been added. A
reflector is been added at (x,y) = (0.35,0.65).

For the former group, an f-value is also required. In the thin-lens equation
this value is usually referred to as the focal length, but to avoid confusion
with the focal lengths of reflectors, the name f-value will be used here.
While the position of a component can be determined with a mouse-click,
name, orientation angle and possible other data (f-value e.g.) can more easily
278 Chapter 9

be entered through the keyboard. Obviously there must be a snap-to-grid


option, which the user can select.
Much consideration was given to the question whether a design should be
required to follow a particular beam path, such that a component, when
added, would automatically be connected to the previous component through
a beam, or whether the layout of components should be completely free. It
was decided that the latter solution would give the user far more freedom in
the design. The price is that beams must be inserted manually to connect the
components. This of course would be done entirely through mouse-clicks.
After the user has placed a number of the required components in the
frame, the relevant components (not necessarily all components) must be
joined by beams, dragged with the mouse from one component to the next.
Each component has a number of ports. Depending on its nature there may
be one, two, or four ports associated with a component, and the user is only
allowed to join ports, which are not already in use.
The mode of operation chosen allows the user to design first a rough
sketch of the network, and to edit the sketch later, taking full advantage of
the interactive assistance provided by the software. The basic editing
facilities are the ability to move a component with the mouse or to move it
by right clicking on a beam and reset the beams angle of direction and length
in a popup window. A more comprehensive edit facility is available in a
popup window reached by right clicking on a component. Here the basic
data like name, coordinates, orientation, and possibly f-value or other special
data can be typed. Since all the beams attachable to the ports of the
component are known at this stage, the software is now able to offer an extra
help in choosing the correct orientation angle, a problem that can otherwise
be difficult if the beams do not align with Cartesian coordinates. A
dropdown menu will show a list of angular values that all will satisfy the
requirements of the component, and the user has only to choose.
9. A Software Tool for Quasi-Optical Systems 279

Figure 9-5. Nine components inserted in beam waveguide design: two feeds, a beam splitter,
a load, two mirrors, two reflectors and a far field-sampling grid.

Ideally there will be only one choice, but occasionally there may be several,
e.g. two, for a flat mirror, since no distinction is made between front and
back at this stage. To give a quick visual impression of the status of the
network, components correctly aligned will be shown in green, while
misaligned components are shown in red. Fig. 9-5 shows a beam waveguide
design where 9 components have been inserted and beams between the
components specified. The components are 2 feeds, 1 beam splitter, 1 load, 2
mirrors, 2 conic reflectors and one far-field sampling grid.
Before the network can be analyzed it must be syntactically correct. This
means that relevant ports are all connected correctly among the components
to be analyzed, and orientation angles of all components are correct relative
to the directions of the incoming beam(s) and the transmitted beams(s). An
analysis is a preliminary Gaussian beam calculation starting at either a feed
or a connector. This is not to be confused with a full-wave analysis, which
requires that all relevant objects be generated for all components. The
Gaussian beam analysis requires only a wavelength, a beam radius and a
beam phase front curvature to be entered at the start component.
280 Chapter 9

Figure 9-6. 20dB curves of beam radiated from leftmost feed in the beam waveguide design.
Beam waists are indicated with .

A graphical representation of the beam through the entire network will be


shown as hyperbolas indicating a user preset power level (e.g. 20 dB
relative to the center of the beam) together with the positions of all beam
waists. A number of relevant beam data are furthermore displayed alpha-
numerically when the cursor is placed on the beam center, viz. distance to
the beam waist, beam width, cross polarization, radius of curvature of the
phase front, and phase slippage. If part of the beam path is shared between
two feeds operating at the same wavelength, it is also possible to find the
coupling between the beams. A marker is placed on the beam with the mouse,
having one feed activated, and subsequently, having activated the second
feed instead, the coupling in dB will be shown when the cursor is placed on
the marker. In Fig. 9-6 the Gaussian beam analysis has been used for a rapid
analysis of a beam waveguide starting in the leftmost feed with one path
ending in the far field sampling grid at the right and one path ending in the
vertically positioned feed at the top.
An important consideration in the design of quasi-optical networks is the
position of beam waists. A simple methodology for moving them is therefore
provided. This is achieved by placing the cursor on a particular component,
9. A Software Tool for Quasi-Optical Systems 281

and then changing the f-value of that component using the keyboard. This
allows the user to change the position of all the waists following that
component along the beam, and check their movements in real time.

6. COMPONENTS AND OBJECTS: THE OBJECT


WIZARD

To perform a detailed electromagnetic analysis of the network, it is


necessary to convert the components into objects, which in detail describe
the components. Typically there will be one main object that in turn refers to
other objects, e.g. a reflector component will generate a reflector object that
refers to a coordinate system object, a surface object and a rim object. To
facilitate the design process in the frame, only parameters strictly necessary
for a Gaussian beam analysis are defined when a component is inserted. To
create the objects, a process that may require considerable geometrical
insight for some components, an object wizard is included in the software.
The wizard is intended to allow the user to define the, possibly, complicated
data on the basis of a few intuitively simple requirements, retaining
compatibility with the basic parameters already defined in the frame. The
only exception is the connector component. It is not a physical device, but a
logical device intended to link frames in a chain, and therefore has no user
accessible objects associated with it.
Some components require only few, simple data to generate the
associated objects.
282 Chapter 9

Figure 9-7. The Object Wizards reflector menu.

A load, a plane mirror, an aperture in a screen, or an interferometer, e.g. only


requires one or more dimensions to be complete, whereas other components
like a beam splitter, a lens, or a reflector require more subtle data.
A beam splitter can be either of the polarizing type or of the dichroic
type. The former can be modeled as an ideal, frequency independent grid, or
as a physical grid consisting of strips or wires. In all cases the orientation of
the grid must be specified, in the two latter cases also the spacing and size of
the strips or wires. For a dichroic beam splitter the frequency response must
be defined as two sets of band limits with associated reflection and
transmission level.
The most complicated type of component is the reflector. Even though
the reflector surface is restricted to be a quadratic surface, a particular f-
value can be achieved with a wide variety of surfaces. It is stipulated that the
focal points of the surface, or focal point in the case of a paraboloid, must lie
on the center beam. The incoming and reflected beams are assigned a
9. A Software Tool for Quasi-Optical Systems 283

number, 1 or 2, such that the angle from beam 1 to beam 2 is a positive,


acute angle. The distance from the reflection point to the focal points on
beams 1 and 2 are denoted r1 and r2, respectively. Only one of these can be
chosen freely, since they must satisfy a lens formula together with the f-
value. In this way one number can define the entire surface. To define the
reflector rim it is utilized, that a circular cone with apex in a focal point
intersects a quadratic surface along a plane ellipse. Selecting the orientation
angle and half apex angle of the cone will therefore define the rim.
Alternatively the user can specify a beam radius and demand that the rim
must allow, as a minimum, all power within that radius to be reflected. The
wizard will then choose two cone angles that satisfy the requirement for both
beams 1 and 2.
The Object Wizard in reflector mode is shown in Fig. 9-7. The wizard
has one menu mode for each type of component. Whenever possible,
sensible default values are supplied for the input data. The Reflector menu
enables specification of surface and rim. Note the default and update
buttons, the latter being enabled if the user modifies the associated r-
distance. Generic sections are at the top where the component can be
selected from a drop down menu and at the bottom where the user can
maneuver in the component list and activate the generation of the objects
associated with the current component (generate) or the generation of all
objects stemming from all components (generate all objects in frame).

7. COMPLEX COMMANDS: THE COMMAND


WIZARD

The Gaussian Beam Analysis as described in section 4 enables a fast


prediction of the field along the specified beam propagation through the
network designed by the user. Subsequently a more rigorous and accurate
analysis using the PO and PTD methods will always be needed including a
precise prediction of the main beam, the sidelobes and other effects that are
not covered in the fast analysis. QUAST has inherited the command
management from GRASP9 in which a sequence of commands can be set up
for a specific calculation. The Command Editor is used to manage this
sequence. The most common command activates the calculation of currents
on a scatterer given some source, which may be an incident field or a
radiating element like a feed. It can also be a set of currents calculated in
a preceding command. Another important command yields the calculation of
a near or far field for system evaluation. Initially, this command management
was considered adequate. However, as the Frame Editor was being
developed and extended with more features, it became evident that a long
284 Chapter 9

sequence of separate commands handling the evaluations through a network


with many components and possibly specified over several frames would be
very difficult to cope with especially for the novice user.
For this reason, a new command wizard has been developed to help the
user with the field calculations for sequences of network components in
frames. The basic idea is that a sequence of calculations always starts in
some radiating source denoted the starting component. This may be a feed or
an external incoming plane wave or it may be a set of currents on a scatterer
type of component that has been evaluated earlier. The user will need to
specify the path of calculation stepwise from the initial component and
forward from scatterer to scatterer and ending with the calculation of a far
field. This is handled with a so-called chain command. The problem is,
however, that several valid paths may exist through the network due to the
beam splitters. Further, the wizard must also enable the user to specify a
chain command over several frames.
9. A Software Tool for Quasi-Optical Systems 285

Figure 9-8. The command wizard presents two chains. The user can toggle between possible
sub-chains using < and >. The user selects a chain by clicking Add to command.
Clicking Generate command creates the command.

The Command Wizard interface is shown in Fig. 9-8. The user selects the
starting component in the Initial component drop down menu, and the
wizard will then locate all possible beam propagation chains through
the network from that component on the current frame. The user can then
select the chain of interest and the wizard can generate the corresponding
calculation command, which is then input to QUAST for the analysis. If the
user selects a chain that ends in a connector, the command wizard will
continue on the connected frame and present all possible chains starting in
the adjoining connector. In Fig. 9-8, the wizard has located a chain on frame
bwg_design and located a continuation via a connector on another frame
denoted bwg_design2. The components are identified via the concatenation
of the frame name and the component name. Once again, the user can select
286 Chapter 9

one and the (sub-) chain is concatenated to the chain starting at the first
frame. This procedure can be repeated over several frames. The frame class
has been equipped with a module that can manage this complex command
and activate the stepwise calculations. The module uses the PO objects
associated with each scatterer that are created by the Object Wizard. These
objects hold the calculated currents for each scatterer. The default analysis
method is PO+PTD. The Object Wizard also creates the required
field_storage objects for each near and far-field component. Currents and
fields are saved in files. The names of the files are generated automatically.
The PO calculation is managed with a number of attributes in the PO
objects, e.g. the density of the PO integration grid. The novice user should,
however, not be concerned with data for the PO integration. Therefore, a
command for calculation with automatic convergence is created. Further, to
obtain maximum accuracy, the command also prepares for PTD currents on
the edge of all scatterers. The advanced user can modify the command set up
using the existing command and object editors in QUAST [Albertsen et al.,
2000] before the command is actually executed.
The currents on a scatterer in the chain are created not only by the nearest
preceding source but also by other sources that radiate with high power in
the direction of the component. In the frame set-up this includes
contributions from all sources that appear before the nearest preceding
source in the chain, and that lie on the straight line between this and the
scatterer itself. The best accuracy is obtained if all relevant contributions are
included and this is done in the calculations in each step unless the user
modifies the command set up created by the wizard with the object and
command editors.

8. FRAME CONNECTIONS AND 3D MODELLING

A single frame will allow the user to model a system where all
components share a common plane of symmetry and where all beams
propagate in this plane. Modeling in 3D can be achieved if more than one
plane is used. First it is observed that if a beam propagates through a
network in a frame, the center beam will always lie in the plane of symmetry
of that frame. The idea is then as follows: if a beam exits a frame and enters
another frame such that its center beam lies in the plane of symmetry of both
frames, then the two frames can be rotated independently around the linear
center beam. The center beam will remain in both frames no matter the
rotation. The principle is illustrated in Fig. 9-9.
Two sub-systems designed in separate frames may thus be aligned such
that all components are positioned correctly relative to all beams by
9. A Software Tool for Quasi-Optical Systems 287

matching two beams leaving each of the frames. Furthermore, if the beam is
linearly polarized at the emitting source on one frame the polarization angle
will be well defined and known on the other frame. Hence all beam data at
the exit connector obtained in a Gaussian analysis are copied in a rotated
version to the entry connector and the analysis can continue from one frame
to another.
Each frame is equipped with a frame coordinate system (FCS) and
initially all components on a frame including connectors are specified
relative to this coordinate system. The right hand FSC has its x- and y-axis
in the frame plane. Furthermore, initially all frame coordinate systems share
a common basis, which is an external coordinate system that is not related to
any frame (e.g. a global system). Clearly a beam that leaves a frame must be
handled such that its direction is known relative to the last physical
component (the neighbor), passed before exiting from the frame. The
abstract component connector is introduced for this purpose. The connector
is represented by a right hand coordinate system in the frame. The point of
connection is the origo. The center beam is the straight line from the
neighbor component to the connector.

Frame 1

Center Beam
Frame 2

Figure 9-9. Principle of connection and rotation of frames.


288 Chapter 9

Figure 9-10. Frame coordinate system (F) and connector coordinate system (c1).

The connector coordinate system is aligned such the z-axis points along
this line and in the opposite direction of the neighbor, i.e. out of the frame.
The x-axis is positioned to lie in the frame plane and the y-axis points in the
same direction as the FCS z-axis i.e. orthogonal to this plane (see Fig. 9-10).
Now, assume that two frames 1 and 2 shall be connected at a pair of
connectors one on each frame. Let C1 be the connector coordinate system
on frame 1 and C2 the corresponding connector coordinate system on frame
2. When Frame 2 is connected to Frame 1, C1 is set to be the base of C2.
Further, the origo of C2 is set equal to (0, 0, 0) and the z-axis of C2 is set
equal to z-axis of C1 i.e. equal to (0, 0, -1) remembering that C1 is the basis
of C2. The two z-axes now lie on the center beam i.e. on the same line in 3D
space and they point in opposite directions.

Figure 9-11. Manipulation of the coordinate systems when connecting frames. c1 refers to
connector on Frame 1, c2 to connector on Frame 2.

Next, the C2 is rotated as specified by the user around the z-axis of Frame 1.
A rotation may be selected for both connectors in the object wizard. The
9. A Software Tool for Quasi-Optical Systems 289

actual rotation performed is the sum of the two. The rotation of C2 is


counter-clockwise around the center beam. The procedure would be
pointless unless the components on Frame 2 were repositioned when the
connector coordinate system on Frame 2 is rotated. The trick is now to let
C2 be the basis of the frame coordinate system on Frame 2 without changing
the relative position of these two systems. In this way all components (and
the frame coordinate system itself) on Frame 2 rotate together with the
connector coordinate system. After the connection, the FSC on frame 1 is the
basis of C1, C1 is the basis of C2 and C2 is the basis of the FSC on frame 2.
Hence, all components on both frames will be positioned directly or
indirectly relative to the frame coordinate system on Frame 1 when the
connection has been completed. The manipulation of the coordinate systems
is shown in Fig. 9-11.
A simple example of the connection is shown in Fig. 9-12 and Fig. 9-13.
In Fig. 9-12 two simple frames have been designed. Frame 1 contains a
Gaussian feed, a rectangular mirror and a connector. Frame 2 contains a
connector, an elliptic mirror and a load. The rotations 0o and 90o are selected
for the connector on Frame 1 and Frame 2 respectively.
The resulting system as generated by the object wizard is shown in Fig.
9-13. All components are shown together with their own coordinate
system. Note the two-connector coordinate systems with common origo in
the center. Further, notice how the load has been rotated down below the
elliptic mirror and the FCS of Frame_2 lies below the connection.

Figure 9-12. Simple frames each with a connector.


290 Chapter 9

Figure 9-13. Perspective view of the system designed using two frames as shown. The
double ring represents the Gaussian feed.

The above procedure enables the system to connect two frames, but often
4 or 5 frames will be needed when large beam waveguide systems are
modeled. We also foresee that 3D modeling will become relevant with
regards to quasi-optical networks. The QUAST program thus enables the
user to create any number of frames and to connect these. This, on the other
hand, leads to new problems that must be taken care of. First, the user can
create a number of frames and attempt to connect these in a ring. This will,
however, lead to a situation where all the coordinate systems involved will
be defined relative to each other in ring. This is not logic an external
global system must either directly or indirectly be the basis of all
coordinate systems. For this reason, a frame must be selected to be the first
for the connections. This is assigned a fixed status i.e. it is fixed relative to a
coordinate system that is not in any frame. The other frames are called free
i.e. they may be connected in one or several sequences or groups that start on
the fixed frame. The user can use more than one fixed frame but no fixed
frame must be connected directly or indirectly with another fixed frame.
Each fixed frame will thus be the point of origin of a group (or a sequence)
of frames.
The connection algorithm loops over the fixed frames. For each of these,
all connections are performed recursively. Assume that the user has
specified one or several connections from the fixed frame. The first
9. A Software Tool for Quasi-Optical Systems 291

connection is established to the target frame. Then the procedure jumps to


the target and the procedure is repeated with this frame as origin i.e. a first
connection is made and the procedure jumps again to the next target. This
continues until a frame is met, which has no remaining connections to be set.
Then the procedure jumps back to the previous frame and continues
connecting to the next target if any. The overall procedure is completed
when it returns to the fixed frame and when all connections from this frame
have been established. The implementation uses recursion and all data
management is thus handled automatically. Whenever a frame has been
successfully connected a flag is set for that frame. If a connection is
attempted to a target frame for which the flag has already been set, then the
user has specified a ring of frames. In this case an error message is issued,
the connection procedure is aborted and all connections are cancelled. This
also happens if a target frame is fixed.
From the users point of view the connections are made in a seemingly
transparent manner. A pair of connectors, which are denoted neighbors, on
two separate frames represents a connection. When data is entered for a
connector in the object wizard the user should supply the name of the
neighbor frame and the neighbor connector and these names are stored in the
frame object. The wizard actually presents a selection lists with the names of
all known frames and if one is chosen, the wizard will present a list of all
available connectors on that frame. The neighbor connector is also equipped
with the same two data fields. The data entered for the two neighbors must
in principle agree i.e. if it is specified in frame F1 that connector C1 is meant
for a connection to connector C2 on frame F2 then the opposite must be
specified for C2 on F2. If this is the case the connection will be established.
The frame connection algorithm is, however, made such that the user needs
only to specify data for one of the connectors in a neighboring pair. The
software will then automatically generate relevant data for the other.
Finally, a number of detailed warnings and error messages will help the
user in case inconsistent connection data are specified.

9. EVALUATION AND FUTURE EXTENSIONS

A new interface has been developed for designing complex quasi-optical


networks and antenna systems including beam waveguides. The final
conclusion with respect to the user friendliness and modeling capabilities
must be postponed until proper evaluations has been made by engineers and
physicists. Initial experience has, however, shown that reliable design and
analysis of such systems can be performed in a fraction of the time needed
with the original GPAD GUI.
292 Chapter 9

Currently, the classes for simulating lenses and interferometers are being
developed. The beam-splitter surface models include strip grids, wire grids
and ideal surfaces. In the future the modeling of more refined beam-splitter
surfaces will be required including advanced FSS surfaces. Further, more
modern graphical rendering features will be included for better and easier
visualization of the system being modeled.

References

Albertsen, N. C., Frandsen, P. E., Jrgensen, R., Nielsen, P. H. and Srensen, S. C., 2000,
GRASP8 an object oriented Fortran90 program for reflector and antenna farm analysis,
ESA SP-444 Proceedings of Millennium Conference on Antennas and Propagation
AP2000, Davos, Switzerland, paper no. 0052.
Albertsen, N. C., Frandsen, P. E., Srensen, S. C. and Lumholt, M. 2001, Reference
handling in the GRASP8 program for reflector antenna analysis, Fifth International
Conference on Software for Electrical Engineering Analysis and Design
(ELECTROSOFT), Lemnos, Greece, WITpress, pp. 87-96.
Albertsen, N. C., Frandsen, P. E. and Srensen S. C., 2003, A new advanced interface for
GRASP the frame concept, 26th ESA Antenna Techn. Workshop on Satellite Antenna
Modelling and Design Tools, Noordwijk, The Netherlands, pp. 29-36.
Bondo, T. and Srensen, S. B., 2005, Physical Optics Analysis of Beam Waveguides Using
Auxiliary Planes, IEEE Trans. Antennas Propagat. vol. 53, 3:10621068.
Friberg, A. T., Jaakkola, T and Tuovinen, J., 1992, Electromagnetic Gaussian Beam Beyond
the Paraxial Regime, IEEE Trans. Antennas Propagat. vol. 40, 8, pp. 984-991.
Goldsmith, P. F., 1998, Quasioptical Systems, IEEE Press.
Imbriale, W. A. and Hoppe, D. J., 2000, Recent Trends in the analysis of Quasioptical
Systems, Proceedings from AP2000, Davos Switzerland.
Jrgensen, E., 2003, Higher-Order Integral Equation Methods in Computational
Electromagnetics, PhD Thesis, rsted DTU, Technical University of Denmark, Lyngby,
Denmark.
Jrgensen, R., Padovan, G., de Maggt, P., Lamarre, D. and Costes, L., 2001, A 5-Frequency
Millimeter Wave Antenna for a Spaceborne Limb Sounding Instrument, IEEE Trans.
Antennas Propagat., vol. 49, 5:703-714.
Pontoppidan, K. (ed.)., 2005, GRASP9 Technical Description, TICRA Engineering
Consultants (can be downloaded from www.ticra.com), Copenhagen, Denmark.
9. A Software Tool for Quasi-Optical Systems 293

Bibliography

Booch, G. 1994, Object-Oriented Analysis and Design with Applications, 2nd Edition, Then
Benjamin/Cummings Publishing Company Inc., Redwood City, USA, 1994.
Goldsmith, P. F., 2005, Quasioptical Systems, IEEE Press, New York, USA, 1998.
TICRA Engineering Consultants, 2005 Frame Design Tool, , Copenhagen, 2005.
Pontoppidan, K. ed.:GRASP9 Technical Description, TICRA Engineering Consultants,
Copenhagen, 2005.
Chapter 10
COOPERATIVE COMPUTER AIDED
ENGINEERING OF ANTENNA ARRAYS

A. Esposito, L. Tarricone, L. Vallone and M. Vallone


University of Lecce, Italy

Abstract: In this chapter, the problem of Computer Aided Engineering (CAE) of


rectangular aperture arrays is attacked taking advantage of grid computing
technologies. This allows an implementation of the CAE environment in a
service oriented framework, where CAE components are encapsulated into
services and exploited remotely through the grid. We also propose an
ontology for CAE of aperture antennas. The ontology allows the
identification and localization of remote services and their orchestration for
CAE problem solving. It is demonstrated that the strong need of cooperation
tools in CAE of microwave circuits and antennas is perfectly fulfilled by
these emerging information technologies.

Key words: CAE; Cooperative Engineering; Grid Computing; Aperture Antenna Array;
Semantic Grid; Ontology.

1. INTRODUCTION

Microwave (MW) engineering and research must often and often


face the problems implied by the growing complexity of applications and
services. One of the most frequent consequences is the increase in the peak
demand of computing power for the computer-aided-engineering (CAE) of
large circuits and antennas. One typical example is represented by the CAE
of aperture arrays. Indeed, the accurate modeling of the radiating behavior is
not trivial even for simple apertures and becomes nearly unaffordable when
medium-sized aperture arrays must be dealt with, as periodic approximations
cannot be adopted.

295
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 295326.
2006 Springer. Printed in the Netherlands.
296 Chapter 10

Another issue is the need of integration of heterogeneous knowledge and


design skills: a research project in the CAE of aperture arrays often deserves
a joint effort of several groups, each with its own favorite numerical
approaches, programming methodologies, etc. In other words, a strong need
of cooperative engineering strategies arises. This goes along with another
key-point: the importance of avoiding reinventing the wheel, when possible.
Indeed, in many cases the MW researcher could take substantial advantage
from existing solutions; unfortunately, it often happens that existing
solutions are not easily accessed or integrated in the framework of a more
complex application.
In conclusion, interoperability, portability and compatibility are crucial
issues, along with the development of very efficient numerical software.
Now, in this chapter we demonstrate that grid computing technologies,
and semantic grids more specifically, are suitable to fulfill the above
mentioned requirements.
In Section 2, the basic properties of a CAE framework are summarized.
Section 3 is an overview on semantic grids, whilst Section 4 focuses on the
relative architectures. Section 5 describes how the CAE environment can be
set up thanks to grids and ontologies, and finally conclusions are drawn.

2. CAE OF APERTURE ANTENNA ARRAYS

The analysis of an antenna array starts from the study of the physical
elementary structure (i.e. the geometry of the single radiator) and gives at the
end a full insight into the properties of the whole device (the antenna array).
As usually happens when dealing with complex problems a sort of
schematization must be introduced in order to simplify the problem and
divide it into various sub-problems. In such a perspective, the problem of
CAE of an array of rectangular flange-mounted apertures (we concentrate on
such devices since now on), can be considered as composed of four main
tasks, namely: 1) analysis and design of the feeding waveguide section
(AFS) 2) analysis and design of the aperture over the flange (AMC) 3)
analysis of the overall behavior of the system (ESM) 4) analysis of the
consequent radiating properties of the system (ERP).
The study of the basic radiating structure, namely the elementary radiator
opening over the flange, can be faced by employing different numerical
techniques, basically mentioned in some pioneering works enumerated in
[Mongiardo et al., 2000], whilst the study of the mutual coupling arising in
small arrays was pioneered by Mailloux [Mailloux, 1969a, 1969b], and,
subsequently, by Hockam [Hockam and Walker, 1973].
10. Cooperative Computer Aided Engineering of Antenna Arrays 297

A key point, both at single aperture, and at array analysis level, is the
adoption of suitable functions for field expansion over the aperture.
Although waveguide modes have been a common choice of basis functions,
considerable attention has been devoted to different field expansions on the
aperture and, as an example, in [Mongiardo and Rozzi, 1993] Gegenbauer
polynomials have been considered. The latter choice provides a basis
function set which incorporates the singular field behavior in the proximity
of metallic edges. It is understood that the possibility of dynamically
choosing among different expanding functions can be a relevant and
interesting feature.

As mentioned before, another sub-problem involved in the CAE of


aperture-antenna arrays concerns the analysis of the feeding waveguide
sections. An appropriate method giving accurate description of the field
inside the feeding section is represented by the Mode Matching method
(MM). The MM technique is usually adopted for the analysis of waveguide
discontinuities: segmenting the feeding waveguide section of an aperture
antenna into a cascade of steps in the E or H plane allows applying the basic
MM assumption to the analysis of the feeder.
In conclusion, the previously mentioned tasks (ASF, AMC, ESM, and
ERP), can quite often be attacked by adopting different numerical
techniques: the applications can in principle be developed by assembling
separate tasks, each being implemented autonomously, with different
methodologies and techniques, throughout the MW community. In other
words, this is an example of a complex application which can be built by
aggregating existing codes, provided that the integration effort required is
not heavier than the one needed when starting from scratch.

3. GRID SERVICES AND SEMANTIC GRID

As seen in Chapter 5, a computational grid is a distributed computational


infrastructure that allows remote organizations to join and share resources,
thus allowing to assemble efforts coming from diverse organizations and
providing a potentially infinite availability of computational power and
storage. This permits to face problems not attackable before: complex
and multidisciplinary problems can nowadays be afforded by adapting them
to GC. One way of doing this is the application of the well known divide
et impera methodology, which solves complex problems by decomposing
them into simpler units. It may happen that some of the simple units have
already been solved by other groups and that they have made the
corresponding code available in the form of grid service (see Chapter 4 and
298 Chapter 10

5), i.e. in the form of an application component callable through the network
via a well-defined interface. In this way, the complex application can be
built by assembling available pieces of code just by specifying the workflow.
The attractiveness of this technology is much more apparent when
considering that code developed with traditional programming languages
(such as Fortran and C), and therefore not suited for network computing, can
behave as grid service with little additional work. The developer must just
add to it an envelope written in a standard language to specify the interface.
This allows to preserve past work and to reuse native code.
Another appealing feature of grid services is their amenability for loosely
coupled environments, where the client has little a-priori knowledge of the
service to be called. This is very useful in wide-area cooperative environ-
ments (such as a hypothetical EM grid) where the partners possess reduced
knowledge of each other. In these contexts, the end-users may not know a-
priori which services are available and which tasks they perform. Besides, in
some cases it may be necessary to develop complex applications which
establish at run time the services to invoke, based on the current state. As
explained in Chapter 4 and 5, grid services allow both the so called dynamic
invocation, i.e. the possibility to define at run-time which service and which
methods to invoke and the so called automatic discovery, i.e. the automatic
location of the remote service best matching some input requirements. As
explained in Chapter 3, automatic discovery can be done by means of the
so-called semantic grids which add a structured conceptual description
(ontology) to the published resources and services, so that the client may
adopt meaningful terms and relations (similar to nouns and verbs of human
languages) to express the requirements.
In this chapter, we show how an existing software, implemented in
Fortran code, is migrated to a service-oriented framework, encapsulated into
grid services and made exploitable through the grid. We also propose an
ontology for CAE of aperture antennas. The ontology allows the identifi-
cation and localization of remote CAE services and their orchestration for
CAE problem solving.

4. SYSTEM ARCHITECTURE

Let us assume that the four steps enumerated in Section 2 have each
been solved by separate and independent working groups, adopting different
software and hardware technologies. We assume that the groups are
interested in sharing their applications so that the global task can be
performed (CAE of the whole array) but require that their own module
10. Cooperative Computer Aided Engineering of Antenna Arrays 299

remains a proprietary application, resident on their own platforms, with all


the guarantees of security of data and applications.
In conclusion, we have several modules, distributed geographically
through the Web, and must guarantee an efficient cooperation among them,
with high reliability and security requirements. Each module solves one of
the enumerated steps and more modules may solve the same step with
different methodologies. The global task is performed by solving the four
steps in the correct order, i.e. by calling sequentially the appropriate
modules.
In such a complex environment, the end-user must be enabled to:

- locate the modules to launch;


- run sequentially the CAE modules found in the previous step to
perform the overall CAE task.

These steps may be performed only if suitable software exists able to


cope with the following issues:

distribution the modules are geographically dispersed and need to talk


one another;
heterogeneity differences among modules and platforms must be
hidden;
security restricted access and mutual authentication are critical as the
subjects involved may not trust one another;
discovery the underlying environment is loosely coupled, i.e. little a-
priori knowledge is available about the shared resources. This casts the need
of advanced search tools;
orchestration the four modules must operate in synergy, according to a
well-defined workflow, which establishes the dependencies between the
modules, the constraints to be matched and the data-flow.

As seen in the previous section and in Chapter 5, semantic grids (SGs)


solve, in one shot, all the above enumerated requirements. First of all, a CAE
application requires the chaining of isolated tasks, i.e. to solve a CAE
problem autonomous components must talk with each other in a distributed
environment. As explained before, SOAs are specifically devoted to manage
the interaction among separate tasks (grid services) in loosely coupled
environments. Distribution and heterogeneity issues are also coped by SOA
standards (such as XML and Java) which make SOA frameworks inde-
pendent from both the underlying networking protocols and the platform.
Security is guaranteed by the GC implementation of mutual authentication
300 Chapter 10

and encryption. As for discovery and orchestration, a semantic model of


CAE has been defined and integrated with the framework.
The system architecture is best explained by commenting the scheme
depicted in Fig. 10-1. The figure shows a distributed system where the CAE
modules are located on networked machines belonging to separate
administrative domains. In such a scenario, it may happen that different
groups publish modules capable of solving the same CAE task. Such
modules probably differ for the implemented methodology or for other
properties, such as performance, or memory requirements. It is responsibility
of the module owner to disseminate the properties of the published module
to the partners of the grid. This is done by means of a special service, called
grid registry, with the specific role of maintaining a structured semantic-
driven collection of available services and their properties. A potential end-
user discoveries available services and their properties by contacting the grid
registry. She/he inputs her/his requirements in a structured form (such as I
am looking for a service solving a specific CAE task by using a specific
methodology and requiring less than a certain time), so that ambiguity is
reduced. Once that all the services needed for solving a CAE problem are
found, the end-user aggregates them to form a sort of meta-service, i.e. a
service made up of simpler services. During this phase, the user specifies the
way services must be launched, their order (such as sequential or cyclic) and
the constraints to be satisfied. Once the CAE meta-service has been built, it
can be launched, given that the suited input data are located on the suited
platform.
It is worthwhile to notice that the above described steps may happen
programmatically, without any human intervention and assuming that no a-
priori knowledge of the remote services is available. The unique info to be
known is the address of the grid registry. Moreover, once that the ontology is
updated with the new instance of the CAE meta-service, it can be launched
in any moment, without the need of repeating the above mentioned steps, but
just invoking the CAE meta-service whose description is stored in the
ontology.
10. Cooperative Computer Aided Engineering of Antenna Arrays 301

ERP1 ESM ASF1

Internet Registry

ERP2
ASF2
Client
application
AMC

CAE metaservice
start ASF1 AMC ESM ERP2 end

Figure 10-1. Grid-enabled CAE system overview. CAE services are implemented by
autonomous groups, each having the know-how for solving a specific task with a certain
methodology. The dispersed services have different features and requirements. End-users
discover available services by querying a semantically structured description provided by a
central registry. They first select the most suited services for their needs based on the
description contained in the central registry. Then end-users orchestrate the chosen services in
a workflow so that a CAE meta-service is built up.

5. THE FRAMEWORK

5.1 Introduction

The CAE system described in Section 4 has been implemented in a


departmental grid, where the cooperative environment has been simulated by
deploying CAE services on different nodes of the grid. The CAE services
have been built starting from available application components which have
been encapsulated into services to make them callable in the grid in a
standard fashion. A node of the grid has been given the role of registry. The
registry has been implemented by borrowing technologies and concepts from
semantic grids. It includes a CAE ontology and a reasoner so that, from any
node of the grid, client applications can contact the reasoner to perform
queries about the available services. Based on results provided by the
reasoner, the application can build a new CAE meta-service or invoke an
already existing one to solve its CAE problem.
302 Chapter 10

In synthesis, the framework consists in the following components:

- grid infrastructure;
- grid services performing CAE tasks;
- ontology and reasoner performing the registry role;
- client applications built-up to test the environment.

The following sections focus on each component of the implemented


CAE framework.

5.2 Grid Infrastructure

The migration of the CAE environment towards GC has been performed


by adopting the Globus Toolkit (GT) [GT, 2005] as reference tool, and by
taking advantage of its panel of services available in version 3.2. GT3 is
conceived as a set of basic utilities supporting grid computing, the most
relevant for the CAE problem being:

- job management - it is the possibility to launch applications on remote


platforms. Once the end-user has located the remote application, she/he
selects the most suited platform for running and launches the application on
that platform. This is very useful in the CAE domain where single tasks are
often implemented by computationally demanding codes, requiring huge
amounts of CPU-power. If the platform where the CAE program is deployed
is inadequate to host it, job management utilities can be used to migrate the
job to more suited platforms and to monitor the functioning of the
application;

- parallelism - as said before, most CAE tasks are computationally


demanding. For this reason, some of them may exhibit a parallel
implementation. As demonstrated in [Tarricone and Esposito, 2004], the
migration of parallel code towards computational grids is straightforward, as
it requires no modifications to the original code. The unique requirement is
the previous installation of the MPICH-G2 library [MPICH-G2, 2005].
MPICH-G2 is a freeware open-source MPI implementation compatible with
GT. A detailed description of parallel CAE implementations working on grid
is proposed in [Esposito and Tarricone, 2003], where encouraging results
about performance in grid environments are provided, based on a
comparison with speed-ups achieved on different multiprocessors;

- data management - CAE tasks need to exchange data. The simplest way
to do this is to store input and output data in files. The file generated as
10. Cooperative Computer Aided Engineering of Antenna Arrays 303

output by a CAE module can provide the input to the subsequent module.
The simplicity of this model is slightly reduced in a distributed environment.
In this case, adjacent CAE tasks may run on different platforms, eventually
chosen at run time. Therefore, the files produced as output may be located
on platforms different from expected ones (see Fig. 10-2). GT third party
transfer utilities solve this problem as they allow to govern remotely the
movement of files between two remote platforms.

The use of the above mentioned utilities in the context of CAE


applications is described quite in detail in [Tarricone and Esposito, 2004]. In
the context of this book, we focus on issues related to service orientation and
to the new course embraced by recent versions of GT, going towards the
integration of grid computing concepts with service-oriented architectures.
Therefore, the system components described afterwards must be interpreted
as complementary to the above mentioned grid functionalities as they
operate in synergy in an integrated system.

ASF.out
ERP1 ESM ASF1

Internet Registry

ERP2
ASF2
Client
application
AMC

CAE metaservice
ASF.in AMC.out ERP.out

start ASF1 AMC ESM ERP end


ASF.out ESM.out

Figure 10-2. For simplicity, we suppose that CAE tasks exchange data by files. The platform
where files are produced is not known a-priori. Therefore, it is responsibility of the client
application to transfer the output produced by the preceding task to the right destination.

5.3 Encapsulation into Services

The adopted version of GT assumes that grid services are implemented in


the Java language, this guaranteeing platform independence and smoothing
304 Chapter 10

the heterogeneity of the underlying environment. Instead, the CAE modules


are supposed to be implemented in a native code, such as standard C or
Fortran language. This assumption derives from two considerations:

- these languages are widely used in scientific contexts and our objective
is to promote cooperation by minimizing the impact on original code;

- even though Java has recently been improved in terms of performance,


traditional compiled languages are still faster than Java bytecodes.

To bridge native code with the Java-oriented GT framework, the open-


source library called Java Native Interface (JNI) [JNI, 2005] has been used.
JNI allows to encapsulate native routines into loadable libraries so that they
can be dynamically embedded into Java methods. This solution guarantees
both platform independence and code re-use.
Thanks to JNI, each CAE module has been embedded into a Java
package. This is shown in the following fragment of code which refers to an
hypothetic AMC native program:

public class AMCServiceImpl


implements OperationProvider, GridServiceCallback
{
.
public native void invokeAMC() throws RemoteException;
static
{

System.loadLibrary(libAMC.so);

As shown above, the AMC native program is loaded, in the form of a


dynamic loadable library, into a Java class. This process is carried out in a
multi-step fashion as follows:

1. writing the Java program. A Java class must be created. It declares


the native method, loads the shared library encapsulating the native
routine and calls the native method;

2. compiling the Java class;

3. generating a header file from the Java class. This is done


automatically by using the JNI utility named javah. The header
file contains the formal signature for the native routine;
10. Cooperative Computer Aided Engineering of Antenna Arrays 305

4. modifying the native function by including the previously generated


header file and a JNI header file, and by giving it the same signature
as the one generated by the javah command;

5. compiling the native program as a shared library.


Once that the Java class is available, it must be embedded into a grid
service. This is done by writing the WSDL file (see Chapter 4). In the
hypothesis that the service does not require parameters as input nor it
provides data as output (in our framework data are exchanged via files,
which are managed by GT utilities), the WSDL file is the following:

<types>
<element name=invokeAMC>
<complexType/>
</element>
<element name=invokeAMCResponse>
<complexType/>
<element>

</types>

<message name=invokeAMCInputMessage>
<part name=parameters element=invokeAMC/>
</message>
<message name=invokeAMCOutputMessage>
<part name=parameters element=invokeAMCResponse/>
</message>

The above lines are extracted from the WSDL file of the AMC service,
by removing for simplicity every reference to namespaces. They sub-
stantially constitute the port-type section of the WSDL file, i.e. the section
which describes the operations performed by the service. In the case of the
above cited AMC service, it includes the unique operation named
invokeAMC being published by the AMCServiceImpl Java class. As
seen in Chapter 4, standard WSDL files are composed also by the so called
binding section, which provides details about protocols and formatting
issues. This part is automatically generated by GT during the so-called
deployment phase, thus further simplifying service creation. Deployment
consists in formatting service files and putting them in the right location, so
that the GT container, i.e. the GT daemon responsible for managing the
interaction of services with the remaining part of the grid, is informed about
the existence of the new service. It is a very simple phase, to be performed
with the help of the Ant tool [Ant, 2005], which automates these steps,
provided that a suited configuration file is given as input. For this purpose,
306 Chapter 10

the GT toolkit includes some sample build files which are valid in the
majority of cases.

5.4 Ontology

5.4.1 Introduction

Once that CAE services are available, we must specify an ontology, to


enable their automatic discovery and composition.
As said in Section 3, ontologies provide a structured, conceptual
representation of resources and of their relationships which is both machine
understandable and user-friendly. Discovery is facilitated by the use of
ontologies as they allow electronic agents (software programs) to search
information on the bases of human-readable queries. In other words we can
see an ontology as the way to encode knowledge in a specific area of
interest, so that it is possible for electronic agents to search, discover and use
semantically demarcated grid resources.
As seen in Chapter 4, different languages can be used to encode
knowledge. The emerging standard is the so called OWL (Web Ontology
Language) [OWL, 2005]. To conform OWL to service needs, an upper
ontology for services has been proposed, the so-called OWL-S [OWL-S,
2005]. OWL-S defines a number of concepts so general to be valid in every
domain. When implementing a domain-specific ontology, the OWL-S upper
ontology must be specialized, by adding specific sub-domain ontologies.
In other words, our work consisted in specializing the available OWL-S
ontology for the CAE domain and in integrating it with other available
ontologies.
To facilitate the ontology building process we have used the Protg tool
[Protg, 2005] with the OWL plugin. The Racer reasoner [Racer, 2005] has
been used to check ontology consistency, build taxonomies and verify
classes decidability.
In the following, we briefly introduce the most relevant components of the
implemented ontology. To simplify the description and to render easier
the comprehension, we conceptually divide our ontology into three parts.
The first (subsection named service discovery) overviews the components
which most specifically support the discovery process. It defines the entities
which specify the features that different CAE modules may possess (for
example the implemented methodology).
The second (named service orchestration ) describes the components
which support the CAE meta-service building up process. It presents the
entities which model the CAE workflow and the constraints to be satisfied in
order to have a deployable service.
10. Cooperative Computer Aided Engineering of Antenna Arrays 307

The third (named service binding ) focuses on the entities supporting


the automatic retrieving of the information necessary to invoke the
discovered service.

5.4.2 Service Discovery

For the service discovery task, we used the upper ontology OWL-S,
which contains all the classes needed to publish the properties of deployed
services and to define meta-services as organized aggregation of simpler
and runnable services.
As seen in Chapter 4, OWL-S is centered around the general concept of
Service, which is linked to other fundamental concepts, whose role is that
of providing detailed and structured information about the service itself. For
the purpose of services discovery, the class named ServiceProfile is
provided. It describes what the service does from a client point of view, i.e.
it provides a general description of a service intended to be published and
shared in order to facilitate service discovery.
Fig. 10-3 shows a fragment of the CAE ontology. It represents concepts
and instances related to the AMC subtask in case of flange-mounted
rectangular apertures. As described in [Tarricone et al., 2001], two
approaches can be used for the study of mutual coupling among the
apertures: the former uses a Fourier transformation, the latter is based on the
Lewin transformation. Both cases are, in turn, amenable to two types of
formulations. The former adopts waveguide modes to expand fields over the
apertures, the latter expands the field by means of different basis functions
(e.g. Gegenbauers polynomials).
Our ontology represents this classification of methodologies with four
concepts, all specializing a generic class named AnalysisMethod. Two
disjoint classes represent the two possible transformations: namely the
SpectralTransformation and LewinTransformation classes. The other
two disjoint classes represent the different possible field expansions: namely
the WGModeExpansion and GEPolynomialExpansion class.
A class named Service represents the software solving a particular task,
encoded in the Task class. The class named ServiceProfile describes
what the service does by the means of the relationship named hasAim (and
its inverse named isSolvedBy), which links the ServiceProfile class to
the Task class. Inside the Task class two subclasses can be identified:
GlobalTask standing for a complex EM problem (for example the CAE
problem) and BasicTask representing the simpler sub-problems into which
a complex problem can be divided (for example the AMC sub-problem). The
relationship named implements (and its inverse named
isImplementedBy) relates the ServiceProfile class with the
308 Chapter 10

AnalysisMethod class, thus expressing the methodology implemented by


grid services.
The knowledge base is obtained by populating the ontology with
instances. Four instances identify the four methodologies of our
classification. Each of them belongs to two classes, respectively expressing
the transformation and the field expansion. For example, the instance named
SP-WG identifies the solution based on a Fourier domain representation
and a waveguide mode field expansion.
The instance named AMCService identifies a specific grid service. The
relationship between the instance named AMCService and the instance
named AMCProfile says that the grid service presents a specific profile;
the relationships between the AMCProfile and the instances AMC and
SP-WG say that the grid service solves the AMC subtask by adopting a
Fourier expansion and a waveguide mode expansion.
This simple example shows how the ontology can encode the concept
four methodologies for solving the CAE AMC subtask are available, they
are obtained by combining the two possible approaches for representing the
problem with the two possible field expansion models. Note also that the
possibility of introducing inverse relationships (such as implements and
isImplementedBy) gives great flexibility to the model as the same query
can be formulated in different ways, as happens in natural language. It is
finally worthwhile to outline that inheritance between classes related by
IS-A relationships greatly simplifies the phase of ontology definition
and instantiation. In our example, the relationship linking the class
AnalysisMethod with the class named ServiceProfile is inherited by the
classes representing the four possible methods.
10. Cooperative Computer Aided Engineering of Antenna Arrays 309

Service AMCService AMCProfile

presents
isPresentedBy BasicTask AMC

hasAim
ServiceProfile Task

isSolvedBy GlobalTask
implements
isImplementedBy SP-WG

GEPolynomialExpansion
SP-GE
SpectralTransformation
AnalysisMethod
LW-GE
WGModeExpansion

LW-WG
LewinTransformation

Figure 10 -3. Knowledge base for the CAE tool. Grey circles represent instances, white
circles identify concepts. Large arrows represent IS-A relationships. Dashed arrows connect
instances to their class. Thin arrows identify relationships.

5.4.3 Service Orchestration

Automatic service orchestration means that end-users are enabled to


define new services, starting from available ones, by specifying their
workflow. Similarly to the service discovery task, also for this task, we
used the upper ontology OWL-S.
For the purpose of services orchestration, the OWL-S upper ontology
provides the class named ServiceModel. It describes how the service
works, i.e. it tells what happens when the service is invoked.
The service model allows to define sort of meta-services obtained by
harmonizing a number of more elementary services to achieve a complex
global task. This is done by representing services as atomic processes and
meta-services as composite processes. In other words, a composite process is
a process composed of other processes, whilst an atomic process is a
310 Chapter 10

runnable service. As shown in Fig. 10-4, this model has been used to
represent meta-services capable of solving a whole CAE problem. This has
been done by defining a CAE composite process, made up of four atomic
processes, each representing a CAE task.
OWL-S allows also to specify the way elementary processes are
combined (in parallel, concurrently, etc.) by means of a set of control
constructs. As shown in Fig. 10-5, a composite process has a composedOf
property which links the process to its control construct. Each control
construct, in turn, is associated through the property named components to
a class describing the control constructs it is composed of. For example, the
control construct named Sequence is linked to a ControlConstructList
class. This class represents a list of control constructs and allows to nest the
desired number of elements to build up sequences of any length. This is done
through the definition of a tree of control constructs, whose terminal nodes
are the processes, represented by the special construct class named
Perform.
Fig. 10-6 shows how this mechanism has been applied to CAE. The CAE
composite process has been linked to the Sequence class, to indicate that
CAE is composed of a sequence of atomic processes. Fig. 10-7 shows the
expanded CAE sequence. It assumes the form of a tree whose nonterminal
nodes are labeled with control constructs, and whose leaves are invocations
of atomic processes, represented by instances of the class Perform.
In order to be run, services use resources. This has been represented by
integrating the CAE ontology with a publicly available one [Resource
Ontology, 2005], which models resources. In this ontology (Fig. 10-8),
resources are divided into atomic and aggregate, aggregate resources being
those composed of sets of atomic resources. The model distinguishes also
between unit capacity atomic resources and batch capacity atomic
resources. Batch capacity resources differ from unit capacity resources in
their capability of supporting multiple activities in a synchronized fashion.
Once the resources ontology has been downloaded, we integrated it with
the CAE ontology and specialized some of its concepts. Specifically we
defined two subclasses of the batch capacity atomic resource class, the
Platform class and the File class. The former is used to represent the
platform were a service is deployed. The latter describes service input and
output files.
Resources are often used to express process preconditions. Preconditions
on processes are the codification of the requirements to be satisfied in order
to run a process. OWL-S gives the possibility to choose the language to
codify the precondition, among a set of publicly available ones. As an
example, the following Semantic Web Rules Language (SWRL) [SWRL,
10. Cooperative Computer Aided Engineering of Antenna Arrays 311

2005] construct expresses the need to have two instances (ASF.dat and
miro.unile.it) linked by a specific relationship (named isInstalled):

isInstalled(?ASF.dat, ?miro.unile.it)

This precondition has been used to codify the need of having the input
file located on the platform where the service is deployed. Figure 10-9
illustrates the corresponding XML lines and an exemplificative portion of
the CAE ontology which includes preconditions.

AMC

ASF CAE
Process

AtomicProcess CompositeProcess
realizes collapsesTo

SimpleProcess
realizedBy expandsTo
ESM

ERP

Figure 10-4. Representation of the CAE meta-service. CAE is modeled as a composite


process, made up of four atomic processes.
312 Chapter 10

CompositeProcess

composedOf

ControlConstruct

Perform

Sequence Split Split-Join ----

components

ControlConstructList

first rest

ControlConstruct ControlConstructList

first rest

ControlConstruct nil

Figure 10-5. Each CompositeProcess is linked with its control construct via a
composedOf property. Each control construct, in turn, is associated to a class describing the
nested control constructs it is composed of. In the case of the Sequence control construct a
link to a ControlConstructList class is provided. This class represents a list of control
constructs, given in the form of a tree, whose terminal nodes are instances of the class named
Perform, which stands for invokable process.
10. Cooperative Computer Aided Engineering of Antenna Arrays 313

CAE CompositeProcess

composedOf

CAE_Sequence ControlConstruct

ERPPerform

Sequence Split Split-join ----- Perform

components

ControlConstructList

AMCPerform

ASFPerform

ESMPerform

Figure 10 -6. The instance of the CAE process is linked with an instance of the Sequence
class, to indicate that CAE is composed of a sequence of atomic processes, each represented
by an instance of the class Perform.
314 Chapter 10

ControlConstructList

first rest

Perform

ControlConstructList

first rest
ASFPerform

ControlConstructList

Perform first rest

ControlConstructList
Perform
first
AMCPerform rest

Perform

ESMPerform nil

ERPPerform

Figure 10-7. The expanded CAE sequence. It assumes the form of a tree whose nonterminal
nodes are labeled with control construct lists, and whose leaves are invocations of CAE
atomic processes.
10. Cooperative Computer Aided Engineering of Antenna Arrays 315

AggregateResource

UnitCapacityResource
AFS.dat

Resource AtomicResource

File

BatchCapacityResource
isInstalled

Platform

Service presents isDeployedOn

ServiceProfile

miro.unile.it
ASFService

ASFProfile

Figure 10-8. CAE ontology has been integrated with a publicly available resource ontology.
Resources are divided into atomic and aggregate. Atomic resources can be unit capacity
resources (i.e. exploitable by a single user) and batch capacity resources (i.e. exploitable by
multiple activities in a synchronized fashion). Example of resources introduced in the CAE
ontology are the Platform where a service is deployed and the File class, representing
input and output files of a service.
316 Chapter 10

ASF ASF.dat_Existence

AtomicProcess

hasPrecondition

Condition

DRS-Condition SWRL-Condition

KIF-Condition

expressionBody

expressionLanguage

SWRL
isInstalled(?ASF.dat,?miro.unile.it)
<swrl:AtomList xmlns:swrl= http://www.w3.org/2003/11/swrl# >
<rdf:first xmlns:rdf= http://www.w3.org/1999/02/22-rdf-syntax-ns# >
<swrl:IndividualPropertyAtom>
<swrl:propertyPredicate rdf:resource= #isInstalled ></swrl:propertyPredicate>
<swrl:argument1 rdf:resource= #ASF.dat></swrl:argument1>
<swrl:argument2 rdf:resource= #miro.unile.it ></swrl:argument2>
</swrl:IndividualPropertyAtom>
</rdf:first>
<rdf:rest xmlns:rdf= http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdf:resource= http://www.w3.org/1999/02/22-rdf-syntax-ns#nil ></rdf:rest>
</swrl:AtomList>

Figure 10-9. Process preconditions codify the requirements to be matched in order to execute
a process. OWL-S includes several typologies of preconditions and is integrated with other
languages suited for specifying them. In the example, the SWRL is used to express the needed
availability of a specific file on a determined platform.
10. Cooperative Computer Aided Engineering of Antenna Arrays 317

5.4.4 Service Binding

What seen till now is the representation of the information needed to


discover available services and compose them into a unique meta-service (or
vice-versa, discover the existence of a meta-service, and acquire knowledge
about its components and their relationships). For the complete automation
of service calling, we need also the description of concrete properties of
services, i.e. of the details on how to access the service, such as protocol,
message formats, transport, and addressing. This is done in OWL-S through
the class named ServiceGrounding (see Fig. 10-10).
OWL-S grounding makes use of the Web Services Description Language
(WSDL). As seen in Chapter 4, WSDL is an established, standard XML
format for describing concrete network protocols and message formats
supported by Web Services. Each grid service is associated a WSDL file
which describes its way of functioning in a structured form. OWL-S and
WSDL partially overlap and complement one another: OWL-S is a
semantically structured representation of grounding information, whilst
WSDL is a detailed description of formatting instructions. For this reason,
OWL-S explicitly addresses to the WSDL document it refers to, and reports
some of the data included in it in a more structured way.
As shown in Fig. 10-10, OWL-S grounding makes use of WSDL through
the WSDLGrounding class, a subclass of ServiceGrounding. WSDL-
Grounding, in its turn, is linked to the class named WSDLAtomicProcess-
Grounding, which refers to specific elements within the WSDL
specification. Among the others, it includes the following properties (Fig.
10-11):

wsdlDocument: the address of the WSDL document this grounding refers


to;
wsdlOperation: the WSDL operation corresponding to the given atomic
process;
wsdlInputMessage: the WSDL message that carries the inputs of the
given atomic process;
wsdlOutputMessage: the WSDL message that carries the outputs of the
given atomic process.

By combining the structured information contained in the OWL-S


document and the detailed description reported in the WSDL file, a
completely automated invocation of the service is viable. This is shown
more in detail in the following section, which overviews the client
application developed to test the environment.
318 Chapter 10

ServiceGrounding

supportedBy
Service WsdlGrounding

CAEGrounding

CAEService
ERPGrounding

ESMGrounding

AMCGrounding

ASFGrounding

hasAtomicProcessGrounding

WsdlAtomicProcessGrounding

Figure 10-10. The description of concrete properties of services, i.e. of the details of how to
access the service, such as protocol, message formats, transport, and addressing, is done in
OWL-S through the class named ServiceGrounding. OWL-S grounding makes use
of WSDL, through the WSDLGrounding class, a subclass of ServiceGrounding.
WSDLGrounding, in its turn, is linked with the class named WSDLAtomicProcess-
Grounding, which refers to specific elements within the WSDL specification. As the CAE
application is defined as a composite process, its grounding is obtained by aggregating the
groundings of the atomic processes which compose the CAE meta-service.
10. Cooperative Computer Aided Engineering of Antenna Arrays 319

http://matisse.unile.it:8080/AFS_service?WSDL#invokeASFInputMessage

http://matisse.unile.it:8080/AFS_service?WSDL #invokeASFOutputMessage

wsdlInputMessage
http://matisse.unile.it:8080/AFS_service?WSDL

wsdlOutputMessage wsdlDocument

WsdlAtomicProcessGrounding

owlsProcess
wsdlOperation

AtomicProcess
WsdlOperationRef

invokeASF
portType

http://matisse.unile.it:8080/AFS_service?WSDL #ASFPortType

ASF
operation
http://matisse.unile.it:8080/AFS_service?WSDL #invokeASF

Figure 10-11. OWL-S makes use of the established standard WSDL for representing the
information needed for automatic invocation. The two standards partially overlap and
complement one another: OWL-S is a semantically structured representation of grounding
information, whilst WSDL is a detailed description of formatting instructions. For this reason,
OWL-S explicitly addresses to the WSDL document it is referred to and reports some of the
data included in it in a more structured way.
320 Chapter 10

5.5 Client Application

5.5.1 Introduction

A set of client applications have been developed to test the environment.


They simulate the situation where a potential end-user accesses the grid and
automatically configures its CAE application by retrieving the needed pieces
from the distributed environment. In the following, the three phases of
discovery, orchestration and binding are briefly overviewed, each in its own
sub-section. This repartition has been done for the sake of simplicity.
Instead, the three phases are to be thought as belonging to a single composite
framework, where each step is automated and fully integrated with the
others.

5.5.2 Service Discovery

In the discovery phase, the application interacts with the Racer reasoner
and the ontology to retrieve services having the suited properties. This has
been done through the use of the Protg OWL API [OWL API, 2005],
which provides constructs to open the ontology, contact the reasoner and
perform queries.
First of all, the ontology is loaded and the reasoner is contacted via
HTTP:

// Load the ontology from the specified URL


OWLModel model = ProtegeOWL.createJenaOWLModelFromURI(ONTOLOGY_URL);

// Connect to the reasoner


ReasonerManager reasonerManager = ReasonerManager.getInstance();
ProtegeOWLReasoner reasoner = reasonerManager.getReasoner(model);
reasoner.setURL(REASONER_URL);

if(reasoner.isConnected())
{
// Get the reasoner identity, such as its name and version,
DIGReasonerIdentity reasonerIdentity = reasoner.getIdentity();
..

Then the code is able to navigate through the ontology and to retrieve the
desired entities. For example, the following lines of code show how to
discover the available services implementing the SP-WG methodology (Fig.
10-3).
10. Cooperative Computer Aided Engineering of Antenna Arrays 321

// get the property named implements


RDFProperty implementsProperty = model.getRDFProperty(implements);
//get the class named AnalysisMethod
OWLNamedClass analysisMethod=model.getOWLNamedClass(AnalysisMethod);
// get all the instances and subinstances of the class
Collection amInstances = analysisMethod.getInstances(true);

// iterate on each instance


for (Iterator j = amInstances.iterator(); j.hasNext();) {
OWLIndividual individual = (OWLIndividual) j.next();
String individualName = (String) individual.getBrowserText();

// get SP-WG instance


if(individualName.equals( SP-WG )) {
// get list of instances linked with SP-WG by implements property
Collection \
CAEclasses = model.getRDFResourcesWithPropertyValue(implementsProperty,individual);

Once that the list of matching services is available, further details about
their implementation can be retrieved by reading their profile or looking at
their properties.

5.5.3 Service Orchestration

In this phase, the OWL-S API [OWL-S API, 2005] has been adopted to
allow the automatic orchestration of the elementary services. The OWL-S
API has been specifically designed to manage OWL-S ontology elements: it
provides a Java API for programmatic access to read, execute and write
OWL-S service descriptions. For example, it allows to create service
descriptions, profiles or processes programmatically, so that composite
processes can be created on the fly once the component services have been
discovered. In the following we show the fundamental constructs needed to
perform this.
First of all, a new service must be created, together with the
corresponding composite process:

OWLOntology CAEontology;
// create a new service
Service service = CAEontology.createService(serviceAddress));
// create the composite process for our service
CompositeProcess cprocess = CAEontology.createCompositeProcess(processName));

Then, a sequence is created and the composed process is associated to it:


322 Chapter 10

// create a new sequence


Sequence sequence = CAEontology.createSequence();
cprocess.setComposedOf(sequence);

Finally, the following code snippet shows how to combine a list of services
in a sequence and put it into the composite process.

Perform[ ] performs = new Perform[services.size()];


for(int i = 0; i < services.size(); i++) {
// get the service from the list
Service s = (Service) services.get(i);
// get the process from the service
Process p = s.getProcess();
// create a new perform
performs[i] = CAEontology.createPerform();
performs[i].setProcess(p);
// add it to the composition
sequence.addComponent(performs[i]);
}

Other constructs allow to specify other aspects of the compound service,


such as the data flow and profile.

5.5.4 Service Invocation

Invoking a CAE meta-service means invoking the services making up the


composite process according to the flow specified in the ontology. The
OWL-S API provides all the constructs needed to navigate into the ontology
and identify the processes to call. Each of them is accompanied with
information related to the requirements to be matched for their functioning,
some of which are expressed in the form of preconditions. Most of the times,
the use of the GT API is needed to satisfy them (for moving the input files to
the appropriate platform or to allocate the needed resources, for example).
For the sake of brevity, in the context of this book we concentrate on
operations directly related with the use of the ontology, as the adoption of
grid computing facilities has already been treated elsewhere, with a direct
reference to the CAE application [Tarricone and Esposito, 2004].
Specifically, we provide some detail on the binding step, which relies on
both the information provided from the grounding section of the ontology
and the standard WSDL description of service protocol and interfaces.
10. Cooperative Computer Aided Engineering of Antenna Arrays 323

The grounding information is obtained by the following code fragment,


which retrieves the name of the service operation and the address of the
WSDL file:

AtomicGrounding grounding = ((AtomicProcess) process).getGrounding();


WSDLAtomicGrounding wsdlgrounding = (WSDLAtomicGrounding) grounding;
URI operationURI = wsdlgrounding.getOperation();
String operation = wsdlgrounding.getOperationRef().getLocalName();
URI wsdlDocument = wsdlgrounding.getWSDL();

Once that the address of the WSDL file is known, the client parses its
contents by using the JWSDL API [JWSDL, 2005] which allows reading
and manipulating every element of that WSDL document.
Once found the services endpoint address and the name of the service
operation to invoke, the client can set up a remote call. These last steps are
carried out by using the Dynamic Invocation Interface (DII) technique
implemented with the Java API for XML based Remote Procedure Call
(JAX-RPC) [JAX-RPC, 2005] which allows to specify the operations to be
invoked at running time.

6. CONCLUSIONS

Referring to the CAE of aperture arrays, it is proved that GC and its


companion technologies allow 1) a plain integration of heterogeneous codes
in a single framework 2) the automatic discovery of application components
based on the input requirements of the MW researcher. Thanks to grid
services, once the MW researcher has identified all the needed applications
(for instance, the four tasks previously enumerated in the case of aperture
array design), he/she can generate a meta-application consisting in a
dynamical aggregation of distributed components. Meantime, the same
infrastructure adopted to generate the meta-application (computational grid),
represents a very low-cost environment for high-performance computing,
where computationally intensive applications can be run. In one shot, both
the cooperative engineering needs, and the demand for intensive computing,
are satisfied.
Even more interesting perspectives are opened by the so-called Semantic
Grids. Once the MW researcher has identified how to partition the complete
CAE problem into subproblems, a human-readable language will allow the
description of the several components, so that they are automatically
discovered, invoked and aggregated into a workflow. This means that the
324 Chapter 10

capability of rigorously defining the design problem, is itself the solution of


the problem. This leads to focus, in the next future, on the creation of
suitable tools for defining problems: adequate languages and standardized
atomic building blocks are some of the many key-points.

References
Ant, 2005, http://ant.apache.org/.
De Roure, D., et al., 2003 The Semantic grid: a future e-science infrastructure, in Grid
Computing Making the global infrastructure a reality, F. Bernam, A. Hey, G. Fox
(Eds.), J. Wiley, Chichester, 2003, pp. 437-470.
Esposito, A., and Tarricone, L., 2003 Grid Technology for Computational Electromagnetics:
a Beginners Guide with Applications, IEEE Antennas and Propagation Magazine, 45, 2,
2003, pp. 91-99.
GT, 2005, http://www.globus.org.
Hockham, G. A. and Walker, G. H., 1973 Study of a finite phased array antenna, in Proc.
European Microwave Conf. (Brussels), Sept. 1973, paper C.1.3.
JAX-RPC, 2005, http://java.sun.com/xml/downloads/jaxrpc.html.
Jena 2 API, 2005 http://jena.sourceforge.net/ontology/.
JNI, 2005, http://java.sun.com/j2re/1.4.2/docs/guide/jni.
JWSDL, 2005, http://www.javaworld.com/javaworld/jw-09-2002/jw-0920-webservices-p4.html.
Mailloux, R. J., 1969a Radiation and near field coupling between two collinear open ended
waveguides, IEEE Trans. Antennas Propagat., vol. 17, pp. 49-55, Jan. 1969.
Mailloux, R. J., 1969b First-order solution for mutual coupling between two collinear open
ended waveguides, IEEE Trans. Antennas Propagat., vol. 17, pp. 740-746, Nov. 1969.
Mongiardo, M. and Rozzi, T., 1993 Singular integral equation analysis of flange-mounted
rectangular waveguide radiators, IEEE Trans. Antennas Propagat., vol. 41, pp. 556-565,
May 1993.
Mongiardo, M., et al., 2000, A comparison of numerical methods for the full-wave analysis
of flange mounted rectangular apertures, Int. Journal Numerical Modelling, 13, 1, 2000,
pp. 21-35.
MPICH, 2005, http://www.mcs.anl.gov/mpi/mpich/download.html.
OWL, 2005, Web Ontology Language 1.0 Reference, http://www.w3.org/TR/2002/WD-owl-
ref-20020729/.
OWL API, 2005, http://protege.stanford.edu/plugins/owl/api/guide.html#Jena.
OWL-S, 2005, http://www.daml.org/services.
OWL-S API, 2005, http://www.mindswap.org/2004/owl-s/api/.
Protg, 2005, http://protege.stanford.edu.
Racer, 2005, http://www.sts.tu-harburg.de/~r.f.moeller/racer/.
Resource Ontology, 2005, http://www.daml.org/services/owl-s/1.0/Resource.owl.
SWRL, 2005, http://www.w3.org/Submission/2004/SUBM-SWRL-20040521/.
Tarricone, L., et al., 2001 A Parallel Framework for the Analysis of Metal-flanged
Rectangular-aperture Arrays, IEEE Trans. Ant. Prop., pp. 1479-1484, Oct. 2001.
Tarricone, L., and Esposito, A. 2004 Grid Computing for Electromagnetics, Artech House,
Boston, MA, 2004, pp. 1-266.
10. Cooperative Computer Aided Engineering of Antenna Arrays 325

Bibliography
Amdt, F., Tebbe, J., and Paradies, H., 1988, Reactively loaded rectangular waveguide arrays
with unequal apertures, Proc. JINA (Jounz6es Internationalles de Nice sur le Antennas),
pp. 353-357, 1988.
Baudrand, H., Tao, J. W., and Atechian, J., 1988, Study of radiating properties of open-ended
rectangular waveguides, IEEE Trans. Antennas Propagat., vol. 36, pp. 1071-1077, Aug.
1988.
Bird, T. S., 1979, Mode coupling in a planar circular waveguide array, IEE J.Microwaves
Opt. Acoust., vol. 3, pp. 172-180, 1979.
Bird, T. S., 1987, Mutual coupling in finite coplanar rectangular waveguide arrays,
Electron. Lett., vol. 23, no. 22, pp. 1199-1201, Oct. 1987.
Bird, T. S., 1990, Analysis of mutual coupling in finite arrays of different sized
waveguides, IEEE Trans. Antennas Propagat., vol. 38, pp. 166-172, 1990.
Bodnar, D. G. and Paris, D. T. , 1970, New variational principle in electromagnetics, IEEE
Trans. Antennas Propag., AP-18, 216-223, 1970.
Cerami, 2002, Web Services, OReilly & Associates, 2002.
Clarricoats, P. J. B. and Slinn , K. R., 1967, Numerical solution of waveguide-discontinuites
problems, Proc.IEEE, vol.114, pp. 878-886, July 1967 .
Croswell, W. F., Ruddock, R. C. and Hatcher, D. M., 1967, The admittance of a rectangular
waveguide radiating into a dielectric slab, IEEE Trans. Antennas Propag., AP-15, 627-
633, 1967.
Croswell, W. F., Taylor, W. C., Swift, C. T. and Cockrel, C. R., 1968, The input admittance
of a rectangular waveguide-fed aperture under an inhomogeneous plasma: theory and
experiment, IEEE Trans. Antennas Propag., AP-16, 475-487, 1968.
Das, B. N., Charakraborty, A. and Gupta, S., 1991, Analysis of waveguide-fed thick
radiating rectangular windows in a ground plane, Proc. Inst. Elec. Eng., pt. H, vol. 138,
no. 2, pp. 142-146, Apr. 1991.
Esposito, A., Tarricone, L., Vallone, L., 2005, New information technologies for the CAE
of MW circuits and antennas: experiences with grid services and semantic grids,
Mediterranean Microwaves Symposium 2005 Athens, 6-8 September 2005.
Esposito, A., Tarricone, L., Vallone, L., Vallone, M., 2005, Grid Services for
Electromagnetics: Applications to Computer-Aided Engineering of Antenna Arrays,
IMACS 2005, Paris, 11-15 July 2005.
Fenn, J. A., Thiele, G. A. and Munk, B. A., 1982, Moment method analysis of finite
rectangular waveguide phased arrays, IEEE Trans. Antennas Propagat., vol. 30, pp. 554-
563, July 1982.
Foster, I., Kesselmann, C., Nick, J. M., Tuecke, S., 2003, The Physiology of the grid, in
Grid Computing Making the global infrastructure a reality, F. Bernam, A. Hey, G. Fox
(Eds.), J. Wiley, Chichester, 2003, pp. 217-249.
Galejs, J., 1965, Admittance of a waveguide radiating into a stratified plasma, IEEE Trans.
Antennas Propag., AP-13, 64-70, 1965.
Jamieson, A. R. and Rozzi, T. E., 1977, Rigorous analysis of cross-polarization in flange-
mounted rectangular waveguide radiators, Electron. Lett., vol. 13, pp. 742-744, Nov.
1977.
Kitchener, D., Raghavan, K. and Parini, C. G., 1987, Mutual coupling in a planar finite
array of rectangular apertures, Electron. Lett., vol. 23, no .21, pp. 11691170, 1987.
Lewin, L., 1951, Advanced Theory of waveguides, Ch. 6, Tliffe, London, 1951, pp. 121-144.
Levine, H., Schwinger, J., 1965, On the theory of electromagnetic wave diffraction by an
aperture in an infinite plane conducting sheet, in The theory of electromagnetic waves,
M. Kline, Ed. New York: Dover, pp. S1-S37, 1965.
326 Chapter 10

Masterman, P. H. and Clarricoats, P. J. B., 1971, Computer field-matching solution of


waveguide transverse discontinuities, PROC. IEE, vol 118, no. 1, 1971, pp. 51-63.
Mautz, J. R. and Harrinton, R. F., 1978, Transmission from a rectangular waveguide into
half-space through a rectangular aperture, IEEE Trans. Microwave Theory Tech., vol. 26,
pp. 44-45, Jan. 1978.
McPhie, R. H. and Zaghloul, A. I., 1980, Radiation from a rectangular waveguide with
infinite flange: Exact solution by correlation matrix, IEEE Trans. Antennas Propagat.,
vol. 28, pp. 497-503, July 1980.
Mongiardo, M. and Rozzi, T., 1979, Polarization dependent coupling in the analysis of finite
arrays of rectangular apertures, in Proc. European Microwave Con$ (London), Sept.
1989, pp. 259-264.
Parini, C. G. and Kitchener D., 1989, The importance of mutual coupling in the analysis of
finite arrays of rectangular apertures, in Proc. European Microwave Con$ (London),
Sept. 1989, pp. 265-271.
Teodoris, V., Sphicopoulos, T. and Gardiol, F. E., 1985, The reflection from an open-ended
rectangular waveguide terminated by a layered dielectric medium, IEEE Trans.
Microwave Theory Tech., vol. 33, pp. 359-365, May, 1985.
Wexler, A., 1967, Solution of Waveguide Discontinuities by Modal Analysis, IEEE Trans
Microwave Theory Tech., vol. MTT-15, pp. 508-517, Sept.1967.
Whinnery, J. R. and Jamieson, H. W., 1944, Equivalent circuits for discontinuites in
transmission lines, proc. IRE, vol. 32, pp. 98-116, Feb. 1944.
Chapter 11
DISTRIBUTED AND OBJECT-ORIENTED
COMPUTATIONAL ELECTROMAGNETICS
ON THE GRID

D. Caromel1, F. Huet1, S. Lanteri2 and N. Parlavantzas1,*


1
INRIA-I3S-CNRS, project-team Oasis, 2004 route des Lucioles, BP 93, 06902 Sophia
Antipolis Cedex, France; 2INRIA, project-team CAIMAN, 2004 route des Lucioles, BP 93,
06902 Sophia Antipolis Cedex, France

Abstract: Grids raise new challenges for programming, composing, and deploying
numerical applications. Heterogeneity, medium to high latency, various
underlying systems and protocols call for new paradigms and techniques.
Within this framework, the development of high performance numerical
methods for the solution of systems of PDEs (Partial Differential Equations),
must integrate these factors, representing both new difficulties and new
opportunities. In this chapter, we describe an open source middleware for the
Grid, ProActive, featuring distributed objects and components. Using
ProActive, we demonstrate how to design and implement an object-oriented
(OO) time domain finite volume solver on unstructured meshes for the 3D
Maxwells equations modelling the propagation of electromagnetic waves.
We also present some experimental results obtained on an experimental Grid,
Grid5000, running on more than 400 processors.

Key words: Grid Computing; PDEs; Finite Volume Solver.

1. INTRODUCTION

The availability of powerful computers and high-speed network


technologies as low-cost commodity components is changing the way we
use computers today. These technological opportunities have led to the

* This work was carried out for the COREGRID IST project no 004265 funded by the
European Commission.

327
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 327343.
2006 Springer. Printed in the Netherlands.
328 Chapter 11

possibility of using distributed computing platforms as a single, unified


resource, leading to what is popularly known as Grid computing. However
this emerging grid computing concept also brings additional constraints on
the development of large-scale distributed applications such as heterogeneity
(both in terms of CPUs and interconnection networks) and multi-
localization. We present here the results of a collaborative effort between
computer scientists and applied mathematicians towards the use of modern
programming languages and libraries (Java and ProActive7) for the design
and implementation of an object-oriented time domain finite volume solver
on unstructured tetrahedral meshes for the 3D Maxwells equations
modelling the propagation of electromagnetic waves.
S ection 2 presents the main characteristics of the ProActive library and
in particular, the object-oriented group communication paradigm. Section 3
is dedicated to the description of the object-oriented distributed solver at the
heart of this study. Section 4 presents performance results for various
experimental test beds, including Grid50008. Finally, section 5 discusses on-
going and future work, and section 6 concludes the chapter.

2. DISTRIBUTED OBJECTS: PROACTIVE

ProActive is a Java middleware for parallel, concurrent and distributed


programming which features high level services as weak migration, group
communication, security, deployment and components. As ProActive is built
on top of the Java standard API, it does not require any modification to the
standard Java execution environment, nor does it make use of a special
compiler, pre-processor or modified virtual machine.

2.1 Basic Model

A distributed or concurrent application built using ProActive is


composed of a number of medium-grained entities called active objects.
Each active object has one distinguished element, the root, which is the only
entry point to the active object. Each active object has its own thread of
control and is granted the ability to decide in which order to serve the
incoming method calls that are automatically stored in a queue of pending
requests. Method calls sent to active objects are asynchronous with

7
ProActive is available in LGPL at http://ProActive.ObjectWeb.org.
8
The Grid'5000 initiative is described at http://www.grid5000.org.
11. Object-Oriented Computational Electromagnetics on the Grid 329

transparent future objects and synchronization is handled by a mechanism


known as wait-by-necessity [Caromel, 1993]. There is a short rendezvous at
the beginning of each asynchronous remote call, which blocks the caller
until the call has reached the context of the callee. The ProActive library
provides a way to migrate any active object from a JVM to another one
through the migrateTo primitive. This primitive can either be called from
the object itself or from another active object through a public method call.

2.2 Mapping Active Objects to JVMs: Nodes

Another extra service provided by ProActive (compared to RMI for


instance) is the capability to remotely create remotely accessible objects. For
that reason, there is a need to identify JVMs, and to add a few services.
Nodes provide those extra capabilities: a Node is an object defined in
ProActive whose aim is to gather several active objects in a logical entity. It
provides an abstraction for the physical location of a set of active objects. At
any time, a JVM hosts one or several active objects. The traditional way to
name and handle nodes in a simple manner is to associate them with a
symbolic name, that is a URL giving their location, for instance
rmi://lo.inria.fr/Node1. Let us take a standard Java class named A. The
instruction:
A a = (A) ProActive.newActive(A, params,
rmi://lo.inria.fr/Node1);

creates a new active object of type A on the JVM identified with Node1.
Further, all calls to that remote object will be asynchronous, and subject to
the wait-by-necessity:
a.foo (...); // asynchronous call
v = a.bar (...); // asynchronous call
...
v.f (...); // wait-by-necessity: wait until v gets its value

Note that an active object can also be bound dynamically to a node as the
result of a migration. In order to help in the deployment phase of ProActive
components, the concept of Virtual Nodes as entities for mapping active
objects has been introduced which can be created using deployment
descriptors.

2.3 Deployment Descriptors

The deployment descriptors [Baude et al., 2002] provide means of


abstracting from the source code of the application any reference to software
330 Chapter 11

or hardware configuration. They also provide an integrated mechanism to


specify external processes that must be launched and the procedure to do it.
The goal is to deploy an application anywhere without having to change
the source code, all the necessary information being stored in an XML
descriptor file. An application using deployment descriptors has access to an
API enabling it to query its runtime environment for information like the
number of nodes available.
A deployment file is made of three parts. The first one, VirtualNode, is
used to declare nodes name that will be used in the source code of the
application. The second part, Mapping, describes how the virtual nodes are
to be mapped to virtual machines. Finally, the Infrastructure part describes
how these virtual machines will be created.

VirtualNode:
jem3DNode

Mapping:
jem3DNode--> VM1, VM2

Infrastructure:
VM1 --> Local Virtual Machine
VM2 --> SSH host1 then RemoteVM
RemoteVM --> Local Virtual Machine

Figure 11-1. Example of a deployment file.

An example of deployment file is given in Fig. 11-1. For the sake of


clarity, we have used a pseudo-code syntax instead of the less readable XML
one. We have indicated in italics the symbolic names which are used as
references in the file. These names are used to structure the descriptor and
can be of arbitrary value. In bold references are the actual classes provided
by ProActive. The application which will use this file will be able to use the
symbolic name jem3DNode in the source code to access these resources.
When used, this virtual node will be mapped onto two virtual machines
VM1 and VM2, specified in the infrastructure part. The creation of these
11. Object-Oriented Computational Electromagnetics on the Grid 331

virtual machines is as follows. The first one will be created locally. The
second one will trigger a ssh connection to host1 and then perform the
creation of a new local virtual machine there. In this part it is possible to
specify various environment variables such as CLASSPATH to be used for
the creation of the virtual machine.

2.4 Group Communications

Within the application code, the group communication mechanism of


ProActive achieves asynchronous remote method invocation for a group of
remote objects, with automatic gathering of replies. Given a Java class, one
can initiate group communications using the standard public methods of
the class together with the classical dot notation; in that way, group
communications remain typed. Furthermore, groups are automatically
constructed to handle the result of collective operations, providing an elegant
and effective way to program gather operations.
Let A be a standard Java class, in the following lines we present a
typical code for creating a group:
// A group of type A and its 2 members are created
// at once on the nodes directly specified, parameters
// are specified in params
Object[][] params = {{...}, {...}};
A ag = (A) ProActiveGroup.newGroup(A,
params, {node1,node2});

Elements can be included into a typed group only if their class equals
or extends the class specified in the group creation. Note that we do allow
and handle polymorphic groups. For example, an object of class B (B
extending A) can be included in a group of type A. However based on Java
typing, only the methods defined in the class A can be invoked on the group.
A method invocation on a group has a syntax similar to a standard
method invocation:
ag.foo(...); // A group communication

Such a call is asynchronously propagated to all members of the group


using multithreading. Like in the ProActive basic model, a method call on a
group is non-blocking and provides a transparent future object to collect the
results. A method call on a group yields a method call on each of the group
members. If a member is a ProActive active object, the method call will be a
ProActive call and if the member is a standard Java object, the method call
will be a standard Java method call (within the same JVM). The parameters
of the invoked method are broadcasted to all the members of the group.
332 Chapter 11

An important specificity of the group mechanism is: the result of a


typed group communication is also a group. The resulting group is
transparently built at invocation time, with a future for each elementary
reply. It will be dynamically updated with the incoming results, thus
gathering results. The wait-by necessity mechanism is also valid on groups:
if all replies are awaited the caller blocks, but as soon as one reply arrives in
the result group the method call on this result is executed. For instance in:

V vg = ag.bar(); // A method call on a group with result


// vg is a typed group of V
vg.f(); // This is also a collective operation

a new f() method call is automatically triggered as soon as a reply from


the call ag.bar() comes back in the group vg (dynamically formed). The
instruction vg.f()completes when f() has been called on all members.
Other features are available regarding group communications:
parameter dispatching using groups (through the definition of scatter
groups), hierarchical groups, dynamic group manipulation (add, remove
members), group synchronization and barriers (waitOne, waitAll,
waitAndGet); see Baduel et al. [Baduel et al., 2002] for further details and
implementation techniques. ProActive also features a distributed and
hierarchical component model [Baude et al., 2003] which will be introduced
in section 5.

3. OO DISTRIBUTED FINITE VOLUME SOLVER

Electromagnetic wave propagation is modelled by the Maxwells


equations. Recently, [Piperno et al., 2003] have proposed a new finite
volume scheme for solving the three-dimensional time domain Maxwells
equations on irregular meshes. A cell-centered formulation is adopted
meaning that the discrete unknowns are the average over a tetrahedron of the
components of the electric and magnetic fields. This scheme combines the
use of a centered numerical flux function for the calculation of the flux
balance at a control volume boundary, with an explicit leap-frog scheme for
time integration of the semi-discrete equations. It is proven to conserve a
discrete electromagnetic energy which is a quadratic form of the unknowns
(the electric and magnetic vectors) subject to a CFL-like condition thus
yielding a stability criterion for the overall scheme. In practice, this scheme
has been implemented using unstructured tetrahedral meshes; however, it is
potentially applicable to general, hybrid meshes, combining hexahedral,
prismatic and tetrahedral elements. Higher-order formulations can be
11. Object-Oriented Computational Electromagnetics on the Grid 333

designed as well [Piperno et al., 2002]. This variety of situations naturally


motivates the specification of a general object-oriented framework that
would facilitate the development of various numerical simulation tools. In
the first part of this section, we describe the main characteristics of this
object-oriented model. The second part of this section is concerned with the
design of a distributed version of Jem3D using the concepts of the ProActive
library.

3.1 Basic Architecture of the OO Model

The proposed OO model essentially consists of two types of classes:


classes that are concerned with the definition of the basic geometrical
entities and classes that are related to the application, i.e. classes for the
basic numerical kernels and classes linked to the physical context under
consideration (electromagnetic waves propagation in the present case).
The numerical methods at the heart of this study assume that the
computational domain is triangulated. The underlying, finite element mesh
can be totally unstructured, allowing local refinements in regions where the
geometry and/or the physical problem under consideration exhibit
complicated features. The common situation is such that only one type of
element is considered for the definition of a given mesh. However, in the
general case, the computational domain can combine several types of
elements (tetrahedron, prism, hexahedron, etc.). Thus, a first series of classes
are concerned with the definition of the basic geometrical entities that are
faced with when manipulating such a mesh. In order to do so, one essentially
needs two basic geometric entities: the vertex and the element. The element
is used to connect a number of vertices and a mesh is defined by filling the
computational domain with elements. These two geometric entities are
included in our object-oriented model through the definition of several
classes: Vertex2D and Vertex3D (which extends Vertex2D) are simple
concrete classes for the definition of a vertex in 2D and 3D; Element,
Element2D and Element3D are abstract classes for the definition of an
element in 2D and 3D (see Fig. 11-2).
Finite volume solvers such as the ones described in [Piperno et al.,
2002] rely on the definition of a control volume. Depending on the adopted
formulation (i.e. element-centred or vertex-centred) this control volume can
be an element of the primal mesh (e.g. a tetrahedron) or a geometrical entity
built from the set of elements attached to a vertex (see [Lanteri, 1996] for a
concrete example in the context of compressible flow solver). In the latter
case, the resulting control volume (also referred as a cell) implicitly defines
an alternative discretization of the computational domain referred to as the
334 Chapter 11

dual mesh. Whatever is the form of the control volume, the resulting finite
volume solvers involve the evaluation of a flux balance at a control volume
boundary. In practice a flux balance results from the assembly of elementary
numerical fluxes computed between neighbouring control volumes sharing a
facet. Different types of facets can be manipulated depending on their
location (internal or boundary facet) and the type of control volume (vertex-
centred or element-centred). This calls for the definition of a hierarchy of
classes represented in Fig. 11-3.

Element

Element2D Element3D

TriangleElt QuadrangleElt TetrahedronElt HexahedronElt

ControlVolume

ControlVolume2D ControlVolume3D

TriangleCv QuandrangleCv TetrahedronCv HexahedronCv

Figure 11-2. Definition of an element and a control volume in 2D and 3D.

Facet

VtxCenteredFacet EltCenteredFacet

VtxCenteredFacet2D VtxCenteredFacet3D EltCenteredFacet2D EltCenteredFacet3D

TriangleFacet QuadrangleFacet

BorderFacet InternalFacet

VirtualBorderFacet MetalBorderFacet InfBorderFacet

Figure 11-3. Definition of a facet.

This model has been used to develop a sequential, object-oriented,


version of an existing Fortran 77 code implementing the finite volume
method introduced in [Piperno et al., 2002]. This finite volume solver relies
on an element-centred formulation where the control volume is taken to be a
tetrahedron. The resulting object-oriented, time domain, finite volume solver
11. Object-Oriented Computational Electromagnetics on the Grid 335

has been named JEM3D. The programming of JEM3D fully relies on Java.
The overall skeleton of the JEM3D solver is shown in Fig. 11-4.

Tetrahedral mesh
Setting of simulation parameters
(vertices and element connectivity)

Construction of auxiliary connectivity


tables

Geometry
Construction of the lists of faces
(internal and boundary faces)

Calculation of auxiliary quantities


(volumes of tetraedra, components of
the normal vectors to faces, ...)
Problem initialization
Calculation of the flux balance for the magnetic
field and update of the electric field

Calculation of the flux balance for the electric


field and update of the magnetic field
Time stepping loop
Calculation of the discrete
electromagnetic energy

t < tmax
Stopping test

t = tmax

Solution saving and statistics

Figure 11-4. Overall skeleton of JEM3D.

3.2 Distribution and Parallelization

This section explains how, using active objects, asynchronous point-


to-point and group communications, the sequential version of JEM3D can
be distributed on a set of machines. Fig. 11-5 shows elements of the
architecture of the sequential version of JEM3D. All facets, whatever their
real type (internal or boundary), are grouped in an ArrayList of facets; all
control volumes (CVs) are grouped into an ArrayList. As each internal
facet belongs to two CVs, one can for instance see in Fig. 11-5 the
corresponding two references (from a facet to two CVs). After the
initialization phase, the main loop repetitively executes the three phases
presented in Fig. 11-4, by going over the ArrayList of facets. The three
phases read or update some values (i.e. the components of the electric and
magnetic fields) of the corresponding CVs .
Fig. 11-6 shows the architecture of the distributed version of JEM3D.
The underlying idea for the parallelization is to apply a standard and natural
geometric decomposition of the 3D computational domain into sub-domains.
336 Chapter 11

As such, some facets will contribute to CVs that may be located in


neighbouring sub-domains. We introduce the VirtualBoderFacets (VBF) to
represent those facets that belong to two sub-domains. In a couple of
neighbouring sub-domains, both have a reference to a VBF designating the
shared facet. Each VBF contributes to the computation. Two neighbouring
VBFs which are copies of the same facet must exchange and combine their
physical values (i.e. the components of the electric and magnetic fields) to
compute the associated numerical flux. For the update access, it is the sub-
domain s responsibility to trigger a remote method call onto the
corresponding sub-domain, implemented as an active object, which itself
sets values in the twin VBF.

Domain
List of facets

Border Internal Border


facet facet facet

Control Control
Volume Volume

List of Control Volume

Figure 11-5. Elements of the architecture of the sequential version of JEM3D.

The object-oriented approach brings a specific advantage: sequential


references to some data-structures (e.g. facets and CVs) can be turned into
remote references in a transparent manner for the code using them. The
partitioning first occurs on facets: each one is assigned to a unique sub-
domain. As a consequence, some CVs will be shared by two sub-domains. A
shared CV is referenced by facets belonging to different sub-domains. Of
course, specific programming techniques have to be used in order to read
and update shared CVs.
Thanks to polymorphism and dynamic binding, there is no need to
explicitly deal with the effective real types of facets: internal or virtual. As a
result, the CVs that reference virtual border facets, as well as the loop that
uses them, can execute unchanged.
The architecture of the distributed version of JEM3D features a totally
decentralized approach. The application is fully peer-to-peer: each sub-
domain communicates with the others without any centralized supervisor. As
centralized points are usually bottlenecks due to overload problems, we
achieve better scalability.
11. Object-Oriented Computational Electromagnetics on the Grid 337

Subdomain 1 (Active Object) Subdomain 2 (Active Object)

List of facets The same facet List of facets


is duplicated

Border Internal Virtual Virtual Internal Border


facet facet border facet border facet facet facet

Control Control Control Control


Volume Volume Volume Volume

Each subdomain have a list of adresses


List of Control Volumes List of Control Volumes
of the remote copies (twins) of his virtual border facet

Figure 11-6. Architecture of the distributed version of JEM3D.

4. BENCHMARKS

A standard test case for which an exact solution of the Maxwells


equations exists consists in the simulation of the propagation of an
eigenmode in a cubic metallic cavity. For this test case, the underlying
tetrahedral mesh is automatically built by first defining a Cartesian grid
discretization of the cube and then, dividing each element of this grid in six
tetrahedrals. In this section we will measure the time taken by our
application to perform 100 iterations of the main loop (Fig. 11-4) using
various mesh sizes.

4.1 Comparison with a Fortran Implementation

The algorithm used in Jem3D was also implemented in a Fortran/MPI


program. The aim of the comparison is not to compare the relative speed of
Java and Fortran but to give some insight about the cost of the Java features.
Although the algorithms are the same in both programs, the implementations
are very different. Among other things, the Fortran implementation used
fixed size data structures set at compile time whereas the Java version uses
fully dynamic ones. Also, Jem3D has been designed to be remotely
accessible in order to visualize in real time the computation. The
experimental test bed was a cluster made of Dual Pentium III running at 933
MHz with 512MB of memory and linked through 100 Mb/s switched
networks, located in Sophia Antipolis, France.
338 Chapter 11

Table 11-1. Comparison of the Java and Fortran/MPI implementations.


Mesh size Java time Fortran Java Fortran Time ratio Memory
time memory memory ratio
21 21 21 45s 18.9s 78MB 59MB 2.38 1.32
31 31 31 150s 65s 224MB 164MB 2.30 1.36
41 41 41 387s 156s 483MB 366MB 2.48 1.31

Table 11.1 displays the execution time and memory usage for some given
mesh size on a single node, i.e. without network communications. As we can
see, the ratio between the Java and Fortran/MPI version is below 2.5 for the
execution time and below 1.4 for the memory usage. Experiments [Bull
et al., 2001] have reported that it is possible to achieve a time ratio between
1.4 and 2.7 for a broad range of applications.

4.2 Grid5000 Experiments

One of the benefits of Java is that it is platform-agnostic, which makes it


easier to run on a Grid. We have thus conducted experiments on Grid5000, a
French grid currently under development. It now has around 1000 processors
available (PowerPC, Xeon and Opterons), distributed over 8 geographical
sites and linked through 1Gb/s connections. Each site is responsible for the
administration of its local nodes and the only guarantee given is that they
should be running under Linux. The versions of the kernel, the glibc or gcc
are not specified and could be different from one node to another.

Table 11-2. Jem3D on Grid5000.


Mesh size Processors Execution time
101101 101 64 114s
128 71s
256 47s
404 106s
201 201 201 64 639s
128 372s
256 206s
404 464s

The aim of this experiment was to try to run Jem3D on as many


processors as possible to verify its scalability. As shown in Table 11.2, we
managed to have 404 processors, although most of the times more were
requested, as we will see in section 5.1. For the 1013 (resp. 2013) mesh we
11. Object-Oriented Computational Electromagnetics on the Grid 339

achieve an 80% (resp. 85%) efficiency when increasing the number of


processors from 64 to 128. However, the mesh size chosen in our
experiments proved to be too small for the maximum number of processors,
as the execution time actually increases when using more than 256 nodes.

5. ON-GOING AND FUTURE WORK

5.1 Application Controlled Deployment

When running on a Grid the main issues which an application faces are
the heterogeneity of resources and their availability. While the former is
handled smoothly by Java, the latter is more difficult to handle. In most Grid
schedulers, nodes are tested periodically using simple commands like ping
and removed from the pool of available resources if they fail the test. This
should prevent an application from getting unresponsive nodes. However,
this is not foolproof as many possible scenarios can happen. As an example,
a node can fail after being allocated to a user, or a seemingly working one
can have services down like ssh or nfs, preventing any application from
running. Also, it is not uncommon to have rogue processes, remains of
previous jobs which did not terminate properly, hogging on resources like
sockets or CPUs. All this boils down to a simple remark: it is not possible to
predict the number and quality of allocated resources available based on the
request issued to the grid scheduler. We believe that, until some quality of
service is guaranteed on the Grid, a simple solution would be to let this
application take the necessary steps to ensure its working. Using ProActive,
this can be done through Virtual Node. Using the provided API, the
application can obtain details such as the number of physical JVMs started,
or their address. One guarantee of the deployment process, as implemented,
is that the virtual nodes contain only functional JVMs, i.e. one being able to
start a ProActive application and open a connection to the caller site. This
means that one has a high confidence that the resources are working,
although there is no guarantee that they will not fail. Using this information,
the application can take a decision on how to use these resources. A simple
one could be to adapt the size of the computation to the available nodes or
request new resources before starting.

5.2 Enhancing Modifiability Through Components

Recent work has begun to investigate how the architecture of Jem3D can
be evolved to support enhanced modifiability. Enhanced modifiability
340 Chapter 11

would, for example, enable selecting and deploying different solver variants
corresponding to different instances of the general model described in
section 3. It would also enable the flexible combination of solvers with
various forms of steering and visualization functionality both statically and
dynamically. Dynamic modification, in particular, would also be useful for
accommodating dynamic variations in the underlying Grid resources (e.g.
network bandwidth and machines). For instance, if a participating machine
crashes during execution, the application could be dynamically reconfigured
to restore its previous configuration. As another example, consider the data
collector object used to periodically receive computed solutions from all
sub-domains. If the load imposed on the collector machine becomes
excessive, one could dynamically reconfigure the application to employ a
hierarchical structure of collectors exhibiting better scalability.
We are addressing the modifiability challenge by means of a component-
based development approach, which has emerged as a principled and
effective way to build flexible systems. Following this approach, we are
restructuring Jem3D towards a component-based implementation. The adop-
ted component model is a parallel and distributed model that specific-
ally targets Grid applications [Baude et al., 2003]. This Grid component
model conforms to the generic Fractal model [Bruneton et al., 2003] and
extends it with a number of features based on ProActive. Fractal and the
ProActive-based extensions are briefly examined next.
Fractal components are runtime entities that communicate exclusively
through interfaces of two types: client interfaces that emit operation
invocations and server interfaces that accept them. Interfaces are connected
through communication paths, called bindings. An important feature of
Fractal is its support for hierarchical composition; that is, for recursively
assembling components into more complex, composite components. Another
key feature is its support for extensible reflective facilities; each component
is associated with an extensible set of controllers that enable inspecting and
reconfiguring internal features of the component (e.g. its sub-components).
Fractal also includes an architecture description language (ADL) for
specifying configurations in terms of components, their composition relation-
ships, and their bindings.
The Grid component model extends Fractal in the following ways. First,
components contain one or more active objects and can be distributed over
different machines. Second, the model provides a specialization of interfaces
which enable multicast communication based on the ProActive group
mechanism. Finally, the model explicitly supports distributed deployment of
components based on the ProActive virtual node abstraction.
A first version of the component-based Jem3D has been produced based
on a coarse-grained partitioning into the following components: steering and
11. Object-Oriented Computational Electromagnetics on the Grid 341

visualisation agents, the data collector, the sub-domains, and a composite


that encapsulates the sub-domains (see Fig. 11-7). This initial version has no
explicit support for dynamic modification. Nevertheless, the componen-
tization has already proven to be beneficial for the following two reasons.
First, componentization has made explicit the main functional units and
communication paths, thus making the system easier to understand, and
revealing several opportunities for design improvement. Second, the use of
Fractal client interfaces and bindings has removed implementation
dependencies from the original Jem3D classes, making them reusable in
different contexts and facilitating their replacement. For example, replacing
the sub-domain or collector implementations can be performed declaratively,
using the ADL, without code modifications. Support for dynamic modifi-
tion will be added in a next version, taking advantage of the flexibility
engendered by the component structure. Specifically, adding such support
will involve building on Fractals reflective facilities, without requiring
any changes to existing components.

Steering / Visualisation Steering/ Visualisation


Agent 1 Agent 2

Data Collector

Sub-domain 1 Sub-domain 2

Sub-domain 3 Sub-domain 4

Figure 11-7. Component Architecture of Jem3D.


342 Chapter 11

6. CONCLUSIONS

In this work we have presented Jem3D, an object oriented time domain


finite volume solver, on unstructured meshes, for the 3D Maxwells
equations modelling the propagation of electromagnetic waves. Using
ProActive for distribution and communication, it can run on heterogeneous
and dynamic environments like Grids. Our experiments show that indeed,
the Java implementation is slower than a Fortran one, but it can easily and
rapidly be deployed on a large number of processors. The experiments
conducted on Grid5000 have shown that having resources allocated by a
scheduler does not guarantee that they will be fully functional. A solution to
this problem could be to let the application delay its configuration after
the deployment. Finally, we briefly described on-going work involving
reengineering Jem3D towards a component-based implementation. Our
initial experience has shown that componentisation is a highly promising
basis for building programmable and modifiable Grid applications, such as
Jem3D. However further work is required to add support or dynamic
reconfiguration and to investigate the impact of componentisation on system
performance.

References

Baduel, L., et al., 2002, Efficient, flexible, and typed group communications in Java, in
Proceedings of the Joint ACM Java Grande ISCOPE Conference, ACM Press, 28-36
(2002).
Baude, F., et al., 2002, Interactive and descriptor-based deployment of object-oriented grid
applications, in: Proceedings. 11th IEEE International Symposium on High Performance
Distributed Computing (HPDC-11), 93-120 .
Baude, F., et al., 2003, From distributed objects to hierarchical grid components, in:
Proceedings of the International Symposium on Distributed Objects and Applications
(DOA03), Lecture Notes In Computer Sciences 2888, 1226-1242.
Bruneton, E., et al., 2002, Recursive and dynamic software composition with sharing.
Proceedings of the 7th ECOOP International Workshop on Component-Oriented
Programming (WCOP 02), 2002.
Bull, J. M. et al., 2001, Benchmarking Java against C and Fortran for scientific applications,
Java Grande, 2001.
Caromel, D., 1993, Towards a method of object-oriented concurrent programming.
Communications of the ACM 36(9), 90-102 (1993).
Lanteri, S., 1996, Parallel solutions of compressible flows using overlapping and non-
overlapping mesh partitioning strategies, Parallel Comput., 22, 943-968 (1996).
Piperno, S., et al., 2002, A nondiffusive finite volume scheme for the three-dimensional
Maxwells equations on unstructured meshes, SIAM J. Num. Anal., 39(6), 2089-2108
(2002).
Piperno, S. and Fezoui, L., 2003, A centered discontinuous Galerkin finite volume scheme for
the 3D heterogeneous Maxwell equations on unstructured meshes, INRIA Research
Report No. 4733 (2003).
11. Object-Oriented Computational Electromagnetics on the Grid 343

Bibliography

Szyperski, C., Component Software Beyond Object-Oriented Programming Second


Edition (Addison-Wesley and ACM Press, 2002), ISBN 0-201-74572-0.
Chapter 12
SOFTWARE AGENTS FOR PARAMETRIC
COMPUTATIONAL ELECTROMAGNETICS
APPLICATIONS

D. G. Lymperopoulos, I. E. Foukarakis, A. I. Kostaridis, C. G. Biniaris


and D. I. Kaklamani
School of Electrical and Computer Engineering, National Technical University of Athens

Abstract: This chapter elaborates the application of novel networking software


technologies in distributed parallel CEM computing. The presented platforms
focus on solving demanding, parametric CEM problems by employing modern
programming techniques and network interface software libraries. Web
Services programming model based on SOAP/ XML and the Mobile Agent
Technology (MAT) have been utilised in the design of platform-independent
CEM modelling for concurrent processing of multiple application inputs. In
addition, as an extension to distributed parametric simulations, this chapter
introduces Genetic Software Agents, which are mobile agent entities with the
ability to carry out Genetic Search Optimisations in a collaborative scheme.
The performance of the platforms is evaluated on tests performed on a local
network of workstations and conclusions about the employment of such novel
network technologies in CEM are presented.

Key words: Computational Electromagnetics; Parametric Problems; Web Services; Mobile


Agent Technology.

1. INTRODUCTION

The notorious computational demands of Computational Electro-


magnetics (CEM) applications have formed a strong drive towards
parallelisation schemes. In many cases the parallelisation scenarios raise
significant communication needs during algorithm execution. Distributed
computing libraries such as PVM and MPI have been successfully employed
for carrying out this kind of tasks, at the cost of design and programming
complexity. However CEM developers often confront the problem of

345
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 345379.
2006 Springer. Printed in the Netherlands.
346 Chapter 12

distributing much simpler parallel applications, known as parametric or


embarrassingly parallel, which do not request data transfer during
executions. This kind of computer programs involves execution of the same
code with different input parameters. Some typical examples include
frequency sweeps for antenna characterisation, Method-of-Moments
modelling with various basis functions or design of optimal antenna
geometries.
Since parametric problems are the simplest kind of parallel distributed
applications, there is no need to face the complexity and restrictions of
traditional network computing libraries for developing an execution
framework. On the other hand, novel network technologies greatly simplify
the deployment of highly scalable distributed systems at relatively small
bandwidth and computing cost. This chapter discusses various ways of
implementing advanced network applications, based on high-level infra-
structures for the execution of parametric CEM applications.
More specifically, the core technology introduced in the implemented
environments is the object-oriented Multi-Agent System. Parametric
simulations are viewed as the task of a mobile Agent, capable of moving
to the location where problem input data is situated, calling the CEM
simulation routines and collecting the results for visualisation or further
processing.
This paradigm can be integrated with the service-oriented, platform
independent architecture of Web Services and SOAP/XML protocol. The
latter has served as a core component of the Globus Grid toolkit and
introduces a standard method for accessing network resources and services,
via well-defined interfaces and messages.
In general, the most important advantages of the proposed architectures
are platform and network independence. The infrastructure provided by
Agent Management Systems and Web Services enables the interconnection
of heterogeneous nodes in a transparent way, supports dynamic node
insertion and deletion, and provides simple methods for implementing fault-
tolerant systems, in contrast with traditional function-oriented libraries.
The following sections elaborate on implementations of Web Services
and Multi-Agent Systems for the parallel distributed execution of parametric
CEM applications. Section 2 presents different classes of CEM parametric
problems. Section 3 analyses the architecture of Mobile Agent Technology
and its uses in CEM. Section 4 consists of a collection of agent platforms
that have been used in CEM modelling and their performance in various,
usually heterogeneous environments. Finally, Section 5 introduces Genetic
Software Agents for CEM applications as an extension to parametric
processing, along with a proposed architecture for implementing genetic
search operators with agents in such optimisation problems.
12. Software Agents for Parametric Electromagnetics Applications 347

2. CLASSIFICATION OF PARAMETRIC
PROBLEMS IN CEM

There are numerous ways to parallelise complex CEM problems.


However, the method to attack them always depends on the characteristics of
the executed algorithm. The simplest case involves execution of the same
algorithm for different input parameters. Problems that can be solved in such
a way are called parametric and appear often in CEM applications.
Parametric problems exhibit attractive properties concerning perfor-
mance metrics, since they are scalable without limits, and achieve optimal
speed-up, due to lack of inter-process communications.
There are many categories of numerical modelling problems in CEM, the
solution of which can benefit from parametric analysis. Conceptually,
parametric analysis in these categories of problems can be applied both at
method-level (e.g. an optimisation procedure, when applying different
discretisation schemes to test the convergence of a given numerical
technique) and application-level (e.g. frequency scanning or altering the
geometrical parameters of the modelled structure). A special case, with
properties similar to application-level problems, concerns stochastic
optimisation schemes and is examined in Section 2.3.

2.1 Method-level Parametric Analysis

The term method-level denotes a simulation configuration that is


specific to the employed numerical technique. Therefore, a method-level
parametric analysis can be used for accuracy and stability tests of the
employed numerical technique, by changing the basic parameters of the
method itself (e.g. the number or kind of basis functions in the Method of
Moments (MoM), the grid density in the Finite Element Method (FEM), the
number and/or location of the fictitious unit sources in the Method of
Auxiliary Sources (MAS) etc).
Therefore method-level analysis usually constitutes a tool for evaluating
CEM methods. The focus is on finding the best way to implement specified
CEM models and achieve satisfactory convergence with the corresponding
physical problems. This approach often results in significant variations in the
computational cost and, consequently, the execution time of each model.
348 Chapter 12

2.2 Application-level Parametric Analysis

An application-level parametric analysis involves the altering of the


parameters that characterise the modelling problem itself. Some typical
examples follow:
In antenna modelling, designers often search for the optimal relative
amplitudes and phases of array element feeds, which achieve a desired
radiation pattern. Furthermore, using a discrete frequency scanning setup,
the antenna characteristics can be determined over a wide range of the
frequency spectrum. Possible applications of array beam-forming range
from the area of wireless and mobile telecommunications [Kostaridis
et al., 2004], to biomedical engineering [Atlamazoglou et al., 2002].
In microwave resonators analysis, multiple executions of the same
numerical code with a different frequency parameter determine the
resonance properties of the modelled structure.
In scattering problems, the Radar Cross Section (RCS) of a complex
structure can be determined for various angles of incidence of a plane
wave.
The above-mentioned examples are typical of application-level
parametric analysis. It is easy to deduce that, in most cases, the parametric
runs have similar needs in computational resources, in contrast with
method-level parametric problems. This fact facilitates accurate estimation
of the computing needs of each parametric simulation and, as a result,
simple, efficient task scheduling in the parallel distributed parametric
simulations, described later in this chapter.

2.3 Population-Based Stochastic Optimisation

An interesting aspect of parametric problems involves considering


population-based stochastic optimisation schemes as a form of a parametric
application. In fact such optimisers (e.g. Genetic Algorithms, Particle Swarm
Optimization) are often used in CEM and are notorious for their demands in
CPU cycles and memory. However, their execution generally winds up in
loading different input data - loaded from the optimiser properties - into the
same CEM code. The example of a Genetic Algorithm (GA) illustrates that:
1. The GA programmer defines the chromosome-coded parameters and
their mapping during application design. In general, the CEM code can
be regarded as a black-box for optimisation purposes.
2. When the GA starts, chromosomes are translated back into input
parameters for the code, which starts execution. In every generation,
there is no need for network data exchange, since the code runs
atomically. The optimisers merely tamper with the input data and not the
12. Software Agents for Parametric Electromagnetics Applications 349

code itself. Estimation of the cost function for each chromosome can take
place in different nodes of a distributed system without tampering with
the electromagnetic simulation code itself.
3. After the CEM code execution the GA framework can collect the results
and evaluate the cost function for all chromosomes. Information
exchange is required purely for GA operations, such as mating and
population decimation.
It is also possible to extend this logic in order to include other
population-based algorithms, such as the Particle Swarm Optimisation
(PSO) method or Evolutionary Strategies.

3. MOBILE SOFTWARE AGENTS

Agent-based computing is a promising paradigm for the implementation


of distributed applications in an open and dynamically changing
environment. Since the most common environment for building agents is
provided by the Java Virtual Machine (JVM), the following paragraphs refer
to Java Agents, without noteworthy loss of generality.
Mobile agents elaborate the concept of mobile code [Fuggetta et al.,
1998], constituting a flexible and dynamic structure able to roam remote
hosts and interact with them locally. Thus, in a Web-integrated agent
environment, mobile agents can be launched from one Web location to
another, performing transactions based on the application logic. This
scenario is especially attractive nowadays with the proliferation of wireless
mobile devices; a perspective user could for example launch a special
mobile agent to gather information into the Web or to perform an e-
commerce task, shutdown his device and reconnect after some period of time
to collect the results. It is the nature of the mobile Agent to migrate in order
to carry out tasks specified from its initiating user.

3.1 The Mobile Agent Paradigm

The Mobile Agent Paradigm extends the Remote Evaluation [REV,


White, 1997], which is a well known mobile computation paradigm, mainly
represented by Java Servlets. According to REV, a software component A
sends instructions to a software component B, describing how to perform a
service. The Mobile Agent Paradigm extends REV beyond the transfer of the
code (instructions), since it additionally specifies the mobility of an entire
computational entity, along with its code, the state, and potentially the
resource required to solve the task.
350 Chapter 12

According to this paradigm, a component A has the know-how


capabilities and a processor, but it lacks the resources. Therefore, it delegates
the know-how to a software component B, where the know-how gains access
to the resources and the service is performed. An entity encompassing the
know-how is a mobile agent. It has the ability to migrate autonomously
to a different computing node, where the required resources are available.
Furthermore, it is capable of resuming its execution seamlessly, because it
preserves its execution state.
Consequently, a mobile agent is not bound to the system where it begins
execution. Instead, it has the unique ability to transport itself from one
system to another in a network. This ability permits a mobile agent to move
to a destination system and utilise its services or interact with other objects
that reside in this system. When the agent travels, its state and code are
transported with it, and the agent can determine exactly its tasks, according
to attribute values that it maintains.
The framework and environment, in which an agent resides, lives and
takes action, is provided by an Agent Management System (AMS). The
AMS is a set of tools and programming interfaces that act as an underlying
infrastructure for mobility mechanisms, agent lookup queries and other
execution characteristics.
The approach of the mobility of an entire computational entity, utilised
by the Mobile Agent Paradigm, has great advantages over traditional client-
server paradigms and is very important for the development of applications
over network-centric systems, with the listed benefits [Chess, 1995; Chess,
1998]:
High bandwidth communication can be achieved, due to the proximity of
the agent with the server.
Since it is the agent that moves to access data (and not vice-versa), the
mobility of large amounts of data can be avoided, resulting in a drastic
reduction of network traffic.
After the injection of an agent into the network environment, the user can
perform other tasks (asynchronous task execution).
The task processing is less dependent on the availability of the network
since the agents can retain their states and postpone their migration until
the target node is available, or another one can replace it.
Increased robustness is achieved due to the autonomous nature of the
agents. Task processing can recover from client-server failures, since
agent lifecycle does not depend on them.
Agents possess application logic and itineraries that determine which
tasks they have to perform and where, without requiring any user
interaction. This automation of distributed task processing is a very
attractive feature in distributed parallel CEM computing.
12. Software Agents for Parametric Electromagnetics Applications 351

Intelligence can be incorporated in the agents, in the sense of enabling


the collaboration of peer entities inside the network environment for
fulfilling a common purpose.
The concurrent Agent Management Systems offer standardised API
functionality, according to the protocols of the Object Management Group
(OMG) Mobile Agent System Interoperability Facility (MASIF) [OMG,
1997] or the Foundation for Intelligent Physical Agents FIPA. While
OMG-MASIF focuses on code mobility, FIPA proposes a large set of
specifications, oriented towards intelligent agent collaboration and
communication. The full collection of FIPA standards can be found in
http://www.fipa.org/specs/, while a brief comparison of FIPA and MASIF
specifications is given in Manola, 1998.
Two MASIF-compliant frameworks have been employed for the case
studies of section 4: The IKV++ Grasshopper [IKV++, 2001], and
WebMages, which is an experimental, lightweight, Web-service oriented
agent platform developed in NTUA labs. The most commonly used FIPA-
compliant system is JADE [CSELT, 2005], also mentioned in this chapter.
The Grasshopper platform is no longer maintained. All API methods
mentioned in the next subsections are part of both specifications, unless
noted otherwise.

3.2 Mobile Agents in CEM: The Master-Worker Model

As discussed in section 2, parametric problems often occur in CEM,


consuming significant amounts of processing cycles and memory.
Furthermore, considering the fact that the more accurate the desired solution
is, the more computationally demanding it becomes, an execution of the
same code sequentially in one computer may lead to prohibitive duration.
In the numerous scenarios examined in section 5, mobile agents are used
in order to transfer the executable code along with the corresponding input
files to a remote system, execute the code locally and return the results. The
functionality of the mobile agents can be exploited under a general Master-
Worker scheme, which allows distributed processing without application-
specific installations in every participating node.

3.2.1 The Master Agent

According to the Master-Worker model, a Master agent is responsible for


tasks that are associated with global characteristics of the application. It is
initiated at the host, where all application resources are present: the agent
classes and the CEM codes, input files or other necessary information. The
tasks a Master agent carries out involve usually:
352 Chapter 12

Calculation of the CEM method initial settings


Execution of any special pre-processing stages (these first two sets of
operations depend greatly upon the CEM method itself)
Application of the decomposition scheme
Initialisation of Worker agents with the correct properties/arguments
Maintenance of platform synchronisation, when required
System monitoring and management of network resources
Due to the fact that such an entity does not need to perform migrations, it
can be implemented as an extension of a stationary agent class, a lightweight
object which does not have mobility mechanisms. All the above-mentioned
operations are integrated in the agent live() method, which is automatically
called by the AMS right after initialisation.

3.2.2 The Worker Agent

When initially created, a Worker agent resides in the network location of


its Master. Since a Worker agent must have the ability to migrate to another
system in the network, it has to extend a mobility-enabled agent class,
provided by the AMS API library. The most important method of this class
is the method move(destination location), which allows the agent to migrate
to a remote system defined by the destination location.
The first method, that is called by default for each Worker agent, is the
initialisation method (Grasshopper: init(), JADE: setup()). This method
actually takes the role of the agents constructor and recognises an input
array of Java Objects, created by the Master agent. This object collection
includes all necessary information for the parametric execution of the CEM
code in the specified location.
It is possible to control the actions of the Worker agent according to its
location via its properties/state. This can be implemented by calls to the
mobility methods beforeMove() or afterMove(), in which the agent state may
be modified to reflect another selection of operations for the new location.
Both methods are automatically invoked by the AMS when the move()
method is called anywhere in the block of the live() method. While the
beforeMove() method is invoked on the current host, the afterMove() method
is called upon arriving at the remote host. This mobility mechanism
according to which the agent can decide its task at the new host according to
changes in its state is known as a weak form of mobility.
Regardless of the specific properties of the CEM application, the main
functionality of the Worker agent remains the execution of the code with the
input data provided by the Master. Since the platform runs on top of the
JVM, this can be performed in the following ways:
12. Software Agents for Parametric Electromagnetics Applications 353

Execution of CEM code as a set of Java methods: despite the efforts of


increasing the performance of the JVM in comparison with native
Fortran, C or even C++ codes, this option remains very costly in
computational resources, especially when dealing with complex
arithmetic. An implementation of CEM functions in Java may easily be
more than two times slower, but it must be noted that platform-
independence is maintained this way.
Call of native CEM executables using the Java Native Interface (JNI): in
order to exploit the performance of native applications, it is possible to
perform native calls via the JNI [Lymperopoulos et al., 2005]. This
interface allows native program execution from within Java bytecode, at
the cost however of platform independence. Each Worker can serialise
and transfer an executable file within its state, calling it after arriving at
the desired destination. Naturally, many versions of this code need to be
maintained, in order to achieve multi-platform support. A more advanced
feature of JNI allows Java programs to call native methods from system
libraries with automatic name resolution (e.g. a compute library name
automatically resolves to compute.dll for win32 and libcompute.so
for Unix/Linux). Once more, these libraries need to be pre-installed in
the operating node.
Although it seems annoying to deal with problems arising from JNI use,
the practical advantages are numerous. There is no or little effort for
adopting already existing CEM applications (which are commonly
developed with Fortran or C), only a very small cost to computational
resources is introduced, and there are still ways of dealing with platform
independence, by examining the host environment /OS via Java system
properties.
Due to the nature of MAT, it is possible for an agent to migrate in
emergency cases and resume execution at a new location. This fault-
tolerance mechanism allows seamless computing even when experiencing
node failures. Finally, depending on the API provided by AMS libraries, this
core functionality can be implemented by either extending the live() agent
method (Grasshopper) or by creating corresponding behaviours during agent
setup (JADE).

3.3 A Brief Comparison Between MAT and MPI or PVM

Unlike MPI (or PVM), where the support for the development of
distributed applications is offered via calls to libraries that are external to the
processes, in Multi-Agent System implementations the distributed code is
incorporated in the mobile-agent functionality. Thus, the code, along with its
354 Chapter 12

state, is transferred at runtime, eliminating the need for the existence of pre-
compiled code at the remote systems. This characteristic adds flexibility to
the entire distributed environment, since new computing nodes can be added
dynamically during the execution of the distributed application. In addition,
the framework architecture enables the deployment of network applications
with little or no knowledge of CEM coding.
In order to comprehend the different approach in the development and
deployment of the same distributed parallel CEM application, a list of the
steps required for the overall setup of a distributed run from scratch is
appended:

1. MPI setup (MPICH2 paradigm of Gropp et al., 2005a; Gropp et al.,


2005b).
Identification of the interconnected network nodes, which will participate
in the distributed computation. A common platform (e.g. UNIX-
compliant or Win32) is required for all nodes since native codes are
compiled against the corresponding libraries.
Compilation/Installation of the MPI core libraries in a location accessible
by every computer (either locally or by a network file system).
Compilation of the native code against the MPI libraries via its native
API.
Deployment of the distributed application for a predefined number of
nodes, using the communication channel formed by the MPI daemons.

2. Mobile agent setup


Installation of the Java Runtime Environment (JRE) and the AMS base
JAR files on every potential node.
Copying of all CEM application resources to a server node.
Creation of the Master agent in the server node AMS.
The deployment of the distributed application is taken care of
automatically by the Master agent, since it is capable of creating the
required work agents for carrying out the specified tasks.

The PVM library behaves much like the MPICH paradigm. The summary
of conclusions obtained by comparing these distributed platforms consists of
the following points:
The MPI runs are more efficient and seem ideal for dedicated clusters
of homogeneous nodes, particularly in UNIX-like environments. Its
efficiency becomes more evident as the network communication
increases. The agent platform introduces overheads in data transfers and
therefore does not appear as an attractive solution for special HPC
computer farms.
12. Software Agents for Parametric Electromagnetics Applications 355

It is necessary to edit the sources and perform one or more compilations


of native CEM codes for even a simple parametric MPI application.
Editing CEM sources is usually a very difficult task, when the
programmer is not also the original author. On the other hand, no
compilations are required with agent computing.
The number of parallel MPI processes is defined right before execution
and is not dynamically updated.
The AMS approach introduces an abstraction layer between the
executable code and its parallelisation in the underlying network
infrastructure. The programmer does not need to know what is executed
in each node.
In both cases, most setup operations can be performed by a single
machine. However, the initialization steps required by MPICH are
significantly more complicated.
As a conclusion, it becomes evident that MPI implementations logically
dominate distributed homogeneous environments that are exclusively
dedicated to parallel simulations. However, when hardware and software are
diverse and/or the CEM source code cannot be accessed, the AMS approach
offers a straightforward and simple solution, which comes at very small cost.
In practical uses, experience gained during the study of numerous parametric
CEM problems has proved that agents can allow an easy migration from a
LAN environment to a distributed computation system, while MPI solutions
appear too complex for embarrassingly parallel problems.

4. A WEB-BASED MOBILE AGENT PLATFORM


FOR PARAMETRIC CEM MODELING

This section describes extensively the WebMages agent platform


developed with the aid of Web services communication methods, following
the MASIF architecture [Kostaridis et al., 2004]. The architectural design is
similar to the Grasshopper AMS [IKV++, 2001], however with lightweight
components targeting at reduced overheads.
Several simulations have been performed with this system, as well as the
Grasshopper platform. The next subsections elaborate on the test-bed
frameworks implemented for various case studies.

4.1 Mobile Agent Platform Components

The main components of the Web-based mobile agent platform are the
region, agencies, and agents (see Fig. 12-1). The region acts as a directory
356 Chapter 12

service for the other two components. It keeps track of every change that
occurs, and provides the other components with useful information
concerning the available agencies, the location of specific agents, and so on.
For every distributed application, only one region is needed, and every
agency has to register itself to the region in order to be able to exploit its
functionality. This enables agencies to be organised in a domain and
cooperate with each other. In addition, the region exposes some commonly
used methods as region services.

Figure 12-1. The Web-based mobile agent platform components.

The agency is the hosting environment for the agents. It uses a


hierarchical model in order to organise its internal structure. According to
MASIF, every agency has a set of places, where agents can be executed. The
agency can interact with the agents it contains and vice versa. For example,
an agent can contact the agency in order to locate its peers that reside in it.
Furthermore, the agency is able to create a new agent or order a resident
agent to migrate to another location. The agency also exposes some services,
in the same way the region exposes the region services.
The agency should be able to identify the agents. Therefore, they are
assigned a unique id that enables other platform components (e.g. the agency
and the region) and users to identify and perform operations on them. These
include agent termination, retrieval of the agents log, requests for
movement to a remote location, as well as collaboration mechanisms
implemented for every agent. The user can access the aforementioned
operations using the management tools.
12. Software Agents for Parametric Electromagnetics Applications 357

4.2 Communication Mechanisms

Communication among the different parts of the platform is achieved


through a common communication channel. In the current implementation of
the platform this channel is SOAP, where XML messages over HTTP are
exchanged. Whenever a service of one component is to be invoked, a request
containing the necessary information is sent. Then the appropriate method is
called and a response message is sent back.
In order to make the communication more efficient and extensible, a
layer of abstraction is added by inserting a set of classes responsible for
handling data transfers. These classes are called mediators, and their main
responsibility is to hide the complexity of the communication layer from the
other components. The mediators are separated into two major categories:
those dedicated to agency-to-agency communication and those to agency-
region communication. The first category involves creation of new agents,
communication between them, and migration mechanisms, while the latter
includes agency registration and region notification upon certain events (e.g.
agent creation).
One of the major advantages of mediators is that they can be replaced
easily, enabling the use of a different communication channel (e.g. RMI or
plain sockets can be used instead of SOAP/HTTP). Therefore, future
implementation will support the usage of many different mediators so that
the platform will be able to use multiple communication channels. Another
important advantage is that they are independent of the implementation
of the other components. They are reusable and can be embedded in
applications, in order to provide them with agent interaction capabilities.

4.3 Web-Based Infrastructure

The platform features several interfaces that allow user interaction and
communication with the back-end of the system (see Fig. 12-2). The
following paragraphs provide detailed information on the implementations
that serve user requests and provide necessary inputs to the main CEM
application.
358 Chapter 12

Figure 12-2. The Web-based distributed computing framework.

4.3.1 Interaction With the User

In order to access the systems services, the user makes an HTTP request
to the application password-guarded front-end. At the other side, the server
logic checks the validity of the password and returns a Web page, prompting
the user to select one simulation program from the available repository.
After selecting a specific code, the server returns to the user the page
containing a fill-in form for the input parameters. Furthermore, since the
user is not expected to be familiar with the specific features of the code, this
page contains useful hyperlinks that point to explanatory pages for assistance
in the correct definition of input parameters.
Once the set of input parameters are sent to the server, the server logic is
responsible for verifying their validity, according to the constraints imposed
by the specified numerical simulation program. In case of error detection,
the server sends back the input form to the user, along with comments that
describe the errors, as well as the way to correct them. When the set of input
parameters are valid, the server informs the user by returning a Web page
prompting the user to enter an email address. After completion of the
12. Software Agents for Parametric Electromagnetics Applications 359

simulation, the server will send an email to this address containing


hyperlinks to the results of the simulation.
A major functionality of the front-end is the ability it gives to the
potential user to visualise the geometry and the results produced by the
simulation. The server processes the input and output data, and creates
the appropriate visualisation files. The geometry and results are presented in
the users browser using the Virtual Reality Modelling Language (VRML).
VRML has been selected for this purpose, because it specifies a plat-
form-independent file format for describing three-dimensional interactive
worlds and objects. Interpreters (browsers) for VRML are widely available
for many different platforms, as they are authoring tools for the production
of VRML files. These characteristics make VRML the ideal technology for
interactive visualisation of complex geometries over the Web and justify its
selection as the visualisation tool of the developed framework.

4.3.2 Servlets for Front/Back-End Communication

The Java servlets constitute the server side logic and act as an
intermediate level between the front- and back-ends of the framework.
Depending on their functionality, they can be categorised in two parts: those
responsible for handling user interaction (connection with the front-end) and
those that interact with the Web-based mobile agent platform (connection
with the back-end of the framework). More specifically, servlets are used in
this platform for:
Collecting a correct set of user input parameters, by the creation of
dynamic Web pages
Setting up the selected simulation process from the server repository,
according to the user input
Guiding the user through the simulation setup process
Creating the input files for the final code execution
Visualising the code results by generating VRML files
Interfacing the back-end mobile agent platform
The last set of servlets acts as a gateway between the user and the agents.
Their main responsibility is to set up the distributed application by accessing
the application programming interface (API) of the platform. In fact, these
servlets act as a representative of the user to the mobile agent platform by
automatically performing the tasks the user would have done manually in
order to exploit its functionality.
360 Chapter 12

4.4 Conformal Array Modelling: A Modified Method


of Auxiliary Sources (MMAS) Approach

The first case study presented in this section involves a set of distributed,
mobile agent based parametric simulations for the estimation of the radiation
characteristics of a conformal antenna array. The Web-based AMS used has
been described in the previous subsections. The focus of this section is on
the formulation of the CEM equivalent model and on the simulation setup
and results.

4.4.1 Problem Formulation

A native numerical simulation code has been developed to model


microstrip arrays that are conformally mounted onto a cylindrical surface
using the Modified Method of Auxiliary Sources - MMAS [Shubitidze et al.,
1999].
The MMAS improves the performance of pure MAS in three
dimensional problems with thin layers. While MAS approximates scattered
waves with a finite number of fictitious, discrete current sources located in
auxiliary surfaces, the MMAS employs discrete current densities and point
charges instead. These modified auxiliary sources form a canonical mesh on
an auxiliary surface, where all derivatives are calculated approximately by
finite differences methods. More details about this approach can be found in
Shubitidze et al., 1999. Experience has shown that the MMAS algorithm
improves both the accuracy and the computational efficiency of pure MAS,
when applied in thin three-dimensional geometries.
In the studied case the generic geometry consists of a cylindrical surface
of Perfect Electric Conductor (PEC) material covered with a thin dielectric
layer. Several rectangular PEC patches are placed conformally on top of the
dielectric substrate, forming thus a patch array. The feeds have the form of
microstrip transmission lines.
For simulation purposes, a dedicated native Fortran code implements the
MMAS algorithm for this specific geometry. The code is capable of solving
the electromagnetic problem for two-dimensional linear conformal arrays of
identical patches for various curvatures and array sizes.
Assuming an e-jt time dependence and the Lorenz condition, the electric
field E (r ) anywhere in space, produced by the J (r ) current density and
(r ) charge density on the auxiliary surface Saux, is given as a function of
the magnetic vector potential A(r ) and the electric scalar potential (r ) by
the well known equation:
12. Software Agents for Parametric Electromagnetics Applications 361

E (r ) = jA(r ) (r )

The dependence of these potentials on the auxiliary sources is given by


the following equations:

jk r r '
J (r )e
A(r ) =
4
S aux
r r'
d 2S

1 (r )e jk r r '
(r ) =
4
S aux
r r'
d 2S

The electromagnetic sources need to satisfy the continuity equation:

j (r ) = 'S J (r )

The above equations contain the MMAS modelling unknowns, namely


the J (r ) , (r ) distributions on the auxiliary surface. The discrete
computation of the divergence is assisted by the mesh setup of Fig. 12-3.

(m 1 ,n+1) Ju (m 1 / 2,n+1) (m,n+1) Ju (m+1 / 2, n+1) (m+1,n+1)

Jv (m 1,n+1 / 2) Jv(m,n+1 / 2) Jv (m+1,n+1 / 2)

(m 1,n) Ju(m 1 /2,n) (m,n) Ju(m+1 / 2,n) (m+1,n)

Jv (m1,n 1 / 2) Jv(m,n 1 / 2) Jv (m+1,n 1 / 2)

(m 1, n 1) Ju (m 1 / 2, n1) (m,n 1) Ju (m1 / 2,n 1) (m+1,n 1)

Figure 12-3. The Modified MAS mesh of discrete fictitious sources on the auxiliary surface.

Following this discretisation scheme, the charge density is sampled at


the mesh nodes, while the surface current density is sampled in the middle
between two adjacent nodes. The finite differences approximation of the
divergence operator for a specific mesh point {mu , nv } is:
362 Chapter 12

hv ' (m + 12 , n )J u ' (m + 12 , n ) hv ' (m 12 , n )J u ' (m 12 , n )


j (m, n)
hu ' (m, n )hv ' (m, n )u '
hu ' (m, n + 12 )J v ' (m, n + 12 ) hu ' (m, n 12 )J v ' (m, n 12 )
+
hu ' (m, n )hv ' (m, n )v'

It is simpler to keep the surface current densities as the sole problem


unknowns, with the aid of the last equation.
The quantisation of the electromagnetic sources results in the
transformation of both potential integrals into finite sums. The next step of
MMAS involves the expansion of the unknown currents into a finite set of
basis functions with unknown weights. A linear system is then formed by
imposing the boundary conditions on the actual interfaces of the geometry.
The solution defines the weights and the wave components can be finally
determined with back-substitution.

4.4.2 Overview of the Model Geometry

The case study antenna involves a patch array of two identical


cylindrically conformal elements fed by microstrip lines. The corresponding
geometry is shown in Fig. 12-4. The dimensions of each patch are a a,
where a = 2l/15 and the dimensions of the substrate footprint on the cylinder
are d d, where d = 4l/15, for both directions (z and y), l being the
wavelength. The microstrip line width is c = l/15, the substrate thickness is
h = 2l/15, the radius of the cylinder is R = 4l/15, and the relative permittivity
of the dielectric substrate is er = 2.32. The two elements are separated by a
distance of l/30.
12. Software Agents for Parametric Electromagnetics Applications 363

Figure 12-4. Antenna array geometry.

4.4.3 Agent Deployment Mechanisms

A set of mobile agents is developed using the platform earlier described.


The most important problems that should be faced during their development
are the creation and dispatch from the servlets, as well as the interaction of
the agents with different types of environments (e.g. agencies running on
Linux, UNIX, Windows).
The solution to the first problem was to employ the classic Master-
Worker model. Upon successful submission of data to the servlets, a servlet
creates a new agent called Master agent by using the mediators API. This
agent is a stationary agent (which means it will never migrate) with
responsibility to create, dispatch, and coordinate the Workers. It has
knowledge of the scheduling policy and is responsible for sending the
Workers to the remote hosts.
In order to cope with the different hosting environments, the Master
agent creates a set of probe agents. These agents are sent to all available
agencies, and return with useful information concerning the underlying
operating system, the amount of free memory, the available storage space
etc. They also perform some benchmarks in order to have a better view of
the system capabilities. The Master agent uses this information to classify
the remote hosts.
The next step is the creation of the Workers, the agents actually
responsible for migrating to a remote host and executing the simulation
code. The Master agent creates the set of Workers and assigns each of them
364 Chapter 12

to a specific host. These agents are just wrappers of the code to be executed.
According to the information acquired from the probing agents, each agent
loads the code, from a pool of available codes that is appropriate for the
underlying operating system. The agent also reads any data it needs (e.g.
files) and stores them into byte arrays. It, then, migrates to remote hosts,
where it executes the code it carries. During execution, the agent
continuously monitors the status of the program executed and reports to the
Master agent important events, such as program failure or progress status.
When the code execution is finished, the agent returns to its home agency
and is assigned another task, until all tasks are completed.
While the Workers keep track of changes, the Master agent is notified
of any important events. The Master agent can maintain information
concerning the task progress on each remote host, the total execution time,
and so on. A servlet can be used to acquire this information and present it to
the user, along with results calculated so far.

4.4.4 Simulation Results

The total execution times for the specific parametric study are depicted in
Fig. 12-5. Since the major computational effort is related to determination of
the radiating properties of the antenna array, once the optimum number and
position of the auxiliary sources is defined, these execution times refer only
to this part of the parametric simulation. The total number of simulations
related to the parametric study is 19. Given that the simulations are
computationally identical and that participating hosts are characterised by
almost equal computing power, it is expected that the tasks will be equally
delegated to the remote hosts. Furthermore, given that every task is
computationally heavy and the parametric study implies an embarrassingly
parallel nature of the distributed problem (i.e. there is no need for
communication among the remote hosts), the time required for transfer of
the native code and input files to the remote hosts, as well as transfer of the
results, is not expected to add significant overhead to the total execution
time.
12. Software Agents for Parametric Electromagnetics Applications 365

Figure 12-5. Simulation times for different types of hosts.

As can be seen in the simulation results of Fig. 12-5, the execution times
align with these expectations. The execution time for two hosts is reduced to
about half that for one host. For three hosts, the time is about 2/3 of the time
for two hosts, and for four hosts it is about the 4/5 of the time for three hosts.
Thus, given the availability of several workstations, we can speed up the
parametric solution of different and/or identical antenna configurations.
The results of the parametric study include parameters of the antenna
array for all the selected configurations, such as input impedance and near
and far field patterns. Each can be handled independently in the users
browser.

4.5 Electromagnetic Penetration Through Apertures:


A Resonator Method of Moments (MoM) Model

The purpose of the present case study is the development of a generic


framework for multi-parameter analysis of EMC modelling problems in a
network of personal computers, utilising a Mobile Agent Platform. The
Grasshopper Agent Management System is used in framework; however the
system shares many common principles with the paradigms presented in
the previous subsections.

4.5.1 Formulation of the Electromagnetic Problem

The developed infrastructure was tested for the parametric simulations of


an Electromagnetic Compatibility (EMC) problem, namely the penetration
366 Chapter 12

of microwaves through apertures in conducting screens. The apertures, with


A(1) = a(1) b(1), A(2) = a(2) b(2) denoting their surfaces respectively, are
eccentrically cut with arbitrary orientation on two horizontal infinite
perfectly conducting infinitesimal thickness plates at a vertical distance H
(see Fig. 12-6).
Without restricting the generality, the origin of the Cartesian system of
coordinates {x, y, z} is taken at the aperture 1 centre of gravity. Then the
centre of gravity of aperture 2 is placed at the point {Rx, Ry, H } and the
apertures relative orientation is defined by the angle , the z-axis being
perpendicular to the conducting screens planes. The entire space is
characterised by the free space dielectric permittivity and magnetic
permeability 0 and 0 respectively.

Figure 12-6. Double screen with two displaced rectangular apertures.

Assuming an e+jt time dependence for all the field quantities, the electric
field in each of the regions of the geometry of Fig. 12-6 is expressed in terms
of its Fourier transform, as

E I ( x, y , z ) = E 0 ( x, y , z )
2
1
+ d k k e
z
d k e
j k ( x cos k + y sin k )
A( k , k )
4 0 2
0
12. Software Agents for Parametric Electromagnetics Applications 367

2
1
E II ( x, y, z ) = B ( k ,k )
j k ( x cos k + y sin k )
d k k e d e
z

4 2
k
0 0
2
1
C ( k ,k )
j k ( x cos k + y sin k )
d k k e d e
+ z
+
4 2
k
0 0

2
1
d e
+z j k ( x cos k + y sin k )
E III ( x, y, z ) = 2
d k k e k
D ( k , k )
4 0 0

where E0 is the primary excitation incident wave, A, B, C, and D are the


vector unknown coefficients to be determined and = k2 k 02 is the
propagation constant, k 0 = 0 0 being the propagation constant in
the free space. Note that, in order to satisfy the radiation conditions, it is
required that Re{} > 0 and, due to the e+jt time dependence, it is also
required that Im{} > 0. Applying Gauss theorem E = 0 and imposing
the continuity of the tangential EM fields components on the apertures
planes, two coupled two-dimensional integral equations are derived, as

dxdyG (x, y x, y) E (x, y) +


(1)
11 t
A( 1 )

dxdyG (x, y x, y) E (x, y) = R (x, y )


( 2)
12 t 1
A( 2 )

dxdyG (x, y x, y) E (x, y) +


(1)
21 t
A( 1 )

dxdyG (x, y x, y) E (x, y) = R (x, y )


( 2)
22 t 2
A( 2 )

where the field superscript (k) denotes the k-th aperture (k = 1, 2) and
(k )
E = xE x( k ) + y E y( k ) are the transverse electric field components on the k-th
t
aperture with surface A(k), G ij (x, y x, y) (i = 1,2/j = 1,2) are kernel matrix
functions and the right hand vectors R1,2(x,y) describe the incident wave
impact. The formed system is solved by employing the Method of Moments
(MoM) and more specifically an entire domain Galerkin technique.
Namely, with respect to the local Cartesian coordinates system
(x(k), y(k), z), attached to the k-th aperture centre of gravity (k = 1, 2), the
transverse electric fields are expressed as
368 Chapter 12

2 x(k ) 2 y(k )
( )
N (k ) M (k )
Ex( k ) = s x ( k ), y ( k ) cnm
(k )
U n T ( k )
(k ) m
n=0 m=0 a b
1 N (k ) M (k ) 2 y (k ) 2 x(k )
E y( k ) = d nm
(k )
U m T ( k )
(
s x , y(k )
(k )
) n=0 m=0
(k ) n
b a

where {x (1) , y (1) , z} {x, y, z} since the origin of the Cartesian system of
coordinates {x,y,z} is taken at the aperture 1 center of gravity, Tn() and Un()
are the n-th order Chebyshev polynomials of the first and second kind
respectively, whose arguments are chosen in a way that the appropriate
stationary waves are developed on the rectangular apertures surfaces and

2
2 x (k )
1 (k )
(
s x (k ), y (k ) =) a , k = 1, 2
2
2 y (k )
1 (k )
b

is a square root term, which imposes directly the satisfaction of the edge
conditions at x (k ) = a (k ) 2 , y (k ) = b (k ) 2 , accelerating the convergence of
the proposed Galerkin technique. A 2(N (1)+1) (M (1)+1)+2(N (2)+1) (M (2)+1)
(k ) (k )
order system of linear equations is derived in terms of the cnm and d nm
unknown coefficients. Note that, when expressing the local Cartesian
coordinates system {x (2 ) , y (2 ) , z} , attached to the aperture 2 center of gravity,
in terms of the global Cartesian coordinates (x, y, z), both the eccentricity
and the orientation of aperture 2 with respect to the aperture 1 is taken into
consideration, by a convenient exponential term, which appears in all the
elements of the linear system kernel. The integrals appearing in the system
kernel with respect to the x, y, x , y and k variables are performed
analytically, while the integrals with respect to the k variable are computed
numerically. Then, multiple poles appear for k = k v = k02 ((v ) H )2 due
to a (1-e2H) denominator term. The corresponding residuals represent the
guided waves between the two parallel screens. Once the transverse electric
fields Et(k) are determined, the EM field at any observation point can be
computed. Due to the integral procedure used in the above-mentioned
algorithm, the obtained solution is stationary. Thus, if the error in computing
the aperture fields is of |E(k)| order, only a |E(k)|2 order error is introduced in
computing the electric field intensity at an arbitrary point.
12. Software Agents for Parametric Electromagnetics Applications 369

4.5.2 Parametric Simulations

A native MoM code has been developed to model the EMC problem
described in the previous subsection. Both method-level and application
level simulations of a specific real example are presented. The geometry
consists of two square co-centric apertures of equal dimensions
A(1) = a(1) b(1) = A(2) = a(2) b(2) = 4 2 while the primary source is taken to be
a Hertzian dipole parallel to the screens, located co-centrally to the apertures
at a half distance in between them. The electric field distributions developed
on the apertures surfaces, which, due to the geometrical symmetry of the
examined structure, are identical on either of the two apertures surface, are
plotted. Within each column, the convergence of the proposed method is
demonstrated with respect to the series upper limit truncation (method-
level simulations), where N (1) = N (2) = N = M (1) = M (2) = M, due to the equal
square apertures geometry. Within each row, the electric field distribution is
plotted with increasing free-space wavenumber k0, i.e. with increasing
operation frequency f = /(2) (application level simulations). Further
application level simulations could involve, for example, altering of the
relative position of the two apertures, orientations and sizes, in order to
achieve the desired EM penetration.
The simulation deployment involves performing the typical steps for
setting up an AMS for parametric problems based on the Master-Worker
scheme. The user initiates a Grasshopper Region at the main host and one
Agency in each processing node. All Agencies report their existence to the
directory service at boot time. The Master agent processes input data and
creates Worker agents accordingly. The Workers serialise the files needed
for their simulation and migrate to the available Agencies, where they begin
execution. Results are collected back at the main host.

4.5.3 Performance Results

In order to test the behaviour of the infrastructure, we executed the


12 computationally heterogeneous simulations on different numbers of
computers. The utilised hosts were nodes of an almost homogeneous LAN,
the additional network traffic was low at the time of execution and the
simulations were autonomous. Therefore, it is mainly due to the simulations
heterogeneity that their delegation to the remote hosts is not absolutely
equal. For example, if a light simulation is delegated to a remote host, while
the other hosts are performing heavy simulations, this host will continue
with another task after having completed its current one.
370 Chapter 12

k0=1, N=M=4, A=4.49e4 k0=2, N=M=4, A=2.37e4 k0=3, N=M=5, A=4.71e4

k0=1, N=M=5, A=4.51e4 k0=2, N=M=5, A=2.44e4 k0=3, N=M=6, A=5.54e4

k0=1, N=M=6, A=4.77e4 k0=2, N=M=6, A=2.40e4 k0=3, N=M=7, A=5.65e4

k0=1, N=M=7, A=4.77e4 k0=2, N=M=7, A=2.38e4 k0=3, N=M=8, A=5.58e4

x-axis y-axis x-axis y-axis x-axis y-axis

Figure 12-7. Electric fields on two co-centric rectangular apertures cut on parallel screens.

According to this scenario, given the homogeneity of the utilised


computers and the heterogeneity of the simulations, each computer ends up
having performed a different set of simulations (Fig. 12-7). One has to notice
that, in problems such as the one analyzed in the current work, the
computational requirements of each simulation are dictated by the number of
basis functions N=M. Thus, the set of simulations can be grouped, based on
their complexity, in order to determine the scheduling policy. Furthermore,
given the LAN high bandwidth, the time required for the transfer of the
native code and the input files to the remote hosts, as well as the transfer of
the results, does not add a significant overhead to the total execution time.
The simulation results are compared with an ideal speed-up in Fig. 12-8.
12. Software Agents for Parametric Electromagnetics Applications 371

Distributed simulation results Time


250 Optimal Time
230

200

150
Time (sec)

115
100 90
70
76,67
57,5
50

0
0 1 2 3 4 5
Nodes

Figure 12-8. Simulation results with 1, 2, 3 and 4 nodes.

5. INTRODUCING GENETIC SOFTWARE AGENTS

The previous sections analyze the idea of developing software agent


platforms for embarrassingly parallel CEM problems. This paragraph studies
the case of population-based stochastic optimisation discussed in Section 2.3
for the case of distributed Genetic Search Optimisation. The theory of
genetic algorithms is based upon evaluations of the quality (fitness) of
multiple potential solutions and mixing of input parameters for the
production of new solutions, following the laws of natural evolution
(survival of the fittest).
A Genetic Algorithm (GA) performs a guided, intelligent random search
in the multi-dimensional space of optimization parameters, for the global
optimum. The GA finds tentative solutions (called chromosomes or
individuals) and classifies their quality according to a fitness function,
defined according to the desired characteristics. The optimization parameter
values are produced by combining the simulation of the survival-
of-the-fittest natural mechanisms (crossover, mating, mutation, population
decimation) with random processes. It must be noted that the estimation of
the fitness function is usually a complex and resource-demanding task, since
it typically involves large-scale electromagnetic (E/M) field calculations.
Serial GA execution consists of the following, consecutive steps:
372 Chapter 12

1. Initialize the algorithm parameters (reset the generation counter, define


fitness function, population size, gene coding, crossover and mutation
probabilities etc)
2. Evaluate the fitness of each individual
3. Sort the population according to their quality
4. Apply a selection method for the mating process
5. Apply the mutation operator and advance generation
6. Re-evaluate the fitness for all chromosomes and sort the population
7. Apply a population decimation method to eliminate the worst chromo-
somes, assuming they do not contribute positively in the search for the
for the global optimum
8. Repeat steps 3 to 7 until at least one of the stop criteria is met. These
include often a maximum number of generations, a desired fitness value
or both. The best individual contains the optimal solution encoded in its
chromosome.

The application of such stochastic global optimization procedures raises


significant demands for computational resources. Since hundreds of fitness
evaluations are required, before a satisfactory result is found, GA execution
time may prove prohibitive for a single processing node. Hopefully, the
nature of GAs itself assists in designing and implementing parallel versions
of the algorithm in a straightforward and efficient way. These modalities are
referred to as Parallel (PGA), or Distributed Genetic Algorithm (DGA),
depending on the platform architecture.

5.1 Distributed Genetic Algorithms with Agents

Implementing Distributed Genetic Algorithms (DGAs) for CEM


applications is a very active research area, especially in Computer-Aided
Engineering (CAE) for antenna design and simulations. In this context,
advanced network computing techniques can be used in building the
framework for the execution of such applications.
The DGA practically provides scalability to the classic GA by dividing
the costly fitness evaluations into several interconnected processing nodes.
There are numerous methods of achieving parallel execution, and many new
parameters are introduced for configuring the distributed execution [Alba,
2001]. The DGA run may be synchronous or asynchronous, coarse- or fine-
grain, etc. There are also many different communication schemes, depending
on the properties of each individual of the GA. The proposed approach
consists of decomposing the GA population to its individuals and assigning
an Intelligent Software Agent for each chromosome, to carry out the genetic
operators and simulate its life-cycle. Agent groups can run in a separate
12. Software Agents for Parametric Electromagnetics Applications 373

node, since the required communication is performed with the aid of the
sophisticated agent messaging system. These collaborative entities are called
Genetic Search Agents (GSAs).
GSAs appear as individuals in the population, carrying the genetic
material in the form of properties and using AMS communication channels
for exchanging information during evolution processes. The implementation
details of this straightforward approach are given in the following
subsections.
The Mobile Agent Technology (MAT) has been applied in the
development of Distributed Genetic Algorithms, in software systems such as
the Genetica environment described in Kryl, 2002 or the platform of
Slootmaekers et al., 1998. However, this approach has not been introduced in
the area of CEM research. The following paragraphs describe the basic
principles and concepts of intelligent agents for genetic algorithms and the
key entity of the GSA. The described infrastructure appears very attractive
for Distributed GA implementations, due to the simple, straightforward
conceptual design and the flexibility in communication mechanisms.
The implementation issues discussed in this section are generally
oriented towards compliance with the FIPA MAT standards and the
CSELT/TILAB Java Agent Development Environment (JADE), concerning
the Agent Management System [CSELT, 2005].

5.1.1 Entity Mappings

The Genetic Search Agent (GSA) is the core functional entity of a DGA
framework based on MAT. It is an autonomous object that communicates in
a synchronised or asynchronous (non-blocking) manner with similar agents
for the collaborative solution of a given optimisation problem. The following
entity mappings connect GA entities with GSA platform components and
operations:
Genetic Search Agent Chromosome (Individual): The GSA is an
individual, carrying a full chromosome, and therefore a possible solution
to the problem. It is the main component of the platform, with the ability
to carry out genetic operations, matching its life cycle with the GA
specifications. The way GSAs implement genetic operators is described
as a part of the entity mappings.
Genetic Search Agent Properties Genes: Each GSA holds properties
that represent the actual genes. It is this set of genes that forms the
chromosome and allows the evaluation of the GSA quality with fitness
criteria.
374 Chapter 12

Population Set of Genetic Search Agents: The population actually


consists of the set of GSAs in the AMS, which have the capability to
collaborate for solving the optimisation problem.
Mating GSA communication: The ACL messaging system is used in
GSA platforms for information exchange during mating procedures
(selection, crossover). According to the coordination scheme followed in
a specific implementation, the required information can be sent by a
central controlling entity, or can be deduced by each GSA according to
its intelligence.
Offspring generation GSA cloning: The copying/cloning agent
operations can be mapped to the offspring generation mechanisms of the
GA.
Fitness evaluation CEM application execution: The assessment of the
genetic material of each GSA corresponds to launching the CEM
application and collecting the results.
Population decimation GSA terminations: Decimating the population at
the end of each generation is equivalent to calling removal functions for
each agent, once more according to the coordination scheme of each
platform.
Based on the above-mentioned mapping model, the development of a
GSA system can be carried out as an extension of existing mobile, intelligent
agent implementations in widely used platforms such as JADE.

5.1.2 Parallel Processing Coordination

The coordination of the GSA optimising framework can be performed in


two ways. The simpler and more widely used way is to introduce a
coordination entity (master) that holds fitness tables and takes the decisions
for the GA operations based on GSA input and manages a population of
fitness evaluators (workers), by giving mating and decimation directives.
This method exhibits blocking behaviour, since generation advancement is
made possible only when all evaluations are finished.
This way, worker-GSAs remain simpler, but there is increased
probability of bottleneck occurrences, since all entities need connections to
the same network address. In addition, the resulting need for synchronisation
at each generation cannot make use of possibly faster processing nodes, not
to mention the total collapse of the system in case the coordinator crashes.
An alternative approach focuses on increased GSA intelligence and
autonomy. This decentralised model proposes the removal of a management
and coordination entity, by making GSAs capable of performing all genetic
operations on their own. More specifically:
12. Software Agents for Parametric Electromagnetics Applications 375

GSAs should be able to communicate directly with a population subset


(neighbourhood).
GSAs should take their own decisions on mating procedures and
offspring generation, based on the qualities of the individuals of their
neighbourhood.
Dynamic insertion/deletion of a GSA should be handled automatically
by the community. Each individual looks for a neighbourhood auto-
nomously, based on its own criteria (e.g. chromosome diversity,
fitness).
No synchronisation is required at the end of every generation. A better-
equipped node can perform faster by mating more often in comparison
with a busier or otherwise worse node. Generation mixing does not
constitute a problem in this environment.
An architecture that can be used as a compass for implementing a GSA
system is discussed in the following paragraph.

5.2 Proposed Architecture

As previously mentioned, the GSAs can perform in both a centralised


and decentralised way, following the theoretical background of Distributed
Genetic Algorithms [Alba 2001].

5.2.1 Centralised Model

This is the method adopted in Genetica [Kryl 2002], with numerous


common characteristics with the standard Master-Worker model. A typical
GA distributed execution with this model involves the following steps:
1. Initialize all network resources, and assign chromosome values to each
individual.
2. The master sends requests for fitness estimation to all nodes for the
unclassified genetic material.
3. Each worker node starts fitness estimation and sends the outcome to the
master. It remains blocked afterwards, waiting for the next master
request.
4. When all fitness evaluations are finished, the master performs the genetic
operators of selection, mating and mutation locally, using the
chromosome values of its workers. The generation counter is increased.
5. The steps 2 to 4 are repeated constantly until the termination criteria
(known to the master) are met. It should be noted that in all future
generations, the decimation operator is also applied. All monitoring and
statistical information is available to the user by the master node.
376 Chapter 12

Fault tolerance and dynamic resource allocation mechanisms can be


applied in this model, thanks to the AMS technology. The Master can
dynamically create new Workers, or replace a malfunctioning one, as long as
the Worker state is saved periodically. The centralised model is the simplest
to implement and offers good control over the genetic search, since all
operations are carried out by the same entity for the whole population.
However, the employed communication scheme may easily suffer from
bottle-necks, especially when input or output data size is large.

5.2.2 Decentralised Model

In order to avoid network traffic bottlenecks and total dependence of the


costly optimisation process on the life of a specific controlling entity, the
GA may be implemented in a decentralised way with intelligent mobile
genetic agents. This model is perhaps the most similar to the real life genetic
evolution, since each individual has the capability to make its own decisions
without following external commands. The genetic operators are imple-
mented with the use of agent communication skills: mating selection, cross-
over, offspring generation are carried out after two GSAs pass messages
to each other, containing their fitness values, their availability or their
genetic material.
Since it would be an enormous cost of network resources to demand from
every individual GSA to exchange information with all its peers, each DGA
holds information about a population subset. The selection depends on
random or fitness criteria. This fact results to a sub-optimal application of
GA operations, but the system handles dynamic resource re-allocation and
exhibits unsurpassed robustness, since no entity is too important to be lost.

5.2.3 Hybrid Implementations

In order to combine the best of both models, it is possible to use both


centralised and decentralised principles in a hybrid platform. In this case we
introduce multiple Master GSAs, which are responsible for the coordination
of a population subset (neighbourhood) of Workers the same way the
centralised model indicates. However, this group of Master GSAs exchanges
information about genetic evolution in a collaborative manner, so that the
search is guided by global and local optimal individuals. This way, the
genetic operators are implemented in a distributed fashion among these
intelligent entities. Naturally, several variations may exist in this hybrid
model, depending on the delegation of tasks between Masters and Workers.
The hybrid modalities combine the advantages of the aforementioned
models at the cost of implementation complexity. A cost-effective approach
12. Software Agents for Parametric Electromagnetics Applications 377

is the creation of a single Master agent in every network node, who would
control the set of local Workers and communicate with its peers in order to
learn the global optimum, implement elitism or decimate its population.

5.3 Conclusions

There are numerous methods of distributing the genetic optimization


procedures with the aid of Intelligent Agent technology. The fundamental
parameter that guides the design of such platform is the level of
sophistication incorporated in the core computing entity, the Genetic Search
Agent. The JADE Agent Management Systems supports advanced
communication protocols, languages and ontology descriptions, which
significantly assist in achieving a rational effect in GSA interaction. Endless
DGA modalities can be programmed with the aid of the request-response
mechanism provided by language- and ontology-coded ACL messages.
Finally, the GSA platform architecture provides the capability for
flexible interface with already developed, native CEM applications, which
can be regarded as black-boxes from the GSA aspect, with the aid of
libraries such as the JNI. The distributed system can then operate as a service
for general-purpose CEM optimization, providing an abstraction layer
between the CEM application and the DGA optimizer.

References
Alba, E. and Troya, J. M., 2001, Analyzing synchronous and asynchronous parallel
distributed genetic algorithms, Elsevier FGCS 17:451-465.
Atlamazoglou, P. E., Biniaris, C. G., Kostaridis, A. I., Kaklamani, D. I. and Venieris, I. S.,
2002, Mobile agent based distributed computation of absorbed power inside interstitial
antenna arrays for the hyperthermic treatment of cancer, Proc. 4th GRACM Cong. on
Comp. Mech.
Biniaris, C. G., Kostaridis, A. I., Kaklamani, D. I. and Venieris, I. S., 2002, Implementing
distributed FDTD codes with Java mobile agents, IEEE Ant. Prop. Mag, 44(6):115-119.
Biniaris, C. G., Kostaridis, A. I., Kaklamani, D. I. and Venieris, I. S., 2003a, An agent-based
framework for parametric studies of numerical modelling problems in computational
electromagnetics, Int. J. Numer. Model. 16:67-79.
Biniaris, C. G., Kostaridis, A. I., Kaklamani, D. I. and Venieris, I. S., 2003b, Mobile agent
based distributed computations of numerical modeling problems in EMC applications,
Proc. IEEE Intl. Symposium on Electromag. Compat., 2:794-797.
Biniaris, C. G., Kostaridis, A. I., Kaklamani, D. I. and Venieris, I. S., 2004, A three-
dimensional object-oriented distributed finite element solver based on mobile agent
technology, Taylor & Francis Electromagnetics, 24:25-37.
Chess, D. et al., 1995, Itinerant agents for mobile computing, IEEE Personal Comm. Mag.,
2(5):34-59.
378 Chapter 12

Chess, D., Harrison, C. G. and Kerschenbaum, A., 1998, Mobile agents: are they a good
idea?, in: Mobile Agents and Security, G. Vigna, ed., LNCS 1419, Springer-Verlag,
pp. 25-47.
CSELT S.p.A., TILab S.p.A., 2005, JADE Programmers Guide, (March 2005), [online]
http://jade.cselt.it/doc/programmersguide.pdf.
FIPA (Foundation for Intelligent Physical Agents) 2005, Repository of FIPA Specifications,
[online], http://fipa.org/repository/index.html.
Fuggetta, A., Picco, G., Vigna, G., 1998, Understanding code mobility, IEEE Trans. on Soft.
Eng., 24(5):342-361.
Gropp, W. et al., 2005a, MPICH2 Installers Guide, (June 10, 2005) Math. and Comp. Sc.
Div., Argonne Natl Lab., (June 10, 2005); http://www-unix.mcs.anl.gov/mpi/mpich2/
downloads/mpich2-doc-install.pdf.
Gropp, W. et al., 2005b, MPICH2 Users Guide, (June 10, 2005) Math. and Comp. Sc. Div.,
Argonne Natl Lab., (June 10, 2005); http://www-unix.mcs.anl.gov/mpi/mpich2/
downloads/mpich2-doc-user.pdf.
IKV++, 2001, Grasshopper Programmers Guide, IKV++ GmbH Informations- und
Kommunikationssysteme, Berlin;
Kostaridis, . ., Biniaris, C. G., Foukarakis, I. E., Kaklamani, D. I. and Venieris, I. S., 2004,
A Web-based distributed computing framework for antenna array modelling, Special Issue
IEEE Comm. Mag. on Adaptive Antennas and MIMO systems for wireless comm.,
42(10):81-87.
Kryl, P., 2002, Distributed genetic algorithms guide to geNETiCA, (22 May 2002);
http://genetica.sourceforge.net/.
Lymperopoulos, D., Logothetis, D., Kostaridis, A. and Kaklamani, D., 2005, Grid
Computing Techniques for Distributed Processing in Computational Electromagnetics
based on the Web Services Architecture, 17th IMACS World Congress Scient. Comp. Appl.
Math. and Simul., Paris.
Manola, F., 1998, Agent Standards Overview, Object Services and Consulting, Inc. Technical
Note, (July 1998); http://www.objs.com/agility/tech-reports/9807-agent-standards.html.
OMG, 1997, Mobile agent system interoperability facility (MASIF) specification, (November
1997); ftp://ftp.omg.org/pub/docs/orbos/97-10-05.pdf.
Shubitidze, P., Kaklamani, D. I., Anastassiu, H. T., 1999, Modified method of auxiliary
sources applied to the analysis of planar and cylindrically shaped microstrip antennas,
Proc. Intl. Conf. Electromag.. in Adv. Apps., Torino, 375-378.
Slootmaekers, R., van Wulpen, H. and Joosen, W., 1998, Modeling Genetic Search Agents
with a Concurrent Object-Oriented Language, Proc. Intl Conf. and Exhib. on HPC and
Networking 98, London, UK, 843-853.
White, J. E., 1997, Mobile agents, in: Software Agents, J. M. Bradshaw, ed., MIT Press, New
York, pp. 437-472.
12. Software Agents for Parametric Electromagnetics Applications 379

Bibliography
Goldberg, D. E., 1989, Genetic Algorithms in Search, Optimization and Machine Learning,
2nd ed., Addison-Wesley.
Chapter 13
WEB SERVICES ENHANCED PLATFORM
FOR DISTRIBUTED SIGNAL PROCESSING
IN ELECTROMAGNETICS

I. E. Foukarakis, D. B. Logothetis, A. I. Kostaridis, D. G. Lymperopoulos


and D. I. Kaklamani
School of Electrical and Computer Engineering, National Technical University of Athens

Abstract: The Web Services programming model has been utilised as middleware for
many distributed platforms. In this chapter a distributed Web Services
enhanced platform is presented. Several architectural paths and decisions are
presented in order to provide information about possible utilization patterns of
Web Services in distributed computing. A parametric CEM application
developed on top of the discussed platform is also presented. Finally, the
performance of the platform is evaluated based on tests performed on a local
network of workstations.

Key words: Computational Electromagnetics; Parametric Problems; Web Services.

1. INTRODUCTION

This chapter presents the development of a network service for the


distributed computation of parametric CEM problems, which is built upon
the Web Services computing paradigm [Karre, 2003; Lymperopoulos et al.,
2005]. The design is service-oriented and the implementation is based on the
Simple Object Access Protocol (SOAP) specifications [Box et al., 2000].
The platform is tested with the problem of microwave imaging using a
coherent Synthetic Aperture Radar (SAR) sensor. The radar signal
processing is simulated in a distributed way using the means provided by the
Web Services mechanism. The implementation details are presented in the
next sections, along with some initial numerical results collected by several
distributed computations.

381
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 381397.
2006 Springer. Printed in the Netherlands.
382 Chapter 13

2. WEB SERVICES IN DISTRIBUTED SAR


MODELLING AND SIGNAL PROCESSING

2.1 Platform Architecture

The described platform is based on the client-server mechanism. It


integrates the following components:
A Web interface for managing nodes in the distributed system: each user
can introduce a new processing unit or remove an existing one from the
distributed computational environment via simple HTML pages, which
are produced by Java Server Page (JSP) compilations.
The core scheduling, coordination and task allocation system, which is
implemented as a Java servlet accessible by the Web-container.
Client nodes, capable of running native codes with the use of Java Native
Interface calls.
An information transfer subsystem for exchanging input/output data in
SOAP/XML format, based on the Apache AXIS library.
The standard, underlying TCP/IP network infrastructure, used by all
concurrent networking technologies.
Due to the service-oriented nature of the platform, these components are
implemented as Web Services, resident in the two conceptual entities: the
clients, which run the simulations, and the server, which monitors the
platform and collects the results. However the role of each entity during
communication is interchangeable, meaning that client nodes provide Web
Services too. The developed services are as follows:

1. Server node
Client node registration
Provision of input files to the clients
Reception and/or processing of output files
Scheduling

2. Client nodes
Resource management
Transportation of output files
Execution of native codes

Since portability and reusability are of vital importance in this platform,


the development of all system modules focuses on component independence.
Each module has a black-box perception of other interconnected parts. Every
piece of information, that is necessary for the correct operation of a service,
13. Web Services for Distributed Signal Processing 383

is provided as a parameter at runtime. This development method is greatly


assisted by the extended use of Java interface objects. In addition, the
services of the administration node communicate with each other as if they
were running on remote machines, offering thus decentralisation and load
balancing capabilities.
The method of assigning a specific task to a client node plays an
important role in the overall system performance. The most common
mechanisms are the push and the pull models. The described platform
supports both models in a hybrid implementation.
According to the push model (Fig. 13-1), a server node has access to
the resources located on remote nodes. The server node can send input files
and start job execution at the remote nodes. In order to achieve this, the
server needs to be able to track information concerning remote nodes. This is
done by using a directory service that stores information for every remote
node. The directory service exposes its functionality as a Web Service that
enables remote nodes to register and deregister themselves to the system,
update their information and query useful information for the application.

Input Provider
Service
Remote Input
Service
File Main Thread
Manager

Task
Executing
Output Service
Task
Receiver
Collector
Service
Class
Loader
Input
Repository

Resource
Scheduler Manager
Output
Repository Service

Node
Manager
Main Thread
Node
Register
Service
Server Node

Figure 13-1. Push Model Architecture.

In addition, the server itself needs to implement a scheduling mechanism


that can take decisions regarding the best available node in the distributed
384 Chapter 13

environment. In the described platform, this algorithm is a plain seek for the
next node that does not run a task and has available memory for the
scheduled one. Naturally, more sophisticated algorithms may be used,
however not always with fruitful results.
The main difference between the push and the pull model is that the
latter doesnt include scheduling and resource control (Fig. 13-2). It is the
responsibility of the remote nodes administrator to decide when resources
are available for job execution.
In the case of lack of resource control mechanisms from a central server,
the application needs to provide the necessary mechanisms for resource
monitoring and management. This way, the applications require built-in
intelligence to identify when to pause, resume or stop their execution. In the
platform described in this section, each task runs as a separate thread at the
client nodes, thus making it easy to perform this operation. The drawback for
this solution is that the platforms hardware requirements are increased.

Input Provider
Service Main Thread

File
Manager
Output
Receiver
Service

Source
Class
Provider Loader
Input Service
Repository

Output
Repository
Source
Repository

Node
Manager

Node
Register
Service
Server Node

Figure 13-2. Pull Model Architecture.

The developed system is capable of supporting both models, by featuring


a multi-functional server mechanism (Fig. 13-3). More specifically, the
controller can offer tasks after client request following the pull model,
13. Web Services for Distributed Signal Processing 385

while the featured scheduling module assigns jobs to push oriented nodes.
In other words, the diversity is handled by the server. In other
implementations, this multi-functional mechanism is integrated in the
clients.

Remote Input
Input Provider Service
Service
Main Thread
File
Manager Task
Executing
Service
Server
Class
Loader
Output
Receiver Task
Service Collector
Input
Repository

Server Scheduler
Resource
Manager
Server Service
Output
Repository
Node
Manager

Main Thread
Node
Server Register
Service

Server Node

Figure 13-3. Overview of the platform's architecture; each service can be implemented
independently of the others.

2.2 Server Services

2.2.1 Node Management Service

One of the most important advantages of the platform is its dynamic


environment. New nodes can be added or removed on demand so that more
resources can be added to the environment or parts of the grid that cause
386 Chapter 13

problems or delay the process can be excluded. The node management


service is responsible for this work. The users can perform management
operations using a simple Web interface (Fig. 13-4).

Figure 13-4. Node Management Web Interface.

The node manager acts as a directory service, providing information


about the registered nodes. In order to register a new node, the user must
provide information concerning the Uniform Resource Locator (URL) of
the nodes services. This data is used to identify each node and to help
applications locate remote resources. The platforms scheduler utilizes
heavily the node management service because it requires data concerning the
available resources in order to distribute work submitted to the system
efficiently to the available nodes.
One important feature of the node management service is that it
continuously checks if a registered node is accessible. Although a node
might be registered, theres always a chance that it wont unregister because
of a fault. This fault could be either a network problem or node failure. The
node manager polls the registered nodes to see whether they are alive or not.
13. Web Services for Distributed Signal Processing 387

If a node doesnt respond, it is removed from the list of the nodes, and the
work submitted to it is sent to another node.

2.2.2 Input Provider Service

The input provider service is responsible for managing input data that
will be provided for processing to the remote nodes. This data is stored in a
file system as separate files. Input data for each application is stored as a set
of files, each one containing data that can be submitted independently. A file
manager is responsible for managing the files stored in the file system.
The input provider is designed based on the singleton design pattern.
According to this pattern, only one instance of the provider exists any time,
ensuring that only a unique file manager is available. This approach prevents
the case of multiple input providers accessing the same files, thus avoiding
submission of the same input data more than one time.

2.2.3 Output Receiver Service

Results from data processing are stored in output files in each remote
node. In order to assemble the final results it is necessary to collect these
partial results and to compile them to a final file containing all of the data. In
order to achieve these, a service responsible for accepting result data is
deployed on the server. Whenever a task finishes on one of the nodes, the
file that contains the results is sent back to the server by using the Output
Receiver Service. This service stores the output data and makes them
available to the user.

2.2.4 Scheduler

The Scheduler is the component that is responsible for task submission to


the remote nodes. The load balancing schemes that it uses utilises a common
technique that simulates the benefits of automated dynamic load balancing in
a heterogeneous distributed computing environment with a relatively small
overhead. More specifically, the required task is divided in many small
subtasks (this is feasible often in parametric simulations), so that each
processing node actually performs many subtasks subsequently. This way, a
slower processor is automatically given a smaller number of subtasks
compared to a faster one, and the idle time cannot exceed the worst subtask
execution time. The algorithm that assigns a task to a specific node is
described in Fig. 13-5. Information about the available nodes is provided by
the node management service.
388 Chapter 13

bestNode := -1;
:= the number of
registered nodes;
:= 1;

The best
I>N? YES bestNode = -1? NO node is
bestNode

NO YES
Find next
registered node There is no
available node

Connect to the
resource control
service

Is another task
executed?
YES

NO

NO Is there any
more free
memory?

YES

bestNode:=I

Figure 13-5. The task allocation flowchart presents the algorithm used to assign a task to the
best available node.
13. Web Services for Distributed Signal Processing 389

2.3 Node Services

2.3.1 Resource Manager Service

In order to provide better load balancing mechanisms, it is necessary to


observe the status of the available resources on each node. Each node has a
resource manager that keeps track of information about the number of tasks
currently running on the node, available memory etc. The node management
service on the server polls each registered node for this information so that it
can provide the scheduler with information about the available resources on
each node.

2.3.2 Task Executing Service

The primary goal of the platform is to enable execution of tasks at remote


nodes. Applications developed for this platform can be implemented in two
ways. The first one is as pure Java classes. This approach enables execution
of code in different environments, taking advantage of Javas platform
independence. The drawbacks of this choice are that application must be
coded or be re-written in Java and that Java is slower than native code. In
order to bypass these problems, we use the second approach, the usage of
Java Native Interface (JNI).
JNI is a set of functions that enables Java programs to use code libraries
or applications developed in other programming languages. The task code is
developed in a different programming language (i.e. C or Fortran) as a
library, usually in a dynamic manner (i.e. DLLs for Windows). This code is
executed from Java by using the JNI functions. The Java classes implement
a specific interface that defines tasks, and is packed with the native code into
a Java Archive (JAR) file.
The JAR file containing the tasks code is submitted to the remote node
by using the Task Executing Service. This service accepts Java classes used
to define tasks. The Task Executing Service uses a custom class loader in
order to dynamically load and execute the task code. Upon receipt of a new
task, a new thread spawns, loads the Java class and starts the execution of
the task. By using this approach we can take advantage of the speed of
native code and the dynamic class loading and execution mechanisms that
Java provides.
390 Chapter 13

2.3.3 Remote Input Service

The last clients service is the Remote Input Service. It is responsible for
receiving task code and input data from the server. The received data is
stored locally. If the data is a JAR file, it is used by the Task Executing
Service to load the code. In the case of files containing input data, the tasks
are responsible for loading this data.

2.4 Other Issues

The distributed, Web-Services-oriented framework is able to execute


massively parallel parametric simulations by transporting files containing
parameters and results. These files must be encoded in XML format, the
standard format for Web Service messages. An important issue is raised,
when considering the binary nature of the input and output files. The XML
specification does not support binary content. This crucial characteristic
depends only on the running CEM code, which is considered a black-box
entity in this platform.
In order to resolve this issue, it is necessary to convert any kind of binary
data in XML format, so that it can be inserted in a SOAP message. The
transparent conversion of such binary information is performed with the aid
of the Castor library. The Castor tool transforms any kind of Java Bean into
a pure ASCII XML representation and vice versa, according to user-
specified rules when needed. This additional feature extends the cross-
platform capabilities of the service, since the manipulation of XML data can
be performed by any kind of executable code. Naturally, such processing
affects the overall speed of the application in a way that is studied during the
simulation tests.
Each class of user requests can be managed by a different server (service
node), distributing efficiently the network and processing load. If a fatal
error renders a node unusable, the architecture may recover by assigning the
task to a different node. These qualities are greatly appreciated in a multi-
purpose, sophisticated platform, where uptime, fault tolerance and service
availability are of vital importance.

2.5 Imaging Radar Signal Processing

The testing of the framework includes the distributed processing of


simulated Synthetic Aperture Radar (SAR) raw data. The simulated SAR
sensor features pseudo-random BPSK modulation. The 1023-bit long
transmitted sequence is generated by a linear feedback shift register with 10
flip/flops [Skolnik, 1981]. The synthetic aperture includes several thousand
13. Web Services for Distributed Signal Processing 391

radar pulses. The SAR homodyne receiver performs matched filtering (and
therefore range resolution) via a bank of despreading processors, which take
advantage of the excellent auto-correlation properties of the pseudorandom
sequence in order to focus on the specified range/round-trip delay. In fact,
this system exhibits the feature of parallelisation intrinsically, since the
despreading processor bank distributes the range resolution task.
Target resolution along track is achieved by classic SAR methods, which
take advantage of the phase history, in order to produce the final
reconstructed image. The reconstruction algorithm includes FFT trans-
formations of the filtered signal. This simple initial design does not account
for range migration and the FFT calculations are performed by a single
node. The illuminated area is also simulated for testing purposes. It consists
of several discrete targets and the SAR system is asked to identify their
their location, size and Radar Cross Section (RCS).
There are several scenarios of decomposing this task into subtasks that
can be carried out in a parallel way (Fig. 13-6):
Each node processes all synthetic aperture pulses, but is responsible for
one of all range buckets, in other words a strip along track. In a way this
method simulates the bank of despreading processors. The nodes do have
to perform the range resolution task once. No data exchange is needed
for along-track resolution (range migration is ignored), but the matched-
filter outputs must be appropriately superimposed, before stepping into
FFT calculations.
A rectangular grid is defined on the illuminated area. Each node
processes all synthetic aperture pulses, but is responsible for one grid
element. This way, each node performs all SAR processing steps, but for
a much smaller area compared to the original problem. In this context,
the superposition of matched filter outputs at the FFT processing node is
a rather time-consuming task, due to the size of the data.
Each node processes an illuminated area but is responsible for a portion
of all synthetic aperture pulses. Although this approach does not fully
carry out any SAR processing step, it is very simple to prepare for along-
track calculation, simply by concatenating the data from each processing
node.
Due to the fact that along-track resolution is performed by a single
processor (the simulated target area is very small), the most convenient
parallelisation method for this application involves division of synthetic
aperture pulses. A single node may be responsible for assembling the
computed data after the simulation and performing the final reconstruction
step.
392 Chapter 13

Figure 13-6. Distribution of the synthetic aperture imaging radar reconstruction problem in
the Web-Services Platform.

2.6 The Simulation Mechanism

The Web Services infrastructure was tested in the distributed parametric


simulation of a Synthetic Aperture Radar (SAR) signal processing and image
reconstruction. The actual code that performed the SAR procedures was
implemented in the C native programming language as a dynamically loaded
library (*.DLL in Win32 platforms, lib*.so in Linux/UNIX systems) (Fig.
13-7) and the Java Native Interface provided access to its functions.
Naturally, for each operating system (OS), it is necessary for a corres-
ponding version of the native library to be present. However, the platform
code itself can call native library functions transparently, without need to
recognise the local OS, since JNI resolves library names automatically.
13. Web Services for Distributed Signal Processing 393

Figure 13-7. Use of the Java Native Interface for accessing C/C++ functions in existing
native libraries.

The algorithm for the distribution of processing is divided in the


following steps:
1. The user begins interaction with the Web Service via simple JSP Web
pages and can either register his machine as a grid node, or initiate a
distributed application on the available nodes. The next steps assume that
a user starts a new application.
2. The application input is parsed from a file by the User Interface (UI)
subsystem.
3. The main thread contacts the scheduler, which checks the registered
nodes for availability.
4. The main thread contacts the remote input service of each node, which
automatically creates the remote input files and assigns the
corresponding task.
5. Each node that is assigned the task starts computation by calling JNI
functions. The native processes are monitored via stream capturing.
6. The scheduler polls for new/free nodes and assigns remaining tasks
accordingly.
7. After each node ends computation, results are returned to the scheduler,
which is responsible for gathering and re-formatting the data.
Several simulations have been performed with the above-mentioned
demo application in heterogeneous environments. The outcome depends on
several parameters that may be classified in three main categories, according
to their origin: node grid structure, platform implementation and the
application itself. The parameters that are defined by the node grid structure
are:
number of nodes
networking infrastructure
node processing load
node processing power (CPU type, memory size)
main server processing power
394 Chapter 13

The platform implementation specifies two additional parameters:


Time interval between checks for the existence of new files
Task allocation frequency

Finally, several parameters depend on the distributed application itself


and cannot be accounted for a-priori in the infrastructure:
Computational complexity of a single task
Input file size
Output file size
Deviations in the above-mentioned values

The combinations of all parameter values are endless. Therefore most


parameters are kept constant during simulations, especially application-
specific properties. The SAR signal processing simulator requires a small
input file with application variables (160 bytes, but may vary) and a file
containing the simulated target area RCS distribution (exactly 242 bytes for
this application). Each output file contains decimal values of simulated
despreading processor outputs, which reached 48500 bytes (this too may
vary no more than 1% due to the variable number of decimal digits). The
resulting measurements for the computation duration appear in the following
tables.
It is evident that the division in small subtasks handles the significant
differences in underlying hardware configurations successfully. In addition,
the registration of new nodes during processing has proved the ease and
flexibility of the foundation technologies (SOAP/XML). The capability of
handling native code has prevented the service from experiencing the
significant cost of programming CEM applications in Java.

2.7 Results and Conclusions

Although the platform is capable of handling large-scale distributed


simulations, the initial test results were obtained with a small set of
heterogeneous nodes. Their hardware specifications are described in Table
13-1, along with their individual performance for the demonstration problem
(divided in 30 subtasks). It is evident from these results that the Symmetric
Multi-Processors (SMP) feature of node N1 is used efficiently and that AMD
AthlonXP processors perform significantly better for this particular
simulation. The diversity of hardware results in great differences in
performance, a fact that does not allow reliable speed-up measurement.
13. Web Services for Distributed Signal Processing 395

However it is interesting to study whether the platform does take advantage


of the best node, as expected in theory.

Table 13-1. Hardware used for measurement and performance (execution of 30 tasks).
Node Id CPU O/S RAM (MB) Time (sec)
N1 AMD Athlon 1GHz Windows 256 T1 = 169
N2 2xIntel PII 400MHz SMP Linux 256 T2 = 315
N3 Intel PIII 1GHz Windows 512 T3 = 290
N4 Intel PIII 933MHz Windows 768 T4 = 294
N5 AMD AthlonXP 1700+ 1.16GHz Linux 768 T5 = 139

Table 13-2. Speed-up tests with 2, 3 and 4 nodes (heterogeneous grid).


Nodes Duration (sec)
{N1, N2} 128 (optimal = 110, speed-up = 1.719)
{N2, N3, N4} 108 (optimal = 89.31, speed-up = 2.48)
{N2, N3, N4, N5} 59 (optimal = 58.08, speed-up = 3.94)

Table 13-2 shows results from distributed runs of the same problem. The
optimal value is the time needed if all nodes work continuously and
concurrently without stop, according to their individual metrics. For
example, for nodes N1+N2 the optimal processing ratio x would be such that:

T1T2
T1 x1 = T2 (1 x1 ) = Toptimal
{ N1 , N 2 }
==
T1 + T2

The speed-up metric can be now re-defined, in order to incorporate the


node inhomogeneity. For this simple, two-node case, it is:

2 Toptimal
{ N1 , N 2 }

speedup = { N1 , N 2 }
Tactual

All theoretical values have been calculated this way, in order to take the
platform lack of homogeneity into account and produce useful metrics. The
general equations for more than two nodes form a simple linear system with
respect to the ratios xi. In compact form, these are:

M 1
T1 x1 = T2 x2 = = TM 1 x j = Toptimal
{ N 1 , N 2 ,, N M }

j =1
396 Chapter 13

The result for the accurate speed-up metric in general form is:

M Toptimal
{ N1 , N 2 ,, N M }

speedup = { N1 , N 2 ,, N M }
Tactual

The comparative results of Table 13-2 show that the platform handles
heterogeneity successfully, by assigning more tasks to better nodes, while
there were no problems when dealing with totally different operating
systems.
The Web-Services based framework described in this chapter exhibits
multiple attractive features in modern CEM development: platform-
independence, ease of task implementation and running, good performance
in heterogeneous environments and great extensibility [Chiu et al., 2002].
Indeed, future enhancements of this platform may include: support for
generic distributed applications (extension to more complex, non-parametric
problems), transparent native implementation of demanding functionalities
(the XML data exchanging permits this in a straightforward way) or even
improvements in the JavaBeans-to-XML conversion code that has proved
perhaps the most significant overhead in the framework.

References
Box, D., et al., 2000, Simple Object Access Protocol (SOAP) 1.1, W3C, (08 May 2000);
www.w3.org/TR/2000/NOTE-SOAP-20000508/.
Chiu, K., Govindaraju M. and Bramley R., 2002, Investigating the limits of SOAP
performance for scientific computing, Proc. 11th IEEE Interl Symp. on High Perf. Distr.
Comp. HPDC-11, Edinburgh: 246-254.
Karre, A., 2003, A do-it-yourself framework for grid computing, JavaWorld (April 2003);
http://www.javaworld.com/javaworld/jw-04-2003/jw-0425-grid.html.
Lymperopoulos, D., Logothetis, D., Kostaridis, A. and Kaklamani, D., 2005, Grid
Computing Techniques for Distributed Processing in Computational Electromagnetics
based on the Web Services Architecture, 17th IMACS World Congress Scient. Comp. Appl.
Math. and Simul., Paris.
Skolnik, M. I., 1981, Introduction to Radar Systems, 2nd ed., McGraw-Hill, pp. 428-430.
13. Web Services for Distributed Signal Processing 397

Bibliography
Skolnik, M. I., 1981, Introduction to Radar Systems, 2nd ed., McGraw-Hill.
Chapter 14
GRID-ENABLED TRANSMISSION LINE
MATRIX (TLM) MODELLING OF
ELECTROMAGNETIC STRUCTURES

P. Russer, B. Biscontini and P. Lorenz


Technische Universitt Mnchen, Munich, Germany

Abstract: The Transmission Line Matrix (TLM) method is a key numerical method in
computational electromagnetics. As a network model of Maxwells equations
formulated in terms of the scattering of impulses, it possesses exceptional
versatility, numerical stability, robustness and isotropic wave properties.
An introduction into the three-dimensional TLM method and its algebraic
formulation is given. The modelling of complex electromagnetic structures
consisting of dielectric and conducting media is treated. The parallelization of
the TLM algorithm is performed by segmentation of the TLM state vector.
System identification and spectral analysis approaches allow a considerable
reduction of numerical effort. Numerical examples are presented.

Key words: Transmission Line Matrix; Grid Computing.

1. INTRODUCTION

The Transmission Line Matrix (TLM) method, developed and first


published in 1971 by Johns and Beurle [Johns and Beurle, 1971] has
emerged as a key numerical method in computational electromagnetics for
the modelling of complex electromagnetic structures [Christopoulos, 1995;
Christopoulos and Russer, 2000a; 2000b; Hoefer, 1985; 1989; Russer,
2000]. The TLM method is based on the analogy between the electro-
magnetic field and a mesh of transmission lines [Kron, 1944]. As a network
model of Maxwells equations formulated in terms of the scattering of

399
L. Tarricone and A. Esposito (eds.), Advances in Information Technologies for Electromagnetics, 399431.
2006 Springer. Printed in the Netherlands.
400 Chapter 14

impulses, it possesses exceptional versatility, numerical stability, robustness


and isotropic wave properties.
In TLM the space and time are discretized. The space is subdivided in
cells. At the faces of every cell the tangential electric and magnetic field
components are sampled. This yields a total of 12 electric and 12 magnetic
field components per cell. The electromagnetic field is modeled by wave
pulses propagating between adjacent cells and scattered within the cells.
Every TLM cell is represented by a twelve port. The discretized field state is
represented by a state vector summarizing the states of all TLM cells. The
field evolution is governed by linear mapping rules. The TLM algorithm
consists of the propagation of the wave amplitudes from the mesh nodes to
the neighboring nodes and the scattering of the wave amplitudes in the mesh
nodes. The propagation and the scattering of the wave amplitudes may be
expressed by operator equations.
By one single computation of a pulse response a large amount of
information is obtained. The versatility of the TLM method allows straight-
forward calculation of complex structures.

2. THE 3D-TLM METHOD

The TLM scheme has been derived from Maxwells equation using the
finite difference approximation [Hein, 1993; Jin and Vahldieck, 1994], the
Method of Moments [Krumpholz and Russer, 1994] and the finite
integration approximation [Aidam and Russer, 1997; Pea and Ney, 1996].
In the following the TLM scheme will be introduced via the finite
integration concept.

E x Hy
H Ey
x
Ez
Ez Hx Hz
Hy
Hz
Ex
Ey

Figure 14-1. The TLM cell.


14. Grid-Enabled TLM Modelling of Electromagnetic Structures 401

We subdivide the space in cubic TLM cells as shown in Fig. 14-1. On


every surface of the TLM cell samples of tangential electric and magnetic
fields are taken. We obtain twelve electric field samples and twelve
magnetic field samples per TLM cell. The orientation of electric and mag-
netic field samples is chosen in such a way that the power flow is
directed into the TLM cell if the electric and magnetic field components
have the same sign. The electric and magnetic field components are
summarized in twelve-dimensional vectors.

k El,m,n = k [E1 , E2 , . . . E11 , E12 ]Tl,m,n , (14.1a)

k H l,m,n = k [H1 , H 2 , . . .H11 , H12 ]Tl,m,n . (14.1b)

For a spatial discretization l and a time discretization t and introducing


the discrete space coordinates l, m, n and the discrete time coordinate k the
relation between the continuous coordinates x, y, z, t and the discrete
coordinates are

x = l l , y = ml , z = nl , t = k t . (14.2)

a2 Ez
b2

Hx
Hz

a1 Ex b1

Figure 14-2. The wave amplitudes.

We now introduce the wave amplitude vectors

k al,m,n = k [a1 , a2 , a3 , . . . a11 , a12 ]Tl,m,n , (14.3a)


402 Chapter 14

k bl,m,n = k [b1 , b2 , b3 , . . . b11 , b12 ]Tl,m,n , (14.3b)

where k al,m,n summarizes the waves incident in the TLM cell and k bl,m,n
contains the amplitudes of the waves scattered by the TLM cell. The incident
and the scattered waves propagate normal to the tangential planes as illustra-
ted in Fig. 14-2. The wave amplitude and the field components are related
via

1 ZF
k al,m,n = k El,m,n + k H l,m,n , (14.4a)
2 ZF 2

1 ZF
k bl,m,n = k El,m,n k H l,m,n . (14.4b)
2 ZF 2

The tangential electric and magnetic field components at the cell


boundaries are summarized in

k El,m,n = Z F ( k al,m,n + k bl,m,n ) , (14.5a)

k H l,m,n = Z F ( k al,m,n k bl,m,n ) . (14.5b)

where field impedance Z F is given by


ZF = . (14.6)

Now we can replace the geometric model by a network model,


represented by the TLM node, depicted in Fig. 14-3. We use the term TLM
cell for the geometrical object we have defined in the continuous space,
whereas the term TLM node is used for the abstract network model.
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 403

10
12 3
1

5 8
7
6
z
2
y 4
x 9
11

Figure 14-3. Condensed symmetric TLM node.

H10 E12
E10 H12 H
1
E3 E
1 E6
E5
H3 H8
H5 H6
E4
H4
E7 H7 E8
E2
H2
H11 E11 y

E9 H9

Figure 14-4. The TLM cell.

We apply finite integration to the TLM cell shown in Fig. 14-4. By this
way we obtain relations between the incident waves k al,m,n and the scattered
waves k bl,m,n of the cell l, m, n. Ampres law and Faradays law yield
404 Chapter 14

d

Axy
H(x, t ) =
dt D(x, t ),
Axy
(14.7)

d

Axy
E(x, t ) =
dt Axy
B(x, t ), (14.8)

where E(x, t ) and H (x, t ) are the electric and magnetic field
differential forms [Russer, 2003] given by

E = E x d x + E y dy + E z d z , (14.9a)

H = H x dx + H y dy + H z dz . (14.9b)

and D (x, t ) and B (x, t ) are the electric and magnetic flux density
differential forms given by

D = Dx dy dz + Dy dz dx + Dz dx dy, (14.10a)

B = Bx dy dz + By dz dx + Bz dx dy. (14.10b)

H1
E3
E1
E5 E6
H3 H8
H5 H6
E4
H7 H4 E8
E7
E2
H2

Figure 14-5. Integration path A1xy .


14. Grid-Enabled TLM Modelling of Electromagnetic Structures 405

H1
E3
E1 E6
E5
H5 H3 H8
E4 H6
H7
H4 E8
E7
E2
H2

Figure 14-6. Integration path A2 xy .

To obtain a system of twelve linear equations describing the dependence


of the amplitudes b of the scattered waves on the amplitudes a of the
incident waves we apply both laws to six surfaces of integration A1xy , A1yz ,
A1zx , A2 xy , A2 yz and A2 zx . The surface A1xy , shown in Fig. 14-5 is parallel
to the xy-plane and goes through the center of the TLM cell. The surfaces
A1yz and A1zx respectively are parallel to the yz-plane and the zx-plane. The
contour A2 xy shown in Fig. 14-6 encloses four triangular leaves. In A2 xy
neighboring leaves exhibit opposite orientation. For the time discretization
we apply a Crank-Nicolson scheme [Thomas, 1995]. That means, we replace
the time-derivative by forward differences and time-dependent quantities by
the arithmetic mean of the two time steps involved. From Eq. (14.7) and Eq.
(14.8) we obtain by this way

1 1

2 Aiuv
( k H + k 1 H) =
t Aiuv
( k D k 1 D), (14.11a)

1 1

2 Aiuv
( k E + k 1 E ) =
t Aiuv
( k B k 1 B). (14.11b)

These integrals are computed for the surface A1xy shown in Fig. 14-5. The
electric and magnetic fields are sampled in the center points of the TLM cell
surfaces yields:
406 Chapter 14

H = l ( H
A1xy
5 + H 4 + H 6 + H 3 ), (14.12a)

l 2

A1 xy
D=
4
( E5 + E 4 + E 6 + E3 ), (14.12b)

E = l ( E
A1xy
7 + E 2 E8 E1 ), (14.12c)

l 2

A1 xy
B=
4
( H 7 H 2 + H 8 + H 1 ). (14.12d)

Inserting Eq. (14.12a) and Eq. (14.12b) into Eq. (14.11a) and considering
= 1/Z F c we obtain

( k H 5 + k 1 H 5 + k H 4 + k 1 H 4 + k H 6 + k 1 H 6 + k H 3 + k 1 H 3 )
l
= ( k E5 k 1 E5 + k E4 k 1 E4 + k E6 k 1 E6 + k E3 k 1 E3 ).
2 Z F ct
(14.13)

The ratio of space discretization interval l and the time discretization


interval t is selected

l
= 2c. (14.14)
t

With Eq. (14.5a) and Eq. (14.5b) this yields

b + k b4 + k b6 + k b3 =
k 5 a + k 1 a4 + k 1 a6 + k 1 a3 .
k 1 5 (14.15)

Inserting Eq. (14.12c), Eq. (14.12d) and Eq. (14.14) into Eq. (14.11b)
and considering = Z F c we obtain
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 407

( k E7 + k 1 E7 + k E2 + k 1 E2 k E8 k 1 E8 k E1 k 1 E1 )
Z F l
= ( k H 7 k 1 H 7 + k H 2 k 1 H 2 k H 8 + k 1 H 8 k H1 k 1 H1 ).
2ct
(14.16)

We insert Eq. (14.5a) and Eq. (14.5b) and obtain

k (b7 + b2 b8 b1 ) = k 1 (a7 + a2 a8 a1 ). (14.17)

We now perform similar integrations over the surfaces A1yz and A1zx and
obtain together with the above equation

k (b5 + b4 + b6 + b3 ) = k 1 (a5 + a4 + a6 + a3 ), (14.18a)

k (b9 + b8 + b10 + b7 ) = k 1 (a9 + a8 + a10 + a7 ), (14.18b)

k (b1 + b12 + b2 + b11 ) = k 1 (a1 + a12 + a2 + a11 ), (14.18c)

k (b7 + b2 b8 b1 ) = k 1 (a7 + a2 a8 a1 ), (14.18d)

k (b11 + b6 b12 b5 ) = k 1 (a11 + a6 a12 a5 ), (14.18e)

k (b3 + b10 b4 b9 ) = k 1 (a3 + a10 a4 a9 ). (14.18f )

The first order finite difference scheme we obtain in this way from
Ampres law and Faradays law only exhibits six equations. To obtain six
additional equations independent from the above ones, we integrate Eq.
(14.11a) and Eq. (14.11b) over the area A2 xy drawn in Fig. 14-6 and over the
areas A2 yz and A2 zx . The parts of the path A2uv crossing the cell diagonally
contribute only by third order to the integral. Therefore we need only to
consider the contribution of the path in the boundary surface. By this way we
obtain a further set of six equations
408 Chapter 14

k (b7 b2 b8 + b1 ) = k 1 (a7 a2 a8 + a1 ), (14.19a)

k (b11 b6 b12 + b5 ) = k 1 (a11 a6 a12 + a5 ), (14.19b)

k (b3 b10 b4 + b9 ) = k 1 (a3 a10 a4 + a9 ), (14.19c)

k (b5 b4 + b6 b3 ) = k 1 (a5 a4 + a6 a3 ), (14.19d)

k (b9 b8 + b10 b7 ) = k 1 (a9 a8 + a10 a7 ), (14.19e)

k (b1 b12 + b2 b11 ) = k 1 (a1 a12 + a2 a11 ). (14.19f)

We now bring Eq. (14.18a) to Eq. (14.18f ) and Eq. (14.19a) to Eq.
(14.19f ) in the form

M k b = L k 1 a. (14.20)

The matrix L is diagonal matrix with the diagonal elements 1. It is the


scattering matrix of the symmetric condensed TLM node in its eigensystem.
The scattering matrix S of the symmetric condensed TLM node is given by

S = M 1LM. (14.21)

It can be rewritten as

0 S0 S 0T

S = S 0T 0 S0 (14.22)
S0 S 0T 0

with the submatrices


14. Grid-Enabled TLM Modelling of Electromagnetic Structures 409

0 0 12 12
0 0 1 1
S0 = 1 1 2 2
. (14.23)
2 2 0 0
1 1
2 2 0 0

The scattering matrix S has the property S = ST = S = S 1 , i.e., it is real,


symmetric, Hermitian and unitary. Consequently the TLM scheme fulfills
energy conservation, reciprocity and invariance with respect to time reversal
exactly. We note that the scattering matrix S may also be determined
completely by considering only symmetry and energy conservation.
We consider the TLM mesh to be composed of condensed symmetric
TLM nodes as shown in Fig. 14-3 where each of the six arms is of length
l 2 . We assume a homogeneous lossless space with no sources. All
incident and scattered wave amplitudes at the node (l , m, n) can be
summarized in the vectors k al , m , n and k bl , m , n .
In order to describe the complete discretized mesh-state, we introduce the
field state space. To the node with the discrete space coordinate (l , m, n) at
the discrete time coordinate k a base vector k ; l , m, n is assigned. The set of
basis vectors k2 ; l2 , m2 , n2 is orthonormal. The orthogonality relations are
given by

k1 ; l1 , m1 , n1 k2 ; l2 , m2 , n2 = k1 ,k2 l1 ,l2 m1 ,m2 n1 ,n2 . (14.24)

The electric field vector FE and the magnetic field vector FM


combine all tangential field samples of the TLM mesh

1
FE =
Z

k ,l , m , n
Elk,m ,n k ; l , m, n , (14.25a)

1
FM =
Z

k ,l , m , n
H lk,m ,n k ; l , m, n . (14.25b)

All incident and scattered wave amplitudes of the TLM mesh can be
combined in two vectors a and b respectively

a =
k ,l , m , n
alk,m ,n k ; l , m, n , (14.26a)
410 Chapter 14

b =
k ,l , m , n
blk,m,n k ; l , m, n . (14.26b)

Since all tangential electric and magnetic field components in each cell
boundary surface are also specified in the neighboring cell boundary
surfaces, only twelve field components per TLM cell are linearly
independent. Specifying, e.g. all twelve incident wave amplitudes per TLM
cell yields a complete description of the field state. The time shift operator
TS and its Hermitian conjugate Ts increment or decrement k by 1, i.e. it
shifts the field state by t,

Ts k ; l , m, n = k + 1; l , m, n , (14.27a)

Ts k ; l , m, n = k 1; l , m, n . (14.27b)

Since a time delay t occurs in connection with every scattering process,


the simultaneous scattering at all TLM mesh nodes is described by the
operator equation

b = Ts S a . (14.28)

To describe the passing of the wave pulses from one cell to a


neighbouring one, we define the spatial shift operators X S , YS , Z S and their
Hermitian conjugates XS , YS and Z S . These spatial shift operators
increment and decrement the three discrete spatial coordinates l, m and n in
the same way as the operators TS and TS are doing with the discrete time
coordinate k. We introduce the connection operator

= X S (1,2 + 3,4 ) + XS ( 2,1 + 4,3 )


+ X S (5,6 + 7,8 ) + XS (6,5 + 8,7 ) (14.29)
+ Z S (9,10 + 11,12 ) + ZS (10,9 + 12,11 ).

( )
where i , j
m,n
= i ,m j ,n is a 12 12 matrix. The scattered wave
amplitudes are incident into the neighboring TLM cells. Assuming
instantaneous propagation between adjacent cell surfaces, we may describe
the propagation of all wave amplitudes in the TLM mesh by
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 411

a = b . (14.30)

The connection operator has the properties = = 1, i.e. it is


Hermitian and unitary. The two equations Eq. (14.28) and Eq. (14.30)
describe the complete TLM scheme. The formal solution of difference
schemes may be simplified by using the z-transform [Gentili et al., 1998],
[Smith, 1987]. To apply z-transform to the TLM scheme [Russer and
Cangellaris, 2001], we consider the time evolution of the field in an interval
from k1 to k2 . We introduce the z transforms a and b via

k2
1
a = k k a , (14.31a)
k = k1 z

k2
1
b = k k b . (14.31b)
k = k1 z

From Eq. (14.28) and Eq. (14.30) we obtain

b = z 1S a , (14.32a)

a = b . (14.32b)

We can summarize Eq. (14.32a) and Eq. (14.30) in

(z S) a = 0. (14.33)

This is the state equation of the TLM system in z domain.

3. MODELLING OF DIELECTRIC MEDIA

Regions with variable permittivities and permeabilities and with


dielectric and magnetic losses can be modeled using a TLM mesh with
constant l if stubs are introduced [Christopoulos, 1995]. For homogeneous
media the relative permittivity is given by r and the relative permeability
412 Chapter 14

by r ; electric and magnetic losses by the electric and magnetic


conductivities e and m , respectively. The three-dimensional condensed
node scheme may be extended in the following way:
The scattering matrix S in the symmetrical notation is given by

S K
S = 0T (14.34)
M L

with

A B BT

S 0 = BT A B , (14.35)
B BT A

where we have introduced

0 0 0 0
0 0 0 0
A= B= (14.36)
0 0 0 0

0 0 0 0

K11 K12 M11 M12


L 0
K = K 21 K 22 L = 11 M = M 21 M 22
K 31 K 32 0 L 22 M 31 M 32
(14.37)

with

0 1 0 0 0 2
0 0 0 0 2
K11 = 1
, K12 = , (14.38)
0 0 1 0 2 0

0 0 1 0 2 0
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 413

0 0 1 2 0 0
0 0 1 0 0
K 21 = , K 22 = 2 , (14.39)
1 0 0 0 0 2

1 0 0 0 0 2

1 0 0 0 2 0
0 0 0 2 0
K 31 = 1 , K 32 = , (14.40)
0 1 0 2 0 0

0 1 0 2 0 0

5 0 0 6 0 0
L11 = 0 5 0 , L 22 = 0 6 0 , (14.41)
0 0 5 0 0 6

0 3 0 0 0 4
0 0 0 0 4
M11 = 3
, M12 = , (14.42)
0 0 3 0 4 0

0 0 3 0 4 0

0 0 3 4 0 0
0 0 3 0 0
M 21 = , M 22 = 4 , (14.43)
3 0 0 0 0 4

3 0 0 0 0 4
414 Chapter 14

3 0 0 0 4 0
0 0 0 4 0
M 31 = 3 , M 32 = . (14.44)
0 3 0 4 0 0

0 3 0 4 0 0

The parameters , , , and i are given by

y0 + g 0 z0 + r0
= + , (14.45)
8 + 2 y0 + 2 g 0 8 + 2 z0 + 2r0

y0 + g 0 z0 + r0
= , (14.46)
8 + 2 y0 + 2 g 0 8 + 2 z0 + 2r0

2
= , (14.47)
4 + y0 + g 0

2
= , (14.48)
4 + z0 + r0

1 = y0 , 2 = , 3 = , 4 = z0 , (14.49)

y0 g 0 4 4 z0 + r0
5 = , 6 = , (14.50)
y0 + g 0 + 4 4 + z0 + r0

l r
y0 = 2 2, (14.51)
t c
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 415

l r
z0 = 2 2 . (14.52)
t c

Electric losses are considered by the parameter

g 0 = e lZ 0 , (14.53)

while magnetic losses are introduced via the parameter

r0 = m lY0 . (14.54)

4. PARALLELIZATION OF THE TLM METHOD

To solve complex electromagnetic problems, where memory require-


ments and computational time become very large for a single computer,
we may parallelize the TLM method to speed-up the computation and to
distribute the memory requirements.
In the following we will show that it is possible to split the TLM
computational region into subregions, to perform the TLM algorithm inside
each subregion independently of other subregions and to exchange the
values on the boundaries common to the subregions. This is possible due to
the local nature of the explicit time-domain TLM scattering scheme and the
properties of the connection operator. This technique is called domain
decomposition. The distributed TLM algorithm is fully equivalent with the
TLM algorithm described in Section 2.
The parallelization can be implemented in terms of distributed computing
(distributed memory), vector computing (shared memory) or Grid computing
(distributed memory). In Section 5 we will present the implementation of the
TLM parallelization in Grid environment.

4.1 Domain Decomposition

Let us consider to have N independent computational resources available.


These resources are represented by a set C = {c1 , c2 , , cN } , where ci is the
i-th computational resource, with i {1,..., N } .
We decompose the complete TLM region R into N subregions Ri which
are always bounded in practical implementation, but for theoretic
considerations may be considered to also to be unbounded (see Fig. 14-7).
416 Chapter 14

We assign to each subregion exactly one computational resource. The


subregions are not overlapping and cover all the original space R, i.e.,

R1 R2 RN =
i{1,..., N }
Ri = , (14.55a)

R1 R2 RN =
i{1,..., N }
Ri = R. (14.55b)

Please note the notation we are using for an iteration over all elements of
a set.
We define the set Ri which contains the indices of all neighboring
subregions of subregion i. With reference to Fig. 14-7 we can write

N1 = {2, 6,5, 4}, N 2 = {1, 6,3}, (14.56a)

N 3 = {2, 6,5, 4}, N4 = {1,5,3}, (14.56b)

N 5 = {1, 6,3, 4}, N6 = {1, 2,3,5}. (14.56c)

It is obvious that


i{1,..., N }
N i = {1,..., N },
i{1,..., N }
N i = . (14.57)

The boundary Bi of subregion Ri is given by

Bi = B,
j Ni
ij (14.58)

where Bij denotes the boundary between subregions Ri and R j (i j ) .


We see that Bij = B ji , i.e., the subregions Ri and R j share the same
boundary. The total boundary B can now be written as the union of
boundaries between all subregions

B=
i{1,..., N }
Bi , (14.59)
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 417

and the interior region I as

I = R\ B, (14.60)

where\ is the set difference operator.


The boundary and the interior region are sets of points of the Euclidean
space E3 . After discretization the boundary and the interior region will be
the sets of discrete points. It will be clear from the context which kind of
points the boundary and the interior contain and we will not use different
symbols to denote the two different cases.

Figure 14-7. Domain decomposition.

4.2 Decomposition of the TLM Algorithm

To keep the description simple with respect to practical implementation,


we will consider the TLM in 3D Cartesian mesh (not necessarily uniform)
using the SCN node.
First we need to be able to describe the TLM state in terms of quantities
associated with faces of the TLM cell. To do that we define the face
operator F f with f {1,..., 6} , since there are six faces belonging to each
3D SCN TLM cell, as
418 Chapter 14


f
F1 = f1 l , m, n + 1 2 2
l , m, n 3 , (14.61a)
(l ,m,n)R


f
F2 = f 2 l , m + 1 2, n 2
l , m, n 3 , (14.61b)
( l , m , n )R


f
F3 = f3 l + 1 2, m, n 2
l , m, n 3 , (14.61c)
( l , m , n )R


f
F4 = f 4 l , m, n 1 2 2
l , m, n 3 , (14.61d)
( l , m , n )R


f
F5 = f5 l , m 1 2, n 2
l , m, n 3 , (14.61e)
( l , m , n )R


f
F6 = f6 l 1 2, m, n 2
l , m, n 3 . (14.61f)
( l , m , n )R

The subscripts 2 and 3 are used to emphasize the physical dimension of


the geometrical objects, i.e., the face of the TLM cell (2D object) and the
TLMcell (3D object), respectively. The (4 12) matrix fi is defined with
reference to the TLM port numbering (see Fig. 14-3 in Section 2 ) as

+ 2,12 ,
f1 = 1,10 f 4 = 3,9 + 4,11 , (14.62a)

+ 2,8 ,
f 2 = 1,6 f5 = 3,5 + 4,7 , (14.62b)

+ 2,4 ,
f3 = 1,2 f6 = 3,1 + 4,3 . (14.62c)

where i , j is a (4 12) matrix with (i , j ) m , n = i ,m j , n .


14. Grid-Enabled TLM Modelling of Electromagnetic Structures 419

By applying the face operator we obtain the TLM state in terms of


quantities associated with faces of the TLM mesh. The complete face
operator F is given by

6
F = Ff . (14.63)
f =1

The inverse face operator for face one F11 is


f
F11 = f11 l , m, n l , m, n + 1 2 (14.64)
( l , m , n )R

with fi1 being the appropriate (12 4) pseudo inverse for face i
satisfying

fi fi1fi = fi . (14.65)

The inverse face operators for the remaining faces are defined in an
analogous way. The complete inverse face operator is then given by

6
F 1 = F f1. (14.66)
f =1

A particular face can be specified either through the triple (l , m, n) or


through a triple of the form (l , m, n 1 2) , (l , m 1 2, n) or (l 1 2, m, n) .
The discretized boundary B and discretized interior faces I F , i.e., the faces
of the TLM interior region I, are sets of such tuples.
420 Chapter 14

Figure 14-8. Transformation of the TLM state l , m, n using the face operator F.

The modified connection operator is defined as

R ( l , m, n 1 2
f f
= l , m, n 1 2
( l , m , n )
f f
+ l , m 1 2, n l , m 1 2, n (14.67)
+ l 1 2, m, n
f
l 1 2, m, n
f
)
= 1 ,

with the (4 4) connection matrix

0 0 1 0
0 0 0 1
= (14.68)
1 0 0 0

0 1 0 0

and the identity operator 1. The modified connection operator is related


to through the similarity transformation
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 421

F 1F = . (14.69)

Furthermore, we introduce the projection operator R i projecting the state


vector a on the state vector of subregion R i

Ri =
( l , m,n)R i
l , m, n l , m, n , (14.70)

with the properties

R
i =1
i = 1, R i R j = ij R i . (14.71)

The local TLM state corresponding to computational resource ci is


obtained as

N N
a i = Ri a , with Ri a = a i = a .
i =1 i =1
(14.72)

The connection operator is split into the interior and boundary


connection operator

= B + I , (14.73a)

with

l , m, n = Ii ,
f f
I = l , m, n (14.73b)
( l , m, n ) I F i =1

l , m, n = Ii .
f f
B = l , m, n (14.73c)
( l , m , n ) B i =1

The interior connection operator Ii operating on the interior faces IFi


of region Ri is given by
422 Chapter 14


f f
Ii = l , m, n l , m, n . (14.73d)
( l , m,n ) I Fi

Finally, the complete TLM algorithm can be written as follows

k +1
a = S k a = F 1F SR i k
a
i =1
N
= F 1 ( B + I ) FSR i k
a (14.74)
i =1

N N

= F 1 B FS k a i + Ii FS k a i .
i =1 i =1

We can see that the distributed TLM algorithm applies in each


subregion independently the scattering operator and the connection operator
on the interior faces. The connection operator on the boundaries between
subregions needs to be applied globally.

5. TLM-G: GRID-ENABLED TIME DOMAIN


TRANSMISSION LINE MATRIX SYSTEM

The needs of innovative technologies for broadband RF systems result in


modeling of complex electromagnetic structures. The availability of
dedicated supercomputers and/or semianalytical techniques is usually
restricted to highly specialized research centers. A modern view, offered by
the Grid computing, defines a more sophisticated simulation environment
that establishes a virtual organization (VO) [Foster et al., 2001; OGSA,
2005].The VOs are formed by a dynamic collection of individuals,
institutions and resources. In the VOs, the resources between the participants
are shared in a secure, flexible and coordinated way. To the resources belong
computational resources (e.g. computers, computer clusters, super-
computers), storage systems, software resources, data sources and special
classes of devices like laboratory instruments. These resources are in general
geographically distributed and act as a virtual computation environment. The
computation of complex electromagnetic structures with the possibility of
collaborative work between different organizations belongs to the class of
problems that have motivated the development of Grid technologies. The
TLM-G system supports the users in performing full-wave electromagnetic
simulations using the TLM method in the Grid environment. The TLM
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 423

solver is implemented in a legacy code of the application called YATSIM


(Yet Another TLM Simulator) [Yatsim, 2006]. The integration of this code
in the Grid environment has been done without modifications to the already
tested code of YATSIM.
For the implementation of the Grid testbed we have used the Globus
Toolkit version 4 (GT4) [Globus Alliance, 2006] that includes many of the
OGSA standards and implements the WSRF.

5.1 The Components of the TLM-G System

To integrate the TLM-G system in a Grid testbed we use the GT4 to


build the Grid infrastructure. Fig. 14-9 presents the Globus components used
in the deployment of the TLM-G system.

Figure 14-9. Globus Toolkit components in the TLM-G system.

The TLM solver is integrated in the Grid environment by means of the


following modules:
Yet Another TLM Simulator-Grid enabled client (YATSIM-G): a user
client employed to access the system;
Yet Another TLMWork Allocation Daemon (YATWAD): a coordination
server;
Yet Another TLM Daemon (YATD): a task allocation server.

To obtain a portable code the modules are implemented in the Python


programming language [Python Language, 2006].
424 Chapter 14

5.2 The Relation Between YATWAD, YATD and the


Components of the Globus Toolkit in the TLM-G
System

Figure 14-10. The relation between YATWAD, YATD and Globus Toolkit in the TLM-G
system.

In Fig. 14-10, it is shown how the TLM-G modules are connected with
the components of the GT4. The policy rules of the communication between
the entities presented in Fig. 14-10 are implemented by means of the Public
Key Infrastructure (PKI) provided by GT4. This means that the usage of
certificates signed by a Certification Authority (CA) is necessary during the
communication. The Grid Security Infrastructure (GSI) of GT4 implements
user authentication, authorization, proxy-certificates and delegation.
The YATWAD is located in the Collective layer of the Grid (see Sec.).
The YATD, and the Grid Resource Allocation Manager (GRAM) [Globus
Toolkit Primer, 2006], are located in the Resource layer and the aggregation
service MDS-Index (Monitoring and Discovery Service) is located in the
Collective layer.
The YATWAD and the YATD have the same functions as described in
the previous subsection but now with an additional MDS-Index. The MDS-
Index is an aggregation service provided by GT4, which collects information
of registered Grid resources, the so-called aggregation sources, and enables
their discovery and monitoring. MDS-Index supports XPath queries againts
its resource property document. Fig. 14-10 shows that YATWAD can also
make requests and obtain status information directly from YATD and
GRAM, without contacting the MDS-Index. The difference in this case is
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 425

that the information is not consolidated by MDS-Index and YATWAD must


take care individually in managing the requests and the status replies from
each GRAM and YATD.

Figure 14-11. Electromagnetic performance of the TLM-G system in the Grid testbed.

The computational resources, the application software for simulations


(TLM, FDTD, etc), the Parallel Virtual Machine (PVM), the connective
equipment and other hardware are available in the Resource and Fabric
layers.

6. ANALYSIS OF THE PERFORMANCE


OF THE TLM-G SYSTEM AND EXAMPLES

6.1 The Electromagnetic Performance of the TLM-G


System

The electromagnetic performance of the TLM-G in the Grid testbed is


measured analyzing the number of performed TLM iterations during a time
interval of 6 hours. The computation is performed in a simulation region
with a fixed size of 1 million TLM cells.
426 Chapter 14

In the Grid, the maximum number of available computers during the


measurement period was 6. The results of the measurements are shown in
Fig. 14-11.
To evaluate the performance we define a unit of Million Cells Million
Iterations Per Day (MCMIPD). This unit gives a total number of TLM
iterations on a one million cell simulation region, e.g. 100 100 100 cells,
per day.
As we can see from the plot, the actual performance of the TLM-G
system is strongly dependent on the instantaneous availability of resources.
We have achieved a peak performance of 3.8 MCMIPD and an average
performance of 1.7 MCMIPD resulting in 0.422 Million Cells Million
Iterations (MCMI) performed during the testing period.

6.2 A Bowtie Antenna in a TLM-G System

The performance in terms of scalability of the TLM-G system with


respect to the TLM algorithm is evaluated by computing the input
impedance of a bowtie antenna (see Fig. 14-12(a)).
First, to obtain an accurate result for the input impedance of the bowtie
antenna in the range of 07 GHz, the antenna is discretized with a resolution
of 252 252 252 16 million cells. The memory requirements for
the simulation is 1.15GB and the number of time steps is 30 000. The
computation was performed in the HT-TLM system by decomposing the
problem to 7 computers (Pentium4 3 GHz, 1GB 400 DDR, Linux cluster).
The total simulation time is 3 hours. Fig. 14-12(a) shows good agreement
between the real and imaginary part of the input impedance computed with
TLM-G and compared with the results of the MoM analysis given by
Makarov [Makarov, 2002].
Second, to evaluate the scalability the input impedance of the bowtie
antenna was computed in the range 04 GHz distributed on several
computers. The results of the analysis are summarized in Fig. 14-12(b). For
the computation the same Linux cluster was used as in the problem
discussed earlier.
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 427

Figure 14-12(a). Real and imaginary parts of the input impedance of a bowtie antenna
computed with TLM-G and the Method of Moments [Makarov, 2002].

Figure 14-12(b). Results of the scalability analysis on the computation of the input impedance
of a bowtie antenna.

In Fig. 14-12(b) we see a speed-up of 1.72 (compared with respect to the


time needed for the computation on one computer only) by using two
computers. Using three, four and five computers to solve the problem gives a
speed-up of 2.17, 2.95 and 3.62, respectively. As can be observed, increasing
428 Chapter 14

the number of computers above five is not any more efficient for the given
size of problem.
This saturation occurs due to the 1/N dependence of the simulation time
and due to the increasing communication time with increasing N, where N is
the number of computational resources. We may conclude, that in order to
obtain a maximum performance of the TLM-G system, we have to use an
optimum number of computers for a given size of the problem.

7. THE CIRCULAR CYLINDRICAL CAVITY


RESONATOR

In this section, the performance in terms of scalability of the system on


different platforms, and during heterogeneous simulations is evaluated. In
this case, we have done several computations of the resonance frequencies of
the TMnpq modes of a circular cylindrical cavity resonator for different
configurations of the HT-TLM system [Lorenz et al., 2005].
The circular cylindrical cavity resonator with radius a = 38 mm and
height d = a is discretized in 102 102 40 = 416 160 cells. The memory
requirement for the simulation is 30MB and the number of time steps is
10 000 . The results of the analysis of the performance of the system by
means of computational time are summarized in Fig. 14-13(b).
We have used three different computer architectures (PowerPC, PC and
HPPA) and four different operating systems (MacOS X, Linux, Windows
XP and HP-UX). The computers are connected by standard IP-based
networks. The memory requirements for the computation are the same for
each platform and in the case of distributed parallel computing the
simulation region is uniformly split on each computer.
We first compare two different homogeneous clusters (Linux and HP-
UX). We observe a good scalability in both cases. In the case of the HP
cluster we have achieved a speed-up of 3.72 by using four computers, and a
speed-up of 3.71 by using five computers on the Linux cluster.
Second, a comparison for individual architectures was done. We see that
the computational time of individual resources in the Grid may vary
significantly.
Third, a heterogeneous computation using two different operating
systems and two different CPU speeds was done (P4 1.6 GHz, Linux and P4
1.8 GHz, Win XP). We observe a speed-up of 1.78 with respect to the slower
computer and a speed-up of 1.2 with respect to the faster one. In this case the
calibration of the TLM-G system is essential in order to obtain optimal
performance. Using the calibration data, each computational resource may
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 429

be assigned an appropriate size of the simulation region, resulting in the


fastest computation.

Figure 14-13(a). The computed resonance frequencies of the circular cylindrical cavity
resonator. The analytically obtained frequencies of the TM modes are indicated above the
curves explicitly.

Figure 14-13(b). The performance analysis for different configurations of the Grid.
430 Chapter 14

References

Aidam, M. and Russer, P., 1997, Derivation of the TLM method by finite integration, AEU
Int. J. Electron. Commun., 51:35-39, 1997.
Akhtarzad, S., 1975, Analysis of Lossy Microstrip Structures and Microstrip Resonators by
the TLM Method, Ph.d dissertation, University of Nottingham, England, July 1975.
Christopoulos, C., 1995, The TransmissionLine Modeling Method TLM., IEEE Press, New
York, 1995.
Christopoulos, C. and Russer, P., 2000a, Application of TLM to microwave circuits, In
Applied Computational Electromagnetics, NATO ASI Series, pp. 300-323. Springer,
Cambridge, Massachusetts, London, England, 2000.
Christopoulos, C. and Russer, P., 2000b, Application of TLM to EMC problems, In Applied
Computational Electromagnetics, NATO ASI Series, pp. 324-350. Springer, Cambridge,
Massachusetts, London, England, 2000.
Foster, I., Kesselman, C. and Tuecke, S., 2001, The anatomy of the Grid: Enabling scalable
virtual organizations, Lecture Notes in Computer Science, 2150, 2001.
Gentili, F., Menini, L., Tornambe, A. and Zaccarian, L., 1998 Mathematical Methods for
System Theory, World Scientific Publishing, Singapore, 1998.
Globus Alliance Website, 2006 [Online 2006], http://www.globus.org.
Globus Toolkit Primer, 2006 (Describing Globus Toolkit Version 4) [Online 2006], http://
www. globus.org/toolkit/docs/4.0/key/GT4_Primer_0.6.pdf,
Hein, S., 1993, Consistent finite difference modelling of Maxwells equations with lossy
symmetrical condensed TLM node, Int. J. Numer. Modeling, 6:207-220, 1993.
Hoefer, W. J. R., 1985, The transmission line matrix method-theory and applications, IEEE
Trans. Microwave Theory Techn., 33(10):882-893, October 1985.
Hoefer, W. J. R., 1989, The transmission line matrix (TLM) method, In T. Itoh, editor,
Numerical Techniques for Microwave and Millimeter Wave Passive Structures, pp. 496-
591. J. Wiley., New York, 1989.
Jin, H. and Vahldieck, H., 1994, Direct derivations of TLM symmetrical condensed node
and hybrid symmetrical condensed node from maxwells equations using centered
differencing and averagingm, IEEE Trans. Microwave Theory Techn., 42(12):2554-2561,
December 1994.
Johns, P. B. and Beurle, R. L., 1971, Numerical solution of 2-dimensional scattering
problems using a transmission-line matrix Proc. IEE, 118(9):1203-1208, September
1971.
Kron, G., 1944, Equivalent circuit of the field equations of Maxwell I, Proc. IRE, 32:289-
299, May 1944.
Krumpholz, M. and Russer, P., 1994, A field theoretical derivation TLM, IEEE Trans.
Microwave Theory Techn., 42(9):1660-1668, September 1994.
Lorenz, P., Vital, J. V. Biscontini B. and Russer, P., 2005, High-throughput transmission line
matrix (TLM) system in Grid environment for microwave design, analysis and
optimizations, In 2005 IEEE MTT-S Int. Microwave Symp. Dig. 12.-17. June 2005, Long
Beach, USA, pp. 1115-1118. June 2005.
Makarov, S. N., 2002, Antenna and EM Modeling with MATLAB, John Wiley & Sons, Inc.,
2002.
OGSA, 2005; http://www.globus.org/ogsa.
Pea, N. and Ney, M., 1996, A general formulation of a three-dimensional TLM condensed
node with the modeling of electric and magnetic losses and current sources, In Proc. 12th
Annual Review of Progress in Applied Computational Electromagnetics, Monterey, pages
262-269, Monterey, CA, March 1996.
Python Language, 2006, http://www.python.org.
14. Grid-Enabled TLM Modelling of Electromagnetic Structures 431

Russer, P., 2000 The transmission line matrix method. In Applied Computational
Electromagnetics, NATO ASI Series, pp. 243-269. Springer, Cambridge, Massachusetts,
London, England, 2000.
Russer, P., 2003 Electromagnetics, Microwave Circuit and Antenna Design for
Communications Engineering, Artech House, Boston, 2003.
Russer, P. and Cangellaris, A. C., 2001 Networkoriented modeling, complexity reduction
and system identification techniques for electromagnetic systems, Proc. 4th Int.
Workshop on Computational Electromagnetics in the TimeDomain: TLM/FDTD and
Related Techniques, 17-19 September 2001 Nottingham, pp. 105-122, September 2001.
Smith, J. M., 1987 Mathematical Modeling and Digital Simulation for Engineers and
Scientists, J. Wiley, New York, 1987.
Thomas, J., 1995, Numerical partial differential equations, Springer, New York, 1995.
Yatsim, 2006, http://www.yatpac.org, http://www.hft.ei.tum.de/yatsim.
Glossary

ABC - Absorbing Boundary Condition - Material that absorbs all incident


(electromagnetic) waves, typically used to terminate infinite simulation
volumes.

Abstraction - Feature of the object-oriented programming model,


according to which entities (objects) having common properties can be
grouped.

Access rights - A description of the type of authorized interactions a


subject can have with a resource. Examples include read, write, execute, add,
modify, and delete.

Agent - Program acting on behalf of a person or organization.

Allocation - The process of assigning a set of resources to a job.

AMS - Agent Management System - Set of tools and programming


interfaces that act as an underlying infrastructure for mobility mechanisms,
agent lookup queries and other execution characteristics.

API - Application Programming Interface - APIs facilitate the


development of programs using functionalities embedded in software or
hardware tools. A tool contains an API when it defines a number of function
calls (interfaces) to access its own facilities.

433
434 Glossary

ASCII - American Standard Code for Information Interchange - A


character set and a character encoding based on the Roman alphabet as used
in modern English. ASCII codes represent text in computers, in other
communications equipment, and in control devices that work with text. Most
contemporary character encodings have an ASCII-like base.

Asynchronous - Asynchronous tasks execute independently from each


other and their timing is not synchronized. For example, when you start up
three asynchronous tasks, even when they do nearly the same amount of
work, you can not predict in which order they will finish.

Authentication - Verification of authenticity of communicating parties.

Authorization - The procedure for granting access to resources.

Back-ward medium - Material in which the energy flow and the wave
vector are anti-parallel.

Bandwidth - The total available bit rate of a digital network channel.


Bandwidth is determined by the speed of the network which is determined
by its technology, but it is also affected by the overhead of the control data
added by the communication protocol.

Barrier - Process synchronization mechanism used in collective


communications. Processes calling a barrier function block until all the
members of the same group have called it.

Beowulf cluster - A cluster of Linux based PCs, using commodity


hardware and open source software.

BIE - Boundary Integral Equation - Integral equation restricted to


unknown fields on a boundary.

Binding -An association between an interface, a concrete protocol and a


data format. A binding specifies the protocol and data format to be used in
transmitting messages defined by the associated interface.

BPSK - Binary Phase-shift Keying - The simplest form of Phase-shift


Keying, a digital modulation scheme that conveys data by changing, or
modulating, the phase of a reference signal (the carrier wave).
Glossary 435

Browser - An application program that provides a way to look at and


interact with all the information on the World Wide Web.

CAD - Computer Aided Design - Designing by making use of computer


software.

CDMA - Code Division Multiple Access - A spread-spectrum


communication technique. Spread-spectrum techniques are based on the
signal dispersion over a wider band, so to reduce sniffing and disturbs.
CDMA is a multiplexing technique: several users can adopt the same
bandwidth with minimum reciprocal interference. Users are associated to
orthogonal codes (Baum-Walsh codes).

CEM - Computational Electromagnetism - Numerical methods for solving


electromagnetics problems with the aid of computer software.

Circuit simulator - CAD-tool based on Kirchoff s laws.

Class - Group of objects having common properties and behavior.

Client - The requesting program or user in a client/server relationship.

Cluster - Group of machines that are networked together and used as a


single system

Complex thickness - Mathematical concept used to describe PMLs.

Complexity - Measures the time (computational complexity) and memory


resources (memory complexity) that a computer requires to solve a problem.

Computational complexity - see complexity.

Component - Unit of composition with contractually specified interfaces


and explicit context dependencies only.

Computational speed-up - see speed-up ratio.

Condor - Tool supporting high throughput computing on collections of


distributed computing resources.

Confidentiality - The service which assures that exchanged messages are


known only to the communicating parties.
436 Glossary

CORBA - Common Object Request Broker Architecture - Object oriented


distributed computing specification developed by OMG, based on the
development of object request brokers able to mediate between clients and
server components.

CRLH - Composite Rigth-handed Left-handed materials - Periodic


artificial dielectrics made out of a planar Transmission Line network loaded
with lumped series capacitors and shunt inductors.

Cryptography - The science of creating messages which can be read only


by a designed receiver.

Daemon - A program that runs continuously and exists for the purpose of
handling periodic service requests that a computer system expects to receive.
The daemon program forwards the requests to other programs (or processes)
as appropriate.

Data decomposition - The division of a global data set into smaller


subdomains, typically for distribution over some forms of parallel
computers.

Discovery - The act of locating a machine-processable description of a


Web service-related resource that may have been previously unknown and
that meets certain functional criteria. It involves matching a set of functional
and other criteria with a set of resource descriptions. The goal is to find an
appropriate Web service-related resource.

Discovery service - A discovery service is a service that enables agents to


retrieve Web services-related resource description.

Dispersive medium - Material whose constitutive parameters and


consequently the electromagnetic medium properties are frequency
dependent.

Distributed memory system - Distributed system where each processor


node has immediate access only to its local memory, so if a processor needs
data from another nodes memory, it must issue special instructions to fetch
these items from that node over the interconnecting network.

DLL - Dynamically linked library - Computer library that implements the


concept of dynamic linking, meaning that the data in the library is not copied
Glossary 437

into a new executable or library at compile time, but remains in a separate


file on disk.

DM - Data Management - Component dealing with access and


management of data in a grid.

Double Negative materials - Metamaterials characterized by negative


values of the constitutive parameters.

Drude model - Model developed in the 1900s by Paul Drude to explain


the transport properties of electrons in materials. The Drude model is the
application of kinetic theory to electrons in a solid. It assumes that the
material contains immobile positive ions and an electron gas of classical,
non-interacting electrons of density n, each of whose motion is damped by a
frictional force, due to collisions of the electrons with the ions, characterized
by a relaxation time '.

EFIE - Electric Field Integral Equation - BIE imposing boundary


conditions for the electric field.

Efficiency - Ratio between the computational speed-up and the number of


processors operating in parallel.

Enabling technology - Basic technology which enables the specification


of a higher level technology.

Encapsulation - Feature of object oriented programming paradigm


according to which objects hide data and expose a well defined interface
allowing to operate on hidden data.

Encryption - A method of scrambling information to render it unreadable


to anyone except the intended recipient.

ENG - Epsilon Negative medium - Metamaterial characterized by negative


values of the electric permittivity.

FDTD - Finite-Difference Time-Domain method - Technique that


discretizes time and space derivatives with finite differences. This numerical
method was originally developed by Kane S. Yee, in 1966. He proposed a
three dimensional central difference approximation for Maxwells curl
equations, both in space and time. FDTD is a time-domain method in that
transient fields are computed as a function of time enabling the accurate
438 Glossary

characterisation of complex inhomogeneous structures for which analytical


methods are ill-suited.

FEM - Finite Element Method - Numerical discretization technique used


to solve integral or differential equations and based on a variational
principle.

FFT - Fast Fourier Transform - Technique to calculate discrete Fourier


transforms of data sets in a fast way.

Field simulator - CAD-tool based on the Maxwells equations.

FIPA - Foundation for Intelligent Physical Agents - Produces computer


software standards for heterogeneous and interacting agents and agent-
based systems. It was founded as a non-profitable organization with the
goal of defining a full set of standards for both implementing systems
within which agents could execute (agent platforms) and specifying how
agents themselves should communicate and interact.

FMM - Fast Multipole Method - Method first presented by L. Greengard


and V. Rokhlin in 1987 that reduces the complexity of particle interaction
problems. Nowadays the concepts of FMM are applied to many numerical
engineering problems.

GGF - Global Grid Forum - A community forum that promotes and


supports the development, deployment, and implementation of Grid
technologies.

Globus Toolkit - Open source middleware set of grid services addressing


fundamental issues such as security, information discovery, resource
management, data management and communication.

GM - Grid Middleware - Layer of software mediating between resource


and high level application to enable grid computing.

GRAM - Globus Resource Allocation Manager - Component of the


Globus Toolkit responsible for sets of resources operating under the same
allocation policy.

Green function - Describes the fields radiated by an infinitesimal Dirac


source.
Glossary 439

Grid - System concerned with the integration, virtualization, and


management of services and resources in a distributed, heterogeneous
environment that supports collections of users and resources (virtual
organizations) across traditional administrative and organizational domains
(real organizations).

GridFTP - Extended version of the File Transfer Protocol, included in the


Globus Toolkit, which adds a series of features to FTP, customizing it to
grid environments.

Grid service - A Grid service is a Web service that is designed to operate


in a Grid environment, and meets the requirements of the Grid(s) in which it
participates.

GSA - Genetic Search Agent - An intelligent software agent that


participates in a distributed genetic algorithm with other peer entities. Its
task is to perform the genetic operators and solve a specified optimisation
problem in a collaborative manner.

GSM - Global System for Mobile Communications - A second-generation


cellular system introduced to standardize wireless communications
throughout Europe in 1990. The mobile entity communicates with the base
station of the cell it belongs to. The base station connects the mobile with the
other mobiles or with the wired network, thanks to a switching center. Each
mobile is equipped with a SIM card, with the user id number and all the
information about the user.

GT - see Globus Toolkit

Homogenization Technique - Technique to extract the effective


constitutive parameters of an inhomogeneous material.

HTML - HyperText Markup Language - HyperText Markup Language


(HTML) is a markup language designed for the creation of Web pages and
other information viewable in a browser.

HTTP - HyperText Transfer Protocol - Text-based protocol that is


commonly used for transferring information across the Internet.

IC - Integrated Circuit - Small complex of electronic components and


their connections, typically residing in or on a layered background medium.
440 Glossary

Inheritance - Form of polymorphism, which allows to define groups of


classes specializing operations and attributes owned by other classes.

Interface - Function call with a very rigorous and permanent specification,


defined to hide implementation details of objects, device, tools, etc.

IPC - Inter-process communication via message-passing, shared memory


(including shared files), or TCP.

IS - Information Services - Component of the Globus Toolkit, responsible


for collecting and returning information about the structure and state of
resources, for example their current load and usage policy.

JADE - Java Agent Development Environment - Open-source platform


for peer-to-peer, agent-based applications, fully implemented in Java.

JAR - Java ARchive - File containing compressed files, used to distribute


a set of Java classes.

Java - Object oriented language widely deployed in Internet.

JavaBeans - Object oriented distributed computing framework based on


Java language.

Jini - Object oriented distributed computing framework based on Java


language and services.

JNI - Java Native Interface - Programming framework that allows Java


code running in the Java virtual machine (VM) to call and be called by
native applications.

Job - User-defined task that is scheduled to be carried out by an execution


subsystem.

Job manager - Process able to handle jobs running on server machines.

JSP - JavaServer Pages - A Java technology that allows developers to


dynamically generate HTML, XML or some other type of Web page.

LAN - Local Area Network - Network connecting computers belonging to


a single organization and not being distant from each other.
Glossary 441

Latency - The time taken to start up an operation. Typically message


latency is the time delay incurred between one processor starting a message
send operation, and the recipient processor completing the receive operation.
Startup latency is the constant communication overhead incurred in sending
a zero length message.

Layered medium - Medium consisting of a finite number of dielectric


and/or magnetic layers. Each layer is linear, isotropic, and homogeneous and
extends to infinity in the transversal plane, which is parallel to the interfaces
between the layers. The medium can also comprise perfectly conducting
planes between the interfaces and/or at the top and/or bottom of the medium.

Left-Handed medium - Term introduced in 1968 by V. G. Veselago to


describe a medium in which the Electric and Magnetic fields form a left-
handed system with the wave vector.

Legion - Object oriented middleware framework for grids.

Library - A collection of precompiled routines that can be linked to a


program.

Load balance - A measure of how evenly work is distributed among a set


of parallel processors. Parallel programs are most efficient when the load is
perfectly balanced, i.e. each processor has exactly the same amount of work
to do.

MAS - Method of Auxiliary Source - ACEM method that attempts to


estimate the field components with the aid of discrete fictitious sources, the
currents of which are obtained by imposing the boundary conditions of the
problem.
MASIF - Mobile Agent System Interoperability Facility - A standard for
mobile agent systems which has been adopted as an Object Management
Group technology (OMG).

Master-worker - Programming paradigm where a root process (master) is


responsible for distributing problem data amongst the remaining processes
(workers) and to collect results at the end of the executions.

Memory complexity - see complexity.

Message passing - Programming paradigm for developing parallel


applications. It adopts explicit exchange of messages between processes.
442 Glossary

Metamaterials - Artificial media which exhibit unusual electromagnetic


properties.

MIC - Microwave Integrated Circuit - IC that operates at microwave


frequencies.

Microstrip structure - Layered medium comprising a substrate placed on


a PEC ground plane and metallic traces and strips are printed on top of the
substrate.

Middleware - Software acting as intermediate between higher and lower


layers in a hierarchical architecture.

MIMD - multiple instruction multiple data - Parallel architecture


containing a number of CPUs interconnected by a high-speed network. The
different CPUs execute in parallel different instructions and operate on
different data.

MLFMA - Multilevel Fast Multipole Algorithm - Extension of the (two-


level) FMM in a multilevel framework.

MMAS - Modified Method of Auxiliary Source - A modified MAS


algorithm, which utilizes fictitious discrete current densities and point
charges instead of current sources. It offers improved performance in 3D
problems with thin structures.

MMIC - Monolithic Microwave Integrated Circuit - Monolithic MIC.

Mu Negative (MNG) medium - Metamaterial characterized by negative


values of the magnetic permeability.

Mobile Agent - Program with the ability to transfer itself from host to host
within a network, and to interact with other agents in order to perform its
task.

MoM - Method of Moments - Numerical discretization technique that is


typically used to solve integral equations.

MPI - Message Passing Interface - A standard API (Application


Programming Interface) that can be used to create parallel applications.

MPICH - Public domain implementation of MPI.


Glossary 443

MPICH-G2 - Grid-enabled implementation of MPICH.

MPP - Massively Parallel Processor - Computer containing hundreds or


thousands of processors interconnected by a fast local interconnection
network.

Multithreaded application - Application in which a number of tasks are


carried out in parallel by simultaneously running threads.

Mutual authentication - The process of authenticating both parties


involved in a communication.

Nimrod-G - Resource management and scheduling system built on Globus


services and freely available on Internet.

Negative Refractive Index materials - Media with negative values of the


index of refraction.

Non-repudiation - Method by which the sender of data is provided with


proof of delivery and the recipient is assured of the sender's identity, so that
neither can later deny having processed the data.

OpenMP - A standard API for distributing work across threads of a


shared memory computer.

ORB - Object Request Broker - Entity responsible for locating


components in a distributed object oriented environment.

PCB - Printed Circuit Board - Layered medium (board) that contains


layers of circuitry, used to connect both integrated and lumped components.

PEC - Perfect Electric Conductor - Material that conducts electric current


on its surface in a perfect way, i.e. without any resistance nor allowing
electric fields in its interior.

Peer-to-peer - Network of computers where each machine can act both as


client and as server.

Planar microwave antenna - Antenna system operating at microwave


frequencies and residing in a layered background medium.
444 Glossary

Plasma frequency - The plasma frequency corresponds to the frequency


of an electromagnetic wave able to accelerate free electrons colliding with a
lattice of ions, defects and other electrons. In the Drude model the plasma
frequency is employed to express the frequency dependence of the electric
permittivity.

PMC - Perfect Magnetic Conductor - Material that conducts magnetic


current in a perfect way, i.e. without any resistance.

PML - Perfectly Matched Layer - An ABC first presented by J. P.


Brenger in 1994 that has very good absorbing properties.

Polymorphism - Feature of the object oriented programming model,


according to which classes can overlap and intersect, i.e. they can include a
common set of operations, eventually assuming different meanings
depending on the class they are applied to.

POSIX - Portable Operating Systems for Computing Environments -


Standard containing the guidelines which govern new generation operating
systems.

Protocol - Set of rules that end points in a telecommunication connection


use when they communicate.

Proxy - Entity acting on behalf of someone else.

Pthreads - Threads programming interfaces compliant with the standard


specifications included in the Portable Operating System for Computing
Environments (POSIX) family of standards.

Public-key - Key used to encrypt or decrypt text, associated with a private


key.

PVM - Parallel Virtual Machine - A subroutine library from Oak Ridge


National Laboratory. PVM includes libraries of subroutines callable from C
and Fortran programs, plus system support processes, for distributed
memory parallelism. PVMs goal is to allow the user to create a parallel
virtual machine from any heterogeneous collection of machines and
networks.

RCS - Radar cross section - A description of how an object reflects an


incident electromagnetic wave.
Glossary 445

Registry - Authoritative, centrally controlled store of information.

REV - Remote EValuation - REV is a general term for any technology that
involves the transmission of executable software programs from a client to a
server computer for subsequent execution. After the program has terminated,
the results are sent back to the client. REV belongs to the family of mobile
code technologies.

Right Handed medium - Conventional medium in which the Electric and


Magnetic fields form a right-handed system with the wave vector.

RM - Resource Management - Component of the Globus Toolkit,


responsible for scheduling and allocating resources specifying, for example,
resource requirements and the operations to be performed, such as process
creation or data access.

SAR - Synthetic aperture radar - A form of radar in which sophisticated


post-processing of radar data is used to produce a very narrow effective
beam.

Scheduler - A program that controls which batch job runs next, when
adequate resources are available.

Shared memory architecture - Distributed architecture where the


processors can address a global, shared memory.

Service interface - The abstract boundary that a service exposes. It defines


the types of messages and the message exchange patterns that are involved
in interacting with the service, together with any conditions implied by those
messages.

Service-oriented architecture - Set of components which can be invoked,


and whose interface descriptions can be published and discovered.

Servlet - Java pieces of code which cooperate with Java-compliant Web


servers to provide services to Web clients. A Java servlet can interface Web
servers with databases and other back-end services and elaborate data to give
back the results to the Web.

SIMD - Simple instruction multiple data - Parallel architecture where a


single CU controls a number of ALUs. ALUs execute in parallel the same
instruction on different local data.
446 Glossary

Single Negative medium - Medium in which the electric permittivity


(Epsilon Negative) or the magnetic permeability (Mu negative) assumes
negative values.

Single sign-on - The procedure of authentication via a single insertion of a


secret password.

SOAP - SOAP provides a standard, extensible, composable framework for


packaging and exchanging XML messages between a service provider and a
service requester. SOAP is independent of the underlying transport protocol,
but is most commonly carried on HTTP.

Sommerfeld-integral - Integral typically appearing in electromagnetism at


the end of a spectral domain method, e.g. under the form of an inverse
Hankel transformation.

Spectral domain method - Technique typically used in electromagnetism


to obtain the Greens function in the spatial domain via an inverse Fourier
transformation of the spectral domain Greens function.

Speed-up ratio - Speed gain obtained from the operation of N processors


in parallel.

SRR - Split Ring Resonator - Anisotropic resonant particle made out of


two concentric split-rings. It was proposed in 1999 by Pendry as inclusion
on a dielectric substrate to obtain a metamaterial with negative values of the
magnetic permeability.

SPMD - Single Program Multiple Data - Parallel programming paradigm


where all tasks execute the same program but on different sets of data. All
the nodes receive identical copies of the program to be executed.

SSL - Secure Sockets Layer - Protocol developed by Netscape to provide


security over Internet. Supports client and server authentication.

Stochastic Optimisation - Process of searching for a globally optimal


problem solution via an algorithm that utilises random processes. Typical
cases include Random Walk, Genetic Algorithms or Evolutionary Strategies.
Used in contrast with gradient descent methods.

Subgridding - Technique employed in the FDTD method to achieve a


higher spatial resolution in the region of interest, which is gridded more
finely than the rest of the problem space.
Glossary 447

Substrate - Slab of linear, isotropic, and homogeneous material with a


dielectric and/or magnetic contrast on which metallizations usually are
printed.

Symmetric key algorithm - Algorithm where the encryption and


decryption key are the same.

TCP/IP - Transfer Control Protocol/Internet Protocol - The basic


communication language or protocol of the Internet.

TDMA - Code Division Multiple Access - A multiplexing technique based


on time division. Each user is assigned a certain time-slot. The information
about the time-slot assigned to a single user is crucial both when transmitting
and when receiving.

TE - Transversal Electric - A TE-polarized field only contains electric


components that are transversal w.r.t. to a certain axis, which is often chosen
as the direction of propagation.

TEM - Transversal Electric and Magnetic - A TEM-polarized field only


contains electric and magnetic components that are transversal w.r.t. to a
certain axis, which is often chosen as the direction of propagation.

Thread - Stream of instructions that can be scheduled to run as it were a


process with an autonomous identity with respect to the program it is part of.

Thread-safe - Functions are said thread safe when they can be safely
called by multiple threads: data are not corrupted when these functions are
concurrently invoked.

TLM - Transmission Line Matrix method - is a numerical method in


computational electromagnetics which solves the Maxwells equations
in time-domain; both the space and time are discretized and the solution
is obtained in an iterative way.

TLM-G - Grid-enabled system for full wave electromagnetic simulations


in Grid environment based on the TLM method.

TM - Transversal Magnetic - A TM-polarized field only contains


magnetic components that are transversal w.r.t. to a certain axis, which is
often chosen as the direction of propagation.
448 Glossary

Topology - Describes the way nodes are interconnected in a parallel


architecture.

UDDI - Universal Description, Discovery and Integration - a


specification that defines a way to publish and discover information about
Web services.

UMTS - Universal Mobile Telecommunication System - A standard for the


third generation of mobile systems. It is based on two standards. The former
is the so called W-CDMA (wide-band CDMA, see CDMA). The latter is
called TD-CDMA, and is a combination of W-CDMA and TDMA, see
TDMA). The basic goal of the third generation of mobile systems (a digital
technology) is the full support of multimedia Web services in a mobile
context.

UNIX - multiprogramming and multiuser operating system written in the


C language.

URL - Uniform Resource Locator - An Internet protocol element


consisting of a short string of characters that conform to a certain syntax.
The string comprises a name or address that can be used to refer to a
resource. It is a fundamental component of the World Wide Web.

Vector processor - a powerful computer processor designed to perform


arithmetic to long vectors rather than single numbers.

Virtual organization - A virtual organization (VO) comprises a set of


individuals and/or institutions having direct access to computers, software,
data, and other resources for collaborative problem-solving or other
purposes.

VRML - Virtual Reality Modeling Language - A standard text file format


for representing three-dimensional (3D) interactive vector graphics,
designed particularly with the World Wide Web in mind.

WAN - Wide Area Network - Network connecting geographically


dispersed computers.

Web server - Program that, using the World Wide Webs Hypertext
Transfer Protocol (HTTP), serves the files that form Web pages to Web
users.
Glossary 449

Web service - A Web service is a software system designed to support


interoperable machine-to-machine interaction over a network. It has an
interface described in a machine-processable format (specifically WSDL).
Other systems interact with the Web service in a manner prescribed by its
description using SOAP-messages, typically conveyed using HTTP with an
XML serialization in conjunction with other Web-related standards.

WSDL - XML document for describing Web services.

WSRF - WS Resource Framework -Specifications defining a generic and


open framework for modeling and accessing stateful resources using Web
services.

XML - Extensible Markup Language - Standard, flexible, and extensible


data format.

XPath - XPath is a language for finding information in an XML


document. XPath is used to navigate through elements and attributes in an
XML document.
Index

ABC; 156; 170; 171; 256; 265 Automatic discovery; 310


ABox; 31; 37; 39; See also Assertional Auxiliary planes; 286
box Axiom; 32; 35; 36; 38
Absorbing Boundary Condition. See also AXIS; 400
ABC
Abstract method; 21 Backward-wave; 234; 244
Active object; 343; 344; 350; 351; 356 antennas; 244
Adjoint interpolation; 199 Barrier synchronization; 5
Agent; 29; 66; 318; 356; 362; 365; 367; Beam
368; 370; 372; 373 splitter; 281; 288; 294; 304
genetic; 390; 393 waist; 286, 292, 293
master; 368; 371; 380; 394 waveguide; 277; 279; 280; 291; 302;
mobile; 8; 9; 27; 365; 366; 368; 370; 304
372; 376; 380 width; 292
runtime environment; 27 BIE-MoM; 156; 157; 158; 162; 223
worker; 368; 369 Binding; 51; 355
Agent Management System; 362; 366; Boundary conditions; 236
382; 390; 394; See also AMS Boundary element method (BEM); 80
AMS; 366; 368; 371; 372; 377; 386; 390; Boundary element method (FE/BE); 81
391; 393 Bowtie antenna; 444
API; 368; 370 BPSK; 408
Anomalous dispersion; 246 Branch-Line Coupler; 244
Ant; 318 Broad bandwidth pulse; 266
Anterpolation; 199 Broadband signal; 270
API; 44; 344; 367; 370; 376; 380 Broadcasting; 5
A-PO; 281; 286 Bytecode; 27
Applet; 9; 24
ASCII; 408 CAD; 155; 231; 278; 279
Assertional box; 31 CAE; 307; 308; 309; 310; 311; 312; 313;
Asynchronous notification; 61 314; 315; 316; 318; 319; 322; 332;
Automatic convergence; 281; 298 334; 336; 389

451
452 Index

Centered finite difference equations; 255 Design grid; 279; 281


Chain command; 296 DGA; 389; 390; 393; 394
Circular cylindrical cavity resonator; 446 Differential equation
Client-server; 7; 9; 366; 367; 400 method; 74
Cluster; 2; 6; 7; 64; 352; 371 DIG; 44
Coarse-mesh; 260 DII; 335
Command wizard; 277; 296 Direct solver; 157
Common runtime; 62 Dispersion equation; 233
Compact Cavity Resonator; 243 Dispersion relation; 167; 175; 177;
Complex 206
stretching factor; 171 Dispersive medium; 246; 249
thickness; 173; 174; 206 Distortion; 246; 271
Complexity; 160; 196; 223 Distributed computing; 1; 2; 3; 7; 20; 24;
computational; 32; 44; 157; 202; 210; 47; 58; 342; 361; 405; 433
211; 222; 412 Distributed Genetic Algorithm; 389; 390;
memory; 157; 202; 211; 222 392; See also DGA
Component; 8; 30; 48; 53; 58; 60; 62; 65; DL; 30; 31; 32; 34; 37; 38; 39; 43; 44
107; 276; 286; 342; 356 DLL; 407; 410
distributed; 30 DNG; 231; 234; 236; 238; 241; 242; 244;
heterogeneous; 9 245; 249; 250; 251; 266; 267; 269
Component ports; 281; 290 Domain decomposition; 3; 257; 433
Composite Right-Handed Left-Handed Double focusing effect; 236
medium; 241; See also CRLH Double negative materials; 231; See also
Computational grid; 2; 6; 57; 58; 65; 309; DNG
314; 335 DPS; 234; 235; 236; 239; 241; 242
Computing performance; 259 Dual transmission line approach; 242
Connection operator; 428 Dynamic binding; 22; 351
Connector; 280; 287; 288; 291; 293; 297; Dynamic invocation; 310
299; 301; 303; 304
coordinate system; 300 Efficiency; 6; 212; 354; 377
Constructor; 32; 38 EFIE; 126; 165; 168; 181
Container; 318; 400 Electric Field Integral Equation; 164; See
Conventional subgridding techniques also EFIE
(CS); 263 Electric losses; 433
Cooperative engineering; 48; 58; 308; Electromagnetic Compatibility; 383; See
335 also EMC
CORBA; 47 Embarrassingly parallel problems; 362;
Coupled-lines; 244 372; 381; 388
CRLH; 242 EMC; 83; 84; 155; 382; 383; 386
Encapsulation; 20
DAI; 68 ENG; 233; 237; 238; 267
DAML; 43 Evanescent wave; 242
DAML+OIL; 43 Expansion
Data management; 62; 315 Fourier; 320
DBMS; 68 waveguide mode; 320
DCE; 8; 47 Extended boundary condition integral
Deployment Descriptors; 344 method (FEBI); 81
Description Logics; 30; See also DL Extrapolation; 264
Index 453

Face operator; 436 GA; 365; 388; 392


Fast Fourier Transform; 197; See also Galerkin
FFT solution; 118
Fast Inhomogeneous Plane Wave technique; 385
Algorithm; 161; See also FIPWA testing; 168
Fast Multipole Method; 160; See also Gaussian
FMM beam; 276; 280; 285; 286; 291
FDMoM; 83 beam analysis; 281; 285; 286; 288;
FDMoM/FDTD; 83 291; 292; 293; 295
FDTD; 77; 78; 79; 80; 81; 82; 83; 84; 85; elimination; 157; 159
106; 107; 112; 113; 115; 118; 119; feed; 301
121; 124; 125; 126; 127; 129; 130; function; 246
131; 136; 137; 141; 144; 146; 156; pulse; 246; 249; 266; 269
170; 175; 232; 233; 251; 256; 257; quadrature rule; 183; 188; 194; 223
258; 259; 260; 261; 262; 264; 265; signal; 271
266; 270 Gauss-Laguerre; 286
FDTD/FDTD; 115; 116 GC; 57; 58; 60; 61; 64; 259; 312; 335
FE/BE; 81; 84 Gegenbauers polynomials; 319
FE/IEDD; 81 GEMACS; 77; 80; 81
FEM; 78; 79; 84; 156; 170; 175; 363 General Electromagnetic Model for the
FEM-BEM; 83 Analysis of Complex Systems; 77; See
FETD; 84 also GEMACS
FFT; 197; 198; 200; 409 Genetic Algorithm; 365; See also GA
Field simulators; 155 Genetic Search Agent; 390; See also
Field state space; 427 GSA
Filters; 280 Genetic Search Optimisation; 388
Fine-mesh; 260 Genetic Software Agents; 363
Finite element method; 78; See also FEM Genetica; 390; 392
Finite integration; 421 Geometrical Optics; 286; See also GO
FIPA; 367; 390 Geometrical Theory of Diffraction; 76;
FIPWA; 161; 223 77; 79; 90; 286; See also GTD
FMM; 160; 161; 162; 169; 183; 191; 212 GGF; 60
Fortran; 103; 128; 310; 316; 349; 352; Globus Toolkit; 59; 314; See also GT
353; 357; 369; 377; 407 GM; 58; 59; 64
Fourier transform; 78; 82; 117; 118; 247; GO; 286
319; 383 GRAM; 63
Frame; 276; 279 Graphical user interface; 4; 44; 276
connection of; 298 GRASP9; 275; 277; 281; 282; 286;
coordinate system; 299; 301 295
editor; 276; 277; 284; 287; 288; 296 Grasshopper; 367; 368; 370; 372; 382;
fixed; 303 386
free; 303 Greens
Frequency domain MoM; 83; See also dyadic; 164; 165; 167; 175; 179; 180;
FDMoM 182; 183; 187; 202; 205; 222
Fresnel refraction; 286 function; 156; 158; 161; 162; 172;
FSS; 304 175; 180; 184; 219; 222
F-value; 289; 290; 293; 294; 295 Greens
FVTD; 84 function; 121
454 Index

Grid computing; 7; 10; 57; 232; 251; 257; Inter Process Communication; 283; See
258; 308; 314; 315; 334; 342; 433; also IPC
See also GC Interaction; 190
grid middleware. See also GM far; 181; 193; 201
Grid middleware; 58 near; 181; 192; 201
Grid Resource Allocation Manager; 442 Interface; 21; 26; 48; 51
GridFTP; 68; 69 Interpolation; 84; 112; 116; 197; 198;
Grounding; 54; 329; 334 201; 260; 264
Group communication; 5; 342; 345; 346; IPC; 283; 284
350 IPW; 185; 194; 195; 196; 199; 201
GSA; 390; 391; 392; 393; 394 Iterative solver; 157; 159; 160; 192; 201
GT; 59; 61; 62; 63; 64; 68; 69; 314; 315;
316; 318 JADE; 367; 370; 390; 394
API; 334 JAR; 371; 407; 408
GTD; 76; 79; 80; 83; 281; 286 Java; 10; 16; 23; 24; 51; 312; 316; 317;
GWSDL; 61 342; 346; 352; 353; 357; 369; 407;
412
Helmholtz equation; 160; 180; 184; 218 agent; 365
HF-MLFMA; 220 API; 24; 27; 333; 335; 342
Homogenisation techniques; 233 bytecode; 24
HTML; 9; 11; 12; 29; 51; 400 servlet; 9; 24; 366; 376; 380; 400
HTTP; 332; 373 Java Native Interface; 316; See also JNI
Huygens surface; 85; 95; 97; 100; 103; Java Server Page; 400; See also JSP
113; 115; 116; 118; 120; 124; 125; Java Virtual Machine; 24; See also JVM
127; 130; 131; 137; 141; 146 JavaBeans; 414
JAX-RPC; 335
IDL; 283; 284 JNI; 316; 317; 369; 394; 407; 410
IETD; 78 Job management; 62; 63; 314
IFA; 106 JSP; 400; 411
IKV++; 367 JVM; 343; 346; 354; 365; 369
IMW; 220 JWSDL; 335
Incident waves; 421 API; 335
Incident-field array excitation; 106; See
also IFA Knowledge base; 37; 320
Incoming multipole wave; 220; See also
IMW Laplace equation; 160; 218
Incoming plane wave; 185 See also IPW Late binding; 49
Index service; 65 Leaky-wave antenna; 244
Indirect solver; 160 Leap-frog; 256; 266; 347
Information services; 62 Lewin transformation; 319
Inheritance; 21; 22; 25 LF-MLFMA; 220
Integral equation; 384 LF-PML-MLFMA; 220
boundary (BIE); 156; 164 Lifecycle management; 61
domain decomposition; 81 Log System Interface; 283; See also LSI
method; 74; 76; 80; 81 Lossy-Drude model; 238; 245; 266
reaction (RIE); 97 Low-frequency; 155
Integral-equation time-domain; 78; See LSI; 283; 284
also IETD LU-decomposition; 157; 159; 160
Index 455

Magnetic losses; 433 MNG; 233; 237; 238; 267


Martin-Publett; 280 M-n-m pulse; 266
MAS; 364; 377 Mobile Agent Technology. See MAT
MASIF; 367; 372; 373 Mode trimming; 202; 203; 208; 210
Massively parallel processors; 64 Mode-matching; 84; See also MM
Master-Worker; 3; 368; 380; 386; 392 Modified Method of Auxiliary Sources;
MAT; 390 377; See also MMAS
Maxwells equations; 75; 120; 155; 156; Modulated signals; 246
170; 171; 252; 286; 342; 352; 347; MoM; 80; 83; 85; 86; 89; 90; 93; 97; 99;
357 100; 101; 104; 105; 112; 116; 117;
MCMI; 444 118; 120; 121; 125; 126; 127; 128;
MCMIPD; 444 130; 140; 156; 168; 181; 281; 363;
MDS; 65; 66 385; 386
MDS-Index; 442 MoM/FDTD; 80; 82; 85; 106; 115; 117;
Mesh-state; 427 121; 128; 130; 136; 144; 146
Message passing; 4; 5; 64; 232 MoM/MoM; 86; 89; 90; 92; 103; 146
Meta-application; 335 MoM/UTD; 81
Metamaterial-based devices; 232 Moment matrix; 157; 160; 161; 169; 182;
Metamaterials; 232; 233; 251; 266 193; 201; 206
Meta-service; 334 MoMTD; 84
Method of Auxiliary Sources; 364; See Monitoring and Discovering System; 65;
also MAS See also MDS
Method of Moments; 76; 78; 80; 286; See MPI; 4; 5; 6; 58; 64; 232; 258; 259; 314;
also MoM 352; 353; 361; 370; 371
Michelson; 280 MPICH; 6; 371
Microstrip; 155; 205; 242; 244 MPICH-G2; 6; 63; 64; 314
antenna array; 181; 213 MPP; 64
structure; 163; 187 Multicomputer; 2; 3
substrate; 156; 203; 210; 216 Multilayered
transmission line; 377 structure; 182; 221
Microwave; 307 Multilevel fast multipole algorithm; 161
Middleware; 58 Multilevel Fast Multipole Algorithm. See
Million Cells Million Iterations; 444; See also MLFMA
also MCMI Multi-level parallelism; 3
Million Cells Million Iterations Per Day; Multiple cycle m-n-m pulse; 267
444; See also MCMIPD Multipole expansion; 219
Mirror; 280 Multiprocessing; 2
MLFMA; 161; 162; 167; 181; 182; 192; Multiprocessor; 2; 3; 6; 57; 315
193; 202; 204; 206; 208; 210; 211;
212; 218; 222 Namespace; 15; 16; 317
High frequency; 217; See also HF- Navigator; 283
MLFMA NEC; 82; 90; 126; 127; 128; 130; 135;
Low frequency; 218; See also LF- 136; 138; 139
MLFMA Negative phase velocity; 268
tree; 188; 190; 202 Negative refraction; 236
MM; 84; 309 NRI antenna; 244
MMAS; 377; 378; 379 Numerical instability; 266
456 Index

OASIS; 61 Peer-to-peer; 9; 351


Object Orientation; 19; See also OO Pendrys lens; 242
Object oriented; 276; 282; 342 Perfect Electric Conductor. See also PEC
Object wizard; 277; 293; 303 Perfect lens; 242
OGSA; 60; 61 Performance Gain; 262
OGSI; 61 PGA; 389
OIL; 43 Phase
OilEd; 44 delay; 242; 243; 268
OMG; 367 shifter; 244
OMW; 220 slippage; 292
Ontology; 29; 30; 32; 40; 44; 52; 53; 66; Physical optics; 83; 276; 285; See also
310; 312; 313; 314; 318; 319; 320; PO
321; 322; 323; 332; 333; 334; 394 approximation; 79
OO; 7; 19; 20; 22; 24; 347; 348 method; 76; 85; 90
OpenMP; 4 Planar
OPW; 185; 194; 196; 198; 201 antenna; 155
Outgoing multipole wave; 220; See also antenna array; 163
OMW circuit; 155; 157
Outgoing plane wave; 185; See also microstrip structure; 162
OPW solver; 156
OWL; 43; 66; 318 Plane wave; 106; 112; 163; 164; 165;
OWL-DL; 43; 44 170; 181; 184; 216; 246; 266; 280;
OWL-Full; 43 296; 364
OWL-Lite; 43 decomposition; 162; 182; 184; 185;
OWL-S; 53; 66; 318; 319; 321; 322; 329; 187; 190; 203; 204; 210; 218; 219;
333 220
API; 334 monochromatic; 234
source; 256
Palette; 276 Plane-wave time-domain; 161; See also
Parallel PWTD
algorithm; 259 Plasma frequency; 237; 238
computing; 1; 3; 257 PML; 158; 162; 170; 172; 173; 175; 192;
programming; 3; 4 204; 206; 222
transfer; 69 PML-MLFMA; 187; 190; 192; 203; 204;
Parallel Genetic Algorithm. See also 205; 211; 212
PGA PO; 79; 83; 85; 281; 285; 286; 295; 298
Parametric Point-to-point communication; 5
analysis; 363; 364 Polymorphism; 21; 22; 351
application; 362 Port-type; 51; 317
modeling; 278 Precondition; 323
problem; 362; 363; 367; 371; 386; Printed circuit board; 79; 242
399 ProActive; 342; 343; 344; 345; 346; 347;
processing; 363 354; 355; 356; 357
programming; 3 Protg; 44; 318
simulation; 377; 381; 383; 406; 410 OWL Plugin; 44
Particle Swarm Optimization; 364; 365 PTD; 83; 281; 285; 295; 298
PEC; 82; 158; 163; 167; 172; 174; 205; PVM; 361; 370; 371
377 PWTD; 161
Index 457

Quasi-optical Single Negative; 233


network; 277; 280; 285; 287; 303; 304 Single-level parallelism; 3
system; 275; 285 Slab; 92; 246; 247; 249; 250; 251; 266
QUAST; 275; 276; 278; 281; 282; 284; Smiths medium; 236
285; 286; 297; 298; 303 Snap-to-grid; 289
SNG; 233
Racer; 44; 318; 332 SOA; 48; 311
Radiation modes; 158; 167; 174 SOAP; 16; 51; 362; 373; 399; 400; 408;
RCS; 364; 409; 412 412
RDF; 42; 43 Software engineering; 19
Reasoner; 38; 39; 40; 43; 44; 313; 318; Sommerfeld-type integral; 158; 167
332 Spatial shift operators; 428
Reflector; 280; 288; 289; 293; 294 Speed-up; 6; 315; 363; 388; 413; 414
antenna; 21; 275; 277 Split Ring Resonator; 237; See also SRR
conic; 291 SRR; 237; 238; 244
curved; 285 Stateful service; 60
double; 284 Structured programming; 19
elliptic; 280 Stub; 430
hyperbolic; 280 Subconcept; 31; 38
parabolic; 278; 280 Sub-domain; 3; 96; 258; 318; 350; 351;
rim; 295 356
single; 284 Subgridding; 257; 260; 261; 263
Resource Subsumption; 38
ontology; 322 Sub-wavelength
RMI; 8; 26; 343 distance; 242
RPC; 2; 7; 8; 26 resolution; 243
Superclass; 21
Sampling rate; 186; 190; 196; 199; 201; Superconcept; 31
203; 204; 210 Surface wave
SAR; 82; 140; 142; 144; 146; 399; 408; Brenger; 175
409; 410; 412 evanescent; 175
Scalability; 6; 44; 352; 354; 355; 390; propagating; 158; 167; 174
445; 446 SWRL; 323
Scattered waves; 421 Symmetric condensed TLM node; 426
Scattering; 5; 79; 81; 83; 85; 96; 115; Synthetic Aperture Radar; 410; See also
116; 164; 165; 169; 183; 213; 216; SAR
364
SCN node; 436 TBox. See also Terminologic box
Security; 27; 63; 311; 312; 342 TCP/IP; 400
Semantic Grid; 66; 310; 311; 313; 336 Terminologic box; 31
Semantic Web; 29; 30; 40; 42; 52; 53; 66 Third party transfer; 63; 69; 315
Serializable; 26 Thread; 4
Service Time envelope; 246
binding; 319 TL; 239; 241
discovery; 319 TLM; 417
orchestration; 319; 321 algorithm; 440
Shared memory; 4; 285; 433 cell; 418
Single cycle pulse; 266 node; 420
458 Index

parallelization; 433 Waveguide; 84; 155; 172; 174; 206; 244


scheme; 427 modes; 309; 319
TLM-G system; 441 Web Services; 7; 10; 16; 47; 48; 49; 50;
Total Field Scattered Field scheme 51; 52; 53; 58; 60; 65; 362; 399; 400;
(TFSF); 257 414
Total/Scattered Field Formulation WebMages; 367; 372
method; 106 Wizard; 276; 281; 287
Translation WS-Addressing; 61
element; 195 WSDL; 50; 51; 52; 61; 317; 329; 335
matrix; 220 WS-Notification; 61
operator; 184; 185 WSRF; 61
Transmission Line; 239; See also TL
Transmission Line Matrix; 417; See also Xerces; 16
TLM XML; 9; 10; 12; 14; 15; 16; 43; 51; 65;
68; 311; 323; 329; 335; 344; 345; 362;
UDDI; 50; 52; 65 373; 400; 408; 412; 414
Uniform Geometrical Theory of parser; 16
Diffraction; 79; See also UTD Schema; 15; 16; 43
Unsatisfiability; 39 XML-RPC; 8; 16
URL; 15; 43; 343; 404 XPath; 65; 66
UTD; 79; 83
YATSIM; 441
Variable mesh; 233; 259; 260; 262
Vector computing; 433
Virtual computation environment; 440
Virtual organization; 440
VRML; 376

Vous aimerez peut-être aussi