Vous êtes sur la page 1sur 25

ABSTRACT

The High Performance Computing (HPC) allows scientists and engineers to


deal with very complex problems using fast computer hardware and specialized
software. Since often these problems require hundreds or even thousands of
processor hours to complete, an approach, based on the use of supercomputers, has
been traditionally adopted. Recent tremendous increase in a speed of PC-type
computers opens relatively cheap and scalable solution for HPC, using cluster
technologies. The conventional MPP(Massively Parallel Processing)
supercomputers are oriented on the very high-end of performance. As a result, they
are relatively expensive and require special and also expensive maintenance
support. Better understanding of applications and algorithms as well as a
significant improvement in the communication network technologies and
processors speed led to emerging of new class of systems, called clusters of SMP
or networks of workstations (NOW), which are able to compete in performance
with MPPs and have excellent price/performance ratios for special applications
types.

A cluster is a group of independent computers working together as a single


system to ensure that mission-critical applications and resources are as highly-
available as possible. The group is managed as a single system, shares a common
namespace, and is specifically designed to tolerate component failures, and to
support the addition or removal of components in a way that's transparent to
users. .In this paper we will introduce the basics of cluster technology .

1
1. INTRODUCTION

Development of new materials and production processes, based on high-


technologies, requires a solution of increasingly complex computational problems.
However, even as computer power, data storage, and communication speed
continue to improve exponentially, available computational resources are often
failing to keep up with what users demand of them. Therefore high-performance
computing (HPC) infrastructure becomes a critical resource for research and
development as well as for many business applications. Traditionally the HPC
applications were oriented on the use of high-end computer systems - so-called
"supercomputers" . Before considering the amazing progress in this field, some
attention should be paid to the classification of existing computer architectures.

SISD (Single Instruction stream, Single Data stream) type computers. These are the
conventional systems that contain one central processing unit (CPU) and hence can
accommodate one instruction stream that is executed serially. Nowadays many
large mainframes may have more than one CPU but each of these execute
instruction streams that are unrelated. Therefore, such systems still should be
regarded as a set of SISD machines acting on different data spaces. Examples of
SISD machines are for instance most workstations like those of DEC, IBM,
Hewlett-Packard, and Sun Microsystems as well as most personal computers.

SIMD (Single Instruction stream, Multiple Data stream) type computers. Such
systems often have a large number of processing units that all may execute the
same instruction on different data in lock-step. Thus, a single instruction
manipulates many data in parallel. Examples of SIMD machines are the CPP DAP
Gamma II and the Alenia Quadrics.

Vector processors, a subclass of the SIMD systems. Vector processors act on


arrays of similar data rather than on single data items using specially structured
CPUs. When data can be manipulated by these vector units, results can be
delivered with a rate of one, two and, in special cases, of three per clock cycle (a
clock cycle being defined as the basic internal unit of time for the system). So,
vector processors execute on their data in an almost parallel way but only when
executing in vector mode. In this case they are several times faster than when
executing in conventional scalar mode. For practical purposes vector processors are
therefore mostly regarded as SIMD machines. Examples of such systems are Cray
1 and Hitachi S3600.

MIMD (Multiple Instruction stream, Multiple Data stream) type computers. These
machines execute several instruction streams in parallel on different data. The
difference with the multi-processor SISD machines mentioned above lies in the

2
fact that the instructions and data are related because they represent different parts
of the same task to be executed. So, MIMD systems may run many sub-tasks in
parallel in order to shorten the time-to-solution for the main task to be executed.
There is a large variety of MIMD systems like a four-processor NEC SX-5 and a
thousand processor SGI/Cray T3E supercomputers. Besides above mentioned
classification, another important distinction between classes of computing systems
can be done according to the type of memory access (Figure 1).

Shared memory (SM) systems have multiple CPUs all of which share the
same address space. This means that the knowledge of where data is
stored is of no concern to the user as there is only one memory accessed
by all CPUs on an equal basis. Shared memory systems can be both
SIMD or MIMD. Single-CPU vector processors can be regarded as an
example of the former, while the multi-CPU models of these machines
are examples of the latter.

Distributed memory (DM) systems. In this case each CPU has its own associated
memory. The CPUs are connected by some network and may exchange data between
their respective memories when required. In contrast to shared memory machines the
user must be aware of the location of the data in the local memories and will have to
move or distribute these data explicitly when needed. Again, distributed memory
systems may be either SIMD or MIMD.

Figure 1. Shared (left) and distributed (right) memory computer architectures

To understand better the current situation in the field of HPC systems and a place
of cluster-type computers among them, some brief overview of supercomputers history
will be given below.

1.1 SUPERCOMPUTERS

3
An important break-through in the field of HPC systems came up in the late
1970s, when the Cray-1 and soon CDC Cyber 203/205 systems, both based on vector
technology, were built. These supercomputers were able to achieve unprecedented
performances for certain applications, being more than one order of magnitude faster
than other available computing systems. In particular, the Cray-1 system boasted that
time a world-record speed of 160 million floating-point operations per second
(MFlops). It was equipped with an 8 megabyte main memory and priced $8.8 million.
A range of first
supercomputers applications was typically limited by ones having regular, easily
vectorisable data structures and being very demanding in terms of floating point
performance. Some examples include mechanical engineering, fluid dynamics and
cryptography tasks. The use of vector computers by a broader community was initially
limited by the lack of programming tools and vectorising compilers, so that the
applications had to be hand coded and optimized for a specific computer system.
However, commercial software packages became available in the 1980s for vector
computers, pushing up their industrial use. At this time, the first multiprocessor
supercomputer Cray X-MP was developed and achieved the performance of 500
MFlops.

Supercomputers are defined as the fastest, most powerful computers in terms of CPU
power and I/O capabilities. Since computer technology is continually evolving, this is
always a moving target. This year’s supercomputer may well be next year’s entry level
personal computer. In fact, today’s commonly available personal computers deliver
performance that easily bests the supercomputers that were available on the market in the
1980’s.

Strong limitation for further scalability of vector computers was their shared-
memory architecture. Therefore, massive parallel processing (MPP) systems using
distributed-memory were introduced by the end of the 1980s. The main advantage of
such systems is the possibility to divide a complex job into several parts, which are
executed in parallel by several processors each having dedicated memory (Figure 1). The
communication between the parts of the main job occurs within the framework of the so-
called message-passing paradigm, which was standardized in the message-passing
interface (MPI). The message-passing paradigm is flexible enough to support a variety of
applications and is also well adapted to the MPP architecture. During last years, a
tremendous improvement in the performance of standard workstation processors led to
their use in the MPP supercomputers, resulting in significantly lowered
price/performance ratios.

Traditionally, conventional MPP supercomputers are oriented on the very high-


end of performance. As a result, they are relatively expensive and require special and also
expensive maintenance support. To meet the requirements of the lower and medium
market segments, the symmetric multiprocessing (SMP) systems were introduced in the
early 1990s to address commercial users with applications such as databases, scheduling
tasks in telecommunications industry, data mining
and manufacturing.

4
Better understanding of applications and algorithms as well as a significant
improvement in the communication network technologies and processors speed led to
emerging of new class of systems, called clusters of SMP or networks of workstations
(NOW), which are able to compete in performance with MPPs and have excellent
price/performance ratios for special applications types. On practice, clustering technology
can be used for any arbitrary group of computers, allowing to build homogeneous or
heterogeneous systems. Even bigger performance can be achieved by combining groups
of clusters into
HyperCluster or even Grid-type system.

It is worth to note, that by the end of 2002, the most powerful existing HPC
systems have performance in the range from 3 to 36 TFlops. The top five supercomputer
systems include Earth- Simulator (35.86 TFlops, 5120 processors), installed by NEC in
2002; two ASCI Q systems (7.72TFlops, 4096 processors), built by Hewlett-Packard in
2002 and based on the AlphaServer SC computersystems; ASCI White (7.23 TFlops,
8192 processors), installed by IBM in 2000 [4]; and, a pleasantsurprise, MCR Linux
Cluster (5.69 TFlops, 2304 Xeon 2.4 GHz processors), built by Linux NetworX in
2002 for Lawrence Livermore National Laboratory (USA). According to the TOP500
Supercomputers List from November 2002, cluster based systems represent 18.6% from
all supercomputers, and most of them (about 60%) use Intel's processors. Finally, one
should note that application range of modern supercomputers is very wide and addresses
mainly industrial, research and academic fields. The covered areas are related to
telecommunications, weather and climate research/forecasting, financial risk analysis, car
crash analysis, databases and information services, manufacturing, geophysics,
computational chemistry and biology, pharmaceutics, aerospace industry, electronics and
much more.

5
2. CLUSTERS

Extraordinary technological improvements over the past few years in areas such
as microprocessors, memory, buses, networks, and software have made it possible to
assemble groups of inexpensive personal computers and/or workstations into a cost
effective system that functions in concert and posses tremendous processing power.
Cluster computing is not new, but
in company with other tech nical capabilities, particularly in the area of networking, this
class of machines is becoming a high-performance platform for parallel and distributed
applications Scalable computing clusters, ranging from a cluster of (homogeneous or
heterogeneous) PCs or workstations to SMP (Symmetric Multi Processors), are rapidly
becoming the standard platforms for high-performance and large-scale computing.

A cluster is a group of independent computer systems and thus forms a loosely


coupled multiprocessor system as shown in Figure 2.

figure 2.A cluster system by connecting 4 SMPS

6
A network is used to provide inter-processor communications. Applications that
are distributed across the processors of the cluster use either message passing or network
shared memory for communication. A cluster computing system is a compromise
between a massively parallel processing system and a distributed system. An MPP
(Massively Parallel Processors) system node typically cannot serve as a standalone
computer; a cluster node usually contains its own disk and is equipped with a complete
operating systems, and therefore, it also can handle interactive jobs. In a distributed
system, each node can function only as an individual resource while a cluster system
presents itself as a single system to the user.

The concept of Beowulf clusters is originated at the Center of Excellence in Space


Data and Information Sciences (CESDIS), located at the NASA Goddard Space Flight
Center in Maryland . The goal of building a Beowulf cluster is to create a cost-effective
parallel computing system from commodity components to satisfy specific computational
requirements for the earth and space sciences community. The first Beowulf cluster was
built from 16 IntelDX4TM processors connected by a c hannel-bonded 10 Mbps
Ethernet, and it ran the Linux operating system. It was an instant success, demonstrating
the concept of using a commodity cluster as an alternative choice for high-performance
computing (HPC). After the success of the first Beowulf cluster,several more were built
by CESDIS using several generations and families of processors and
network.

Beowulf is a concept of clustering commodity computers to form a parallel,


virtual supercomputer. It is easy to build a unique Beowulf cluster from components that
you consider most appropriate for your applications. Such a system can provide a cost-
effective way to gain features and benefits (fast and reliable services) that have
historically been found only on more expensive proprietary shared memory systems. The
typical architecture of a cluster is shown in Figure 3. As the figure illustrates, numerous
design choices exist for building a Beowulf cluster. For, example, the bold line indicates
our cluster configuration from bottom to top. No Beowulf cluster is general enough to
satisfy the needs of everyone.

7
figure 3.architecture of cluster systems

2.1 LOGICAL VIEW OF CLUSTER

A Beowulf cluster uses multi computer architecture, as depicted in figure 4.It


features a parallel computing system that usually consists of one or more master nodes
and one or more compute nodes, or cluster nodes, interconnected via widely available
network interconnects. All of the nodes in a typical Beowulf cluster are commodity
systems-PCs, workstations, or servers-running commodity software such as Linux.

8
figure 4.logical view of a cluster

The master node acts as a server for Network File System (NFS) and as a gateway
to the outside world. As an NFS server, the master node provides user file space and
other common system software to the compute nodes via NFS. As a gateway, the master
node allows users to gain access through it to the compute nodes. Usually, the master
node is the only machine that is also connected to the outside world using a second
network interface card (NIC). The sole task of the compute nodes is to execute parallel
jobs. In most cases, therefore, the compute nodes do not have keyboards, mice, video
cards, or monitors. All access to the client nodes is provided via remote connections from
the master node. Because compute nodes do not need to access machines outside the
cluster, nor do machines outside the cluster need to access compute nodes
directly, compute nodes commonly use private IP addresses, such as the 10.0.0.0/8 or
192.168.0.0/16 address ranges.
From a user’s perspective, a Beowulf cluster appears as a Massively Parallel
Processor (MPP) system. The most common methods of using the system are to access
the master node either directly or through Telnet or remote login from personal
workstations. Once on the master node, users can prepare and compile their parallel
applications, and also spawn jobs on a desired number of compute nodes in the cluster.
Applications must be written in parallel style and use the message-passing programming
model. Jobs of a parallel application are spawned on compute nodes, which work
collaboratively until finishing the application. During the execution, compute nodes use

9
standard message-passing middleware, such as Message Passing Interface (MPI) and
Parallel Virtual Machine (PVM), to exchange information.

3. WHY CLUTERS

The question may arise why clusters are designed and built when perfectly good
commercial supercomputers are available on the market. The answer is that the latter is
expensive. Clusters are surprisingly powerful.

The supercomputer has come to play a larger role in business applications. In


areas from data mining to fault tolerant performance clustering technology has become
increasingly important.

Commercial products have their place, and there are perfectly good reasons to buy
a commercially produced supercomputer. If it is within our budget and our applications
can keep machines busy all the time, we will also need to have a data center to keep it in.
then there is the budget to keep up with the maintenance and upgrades that will be
required to keep our investment up to par. However , many who have a need to harness
supercomputing power don’t buy supercomputers because they can’t afford them. Also it
is impossible to upgrade them.

Clusters, on the other hand, are cheap and easy way to take off-the-shelf
components and combine them into a single supercomputer. In some areas of research
clusters are actually faster than commercial supercomputer. Clusters also have the distinct
advantage in that they are simple to build using components available from hundreds of
sources. We don’t even have to use new equipment to build a cluster.

3.1. COMPARING THE OLD AND THE NEW

Today, open standards-based HPC systems are being used to solve problems from
high-end, floating-point intensive scientific and engineering problems to data-intensive
tasks in industry. Some of the reasons why HPC clusters outperform RISC-based systems
include:

Collaboration

Scientists can collaborate in real-time across dispersed locations- bridging isolated


islands of scientific research and discovery- when HPC clusters are based on open source
and building block technology.

Scalability

HPC clusters can grow in overall capacity because processors and nodes can be added as
demand increases.

10
Availability

Because single points of failure can be eliminated, if any one system component goes
down, the system as a whole or the solution (multiple systems) stay highly available.

Ease of technology refresh

Processors, memory, disk or operating system (OS) technology can be easily updated,
and new processors and nodes can be added or upgraded as needed.

Affordable service and support

Compared to proprietary systems, the total cost of ownership can be much lower. This
includes service, support and training.

Vendor lock-in

The age-old problem of proprietary vs. open systems that use industry-accepted standards
is eliminated.

System manageability

The installation, configuration and monitoring of key elements of proprietary systems is


usually accomplished with proprietary technologies, complicating system management.
The servers of an HPC cluster can be easily managed from a single point using readily
available network infrastructure and enterprise management software.

Reusability of components

Commercial components can be reused, preserving the investment. For example, older
nodes can be deployed as file/print servers, web servers or other infrastructure servers.

Disaster recovery

Large SMPs are monolithic entities located in one facility. HPC systems can be co-
located or geographically dispersed to make them less susceptible to disaster.

11
12
4. CLUSTERING CONCEPTS

Clusters are in fact quite simple. They are a bunch of computers tied together with a
network working on a large problem that has been broken down into smaller pieces.
There are a number of different strategies we can use to tie them together. There are
also a number of different software packages that can be used to make the software side
of things work.

4.1. PARALLELISM

The name of the game in high performance computing is parallelism. It is the


quality that allows something to be
done in parts that work independently rather than a task that has so many interlocking
dependencies that it cannot be further broken down. Parallelism operates at two levels:
hardware parallelism and software parallelism.

4.1.1. HARDWARE PARALLELISM

On one level hardware parallelism deals with the CPU of an individual system
and how we can squeeze performance out of sub-components of the CPU that can speed
up our code. At another level there is the parallelism that is gained by having multiple
systems working on a computational problem in a distributed fashion. These systems are
known as ‘fine grained’ for parallelism inside the CPU or having to do with the multiple
CPUs in the same system, or ‘coarse grained’ for parallelism of a collection of separate
systems acting in concerts.

CPU-LEVEL PARALLELISM

A computer’s CPU is commonly pictured as a device that operates on one instruction


after another in a straight line, always completing one-step or instruction before a new
one is started.

But new CPU architectures have an inherent ability to do more than one thing at once.
The logic of CPU chip divides the CPU into multiple execution units. Systems that have
multiple execution units allow the CPU to attempt to process more than one instruction at
a time. Two hardware features of modern CPUs support multiple execution units: the
cache – a small memory inside the CPU. The pipeline is a small area of memory inside
the CPU where instructions that are next in line to be executed are stored. Both cache and
pipeline allow impressive increases in CPU performances. It also requires a lot of
intelligence on the part of the compiler to arrange the executable code in such a way that
the CPU has good chance of being able to execute multiple instructions simultaneously.

SYSTEM LEVEL PARALLELISM

It is the parallelism of multiple nodes coordinating to work on a problem in parallel that


gives the cluster its power. There are other levels at which even more parallelism can be

13
introduced into this system. For example if we decide that each node in our cluster will
be a multi CPU system we will be introducing a fundamental degree of parallel
processing at the node level. Having more than one network interface on each node
introduces communication channels that may be used in parallel to communicate with
other nodes in the cluster. Finally, if we use multiple disk drive controllers in each node
we create parallel data paths that can be used to increase the performance of I/O
subsystem.

4.2. SOFTWARE PARALLELISM

Software parallelism is the ability to find well defined areas in a problem we want to
solve that can be broken down into self-contained parts. These parts are the program
elements that can be distributed and give us the speedup that we want to get out of a high
performance computing system.

Before we can run a program on a parallel cluster, we have to ensure that the problems
we are trying to solve are amenable to being done in a parallel fashion. Almost any
problem that is composed of smaller sub-problems that can be quantified can be broken
down into smaller problems and run on a node on a cluster.

5. NETWORKING CONCEPTS

Networking technologies consist of four basic components and concepts

NETWORK PROTOCOLS

Protocols are a set of standards that describe a common information format that
computers use communicating across a network. A network protocol is analogous to the
way information is send from one place to another via the postal system. A network
protocol specifies how information must be packaged and how it is labeled inorder to be
delivered from one computer to another.

NETWORK INTERFACES

The internet interface is a hardware device that makes the information packaged
by a network protocol and put into a format that can be transmitted over some physical
medium like Ethernet, a fiber-optical cable, or even the air using radio waves.

TRANSMISSION MEDIUM

The transmission medium is the mechanism through which the information


bundled together by the networking protocol and transmitted by the network interface is
delivered from one computer to another.

BANDWIDTH

14
Bandwidth is the amount of information that can be transmitted over a given
transmission medium over a given amount of time. It is usually expressed in some form
of bits per second.

5.1. TCP/IP NETWORKING

The networking protocol that is used in the design of linux clusters is


Transmission Control Protocol/ Internet Protocol. This is the same suite of network
protocols used to operate the internet. TCP/IP is the link language of the internet. It is a
communication protocol specification designed by a committee of people.

ROUING

Routing is a process by which a specialized computer with multiple networking


interfaces take information that arrive on one interface and delivers them to
another interface, given a set of rules about how information should be delivered
between the networks. For use with clusters, we can use the network interface
cards on our machines and some code in the linux kernal to help us to perform the
routing function.

6. OPERATING SYSTEM

Linux is a robust, free and reliable POSIX compliant operating system. Several
companies have built businesses from packaging Linux software into organized
distributions; RedHat is an example of such a company. Linux provides the features
typically found in standard UNIX such as multi-user access, pre-emptive multi-tasking,
demand-paged virtual memory and SMP support. In addition to the Linux kernel, a large
amount of application and system software and tools are also freely available. This makes
Linux the preferred operating system for clusters. The idea of the Linux cluster is to
maximize the performance-to-cost ratio of computing by using low-cost commodity
components and free-source Linux and GNU software to assemble a parallel and
distributed computing system. Software support includes the standard Linux/GNU
environment, including compilers, debuggers, editors, and standard numerical libraries.
Coordination and communication among the processing nodes is a key requirement of
parallel-processing clusters. In order to accommodate this coordination, developers have
created software to carry out the coordination and hardware to send and receive the
coordinating messages. Messaging architectures such as MPI or Message Passing
Interface, and PVM or Parallel Virtual Machine, allow the programmer to ensure that
control and data messages take place as needed during operation.

6.1. MESSAGE PASSING LIBRARIES

To use clusters of Intel architected PC machines for High Performance


Computing applications, you must run the applications in parallel across multiple
machines. Parallel processing requires that the code running on two or more processor

15
nodes communicate and cooperating with each other. The message passing model of
communication is typically used by
programs running on a set of discrete computing systems (each with its own memory)
which are linked together by means of a communication network. A cluster is such a
loosely coupled distributed memory system.

6.1.1 PARALLEL VIRTUAL MACHINE (PVM)

PVM, or Parallel Virtual Machine, started out as a project at the Oak Ridge
National Laboratory and was developed further at the University of Tennessee. PVM is a
complete distributed computing system, allowing programs to span several machines
across a network. PVM utilizes a Message Passing model that allows developers to
distribute programs across a
variety of machine architectures and across several data formats. PVM essentially
collects the networks workstations into a single virtual machine. PVM allows a network
of heterogeneous computers to be used as a single computational resource called the
parallel virtual machine. As we have seen, PVM is a very flexible parallel processing
environment. It therefore supports almost all models of parallel programming, including
the commonly used all-peers and master-slave paradigms.
A typical PVM consists of a (possibly heterogeneous) mix of machines on the network,
one being the master host and the rest being worker or slave hosts. These various hosts
communicate by message passing. The PVM is started at the command line of the master
which in turn can spawn workers to achieve the desired configuration of hosts for the
PVM. This configuration can be established initially via a configuration file.
Alternatively, the virtual machine can be configured from the PVM command line
(masters console) or during run time from within the application program. A solution to a
large task, suitable for parallelization, is divided into modules to be spawned by the
master and distributed as appropriate among the workers. PVM consists of two software
components, a resident daemon (pvmd) and the PVM library (libpvm). These must be
available
on each machine that is a part of the virtual machine. The first component, pvmd, is the
message-passing interface between the application program on each local machine and
the network connecting it to the rest of the PVM. The second component, libpvm,
provides the local application program with the necessary message-passing functionality,
so that it can communicate with the other hosts. These library calls trigger corresponding
activity by the local pvmd which deals with the details of transmitting the message. The
message is intercepted by the local pvmd of the target node and made available to that
machines application module via the related library call from within that program.

6.1.2 MESSAGE PASSING INTERFACING (MPI)

MPI is a message-passing library standard that was published in May 1994 . The
standard of MPI is based on the consensus of the participants in the MPI Forum,
organized by over 40 organizations. Participants included vendors, researchers,
academics, software library developers and users. MPI offers portability, standardization,
performance, and functionality. The advantage for the user is that MPI is standardized on

16
many levels. For example, since the syntax is standardized, you can rely on your MPI
code to execute under any MPI implementation running on your architecture. Since the
functional behavior of MPI calls is also standardized, your MPI calls should behave the
same regardless of the implementation. This guarantees the portability of your parallel
programs. Performance, however, may vary between different implementations. MPI
includes point-to-point message passing and collective (global) operations. These are all
scoped to a user-specified group of processes. MPI provides a substantial set of libraries
for the writing, debugging, and performance testing of distributed programs. Our system
currently uses LAM/MPI, a portable implementation of the MPI standard developed
cooperatively by Notre Dame University. LAM (Local Area Multi computer) is an MPI
programming environment
and development system and includes a visualization tool that allows a user to examine
the state of the machine allocated to their job as well as provides a means of studying
message flows between nodes.

7. DESIGN CONSIDERATIONS

Before attempting to build a cluster of any kind, think about the type of problems
you are trying to solve. Different kinds of applications will actually run at different levels
of performance on different kinds of clusters. Beyond the brute force characteristics of
memory speed, I/O bandwidth, disk seek/latency time and bus speed on the individual
nodes of your cluster, the way you connect your cluster together can have a great impact
on its efficiency.

7.1. CLUSTER STYLES

There are many kind of clusters that may be used for different applications.

HOMOGENEOUS CLUSTERS

If we have a lot identical systems or lot of money at our disposal we will be


building a homogeneous cluster. This means that we will be putting together a cluster in
which every single node is exactly the same. Homogeneous clusters are very easy to
work with because no matter what way we decide to tie them together, all of our nodes
are interchangeable and we can be sure that all of our software will work the same way
on all of them.

HETEROGENEOUS CLUSTERS

They come in two general forms. The first and most common are heterogeneous
clusters made from different kinds of computers. It does not matter what the actual
hardware is except that there are different makes and models. A cluster made from such
machines will have several very important details.

7.2. ISSUES TO BE CONSIDERED

17
7.2.1 CLUSTER NETWORKING

If you are mixing hardware that have different networking technologies, there will
be large differences in the speed with which data will be accessed and how individual
nodes can communicate. If it is in your budget make sure that all of the machines you
want to include in your cluster have similar networking capabilities, and if at all possible,
have network adapters from the same manufacturer.

7.2.2 CLUTERING SOFTWARE

You will have to build versions of clustering software for each kind of system you
include in your cluster.

7.2.3 PROGRAMMING

Our code will have to be written to support the lowest common denominator for
data types supported by the least powerful node in our cluster. With mixed machines, the
more powerful machines will have attributes that cannot be attained in the powerful
machine.

7.2.4 TIMING

This is the most problematic aspect of heterogeneous cluster. Since these


machines have different performance profile our code will execute at different rates on
the different kinds of nodes. This can cause serious bottlenecks if a process on one node
is waiting for results of a calculation on a slower node.

The second kind of heterogeneous clusters is made from different machines in the
same architectural family: eg. A collection of Intel boxes where the machines are
different generations or machines of same generation from different manufacturers. This
can present issues with regard to driver versions, or little quirks that are exhibited by
different pieces of hardware.

8. HARDWARE FOR CLUSTERS

When considering what hardware to use for your cluster, there are three
fundamental questions that you will want to answer before you start

1. Do I want to use the existing hardware ?


2. Do I want to build all my hardware from components ?
3. Do I want to use commercial off-the-shelf components ?

You can build a cluster out of almost any kind of hardware. The factors that will probably
play into your decision are a trade-off between time and money.

18
One of the ways to build a high performance cluster is to have to control over every piece
of equipment that goes into the individual nodes. There are five major components of
clusters

. system-board
. CPUs
. disk storage
. network adapters
. enclosure (cases)

9. SYSTEM PERFORMANCE ANALYSIS

Part of any good design is a thorough understanding of a problem we are trying to


solve. Of course, if our goal is to build a general purpose cluster to solve a variety of
problems, there are a number of factors to consider in every case. The most critical are:

CODE SIZE

For large codes, a large portion of the program time is spending on the program
rather than the processing of the data. So it is best to avoid letting a program become an
end unto itself and try to minimize the program logic so that the code is efficient and as
small as possible. This way more of the data can fit it into the CPU cache, and the
program spends the majority of its time doing productive work.

DATA SIZE

With high performance computers, the focus is on arranging data so that it can be
processed in the most efficient manner possible. In looking at the problems we wish to
solve with our parallel linux cluster, we should look carefully at the data.

I/O

Lastly, we need to consider I/O. how much data do we have, where is it coming
from, can it all fit into a flat file on a hard disk etc. These are the decisions that you
should explore before spending a lot of money and time on hardware.

10. NETWORK SELECTION

There are a number of different kinds of network topologies, including buses,


cubes of various degrees, and grids/meshes. These network topologies will be
implemented by use of one or more network interface cards, or NICs, installed into the
head-node and compute nodes of our cluster.

19
SPEED SELECTION

No matter what topology you choose for your cluster, you will want to get fastest
network that your budget allows. Fortunately, the availability of high speed computers
has also forced the development of high speed networking systems. Examples are 10Mbit
Ethernet, 100Mbit Ethernet, gigabit networking, channel bonding etc.

11
. CLUSTER COMPUTING MODELS

Workload Consolidation/Common Management Domain Cluster

This chart shows a simple arrangement of heterogeneous server tasks — but all are
running on a single physical system (in different partitions, with different granularities of
systems resource allocated to them). One of the major benefits offered by this model it

20
that of convenient and simple systems management — a single point of control.
Additionally, however, this consolidation model offers the benefit of delivering high
quality of service (resources) in a cost effective manner.

High Availability Cluster Model

This cluster model expands on the simple load-balancing model shown in the previous
chart. Not only does it provide for load balancing, it also delivers high availability
through redundancy of applications and data. This, of course, requires at lease two nodes
— a primary and a backup. In this model, the nodes can be active/passive or active/active.
In the active/passive scenario, one server is doing most of the work while the second
server is spending most of its time on replication work. In the active/active scenario, both
servers are doing primary work and both are accomplishing replication tasks so that each
server always "looks" just like the other. In both instance, instant failover is achievable
should the primary node (or the primary node for a particular application) experience a
system or application outage. As with the previous model, this model easily scales up
(through application replication) as the overall volume of users and transactions goes up.
The scale-up happens through simple application replication, requiring little or no
application modification or alteration.

21
High-performance Parallel Application Cluster — Technical
Model

In this clustering model, extreme vertical scalability is achievable for a single large
computing task. The logic shown here is essentially based on the message passing
interface (MPI) standard. This model would best be applied to scientific and technical
tasks, such as computing artificial intelligence data. In this high performance model, the
application is actually "decomposed" so that segments of its tasks can safely be run in
parallel.

Load-Balancing Cluster Model

With this clustering model, the number of users (or the number of transactions) can be
allocated (via a load-balancing algorithm) across a number of application instances (here,
we're showing Web application server (WAS) application instances) so as to increase

22
transaction throughput. This model easily scales up as the overall volume of users and
transactions goes up. The scale-up happens through simple application replication only,
requiring little or no application modification or alteration.

High-performance Parallel Application Cluster — Commercial


Model

This clustering model demonstrates the capacity to deliver extreme database scalability
within the commercial application arena. In this environment, "shared nothing" or "shared
disk" might be the requirement "of the day," and can be accommodated. You would
implement this model in commercial parallel database situations, such as DB2 UDB EEE,
Informix XPS or Oracle Parallel Server. As with the technical high-performance model
shown on the previous chart, this high-performance commercial clustering model requires
that the application be "decomposed" so that segments of its tasks can safely be run in
parallel.

23
12. FUTURE TRENDS - GRID COMPUTING

As computer networks become cheaper and faster, a new computing


paradigm, called the Grid , has evolved. The Grid is a large system of computing
resources that performs tasks and provides to users a single point of access, commonly
based on the World Wide Web interface, to these distributed resources. Users consider
the Grid as a single computational resource. Resource management software, frequently
referenced as middleware, accepts jobs submitted by users and schedules them for
execution on appropriate systems in the Grid, based upon resource management policies.
Users can submit thousands of jobs at a time without being concerned about where they
run. The Grid may scale from single systems to supercomputer-class compute farms that
utilise thousands of processors. Depending on the type of applications, the
interconnection between the Grid parts can be performed using dedicated high-speed
networks or the Internet.

By providing scalable, secure, high-performance mechanisms for


discovering and negotiating access to remote resources, the Grid promises to make it
possible for scientific collaborations to share resources on an unprecedented scale, and for
geographically distributed groups to work together in ways that were previously
impossible. Several examples of new applications that benefit from using Grid
technology constitute a coupling of advanced scientific instrumentation or desktop
computers with remote supercomputers; collaborative design of complex systems via
high-bandwidth access to shared resources; ultra-large virtual supercomputers constructed
to solve problems too large to fit on any single computer; rapid, large-scale parametric
studies, e.g. Monte-Carlo simulations, in which a single program is run many times in
order to explore a multidimensional parameter space. The Grid technology is currently
under intensive development. Major Grid projects include NASA’s Information Power
Grid, two NSF Grid projects (NCSA Alliance’s Virtual Machine Room and NPACI), the
European DataGrid Project and the ASCI Distributed Resource Management project.
Also first Grid tools are already available for developers. The Globus Toolkit [20]
represents one such example and includes a set of services and software libraries to
support Grids and Grid applications .

24
13. CONCLUSION
Scalable computing clusters, ranging from a cluster of (homogeneous or
heterogeneous) PCs or workstations, to SMPs, are rapidly becoming the standard
platforms for high-performance and large-scale computing. It is believed that message-
passing programming is the most obvious approach to help programmer to take
advantage of clustering symmetric multiprocessors (SMP) parallelism.

REFERENCES

1. “Using a PC cluster for high performance computing and applications “ by


Chao Tung Yang
http://www2.thu.edu.tw/~sci/journal/v4/000407.pdf

2. “Cluster approach to high performance computing” by A.Kuzmin


http://www.cfi.lu.lv/teor/pdf/LASC_short.pdf

3. “Changing the face of high performance computing through great


performance, collaboration and affordability”
A technical white paper by Intel corporation and Dell computer corporation.

4. Building Linux clusters by David HM Spector.

25

Vous aimerez peut-être aussi