Vous êtes sur la page 1sur 32

Cluster Architecture

Jaringan Terdistribusi
TI Bayu
Cluster Architecture
Cluster Architecture
Computing speed isn't just a convenience.
Faster computers allow us to solve larger
problems, and to find solutions more quickly,
with greater accuracy, and at a lower cost.
Traditional high-performance clusters have
proved their worth in a variety of uses
predicting the weather to industrial design, from
molecular dynamics to astronomical modeling.
High-performance computing (HPC) has
created a new approach to science.
Cluster Architecture
Computing speed isn't just a convenience.
Faster computers allow us to solve larger
problems, and to find solutions more quickly,
with greater accuracy, and at a lower cost.
performance clusters have
proved their worth in a variety of usesfrom
predicting the weather to industrial design, from
molecular dynamics to astronomical modeling.
performance computing (HPC) has
created a new approach to science.
Cluster Architecture
Clusters are also playing a greater role in
business. High performance is a key issue in
data mining or in image rendering. Advances in
clustering technology have led to high
availability and load-balancing clusters. availability and load-balancing clusters.
Clustering is now used for mission
applications such as web and FTP servers. For
example, Google uses an ever
composed of tens of thousands of computers.
Cluster Architecture
Clusters are also playing a greater role in
business. High performance is a key issue in
data mining or in image rendering. Advances in
clustering technology have led to high-
balancing clusters. balancing clusters.
Clustering is now used for mission-critical
applications such as web and FTP servers. For
example, Google uses an ever-growing cluster
composed of tens of thousands of computers.
Modern Computing and the Role of
Clusters
When computing, there are three basic approaches
to improving performance
use a faster computer, or divide the calculation
among multiple computers. A very common analogy
is that of a horse-drawn cart.
First, consider what you are trying to calculate. First, consider what you are trying to calculate.
While there are no hard and fast rules, it is not
unusual to see a quadratic increase in cost with a
linear increase in performance, particularly as you
move away from commodity technology.
The third approach is parallelism, i.e., executing
instructions simultaneously.
Modern Computing and the Role of
When computing, there are three basic approaches
use a better algorithm,
use a faster computer, or divide the calculation
among multiple computers. A very common analogy
drawn cart.
First, consider what you are trying to calculate. First, consider what you are trying to calculate.
While there are no hard and fast rules, it is not
unusual to see a quadratic increase in cost with a
linear increase in performance, particularly as you
move away from commodity technology.
The third approach is parallelism, i.e., executing
instructions simultaneously.
Uniprocessor Computers
The traditional classification of computers based
on size and performance, i.e., classifying
computers as microcomputers, workstations,
minicomputers, mainframes, and supercomputers,
has become obsolete.
Regardless of where we place them in the
traditional classification, most computers today are
based on an architecture often attributed to the
Hungarian mathematician John von Neumann.
The basic structure of a von Neumann computer is
a CPU connected to memory by a communications
channel or bus.
Uniprocessor Computers
The traditional classification of computers based
on size and performance, i.e., classifying
computers as microcomputers, workstations,
minicomputers, mainframes, and supercomputers,
Regardless of where we place them in the
traditional classification, most computers today are
based on an architecture often attributed to the
Hungarian mathematician John von Neumann.
The basic structure of a von Neumann computer is
a CPU connected to memory by a communications
Uniprocessor Computers
The development of reduced instruction set
computer (RISC) architectures and post
architectures has led to more uniform
instruction sets. This eliminates cycles from
some instructions and allows a higher clock some instructions and allows a higher clock
rate.
Superscalar architectures and pipelining have
also increased processor speeds. Superscalar
architectures execute two or more instructions
simultaneously.
Uniprocessor Computers
The development of reduced instruction set
computer (RISC) architectures and post-RISC
architectures has led to more uniform
instruction sets. This eliminates cycles from
some instructions and allows a higher clock- some instructions and allows a higher clock-
Superscalar architectures and pipelining have
also increased processor speeds. Superscalar
architectures execute two or more instructions
Multiple Processors
In recent years we have come to augment that
definition to include parallel computers with
hundreds or thousands of CPUs, otherwise
known as multiprocessor computers.
Multiprocessor computers fall into two basic Multiprocessor computers fall into two basic
categoriescentralized multiprocessors (or
single enclosure multiprocessors) and
multicomputers.
Multiple Processors
In recent years we have come to augment that
definition to include parallel computers with
hundreds or thousands of CPUs, otherwise
known as multiprocessor computers.
Multiprocessor computers fall into two basic Multiprocessor computers fall into two basic
centralized multiprocessors (or
single enclosure multiprocessors) and
Centralized multiprocessors
With centralized multiprocessors, there are two
architectural approaches based on how memory is
manageduniform memory access (UMA) and
nonuniform memory access (NUMA) machines.
With UMA machines, also called symmetric With UMA machines, also called symmetric
multiprocessors (SMP), there is a common shared
memory. Identical memory addresses map,
regardless of the CPU, to the same location in
physical memory. Main memory is equally
accessible to all CPUs. To improve memory
performance, each processor has its own cache.
Centralized multiprocessors
With centralized multiprocessors, there are two
architectural approaches based on how memory is
uniform memory access (UMA) and
memory access (NUMA) machines.
With UMA machines, also called symmetric With UMA machines, also called symmetric
multiprocessors (SMP), there is a common shared
memory. Identical memory addresses map,
regardless of the CPU, to the same location in
physical memory. Main memory is equally
accessible to all CPUs. To improve memory
performance, each processor has its own cache.
Centralized multiprocessors Centralized multiprocessors
Centralized multiprocessors
A closely related architecture is used with
NUMA machines. Roughly, with this
architecture, each CPU maintains its own piece
of memory. Effectively, memory is divided
among the processors, but each process has
access to all the memory. access to all the memory.
Operating system support is required with either
multiprocessor scheme. Fortunately, most
modern operating systems, including Linux,
provide support for SMP systems, and support
is improving for NUMA architectures.
Centralized multiprocessors
A closely related architecture is used with
NUMA machines. Roughly, with this
architecture, each CPU maintains its own piece
of memory. Effectively, memory is divided
among the processors, but each process has
access to all the memory. access to all the memory.
Operating system support is required with either
multiprocessor scheme. Fortunately, most
modern operating systems, including Linux,
provide support for SMP systems, and support
is improving for NUMA architectures.
Centralized multiprocessors Centralized multiprocessors
Centralized multiprocessors
A third architecture worth mentioning in passing
is processor array, which, at one time,
generated a lot of interest. A processor array is
a type of vector computer built with a collection
of identical, synchronized processing elements. of identical, synchronized processing elements.
Each processor executes the same instruction
on a different element in a data array.
Centralized multiprocessors
A third architecture worth mentioning in passing
is processor array, which, at one time,
generated a lot of interest. A processor array is
a type of vector computer built with a collection
of identical, synchronized processing elements. of identical, synchronized processing elements.
Each processor executes the same instruction
on a different element in a data array.
Multicomputers
A multicomputer configuration, or cluster, is a
group of computers that work together. A
cluster has three basic elements
of individual computers, a network connecting
those computers, and software that enables a
computer to share work among the other computer to share work among the other
computers via the network.
For most people, the most likely thing to come
to mind when speaking of
Beowulf cluster. While this is perhaps the best
known type of multicomputer, a number of
variants now exist.
A multicomputer configuration, or cluster, is a
group of computers that work together. A
cluster has three basic elementsa collection
of individual computers, a network connecting
those computers, and software that enables a
computer to share work among the other computer to share work among the other
computers via the network.
For most people, the most likely thing to come
to mind when speaking of multicomputers is a
Beowulf cluster. While this is perhaps the best-
known type of multicomputer, a number of
Multicomputers
First, both commercial multicomputers and
commodity clusters are available. Commodity
clusters, including Beowulf clusters, are
constructed using commodity, off
(COTS) computers and hardware. (COTS) computers and hardware.
When constructing a commodity cluster, the
norm is to use freely available, open source
software. This translates into an extremely low
cost that allows people to build a cluster when
the alternatives are just too expensive.
First, both commercial multicomputers and
commodity clusters are available. Commodity
clusters, including Beowulf clusters, are
constructed using commodity, off-the-shelf
(COTS) computers and hardware. (COTS) computers and hardware.
When constructing a commodity cluster, the
norm is to use freely available, open source
software. This translates into an extremely low
cost that allows people to build a cluster when
the alternatives are just too expensive.
Multicomputers
In commodity clusters, the software is often
mix-and-match.
Commercial clusters often use proprietary
computers and software. computers and software.
A network of workstations (NOW), sometimes
called a cluster of workstations (COW), is a
cluster composed of computers usable as
individual workstations.
In commodity clusters, the software is often
Commercial clusters often use proprietary
computers and software. computers and software.
A network of workstations (NOW), sometimes
called a cluster of workstations (COW), is a
cluster composed of computers usable as
individual workstations.
Cluster structure
It's tempting to think of a cluster as just a bunch
of interconnected machines, but when you
begin constructing a cluster, you'll need to give
some thought to the internal structure of the
cluster. cluster.
It's tempting to think of a cluster as just a bunch
of interconnected machines, but when you
begin constructing a cluster, you'll need to give
some thought to the internal structure of the
Cluster structure
The simplest approach is a symmetric cluster.
With a symmetric cluster each node can
function as an individual computer. This is
extremely straightforward to set up.
There are several disadvantages to a
symmetric cluster. Cluster management and
security can be more difficult. Workload
distribution can become a problem, making it
more difficult to achieve optimal performance.
The simplest approach is a symmetric cluster.
With a symmetric cluster each node can
function as an individual computer. This is
extremely straightforward to set up.
There are several disadvantages to a
symmetric cluster. Cluster management and
security can be more difficult. Workload
distribution can become a problem, making it
more difficult to achieve optimal performance.
Cluster structure
Cluster structure
For dedicated clusters, an asymmetric
architecture is more common. With asymmetric
clusters one computer is the head node or
frontend. It serves as a gateway between the
remaining nodes and the users. remaining nodes and the users.
The primary disadvantage of this architecture
comes from the performance limitations
imposed by the cluster head. For this reason, a
more powerful computer may be used for the
head.
For dedicated clusters, an asymmetric
architecture is more common. With asymmetric
clusters one computer is the head node or
frontend. It serves as a gateway between the
remaining nodes and the users. remaining nodes and the users.
The primary disadvantage of this architecture
comes from the performance limitations
imposed by the cluster head. For this reason, a
more powerful computer may be used for the
Cluster structure
Cluster structure
I/O represents a particular challenge. It is often
desirable to distribute a shared filesystem
across a number of machines within the cluster
to allow parallel access.
Network design is another key issue. With small
clusters, a simple switched network may be
adequate. With larger clusters, a fully
connected network may be prohibitively
expensive.
I/O represents a particular challenge. It is often
desirable to distribute a shared filesystem
across a number of machines within the cluster
to allow parallel access.
Network design is another key issue. With small
clusters, a simple switched network may be
adequate. With larger clusters, a fully
connected network may be prohibitively
Cluster structure
Types of Clusters
Originally, "clusters" and "high
computing" were synonymous. Today, the
meaning of the word "cluster" has expanded
beyond high-performance to include high
availability (HA) clusters and load availability (HA) clusters and load
(LB) clusters.
Originally, "clusters" and "high-performance
computing" were synonymous. Today, the
meaning of the word "cluster" has expanded
performance to include high-
availability (HA) clusters and load-balancing availability (HA) clusters and load-balancing
Types of Clusters
High-availability clusters, also called failover
clusters, are often used in mission
applications. If you can't afford the lost business
that will result from having your web server go
down, you may want to implement it using a HA down, you may want to implement it using a HA
cluster. The key to high availability is
redundancy.
availability clusters, also called failover
clusters, are often used in mission-critical
applications. If you can't afford the lost business
that will result from having your web server go
down, you may want to implement it using a HA down, you may want to implement it using a HA
cluster. The key to high availability is
Types of Clusters
The idea behind a load
provide better performance by dividing the work
among multiple computers. For example, when
a web server is implemented using LB
clustering, the different queries to the server are clustering, the different queries to the server are
distributed among the computers in the clusters.
This might be accomplished using a simple
round-robin algorithm.
The idea behind a load-balancing cluster is to
provide better performance by dividing the work
among multiple computers. For example, when
a web server is implemented using LB
clustering, the different queries to the server are clustering, the different queries to the server are
distributed among the computers in the clusters.
This might be accomplished using a simple
Types of Clusters
Keep in mind, the term "load
different things to different people. A high
performance cluster used for scientific
calculation and a cluster used as a web server
would likely approach load would likely approach load
different ways. Each application has different
critical requirements.
Keep in mind, the term "load-balancing" means
different things to different people. A high-
performance cluster used for scientific
calculation and a cluster used as a web server
would likely approach load-balancing in entirely would likely approach load-balancing in entirely
different ways. Each application has different
Distributed Computing and Clusters
While the term parallel is often used to describe
clusters, they are more correctly described as a
type of distributed computing.
Typically, the term parallel computing refers to Typically, the term parallel computing refers to
tightly coupled sets of computation. Distributed
computing is usually used to describe
computing that spans multiple machines or
multiple locations.
Distributed Computing and Clusters
While the term parallel is often used to describe
clusters, they are more correctly described as a
type of distributed computing.
Typically, the term parallel computing refers to Typically, the term parallel computing refers to
tightly coupled sets of computation. Distributed
computing is usually used to describe
computing that spans multiple machines or
Distributed Computing and Clusters
Clusters are generally restricted to computers
on the same subnetwork
computing is frequently used to describe
computers working together across a WAN or
the Internet.
Peer-to-peer computing provides yet another
approach to distributed computing. Again this is
an ambiguous term. Peer
sharing cycles, to the communications
infrastructure, or to the actual data distributed
across a WAN or the Internet.
Distributed Computing and Clusters
Clusters are generally restricted to computers
subnetwork or LAN. The term grid
computing is frequently used to describe
computers working together across a WAN or
peer computing provides yet another
approach to distributed computing. Again this is
an ambiguous term. Peer-to-peer may refer to
sharing cycles, to the communications
infrastructure, or to the actual data distributed
across a WAN or the Internet.
Limitations
While clusters have a lot to offer, they are not
panaceas. There is a limit to how much adding
another computer to a problem will speed up a
calculation. In the ideal situation, you might
expect a calculation to go twice as fast on two expect a calculation to go twice as fast on two
computers as it would on one. Unfortunately,
this is the limiting case and you can only
approach it.
While clusters have a lot to offer, they are not
panaceas. There is a limit to how much adding
another computer to a problem will speed up a
calculation. In the ideal situation, you might
expect a calculation to go twice as fast on two expect a calculation to go twice as fast on two
computers as it would on one. Unfortunately,
this is the limiting case and you can only
Amdahl's Law
In a nutshell, Amdahl's Law states that the
serial portion of a program will be the limiting
factor in how much you can speed up the
execution of the program using multiple
processors. processors.
Having said this, it is important to remember
that Amdahl's Law does clearly state a limitation
of parallel computing. But this limitation varies
not only from problem to problem, but with the
size of the problem as well.
In a nutshell, Amdahl's Law states that the
serial portion of a program will be the limiting
factor in how much you can speed up the
execution of the program using multiple
Having said this, it is important to remember
that Amdahl's Law does clearly state a limitation
of parallel computing. But this limitation varies
not only from problem to problem, but with the
size of the problem as well.
Amdahl's Law
Amdahl's Law
One last word about the limitations of
clustersthe limitations are often tied to a
particular approach. It is often possible to mix
approaches and avoid limitations. For example,
in constructing your clusters, you'll want to use in constructing your clusters, you'll want to use
the best computers you can afford. This will
lessen the impact of inherently serial code. And
don't forget to look at your algorithms!
One last word about the limitations of
the limitations are often tied to a
particular approach. It is often possible to mix
approaches and avoid limitations. For example,
in constructing your clusters, you'll want to use in constructing your clusters, you'll want to use
the best computers you can afford. This will
lessen the impact of inherently serial code. And
don't forget to look at your algorithms!

Vous aimerez peut-être aussi