Vous êtes sur la page 1sur 21

LinuxDevCenter.

com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 1 of 21

Published on Linux DevCenter (http://www.linuxdevcenter.com/)


http://www.linuxdevcenter.com/pub/a/linux/2005/02/03/haoscar.html
See this if you're having trouble printing code examples

HA-OSCAR: Five Steps to a High-Availability Linux Cluster


by Ibrahim Haddad and Chokchai Leangsuksun, Stephen L. Scott
02/03/2005
The HA-OSCAR project's primary goal is to improve the existing OSCAR, Beowulf architecture, and
cluster management technology systems (including OSCAR, ROCKS, and Scyld) while providing highavailability and scalability capabilities for Linux clusters. The OCG recognized the project as an official
working group, along with the current OSCAR and Thin-OSCAR working groups. HA-OSCAR
introduces several enhancements and new features to OSCAR, mainly in the areas of availability,
scalability, and security. The new features in the initial release are head node redundancy and selfrecovery for hardware, service, and application outages.
This document provides a systematic installation guide for system administrators, as well as a detailed
explanation of what happens during the installation. This guide assumes familiarity with basic Linux
administration commands. Prior knowledge of OSCAR installation and administration will be useful.
Supported Distributions and System Requirements
The HA-OSCAR team has tested HA -OSCAR to work with OSCAR 2.3, 2.3.1, and 3.0 based on Red
Hat 9.0. The test environment for the installation discussed in this article is as follows:
?

Head node: Two dual Xeon 2.4GHz machines, each with 1GB of RAM, a 40GB HD, and two
network interface cards (NICs)
Client node: Four dual Xeon 2.4GHz machines, each with 1GB of RAM, a 40GB HD, and two
NICs
Switch: D-Link 10/100Mbps switch

This article assumes that you have built the cluster with OSCAR beforehand. If this is not the case,
please refer to the OSCAR project page for the OSCAR installation procedure.
The primary and standby servers should have homogeneous hardware, and each server should have at
least two network interface cards. The network interfaces must support PXE boot, and they must all
connect to the local switch (two for redundancy purposes).

HA-OSCAR Architecture
Figure 1 illustrates the HA-OSCAR architecture.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 2 of 21

Figure 1. HA-OSCAR architecture


HA-OSCAR consists of the following major system components:
1. A primary server, which receives and distributes requests to specified clients. Each server has
three network interface cards: one connected to the Internet by a public network address, and the
other two connected to a private LAN, which consists of a primary Ethernet LAN and a standby
LAN. However, the beta release only supports one private NIC.
2. A standby primary server, which activates its services, monitors the primary server, and
anticipates taking over for the primary server when it detects a failure in the primary server.
3. Multiple clients, dedicated to computation.
4. Local LAN switches, to provide local connectivity among head and client/compute nodes.
Each head node must have at least two NICs: eth0 and eth1. One of the NICs is a public interface to
the outside network and the other is a private interface to its local LAN and towards computing nodes.
The exact configuration depends on how a user wants to connect eth0 and eth1 to either the public or
private network. Our example assumes that eth0 is a private interface and eth1 is the public interface.
Figure 2 shows the sample network configuration of a HA-OSCAR head node.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 3 of 21

Figure 2. Sample HA-OSCAR network configuration for head nodes

Cluster Installation Procedure


The HA-OSCAR team has developed an easy-to-install package with a GUI interface. When the system
has OSCAR installed, download HA-OSCAR .
You must be root to be able to install HA-OSCAR. Once you uncompress the package, start the
installation by typing the following command:
% ./haoscar_install <interface>

The interface directive is the private network interface for the primary head, normally eth0.
The installation wizard should pop up (as shown in Figure 3). The HA-OSCAR installation wizard will
walk the user through a complete installation process consisting of the following steps:
1.
2.
3.
4.
5.

Installing the HA-OSCAR package (SW staging on the primary head).


Building a standby server image (cloning the primary server).
Configuring the initial standby server.
Setting up the network and creating a boot image on the standby server.
Completing the installation.

The following sections describe the HA-OSCAR wizard installation process and provide visuals for the
associated screens.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 4 of 21

HA-OSCAR Package Installation


In step 1, this wizard will install all of the required packages to the OSCAR cluster server and prepare
the environment.

Figure 3. HA-OSCAR installation wizard

Fetching (Cloning) Image for Standby Server


The first step will take less than one minute to complete. Step 2 is for building a standby server image
from the primary node. When you click the button Building Image for Standby server, the wizard will
pop up another window requesting a server image name. Normally, you can leave the default value and
just press the Fetch image button (shown in Figure 4) to fetch an image for the standby server. This step
will take several minutes.
This is an important step in cloning a standby server image from a primary one. For a stringent
downtime requirement, we recommend a separate image server for image repository and recovery
purposes.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 5 of 21

Figure 4. Fetching or cloning a server image


This step will take ten to 15 minutes. Once it succeeds (a successful status window will pop up), click
the Close button.

Configuring the Standby Server


The third step requires the you to enter an alias public IP address. Proceed by clicking on step 3. HAOSCAR will pop up the Standby server initial network Configuration screen shown in Figure 5. Users
normally use this public IP address as a virtual entry point to access the head node. When the failover
occurs, the standby server will take over this address so users can continue accessing the cluster as if
nothing has happened. The normal procedures within this step are:
1. Enter the alias public IP for the eth1 interface. When the failover occurs, the standby server will
automatically clone the cluster public IP on the designated network interface, probably eth1.
2. Determine whether the standby local IP address HA-OSCAR has selected is occupied. If this IP
address is in use, please select a new one. Otherwise, keep this default.
3. Leave the last two items unchanged.
4. Click Add Standby Server.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 6 of 21

Figure 5. Standby server initial network configuration


This step will take less than a minute. When the successful status window pops up, click on the Close
button.

Retrieve Standby Server MAC Address (For PXE Boot) and Build the Image on its
Local Drive
Pay close attention to the following procedures to retrieve the standby server's MAC address for PXE
booting before building its images on the local drive. One of the standby server network interfaces,
typically eth0, connects to the private LAN and broadcasts its MAC address during its network boot.
Whenever the primary server is ready to build the standby server image, it starts cloning its images with
the collected addresses. Consequently, the standby server will fetch the image by network booting the
standby server via PXE (or floppy) from the primary server or an optional image server on its local file
system. When the cloning succeeds, the server will reboot from its hard disk. This marks the completion
of the standby server installation.
To assign the standby server's MAC address and build a local image on the standby server, proceed to
step 4 in "Network Setup & Make boot server." HA-OSCAR will display the standby server MAC
address configuration screen as shown in Figure 6.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 7 of 21

Figure 6. Standby server MAC address configuration


Step 4 contains the following procedures:
1. Click on the Setup Network Boot button.
2. Click on the Start Collect MAC address button to instruct the primary server to collect the standby
server's MAC address.
3. Switch to the standby server terminal, configure its boot sequence to start with the network boot,
and reboot the standby system. Make sure the standby server eth0 is connected to the local switch
where the primary server PXE daemon will listen to the broadcast boot request. Otherwise, the
primary server will not be able to collect the standby MAC address in the next step.
4. Toggle back to the primary server screen; it should now show the standby server's MAC address
(Figure 7).
5. Assign the MAC address to the standby server network interface (e.g. eth0).
6. Click on Configure DHCP Server.
7. Toggle back to the standby server terminal and reboot it; you're now ready to build a local image
on the standby server. This step takes 30 minutes to one hour, so please be patient.
8. When the standby server image completes, make sure to set the server boot device as its local hard
disk. Reboot the system.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 8 of 21

Figure 7. Standby server MAC address configuration after MAC address collection

Completing the HA-OSCAR Installation


Having completed all four steps, the cluster should have all of its packages installed. The cluster should
be ready to use or test. HA-OSCAR also provides a web-based management to customize the HAOSCAR configuration, including the capability to enable new outage monitor/detection modules and
failover capabilities. However, this is a feature for advanced users only, as it may cause invalid cluster
configurations if you incorrectly configure HA-OSCAR parameters. The next section elaborates on this
topic.

HA-OSCAR Monitoring and Configuration Webmin (Optional)


HA-OSCAR provides a default self -healing system resource and outage monitoring health and recovery
mechanism. It also provides a web-based service monitoring and configuration program based on
WebMin and Mon. You can use HA-OSCAR Webmin to customize resource managing, configuring,
and service monitoring.
The following sections describe step by step how to manually configure the virtual network interface,
(heartbeat) detection channel, and optional service monitoring configurations. Again, we intended to
support the following procedures and features only for advanced users. The normal initial head node
configuration steps are:
?
?
?

Set up detection channels and configurations on the primary server.


Set up detection channels and configurations on the standby server.
Enable the optional HA-OSCAR service monitoring (only for advanced users).

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 9 of 21

Primary Server Setup


Access HA-OSCAR Webmin by opening http://localhost:10000 (but only if you have it running) and
selecting the HA -OSCAR category to configure the system (Figure 8). A manual configuration (Figure
9) involves the following steps:
1. Add a virtual network interface to eth0 and eth1.
2. Define channel detection: a public network and its virtual interface (virtual public IP, which is the
same public IP for both primary and standby severs) and a private network and its virtual
interface. This is for IP cloning and channel detection.
Other users also can later log in and manage your system with the web-based tool.

Figure 8. Step-by-step instructions to set up a virtual network interface and detection channel

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 10 of 21

Figure 9. The main Webmin screen

Creating Virtual Network Interface for HA-OSCAR Primary Server


First select Detection channel configuration shown in Figure 10, and navigate into the corresponding
screen shown in Figure 11. Initially, there should be three network interfaces: eth0, eth1, and lo, all
created during OSCAR and HA-OSCAR installation. Add virtual network interfaces for eth0 and eth1
by clicking on the Add a new interface button in Figure 12. Sample screens (Figures 13 and 14) show
how to add virtual network interfaces for eth0 and eth1.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 11 of 21

Figure 10. HA-OSCAR monitoring configuration screen

Figure 11. Detection channel configuration


Adding Virtual Network Interface

Figure 12. Network interface screen

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 12 of 21

Figure 13. Sample eth0 virtual network interface creation

Figure 14. Sample eth1 virtual network interface creation


Building Detection Channel Information
After you create the virtual network interfaces, the next step is to define the previously created network
interfaces for HA-OSCAR (health) detection channels. Access the detection channel configuration from
the HA-OSCAR Webmin screen shown in Figure 15. When you've made your selection, enter the
network interface information in the form shown in Figure 15.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 13 of 21

Figure 15. Channel configuration selection

Figure 16. Primary server channel configuration screen


When you complete the channel setup and primary server configuration, make sure to click the Save
button in Figure 17 and then click the Apply configuration button in Figure 18.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 14 of 21

Figure 17. Network and monitoring service configuration screen

Figure 18. Main monitor configuration screen


Optional Steps for Managing Services Configuration
HA-OSCAR provides a useful default set of monitoring policies. However, users can add new services
and change monitoring parameters. We do recommend this option for advanced users.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 15 of 21

Figure 19. Monitor list task


Return to the index page, and apply all of the configurations to HA-OSCAR.

Figure 19. Details of the "Process Server" monitoring policy


Make sure to apply the change to the new configuration.

Standby Server Setup

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 16 of 21

After you complete the primary server network and detection channel configuration, create a virtual
network interface and enable channel detection. First, switch to the standby server terminal and access
HA-OSCAR Webmin by opening http://localhost:10000 (the link will only work if you've enabled this
locally). The setup steps are similar to those shown earlier in the primary server setup section. When you
finish configuring the network interface, select only the Channel configuration on standby server button.
Otherwise, it may cause unpredictable behavior and invalid configuration.
Virtual Network Interface Creation on the Standby Server
You can create a virtual network interface for the standby server that is similar to the primary server
configuration. Figure 21 shows how to set up the standby server's virtual network interface. Be sure not
to activate the virtual public IP at boot time. It should come up only at the failover when a user creates a
virtual public interface, perhaps eth1:1.

Figure 21. The standby server network interface screen

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 17 of 21

Figure 22. Adding a new virtual network interface to eth1

Figure 23. A new network interface created on the standby server

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 18 of 21

Figure 24. Configuring a detection channel on the standby server

Figure 25. More configuration for the detection channel


When you complete configuring both the network interface and the detection channel, return to the
index page and apply the configuration (Figure 26).

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 19 of 21

Figure 26. Applying the changes after channel configuration

Conclusion
This article is meant as a guide to help you on your feet with your installation and configuration of a
highly available Linux cluster using HA-OSCAR.
Open source projects have a special dynamic, especially popular projects,
and they tend to advance and change as their users request. Therefore, if
any of the steps/functionality/screen captures above are not valid by the
time you read this article, this will be due to changes in the HA-OSCAR
package; please forgive us and post an update on the discussion forum.

Related Reading

We hope you find this article useful. Have fun.


Happy Hacking!

Acknowledgment
?

Vishal Rampure, Anand Tikotekar, and Ryan Bourgeois from


Louisiana Tech University
HA-OSCAR Team (Louisiana Tech University, Oak Ridge
National Lab, Ericsson Research, Intel)
Intel, for an equipment loan

Future Work
?

Support network private interface failover for all Ethernet


interfaces.
Remove the OSCAR dependency so any Linux Beowulf can be

http://www.linuxdevcenter.com/lpt/a/5603

High Performance
Linux Clusters with
OSCAR, Rocks,
OpenMosix, and MPI
By Joseph
D. Sloan
Table of Contents

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

?
?

retrofitted with HA -OSCAR.


Provide automatic head-node image synchronization via resync.
Provide automatic virtual network and detection channel
configuration (instead of manual).
Integrate the Distributed Security Infrastructure with HA-OSCAR.
This activity has started and we have a lab version of a combined
package of HA-OSCAR and DSI.
Much more

Page 20 of 21

Index
Sample Chapter
Read Online--Safari
Search this book on
Safari:

Glossary
Only This Book

ASP

Code Fragments
Application Service Providers
only
DHCP
Dynamic Host Configuration Protocol
FCAPS
Fault management, Configuration, Accounting, Performance, and Security
HA
High Availability
HA-OSCAR
High Availability OSCAR
HPC
High Performance Computing
ISP
Internet Service Provider
ITU
International Telecommunication Union
LAN
Local Area Network
MAC
Media Access Control
OCG
Open Cluster Group
OSCAR
Open Source Clustering and Application Resources
LUI
Resource-Based Cluster Installation Tool
NFS
Network File System
NIC
Network Interface Card
NTP
Network Time Protocol
SIS SystemImager
Image-Based Installation and Maintenance Tool
SNMP
Simple Network Management Protocol
TFTP
Trivial File Transfer Protocol
Thin-OSCAR
A diskless OSCAR version
http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

LinuxDevCenter.com: HA-OSCAR: Five Steps to a High-Availability Linux Cluster

Page 21 of 21

TNM
Telecommunication Management Network
Ibrahim Haddad is a researcher at the Open System Lab, Ericsson Research.
Chokchai Leangsuksun is an Associate Professor of Computer Science, Louisiana Tech University.
Stephen L. Scott is a founding member of OCG and OSCAR - and has served in the capacity of both
release manager and working group chair

Return to the Linux DevCenter.


Copyright 2006 O'Reilly Media, Inc.

http://www.linuxdevcenter.com/lpt/a/5603

5/23/2006

Vous aimerez peut-être aussi