Vous êtes sur la page 1sur 18

Creating a Red Hat Cluster: Part 4

In this article we will create a GFS filesystem that will allows us to share data between nodes. In the next and last article well finalise the cluster by completing our ftp and web service so they really work. We will also show you how to manually move service from one server to another. So we still have some work to do, so lets start right away.

Adding a SAN disk to our servers


The Linux operating system is installed on the internal disks for each of our server. We will now add a SAN disk that will be visible be each of our server. I assume here that your SAN and your Brocade switch are configure accordingly. Explaining how to set up the SAN and the Brocade switch is not in the scope of this article. But I think that you get the idea that the new disk must be visible by every node in our cluster. In the example below we already have a SAN disk (sda) with one partition (sda1) on it. Adding a disk to the server, can be done (live) without any interruption of service, if you follow the steps below. I would suggest you practice on a test server, to become familiar with the procedure. Before we add a disk, lets see what are the visible disks on the system, by looking at the /proc/partitions file. We can see that we already have a disk (sda) with one partition on it. So the new disk that were going to add, should be seen as sdb. root@gollum~# grep sd /proc/partitions 8 0 104857920 sda 8 1 104856223 sda1 Lets rescan the SCSI bus by typing the command below. This command must be run on each of the server within the cluster. Here, we have only one HBA (Host Base Adapter) card connected to the SAN on each server. If you have a second HBA, you need to run the same command for the second HBA, but replace the host0 by host1. root@gollum~# echo - - > /sys/class/scsi_host/host0/scan root@gandalf~# echo - - > /sys/class/scsi_host/host0/scan root@bilbo~# echo - - > /sys/class/scsi_host/host0/scan Lets see if we have some new disk(s) that were detected (sdb) (check each servers) root@gollum~# grep sd /proc/partitions 8 0 104857920 sda 8 1 104856223 sda1 8 16 15728640 sdb Lets create a LVM partition on our new disk (sdb) by running the fdisk command. # fdisk /dev/sdb Command (m for help): p Disk /dev/sdh: 107.3 GB, 107374510080 bytes 255 heads, 63 sectors/track, 13054 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System Command (m for help): n (n=new partition) Command action e extended p primary partition (1-4) p (p=primary partition)

Partition number (1-4): 1 (first partition =1) First cylinder (1-13054, default 1): 1 (Start at the beginning of disk) Last cylinder or +size or +sizeM or +sizeK (1-13054, default 13054): 13054 (End of the Disk) Command (m for help): p (p=Print partition information) Disk /dev/sdh: 107.3 GB, 107374510080 bytes 255 heads, 63 sectors/track, 13054 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 13054 104856223+ 83 Linux Command (m for help): t (t=type of partition) Selected partition 1 (Change type of partition 1) Hex code (type L to list codes): 8e (8e=LVM partition Type L to list partition code) Changed system type of partition 1 to 8e (Linux LVM) Command (m for help): p (p=print partition information) Disk /dev/sdh: 107.3 GB, 107374510080 bytes 255 heads, 63 sectors/track, 13054 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdb1 1 13054 104856223+ 8e Linux LVM Command (m for help): w (w=write partition to disk) The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks. If we look again at our /proc/partition, we should see our new disk and partition are being seen by this server. root@gollum~# grep sd /proc/partitions 8 0 104857920 sda 8 1 104856223 sda1 8 16 15728640 sdb 8 16 15727658 sdb1 Now we need to make sure that the new disk and partition are seen by every servers within the cluster. We now need to go on every servers in the cluster and run the command partprobe (Partition probe). After running the command, you should check like we did on gollum, if all the disks and partitions are seen by each servers. root@gollum~# partprobe root@gandalf~# partprobe root@bilbo~# partprobe

Creating our physical volume Now that we know that the disk is seen by every node, lets create the physical volume on one of the server and then check on all others servers in the cluster, the physical should be seen on all the servers. The command to create a physical volume is pvcreate, so what we are really doing here, is creating a physical volume of the partition (/dev/sdb1) we created earlier. # pvcreate /dev/sdb1 Physical volume /dev/sdb1 successfully created Lets run a pvscan on every node, to validate that every node can actually see the new disk.

# pvscan PV /dev/sda1 PV /dev/sdb1

VG

datavg

lvm2 [100.00 GB / 22.02 GB free] lvm2 [100.00 GB]

Create our clustered volume group We will now create a new volume group named sharevg and we will assign the physical volume /dev/sdb1 as part of that group. If we ever ran out of disk space within sharevg, we could add another physical volume to the volume group and continue to work without any service disruption. This is a real advantage when working in production environment. # vgcreate sharevg /dev/sdb1 Clustered volume group sharevg successfully created

Display the sharevg volume group properties We can display the volume group properties by issueing the vgdisplay command. We can see that our volume group is Clustered, so it is cluster aware. This will allow later on, to create LV (Logical volume/Filesystem) on one server and have the cluster software automatically advise the cluster member that a new logical volume (filesystem) is available.

root@gollum~# vgdisplay sharevg --- Volume group --VG Name System ID Format Metadata Areas Metadata Sequence No VG Access VG Status Clustered Shared MAX LV Cur LV Open LV Max PV Cur PV Act PV VG Size PE Size Total PE Alloc PE / Size Free PE / Size VG UUID lvm2 1 25 read/write resizable yes no 0 0 0 0 1 1 100.00 GB 4.00 MB 24999 1 / 2.10 MB 24980 / 98.00 GB V8Ag76-vdW2-NAk4-JjOo-or3l-GuPz-x5LEKP sharevg

Create a logical volume of 1024MB named cadmin of the sharevg volume group We will create a logical volume named cadminlv (Cluster Admin), in our sharevg volume group. The command below is asking to create a logical volume of 1024MB, name cadminlv in the volume group sharevg. This command can be done one server and the logical volume will be seen by every member of the cluster. #/usr/sbin/lvcreate -L1024M -n cadminlv sharevg Logical volume cadminlv created

The lvs command allows you to display a list of all your logical volumes. Since this is currently the only one on the volume group sharevg, we filter the list (with the grep command) to only display the logical volume on sharevg volume group.Lets check if it seen by all nodes. root@gandalf~# lvs | grep sharevg cadminlv sharevg -wi-a- 1024.00M root@gollum~# lvs | grep sharevg cadminlv sharevg -wi-a- 1024.00M root@bilbo~# lvs | grep sharevg cadminlv sharevg -wi-a- 1024.00M

Creating the /cadmin GFS Filesystem


Finally, we are going to create a GFS filesystem within that logical volume cadminlv weve just created. But first we need to create our GFS filesystem mount point on every nodes. We need to do that, because we want this filesystem to be mounted on every node and be available for the 3 nodes. root@gandalf~# mkdir /cadmin root@bilbo~# mkdir /cadmin

root@gollum~# mkdir /cadmin

We have choosen to have the GFS filesystem /cadmin to be mounted on all servers at all time. We could have include it as part of our service, so it would be mounted only when a service is started. But we found out, that the process of unmounting and mounting a couple of GFS take time and this time adds up to the time it take to move a service from one server to another. In production we have 5 servers in a cluster for more that 2 years now, we have around 30 GFS mounted at all time on all the five servers and we had very little problem. The only thing you have to be careful about is the number of journals that you assign to each GFS. One journal is needed for each concurrent mount in the cluster, in our case we have at least 5 journals for each of our GFS filesystem (more on that below).Create the GFS on the LVM cadmin created previously. This need to be done only on one node, the creation is done once and all the nodes are made aware of the new GFS by the cluster daemon. The command we use to create a GFS filesystem is gfs_mkfs. We need to use a couple of options and I will explain them all. First, the -O prevents gfs_mkfs from asking for confirmation before creating the filesystem. The option -p lock_dlm, indicate the name of the locking protocol to use. The locking protocol should be lock_dlm for a clustered file system.

The -t our_cluster:cadminlv Its the cluster-name, followed by : and the logical volume name. The cluster name must match the one you have defined in your cluster configuration file (in our case our_cluster), only members of this cluster are permitted to use this file system. The filesystem name (cadminlv) is a unique file system name used to distinguish this GFS file system from others created (1 to 16 characters). The -j 4 is the number of journals for gfs_mkfs to create. You need at least one journal per machine that will mount the filesystem. This number should have been 3, but I always add one more, in case a add a member in the cluster. This number is important, if I had put 3 and I added a node within the cluster and I wanted the 4 nodes to mount simultaneously this filesystem, I would need to make sure that the filesystem have 4 journals, because the GFS wound not mount. You can always add a journal to an existing GFS filesystem with the gfs_jadd command. Each journal reserve 128 MB in the filesystem, so you need take into consideration. Let look at our example, we want all our nodes to mount the /cadmin GFS filesystem, we created an logical volume of 1024M and on it we created a GFS, we reserved 4 journals (4*128=512MB) , so will have only around 500 MB available for data out of the 1024MB we allocated to our logical volume. The last parameter /dev/sharevg/cadmlv is the name of the logical volume we created previously. # /sbin/gfs_mkfs Device: Blocksize: Filesystem Size: Journals: Resource Groups: Locking Protocol: Lock Table: Syncing All Done -O -p lock_dlm -t our_cluster:cadmlv -j 4 /dev/sharevg/cadmlv /dev/sharevg/cadmlv 4096 98260 4 8 lock_dlm our_cluster:cadminlv

We are now able to mount our GFS filesystem on all the servers, by using the command below on all the servers, # mount -t gfs /dev/sharevg/cadminlv /cadmin We want that filesystem to be mounted every time a server boot, so dont forget to add your filesystem to /etc/fstab, so it will mount after the next reboot and dont forget to change the owner and protection of the filesystem. # echo /dev/sharevg/cadminlv /cadmin gfs defaults 0 0 >> /etc/fstab The filesystem should be available on all our nodes. # df -h /cadmin Filesystem Size Used /dev/mapper/sharevg-cadminlv 510 M 1.5M Avail 508M Use% Mounted on 1% /cadmin

So weve just created our first GFS filesystem and it is mounted on all our nodes in the cluster. In our last article we will finalise our cluster, by creating the needed scripts for our ftp/web services to start and to move from server to server. We will add these scripts to our cluster configuration and we will show you how to move service from one server to another using the command line and GUI. So stay tune, for this last article on how to build a Red Hat cluster.

Creating a Red Hat Cluster: Part 5


Welcome back to LINternUX, for our last article of this series on how to build a working Red Hat cluster. So far we have a working cluster, but it only move the IP from server to server. In this article, we will put in place everything so that we have an FTP and a web service that will be fully redundant within our cluster. In our previous article, we have create a GFS filesystem under the mount point /cadmin, this is where we will put our scripts, configuration files and log used for our cluster. The content of the /cadmin filesystem can be downloaded here, it include all the directories structure and scripts used in our cluster articles. After this article, you will have a fully configured cluster, running an ftp and a web service. We will have a lot to do, so lets begin.

FTP prerequisite
We need to make sure that the ftp server vsftpd is installed on every server in our cluster. You can check if it is installed by typing the following command; root@gandalf:~# rpm -q vsftpd vsftpd-2.0.5-16.el5_5.1 root@gandalf:~# If is not installed, we need to run the following command to install it on the servers where its not installed; root@bilbo:~# yum install vsftpd We must make sure the vsftpd is not started and doesnt start upon reboot. To do so use the following commands on all servers; root@bilbo:~# service vsftpd stop Shutting down vsftpd: root@bilbo:~# chkconfig vsftpd off [FAILED]

Script to stop/start/status our FTP service


Now we need to create a script for each of our services (ftp and web) that the cluster software will use to stop and start the appropriate service and add it to our cluster configuration. Well put these scripts if our /cadmin GFS filesystem, so its accessible by our 3 servers. We will start by creating the script for the ftp service. The script used by the Red Hat Cluster Suite, receive one parameter when called by the cluster software. This parameter can be stop, start and status. You can download a copy of the script and the vsftpd configuration file if you want. But remember that if you want to use them as is, you must put them in the /cadmin filesystem. The srv_ftp.sh script will go in a subdirectory name /cadmin/srv and the configuration file srv_ftp.conf must go in /cadmin/cfg directory. But nothing beat an example; lets build the one for our FTP service.

#! /bin/bash # -------------------------------------------------------------------------------# Script to stop/start and give a status of ftp service in the cluster. # This script is build to receive 3 parameters. # - start : service(s) # - stop service(s) : Executed by cluster to start the application(s) or Executed by cluster to stop the application(s) or

# - status: status.

Executed by cluster every 30 seconds to check service

# ------------------------------------------------------------------------# Author #set -x CDIR="/cadmin" Services CSVC="$CDIR/srv" CCFG="$CDIR/cfg" INST="srv_ftp" HOSTNAME=`hostname -a` VSFTPD="/usr/sbin/vsftpd" RC=0 ; export CDIR ; export CSVC ; export CCFG ; export INST ; export HOSTNAME ; export VSFTPD ; export RC # Root directory for # Service Scripts Directory # Service Config. Directory # Service Instance Name # Service Log file name # HostName # Service Program name # Service Config. file name # Service Return Code # Dash Line : Jacques Duplessis - April 2011 # -------------------------------------------------------------------------

LOG="$CDIR/log/${INST}.log" ; export LOG

FCFG="${CCFG}/${INST}.conf" ; export FCFG DASH="---------------------"; export DASH # Where the Action Start

# ------------------------------------------------------------------------case "$1" in start) 2>&1 echo -e "${VSFTPD} ${FCFG}" >> $LOG 2>&1 ${VSFTPD} ${FCFG} >> $LOG 2>&1 RC=$? FPID=`ps -ef |grep -v grep |grep ${FCFG} |awk '{ print $2 }'|head -1` echo "Service $INST started on $HOSTNAME - PID=${FPID} RC=$RC">> $LOG echo "${DASH}" >> $LOG 2>&1 ;; stop ) echo -e "\n${DASH}" >> $LOG 2>&1 echo -e "Stopping Service $INST on $HOSTNAME at `date` " >> $LOG ps -ef | grep ${FCFG}| grep -v grep >> $LOG 2>&1 FPID=`ps -ef |grep -v grep |grep ${FCFG} |awk '{ print $2 }'|head -1` echo -e "Killing PID ${FPID}" >> $LOG 2>&1 kill $FPID RC=0 echo "${DASH}" >> $LOG 2>&1 ;; status) COUNT=`ps -ef | grep ${FCFG}| grep -v grep | wc -l` >> $LOG 2>&1 echo -e "Service $INST is stopped ..." >> $LOG 2>&1 echo -e "\n${DASH}" >> $LOG 2>&1 echo -e "Starting service $INST on $HOSTNAME at `date`" >> $LOG

FPID=`ps -ef |grep -v grep |grep ${FCFG} |awk '{ print $2 }'|head -1` echo -n "`date` Service $INST ($COUNT) on $HOSTNAME">> $LOG 2>&1 if [ $COUNT -gt 0 ] then echo " - PID=${FPID} - OK" RC=0 else echo " - NOT RUNNING" >> $LOG 2>&1 ps -ef | grep -i ${FCFG} | grep -v grep RC=1 fi ;; esac exit $RC
This script is placed in the directory /cadmin/svc and name svc_ftp.sh. Now, lets add this script to our ftp service, run the system-config-cluster command to start the cluster configuration GUI.

>> $LOG 2>&1

>> $LOG 2>&1

Add ftp script to our ftp cluster service


Now, lets add this script to our ftp service, run the system-config-cluster command. root@gandalf:~# system-config-cluster &

Click on Resources on the left side and then on the Create a Resource button at the bottom right of the screen. This will allow us insert our ftp service script into the cluster configuration.

Select Script from the Resource Type list and then enter the name of our ressource srv_ftp and then specify the name of the script the service will use, with its full path. Here, like I said we decided to place it in our /cadmin GFS filesystem so it is seen by every node in the cluster.

Now we need to edit our srv_ftp service to add the resource we just created. Select the srv_ftp service at the bottom left of the screen and then press the Edit Service Properties button.

Click on the Add a Shared Resource to this service button. This will bring up the screen below, where we select the srv_ftp script that we want to add our service.

After adding our script to the resource, press the Close button. We are now ready to push our new configuration to the member of our cluster, press the Send to Cluster button to do so.

Web site prerequisite


Make sure that the httpd and the php package is installed on every server in our cluster. You can check if it is installed by typing the following command; root@gandalf # rpm -q httpd php httpd-2.2.3-45.el5 php-5.1.6-27.el5_5.3 root@gandalf # # If is not installed, we need to run the following command to install them on the servers where its not installed; root@bilbo:~# yum install httpd php We must make sure the httpd is not started and doesnt start upon reboot. To do so use the following commands on all servers; root@bilbo:~# service httpd stop Shutting down httpd: root@bilbo:~# chkconfig httpd off [FAILED]

Script to stop/start/status our Web service


We have simplified the configuration of our web site to the minimum. This was done intentionally; we wanted to demonstrate the cluster functionally and not the httpd possibilities. But our web site will be functional and redundant. As with the ftp script, the function of our web server script is very similar. You can download this script and the httpd configuration file if you want, but remember that if you want to use them as is, you must put them in the /cadmin filesystem. The srv_www.sh script will go in a subdirectory name /cadmin/srv and the configuration file srv_www.conf must go in /cadmin/cfg directory.

#! /bin/bash # ------------------------------------------------------------------------# Script to stop/start and give a status of our web service in the cluster. # This script is build to receive 3 parameters. # - start : # - stop : # - status: # Author #set -x CDIR="/cadmin" Services CSVC="$CDIR/srv" CCFG="$CDIR/cfg" INST="srv_www" HOSTNAME=`hostname -a` HTTPD="/usr/sbin/httpd" ; export CDIR ; export CSVC ; export CCFG ; export INST ; export HOSTNAME ; export HTTPD # Root directory for # Service Scripts Directory # Service Config. Directory # Service Instance Name # Service Log file name # HostName # Service Program name Executed by cluster to start the application(s) or service(s) Executed by cluster to stop the application(s) or service(s) Executed by cluster every 30 seconds to check service status. : Jacques Duplessis - April 2011

# -----------------------------------------------------------------------# -------------------------------------------------------------------------

LOG="$CDIR/log/${INST}.log" ; export LOG

HCFG="${CCFG}/${INST}.conf" ; export HCFG RC=0 ; export RC DASH="---------------------"; export DASH # Where the Action Start

# Service Config. file name # Service Return Code # Dash Line

# ------------------------------------------------------------------------case "$1" in start) 2>&1 echo -e "${HTTPD} ${HCFG}" >> $LOG 2>&1 ${HTTPD} -f ${HCFG} >> $LOG 2>&1 RC=$? HPID=`cat ${CCFG}/${INST}.pid` echo "Service $INST started on $HOSTNAME - PID=${HPID} RC=$RC">> $LOG echo "${DASH}" >> $LOG 2>&1 ;; stop ) echo -e "\n${DASH}" >> $LOG 2>&1 echo -e "Stopping Service $INST on $HOSTNAME at `date` " >> $LOG HPID=`cat ${CCFG}/${INST}.pid` echo -e "Killing PID ${HPID}" >> $LOG 2>&1 kill $HPID RC=0 echo "${DASH}" >> $LOG 2>&1 ;; status) COUNT=`ps -ef | grep ${HCFG}| grep -v grep | wc -l` HPID=`cat ${CCFG}/${INST}.pid` echo -n "`date` Service $INST ($COUNT) on $HOSTNAME">> $LOG 2>&1 if [ $COUNT -gt 0 ] then echo " - PID=${HPID} - OK" RC=0 else echo " - NOT RUNNING" >> $LOG 2>&1 ps -ef | grep -i ${HCFG} | grep -v grep RC=1 fi ;; esac exit $RC >> $LOG 2>&1 >> $LOG 2>&1 > /dev/null 2>&1 echo -e "Service $INST is stopped ..." >> $LOG 2>&1 echo -e "\n${DASH}" >> $LOG 2>&1 echo -e "Starting service $INST on $HOSTNAME at `date`" >> $LOG

Updating our cluster Configuration

To add our web service, please follow the same sequence as we did when we inserted our ftp service into the cluster configuration. You only need to replace srv_ftp.sh by srv_www.sh and the script path will be the same, we have decide to place our scripts into the directory /cadmin/srv. Once we have push the new configuration to all servers in the cluster, we should now have a working cluster.The web site define in the configuration have its Root Directory set to /cadmin/www/html it contains only one file that will display the name of the it is running on. The will help us testing our cluster configuration. I you wish to use the cluster configuration, scripts and configuration files we have used in this series of articles, I would encourage you to download the cadmin.tar file. The file is the actual content of the /cadmin directory used throught out this article. To use it, download the cadmin.tar file then copy it to your /cadmin directory and enter the command tar -xvf ./cadmin.tar. This will explode the tar file and then you will have the working envirionnment I used in this article.

Testing our ftp service


So here we are (finally you would say hum me too), we have now a fully working cluster. So if we issue the clustat command this is what we should see.

root@gollum:/# clustat Cluster Status for our_cluster @ Sat Apr 16 11:37:25 2011 Member Status: Quorate Member Name ------ ---hbbilbo.maison.ca hbgandalf.maison.ca hbgollum.maison.ca Service Name ------- ---service:srv_ftp service:srv_www root@gollum:/#
From the information above, we can see that all our cluster member status are online and that the resource manager is running on all of them. The resource manager is important, it is responsable for moving service around when needed. Our service srv_ftp is started (running) on the hbgollum server and srv_www is running on the hbbilbo like we decided at the beginning (remember ?

ID

Status 1 Online, rgmanager 2 Online, rgmanager 3 Online, Local, rgmanager

---- ------

Owner (Last) ----- -----hbgollum.maison.ca hbbilbo.maison.ca

State ----started started

root@gollum:/# ip addr show | grep 192 inet 192.168.1.104/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.204/24 scope global secondary eth0 root@gollum:/# root@gollum:/# ps -ef | grep vsftpd | grep -v grep root 7858 1 0 10:05 ? /cadmin/cfg/srv_ftp.conf 00:00:00 /usr/sbin/vsftpd

root@gollum:/#
The command ip addr show | grep 192 is confirming that the virtual IP is defined on hbgollum server and if we check if the ftp process is also running, we can see that it is. So lets try to do an FTP to our virtual IP that we have name ftp.maison.ca (192.168.1.204). We will try it from the gandalf server and we see that it is working.

root@gandalf:/# ftp ftp.maison.ca Connected to ftp.maison.ca. 220 ftp.maison.ca 530 Please login with USER and PASS. 530 Please login with USER and PASS. KERBEROS_V4 rejected as an authentication type Name (ftp.maison.ca:root):

Now lets move the ftp service from hbgollum to hbbilbo, to see if the ftp service will continue to work. To move the service we will use the clusvcadm command, we need to specify the service name we need to relocate (-r) and the server (-m for machine) we wish to move it. You can issue the clusvccmd command on any of the server within our cluster. So enter the following command to move our service to hbbilbo ;

root@gandalf:/# clusvcadm -r srv_ftp -m hbbilbo 'hbbilbo' not in membership list Closest match: 'hbbilbo.maison.ca' Trying to relocate service:srv_ftp to hbbilbo.maison.ca...Success service:srv_ftp is now running on hbbilbo.maison.ca root@gandalf:/#
Notice that just after pressing [Enter], we got hbbilbo nit in membership list, this is because we did not mention the domain name maison.ca, but it managed to assume that we were refering to hbbilbo.maison.ca. So our command succeeded, so lets see if everything went like it should have. Fisrt, lets execute the clustat command to see in the srv_ftp service is now running on hbbilbo.

root@gandalf:/# clustat Cluster Status for our_cluster @ Sat Apr 16 12:10:01 2011 Member Status: Quorate Member Name ------ ---hbbilbo.maison.ca hbgandalf.maison.ca hbgollum.maison.ca Service Name Owner (Last) ID Status 1 Online, rgmanager 2 Online, Local, rgmanager 3 Online, rgmanager State

---- ------

------- ---service:srv_ftp service:srv_www root@gandalf:/#

----- -----hbbilbo.maison.ca hbbilbo.maison.ca

----started started

We can see that our ftp service is now running on hbbilbo, lets see if it reallly is. If we check if the 192.168.1.204 (ftp.maison,ca) is now defined on hbbilbo we can see that it is. The FTP process is also running now on the server.

root@bilbo:~# ip addr show | grep 192 inet 192.168.1.111/24 brd 192.168.1.255 scope global eth0 inet 192.168.1.211/24 scope global secondary eth0 inet 192.168.1.204/24 scope global secondary eth0 root@bilbo:~# ps -ef | grep vsftpd | grep -v grep root 8616 1 0 12:05 ? /cadmin/cfg/srv_ftp.conf root@bilbo:~#
But what happen on hbgollum, the IP 192.168.1.204 should not be there anymore and the FTP process should not be running anymore. So thats what happen the IP is gone and the ftp process is no longer running. So far so good, the moves to hbbilbo server have worked.

00:00:00 /usr/sbin/vsftpd

root@gollum:/etc/profile.d# ip addr show | grep 192 inet 192.168.1.104/24 brd 192.168.1.255 scope global eth0 root@gollum:/etc/profile.d# ps -ef | grep vsftpd | grep -v grep root@gollum:/etc/profile.d#
The last test if to try to do an ftp to ftp.maison.ca and see if it responds.

root@gandalf:/# ftp ftp.maison.ca Connected to ftp.maison.ca. 220 ftp.maison.ca 530 Please login with USER and PASS. 530 Please login with USER and PASS. KERBEROS_V4 rejected as an authentication type Name (ftp.maison.ca:root):
Great evrything worked !! Lets move the ftp process back to hbgollum before testing our web site. Open another terminal window and enter the clustat -i 2 command and see watch the status change from started/stopping/starting/started while the move is going on. Check your /var/log/message and familiar yourself with the line recorded when the move happen.

root@gandalf:/# clusvcadm -r srv_ftp -m hbgollum 'hbgollum' not in membership list Closest match: 'hbgollum.maison.ca' Trying to relocate service:srv_ftp to hbgollum.maison.ca...Success service:srv_ftp is now running on hbgollum.maison.ca

root@gandalf:/#
One of the test we should make, is to unplug the network cable or poweroff hbgollum and see if the service move to the next server in the failover domain (It will). So we have now completed and tested our ftp service. It as been a long road but it woth it, no ?

Testing our web service


You know now, how to move service from server to another, Lets do the same test with our web server service. The web site is actually just one simple page. It just display the name of the server that it is running on, this simplify our testing. If we issue the clustat command we should have the following picture;

# clustat Cluster Status for our_cluster @ Sat Apr 16 14:15:37 2011 Member Status: Quorate Member Name ------ ---hbbilbo.maison.ca hbgandalf.maison.ca hbgollum.maison.ca Service Name ------- ---service:srv_ftp service:srv_www Owner (Last) ----- -----hbgollum.maison.ca hbbilbo.maison.ca ID Status 1 Online, rgmanager 2 Online, Local, rgmanager 3 Online, rgmanager State ----started started

---- ------

Lets see if it is working, open your browser type this URL http://www.maison.ca, you should have a the following;

Now, lets move the wev site to gandalf server, type the following command;

root@gollum:/cadmin/cfg# clusvcadm -r srv_www -m hbgandalf 'hbgandalf' not in membership list Closest match: 'hbgandalf.maison.ca' Trying to relocate service:srv_www to hbgandalf.maison.ca...Success service:srv_www is now running on hbgandalf.maison.ca root@gollum:/cadmin/cfg# root@gollum:/cadmin/cfg# clustat

Cluster Status for our_cluster @ Sat Apr 16 14:27:14 2011 Member Status: Quorate Member Name ------ ---hbbilbo.maison.ca hbgandalf.maison.ca hbgollum.maison.ca Service Name ------- ---service:srv_ftp service:srv_www Owner (Last) ----- -----hbgollum.maison.ca hbgandalf.maison.ca ID Status 1 Online, Local, rgmanager 2 Online, rgmanager 3 Online, rgmanager State ----started started

---- ------

We can see that the web site is now running on gandalf server.

Disabling and Enabling Services


There may come a time, when you need to stop a service completely. We will demonstrate how to acheive that, first lets display the status of our cluster

root@bilbo:~# clustat Cluster Status for our_cluster @ Sat Apr 16 14:39:48 2011 Member Status: Quorate Member Name ------ ---hbbilbo.maison.ca rgmanager hbgandalf.maison.ca hbgollum.maison.ca Service Name State ------- -----service:srv_ftp started Owner (Last) ----- -----hbgollum.maison.ca -ID Status 1 Online, Local, 2 Online, rgmanager 3 Online, rgmanager

---- ------

service:srv_www started

hbgandalf.maison.ca

We are going to disable the srv_www service, to do so enter the following command;

root@bilbo:~# clusvcadm -d srv_www Local machine disabling service:srv_www...Success


The clustat command show us that the service is now disable.

root@bilbo:~# clustat Cluster Status for our_cluster @ Sat Apr 16 14:40:04 2011 Member Status: Quorate Member Name ------ ---hbbilbo.maison.ca rgmanager hbgandalf.maison.ca hbgollum.maison.ca Service Name State ------- -----service:srv_ftp started service:srv_www disabled Owner (Last) ----- -----hbgollum.maison.ca (hbgandalf.maison.ca) -ID Status 1 Online, Local, 2 Online, rgmanager 3 Online, rgmanager

---- ------

We will now enable the service, but this time we will enable it on another server than hbgandalf. This command enables the srv_www service on the server hbbilbo.

root@bilbo:~# clusvcadm -e srv_www -m hbbilbo 'hbbilbo' not in membership list Closest match: 'hbbilbo.maison.ca' Member hbbilbo.maison.ca trying to enable service:srv_www...Success service:srv_www is now running on hbbilbo.maison.ca root@bilbo:~#
We can see the it is now running on hbbilbo.

root@bilbo:~# clustat Cluster Status for our_cluster @ Sat Apr 16 14:47:36 2011 Member Status: Quorate Member Name ID Status

------ ---hbbilbo.maison.ca rgmanager hbgandalf.maison.ca hbgollum.maison.ca Service Name State ------- -----service:srv_ftp started service:srv_www started root@bilbo:~#

---- -----1 Online, Local, 2 Online, rgmanager 3 Online, rgmanager Owner (Last) ----- -----hbgollum.maison.ca hbbilbo.maison.ca --

This concludes our implementation of a small cluster. It was intended just to show everyone how the Red Hat Cluster Suite actually work and to give a brief overview how it work. We will now move on to other interesting topic. Dont know what it will be, but I can assure that it should fit into one article, so I hope you appreciate it and hope to see you soon.

Vous aimerez peut-être aussi