Vous êtes sur la page 1sur 26

The ultimate Veritas Cluster Server (VCS)

Basics
What are the different service group types ?
Service groups can be one of the 3 type :
1. Failover Service group runs on one system at a time.
2. Parallel Service group runs on multiple systems simultaneously.
3. Hybrid Used in replicated data clusters (disaster recovery setups). SG behaves as Failover
within the local cluster and Parallel for the remote cluster.
Where is the VCS main configuration file located ?
The main.cf file contains the configuration of the entire cluster and is located in the directory
/etc/VRTSvcs/conf/config.
How to set VCS configuration file (main.cf) ro/rw ?
To set the configuration file in read-only/read-write :
# haconf -dump -makero (Dumps in memory configuration to main.cf and
makes it read-only)
# haconf -makerw (Makes configuration writable)
Where is the VCS engine log file located ?
The VCS cluster engine logs is located at /var/VRTSvcs/log/engine_A.log. We can either
directly view this file or use command line to view it :
# hamsg engine_A
How to check the complete status of the cluster
To check the status of the entire cluster :
# hastatus -sum
How to verify the syntax of the main.cf file
To verify the syntax of the main.cf file just mention the absolute directory path to the main.cf file
:
# hacf -verify /etc/VRTSvcs/conf/config
What are the different resource types ?
1. Persistent : VCS can only monitor these resources but can not offline or online them.
2. On-Off : VCS can start and stop On-Off resource type. Most resources fall in this category.
3. On-Only : VCS starts On-Only resources but does not stop them. An example would be NFS
daemon. VCS can start the NFS daemon if required, but can not take it offline if the associated
service group is take offline.
Explain the steps involved in Offline VCS configuration
1. Save and close the configuration :
# haconf -dump -makero
2. Stop VCS on all nodes in the cluster :
# hastop -all
3. Edit the configuration file after taking the backup and do the changes :
# cp -p /etc/VRTSvcs/conf/config/main.cf
/etc/VRTSvcs/conf/config/main.cf_17march
# vi /etc/VRTSvcs/conf/config/main.cf
4. Verify the configuration file syntax :
# hacf -verify /etc/VRTSvcs/conf/config/
5. start the VCS on the system with modified main.cf file :
# hastart
6. start VCS on other nodes in the cluster.
Note : This can be done in another way by just stopping VCS and leaving services running to
minimize the downtime. (hastop -all -force
GAB, LLT and HAD
What is GAB, LLT and HAD and whats their functionalities ?
GAB, LLT and HAD forms the basic building blocks of vcs functionality.
LLT (low latency transport protocol) LLT transmits the heartbeats over the interconnects. It
is also used to distribute the inter system communication traffic equally among all the
interconnects.
GAB (Group membership services and atomic broadcast) The group membership service
part of GAB maintains the overall cluster membership information by tracking the heartbeats
sent over LLT interconnects. The atomic broadcast of cluster membership ensures that every
node in the cluster has same information about every resource and service group in the cluster.
HAD (High Availability daemon) the main VCS engine which manages the agents and
service group. It is in turn monitored by a daemon named hashadow.
What are the various GAB ports and their functionalities ?
a --> gab driver
b --> I/O fencing (to ensure data integrity)
d --> ODM (Oracle Disk Manager)
f --> CFS (Cluster File System)
h --> VCS (VERITAS Cluster Server: high availability daemon, HAD)
o --> VCSMM driver (kernel module needed for Oracle and VCS
interface)
q --> QuickLog daemon
v --> CVM (Cluster Volume Manager)
w --> vxconfigd (module for cvm)
How to check the status of various GAB ports on the cluster nodes
To check the status of GAB ports on various nodes :
# gabconfig -a
Whats the maximum number of LLT links (including high and low priority) can a cluster
have ?
A cluster can have a maximum of 8 LLT links including high and low priority LLT links.
How to check the detailed status of LLT links ?
The command to check detailed LLT status is :
# lltstat -nvv
What are the various LLT configuration files and their function ?
LLT uses /etc/llttab to set the configuration of the LLT interconnects.
# cat /etc/llttab
set-node node01
set-cluster 02
link nxge1 /dev/nxge1 - ether - -
link nxge2 /dev/nxge2 - ether - -
link-lowpri /dev/nxge0 ether - -
Here, set-cluster -> unique cluster number assigned to the entire cluster [ can have a value
ranging between 0 to (64k - 1) ]. It should be unique across the organization.
set-node -> a unique number assigned to each node in the cluster. Here the name node01 has a
corresponding unique node number in the file /etc/llthosts. It can range from 0 to 31.
Another configuration file used by LLT is /etc/llthosts. It has the cluster-wide unique node
number and nodename as follows:
# cat /etc/llthosts
0 node01
1 node02
LLT has an another optional configuration file : /etc/VRTSvcs/conf/sysname. It contains short
names for VCS to refer. It can be used by VCS to remove the dependency on OS hostnames.
What are various GAB configuration files and their function ?
The file /etc/gabtab contains the command to start the GAB.
# cat /etc/gabtab
/sbin/gabconfig -c -n 4
here -n 4 > number of nodes that must be communicating in order to start VCS.
How to start/stop GAB
The commands to start and stop GAB are :
# gabconfig -c (start GAB)
# gabconfig -U (stop GAB)
How to start/stop LLT
The commands to stop and start LLT are :
# lltconfig -c -> start LLT
# lltconfig -U -> stop LLT (GAB needs to stopped first)
Whats a GAB seeding and why manual GAB seeding is required ?
The GAB configuration file /etc/gabtab defines the minimum number of nodes that must be
communicating for the cluster to start. This is called as GAB seeding.
In case we dont have sufficient number of nodes to start VCS [ may be due to a maintenance
activity ], but have to do it anyways, then we have do what is called as manual seeding by firing
below command on each of the nodes.
# gabconfig -c -x
How to start HAD or VCS ?
To start HAD or VCS on all nodes in the cluster, the hastart command need to be run on all
nodes individually.
# hastart
What are the various ways to stop HAD or VCS cluster ?
The command hastop gives various ways to stop the cluster.
# hastop -local
# hastop -local -evacuate
# hastop -local -force
# hastop -all -force
# hastop -all
-local -> Stops service groups and VCS engine [HAD] on the node where it is fired
-local -evacuate -> migrates Service groups on the node where it is fired and stops HAD on the
same node only
-local -force -> Stops HAD leaving services running on the node where it is fired
-all -force -> Stops HAD on all the nodes of cluster leaving the services running
-all -> Stops HAD on all nodes in cluster and takes service groups offline
Resource Operations
How to list all the resource dependencies
To list the resource dependencies :
# hares -dep
How to enable/disable a resource ?
# hares -modify [resource_name] Enabled 1 (To enable a resource)
# hares -modify [resource_name] Enabled 0 (To disable a resource)
How to list the parameters of a resource
To list all the parameters of a resource :
# hares -display [resource]

Service group operations
How to add a service group(a general method) ?
In general, to add a service group named SG with 2 nodes (node01 and node02) :
haconf makerw
hagrp add SG
hagrp modify SG SystemList node01 0 node02 1
hagrp modify SG AutoStartList node02
haconf dump -makero
How to check the configuration of a service group SG ?
To see the service group configuration :
# hagrp -display SG
How to bring service group online/offline ?
To online/offline the service group on a particular node :
# hagrp -online [service-group] -sys [node] (Online the SG on a
particular node)
# hagrp -offline [service-group] -sys [node] (Offline the SG on
particular node)
The -any option when used instead of the node name, brings the SG online/offline based on SGs
failover policy.
# hagrp -online [service-group] -any
# hagrp -offline [service-group] -any
How to switch service groups ?
The command to switch the service group to target node :
# hagrp -switch [service-group] -to [target-node]
How to freeze/unfreeze a service group and what happens when you do so ?
When you freeze a service group, VCS continues to monitor the service group, but does not
allow it or the resources under it to be taken offline or brought online. Failover is also disable
even when a resource faults. When you unfreeze the SG, it start behaving in the normal way.
To freeze/unfreeze a Service Group temporarily :
# hagrp -freeze [service-group]
# hagrp -unfreeze [service-group]
To freeze/unfreeze a Service Group persistently (across reboots) :
# hagrp -freeze -persistent[service-group]
# hagrp -unfreeze [service-group] -persistent

Communication failures : Jeopardy, split brain
Whats a Jeopardy membership in vcs clusters
When a node in the cluster has only the last LLT link intact, the node forms a regular
membership with other nodes with which it has more than one LLT link active and a Jeopardy
membership with the node with which it has only one LLT link active.

Effects of jeopardy : (considering example in diagram above)
1. Jeopardy membership formed only for node03
2. Regular membership between node01, node02, node03
3. Service groups SG01, SG02, SG03 continue to run and other cluster functions remain
unaffected.
4. If node03 faults or last link breaks, SG03 is not started on node01 or node02. This is done to
avoid data corruption, as in case the last link is broken the nodes node02 and node01 may think
that node03 is down and try to start SG03 on them. This may lead to data corruption as same
service group may be online on 2 systems.
5. Failover due to resource fault or operator request would still work.
How to recover from a jeopardy membership ?
To recover from jeopardy, just fix the failed link(s) and GAB automatically detects the new
link(s) and the jeopardy membership is removed from node.
Whats a split brain condition ?
Split brain occurs when all the LLT links fails simultaneously. Here systems in the cluster fail to
identify whether it is a system failure or an interconnect failure. Each mini-cluster thus formed
thinks that it is the only cluster thats active at the moment and tries to start the service groups on
the other mini-cluster which he think is down. Similar thing happens to the other mini-cluster
and this may lead to a simultaneous access to the storage and can cause data corruption.
What is I/O fencing and how it prevents split brain ?
VCS implements I/O fencing mechanism to avoid a possible split-brain condition. It ensure data
integrity and data protection. I/O fencing driver uses SCSI-3 PGR (persistent group reservations)
to fence off the data in case of a possible split brain scenario.

In case of a possible split brain
As show in the figure above assume that node01 has key A and node02 has key B.
1. Both nodes think that the other node has failed and start racing to write their keys to the
coordinator disks.
2. node01 manages to write the key to majority of disks i.e. 2 disks
3. node02 panics
4. node01 now has a perfect membership and hence Service groups from node02 can be started
on node01
Whats the difference between MultiNICA and MultiNICB resource types ?
MultiNICA and IPMultiNIC
- supports active/passive configuration.
- Requires only 1 base IP (test IP).
- Does not require to have all IPs in the same subnet.
MultiNICB and IPMultiNICB
- supports active/active configuration.
- Faster failover than the MultiNICA.
- Requires IP address for each interface.
Troubleshooting
How to flush a service group and when its required ?
Flushing of a service group is required when, agents for the resources in the service group seems
suspended waiting for resources to be taken online/offline. Flushing a service group clears any
internal wait states and stops VCS from attempting to bring resources online.
To flush the service group SG on the cluster node, node01 :
# hagrp -flush [SG] -sys node01
How to clear resource faults ?
To clear a resource fault, we first have to fix the underlying problem.
1. For persistent resources :
Do not do anything and wait for the next OfflineMonitorInterval (default 300 seconds) for the
resource to become online.
2. For non-persistent resources :
Clear the fault and probe the resource on node01 :
# hares -clear [resource_name] -sys node01
# hares -probe [resource_name] -sys node01
How to clear resources with ADMIN_WAIT state ?
If the ManageFaults attribute of a service group is set to NONE, VCS does not take any
automatic action when it detects a resource fault. VCS places the resource into the
ADMIN_WAIT state and waits for administrative intervention.
1. To clear the resource in ADMIN_WAIT state without faulting service group :
# hares -probe [resource] -sys node01
2. To clear the resource in ADMIN_WAIT state by changing the status to
OFFLINE|FAULTED :
# hagrp -clearadminwait -fault [SG] -sys node01

Advanced Operations
How to add/remove a node from an active VCS cluster online
To add a node to cluster refer the article :
How to add a node to an active VCS cluster
To remove a node from the cluster refer the article :
How to remove a node from an active VCS cluster
How to configure I/O fencing
For detailed steps on configuring I/O fencing refer the post :
How to configure VCS I/O fencing Command line and installer utility
How to add/remove LLT links to/from VCS cluster
To add/remove LLT links refer detailed steps in the post :
How to add or remove LLT links in VCS cluster
Hope this post was informative. Do comment below if you have some more questions to add to
this list.

How to remove a node from an active VCS
cluster
There might be situations when you want to remove a node from cluster. We can do this online
without interrupting the cluster services. For this you should have sufficient resources to handle
the load of the node which you are going to remove from the cluster.

We will be removing node03 with service group SG03 running on it.
Freeze node and halt VCS
Freeze the node03 and stop VCS on that node by evacuating it, so that service group SG03 will
switch over to either node01 or node02.
# haconf -makerw
# hasys -freeze -persistent node03
# haconf -dump
# hastop -sys node03 -evacuate
Stop I/O fencing and stop LLT/GAB modules
Stop the VCS I/O fencing and LLT/GAB modules on node03.
# /sbin/vxfen-shutdown
# vxfenadm -d (check status of fencing from node01 or node02)
# gabconfig -U
# lltconfig -U
Remove VCS software
Remove VCS software from node03 by running the installer utility and rename the main.cf file.
Modify AutoStartList and SystemList
Run below commands on node01 and node02 ( Im assuming both nodes have node03 in their
SystemList and AutoStartList attributes)
On node01
# hagrp -modify SG01 AutoStartList delete node03
# hagrp -modify SG01 SystemList delete node03
On node02
# hagrp -modify SG02 AutoStartList delete node03
# hagrp -modify SG02 SystemList delete node03
Remove node from ClusterServices SG
Remove node03 from ClusterServices SG on both node01 and node02
# hagrp -modify ClusterService AutoStartList delete node03
Remove node from cluster
Now remove node03 from cluster and save the configuration(run this on node01 or node02)
# hasys -delete train10
# haconf -dump -makero




VCS cluster 101 Cluster Communications,
GAB, LLT, HAD
GAB, LLT and HAD forms the basic building blocks of vcs functionality. I/O fencing driver on
top of it provides the required data integrity. Before diving deep into the topic lets see some basic
components of VCS which contribute to the communication stack of VCS.

LLT
- LLT stands for low latency transport protocol. The main purpose of LLT is to transmit
heartbeats.
- GAB determines the state of a node with the heartbeats sent over the LLTs.
- LLTs are also used to distribute the inter system communication traffic equally among all the
interconnects.
- We can configure upto 8 LLT links including the low and high priority links
1. High Priority links
- Heartbeat is sent every 0.5 seconds
- Cluster status information is passed to other nodes
- Configured using a private/dedicated network.
2. Low priority links
- Heartbeat is sent every 1 seconds
- No cluster status is sent over these links
- Automatically becomes high priority links if there are no more high priority links left
- Usually configured over a public interface (not dedicated)
To view the LLT link status use command verbosely :
# lltstat -nvv | more
Node State Link Status Address
*0 node01 OPEN
ce0 UP 00:11:98:FF:FC:A4
ce1 UP 00:11:98:F3:9C:8C

*0 node02 OPEN
ce0 UP 00:11:98:87:F3:F4
ce1 UP 00:11:98:23:9A:C8
Other lltstat commands
# lltstat -> outputs link statistics
# lltstat -c -> displays LLT configuration directives
# lltstat -l -> lists information about each configured LLT link
Commands to start/stop LLT
# lltconfig -c -> start LLT
# lltconfig -U -> stop LLT (GAB needs to stopped first)
LLT configuration files
LLT uses /etc/llttab to set the configuration of the LLT interconnects.
# cat /etc/llttab
set-node node01
set-cluster 02
link nxge1 /dev/nxge1 - ether - -
link nxge2 /dev/nxge2 - ether - -
link-lowpri /dev/nxge0 ether - -
set-cluster -> unique cluster number assigned to the entire cluster [ can have a value ranging
between 0 to (64k - 1) ]. It should be unique across the organization.
set-node -> a unique number assigned to each node in the cluster. Here the name node01 has a
corresponding unique node number in the file /etc/llthosts. It can range from 0 to 31.
- Another configuration file used by LLT is /etc/llthosts.
- It has the cluster-wide unique node number and nodename as follows:
# cat /etc/llthosts
0 node01
1 node02
- LLT has an another optional configuration file : /etc/VRTSvcs/conf/sysname.
- It contains short names for VCS to refer. It can be used by VCS to remove the dependency on
OS hostnames.
GAB
- GAB stands for Group membership services and atomic broadcast.
- Group membership services : It maintains the overall cluster membership information by
tracking the heartbeats sent over LLT interconnects. If any nodes fails to send the heartbeat over
LLT the GAB module send the information to I/O fencing module to take further action to avoid
any split brain condition if required. It also talks to the HAD which manages the agents and
service groups.
- Atomic Broadcast : atomic broadcast of cluster membership ensures that every node in the
cluster has same information about every resource and service group in the cluster.
GAB configuration files
The file /etc/gabtab contains the command to start the GAB.
# cat /etc/gabtab
/sbin/gabconfig -c -n 4
here -n 4 -> number of nodes that must be communicating in order to start VCS.
Note : Its not always the total no of the nodes in the cluster. Its the minimum no of nodes
required communicating with each other in order to start VCS.
Seeding During startup
- The option -n 4 in the GAB configuration file shown above ensures that minimum number of
nodes are communicating before VCS can start. Its called seeding.
- In case we dont have sufficient number of nodes to start VCS [ may be due to a maintenance
activity ], but have to do it anyways, then we have do what is called as manual seeding by firing
below command on each of the nodes.
# gabconfig -c -x
Note : Be assured that no machine is already seeded as it can create a potential split brain
scenario in clusters not using I/O fencing.
Start/Stop GAB
# gabconfig -c (start GAB)
# gabconfig -U (stop GAB)
To check the status of GAB
# gabconfig -a
GAB Port Memberships
===============================================
Port a gen a36e001 membership 01
Port b gen a36e004 membership 01
Port h gen a36e002 membership 01
Common GAB ports
a --> gab driver
b --> I/O fencing (to ensure data integrity)
d --> ODM (Oracle Disk Manager)
f --> CFS (Cluster File System)
h --> VCS (VERITAS Cluster Server: high availability daemon, HAD)
o --> VCSMM driver (kernel module needed for Oracle and VCS
interface)
q --> QuickLog daemon
v --> CVM (Cluster Volume Manager)
w --> vxconfigd (module for cvm)
HAD
- HAD, high availability daemon is the main VCS engine which manages the agents and service
group.
- It is in turn monitored by hashadow daemon.
- HAD maintains the resource configuration and state information.
Start/Stop HAD
- hastart command needs to be run on every node in the cluster where you want to start the HAD.
- Although hastop can be run from any one node in the cluster too to stop the entire cluster.
- hastop gives us various option to control the behavior of service groups upon stoping the node.
# hastart
# hastop -local
# hastop -local -evacuate
# hastop -local -force
# hastop -all -force
# hastop -all
Meanings of various parameters of hastop are:
-local -> Stops service groups and VCS engine [HAD] on the node where it is fired
-local -evacuate -> migrates Service groups on the node where it is fired and stops HAD on the
same node only
-local -force -> Stops HAD leaving services running on the node where it is fired
-all -force -> Stops HAD on all the nodes of cluster leaving the services running
-all -> Stops HAD on all nodes in cluster and takes service groups offline

VCS cluster 101 Communication faults,
jeopardy, split brain, I/O fencing
Let us now see various communication faults that can occur in the VCS cluster and how VCS
engine reacts to these fault. There are basically 2 types of communication failures
1. Single LLT link failure (jeopardy)
When a node in the cluster has only the last LLT link intact, the node forms a regular
membership with other nodes with which it has more than one LLT link active and a Jeopardy
membership with the node with which it has only one LLT link active. So as shown in the
digram below node03 forms a jeopardy membership and there is also a regular membership
between node01, node02 and node03.
The output of gabconfig -a in case of a
jeopardy membership
GAB Port Memberships
===================================
Port a gen a36e0003 membership 012
Port a gen a36e0003 jeopardy ;2
Port h gen fd570002 membership 012
Port h gen fd570002 jeopardy ;2
Effects of Jeopardy
1. Jeopardy membership formed only for node03
2. Regular membership between node01, node02, node03
3. Service groups SG01, SG02, SG03 continue to run and other cluster functions remain
unaffected.
4. If node03 faults or last link breaks, SG03 is not started on node01 or node02. This is done to
avoid data corruption, as in case the last link is broken the nodes node02 and node01 may think
that node03 is down and try to start SG03 on them. This may lead to data corruption as same
service group may be online on 2 systems.
5. Failover due to resource fault or operator request would still work.
Recovery
To recover from jeopardy, just fix the link and GAB automatically detects the new link and the
jeopardy membership is removed from node03.
2. Network partition
Now consider a case where the last link also fails (note that the last link fails when node03
was already in jeopardy membership). In that case 2 mini-clusters are formed.
Effects of network partition
1. New regular membership formed between node01 and node02 and a separate membership for
node03
2. Two min-clusters formed, one with node01 and node02 and one only containing node03
3. As node03 was already in jeopardy membership a) SG01 and SG02 are auto-disabled on
node03
b) SG03 is auto-disabled on node01
This implies that SG01 and SG02 can not be started on node03 and SG03 can not be started on
node01 or node02. But SG01 and SG02 can failover to node02 and node01 respectively.
Recovery
Now if you directly fix the LLT links, there will be a mismatch in the cluster configurations of
the 2 mini-clusters. To avoid this, shutdown mini-cluster with fewest nodes. In our case node03.
Fix the LLT links and startup the node03. In case you fix the link without shutting down any one
mini-cluster GAB prevents a split-brain scenario by panicking mini-cluster with lowest no of
nodes. In a 2 node cluster, node with higher LLT node number panics. Similarly incase of mini-
clusters having same number of nodes in each mini-cluster, mini-cluster with lowest LLT node
number continues to run, while other mini-cluster nodes panic.
Split Brain
Split brain occurs when all the LLT links fails simultaneously. Here systems in the cluster fail to
identify whether it is a system failure or an interconnect failure. Each mini-cluster thus formed
thinks that it is the only cluster thats active at the moment and tries to start the service groups on
the other mini-cluster which he think is down. Similar thing happens to the other mini-cluster
and this may lead to a simultaneous access to the storage and can cause data corruption.

I/O fencing
VCS implements I/O fencing mechanism to avoid a possible split-brain condition. It ensure data
integrity and data protection. I/O fencing driver uses SCSI-3 PGR (persistent group reservations)
to fence off the data in case of a possible split brain scenario. Persistent group reservations are
persistent across SCSI bus resets and supports multi-pathing from host to disk.

Coordinator disks
Coordinator disks are used to store the key of each host, which can be used to determine which
node stays in cluster in case of a possible split brain scenario. In case of a split brain scenario the
coordinator disks triggers the fencing driver to ensure only one mini-cluster survives.
data disks
The disks used in shared storage for VCS are automatically fenced off as they are discovered and
configured under VxVM.
Now consider various scenarios and how fencing works to avoid any data corruption.
In case of a possible split brain
As show in the figure above assume that node01 has key A and node02 has key B.
1. Both nodes think that the other node has failed and start racing to write their keys to the
coordinator disks.
2. node01 manages to write the key to majority of disks i.e. 2 disks
3. node02 panics
4. node01 now has a perfect membership and hence Service groups from node02 can be started
on node01
In case of a node failure
Assume that node02 fails as shown in the diagram above.
1. node01 detects no heartbeats from node02 and start racing to register its keys on the
coordinator disks and ejects the keys of node02.
2. As node01 wins the race forming a perfect cluster membership.
3. VCS thus can failover any service group thats on the node02 to node01
In case of manual seeding after reboot in a network partition
Consider a case when there is already a network partition and a node [node02] reboots. At this
point the node which got rebooted can not join the cluster due to the gabtab file has specified
minimum 2 nodes to be communicating to start VCS and it cant communicate with other node
due to network partition.
1. node02 reboots and a user manually forces GAB on node02 to seed the node.
2. node02 detects keys of node01 pre-existing on the coordinator disks and comes to know about
the existing network partition. I/O fencing driver thus prevents HAD from starting and outputs
an error on the console about the pre-existing network partition.
Summary
VCS ensures data integrity by using all of the below mechanisms.
1. I/O fencing recommended method. requires scsi3 PGR compatible disks to implement.
2. GAB seeding prevents service groups from starting if nodes are not communicating. Ensures
a cluster membership is formed.
3. jeopardy cluster membership
4. Low priority links to ensure redundancy in case high priority links fail. In case of a network
partition where all high priority links fail, low priority link can be used to form a jeopardy
membership by promoting it to a high priority link.
How to add a node to an active VCS cluster
There might be a requirement to add a new node to an existing VCS cluster to increase the
cluster capacity. Another situation in which there is a need to add a new node is hardware
upgrade of the nodes. In this post, we will be adding 3rd node (node03) to the existing 2 node
cluster.

Install and configure VCS on the new node
Install VCS cluster software to node03. When asked to configure it, select no. After you have
finished installing the vcs software run installsf from the software media:
# cd /cdrom/VRTSvcs/
# installsf -addnode
enter name of one of the existing node of the cluster
Enter a node of SF cluster to which you want to add a node: node01
Checking communication on node01 .................. Done
Checking release compatibility on node01 .......... Done
Following cluster information detected: Cluster Name: geekdiary
Cluster ID: 3
Systems: node01 node02
Is this information correct? [y,n,q] (y)? y
Checking communication on node03 ................. Done
Checking VCS running state on node01 ............. Done
Checking VCS running state on node02 ............. Done
Checking VCS running state on node03 ............. Done
Enter the system names separated by spaces to add to the cluster:? node03
Checking communication on node03.................. Done
Checking release compatibility on node03.......... Done
Do you want to add the system(s) node03 to the cluster geekdiary? [y,n,q]
(y)? y
Checking installed product on cluster geekdiary ....... SF5.1
Checking installed product on node03 ............ SF5.1
Checking installed packages on node03 ........... Done
Discovering NICs on node03 ... Discovered nxge0 nxge1 nxge2 nxge3
To use aggregated interfaces for private heartbeat, enter the name of an
aggregated interface.
To use a NIC for private heartbeat, enter a NIC which is not part of an
aggregated interface.
Enter the NIC for the first private heartbeat link on node03: [b,q,?]? nxge2
Would you like to configure a second private heartbeat link? [y,n,q,b,?] (y)?
y
Enter the NIC for the second private heartbeat link on node03: [b,q,?]? nxge3
Would you like to configure a third private heartbeat link? [y,n,q,b,?] (n)?
n
Do you want to configure an additional low priority heartbeat link?
[y,n,q,b,?] (n)? n
Checking Media Speed for nxge2 on node03 ......... 1000
Checking Media Speed for nxge3 on node03 ......... 1000
Private Heartbeat NICs for train10:
link1=nxge2
link2=nxge3
Is this information correct? [y,n,q] (y)? y
In the case where you have configured a virtual cluster IP
Cluster Virtual IP is configured on the cluster geekdiary
A public NIC device is required by following services on each of the newly
added nodes:
Cluster Virtual IP
Active NIC devices discovered on node03: nxge0 nxge1
Enter the NIC for the VCS to use on node03: (nxge0)
nxge0

Verify
Confirm the configuration: run these commands on node03
# lltstat -nvv
# cat /etc/llttab
# cat /etc/llthosts
# gabconfig -a
# cat /etc/gabtab
# vxfenadm -d
# cat /etc/vxfenmode
# cat /etc/VRTSvcs/conf/config/main.cf (ClusterServices Sg should have
node03)
3. If all of the above settings are right then dump the vcs configuration in memory to main.cf
# haconf -makerw

How to configure VCS I/O fencing
Command line and installer utility
I/O fencing is one of the very important feature of VCS and provides user the data integrity
required. Let us now see how we can configure fencing in a VCS setup. This setup assumes that
you already have a VCS setup up and running without fencing. Fencing can be configured in 2
ways By using the installer script and by using command line.
Before configuring the disks for fencing you can run a test to confirm whether they are SCSI3-
PGR compatible disks. The vxfentsthdw script guides you through to test all the disks to be
configured in the fencing DG. You can also directly give the fencing DG name to test an entire
DG.
# vxfentsthdw
# vxfentsthdw -c vxfencoorddg

Steps to configure I/O fencing
##### Using the installer script ######
1. Initialize disks for I/O fencing
Minimum number of disks required to configure I/O fencing is 3. Also number of fencing disks
should always be an odd number. Well be using 3 disks of size around 500 MB as we have a 2
node cluster. Initialize the disks to be used for the fencing disk group. We can also test whether
the disks are SCSI3 PGR compatible by using the vxfentsthdw command for the fendg.
# vxdisk -eo alldgs list
# vxdisksetup -i disk01
# vxdisksetup -i disk02
# vxdisksetup -i disk03
2. Run the installvcs script from the install media with fencing option
# cd /cdrom/VRTS/install
# ./installvcs -fencing
Cluster information verification: Cluster Name: geekdiary
Cluster ID Number: 3
Systems: node01 node02
Would you like to configure I/O fencing on the cluster? [y,n,q] y
3. Select disk based fencing
We will be doing a disk based fencing rather than a server based fencing also called as CP
(coordinator point) client based fencing.
Fencing configuration
1) Configure CP client based fencing 2) Configure disk based fencing
3) Configure fencing in disabled mode
Select the fencing mechanism to be configured in this Application Cluster:[1-
3,q] 2
4. Create new disk group
You can create a new disk group or use an existing disk group for fencing. We will be using a
new fencing DG which is a preferred way of doing it.
Since you have selected to configure Disk based fencing, you would be asked
to give either the Disk group to be used as co-ordinator of asked to create
disk group and the mechanism to be used.
Select one of the options below for fencing disk group: 1) Create a new disk
group
2) Using an existing disk group
3) Back to previous menu
Press the choice for a disk group: [1-2,b,q] 1
5. Select disks to be used for the fencing DG
Select the disks which we initialized in step 1 to create our new disk group.
List of available disks to create a new disk group 1)
2) disk01
3) disk02
4) disk03
...
b) Back to previous menu
Select an odd number of disks and at least three disks to form a disk group.
Enter the disk options, separated by spaces:
[1-4,b,q] 1 2 3
6. Enter the fencing disk group name, fendg
enter the new disk group name: [b] fendg
7. Select the fencing mechanism : raw/dmp(dynamic multipathing)
Enter fencing mechanism name (raw/dmp): [b,q,?] dmp
8. Confirm configuration and warnings
I/O fencing configuration verification Disk Group: fendg
Fencing mechanism: dmp
Is this information correct? [y,n,q] (y) y
Installer will stop VCS before applying fencing configuration. To make sure
VCS shuts down successfully, unfreeze any frozen service groups in the
cluster.
Are you ready to stop VCS on all nodes at this time? [y,n,q] (n) y
##### Using Command line ######
1. Initialize disks for I/O fencing
First step is same as above method. Well initialize 3 disks of 500 MB each. on one node :
# vxdisk -eo alldgs list
# vxdisksetup -i disk01
# vxdisksetup -i disk02
# vxdisksetup -i disk03
2. Create the fencing disk group fendg
# vxdg -o coordinator=on init fendg disk01
# vxdg -g fendg adddisk disk02
# vxdg -g fendg adddisk disk03
3. create the vxfendg file
# vxdg deport vxfendg
# vxdg -t import fendg
# vxdg deport fendg
# echo "fendg" > /etc/vxfendg (on both nodes)
4. Enabling fencing
# haconf -dump -makero
# hastop -all
# /etc/init.d/vxfen stop
# vi /etc/VRTSvcs/conf/config/main.cf ( add SCSI3 entry )
cluster geekdiary (
UserNames = { admin = "ass76asishmHajsh9S." }
Administrators = { admin }
HacliUserLevel = COMMANDROOT
CounterInterval = 5
UseFence = SCSI3
)
# hacf -verify /etc/VRTSvcs/conf/config
# cp /etc/vxfen.d/vxfenmode_scsi3_dmp /etc/vxfenmode (if you are using dmp
)
7. Start fencing
# /etc/init.d/vxfen start
# /opt/VRTS/bin/hastart
Testing the fencing configuration
1. Check status of fencing
# vxfenadm -d
Fencing Protocol Version: 201
Fencing Mode: SCSI3
Fencing Mechanism: dmp
Cluster Members:
* 0 (node01)
1 (node02)
RSM State Information
node 0 in state 8 (running)
node 1 in state 8 (running)
2. Check GAB port b status
# gabconfig -a
GAB Port Memberships
==================================
Port a gen 24ec03 membership 01
Port b gen 24ec06 membership 01
Port h gen 24ec09 membership 01
3. Check for configuration files
# grep SCSI3 /etc/VRTSvcs/conf/config/main.cf
UseFence = SCSI3
# cat /etc/vxfenmode
...
vxfen_mode=scsi3
...
scsi3_disk_policy=dmp
# cat /etc/vxfendg
fendg
cat /etc/vxfentab
...
/dev/vx/rdmp/emc_dsc01
/dev/vx/rdmp/emc_dsc02
/dev/vx/rdmp/emc_dsc03
4. Check for SCSI reservation keys on all the coordinator disks
In my case I have 2 nodes and 2 paths per disk, so I should be able to see 4 keys per disk (1 for
each path and 1 for each node) in the output of below command.
# vxfenadm -s all -f /etc/vxfentab
Reading SCSI Registration Keys...
Device Name: /dev/vx/rdmp/emc_dsc01 Total Number Of Keys: 4
key[0]:
[Numeric Format]: 32,74,92,78,21,28,12,65
[Character Format]: VF000701
* [Node Format]: Cluster ID: 5 Node ID: 1 Node Name: node02
key[1]:
[Numeric Format]: 32,74,92,78,21,28,12,65 [Character Format]: VF000701
* [Node Format]: Cluster ID: 5 Node ID: 1 Node Name: node02
key[2]:
[Numeric Format]: 32,74,92,78,21,28,12,66 [Character Format]: VF000700
* [Node Format]: Cluster ID: 5 Node ID: 0 Node Name: node01
key[3]:
[Numeric Format]: 32,74,92,78,21,28,12,66 [Character Format]: VF000700
* [Node Format]: Cluster ID: 5 Node ID: 0 Node Name: node01

How to add or remove LLT links in VCS
cluster
Its sometimes a requirement to add a new high or low priority link to the existing LLT links.
This can be done online without affecting any of the cluster services. I will be adding a new low
priority link to existing high priority links in this post. The steps to remove a LLT link are
exactly same.
1. Backup the llttab file and edit it to have the entry for low priority LLT link.
# cp /etc/llttab /etc/llttab.bak
# vi /etc/llttab
set-node node02
set-cluster 3
link nxge0 /dev/qfe:0 - ether - -
link nxge1 /dev/qfe:1 - ether - -
link-lowpri e1000g0 /dev/e1000g:0 - ether - -
2. Ensure the cluster configuration is read-only and restart VCS, leaving applications running on
the nodes.
# haconf -dump -makero
# hastop -all -force (on any one node)
# gabconfig -a
3. Stop fencing on each node
# /sbin/vxfen-shutdown
# vxfenadm -d
# gabconfig -a
4. Unconfigure GAB and LLT on each system
# gabconfig -U
# gabconfig -a
# lltconfig -U
# lltconfig
5. Now start LLT and GAB on each node
# lltconfig -c
# lltconfig
# sh /etc/gabtab
# gabconfig -a
6. Start fencing on each node
# /sbin/vxfen-startup
# vxfenadm -d
# gabconfig -a
7. Now start VCS on each node and verify if everything is running fine
# hastart
# hastatus -sum
Now if you can check lltstat to see the newly added low priority link. Note that GAB has to start
and some link traffic has to be generated in order to see the new priority link in lltstat output.
# lltstat -nvv

Vous aimerez peut-être aussi