Académique Documents
Professionnel Documents
Culture Documents
/etc/cluster/ccr (directory)
Important Files
/etc/cluster/ccr/infrastructure
Global Services
One node is to specific global services. All other nodes communicate with the global services (devices, filesystems)
via the Cluster interconnect.
Global Devices
provide global access to devices irrespective of there physical location.
most commonly SDS/SVM/VxVM devices are used as global devices. LVM software is unaware of the
implementation of global nature on these devices.
/global/.devices/node@nodeID
nodeID is an integer representing the node in the cluster
Global Filesystems
# mount -o global, logging /dev/vx/dsk/nfsdg/vol01 /global/nfs
or edit the /etc/vfstab file to contain the following:
/dev/vx/dsk/nfsdg/vol01 /dev/vx/rdsk/nfsdg/vol01 /global/nfs ufs 2 yes global,logging
Global Filesystem is also known as (aka) Cluster Filesystem (CFS) or PxFS (Proxy File system)
Note
Local failover filesystems (i.e. directly attached to a storage device) cannot be used for scalable services — one
would have to use global filesystems for it.
Console Software
SUNWccon
There are three wariants of the cluster console software:
cconsole (access the node consoles through the TC or other remote console access method)
crlogin (uses rlogin as underlying transport)
ctelnet (uses telnet as underlying transport)
/opt/SUNWcluster/bin/ &
Cluster status
Reporting the cluster membership and quorum vote information
# /usr/cluster/bin/scstat –q
Cluster Daemons
lahirdx@aescib1:/home/../lahirdx > ps -ef|grep cluster|grep -v grep
root 4 0 0 May 07 ? 352:39 cluster
root 111 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/qd_userd
root 120 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/failfastd
root 123 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/clexecd
root 124 123 0 May 07 ? 0:00 /usr/cluster/lib/sc/clexecd
root 1183 1 0 May 07 ? 46:45 /usr/cluster/lib/sc/rgmd
root 1154 1 0 May 07 ? 0:07 /usr/cluster/lib/sc/rpc.fed
root 1125 1 0 May 07 ? 23:49 /usr/cluster/lib/sc/sparcv9/rpc.pmfd
root 1153 1 0 May 07 ? 0:03 /usr/cluster/lib/sc/cl_eventd
root 1152 1 0 May 07 ? 0:04 /usr/cluster/lib/sc/cl_eventlogd
root 1336 1 0 May 07 ? 2:17 /var/cluster/spm/bin/scguieventd -d
root 1174 1 0 May 07 ? 0:03 /usr/cluster/bin/pnmd
root 1330 1 0 May 07 ? 0:01 /usr/cluster/lib/sc/scdpmd
root 1339 1 0 May 07 ? 0:00 /usr/cluster/lib/sc/cl_ccrad
FF Panic rule — failfast will shutdown the node (panic the kernel) if specified daemon is not restarted within
30 seconds
cluster — System proc created by the kernel to encap kernel threads that make up the core kernel range of
operations. It directly panics the kernel if it's sent a KILL signal (SIGKILL). Other signals have no effect.
clexecd — This is used by cluster kernel threads to execute userland cmds (such as run_reserve and dofsck
cmds). It is also used to run cluster cmds remotely (eg: scshutdown).A failfast driver panics the kernel if this daemon
is killed and not restarted in 30 seconds.
cl_eventd — This daemon registers and forwards cluster events s(eg: nodes entering and leaving the cluster). With
a min of SC 3.1 10/03, user apps can register themselves to receive cluster events. The daemon automatically gets
respawned by rpc.pmfd if it is killed.
rgmd — This is the resource group mgr, which manages the state of all cluster-unaware applications. A failfast driver
panics the kernel if this daemon is killed by not started in 30 seconds.
rpc.fed — This is the "fork-and-exec" daemon, which handles reqs from rgmd to spawn methods for specific data
services. failfast will hose the box if this is killed and not restarted in 30 seconds.
scguieventd — This daemon processes cluster events for the SunPlex or Sun Cluster Mgr GUI, so that the display
can be updated in real time. It's not automatically started if it stops. If you are having trouble with SunPlex or Sun
Cluster Mgr, might have to restart the daemon or reboot the specific node.
rpc.pmfd — This is the process monitoring facility. It is i used as a general mech to initiate restarts and failure
action scripts for some cluster f/w daemons, and for most app daemons and app fault monitors. FF panic rule holds
good.
pnmd — This is the public Network mgt daemon, and manages n/w status info received from the local IPMP
(in.mpathd) running on each node in the cluster. It is automatically restarted by rpc.pmfd if it dies.
scdpmd — multi-threaded DPM daemon runs on each node. DPM daemon is started by an rc script when a node
boots. It montiors the availability of logical path that is visible thru various multipath drivers (MPxIO), HDLM,
Powerpath, etc. Automatically restarted by rpc.pmfd if it dies.
Sun Cluster does not work with Veritas DMP. DMP can be disabled before installing the software by putting in
dummy symlinks, etc.
scvxinstall is a shell script that automates VxVM installation in a Sun Clustered environment
scvxinstall automates the following things:
tries to disable DMP (vxdmp)
installs correct cluster package
automatically negotiates a vxio major number and properly edits /etc/name_to_major
automates rootdg initialization process and encapsulates boot disk
gives different device names for the /global/.devices/node@# volumes on each side
edits the vfstab properly for this same volume. The problem is this particular line has DID device on it,
and VxVM doesn't understand DID devices.
installs a script to "reminor" the rootdg on the reboot
reboots the node so that VxVM operates properly
Maintenance mode
scswitch -m -D
all volumes in the device group must be unopened or unmounted (not being used) in order to do that.
Replacing a failed disk in a A5200 Array (similar concept with other FC disk arrays)
vxdisk list #get the failed disk name
vxprint -g dgname #determine state of the volume(s) that might be affected
On the hosting node, replace the failed disk:
luxadm remove enclosure,position
luxadm insert enclosure,position
On either node of the cluster (that hosts the device group):
scdidadm -l c#t#d#
scdidadm -R d#
On the hosting node:
vxdctl enable
vxdiskadm #replace failed disk in vxvm
vxprint -g
vxtask list #ensure that resyncing is completed
Remove any relocated submirrors/plexes (if hot-relocation had to move something out of the way):
vxunreloc repaired-diskname
Replica management
Add local replicas manually.
Put local state db replicas on slice 7 of disks (as a convention) in order to maintain uniformity. Shared disksets
have to have replicas on slice 7.
Spread local replicas evenly across disks and controllers.
Support for Shared disksets is provided by Pkg SUNWmdm
Modifying /kernel/drv/md.conf
nmd \=\= max num of volumes (default 128)
md_nsets \=\= max is 32, default 4.