Sun™ Cluster 3.1 8/05 Software Differences: Note

Sun™ Cluster 3.
1 8/05 Software
Differences
Introduction
This document introduces the new features of Sun™ Cluster 3.1 8/05 (Sun
Cluster 3.1 Update 4) software and describes the differences between this
version and Sun Cluster 3.1 9/04 (Sun Cluster 3.1 Update 3) software.
Note – The content contained in this document is Sun

Proprietary/Confidential. Do not share this document outside of Sun.
Contents
This document contains:
Product Description......................................................................... page 2
Differences in Detail ........................................................................ page 3
Solaris™ 10 Operating System Support.............................. page 3
Solaris 10 OS Zones and the Solaris OS Container Agent page 7
Support for Network-Attached Storage – NetApp Filer page 17
Cluster Installation: Sun Java™ Enterprise System (Java ES)
Installer Standardization...................................................... page 24
New scinstall Features: “Easy Install” ......................... page 24
Shared Physical LAN Interconnects .................................. page 29
Infiniband as Sun Cluster Transport (Solaris 10 OS) ...... page 36
New Data Service Agents.................................................... page 37
VERITAS Foundation Suite 4.1 (VxVM 4.1 and VxFS4.1).............
....................................................................................................page 38
Product Documentation................................................................ page 39
Further Training ............................................................................. page 39
Sun Proprietary: Internal Use Only

1
Copyright 2005 Sun Microsystems, Inc. All Rights Reserved. SunU, Revision A
Product Description
Product Description
Sun Cluster 3.1 Update 4 software is the first Sun Cluster version to
support the Solaris™ 10 Operating System (Solaris OS). Therefore, the
most important feature of Sun Cluster 3.1 Update 4 software is Solaris 10
OS co-existence. Sun Cluster Update 4 software makes use of the new
Solaris Service Management Facility (SMF) in Solaris 10 OS to manage the
cluster framework daemons and cluster services.
Sun Cluster 3.1 Update 4, while not yet providing full integration of the
Solaris 10 OS zones feature, does support an agent which makes it possible
to automate booting zones and failing over of zones between cluster
nodes. The zones themselves are ordinary Solaris 10 OS zones and do not
have any knowledge of the cluster.
Sun Cluster 3.1 Update 4 has a number of other noteworthy features. The
software includes a new general framework for support of network-
attached storage as the only cluster data storage. The current
implementation provides specific support only for the Network
Appliance (NetApp) Filer product. However, the fact that this feature is
designed using an extensible plug-in architecture will simplify support for
other network-attached storage in the future.
A feature that allows both private transport network traffic and public
network traffic to travel over the same physical network adapter, using a
technology called tagged VLANs, will enable future support for clustering
blade or blade-like hardware that can support only two physical network
adapters.
Also of note is support for a new type of cluster transport, Infiniband,

which is quickly becoming the new industry standard for host-to-host and
host-to-storage interconnects. In Sun Cluster 3.1 Update 4, Infiniband host
channel adapters can be used as Transmission Control Protocol/Internet
Protocol (TCP/IP) transport adapters, and will look and behave just like
Ethernet transport adapters.
Note – Official support for Infiniband may post-date the first customer
ship of Sun Cluster 3.1 8/05. This is just a support and testing issue. The
Infiniband feature is already integrated into the cluster product.

2 Sun Cluster 3.1 9/04 Software Differences
Differences in Detail
The following sections provide details on the new features of Sun Cluster
3.1 Update 4 software.
Solaris™ 10 Operating System Support

Sun Cluster 3.1 Update 4 adds support for Solaris 10 OS, along with
continued support for Solaris 8 and Solaris 9 OS. One major difference in
Sun Cluster 3.1 Update 4 when it is running in Solaris 10 OS is that the
cluster framework daemons are managed as Service Management Facility
(SMF) services, rather than being launched as boot scripts. The other
major difference is the support only on the Solaris 10 OS for a cluster zone
agent as described in detail in ‘‘Solaris 10 OS Zones and the Solaris OS
Container Agent’’ on page 7.
Solaris 10 OS and SMF
Solaris 10 OS uses SMF as a standard, operating-system-wide method of

managing boot activity and system daemons. SMF is somewhat analogous
to the Sun Cluster process monitoring facility (PMF), but SMF is more
powerful. Like PMF, SMF can launch services and automatically restart
them for you, but it can also enforce dependencies between services.
On Solaris 8 and 9 OS, the Sun Cluster 3.1 Update 4 framework daemons
are launched from boot scripts. On Solaris 10 OS, they are registered as
SMF services. While SMF continues to support legacy boot scripts (SMF
essentially emulates the earlier /sbin/rc2 and /sbin/rc3 functionality if
such boot scripts still exist), the proper way to run services in Solaris 10
OS is to register them as SMF objects. Therefore, the decision was made to
run the cluster framework daemons in the so-called ‘correct’ Solaris 10 OS
fashion. The exception is the Network Time Protocol (NTP) configuration
that uses the cluster private node names. The reason for this is that the
standard NTP is already registered as an SMF service. If you manually
configure and enable the standard Solaris 10 OS SMF (to use clock
synchronization off an external source, for example) before you run
scinstall, then the cluster legacy NTP script will not be installed.

Sun™ Cluster 3.1 8/05 Software Differences 3
Examples of Viewing Cluster SMF Services
The following example shows cluster daemons registered as SMF services

in Solaris 10 OS:
grape:/# svcs -a|grep cluster
legacy_run 20:46:21 lrc:/etc/rc2_d/S74xntpd_cluster
online 20:45:09 svc:/system/cluster/scmountdev:default
online 20:45:12 svc:/system/cluster/network-multipathing:default
online 20:45:21 svc:/system/cluster/bootcluster:default
online 20:45:36 svc:/system/cluster/initdid:default
online 20:45:39 svc:/system/cluster/scvxinstall:default
online 20:45:45 svc:/system/cluster/mountgfsys:default
online 20:46:19 svc:/system/cluster/gdevsync:default
online 20:46:20 svc:/system/cluster/clusterdata:default
online 20:47:21 svc:/system/cluster/cl-svc-enable:default
online 20:47:23 svc:/system/cluster/pnm:default
online 20:47:23 svc:/system/cluster/scdpm:default
online 20:47:23 svc:/system/cluster/cl-event:default
online 20:47:23 svc:/system/cluster/cl-ccra:default
online 20:47:24 svc:/system/cluster/rpc-fed:default
online 20:47:26 svc:/system/cluster/rpc-pmf:default
online 20:47:27 svc:/system/cluster/cl-eventlog:default
online 20:47:30 svc:/system/cluster/rgm:default
online 20:47:36 svc:/system/cluster/spm:default
online 20:47:36 svc:/system/cluster/cl-svc-cluster-
milestone:default
maintenance 20:47:31 svc:/system/cluster/scsymon-srv:default
The following example provides a closer look at an excerpt from the

eXtensible markup language (XML) code used to define a typical cluster
daemon—the resource group manager (RGM):
grape:/# svccfg export rgm
.
.
.
.
<instance name='default' enabled='true'>

Note that the service is automatically enabled and that the listed
dependencies are merely startup dependencies. This ensures that the
dependees (pnm, rpc-fed, and rpc-pmf) are started before rgm is started.
You do not need to specify actual restart dependencies, because the
daemons are protected by failfasts as they have been in previous updates
of the Sun Cluster 3.1 software.
The dependencies are as follows:

<dependency name='pnm' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/system/cluster/pnm:default'/>
</dependency>
<dependency name='rpc-fed' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/system/cluster/rpc-fed:default'/>
</dependency>
<dependency name='rpc-pmf' grouping='require_all' restart_on='none'
type='service'>
<service_fmri value='svc:/system/cluster/rpc-pmf:default'/>
</dependency>
Finally, the service definition points to the actual scripts used to start and
stop the cluster daemon, as follows:
<exec_method name='start' type='method'
exec='/usr/cluster/lib/svc/method/svc_rgm start'
timeout_seconds='18446744073709551615'>
<method_context/>
</exec_method>
<exec_method name='stop' type='method'
exec='/usr/cluster/lib/svc/method/svc_rgm stop'
timeout_seconds='18446744073709551615'>
<method_context/>
</exec_method>
SMF and Data Service Agents
Actual cluster applications being started by data service agents are not
likely to be placed under control of SMF because this would make it
difficult for SMF to make failover applications behave properly. While
SMF can support dependencies, SMF does not have any particular
knowledge about which node in a cluster is likely to be among the correct
primaries for a cluster application.

Even when a data service agent references software whose native Solaris
OS version is already registered in SMF, it is likely to ignore the SMF
versions of the services. The Network File System (NFS) agent, for
example, ignores the SMF-registered service by keeping it disabled even
when NFS is registered and enabled as a cluster service.

Solaris 10 OS Zones and the Solaris OS

Container Agent
The Solaris 10 OS zones feature is a powerful new facility that allows you
to create multiple, virtual instances of the Solaris 10 OS, or local zones,
each running inside a physical instance of the Solaris OS. Each configured
local zone (the main Solaris 10 OS instance is known as the global zone),
appears to be a complete OS instance isolated from each other and from
the global zone. Each zone has its own zonepath, that is its own completely
separate directory structure for all of its files starting with the root
directory. When configuring a local zone, its zonepath is expressed as a
subdirectory somewhere inside a global zone mounted filesystem. When
you log in to a local zone, however, you see the file systems belonging
only to that local zone, users and processes belonging only to that local
zone, and so on. Root users logged into one local zone have superuser
privileges only in that zone.
Sun Cluster 3.1 Update 4 does not provide full zone integration. A full
zone integration feature would allow you to essentially have clustered
Solaris 10 OS zones, with cluster framework software installed in the
zones, and cluster configuration and agents working together inside the
zones to provide resource failover and scalability in zones. There could be
clustered relationships between zones running on different physical
machines or even between different local zones running on the same
physical machine.
What Sun Cluster 3.1 Update 4 does provide is an agent that allows you to
boot and control cluster-unaware local zones, that live inside physical
machines (global zones) which are, of course, running the cluster
framework.
The official name of the agent is Sun Cluster Data Service for Solaris
Containers. A Solaris Container is just a zone that is managed with the
Solaris Resource Manager (SRM). The agent actually does not provide any
automated management of SRM.
Modifying /etc/system to Support Sparse Root Zones
The default initial configuration of a zone does not specify an absolute

entire copy of the global zone’s OS into the local zone storage. Instead,
certain seldomly modified parts of the OS, by default, are shared between
the local zone and global zone. This is known as a sparse root zone. In this
configuration, the loopback file system (lofs) is used to make portions of
the global zone’s OS available from within the local zone’s zonepath.

When you install the Sun Cluster framework, the line:

exclude: lofs
is automatically added to your /etc/system file. You must remove or

comment out this line in order to support sparse root zones in the cluster.
If you choose to configure only whole root zones, that are really initialized
with an entire copy of the global zone’s OS, then you could use the
/etc/system entry unmodified.
Failover Zones and Multiple Master Zones
The Sun Cluster 3.1 zones agent for Solaris 10 OS provides two different
models for booting and controlling zones.
Failover zones have the following characteristics:

● The zonepath must be in shared storage configured as a failover
filesystem.
● The zone is configured and installed manually only from one
physical node, and then the zone’s configuration is manually copied
over to the other node.
● The control of the zone is managed by a failover resource group, as
described in detail in the following sections.
● The zone can have a combination of IP address types as follows:
● Some configured by the zonecfg utility and not controlled by
the zone agent
● Others configured as SUNW.LogicalHostname instances and
controlled by the zone agent.
Both types of IP addresses will appear to fail over along with the
zone from node to node. The advantage of SUNW.LogicalHostname
addresses is that their presence will also cause the zone to fail to node
from node if all physical adapters in the Internet Protocol Multi-
Pathing (IPMP) group fail. The advantage of addresses configured
by zonecfg is that they will be present if you need to boot the zone
manually for debugging purposes. However, if you have only this
kind of address then your zone will not fail over even if all network
adapters in the IPMP group on that node are broken.

Multiple-master zones have the following characteristics:

● The zones are created manually, separately, on each node.
● The zonepath for the zones must be storage local to each node.
● The zone name must be the same on each node but the zonepath can
be the same or different.
● The zone is controlled by a scalable resource group, as described in
detail in the following sections.
● IP addresses can be configured only by the zonecfg command and
must be different on each physical node.
● The agent does not provide any internal load balancing or make any
use of the global interface feature in any way.
● The agent boots the same zone name on all nodes simultaneously.
The idea is that they can contain instances of some application that
needs to be load-balanced externally.
Zone Boot (sczbt), Zone Script (sczsh), and Zone SMF

(sczsmf) Resources
The agent provides three new kinds of resources. Each of these is

implemented using the generic data service type, SUNW.gds as follows:
● The sczbt resource flavor provides booting and halting of the zone,
and, for failover zones, also manages placing any (optional) IP
addresses managed by SUNW.LogicalHostname resources into the
zone. An instance of the sczbt resource flavor is required in every
resource group (both failover and multiple master) used to manage
zones.
● The sczsh resource flavor provides the ability to launch any
software in the zone through a user-provided start script (that lives
in the zone). Any instances of this resource are optional, and if
present, must depend on the sczbt resource living in the same
group.
● The sczsmf resource flavor provides the ability to enable an SMF
service in the zone. Note that this resource does not configure the
SMF service, it only enables it. Any instances of this resource are
optional, and if present, must depend on the sczbt resource living in
the same group.

The sczsh and sczsmf resource flavors are not required, and it is
perfectly legal to just configure your zone manually to run all the boot
scripts and SMF services that you like. However, by using the zone agent
resources you gain the following benefits:
● You can have a fault-monitoring component. You can provide a
custom fault probe for your resource, and its exit code can indicate a
desire to have the entire resource group fail over (exit code 201) or
restart (other non-zero exit code).
● You can have dependencies, even on resources that live in other
resource groups that can be online on a different node.
Example of Configuring a Failover Zone
The following code example shows a complete example of failover zone.

This example shows both an IP address configured by zonecfg and
another IP address that is managed by SUNW.LogicalHostname. There is
no particular reason you have to have them both. Recall, only if you have
any IP addresses configured using SUNW.LogicalHostname will failure of
all adapters in your IPMP group cause your zone to fail over to the other
node.
Configuring and Enabling a Resource Group
You start by configuring a resource group containing a storage resource

for the failover storage and the IP address under control of
SUNW.LogicalHostname. The IP will not be placed into the zone until the
actual zone boot resource is configured and added later.
Setting up the resource group first is an easy way to control the failover
storage. Assume you have a filesystem on the shared storage created
ready to be the zonepath for the new zone. If using a volume manager, the
storage needs to be in a device group dedicated for this zone:
pecan:/# df -k
.
.
/dev/md/zoneds/dsk/d100 5160782 98161 5011014 2% /fro-zone
pecan:/# grep fro-zone /etc/vfstab

/dev/md/zoneds/dsk/d100 /dev/md/zoneds/rdsk/d100 /fro-zone ufs 2 no -
# scrgadm -a -g frozone-rg -h pecan,grape

# scrgadm -a -g frozone-rg -L -l frozone-lh
# scrgadm -a -g frozone-rg

# scrgadm -a -g frozone-rg -t HAStoragePlus -j fro-stor -x \

FilesystemMountpoints=/fro-zone
# scswitch -Z -g frozone-rg
Manual Configuration and Installation of the Zone
The zone is configured and installed from one node only. The only
configuration options given are the zonepath and an IP (not required, just
to show an example) controlled by the zone.
You must not set the autoboot parameter for the zone to true. The default
is false, which is correct, and which is why you do not see it mentioned
at all in the following correct example:
pecan:/# zonecfg -z fro-zone
fro-zone: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:fro-zone> create
zonecfg:fro-zone> set zonepath=/fro-zone
zonecfg:fro-zone> add net
zonecfg:fro-zone:net> set address=192.168.1.189
zonecfg:fro-zone:net> set physical=qfe2
zonecfg:fro-zone:net> end
zonecfg:fro-zone> commit
zonecfg:fro-zone> exit
pecan:/# zoneadm -z fro-zone install

Preparing to install zone <fro-zone>.
Creating list of files to copy from the global zone.
Copying <2826> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <1005> packages on the zone.
Initialized <1005> packages on zone.
Zone <fro-zone> is initialized.
Installation of these packages generated warnings: <SUNWmconr>
The file </fro-zone/root/var/sadm/system/logs/install_log> contains a log
of the zone installation.
The zone configuration file is copied to the other node:

pecan:/# cd /etc/zones
pecan:/etc/zones# rcp fro-zone.xml grape:/etc/zones

Edit the /etc/zones/index files on the other node. You need to be

careful about copying it verbatim since you could also have non-failover
zones whose configuration differs on the two nodes:
grape:/# vi /etc/zones/index [ add the line below ]
fro-zone:installed:/fro-zone
Boot the zone and connect to its console. You will be prompted to
configure the zone (it looks just like a standard Solaris OS that is booting
after a sys-unconfig:
pecan:/etc/zones# zoneadm -z fro-zone boot
pecan:/etc/zones# zlogin -C fro-zone
[Connected to zone 'fro-zone' console]
Log in. You might want to do any other configuration of the zone at this
time, such as commenting out the CONSOLE=/dev/console line in
/etc/default/login.
Note that in the example the zone already has the IP address that you
configured with zonecfg:
fro-zone console login: root
Password:
May 12 16:15:16 fro-zone login: ROOT LOGIN /dev/console
Last login: Thu May 12 10:34:13 from 192.168.1.39
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
# ifconfig -a
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu
8232 index 1 inet 127.0.0.1 netmask ff000000
qfe2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index
3 inet 192.168.1.189 netmask ffffff00 broadcast 192.168.1.255
Test the Zone on Other Nodes
Make sure you can boot the zone on each intended failover node:
pecan# zoneadm -z fro-zone halt
pecan# scswitch -z -g fro-zone-rg -h grape
grape# df -k [verify filesystem switched here correctly]

grape# zoneadm -z fro-zone boot
grape# zlogin -C fro-zone
grape# zoneadm -z fro-zone halt

Configure the sczbt Resource Instance
The sczbt resource instance is configured and registered using the

configuration file and registration script that is typical of Sun supported
agents that are built on the SUNW.gds type.
Notice that the sczbt_config file is only used by the registration

command sczbt_register and then no longer needed. In fact, you will
need to “re-use” the file if you create multiple zone boot resources
(probably in different groups). Therefore, you should make a copy of it for
“safekeeping” in case you ever need to delete the resource and add it back
again.
There is a parameter file that needs persist and is read every time the
resource is stopped, started, or validated. Note that in the sczbt_config
file you specify a directory name for the parameter file. This directory could
be part of a global file system, or it could be a local directory available on
each node. If you happen to have a global file system available, it is most
convenient to place the parameter file directory there. If not, you will have
to manually copy the parameter file that is created by sczbt_register to
other nodes.
Note in the example that it is assumed that there is no convenient global

filesystem, and that you do not want to create a global file system just for
this purpose. Instead, a local directory /etc/zoneagentparams is
specified as the parameter file location. The register script will
automatically create the sczbt-fro-zone-rs file in that directory, but
you will have to manually copy that file to the other node.
pecan:/# cd /opt/SUNWsczone/sczbt/util
pecan:/# vi sczbt_config
.
.
RS=fro-zone-rs
RG=frozone-rg
PARAMETERDIR=/etc/zoneagentparams
SC_NETWORK=true
SC_LH=frozone-lh
FAILOVER=true
HAS_RS=fro-stor
#
# The following variable will be placed in the parameter file
#
# Parameters for sczbt (Zone Boot)
#

# Zonename Name of the zone

# Zonebootopt Zone boot options ("-s" requires that Milestone=single-
user)
# Milestone SMF Milestone which needs to be online before the zone is
considered as booted
#
Zonename=fro-zone
Zonebootopt=
Milestone=multi-user-server
pecan# mkdir /etc/zoneagentparams

pecan# ./sczbt_register
pecan# rsh grape mkdir /etc/zoneagentparams

pecan# cd /etc/zoneagentparams
pecan# rcp sczbt_fro-zone-rs grape:/etc/zoneagentparams
Now enabling the zone boot resource instance will automatically boot the
zone on the node where the failover group is primary. It will also
automatically place the IP address controlled by the
SUNW.LogicalHostname resource into the zone. This is demonstrated by
accessing the zone through this logical IP address.
Note that in this example, the IP address configured using

SUNW.LogicalHostname is marked as deprecated, and the one
configured directly into the zone by zonecfg is not. The effect of this is
that all outgoing TCP/IP connections from the zone will appear to come
from the non-deprecated IP address.
pecan# scswitch -e -j fro-zone-rs
pecan:/# telnet 192.168.1.188
Trying 192.168.1.188...
Connected to 192.168.1.188.
Escape character is '^]'.
login: root
Password:
Last login: Thu May 12 16:46:58 from 192.168.1.39
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
# uname -a
SunOS fro-zone 5.10 Generic sun4u sparc SUNW,Ultra-250
# ifconfig -a
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu
8232 index 1
inet 127.0.0.1 netmask ff000000

qfe2:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu

1500 index 3
inet 192.168.1.188 netmask ffffff00 broadcast 192.168.1.255
qfe2:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index
3
sczsh and sczsmf Resource Instances
As mentioned before, you can create optional resources to run some

random software through a script (sczsh) or through an SMF service
(sczsmf) instance.
Instances of each of these resource flavors must live in the same resource
group as a zone boot resource (sczbt). The configuration and registration
scripts provided by the zone agent specifically for these resources will
automatically place a restart dependency on the zone boot resource. Any
other dependencies, as mentioned above, are fully customizeable.
Example of a Script Resource
The following configuration file specifies a script resource that will

depend on the zone boot resource defined in the previous example. The
registration script, as for the zone boot resource, creates the parameter file,
but if the directory specified is local on each node the file must be copied
manually:
pecan:/# cd /opt/SUNWsczone/sczsh/util
pecan# cat sczsh_config
.
.
#
RS="frozone-myd-rs"
RG="frozone-rg"
SCZBT_RS="fro-zone-rs"
PARAMETERDIR="/etc/zoneagentparams"
#
# The following parameters will be put in the agents parameterfile:
#
Zonename="fro-zone"
ServiceStartCommand="/marc/mydaemon &"
ServiceStopCommand="/usr/bin/pkill mydaemon"
ServiceProbeCommand="/usr/bin/pgrep mydaemon >/dev/null"
pecan# ./sczsh_register

pecan# cd /etc/zoneagentparams
pecan# rcp sczsh_frozone-myd-rs grape:/etc/zoneagentparams
This is a fairly simplistic example. The new resource will cause the
/marc/mydaemon to be launched in the zone after it is booted. Since the
probe command used by the fault monitor is just /usr/bin/pgrep
mydaemon >/dev/null, which will never return the value 201, failure of
the probe will always just cause a restart of the daemon rather than
failover of the entire zone.
Script and SMF Resources Whose Failure Can Cause Zone

Failover
In order to provide a script or SMF resource whose failure could cause the
whole resource group, and hence the whole zone, to fail over, you would
have to do the following:
1. Provide a fault probe that could return the result 201 in order to
suggest an entire zone failover.
2. Modify properties of the resource to allow it to failover as follows:
pecan:/# scrgadm -pvv -j frozone-myd-rs |grep -i 'failover.*value'
(frozone-rg:frozone-myd-rs:Failover_mode) Res property value: NONE
(frozone-rg:frozone-myd-rs:Failover_enabled) Res property value:
FALSE
pecan:/# scrgadm -c -j frozone-myd-rs -y Failover_mode=SOFT
pecan:/# scrgadm -c -j frozone-myd-rs -x Failover_enabled=TRUE
The new value for Failover_mode will cause resource group failover if
the script resource fails to start. The Failover_enabled property is an
extension property of the SUNW.gds resource type and will enable the
SUNW.gds fault monitor to request an entire resource group failover when
the fault probe returns value 201.

Support for Network-Attached Storage – NetApp Filer

Sun Cluster 3.1 Update 4 provides a framework for supporting network-
attached storage (NAS) devices as the data storage available for the Sun
Cluster nodes without requiring any other shared storage devices. In
other words, a cluster whose nodes all have access to NAS but have no
multiported device identifier (DID) storage devices whatsoever should
still be able to operate as a cluster and run cluster applications.
A NAS device provides file services to all of the cluster nodes through the
Network File System (NFS) or any other network file-sharing protocol.
File services can run between the cluster nodes and the network storage
server on a dedicated subnet or—less likely— on the same public subnet
providing access to the clients of the cluster services. In other words, file
traffic is supported on any network except those that make up the cluster
interconnect.
Sun Cluster 3.1 Update 4 software provides a general drop-in type of

architecture to serve as an interface for managing NAS as the quorum
device and an interface for failure fencing on a NAS device. Two new
commands—the scnas command and the scnasdir are provided as
part of the cluster framework for registering into the cluster framework
information about a NAS device and the directories it provides (the
directory information is used for data fencing). New interfaces are
provided by the scconf command and by the scsetup utility to specify a
NAS device as a quorum device.
In the current release, the only specific implementation supported is the

NetApp Filer product, which uses NFS to provide file services to the
clients which, in this case, are the cluster nodes. All Sun Cluster
applications are supported with their data on the NAS device except NFS
itself.
Requirements for the NetApp Filer – File Server Side
There are some specific requirements for using the NetApp Filer as a Sun
Cluster storage device. On the file server device itself:
● The filer must be a NetApp clustered filer. There is nothing that the
Sun Cluster software can do to enforce this, but it makes sense to
require high availability for a device which is going to provide file
services to your high-availability cluster.

● You must configure the NAS device to allow Hypertext Transfer

Protocol (HTTP) administrative access. This is required by the
interface that Sun Cluster software uses to perform data fencing on
the filer.
● You must know the administrator login and password. These are
configured into Sun Cluster configuration repository (CCR) using the
scnas command.
● When exporting NAS directories for use with the cluster, you must
use rw=nodename1:nodename2 to specify access to the directories.
You make these entries in the exports file on the NAS device. This is
required by the method that Sun Cluster uses to perform data
fencing on the NAS device—it removes a fenced node from the
export list and places it back when the node is unfenced.
Requirements for the NetApp Filer on the Cluster Nodes
In order to support the NetApp Filer as cluster storage on the cluster

nodes, you must meet the following requirements.
● You must install the NTAPclnas package on all cluster nodes. This
package is available from http://now.netapp.com. This package
provides the client interfaces required to provide data fencing and to
provide quorum device implementation on the NetApp Filer NAS
device.
● You must NFS-mount the needed directories from the NetApp Filer
using a standard /etc/vfstab entry on all the cluster nodes. You
must use the yes option in the mount-at-boot column so that the
NFS directories are mounted at cluster node boot time.
Registering the NAS Storage Device on the Cluster Nodes
You must use the scnas command to register the NAS device into the Sun
Cluster Configuration Repository (CCR). This is true whether you want to
use the NAS storage device as a quorum device or not. The important
information being recorded is the NAS identity (IP address), login name,
and password which are used for failure fencing.
You must also use the scnasdir command to register on the NAS device
the specific directories that are being used to serve cluster data. The Sun
Cluster client implementation is then able to perform data fencing on
these specific directories. In the NetApp Filer implementation, data
fencing is accomplished by removing the name of a node from the
NetApp Filer exports list as it is being fenced out of the cluster.

Registering a NAS device with the scnas command looks like the
following:
# scnas -a -h netapps25 -t netapp -o userid=root
Please enter password:
Registering the specific NAS directories for failure fencing looks like the
following:
# scnasdir -r -h netapps25 -d /vol/vol_01_03
# scnasdir -r -h netapps25 -d /vol/vol_01_04
You can verify the configuration of the NAS device into the CCR using the
-p option to the commands as in the following example:
# scnas -p
Filers of type "netapp":
Filer name: netapps25

type: netapp
password: *******
userid: root
directories: /vol/vol_01_03
# scnasdir -p
Filers of type "netapp":
Filer name: netapps25

The stored password is not shown. It is stored in the CCR with a very
basic encryption scheme known as ROT-13 (alphabetic characters shifted
13 places).
Using a NAS device as a Quorum Device
You can use a NAS device as a quorum device. In a two-node cluster, it

can be your single-vote, SCSI-2 quorum device. In a cluster with more
than two nodes, it will be configured as a SCSI-3 quorum device and
given a number of votes that is one fewer than the number of nodes in the
cluster.

The NAS quorum architecture, like the rest of the NAS architecture, is a
general architecture with specific support in Sun Cluster Update 4
software only for the NetApp Filer. On the NetApp Filer side, the
requirements for operation as a Sun Cluster quorum device are as follows:
● You must install the iSCSI license from your NAS device vendor.
● You must configure an iSCSI Logical Unit (LUN) for use as the
quorum device.
● When booting the cluster, you must always boot the NAS device
before you boot the cluster nodes.
On the cluster side, the requirements and restrictions are as follows:

● A cluster can use a NAS device for only a single quorum device.
There should be no need for other quorum devices. This is true
because, in a cluster of more than two nodes, the quorum acts like a
SCSI-3 quorum device attached to all the nodes.
● Multiple clusters using the same NAS device can use separate LUN’s
on that device as their quorum devices.
If you want to use NAS storage as the quorum device, it can not be
configured by the new scinstall feature that automatically selects a
quorum device (see ‘‘Auto-Configuration of the Quorum Device’’ on page
25). Instead, you use new options to the scconf command, which are also
instrumented into the scsetup utility.
The following is an example of the new dialogue in the scsetup utility

that allow you to add a NAS device as the quorum. The new options to
scconf are shown as part of the dialogue. The example is from a cluster
with four nodes. The quorum device will, therefore, automatically be
given three votes.
>>> Add a Quorum Device <<<
This option is used to add a quorum device to the cluster

configuration. Quorum devices are necessary to protect the cluster
from split brain and amnesia situations. Each quorum device must be
connected to at least two nodes. You can use a device containing user
data.
Adding a quorum device automatically configures node-to-device paths

for the nodes attached to the device. Later, if you add more nodes to
the cluster, you might need to update these paths by removing then
adding back the quorum device. For more information on supported
quorum device topologies, see the Sun Cluster documentation.

Is it okay to continue (yes/no) [yes]?
What is the type of device you want to use?
1) Directly attached shared disk

2) Network Attached Storage (NAS) from Network Appliance
q)
>>> Add a Netapp NAS Quorum Device <<<
A Netapp NAS device can be configured as a quorum device for Sun

Cluster. The NAS configuration data includes a device name, which is
given by the user and must be unique across all quorum devices, the
filer name, and a LUN id, which defaults to 0 if not specified.
Please refer to the scconf(1M) man page and other Sun Cluster
documentation for details.
The NAS quorum device must be setup before configuring it with Sun
Cluster. For more information on setting up Netapp NAS filer,
creating the device, and installing the license and the Netapp
binaries, see the Sun Cluster documentation.
Is it okay to continue (yes/no) [yes]?
What name do you want to use for this quorum device? netapps
What is the name of the filer [netapps]? netapps25
What is the LUN id on the filer [0]?
Is it okay to proceed with the update (yes/no) [yes]?
scconf -a -q name=netapps,type=netapp_nas,filer=netapps25,lun_id=0
# scstat -q
-- Quorum Summary --
Quorum votes possible: 7

Quorum votes needed: 4
Quorum votes present: 7
-- Quorum Votes by Node --

Node Name Present Possible Status

--------- ------- -------- ------
Node votes: phys-maggie-1 1 1
Online
Node votes: phys-maggie-2 1 1 Online

-- Quorum Votes by Device --
Device Name Present Possible Status

----------- ------- -------- ------
Device votes: netapps 3 3 Online
NAS Devices and ORACLE® RAC Data Files
It is supported to use a NAS device to hold data for an ORACLE® RAC

database.
For this specific application, you should use the following mount options
in the /etc/vfstab file of the cluster nodes:
● forcedirectio
● noac
● proto=tcp
There are no other NAS-specific instructions for using NAS devices with
RAC. You simply choose filesystem storage as you build your RAC
database, and you point it to the directories where your NAS directories
are mounted on the cluster nodes.

Cluster Installation: Sun Java™ Enterprise System

(Java ES) Installer Standardization
The supported mechanism for installing the Sun Cluster software
framework packages, that is, for doing the actual pkgadd portion of the
Sun Cluster framework, has been standardized. All releases of Sun
Cluster 3.1 Update 4, whether or not they are actually co-packaged with
other Java ES installations, will include the Java ES installer. The cluster-
specific webstart installer that came with standalone versions of
previous releases of the Sun Cluster 3.1 software no longer exists.
While internally the scinstall command still has the ability to do the
Sun Cluster framework pkgadd’s, this will not be a supported mechanism
for installing the cluster. The only supported mechanism is to use the Java
ES installer.
The advantages of using the Java ES installer are:

● It will automatically install or upgrade all prerequisite auxiliary
components. These include the following:
● The Common Agent Container (cacao)
● The Java Management Development Kit Runtime (JDMK)
● Sun Web Console
● Prerequisite Solaris OS packages you may not yet have installed
if you installed only the end-user configuration
These components will always be included with the Sun Cluster 3.1
Update 4 distribution, whether a standalone distribution or a full
Java ES release. The fact that the Java ES installer is the only
documented way of adding the cluster framework packages means
that troubleshooting based on problems with these prerequisites
should be minimal or non-existent. In fact, you hardly need to
document them to the public at all.
● It always lets you choose which languages you want to install in
addition to English (which is required), and only installs language-
specific packages for the languages you choose.
The Java ES installer does not have the ability to actually configure the
cluster. This will still be done through scinstall. You will get an error if
you try to use the Configure Now option of the Java ES installer as you
install the Sun cluster framework.

New scinstall Features: “Easy Install”

The scinstall utility has undergone several modifications. This section
discusses changes to scinstall which simplify the cluster configuration
process, they are known internally as the “easy install features”.
There are other new scinstall features introduced to support other

major features of the Sun Cluster 3.1 Update 4 release. These are discussed
in later sections, in ‘‘Shared Physical LAN Interconnects’’ on page 29, and
in ‘‘Infiniband as Sun Cluster Transport (Solaris 10 OS)’’ on page 36. This
particular section only details those changes to scinstall unrelated to
other features, designed to simplify the configuration process. The result
of these features is that in the majority of cases, starting in Sun Cluster 3.1
Update 4, the Sun Cluster framework is completely operational after the
cluster nodes reboot following execution of the scinstall utility with no
mandatory post-installation steps.
These changes to the scinstall utility are detailed in the following

subsections.
Typical Install Added to the One-Node-at-a-Time Configuration
The interactive procedure that allows you to configure the Sun Cluster
software one node at a time now supports a Typical configuration. This
is very similar to the Typical configuration that has been supported with
the procedure that allows you to configure the entire cluster from one
node since Sun Cluster 3.1 Update 1.
The Typical configuration mode assumes the following responses to

configuration questions. If you need to use different responses than the
ones given below, you can use the Custom configuration mode instead of
the Typical configuration mode.
● It assumes you have the standard empty /globaldevices mounted
as an empty file system with the recommended size (512 MB) to be
remounted as the /global/.devices/node@# filesystem.
● It will use network address 172.16.0.0 with netmask 255.255.0.0
for the cluster interconnect.
● It uses the names switch1 and switch2 for the two transport
junctions and assumes the use of junctions even for a two-node
cluster.

This is just a matter of definitions in the CCR and has always

worked even if the two-node cluster uses direct (crossover) cables. If
you need to add nodes to a two-node cluster, having the original
cluster defined with junctions facilitates the procedure.
● It assumes you want to use standard system authentication for new
nodes configuring themselves into the cluster.
● It checks if you have any patches first in the /var/cluster/patches
directory and then in /var/patches. It will not add patches from
both directories. If you have a file named patchlist in the same
directory as the patches, it will be used as the patch list file.
Auto-Configuration of the Quorum Device
The scinstall has a new feature to automatically choose and configure

the quorum device for you. If you choose to use this feature, the quorum
device is automatically configured as the last node reboots into the cluster.
The feature always chooses the shared device with the lowest possible
DID number as the quorum device. When the quorum device is
automatically configured, the installmode flag is automatically reset as
well. In other words, this feature completely automates configuration so
that the cluster framework is made completely operational when the
nodes boot after the completion of the scinstall utility.
For all four modes of configuring the cluster (one-at-a-time and all-at-
once, with custom and typical options for each of these), the scinstall
utility asks you if you want to disable auto-configuration of the quorum
device. The default is to use the autoconfiguration mode. You must
disable the quorum auto-configuration if you want to use a NAS device as
the quorum (see ‘‘Using a NAS device as a Quorum Device’’ on page 19)
or if your shared device —the device which will be assigned the lowest
number DID—cannot be supported as a quorum device. The dialog looks
like the following:

>>> Quorum Configuration <<<
Every two-node cluster requires at least one quorum device. By

default, scinstall will select and configure a shared SCSI quorum
disk device for you.
This screen allows you to disable the automatic selection and

configuration of a quorum device.
The only time that you must disable this feature is when ANY of the
shared storage in your cluster is not qualified for use as a Sun
Cluster quorum device. If your storage was purchased with your
cluster, it is qualified. Otherwise, check with your storage vendor
to determine whether your storage device is supported as Sun Cluster
quorum device.
If you disable automatic quorum device selection now, or if you

intend to use a quorum device that is not a shared SCSI disk, you
must instead use scsetup(1M) to manually configure quorum once both
nodes have joined the cluster for the first time.
Do you want to disable automatic quorum device selection (yes/no)

[no]?
On Solaris 8 and 9 OS, the auto-configuration of the quorum happens as

part of a boot script on the last node booting into the cluster. It will be
complete by the time you get the login prompt on this last node.
In Solaris 10 OS, as the last node boots into the cluster, you get the login
prompt on the last node booting into the cluster before the quorum auto-
configuration runs. This is because the boot environment is controlled by
the SMF of Solaris 10 OS, which runs boot services in parallel and gives
you the login prompt before many of the services are complete. The auto-
configuration of the quorum device does not complete until a minute or
so later. You should have time to log in on the last node, run the scstat -
q command, notice you still had no quorum device and still see the
installmode flag set. Do not attempt to configure the quorum device by
hand, as the auto-configuration will eventually run to completion.

Automatic Configuration of Singleton IPMP Groups
The scinstall utility now automatically configures a singleton IPMP

group. The utility does this for any adapter for which an
/etc/hostname.xxx file exists which does not yet specify an IPMP
group. The scinstall utility automatically rewrites such a file as part of
its standard configuration phase—after confirmation of all the interactive
options. The message looks like the following:
Configuring IP Multipathing groups in
"/etc/hostname.<adapter>" files
Updating "/etc/hostname.qfe1".
And the file it creates looks like the following:

grape:/# cat /etc/hostname.qfe1
grape netmask + broadcast + group sc_ipmp0 up
Note that this is the same singleton IPMP group that would be
automatically created if you added an instance of
SUNW.LogicalHostname or SUNW.SharedAddress onto an adapter that
was not part of an IPMP group. The functionality has now been moved
up into scinstall, although it remains in the validation methods of the
virtual IP resources as well, just in case you have adapters (perhaps these
are new adapters) that are not configured with IPMP when you add the
resources.
Note also that if you want to create a multi-adapter IPMP group—with

real failover capability—you still must do this manually, either before or
after running scinstall.
Ensuring files Keyword Before Name Service in

/etc/nsswitch.conf
The new scinstall version automatically edits the

/etc/nsswitch.conf file. Every line is edited to ensure that the files
keyword appears before any other name service. The cluster keyword is
added—as usual—to the hosts and netmasks lines. Following is an
excerpt from the /etc/nsswitch.conf, as edited by scinstall from a
node that used to have the name service before the files keyword:
#hosts: nis files
hosts: cluster files nis

Extra Nodes Removed from /etc/inet/ntp.conf.cluster
You no longer need to manually remove unused private network node

names from /etc/inet/ntp.conf.cluster that is used for running the
NTP across the cluster interconnect. The scinstall utility automatically
leaves you with a file containing only the private network node names
corresponding to the number of nodes you specified as part of the
scinstall dialogue.

Shared Physical LAN Interconnects

Sun Cluster 3.1 Update 4 software allows certain types of transport
adapters to be used as public network adapters and private network
transport adapters simultaneously. The purpose of this feature is to allow
certain types of servers that may never have more than two physical
network adapters—such as servers in a blade architecture—to be used as
Sun Cluster nodes.
Using this feature, such servers could use each physical adapter both as a
single transport adapter and a single public network adapter. For the
public network part, this adapter could be in the same IPMP group as a
separate physical adapter, so that it would be possible for public network
IP addresses to failover between the two.
Tagged Virtual Local Area Networks (VLANs) and Adapter

Restrictions
This feature makes use of network device drivers that support a

specification called tagged VLANs. In the tagged VLAN specification,
traffic is not necessarily isolated between virtual networks by the
switching infrastructure. Instead, information about the virtual network
identity of an adapter is known by the adapter itself in the form of a
VLAN identifier (VLAN-ID). This VLAN-ID is encapsulated as part of the
header information at the media access control (MAC) level. Adapters
configured with a particular value for the VLAN-ID only accept packets
containing the same VLAN-ID.
The only adapters that support the tagged VLAN device driver and that
are also supported in the Sun Cluster environment are the Cassini
Ethernet (ce) adapters and the Broadcom Gigabit Ethernet (bge) adapters.
Thus the shared physical interconnects feature is only available with those
adapters.

Illustration of Shared Private and Public Networks
Figure 1 is diagram of a two-node cluster that has only two physical

network adapters that are capable of tagged VLANs:
Figure 1 Adapters Being Shared Between Public and Private Nets
As shown in Figure 1, the switches are interconnected in order to support

the two adapters on each node existing in the same IPMP group for public
network address failover. In addition, each switch is being used as a
private network interconnect. The isolation of traffic between the private
network and public network adapters is controlled by the VLAN-ID that
is assigned to the adapters themselves, not by the network fabric.

VLAN Identifiers and Adapter Instance Numbers
The network adapters that are capable of tagged VLANs in the Solaris OS
use a VLAN-related instance number. For example, if physical adapter
ce1 is configured with tagged VLAN-ID 22, it will use instance
ce22001— that is, 1000 times the VLAN-ID, plus the normal instance
number.
Traffic Prioritization (802.1d Standard)
The network adapters supported with tagged VLANs in the Sun Cluster
environment also support a related standard which allows an adapter to
specify prioritization of network traffic, from a lowest priority of 0 to a
highest priority of 7.
The actual prioritization of traffic must be implemented by the network

fabric, that is, by the switches themselves.
The shared physical interconnect feature uses the 802.1d standard in a

manner that results in private network traffic being assigned a higher
priority than public network traffic. The default priorities assigned by the
Sun Cluster framework are as follows:
● Heartbeats – Priority 7
● ORB (other internode private traffic) – Priority 6
● Application traffic on private net – Priority 6
● Scalable network traffic on private net – Priority 0
● Public network traffic – Priority 0
Note that there is no absolute requirement that the network switches

actually adhere to this standard or be able to implement it. It is highly
recommended that if you use the same network adapter for public and
private networks, you have network switches able to implement the
traffic prioritization standard as well; but this is not a requirement. As
mentioned earlier, one of the major goals of this feature is to eventually
support blade servers as cluster nodes. These servers are typically
connected directly in their chassis to intelligent switches that would
implement the traffic prioritization.

Changes to Sun Cluster Tools to Support Tagged VLANs
There are changes to support configuration of the VLAN-ID of a shared

network adapter for the private network (cluster transport) only. All
public network configuration is done, manually, as always, without using
Sun Cluster tools. If you have manually set up a VLAN on a Sun Cluster
node for public network usage, and it is not already a member of an IPMP
group, it is automatically placed in a singleton IPMP group by the
scinstall utility just like any other public network adapter.
In the example used in these sections, VLAN-ID 2 has already been

chosen for the public network interfaces for each of two adapters, bge0
and ce0. All that had to be done was boot with the /etc/hostname.xxx
files renamed to use the correct VLAN-ID. There is nothing special about
the contents of these files, so they are not shown. Note that the VLAN-
enabled adapters are automatically assigned the CoS flag, for Class of
Service, that indicates that they support the 8021.d traffic prioritization
feature.
gabi:/# ls -l /etc/hostname.*
-rw-r--r-- 1 root root 111 May 10 13:52 /etc/hostname.bge2000
-rw-r--r-- 1 root root 74 May 10 13:52 /etc/hostname.ce2000
gabi:/# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu
8232 index 1
inet 127.0.0.1 netmask ff000000
bge2000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu
1500 index 2
groupname therapy
ether 0:9:3d:0:a7:12
bge2000:1:flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,
NOFAILOVER,CoS> mtu 1500 index 2
ce2000:flags=209040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOF
AILOVER,CoS> mtu 1500 index 3
groupname therapy
ether 0:3:ba:e:bd:e0

Changes to CCR and scinstall for the Cluster Transport
The VLAN-ID used for shared adapters on the private network is

recorded as part of the CCR.
The interactive install (all four variations, including custom, typical, the
one-at-a-time, and the all-at-once methods) detects if you choose a VLAN-
capable private network adapter from the transport adapters menu. If
VLAN is already enabled on this physical adapter for the public network,
you are required to enter a VLAN-ID for the transport; otherwise you will
be asked if you want to.
The following example shows that since the public network is already
VLAN-capable, the adapters menu shows the same physical adapters
already used by the public network:
>>> Cluster Transport Adapters and Cables <<<
You must configure at least two cluster transport adapters for each
node in the cluster. These are the adapters which attach to the
private cluster interconnect.
Select the first cluster transport adapter:
1) bge0
2) bge1
3) ce0
4) ce1
5) Other
Option: 1
This adapter is used on the public network also, you will need to
configure it as a tagged VLAN adapter for cluster transport.
What is the cluster transport VLAN ID for this adapter? 5
Searching for any unexpected network traffic on "bge5000" ... done

Verification completed. No traffic was detected over a 10 second
sample period.

The autodiscovery method for transport adapter entry (on all but the first
node or all but the node you are driving from in the all-at-once method)
will be able to auto-detect the correct VLAN-ID to use on VLAN-capable
adapters. This is shown in the following example:
>>> Autodiscovery of Cluster Transport <<<
If you are using Ethernet or Infiniband adapters as the cluster

transport adapters, autodiscovery is the best method for configuring
the cluster transport.
Do you want to use autodiscovery (yes/no) [yes]?
Probing .......
The following connections were discovered:
gabi:bge0 switch1 dani:bge0 [VLAN ID 5]

gabi:ce0 switch2 dani:ce0 [VLAN ID 5]
Is it okay to add these connections to the configuration (yes/no)

[yes]?
Shared Physical Adapters and the Other Cluster Utilities
The scconf options that allow you to add private network adapters can
be used either with a VLAN-enabled instance number, or with a new
property named vlan_id. In other words, these are equivalent, valid
commands.
scconf -a -A trtype=dlpi,node=phys-node-1,name=ce1001
or
scconf -a -A trtype=dlpi,node=phys-node-1,name=ce1,vlan_id=1

The equivalent scsetup commands have been instrumented in a way

similar to scinstall to guide you through choosing a VLAN-ID, if any,
for a VLAN-capable adapter and calling the second format of the scconf
command. Note from the following example that when you add a cable
you must specify the VLAN-enabled instance number:
>>> Add a Cluster Transport Adapter to a Node <<<
.
.
To which node do you want to add the adapter? gabi
Name of the adapter to add to "gabi"? bge1
Will this be a dedicated cluster transport adapter (yes/no) [yes]? no
What is the cluster transport VLAN ID for this adapter? 5
scconf -a -A trtype=dlpi,name=bge1,node=gabi,vlanid=5
>>> Add a Cluster Transport Cable <<<

To which node do you want to add the cable? gabi
Name of the adapter to use on "gabi"? bge5001
Name of the junction at the other end of the cable? switch1
Okay to use the default for this cable connection [yes]?
scconf -a -m endpoint=gabi:bge5001,endpoint=switch1
Command completed successfully.
The scstat command shows the virtual adapter names, for example,
ce5000, for a private network adapter using the VLAN feature.
dani:/# scstat -W
-- Cluster Transport Paths --
Endpoint Endpoint Status

-------- -------- ------
Transport path: gabi:ce5000 dani:ce5000 Path online
Transport path: gabi:bge5000 dani:bge5000 Path online

Infiniband as Sun Cluster Transport (Solaris 10 OS)

Infiniband is a new industry standard switched fabric infrastructure for
very fast, low latency, intercommunication between and among
Infiniband-compliant hosts and I/O devices. Infiniband can theoretically
support speeds up to 10 Gigabytes (Gbytes) per second.
Sun Cluster 3.1 Update 4 supports—for the first time—Infiniband

adapters as Sun Cluster transport adapters.
The Infiniband standards specify software interfaces that can interface

directly with the data transfer capabilities of Infiniband without the
relatively heavy overhead of TCP/IP. However, Sun Cluster 3.1 Update 4
will be able to use an Infiniband-based transport only as a TCP/IP-type
transport.
The Solaris 10 OS includes the IP over Infiniband (IPoIB) that is required

to use the Infiniband adapters as TCP/IP transport adapters in the cluster.
This feature is supported only on Solaris 10 OS.
Infiniband Transport Adapters
The scinstall utility automatically recognizes any Infiniband host

channel adapters (HCAs) that are present when the cluster is being
configured. These will automatically be listed on the menu of candidate
transport adapters, with the adapter prefix ibd (for example, ibd1 and
ibd3). You can choose these adapters even if you are using one of the
typical install methods, as these assume that the transport adapter uses
TCP/IP. In addition, the autodiscovery feature automatically configures
the Infiniband adapters for you on other nodes.
ORACLE RAC Over Infiniband Using uDAPL
The User Direct Access Programming Library (uDAPL) is a set of

application programming interfaces (APIs) that allows direct access by
applications to the direct memory access capabilities of a variety of
transport types including Infiniband.
The Sun Cluster framework will not use uDAPL for any framework
transport activity, such as heartbeats. In this respect, Infiniband will be a
purely TCP/IP transport—at this time. However, uDAPL is available for
application usage, and Sun Cluster will support an ORACLE RAC
implementation using this API, when it exists.

New Data Service Agents

The following new data service agents and data service agent changes are
released with Sun Cluster 3.1 Update 4:
● NFS agent to support NFS version 4 on Solaris 10 OS (No PxFS)
The NFS agent is being updated to support NFS version 4, which is
the preferred NFS version on Solaris 10 OS. Solaris 8 and Solaris 9 OS
do not implement NFS version 4.
If running on Solaris 10 OS, the NFS agent allows high-availability
for an NFS-shared filesystem that can simultaneously support some
clients using NFS version 4 and others using NFS version 3 and
below. While NFS version 4 integrates the NFS mounting, locking,
and data transfer daemons, the older versions of the mounting and
locking daemons must run and fail-over properly in order to support
clients using versions below NFS version 4.
Lock state management is kept consistent across a combination of
clients using NFS version 4 and lower versions.
At the time of the release of Sun Cluster 3.1 Update 4, NFS Version 4
is not supported if you use the global filesystem (PxFS) as the data
storage on the cluster nodes. This is due to implementation
inconsistencies. In order to support NFS Version 4 you must have the
data on a failover file system.
● Siebel 7.7 and 7.5
The Siebel agent has been qualified to support the newer version of
the software.
● Sybase Adaptive Server Enterprise (ASE) 12.5
The Sun Cluster agents has been qualified with Sybase ASE version
12.5. The previous revision of Sun Cluster supported only 12.0.
● Java Enterprise System Application Server 8.1
Previous versions of Sun Cluster supported only Application
Server 7. The new agent delivered with Sun Cluster 3.1 Update 4
supports Application Server 8. This includes functionality to make
the Application Server 8.1’s central administration repository and
domain administration server (DAS) highly available. One or more
instances of the Application Server can be configured as highly-
available instances in the same cluster.

VERITAS Foundation Suite 4.1 (VxVM 4.1 and

VxFS4.1)
Sun Cluster software supports VERITAS Volume Manager (VxVM) 4.1
and VERITAS File System (VxFS) 4.1.These are the first versions that run
on Solaris 10 OS. Therefore, in the first Solaris 10 OS clusters, they will be
the only versions of the VERITAS software supported.
On Solaris 8 and 9 OS, Sun Cluster 3.1 Update 4 continues to support both
the 3.5 and 4.0 versions of VxVM and VxFS. However, the 4.1 version is
recommended for new installations since VERITAS has already declared
end-of-life for 3.5 and may do so soon for 4.0.
As of release 4.1 of the VERITAS products, some changes affect Sun

Cluster software management:
● VERITAS requires that customers use the VERITAS installer to install
the software packages. Their absolute minimum installation still
includes the back-end pieces required to use the VERITAS graphical
interface, although it does not include the front end. The VERITAS
installer is a full-screen, text-based installer. It is not a graphical
installer.
Sun Cluster 3.1 Update 4 software does not support using the
scvxinstall utility to add the VxVM 4.1 packages, to add licenses,
or to initialize the VxVM software. You still have the option to use
scvxinstall to customize the VxVM 4.1 software for the Sun
Cluster environment, and it is still highly recommended to use
scvxinstall if you want to encapsulate root. For earlier versions of
VxVM, scvxinstall will still add the packages and licenses and
initialize the software.
● The vxinstall utility has been reinstated into VxVM 4.1. This utility
does not offer to encapsulate root or build any local disk group.
Rather, the vxinstall utility configures the VERITAS configuration
daemon and asks for licenses. The VERITAS installer guides you
through executing this piece after the package installation.

Product Documentation
Product Documentation
Table 1 provides a list of related documentation. Refer to the
documentation for detailed product information:
Table 1 Product Documentation
Downloadable Versions of
Title Part Number
Documentation
Sun Cluster Concepts 819-0421 http://docs.sun.com

Guide For Solaris OS
Sun Cluster Software 819-0420 http://docs.sun.com
Installation Guide for
Solaris OS
Sun Cluster System 819-0580 http://docs.sun.com
Administration Guide
for Solaris OS
Sun Cluster Overview 819-0579 http://docs.sun.com
for Solaris OS
Sun Cluster Data Service 819-1247 http://docs.sun.com
for Solaris Containers
Further Training
There will not be any additional support readiness training (SRT) for the
Sun Cluster 3.1 Update 4 software framework. Because this is an
incremental release, not enough has changed to warrant a dedicated SRT.
The purpose of this document is to provides support personnel with
enough information to support the product.


Sun™ Cluster 3.1 8/05 Software Differences: Note

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Sun™ Cluster 3.1 8/05 Software Differences: Note

Transféré par

Droits d'auteur :

Formats disponibles

Sun™ Cluster 3.

Note – The content contained in this document is Sun

Sun Proprietary: Internal Use Only

Also of note is support for a new type of cluster transport, Infiniband,

Sun Proprietary: Internal Use Only

Solaris™ 10 Operating System Support

Solaris 10 OS and SMF

Solaris 10 OS uses SMF as a standard, operating-system-wide method of

Sun Proprietary: Internal Use Only

Examples of Viewing Cluster SMF Services

The following example shows cluster daemons registered as SMF services

The following example provides a closer look at an excerpt from the

Sun Proprietary: Internal Use Only

The dependencies are as follows:

SMF and Data Service Agents

Sun Proprietary: Internal Use Only

Sun Proprietary: Internal Use Only

Solaris 10 OS Zones and the Solaris OS

Modifying /etc/system to Support Sparse Root Zones

The default initial configuration of a zone does not specify an absolute

Sun Proprietary: Internal Use Only

When you install the Sun Cluster framework, the line:

is automatically added to your /etc/system file. You must remove or

Failover Zones and Multiple Master Zones

Failover zones have the following characteristics:

Sun Proprietary: Internal Use Only

Multiple-master zones have the following characteristics:

Zone Boot (sczbt), Zone Script (sczsh), and Zone SMF

The agent provides three new kinds of resources. Each of these is

Sun Proprietary: Internal Use Only

Example of Configuring a Failover Zone

The following code example shows a complete example of failover zone.

Configuring and Enabling a Resource Group

You start by configuring a resource group containing a storage resource

pecan:/# grep fro-zone /etc/vfstab

# scrgadm -a -g frozone-rg -h pecan,grape

Sun Proprietary: Internal Use Only

# scrgadm -a -g frozone-rg -t HAStoragePlus -j fro-stor -x \

Manual Configuration and Installation of the Zone

pecan:/# zoneadm -z fro-zone install

The zone configuration file is copied to the other node:

Sun Proprietary: Internal Use Only

Edit the /etc/zones/index files on the other node. You need to be

Test the Zone on Other Nodes

grape# df -k [verify filesystem switched here correctly]

Sun Proprietary: Internal Use Only

Configure the sczbt Resource Instance

The sczbt resource instance is configured and registered using the

Notice that the sczbt_config file is only used by the registration

Note in the example that it is assumed that there is no convenient global

Sun Proprietary: Internal Use Only

# Zonename Name of the zone

pecan# mkdir /etc/zoneagentparams

pecan# rsh grape mkdir /etc/zoneagentparams

Note that in this example, the IP address configured using

Sun Proprietary: Internal Use Only

qfe2:1: flags=1040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4> mtu

sczsh and sczsmf Resource Instances

As mentioned before, you can create optional resources to run some

Example of a Script Resource

The following configuration file specifies a script resource that will

Sun Proprietary: Internal Use Only

Script and SMF Resources Whose Failure Can Cause Zone