Académique Documents
Professionnel Documents
Culture Documents
General Questions
l What is Cluster Project?
l What is Red Hat Cluster Suite (RHCS)?
l What's the history of the cluster project?
l What does the Cluster Project encompass?
l How do you configure a cluster?
l Why do my changes to cluster.conf file keep disappearing?
l What's the "right way" to propagate the cluster.conf file to a running cluster?
l What are all the possible options that can go into cluster.conf file?
l Are there any examples of cluster.conf I can look at?
l Which kernel does this code run on?
l Where is the source code?
l Is the Cluster Project code in CVS production ready?
l Does the cluster project run on OS foo?
l Where is the project page?
l What hardware do I need to run one of these "clusters"?
l What is the largest cluster of this type in existence?
l Do I really need shared storage?
l Are there any manuals or other documentation I can reference?
l What ports do I have to enable for the iptables firewall?
l Are there any public source tarballs that compile against specific kernels?
l What are the differences between the RHEL4 and RHEL5 versions?
l Can I have a mixed cluster with some RHEL4 and some RHEL5 nodes?
l Can I use cluster suite from xen or vmware?
l When I reboot a xen dom, I get cluster errors and it gets fenced. What's going on and how do I
fix it?
l My cluster.conf failed to validate but I don't understand the error. What should I do?
The Cluster Project is a set of components designed to enable clustering, which means a group of
computers all sharing resources, such as shared storage devices and services. Clustering ensures data
integrity when people are working on devices from multiple machines (or virtual machines) at the
same time.
Red Hat Cluster Suite is a marketing term under which some of this software is promoted. Red Hat
has bundled components from the cluster project together and made them available for its various
http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions?action=print 3/18/2011
FAQ/GeneralQuestions - Cluster Wiki Page 2 of 8
platforms.
Somewhere around 1996, Red Hat developed its first Cluster Suite, which primarily managed cluster-
cooperative services. That's the equivalent of rgmanager now.
From 1997 to 2003, Sistina Software was spun off from a project at the University of Minnesota, and
they developed a clustering file system which became the Global File System, GFS, which it sold to
customers.
In 2004, Red Hat, Inc. bought Sistina, merged GFS into its Cluster Suite, and open-sourced the whole
thing.
Today, the cluster project belongs to the people and is available for free to the public through Red
Hat's CVS repository. The open-source community continues to improve and develop the cluster
project with new clustering technology and infrastructures, such as OpenAIS.
That depends on what version you are using. Like all active technology, it is constantly evolving. The
Cluster Project involves development in many different areas including:
Assuming you have all the necessary pieces and/or RPMs in place, there are four ways to configure a
cluster:
Manually edit /etc/cluster/cluster.conf and propagate it to all nodes. Use system-config-cluster tool gui
http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions?action=print 3/18/2011
FAQ/GeneralQuestions - Cluster Wiki Page 3 of 8
(RHEL4) - For cluster configuration maintenance. Cluster Suite Deployment tool gui (cs-deploy-tool)
(RHEL4) - For initial cluster setup. Conga web-based cluster configuration tool (RHEL5 and FC6).
The cluster configuration system (ccs) tries to manage the cluster.conf file and keep all the nodes in
sync. If you make changes to the cluster.conf file, you have to tell ccs and cman that you did it, so
they can update the other nodes. If you don't, your changes are likely to be overwritten with an older
version of the cluster.conf file from a different node. See the next question.
What's the "right way" to propagate the cluster.conf file to a running cluster?
The cluster configuration guis take case of propagating changes to cluster.conf to your cluster. The
system-config-cluster gui has a big button that says "Send to Cluster". If you're maintaining your
cluster.conf file by hand and want to propagate it to the rest of the cluster, do this:
Edit /etc/cluster/cluster.conf using the editor of choice. Tell ccs about the change: ccs_tool
update /etc/cluster/cluster.conf Find out what version your cluster.conf file currently is from cman's
perspective: cman_tool status | grep "Config version" It should come back with something like this:
Config version: 37
What are all the possible options that can go into cluster.conf file?
A list of options can be found at the following link. I won't guarantee it's complete or comprehensive,
but it's pretty close:
http://sources.redhat.com/cluster/doc/cluster_schema.html
Take a look at the man page for cluster.conf (5). There's also a small example in the usage.txt file:
cluster/doc/usage.txt
The GFS 6.0 cluster code runs on the 2.4.xx (for Red Hat Enterprise Linux 3). The GFS 6.1 code runs
on the 2.6.xx series kernels for Red Hat Enterprise Linux 4, Fedora Core and other distributions.
The source code for the current development tree is kept in a git repository on sources.redhat.com.
You can view it through a web browser by following this link:
http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions?action=print 3/18/2011
FAQ/GeneralQuestions - Cluster Wiki Page 4 of 8
http://sourceware.org/git/?p=cluster.git;a=tree
You can check the entire source code tree out from git with this command:
You may first need to install git or git-core (e.g. yum -y install git). If you plan to make changes to the
source, you may want to configure your git environment:
For more information about using the cluster git repository, see ClusterGit.
The git repository contains all the source code for all the branches. Initially, the branch shown is the
"master" branch, which is development (not stable) code. To get a list of branches, you can use this
command:
git branch -a
If you want to switch to a different branch, use the "git checkout" command. For example, to switch
to the STABLE2 branch (which should work with the current upstream kernel from kernel.org), use
this command:
The source code for the GFS2 file system is in a different git repository because it's now part of the
upstream kernel. You can check out the most recent source for it by doing:
The source code for openais is currently kept in a subversion repository. You can check out the most
recent source for it by doing:
The cluster project was primarily designed to run on linux. Some of the cluster infrastructure, such as
http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions?action=print 3/18/2011
FAQ/GeneralQuestions - Cluster Wiki Page 5 of 8
It depends on which components you need to use. For a basic cluster, all you need is two or more
computers and a network between them. If you want to use GFS, you'll need shared storage.
The current architecture for general purpose high availability clusters is known to scale to 32 nodes.
At the present time we recommend a maximum size of 16 node clusters as 16 nodes or less provides
the most stable configuration.
For GFS and GFS2 filesystems, the current limitation is also 16 node clusters.
It depends on what you're planning to do. The point of using GFS and CLVM is that you have storage
you want to share between machines concurrently. Without shared storage, you have a local
filesystem and lvm2, neither of which need the cluster infrastructure. If you want to use the cluster
infrastructure for High Availability services, you don't need shared storage.
http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions?action=print 3/18/2011
FAQ/GeneralQuestions - Cluster Wiki Page 6 of 8
Are there any public source tarballs that compile against specific kernels?
Yes. From time to time, we build the STABLE branch against different kernels and release the
tarballs. You'll find them here: ftp://sources.redhat.com/pub/cluster/releases/
What are the differences between the RHEL4 and RHEL5 versions?
The cluster software isn't specific to any Linux distribution or release. However, many of the users are
running the software on Red Hat Enterprise Linux (RHEL). Most customers are currently running on
RHEL4 (or the RHEL4 equivalent of CentOS, or at least the RHEL4 branch of the source tree in
CVS). So they may want to know the differences between the way things work now in RHEL4 and
how they'll work in RHEL5.
This list is by no means complete, but these are the differences I know about offhand:
l There's a new web-based cluster configuration tool for RHEL4.5 and RHEL5 called "Conga".
l Cluster.conf in RHEL5 has a new "nodeid=X" requirement for each node. (X is the node
number).
l The ccsd, cman and fenced init scripts in RHEL4 were combined into a single init script for
RHEL5: service cman start.
l However, since users can use a cluster without clvmd, gfs or rgmanager, they are still separate
init scripts.
l RHEL5 has a new "locking_type = 3" in /etc/lvm/lvm.conf
l The logical volume manager can take a new "locking_type = 3" to figure out the appropriate
locking for clustered and
l non-clustered volumes.
l RHEL5 has no more lock_gulm locking protocol. Users are encouraged to use dlm locking.
l The new GFS2 file system will be available as a "tech preview" (not production ready) in
RHEL5.
http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions?action=print 3/18/2011
FAQ/GeneralQuestions - Cluster Wiki Page 7 of 8
l See question "[GFS#gfs2 What improvements will GFS2 have over GFS(1)?]"
A lot of code has been moved from the kernel to userland for better system integrity and easier
debugging. New group daemons will run: groupd, gfs_controld, dlm_controld. The new cluster
infrastructure is built on top of openais clustering technology. Openais uses multicast rather than
broadcast packets for better efficiency. DLM and GFS2 are now part of the base kernel.They were
accepted into the 2.6.18 upstream kernel by kernel.org. Lots of little improvements.
Can I have a mixed cluster with some RHEL4 and some RHEL5 nodes?
It's definitely not a good idea to mix the two within a single cluster. With the introduction of RHEL5,
there are now two distinct and separate cluster infrastructures. The older (RHEL4 or STABLE branch
in CVS) infrastructure passes cluster messages using a kernel module (cman or the one internal to
gulm). The newer infrastructure (RHEL5 or HEAD branch in CVS) passes cluster messages using
openais and userland daemons. If you try to mix and match the two, it will not work.
That said, you could probably still fetch the STABLE branch of the cluster code from CVS, compile it
on a RHEL5 system, and have it interact properly in a RHEL4 cluster through the old infrastructure.
Since the STABLE branch tracks the upstream kernel, you may also need to build a newer kernel
from source code as well on the RHEL5 system.
It would be extremely difficult, if not impossible, to go the other way around (i.e. to get the new
infrastructure and openais running on a RHEL4 system so it could interact with a RHEL5 cluster).
Yes you can. For example, you could have a single computer, running Xen virtualization, act as a
complete cluster consisting of several xen guests. There are special fencing issues to consider. For
example, if you use power fencing, one guest could cause the whole machine to be powered off and
never come back (because it wouldn't be alive to tell the power switch to power back on). There is a
special fencing agent designed to reboot xen guests as needed.
You can also create clusters made of several computers, each of which has several virtual xen guest
nodes. This has other fencing complications. For example, a xen guest can't use a simple xen fencing
agent to reboot a xen guest that's physically running on a different physical computer.
When I reboot a xen dom, I get cluster errors and it gets fenced. What's going on and how do I
fix it?
As I understand it, the problem is due to the fact that xen nodes tear down and rebuild the ethernet nic
after cluster suite has started. We're working on a more permanent solution. In the meantime, here is a
workaround:
Edit the file: /etc/xen/xend-config.sxp line. Locate the line that reads:
http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions?action=print 3/18/2011
FAQ/GeneralQuestions - Cluster Wiki Page 8 of 8
(network-script network-bridge)
(network-script /bin/true)
DEVICE=eth0
ONBOOT=yes
BRIDGE=xenbr0
HWADDR=XX:XX:XX:XX:XX:XX
DEVICE=xenbr0
ONBOOT=yes
BOOTPROTO=static
IPADDR=10.0.0.116
NETMASK=255.255.255.0
GATEWAY=10.0.0.254
TYPE=Bridge
DELAY=0
My cluster.conf failed to validate but I don't understand the error. What should I do?
As it turns out, the RelaxNG validator in libxml2, while efficient, also has a side effect of producing
rather odd errors sometimes. We are looking in to ways to make this output more user-friendly. Until
this is resolved, the easiest thing to do is use Jing, a stand-alone RelaxNG validator written in Java.
For a concrete example and error comparison, see bugzilla #531489.
http://sources.redhat.com/cluster/wiki/FAQ/GeneralQuestions?action=print 3/18/2011