Académique Documents
Professionnel Documents
Culture Documents
Voting disks are used in a RAC configuration for maintaining nodes membership. T
hey are critical pieces in a cluster configuration. Starting with ORACLE 10gR2,
it is possible to mirror the OCR and the voting disks. Using the default mirrori
ng template, the minimum number of voting disks necessary for a normal functioni
ng is two.
Scenario Setup
In this scenario it is simulated the crash of one voting disk by using the follo
wing steps:
identify votings:
crsctl query css votedisk
0. 0 /dev/raw/raw1
1. 0 /dev/raw/raw2
2. 0 /dev/raw/raw3
corrupt one of the voting disks (as root):
dd if=/dev/zero /dev/raw/raw3 bs=1M
Recoverability Steps
check the $CRS_HOME/log/[hostname]/alert[hostname].log file. The following message
should be written there which allows us to determine which voting disk became c
orrupted:
[cssd(9120)]CRS-1604:CSSD voting file is offline: /opt/oracle/product/10.2.0/crs
_1/Voting1. Details in /opt/oracle/product/10.2.0/crs_1/log/aut-arz-ractest1/css
d/ocssd.log.
According to the above listing the Voting1 is the corrupted disk. Shutdown the C
RS stack:
srvctl stop database -d fitstest -o immediate
srvctl stop asm -n aut-vie-ractest1
srvctl stop asm -n aut-arz-ractest1
srvctl stop nodeapps -n aut-vie-ractest1
srvctl stop nodeapps -n aut-arz-ractest1
crs_stat -t
On every node as root:
crsctl stop crs
Pick a good voting from the remaining ones and copy it over the corrupted one:
dd if=/dev/raw/raw4 of=/dev/raw/raw3 bs=1M
Start CRS (on every node as root):
Note: There's also possible to recover a lost voting disk from an old voting bac
kup and to perform the dd command without shutting down the CRS stack.
tbeat the membership nodes as a second way to check. This bypass the interconnec
t failure and relies to the shared nature of the voting disk. Basically every no
de operates using a so called "membership bitmap". Every node provides membershi
p information as it is presumed to be correct by that node. This information is
written on every 3 seconds by the CKPT process of every instance into the databa
se control file. The mastering instance of the cluster will gather all votes and
will decide if there is any "split brain" issue or not. For example, in a 3 nod
es RAC the "membership votes" may be: n1 => 101; n2 => 010; n3 => 101. Counting
the votes reveals a score of 2 - 1. The second node has a different image of the
cluster and will be simply evicted from the rac configuration.
1:50 AM
Gas said...
Hi, I have one question:
I was installing a 3 node RAC (10.2.0) I didn't configure redundancy for voting
disk. (It's just test environment) but RAC never comes up. At this moment, I'm e
xperiencing the issue in this article, but I havent backups or others voting dis
ks. What can I do?
regards!
3:42 PM
Alexandru Tica said...
Hi Gas,
Have a look at metalink Note 399482.1
I didn't tried this but according to the above note: "If there are multiple voti
ng disks and one was accidentally deleted, then check if there are any backups o
f this voting disk. If there are no backups then we can add one using the crsctl
add votedisk command. The complete steps are in the Oracle Database Oracle Clust
erware and Oracle Real Application Clusters Administration and Deployment Guide"
.
11:52 PM
Gas said...
I already did follow that note, but when I run root.sh on first node, it fails s
aying "vote disk is offline", and I check log, and face with a large number of t
hings that I'm not understanding...
It's an option to "erase" the voting disk and re-run root.sh?
What you think?
8:36 AM
Anonymous said...
Thank you so much for the blog.
Had a question. I have three cluster File systems which host voting and ocr disk
s. One of them went offline due to some storage issue, the FS got unmounted.
Cluster continued to run because we had other two file systems which had voting
and ocr disks.
The storage admins brought the fs back and when I do query css votedisk it shows
online, and the ocrcheck complete sucessfully even logical corruption check als
o comes out fine.
can I leave this as it is, or shall I have to delete and add voting and ocr?
there are no changes that happened during this offline time of one of the file s
ystems
Appreciate your comment.
12:02 AM