Vous êtes sur la page 1sur 8

If you remove a disk from the system using rmdev -dl hdiskX without having previously reduced

the volume group to remove the disk from LVM, and thus have not updated properly the on-disk
format information (called VGDA), you get a discrepancy between the ODM and the LVM
configurations. Here is how to solve the issue (without any warranty though!).
What are the volume group informations:
# lsvg -p rootvg
rootvg:
PV_NAME
PV STATE
TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0
active
2157
1019
174..00..00..413..432
0516-304 : Unable to find device id 00ce4b6a01292201 in the Device Configuration Database.
00ce4b6a01292201 missing
2157
1019
174..71..00..342..432
# lspv
hdisk0
00ce4b6ade6da849
rootvg
active
hdisk2
00ce4b6a01b09b83
drakevg
active
hdisk3
00ce4b6afd175206
drakevg
active
# lsdev -Cc disk
hdisk0 Available Virtual SCSI Disk Drive
hdisk2 Available Virtual SCSI Disk Drive
hdisk3 Available Virtual SCSI Disk Drive
As we can notice, the disk is still in the LVM configuration but doesn't show up in the devices. To
solve this issue, we need to cheat the ODM in order to be able to use LVM commands to change
the LVM configuration, stored on the volume group disks. The idea is to reinsert a disk in the
ODM configuration, remove the disk from LVM and then remove it from ODM. Here is how we
do it. First, let's make a copy of the ODM files that we will change:
# cd /etc/objrepos/
# cp CuAt CuAt.before_cheat
# cp CuDv CuDv.before_cheat
# cp CuPath CuPath.before_cheat
Now, we will extract the hdisk0's definition from ODM and add it back as hdisk1's definition:
# odmget -q "name=hdisk0" CuAt
CuAt:
name = "hdisk0"
attribute = "unique_id"
value = "3520200946033223609SYMMETRIX03EMCfcp05VDASD03AIXvscsi"
type = "R"
generic = ""
rep = "n"
nls_index = 0
CuAt:
name = "hdisk0"
attribute = "pvid"
value = "00ce4b6ade6da8490000000000000000"

type = "R"
generic = "D"
rep = "s"
nls_index = 11
# odmget -q "name=hdisk0" CuDv
CuDv:
name = "hdisk0"
status = 1
chgstatus = 2
ddins = "scsidisk"
location = ""
parent = "vscsi0"
connwhere = "810000000000"
PdDvLn = "disk/vscsi/vdisk"
# odmget -q "name=hdisk0" CuPath
CuPath:
name = "hdisk0"
parent = "vscsi0"
connection = "810000000000"
alias = ""
path_status = 1
path_id = 0

Basically, we need to insert new entries in the three classes CuAt, CuDv and CuPath with hdisk0
changed to hdisk1. A few others attributes need to be changed. The most important one is the
PVID, located in CuAt. We will use the value reported as missing by lsvg -p rootvg. Attribute
unique_id also need to be changed. You can just change a few characters in the existing string, it
just need to be unique in the system. The other attributes to change are connwhere in CuDv and
connection in CuPath. Their value represent the LUN ID of the disk. Again, this value is not
relevant, it just have to be unique. We can check the current LUN defined by running lscfg on all
the disks defined:
# lscfg -vl hdisk*
hdisk0
U9117.570.65E4B6A-V6-C2-T1-L810000000000 Virtual SCSI Disk Drive
hdisk2
U9117.570.65E4B6A-V6-C3-T1-L810000000000 Virtual SCSI Disk Drive
hdisk3
U9117.570.65E4B6A-V6-C3-T1-L820000000000 Virtual SCSI Disk Drive

LUN 81 is used on controller C2 and LUNs 81 and 82 on C3. Let's choose 85, which for sure will
not collide with other devices. The following commands will generate the text files that will be
used to cheat the ODM, according to what was just explained:
# mkdir /tmp/cheat
# cd /tmp/cheat
# odmget -q "name=hdisk0" CuAt | sed -e 's/hdisk0/hdisk1/g' \

-e 's/00ce4b6ade6da849/00ce4b6a01292201/' \
-e 's/609SYMMETRIX/719SYMMETRIX/' > hdisk1.CuAt
# odmget -q "name=hdisk0" CuDv | sed -e 's/hdisk0/hdisk1/' \
-e 's/810000000000/850000000000/' > hdisk1.CuDv
# odmget -q "name=hdisk0" CuPath | sed -e 's/hdisk0/hdisk1/' \
-e 's/810000000000/850000000000/' > hdisk1.CuPAth

Let's look at the generated files:


# cat hdisk1.CuAt
CuAt:
name = "hdisk1"
attribute = "unique_id"
value = "3520200946033223719SYMMETRIX03EMCfcp05VDASD03AIXvscsi"
type = "R"
generic = ""
rep = "n"
nls_index = 0
CuAt:
name = "hdisk1"
attribute = "pvid"
value = "00ce4b6a012922010000000000000000"
type = "R"
generic = "D"
rep = "s"
nls_index = 11
# cat hdisk1.CuDv
CuDv:
name = "hdisk1"
status = 1
chgstatus = 2
ddins = "scsidisk"
location = ""
parent = "vscsi0"
connwhere = "850000000000"
PdDvLn = "disk/vscsi/vdisk"
# cat hdisk1.CuPath
CuPath:
name = "hdisk1"
parent = "vscsi0"
connection = "850000000000"
alias = ""
path_status = 1
path_id = 0

So, we are ready to insert the data in the ODM:


# odmadd hdisk1.CuAt
# odmadd hdisk1.CuDv
# odmadd hdisk1.CuPath
# lsvg -p rootvg
rootvg:
PV_NAME
PV STATE
TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0
active
2157
1019
174..00..00..413..432
hdisk1
missing
2157
1019
174..71..00..342..432
The disk is now back in ODM! Now, to remove the disk from the VGDA, we use the reducevg
command:
# reducevg rootvg hdisk1
0516-016 ldeletepv: Cannot delete physical volume with allocated
partitions. Use either migratepv to move the partitions or
reducevg with the -d option to delete the partitions.
0516-884 reducevg: Unable to remove physical volume hdisk1.
We will use the -d flag to remove the physical partitions associated to each logical volumes and
located hdisk1. A few lines have been remove to simplify listing...
# reducevg -d rootvg hdisk1
0516-914 rmlv: Warning, all data belonging to logical volume
lv01 on physical volume hdisk1 will be destroyed.
rmlv: Do you wish to continue? y(es) n(o)? y
0516-304 putlvodm: Unable to find device id 00ce4b6a012922010000000000000000 in the
Device Configuration Database.
0516-896 reducevg: Warning, cannot remove physical volume hdisk1 from
Device Configuration Database.
# lsvg -l rootvg
LV NAME
TYPE
LPs PPs PVs LV STATE
MOUNT POINT
hd5
boot
2 2 1 closed/syncd N/A
hd6
paging 256 256 1 open/syncd N/A
hd8
jfs2log 1 1 1 open/syncd N/A
hd4
jfs2
7 7 1 open/syncd /
hd2
jfs2
384 384 1 open/syncd /usr
hd9var
jfs2
64 64 1 open/syncd /var
hd3
jfs2
128 128 1 open/syncd /tmp
hd1
jfs2
2 2 1 open/syncd /home
hd10opt
jfs2
32 32 1 open/syncd /opt
fslv04
jfs2
256 256 1 open/syncd /usr/sys/inst.images
loglv01
jfslog 1 1 1 closed/syncd N/A
lv01
jfs
5 5 1 closed/syncd /mkcd/cd_images
# lsvg -p rootvg

rootvg:
PV_NAME PV STATE
TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0
active
2157
1019
174..00..00..413..432
The disk has been deleted from the VGDA. What about ODM?
# lsdev -Cc disk
hdisk0 Available Virtual SCSI Disk Drive
hdisk1 Available Virtual SCSI Disk Drive
hdisk2 Available Virtual SCSI Disk Drive
hdisk3 Available Virtual SCSI Disk Drive
# rmdev -dl hdisk1
Method error (/etc/methods/ucfgdevice):
0514-043 Error getting or assigning a minor number.
We probably forgot to cheat one ODM class... Never mind: let's remove the cheat we added to
ODM and see what appends:
# odmdelete -o CuAt -q "name=hdisk1"
2 objects deleted
# lspv
hdisk0
00ce4b6ade6da849
rootvg
active
hdisk2
00ce4b6a01b09b83
drakevg
active
hdisk1
none
None
hdisk3
00ce4b6afd175206
drakevg
active
# rmdev -dl hdisk1
Method error (/etc/methods/ucfgdevice):
0514-043 Error getting or assigning a minor number.
# odmdelete -o CuDv -q "name=hdisk1"
1 objects deleted
# lspv
hdisk0
00ce4b6ade6da849
rootvg
active
hdisk2
00ce4b6a01b09b83
drakevg
active
hdisk3
00ce4b6afd175206
drakevg
active
# lspath
Enabled hdisk0 vscsi0
Enabled hdisk2 vscsi0
Enabled hdisk2 vscsi1
Enabled hdisk3 vscsi1
Enabled hdisk3 vscsi0
Unknown hdisk1 vscsi0
# odmdelete -o CuPath -q "name=hdisk1"
1 objects deleted
# lspath
Enabled hdisk0 vscsi0
Enabled hdisk2 vscsi0

Enabled hdisk2 vscsi1


Enabled hdisk3 vscsi1
Enabled hdisk3 vscsi0

That's it! Use with care.


Side note: This entry was originally contributed by Patrice Lachance, which first wrote about this
subject.

Comments
1.
There is a much simpler way to do this. (Under normal circurmstances you would not be able to
delete a disk using rmdev. More likely, the physical disk has been physically removed/destroyed The command [b]rmdev -dl hdiskX [/b] will be refused for an active volume group).
How you end up with a PVMISSING disk is irrevevant. Getting the system repaired is relevant!
So, the simpler way! to correct volume group VGDA and AIX ODM.
A situation like this is more common:
CASE: While the volume group is offline, maintenance is performed on the disks. One disk
is/was damaged beyond repair, or replaced during the process. Now back at AIX the volumes are
to be reactivated.
root@aix530:[/]lsvg -p vgExport
0516-010 : Volume group must be varied on; use varyonvg command.
root@aix530:[/]varyonvg vgExport
PV Status: hdisk1 00c39b8d69c45344 PVACTIVE
hdisk2 00c39b8d043427b6 PVMISSING
Here there is a PVMISSING. For this case, the old hdisk2 is physically destroyed. All the data is
lost, but AIX ODM and the VGDA on all other disks in the volume group do not know this yet.
First document what is lost:
root@aix530:[/]lsvg -l vgExport
vgExport:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvExport jfs2 416 416 1 closed/syncd /export
lvTest jfs 32 32 1 closed/syncd /scratch
loglv00 jfslog 1 1 1 closed/syncd N/A
Examine each logical partition for data that may be lost from hdisk2. Just one physical partition on
hdisk2 implies that the filesystem is corrupt!

[b]root@aix530:[/]lslv -m lvExport | grep hdisk2 | tail -1


root@aix530:[/]lslv -m lvTest | grep hdisk2 | tail -1
0032 0083 hdisk2
root@aix530:[/]lslv -m loglv00 | grep hdisk2 | tail -1
0001 0084 hdisk2
So, with this info I know that any data in /scratch is suspect, and should be restored from a
backup.
To prepare for this I run the AIX command for removing a MISSING disk:
root@aix530:[/]lqueryvg -p hdisk1 -vPt
Physical: 00c39b8d69c45344 2 0
00c39b8d043427b6 1 0
VGid: 00c39b8d00004c000000011169c45a4b
root@aix530:[/]umount /scratch
umount: 0506-347 Cannot find anything to unmount.
root@aix530:[/]rmfs /scratch
rmfs: 0506-936 Cannot read superblock on /dev/lvTest.
rmfs: 0506-936 Cannot read superblock on /scratch.
rmfs: Unable to clear superblock on /scratchrmlv: Logical volume lvTest is removed.
root@aix530:[/]rmlv loglv00
Warning, all data contained on logical volume loglv00 will be destroyed.
rmlv: Do you wish to continue? y(es) n(o)? y
rmlv: Logical volume loglv00 is removed.
root@aix530:[/]lsvg -p vgExport
vgExport:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 511 95 00..00..00..00..95
hdisk2 missing 255 222 51..18..51..51..51
root@aix530:[/]lsvg -l vgExport
vgExport:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvExport jfs2 416 416 1 open/syncd /export
root@aix530:[/]ldeletepv -g 00c39b8d00004c000000011169c45a4b -p 00c39b8d043427b6
Note: there is no output for the above command when all proceeds accordingly.
Now the regular AIX commands to verify VGDA and ODM are in order.
root@aix530:[/]lsvg -p vgExport
vgExport:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 511 95 00..00..00..00..95
root@aix530:[/]lsvg -l vgExport

vgExport:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
lvExport jfs2 416 416 1 open/syncd /export

Summary:
This is much less error prone than using ODM commands and has been available in AIX for disk
management since at least 1995 (when AIX 4 first came out. It may have been in AIX 3 as well,
taking it back to 1991-1992, but I never did any system administration on AIX 3 to know for sure.
Important commands to review:
lslv (-m)
lqueryvg
ldeletepv
2.
First, you are right in your assumption: I never had followed an AIX administration course
(advanced or not), nor any other UNIX or operating system's courses.
Secondly, thank you for your great inputs. This is pretty interesting, and the proposed follow-up
fits very well its purpose. So, assuming I am sure there is nothing stored on the faulted disk (since
it is just a wrong information from the LVM), I understand I can forcibly clean things up using
only the `ldeletepv' command.
3.
I have cleaned up the entry a bit, and found an easier way to locate logical partitions that might
still be in the VGDA and/or ODM.
http://rootvg.net/content/view/174/...

4.
My apologies for not answering your direct question. If there was anything on the MISSING disk,
the command ldeletepv will not remove the disk. You will need to use rmfs (or rmlv if not a file
system) to remove the information about the logical volumes from the rest of the VGDA's. Once
there are no other references to data on the missing disk the disk entry in the VGDA can be
removed as well.
In your example, assuming that it was a mirror of rootvg that was lost (so only a mirror is
missing) you could first execute:
unmirrorvg rootvg hdisk1
ldelevepv -v VGID -p PVID
and your system would be satisfied.