Steps for Disk replacement in IBM ESS.

Today in this post, we will see how to do disk replacement in IBM ESS server.

Suppose there is a disk failed and IBM is sending an engineer to replace the failed ESS disk.

Let us see the steps involved here.

========== Replacing a failed disk on IBM ESS Server============

 

1. Check which disks are not OK.

ngeess1--> mmlspdisk all --not-ok
pdisk:
replacementPriority = 3.42
name = "e2d2s32"
device = ""
recoveryGroup = "rg_ngeess2-da"
declusteredArray = "DA1"
state = "failing/replace"
internalState = 00009.1c0
capacity = 8001524072448
freeSpace = 7997229105152
fru = "00LY450"
location = "78R039C-2-32"
WWN = "naa.5000C50094E9E8A7"
server = "ngeess2-da.india.ngelinux.com"
reads = 75139318
writes = 53230540
bytesReadInGiB = 66080.249
bytesWrittenInGiB = 43387.205
IOErrors = 2
IOTimeouts = 5
mediaErrors = 0
checksumErrors = 0
pathErrors = 0
relativePerformance = 0.836
dataBadness = 0.000
rgIndex = 37
userLocation = "NGE LAB, E1 N00-15, Enclosure 00XX-0XX-OXOXOXX Draw er 2 Slot 32"
hardware = "IBM-ESXS STX000NM00XX E5 ECE4 XX19KXXX0000RXXXNWHA"
hardwareType = Rotating 7200
nPaths = 0 active 0 total
nsdFormatVersion = Unknown
paxosAreaOffset = Unknown
paxosAreaSize = Unknown
logicalBlockSize = 4096
ssdEndurancePercentage =
You have new mail in /var/spool/mail/root
ngeess1-->

 

2. Replace disk in rg_ngeess2-da.

ngeess1--> mmlspdisk rg_ngeess2-da --replace
pdisk:
replacementPriority = 3.42
name = "e2d2s32"
device = ""
recoveryGroup = "rg_ngeess2-da"
declusteredArray = "DA1"
state = "failing/replace"
internalState = 00009.1c0
capacity = 8001524072448
freeSpace = 7997229105152
fru = "00LY450"
location = "78R039C-2-32"
WWN = "naa.5000C50094E9E8A7"
server = "ngeess2-da.india.ngelinux.com"
reads = 75139318
writes = 53230540
bytesReadInGiB = 66080.249
bytesWrittenInGiB = 43387.205
IOErrors = 2
IOTimeouts = 5
mediaErrors = 0
checksumErrors = 0
pathErrors = 0
relativePerformance = 0.836
dataBadness = 0.000
rgIndex = 37
userLocation = "NGE LAB, E1 N00-15, Enclosure 00XX-0XX-OXOXOXX Draw er 2 Slot 32"
hardware = "IBM-ESXS STX000NM00XX E5 ECE4 XX19KXXX0000RXXXNWHA"
hardwareType = Rotating 7200
nPaths = 0 active 0 total
nsdFormatVersion = Unknown
paxosAreaOffset = Unknown
paxosAreaSize = Unknown
logicalBlockSize = 4096
ssdEndurancePercentage =

 

3. Check out the pdisks available.

ngeess1--> mmlsrecoverygroup rg_ngeess2-da -L --pdisk
declustered current allowable
recovery group arrays vdisks pdisks format version format version
----------------- ----------- ------ ------ -------------- --------------
rg_ngeess2-da 3 5 86 4.2.2.0 5.0.5.1
declustered needs replace scrub background activity
array service vdisks pdisks spares threshold trim free space duration task progress priority
----------- ------- ------ ------ ------ --------- ---- ---------- -------- -------------------------
NVR no 1 2 0,0 1 no 3632 MiB 14 days scrub 62% low
DA1 yes 3 83 2,44 2 no 10 TiB 14 days scrub 45% low
SSD no 1 1 0,0 1 no 744 GiB 14 days scrub 33% low
n. active, declustered state,
pdisk total paths array free space remarks
----------------- ----------- ----------- ---------- -------
e1d1s01ssd 2, 4 SSD 744 GiB ok
e1d1s02 2, 4 DA1 220 GiB ok
e1d1s03 2, 4 DA1 216 GiB ok
e1d1s04 2, 4 DA1 220 GiB ok
e1d1s20 2, 4 DA1 216 GiB ok
e1d1s21 2, 4 DA1 220 GiB ok
e1d1s29 2, 4 DA1 220 GiB ok
e1d1s30 2, 4 DA1 220 GiB ok
e1d1s31 2, 4 DA1 220 GiB ok
e1d1s32 2, 4 DA1 220 GiB ok
e2d2s20 2, 4 DA1 216 GiB ok
e2d2s21 2, 4 DA1 216 GiB ok
e2d2s29 2, 4 DA1 216 GiB ok
e2d2s30 2, 4 DA1 216 GiB ok
e2d2s31 2, 4 DA1 216 GiB ok
e2d2s32 0, 0 DA1 7448 GiB failing/replace
e2d2s33 2, 4 DA1 216 GiB ok
e2d2s34 2, 4 DA1 216 GiB ok
e2d2s35 2, 4 DA1 216 GiB ok
n001v001 1, 1 NVR 1816 MiB ok
n002v001 1, 1 NVR 1816 MiB ok
declustered checksum
vdisk RAID code array vdisk size block size granularity state remarks
------------------ ------------------ ----------- ---------- ---------- ----------- ----- -------
rg_ngeess2_da_logtip 2WayReplication NVR 48 MiB 2 MiB 4096 ok logTip
rg_ngeess2_da_logtipbackup Unreplicated SSD 48 MiB 2 MiB 4096 ok logTipBackup
rg_ngeess2_da_loghome 4WayReplication DA1 72 GiB 2 MiB 4096 ok log
rg_ngeess2_da_Meta_512K_1 3WayReplication DA1 17 TiB 512 KiB 32 KiB ok
rg_ngeess2_da_Data_8M_1 8+2p DA1 420 TiB 8 MiB 32 KiB ok
config data declustered array spare space remarks
------------------ ------------------ ------------- -------
rebuild space DA1 47 pdisk
config data disk group fault tolerance remarks
------------------ --------------------------------- -------
rg descriptor 1 drawer + 1 pdisk limiting fault tolerance
system index 1 drawer + 1 pdisk limited by rg descriptor
vdisk disk group fault tolerance remarks
------------------ --------------------------------- -------
rg_ngeess2_da_logtip 1 pdisk
rg_ngeess2_da_logtipbackup 0 pdisk
rg_ngeess2_da_loghome 1 drawer + 1 pdisk limited by rg descriptor
rg_ngeess2_da_Meta_512K_1 1 drawer + 1 pdisk limited by rg descriptor
rg_ngeess2_da_Data_8M_1 2 pdisk
active recovery group server servers
----------------------------------------------- -------
ngeess2-da.india.ngelinux.com ngeess2-da.india.ngelinux.com,ngeess3-da.india.ngelinux.com

 

4. Verify pdisk inside the recovery group.

ngeess1--> mmvdisk pdisk list --recovery-group rg_ngeess2-da
declustered
recovery group pdisk array paths capacity free space FRU (type) state
-------------- ------------ ----------- ----- -------- ---------- --------------- -----
rg_ngeess2-da e1d1s02 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s03 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e1d1s04 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s05 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s06 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s07 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e1d1s15 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s16 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s17 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s18 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e1d1s19 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s20 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e1d1s21 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s29 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s30 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s31 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s32 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s33 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s34 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s35 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s01 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s02 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s03 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s04 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s05 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s06 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s07 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s15 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s16 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s17 DA1 2 7452 GiB 228 GiB 00LY450 ok
rg_ngeess2-da e1d2s18 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s19 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s20 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s21 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s29 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s30 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s31 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s32 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s33 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e1d2s34 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d2s35 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e2d1s01 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e2d1s02 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e2d1s03 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e2d1s04 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e2d1s05 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e2d1s06 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e2d1s07 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e2d2s19 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s20 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s21 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s29 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s30 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s31 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s32 DA1 0 7452 GiB 7448 GiB 00LY450 failing/replace
rg_ngeess2-da e2d2s33 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s34 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s35 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da n001v001 NVR 1 1992 MiB 1816 MiB IPR-10 68C8730 ok
rg_ngeess2-da n002v001 NVR 1 1992 MiB 1816 MiB IPR-10 68C8C10 ok
rg_ngeess2-da e1d1s01ssd SSD 2 745 GiB 744 GiB 00LY451 ok

 

5. Prepare the pdisk for replacement

ngeess1--> mmvdisk pdisk replace --prepare --recovery-group rg_ngeess2-da --pdisk e2d2s32
mmvdisk: Suspending pdisk e2d2s32 of RG rg_ngeess2-da in location 78R039C-2-32.
mmvdisk: Location 78R039C-2-32 is Rack Pyramid Park, E5 U11-15, Enclosure 5147-084-78R039C Drawer 2 Slot 32.
mmvdisk: Carrier released.
mmvdisk:
mmvdisk: - Remove carrier.
mmvdisk: - Replace disk in location 78R039C-2-32 with type '00LY450'.
mmvdisk: - Reinsert carrier.
mmvdisk: - Issue the following command:
mmvdisk:
mmvdisk: mmvdisk pdisk replace --recovery-group rg_ngeess2-da --pdisk 'e2d2s32'
ngeess1-->

 

6. Now ask the IBM engineer to replace the failed disk.

 

7. After disk replacement by engineer, initiate the new disk.

ngeess1--> mmvdisk pdisk replace --recovery-group rg_ngeess2-da --pdisk 'e2d2s32'
mmvdisk:
mmvdisk: mmchcarrier : [I] Preparing a new pdisk for use may take many minutes.
mmvdisk:
mmvdisk: 2021-08-04_12:34:01.261+0100: [I] Callback: /usr/lpp/mmfs/bin/tspreparenewpdiskforuse /dev/sdhu.
mmvdisk: Attempting to update firmware if necessary. Failure will not prevent drive replacement.
mmvdisk: Command: mmchfirmware --type drive --serial-number XX1XXXXR0000C020LXXX --new-pdisk
mmvdisk: Command: err 0: mmchfirmware --type drive --serial-number XX1XXXXR0000C020LXXX --new-pdisk
mmvdisk:
mmvdisk: The following pdisks will be formatted on node ngeess2:
mmvdisk: //ngeess2-da/dev/sdbz,//ngeess2-da/dev/sdhu,//ngeess3-da/dev/sdy,//ngeess3-da/dev/sdhu
mmvdisk: Pdisk e2d2s32 of RG rg_ngeess2-da successfully replaced.
mmvdisk: Resuming pdisk e2d2s32#0037 of RG rg_ngeess2-da.
mmvdisk: Carrier resumed.

 

8. Check out the disk status.

ngeess1--> mmvdisk pdisk list --recovery-group rg_ngeess2-da
declustered
recovery group pdisk array paths capacity free space FRU (type) state
-------------- ------------ ----------- ----- -------- ---------- --------------- -----
rg_ngeess2-da e1d1s02 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s03 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e1d1s04 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s05 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s06 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s07 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e1d1s15 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s16 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s17 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s18 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e1d1s19 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s20 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e1d1s21 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s29 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s30 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s31 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s32 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s33 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e1d1s34 DA1 2 7452 GiB 220 GiB 00LY450 okrg_ngeess2-da e2d1s33 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e2d1s34 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e2d1s35 DA1 2 7452 GiB 224 GiB 00LY450 ok
rg_ngeess2-da e2d2s01 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s02 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s03 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s04 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s05 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e2d2s06 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s07 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s15 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s16 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s17 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s18 DA1 2 7452 GiB 220 GiB 00LY450 ok
rg_ngeess2-da e2d2s19 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s20 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s21 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s29 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s30 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s31 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s32 DA1 2 7452 GiB 7448 GiB 00LY450 ok
rg_ngeess2-da e2d2s33 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s34 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da e2d2s35 DA1 2 7452 GiB 216 GiB 00LY450 ok
rg_ngeess2-da n001v001 NVR 1 1992 MiB 1816 MiB IPR-10 68C8730 ok
rg_ngeess2-da n002v001 NVR 1 1992 MiB 1816 MiB IPR-10 68C8C10 ok
rg_ngeess2-da e1d1s01ssd SSD 2 745 GiB 744 GiB 00LY451 ok
0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments