zpool import failure – invalid vdev configuration

If you happened to be on #lopsa a few nights ago, you might have seen this:

 

Thu 00:54:15< lkchen> uh, oh....zpool import - "The pool cannot be imported due to damaged devices or data"
Thu 00:54:27< lkchen> was in the process of replacing a disk in the root zpool, when the system got killed
Thu 00:54:41< lkchen> anybody know of a way to rescue this situation from remote?

 

Well, I was able to eventually get the system out of this mess….

 

First, putting aside that my co-workers are from an experiment where after years of wanting to find Unix System Administration certification programs in which we needed to achieve to keep our jobs and find new talent, but failing to find a suitable one.  Years ago a faculty unit had hired a Sun Certified Solaris Systems Administrator, who asked to ahve a Sun Workstation on his desk.  So, he was given the old workstation that had been used by the person in central IT that had been laid off, after that faculty unit pulled funding for the position.

 

Per IT Security protocol, the machine had been wiped before leaving the datacenter.  Which he soon discovered in that finding that he couldn't figure out how to get the system to boot.  After a few days he finally mentions that there doesn't appear to be an OS on it, so we hand him the OS install disk set.  His immediate response, is he doesn't know what to do with these.

 

Not wanting to blurt out laughing, or colorful, we weren't able to immediately respond to that….

 

But, he became why we don't hire on certifications, or require us to have them. (we may require training, before working on something, but not to extent of getting certified.)

 

The new experiment has been to hire people that don't know Unix System Administration…give them the same (or better) money than existing, and have them figure out the job on their own.  Without requiring that they get any training (while the previous interim director didn't offer training, the new one has…but they probably haven't taken the class on when to ask for training…or how to use Google or read documentation….before calling our own helpdesk, which they just send straight back to us for tier 3 support.

 

So, when I had gone into work on my birthday, and so was on-site for weekly staff meeting (a co-worker was sad that it had to be the one time he was remoting in….the camera in conference room also happened to be out of service.  While, I never got webcam to work in the VM that I run conferencing software in on my FreeBSD workstation at home…or directly in FreeBSD…. one of the 'new' admins…

 

[the one I mistook for a lost student on his first day, not the one that reitred from IBM and is a graduate student…and floods our helpdesk, with how do I set up printing (of iMac to local network printer), or "I'm having trouble moving host from dhcp to static in same vlan, please move host to correct vlan." or how do access console of a VM, as it doesn't show up on digi boxes.  Though the local network printer problem was partly because on the first day on the job, he decided he wanted to reconfigure the network printer be usable as a network scanner (we all have department issued flash drives…though mine always seems to have the F5 EOD or boot installer on it.)]

 

…said he was going to finally replace the failed root zpool mirror drive in E2900 that is one of two production student information Oracle RAC servers.  I had reported brought it up that we'd been getting weekly emails about its failure for a few months. 

 

[started while I was at Mayo Clinic…on my return a 2.5lb object gets knocked off of a table behind me and lands on my foot….so had been working from home a lot since then…..otherwise, I'd probably have replaced it within a week … longer if its on support …of the weekly emails from CFEngine on zpool health.]

 

These E2900s are on support, so that may have delayed on getting a replacement disk sent for it (if they weren't, we have piles of pulled drives to try from.)

 

Well, later that night…I get call that something might have gone wrong in the replacement of failed disk, because the server is down, and they can't even tell if its on or off.

 

Well, I got onto console, lom and see that its powered off, checking through other things, showlogs, showenvironment, showcomponents, etc. I poweron….while its doing its long POST, the other admin starts 'sniff session'.  Just as the system it about to boot, somebody hits the rocker switch and sends it back to poweroff state.

 

I later found a note that the other admin seeing that I was working on things had decided to go home then.  nice.

 

So, watching it POST again….and it boots, but reports the zpool is in faulted state so it can't continue.  I wondering what state its in….there's no boot block on disk0, so its using disk1's.  (couldn't remember if it had been disk1 or disk0 that was in need of replacement….going back to old emails, it was disk0.)

 

stumblie around a bit in OBP, somehow boot failure results in the controller being in an invalid state requiring a full reset to recover, I get it to come up from failsafe archive.

 

Viewing "zpool import" from failsafe, gives report that "The pool cannot be imported due to damaged devices or data".

 

I search around a bit, and including typing into a couple of IRC channels…

 

Only crickets….

 

I also notice the config is like this:

 

  NAME                STATE
  rpool               ONLINE
    mirror            ONLINE
      replacing       ONLINE
        c1t0d0s0      ONLINE
        c1t0d0s0/old  ONLINE
      c1t1d0s0        ONLINE

 

Normally, I'd expect to see c1t0d0s0/old as UNAVAIL, FAULTED and as a string of numbers with '(was c1t0d0s0/old)' on the end of the line.

 

I then manage to get it to import with '-F'.  'zpool status' says the 'vdev configuration is invalid'.

 

So, I'm wondering if there's someway from remote to get it to import the zpool seeing only 'c1t1d0s0'.

 

Or what zdb messing around i'd be willing to attempt….zdb -l shows the labels to be in sync, but I don't know how to get zdb to tell me where it had found the labels. (which might have sped things up.)

 

After a couple of hours go by….I decide I'm willing to lose it, and put to the test that "RAID is backups" to see how they restore the rpool from backups. (though I was fairly certain the the backup administrator is still running the old backup system, though it might not be licensed for bare-metal restores anymore….think we had moved back to we're doing bare-metal backups, but we'll get the license to do restores when we need it….or that it wasn't an option under the perpetual license for legacy backups.)

 

So, I try to figure out how to 'dd' over the zdb labels.  The front two labels should be no problem in wiping, except they wouldn't go away….though I later found I had slipped on the command and created a couple of 1MB files instead.  But, where's the end of the disk. (I had re-exported the inaccessible import to avoid any issues that might cause.)

 

# format

 

When i print out the partition, I see something like:

s0 root 127MB
s1 swap 127MB
s2 the whole disk
s6 usr the rest of the disk

 

Hmmm, that's not right.

 

# prtvtoc /dev/rdsk/c1t1d0s2 | fmthard -s – /dev/rdsk/c1t0d0s2

 

and look to see what "zpool import" thinks now.

 

c1t0d0s0/old's state is no UNAVAIL.  That might work now.

 

Now I can 'zpool import rpool'….and I detach both c1t0d0s0 and the string of digits that (was c1t0d0s0/old) from it.  And, re-attach c1t0d0s0….where it starts resilvering….

 

About an hour later 77GB has been resilvered.

 

The other admin takes over, he doesn't seem to know how to stop "watch" of "zpool status", so he reboots the box, does "installboot' and reboots again. (leaving auto-boot? as false and boot-device as disk1…)

 

And, everything was good again….

 

Oh, was the problem due to 'new' admin not being able to follow instructions, or inexperience….it was the latter.  In the continuation of the SR, new support engineer asks if the procedure outlined in their documentation on replacing a zpool disk had been followed.

 

I infer that the answer is 'yes'….it makes no mention of formatting or partitioning the replacement disk to match the existing, etc.  It was just lucky/unlucky the replacement disk they had sent came with an SMI label…instead of being blank or having an EFI label.  I normally expect to get a blank replacement disk, but have on occasion received ones with an EFI label….prtvtoc|fmthard doesn't quite work when they aren't the same 😉  I never remember to see who's data is on these 'replacement' disks that we receive.  Wonder if anybody is looking at the data on the disks we return?

 

Until the next day…..but that's another story.