08-18-2010 09:07 AM
Single-user boot does the same thing.
How can I run fsck or otherwise resolve the ADVFS filesystem discrepancy and allow the boot process to advance?
ADVFS : Domain generic not activated - inconsistency detected
cdfs_mount: Unsupported disk format
panic (cpu 0): vfs_mountroot: cannot mount root
DUMP: No primary swap, no explicit dumpdev.
Nowhere to put header, giving up.
CP - SAVE_TERM routine to be called
CP - SAVE_TERM exited with hlt_req = 1, r0 = 00000005.00000000
halted CPU 0
halt code = 5
HALT instruction executed
PC = fffffc0000538480
Solved! Go to Solution.
08-19-2010 04:41 AM
You don't run fsck on AdvFS domains. You'd usually use verify to check it out.
In your case, as you can't get to single user, you're going to have to dig out an OS CD and try booting off that to get to a Unix shell.
You should then be able to use the verify utility from there to try and fix things...
However given the "Unsupported disk format" message, I'm going to guess that your hard disk is dead...
Hope this helps,
08-19-2010 06:36 AM
Can you supply part numbers of specific media kits that would be sufficient? I doubt that my client's client's vendor (who owns the server) has the license key. A media kit with no license may still permit a 2-user system, I am told. Do you know whether I can at least boot the media CD to a point where I can repair the filesystem without having to enter a license key? If I do need a key, is there any way to retrieve the license key of the existing OS installation on the machine?
Another way to repair which requires a helpful colleague is to dd the partition off to another machine, and then move the partition image over the net to a colleague who can dd it onto a drive, repair it, and ftp the image back. I could make it easier by putting the image onto a drive that could be repaired directly, and shipping the drive.
Thank you for your reply, and any other insights that you wish to offer.
08-19-2010 09:25 AM
You shouldn't need any key to boot from a
CD-ROM. Probably best if you can find a
Tru64 installation CD-ROM of the same version
as the installed OS. I'd tend to avoid any
V5.x kits if it was running V4.x.
I know nothing, but to me the scariest part
of your transcript was:
cdfs_mount: Unsupported disk format
Why is it doing anything related to cdfs?
> [...] they are three 9G drives in RAID5 on
> a Mylex controller, [...]
More stuff about which I know nothing, but
you might try to explore the Mylex
configuration to see which disks it's trying
to do what with. Also potentially
08-20-2010 02:31 AM
I *think* the part number for a 4.0x media kit is QA-MT4AA-H8, however I very much doubt that you'll be able to get hold of that directly from HP.
You won't need any licenses in order to boot up from CD.
In addition to seeing the output Steven mentioned, the output of
>>> show boot*
would also be useful.
If you've got a Mylex RAID controller, otherwise known as a KZPAC or SWXCR then it's very likely that you'll need a copy of the RAID Configuration Utility to check the status of any RAID volumes.
You should be able to find that from here:
08-20-2010 09:55 AM
> I *think* the part number for a 4.0x media kit is QA-MT4AA-H8, however I very much doubt that you'll be able to get hold of that directly from HP.
You can still order old media kits from HP, and that is the correct part number. However, be sure to specify the version number you're asking for. Also, the turnaround time for older kits may be significant (e.g. several days).
08-25-2010 02:39 PM
And again, the system DID boot fine, but after two "rude" power-downs (due to not having any passwords to do a proper shutdown), the root partition is damaged.
But, I now have images of the three partitions copied from the RAID logical drive.
If anyone is receptive to a fee-for-service arrangement to run verify on the damaged partition to make it bootable again, please contact me at james at ump qua net dot com or telephone DIII CCCXIII LXXIII LXXXII. I can either make the partition image available via ftp, or I can attempt to put the partition on a drive and ship the drive.
FreeBSD couldn't read the disk label, but NetBSD did. Were I to ship a whole drive, I would probably use NetBSD to create the disk label and dd the partition image onto the drive. Please advise if you have a better idea!
Thank you all very much, and I'll keep you posted on my progress.
08-26-2010 04:26 AM
It would still be useful to see the output of the various "show" commands.
The original messages do appear to suggest that the system wasn't trying to boot off the RAID disk.
I've just emailed you as well...
08-31-2010 08:42 PM
I've been able to boot the CD and get a UNIX shell, and then:
disklabel -r re0
Due to my failed attempt to use NetBSD to 'dd' a repaired partition image onto the physical disk partition, it appears that the disklabel is damaged. I have a saved ASCII copy of the disklabel as NetBSD saw it originally, and I trust it because:
a) the offset and size values step correctly with no math errors;
b) the SWXCR drive type reported by NetBSD corresponds with the Digital UNIX /etc/disktab file's 're' device and SWXCR disk type, which is described as a DEC RAID Controller with dynamic geometry.
c) the size of the c partition from offset zero is the same as the total sectors, so the c partition size correctly fills the entire logical disk.
I will comment idly that I needed to switch to 'set console graphics' mode to boot the installation CD, and also plug in a keyboard and mouse, but I did find the shell prompt. I rather prefer the serial console because it's so much easier to use 'script' to keep an audit trail of what I did and what response I got, and to paste console output from the serial 'cu' utility than to re-type everything from the graphics console. But I couldn't seem to get the CD to boot up in serial console mode. Now that I see it is an X-based install, that makes sense, I guess. But when booting with my laptop acting as a serial console (and no keyboard or mouse, I should confess), I never did get any console nor VGA output that hinted that anything was going on beyond the "jumping to bootstrap code" message. After I switched to console graphics mode, I started getting results.
Once at a shell, I did:
# cd /dev
# ./MAKEDEV re0
# disklabel -r re0
and saw the damaged label, which reinforced my suspicion that the disklabel really was gone. The type, disk, and most all the geometry values were not correct, compared to the SWXCR label that NetBSD saw several days ago.
I tried writing a new label to the disk:
# disklabel -wr -t advfs re0
That command gave me:
disklabel: ioctl DIOCSDINFO: Open partition would move or shrink
Use alternate partition
I took a feeble stab at editing the new label to my liking:
# TERM=vt100;export TERM
# EDITOR=vi;export EDITOR
# disklabel -e re0
I changed only the size and offset values (deleting the partitions I didn't want), and left all the header disk, type, and geometry values at their (incorrect) defaults. I changed the 'a' partition fstype to 'unused' and left the other fstype values alone, since they already said 'unused'.
Since I didn't change the incorrect geometry values (which were generally smaller than they should be), I'm not surprised that when I tried to write the edited disklabel, I was told in essence that the partitions overflowed the limits of the disk.
For kicks, here are the "then-and-now" values of the disklabel header fields:
type: ccd now says SCSI
disk: SWXCR now says RZ28
sectors/track was 32, now 99
tracks/cyl was 128, now 16
sectors/cyl was 4096, now 1584
cyls was 8683, now 2595
total sectors was 35565568, now sectors/unit is 4110480
rpm was 3600, now 5376
interleave is 1 on both labels
trackskew, cylinderskew, headswitch, trk-to-trk seek, drivedata are
all 0 on both labels
As you can see, the geometry on the RZ28 label is much smaller. Not surprisingly, when I edit and try to write the label to disk, I am told that the partitions exceed the size of the disk.
I've rarely read /etc/disktab, but reading Tru64's disklabel man page gave me an interest. The type SWXCR is indeed defined in that file, in conjunction with the re device name, described as a DEC RAID Controller (multi-spindle), and references dynamic_geometry in the disktab 'ty' field.
Attempting to create a label with:
disklabel -wr -t advfs re0 SWXCR
yields the same error DIOCSDINFO error message.
Here's something I just found:
'print a default label'
disklabel -p re0
displays a disklabel with largely correct geometry: type SWXCR, disk SWXCR, flags: dynamic_geometry, 512 bytes/sec, 32 sec/trk (check), 128 trk/cyl (check), 4096 sec/cyl (check), 8683 cyls (check), 35565568 total sectors (check). 3600 rpm, interleave 1, all else 0. The partitions are wrong, but I can edit those.
Shall I write the 'disklabel -p re0' to an ASCII file, edit the ASCII file to have the right partitions, and then tell disklabel to write the label to disk based on that?
Would that be:
disklabel -p re0 > /tmp/re0.label
disklabel -Rr -t advfs re0 /tmp/re0.label