tru64 goes down to halt (355 Views)
Reply
Occasional Advisor
Bulent Kolay
Posts: 9
Registered: ‎05-12-2010
Message 1 of 14 (355 Views)

tru64 goes down to halt

I have a server alpha es40. its os is tru64.
the server has gone down to p00>>> everyday for last 3 days.
the server has been working without any problem for 4 years. We didn't install any addition software recently.
when it becomes crash, I boot it . it works for a while. after that it goes down.
I investigated the messages on it.

An AdvFS domain panic has occured due to either a metadata write error or an internal inconsistency. This domain is being rendered in accessible
Vmunix: syncing disks AdvFS I/O error
Vmunix: Volume: /dev/rz8g
Vmunix: Tag: 0xfffffff7.0000
Vmunix: Page: 435
Vmunix: Block: 8064
Vmunix: Block Count: 16
Vmunix: Type of operation: write
Vmunix: Error: 5
bs_osf_complete: metadata write failed
Honored Contributor
Steven Schweda
Posts: 9,089
Registered: ‎02-23-2005
Message 2 of 14 (355 Views)

Re: tru64 goes down to halt

sizer -v

> [...] AdvFS I/O error

Hmmm. Do you think it may be a bad disk?

> [...] Volume: /dev/rz8g

Possibly that one?
Occasional Advisor
Bulent Kolay
Posts: 9
Registered: ‎05-12-2010
Message 3 of 14 (355 Views)

Re: tru64 goes down to halt

in another time in /var/adm/messages, AdvFS I/O error /dev/rz8a

What am I to do in order to fix it ?
Thanks
Honored Contributor
Rob Leadbeater
Posts: 3,582
Registered: ‎08-14-2002
Message 4 of 14 (355 Views)

Re: tru64 goes down to halt

Hi,

It sounds very much like you have a failed or failing disk.

Repair/replace the disk.
Restore from your backups as appropriate.

Cheers,

Rob
Occasional Advisor
Bulent Kolay
Posts: 9
Registered: ‎05-12-2010
Message 5 of 14 (355 Views)

Re: tru64 goes down to halt

if I save the disk from repair,
How can I repair it ?
Thanks
Honored Contributor
Rob Leadbeater
Posts: 3,582
Registered: ‎08-14-2002
Message 6 of 14 (355 Views)

Re: tru64 goes down to halt

You could try reassigning bad blocks.

Check out the man page and the help for scu.

Cheers,
Rob
Occasional Advisor
Bulent Kolay
Posts: 9
Registered: ‎05-12-2010
Message 7 of 14 (355 Views)

Re: tru64 goes down to halt

is there a tool to try reassigning bad blocks ?

how can I get it ?

thanks
Honored Contributor
Rob Leadbeater
Posts: 3,582
Registered: ‎08-14-2002
Message 8 of 14 (355 Views)

Re: tru64 goes down to halt

Please refer to my previous post. For example:

# man scu

# scu

scu> help

Cheers,
Rob
Honored Contributor
Steven Schweda
Posts: 9,089
Registered: ‎02-23-2005
Message 9 of 14 (355 Views)

Re: tru64 goes down to halt

> How can I repair it ?

How old is this disk drive? What would a
replacement cost? How much time would you
like to spend playing with a cheap piece of
obsolete (and failing) hardware? What is
your time worth?
Occasional Advisor
Bulent Kolay
Posts: 9
Registered: ‎05-12-2010
Message 10 of 14 (355 Views)

Re: tru64 goes down to halt

I have 3 advfs partitions

disklabel -r rz8

rrz8a
rrz8g
rrz8h
I booted the server on single mode.
when I run "/sbin/scu -f /dev/rrz8a"
I doesn't get any error. scu prompted
scu>
when I run "/sbin/scu -f /dev/rrz8g"
scu>
when I run "/sbin/scu -f /dev/rrz8h"
scu>
is it normal ?
I see /var/adm/messages
Vmunix: syncing disks AdvFS I/O error
Vmunix: Volume: /dev/rz8g

Why didn't the utility able to find any bad blocks?

Thanks

Honored Contributor
Rob Leadbeater
Posts: 3,582
Registered: ‎08-14-2002
Message 11 of 14 (353 Views)

Re: tru64 goes down to halt

> Why didn't the utility able to find any bad blocks?

Because you didn't tell it to check anything...

scu>
scu> sbtl 1 0 0
scu> verify media

If this flags a block as being faulty, then

scu> reassign lba
scu> verify media start

Repeat as necessary. If you get more than a few bad blocks then replace the drive and restore from your backups.

Cheers,

Rob
Occasional Advisor
Bulent Kolay
Posts: 9
Registered: ‎05-12-2010
Message 12 of 14 (353 Views)

Re: tru64 goes down to halt

Hello

I think 1 0 0 of values in sbtl is variable.
How can I learn these values for the server?

Thanks


Honored Contributor
Kapil Jha
Posts: 1,478
Registered: ‎01-23-2006
Message 13 of 14 (353 Views)

Re: tru64 goes down to halt

scu> show edt

and from there you can get the device number.

After scu you may also consider to run fsck on rootdg as from the error it clearly reflect that you have you have AdvFS curruption.

Do you have alternate mirror disk?

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Honored Contributor
Martin Moore
Posts: 214
Registered: ‎03-19-2003
Message 14 of 14 (353 Views)

Re: tru64 goes down to halt

> After scu you may also consider to run fsck on rootdg as from the error it clearly reflect that you have you have AdvFS curruption.

Um, no. fsck works on UFS, not AdvFS. To check for and fix AdvFS corruption, you need to use fixfdmn or verify (both found in /sbin/advfs).

Martin
Every complex problem has a solution that is simple, elegant--and utterly wrong.
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.