zx2000: Machine check when booting (1717 Views)
Reply
Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 1 of 17 (1,717 Views)

zx2000: Machine check when booting

[ Edited ]

Dear Itanium experts,

 

I'd be grateful for any hint how to understand BMC logs for my zx2000. This is a continuation of "Firmware Error" on rx/zx topic, now with new data I retrieved with null-modem connection. In a couple of words: 2 months ago my zx2000 machine, deerfield, b.2005, always healthy, refused to boot. For the first time about 1 of 5-6 attempts to boot was successeful (during one of them I replaced the bios battery and ran IPF Offline Utilities) but now the machine seems to be defunct: every boot fails. IPF Offline Utilities say that almost all is OK except for Basic I/O test which hangs the machine.

 

Here is an extract from SEL log with a record of successful boot (judging on incorrect time settings that was the day I changed the battery):

 

cli>sel
#  Sev Generator/Sensor Description  Event ID    Data, Timestamp
---- - ---------------- ------------ ----------- --------------------------
0990 - BMC              LPC reset    00-12:70:02        (Rel)   00 00:00:05
09A0 - SEL Time Set     Set          FD-C0:03:01        1998-01-01 00:04:52
09B0 2 CPU0                             000DC    DT 00  0000000000000000
09C0 2 CPU0                             000DC    Time   1998-01-01 00:04:54
09D0 - SFW              EFI boot mgr 00-12:6F:41 8F:--  1998-01-01 00:05:20
09E0 2 CPU0             EFI boot mgr    0020B    DT 04  0000000000000006
09F0 2 CPU0             EFI boot mgr    0020B    Time   1998-01-01 00:05:20
0A00 - BMC              LPC reset    00-12:70:02        1998-01-01 00:34:23

 

 

The following is the FPL record for a recent attempt to boot:

 

cli>fpl
Rec#   Sev Generator/Sensor Description  Event ID    Data, Timestamp
-------- - ---------------- ------------ ----------- --------------------------
00001DEF - BMC              LPC reset    00-12:70:02        2012-12-17 00:58:59
00001DF0 - SFW              Boot start   00-1D:0A:00        2012-12-17 00:58:59
00001DF1 2 CPU0             Boot start      00063    DT 06  0000000000000000
00001DF2 2 CPU0             Boot start      00063    Time   2012-12-17 00:58:59
00001DF3 0 CPU0                             00020    DT 00  0000000000000000
00001DF4 0 CPU0                             0000E    DT 06  0000000000010000
00001DF5 1 CPU0             CPU monarch     0000C    DT 06  0000000000000000
00001DF6 1 CPU0             CPU present     00261    DT 06  0000000000000000
00001DF7 0 CPU0                             00008    DT 00  0000000000000000
00001DF8 0 CPU0                             0024B    DT 00  0000000000000000
00001DF9 0 CPU0                             00006    DT 03  02000000002A0400
00001DFA 0 CPU0                             00056    DT 00  0000000000000000
00001DFB 0 CPU0                             0024C    DT 00  0000000000000000
00001DFC 0 CPU0                             0001D    DT 06  0000000000000000
00001DFD - SEL Time Set     Set          FD-C0:03:01        2012-12-17 00:59:05
00001DFE 0 CPU0                             002AF    DT 06  000000000000001F
00001DFF 0 CPU0                             0010B    DT 00  0000000000000000
00001E00 1 CPU0                             000A4    DT 00  0000000000000000
00001E01 0 CPU0                             000B1    DT 00  0000000000000000
00001E02 0 CPU0                             000DF    DT 00  0000000000000000
00001E03 0 CPU0                             000C6    DT 00  0000000000000000
00001E04 1 CPU0                             000FE    DT 00  0000000000000000
00001E05 0 CPU0                             000EC    DT 00  0000000000000000
00001E06 0 CPU0                             000A6    DT 00  0000000000000000
00001E07 0 CPU0                             000E7    DT 04  FFFFFFFF000AFF74
00001E08 0 CPU0                             000E7    DT 04  FFFFFFFF000BFF74
00001E09 0 CPU0                             000E5    DT 04  FFFFFFFF001AFF74
00001E0A 0 CPU0                             000E5    DT 04  FFFFFFFF001BFF74
00001E0B 0 CPU0                             00205    DT 00  0000000000000000
00001E0C 0 CPU0                             000B2    DT 00  0000000000000000
00001E0D 0 CPU0                             000C9    DT 00  0000000000000000
00001E0E 0 CPU0                             000C2    DT 00  0000000000000000
00001E0F 0 CPU0                             000A8    DT 00  0000000000000000
00001E10 0 CPU0                             000CE    DT 00  0000000000000000
00001E11 0 CPU0                             000B8    DT 00  0000000000000000
00001E12 0 CPU0                             000F6    DT 00  0000000000000000
00001E13 0 CPU0                             000F1    DT 00  0000000000000000
00001E14 0 CPU0                             000EF    DT 00  0000000000000000
00001E15 0 CPU0                             000A5    DT 00  0000000000000000
00001E16 1 CPU0             I/O discovry    00081    DT 00  0000000000000000
00001E17 0 CPU0                             00086    DT 04  000000FFFF00FF83
00001E18 0 CPU0                             00086    DT 04  000000FFFF04FF83
00001E19 0 CPU0                             00086    DT 04  000000FFFF05FF83
00001E1A 0 CPU0                             00086    DT 04  000000FFFF06FF83
00001E1B 0 CPU0                             00087    DT 04  000000FFFF00FF83
00001E1C 0 CPU0                             00087    DT 04  000000FFFF04FF83
00001E1D 0 CPU0                             00087    DT 04  000000FFFF05FF83
00001E1E 0 CPU0                             00087    DT 04  000000FFFF06FF83
00001E1F 2 CPU0                             00285    DT 06  0000000000000000
00001E20 2 CPU0                             00285    Time   2012-12-17 00:59:09
00001E21 - SFW              Machine chk  00-13:70:A1 3F:00  2012-12-17 00:59:09
00001E22 7 CPU0             Machine chk     00098    DT 06  000000000000000B
00001E23 7 CPU0             Machine chk     00098    Time   2012-12-17 00:59:09
00001E24 2 CPU0                             002A1    DT 06  28000000FFF21130
00001E25 2 CPU0                             002A1    Time   2012-12-17 00:59:09
00001E26 2 CPU0                             00115    DT 06  0000000000000000
00001E27 2 CPU0                             00115    Time   2012-12-17 00:59:09
00001E28 3 CPU0                             00107    DT 06  0000000000000000
00001E29 3 CPU0                             00107    Time   2012-12-17 00:59:09
00001E2A - BMC              LPC reset    00-12:70:02        2012-12-17 00:59:09

 

Which code caused this Machine check? If it's actually the EFI code, how can I replace the firmware if the only documented way to do this is through the EFI shell?

 

Of course, it could be solely a hardware interruption, and if it's the case, what should I replace: processor or motherboard?

 

Thank you in advance,

Regards,

Valery

Honored Contributor
Robert_Jewell
Posts: 1,238
Registered: ‎06-26-2001
Message 2 of 17 (1,650 Views)

Re: zx2000: Machine check when booting

Its not any firmware code that is causing the machine check.  What you see if that the firmware is reporting the machine check.  Machine checks are caused by faulty hardware in almost all cases.

 

If you can get the system to the EFI shell, then run the 'errdump -mca' command to dump the hardware registers to your screen.  Capture that data to a file and perhaps it can be analyzed.

 

If not, then likely the CPU or system board is at fault and would need replacement.

 

-Bob

Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 3 of 17 (1,640 Views)

Re: zx2000: Machine check when booting

Thank you Bob for your post. Now the machine is working, replacing cmos battery again did the job but before X started I had Firmware Error again, and finally, the next boot was successful, the system is up since that time, it's about 2 weeks. Really confused :( Maybe rather than replacing the matherboard to a used one (and equally ancient - didn't HP cancel their manufacturing?) I should search for a used rx2660 system? Regards Valery
Honored Contributor
Robert_Jewell
Posts: 1,238
Registered: ‎06-26-2001
Message 4 of 17 (1,628 Views)

Re: zx2000: Machine check when booting

Well, if the system has been running for over two weeks now, maybe its best to see how things go.  If you want to know more about the reasons for the Machine Checks you can still post those logs.

 

I would avoid replacing any hardware for the sake of replacing hardware at this point.

 

Since the OS is running, the machine check logs can be obtained from the /var/tombstones directory.  If you have anything there of interest post it here.

 

-Bob

Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 5 of 17 (1,624 Views)

Re: zx2000: Machine check when booting

Bob, thank you again for the help. Though I know nothing about the severity of this problem, I really consider hardware replacement to be the worst solution :( I bought my zx about 7 years ago when I needed a machine compatible with integrity servers at the office and now find myself really attached to it, not to say that it's the only non-laptop pc at the home. As to the logs you mentioned, /var/tombstones seems to be something very specific to hp-ux and not present in linux. There were at least 2 Firmware Error encountered when system was on, so kernel logs MAY contain something related but I don't know how detailed the info collected is and not sure of linux' interest in machine check internals. If linux logs are really helpful, I'd try to repeat booting but the problem is that fails are order of magnitude more frequent before ELILO starts than after that. Regards Valery
Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 6 of 17 (1,617 Views)

Re: zx2000: Machine check when booting

Sorry for my ignorance, but I really don't understand how the kernel can save anything about the problem in case if a machine check isn't a usual interruption vector with a code in the IVA table, but solely a hardware event causing an abort. My knowledge of hardware is very little, when I learned ia64 assembler 10 years ago with ski simulator, I omitted everything related to hardware interruptions (those not processed through iva table) as something requiring a sufficient training in hardware design even to be understood. However, after 10 years I can recall that there are several rare conditions when machine checks are caused by the code itself, for instance after inconsistent manipulations with tlb. But the chance that now I faced such one is about 0, I think.
Honored Contributor
Robert_Jewell
Posts: 1,238
Registered: ‎06-26-2001
Message 7 of 17 (1,613 Views)

Re: zx2000: Machine check when booting

Its not the kernel, but rather there is space in NVRAM where CPU and system ASIC registers can dump to in the event of a machine check.  The OS reads this memory in NVRAM and copies the data to disk.

 

But, alas that is HP-UX.  I hadnt realized you were running Linux.  In order to review the MCA logs (machine check abort), you will need to reboot the system and access the EFI shell.   From there run the errdump -mca command and capture the data to a file.  The data will have to be decoded to interpret.

 

-Bob

 

 

Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 8 of 17 (1,611 Views)

Re: zx2000: Machine check when booting

OK, thank you Bob, both for the explanation how it works (now all become consistent in my mind) and for the advice. I'll try errdump -mca. Regards Valery
Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 9 of 17 (1,597 Views)

Re: zx2000: Machine check when booting

Dear Bob, my gratitude for the exceptional help from you. Here is a result of errdump mca: **** MCA Error Log Dump **** Firmware Revision: 02.31 Architected SAL Record ID 0x0000000000000001 Time this log was recorded: 01/01/1998 at 00:17:25 MCA Monarch Lid: 0x0000000000000000 **** CPU Error Information, CPU00 **** Time Stamp: 01/01/1998 at 00:17:25 PROCESSOR_SPECIFIC_DEVICE_INFO VALIDATION_BITS 0x000000000110102f PROC_ERROR_MAP 0x0000000201006000 Processor State Parameter 0xa8000000fff21330 PROC_CR_LID 0x0000000000000000 CPU ID Registers: CPU ID[00] 0x49656e69756e6547 CPU ID[01] 0x000000006c65746e CPU ID[02] 0x0000000000000000 CPU ID[03] 0x000000001f010504 CPU ID[04] 0x0000000000000001 Mod Error Section Cache Error [00] Valid Field Bits 0x0000000000000001 Check Info 0x0000000000000000 Requestor ID 0x0000000000000000 Responder ID 0x0000000000000000 Target ID 0x0000000000000000 Precise IP 0x0000000000000000 Cache Error [01] Valid Field Bits 0x0000000000000001 Check Info 0x0000000000000000 Requestor ID 0x0000000000000000 Responder ID 0x0000000000000000 Target ID 0x0000000000000000 Precise IP 0x0000000000000000 Bus Error [00] Valid Field Bits 0x0000000000000001 Check Info 0x0080000000000060 Requestor ID 0x0000000000000000 Responder ID 0x0000000000000000 Target ID 0x0000000000000000 Precise IP 0x0000000000000000 Micro Arch Error [00] Valid Field Bits 0x0000000000000001 Check Info 0x0080000000003001 Requestor ID 0x0000000000000000 Responder ID 0x0000000000000000 Target ID 0x0000000000000000 Precise IP 0x0000000000000000 GRs NaT bits 0x0000000000000000 PR (Predicates) 0xff000000000108c1 BR0 0xa00000010031b790 RSC 0x0000000000000003 IIP 0xa0000001002f09c0 IPSR 0x0000101008526030 IFS 0x80000028b040cc18 XIP 0xa0000001002f09c0 XPSR 0x0000101008526030 XFS 0x0000000000000000 BR1 0x0000000000000000 Processor Info Valid Field Bits 0x000000000000003f Valid Bits Decoding: Min State Save Area Valid: 00 Min State == GRs NATs, GR0-15, Bank 0, Bank 1, PR, BR0, RSC IIP, IPSR, IFS, XIP, XPSR, XFS BRs Valid: 01 CRs Valid: 02 ARs Valid: 03 RRs Valid: 04 FRs Valid: 05 Static General Registers (GR 0-15) 000-003 0x0000000000000000 a000000100a6ced0 c0000000000b9408 c0000000000b9418 004-007 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 008-011 0xe000000101252580 000000000000000e 0000000000000000 00000000000009c1 012-015 0xe000004080d1fe00 e000004080d18000 0000000000000000 0000000000000fa0 Bank 0 Static General Registers (GR 16-31, bank 0) 016-019 0xc0000000000b9400 0010000000000671 00000000000000f8 00100000000b9671 020-023 0x0000080400000000 0000181008526030 0000000000000000 0000000000000000 024-027 0xffffffffffff0000 0000000000000040 0000000000010000 000000000000000f 028-031 0x40000000000d83f0 0000101308526030 8000000000000006 00000000000108c1 Bank 1 Static General Registers (GR 16-31, bank 1) 016-019 0x0000000000000fa0 e000000101252588 e000000101252598 c0000000000b9800 020-023 0xe000000101252580 c0000000000b9440 e0000001012525c0 e0000001012523f0 024-027 0xa0000001002f0260 a0000001006b3b48 a0000001008611e0 a000000100861140 028-031 0x0000000000000000 0000000000000000 0000000000000020 0000000000000207 Branch Registers (BR 0-7) 000-003 0xa00000010031b790 0000000000000000 0000000000000000 0000000000000000 004-007 0x0000000000000000 0000000000000000 a00000010031b6a0 a0000001002f0320 Control Registers (CR 0-127) 000-003 0x0000000000007e04 00000009e93df11e a000000100000000 0000000000000000 004-007 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 008-011 0x1ffc0000000000c9 000000000000003c 0000000000000000 0000000000000000 012-015 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 016-019 0x0000101008526030 0000080400000000 0000000000000000 a0000001002f09c0 020-023 0xc0000000000b9400 0000000000000660 a0000001002f09c0 80000028b040cc18 024-027 0x000000000004c020 bffc000ffff90058 0000000000000000 0000000000000000 028-031 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 032-035 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 036-039 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 040-043 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 044-047 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 048-051 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 052-055 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 056-059 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 060-063 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 064-067 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 068-071 0x0000000000000000 0000000000000000 0000000000000000 0000800000000000 072-075 0x00000000000000ef 00000000000000ee 000000000000001f 0000000000000000 076-079 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 080-083 0x0000000000010000 0000000000010000 0000000000000000 0000000000000000 084-087 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 088-091 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 092-095 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 096-099 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 100-103 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 104-107 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 108-111 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 112-115 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 116-119 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 120-123 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 124-127 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 Application Registers (AR 0-127) 000-003 0x0003fffffc000000 0000000000000000 0000000000000000 00000000047e0000 004-007 0x0000000000004080 e000004081bb0000 e000004080d18000 00000040847d0000 008-011 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 012-015 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 016-019 0xa0000001002f09c0 e000004080d18fa8 e000004080d18db8 0000000000000000 020-023 0x0000000000000000 0000000000000040 0000000000000000 0000000000000000 024-027 0x0000000000000002 0000000000000000 0000000000000000 0000060080000011 028-031 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 032-035 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 036-039 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 040-043 0x0009804c8a74433f 0000000000000000 0000000000000000 0000000000000000 044-047 0x00000009e971cccd 0000000000000000 0000000000000000 0000000000000000 048-051 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 052-055 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 056-059 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 060-063 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 064-067 0x0000000000000207 0000000000000007 0000000000000001 0000000000000000 068-071 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 072-075 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 076-079 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 080-083 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 084-087 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 088-091 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 092-095 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 096-099 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 100-103 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 104-107 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 108-111 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 112-115 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 116-119 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 120-123 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 124-127 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 Region Registers (RR 0-7) 000-003 0x0000000000a63039 0000000000a63039 0000000000a63039 0000000000a63039 004-007 0x0000000000a63039 0000000000a63039 0000000000a63039 0000000000a63039 Floating Point Registers (FR 0-127) 000-001 0x000000000000000000000 0x0ffff8000000000000000 002-003 0x000000000000000000000 0x000000000000000000000 004-005 0x000000000000000000000 0x000000000000000000000 006-007 0x000000000000000000000 0x1003e000000000014ff97 008-009 0x1003e0000000000000002 0x10008fa00000000000000 010-011 0x000000000000000000000 0x0fff683126e978d4fdf3b 012-013 0x0ffbfc400000000000000 0x0ffe9a200000000000000 014-015 0x000000000000000000000 0x000000000000000000000 016-017 0x1003e80000000ff401600 0x1003e0000000000001600 018-019 0x1003e80000000ff401600 0x000000000000000000000 020-021 0x000000000000000000000 0x000000000000000000000 022-023 0x000000000000000000000 0x000000000000000000000 024-025 0x000000000000000000000 0x000000000000000000000 026-027 0x000000000000000000000 0x000000000000000000000 028-029 0x000000000000000000000 0x000000000000000000000 030-031 0x000000000000000000000 0x000000000000000000000 032-033 0x1003e200000000004d3c8 0x000000000000000000000 034-035 0x000000000000000000000 0x1003e0000000000000000 036-037 0x1003e600000000017c350 0x1003e600000000017c3b0 038-039 0x1003e200000000012aad0 0x1003e20000000001297e0 040-041 0x1003e20000000001284d0 0x1003e200000000004f1a8 042-043 0x000000000000000000000 0x000000000000000000000 044-045 0x1003e0000000100000001 0x1003effffffff00000000 046-047 0x1003e0000000100000001 0x1003e200000000004c5f0 048-049 0x1003e2000000000129c98 0x1003e20000000001289a0 050-051 0x1003e200000000004f660 0x000000000000000000000 052-053 0x000000000000000000000 0x1003e600000000017c2b0 054-055 0x1003e0000000000000000 0x1003e600000000017c330 056-057 0x1003e200000000012afd0 0x1003e200000000012a150 058-059 0x1003e2000000000128e60 0x1003e200000000004fb20 060-061 0x000000000000000000000 0x000000000000000000000 062-063 0x000000000000000000000 0x000000000000000000000 064-065 0x000000000000000000000 0x000000000000000000000 066-067 0x000000000000000000000 0x000000000000000000000 068-069 0x000000000000000000000 0x000000000000000000000 070-071 0x000000000000000000000 0x000000000000000000000 072-073 0x000000000000000000000 0x000000000000000000000 074-075 0x000000000000000000000 0x000000000000000000000 076-077 0x000000000000000000000 0x000000000000000000000 078-079 0x000000000000000000000 0x000000000000000000000 080-081 0x000000000000000000000 0x000000000000000000000 082-083 0x000000000000000000000 0x000000000000000000000 084-085 0x000000000000000000000 0x000000000000000000000 086-087 0x000000000000000000000 0x000000000000000000000 088-089 0x000000000000000000000 0x000000000000000000000 090-091 0x000000000000000000000 0x000000000000000000000 092-093 0x000000000000000000000 0x000000000000000000000 094-095 0x000000000000000000000 0x000000000000000000000 096-097 0x000000000000000000000 0x000000000000000000000 098-099 0x000000000000000000000 0x000000000000000000000 100-101 0x000000000000000000000 0x000000000000000000000 102-103 0x000000000000000000000 0x000000000000000000000 104-105 0x000000000000000000000 0x000000000000000000000 106-107 0x000000000000000000000 0x000000000000000000000 108-109 0x000000000000000000000 0x000000000000000000000 110-111 0x000000000000000000000 0x000000000000000000000 112-113 0x000000000000000000000 0x000000000000000000000 114-115 0x000000000000000000000 0x000000000000000000000 116-117 0x000000000000000000000 0x000000000000000000000 118-119 0x000000000000000000000 0x000000000000000000000 120-121 0x000000000000000000000 0x000000000000000000000 122-123 0x1003e0000000000000000 0x1003e0000000100000001 124-125 0x1003e0000000100000001 0x1003e200000000012a610 126-127 0x1003e2000000000129320 0x1003e2000000000128000 **** zx1 IOC Registers **** iocErrorValid 0x0000000000000000 **** PCI Component Registers **** pciCompErrorValid 0x0000000000000000 **** PCI Bus Registers **** pciBusErrorValid 0x0000000000000001 ---- PCI Bus ---- validation_bits 0x000000000000074f error_status 0x0000000000161600 error_type 0x 0005 bus_id 0x 0000 bus_addr 0x00000000000b9400 bus_data 0x0000000000000000 bus_cmd 0x0000000000000000 bus_requestor_id 0x00000000fed20000 bus_responder_id 0x0000000000000000 bus_target_id 0x00000000000b9400 bus_oem_id[0] 0x000000000000122e bus_oem_id[1] 0x0000000000000000 cellNum 0x 00000000 sbaNum 0x 0000 ropeNum 0x 0000 .... Mercury LBA .... error_status 0x688 0x0000000100000216 master_id_log 0x0690 0x0000000000000000 inbound_err_add 0x0290 0x0000000000000000 inbound_err_attrib 0x0298 0x0000000000000000 completion_msg_log 0x02A0 0x0000000000000000 outbound_err_address 0x0070 0x00000000000b9400 error_config 0x0680 0x0000000000001d50 status_info_cntrl 0x0108 0x0000000000000048 function_id 0x0000 0x83b00146122e103c capabilities_list 0x0060 0x0f00023700200002 agp_command 0x0068 0x0000000000000000 pcix_capabilities 0x00A0 0x0013ff0000010007 olr_control 0x0600 0x0002361200032403 clock_control 0x0618 0x0000000000000038 bus_mode 0x0620 0x95c8646fa5850001
Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 10 of 17 (1,591 Views)

Re: zx2000: Machine check when booting

Sorry for posting unformatted text, Edit option isn't working for me this time :( **** MCA Error Log Dump **** Firmware Revision: 02.31 Architected SAL Record ID 0x0000000000000001 Time this log was recorded: 01/01/1998 at 00:17:25 MCA Monarch Lid: 0x0000000000000000 **** CPU Error Information, CPU00 **** Time Stamp: 01/01/1998 at 00:17:25 PROCESSOR_SPECIFIC_DEVICE_INFO VALIDATION_BITS 0x000000000110102f PROC_ERROR_MAP 0x0000000201006000 Processor State Parameter 0xa8000000fff21330 PROC_CR_LID 0x0000000000000000 CPU ID Registers: CPU ID[00] 0x49656e69756e6547 CPU ID[01] 0x000000006c65746e CPU ID[02] 0x0000000000000000 CPU ID[03] 0x000000001f010504 CPU ID[04] 0x0000000000000001 Mod Error Section Cache Error [00] Valid Field Bits 0x0000000000000001 Check Info 0x0000000000000000 Requestor ID 0x0000000000000000 Responder ID 0x0000000000000000 Target ID 0x0000000000000000 Precise IP 0x0000000000000000 Cache Error [01] Valid Field Bits 0x0000000000000001 Check Info 0x0000000000000000 Requestor ID 0x0000000000000000 Responder ID 0x0000000000000000 Target ID 0x0000000000000000 Precise IP 0x0000000000000000 Bus Error [00] Valid Field Bits 0x0000000000000001 Check Info 0x0080000000000060 Requestor ID 0x0000000000000000 Responder ID 0x0000000000000000 Target ID 0x0000000000000000 Precise IP 0x0000000000000000 Micro Arch Error [00] Valid Field Bits 0x0000000000000001 Check Info 0x0080000000003001 Requestor ID 0x0000000000000000 Responder ID 0x0000000000000000 Target ID 0x0000000000000000 Precise IP 0x0000000000000000 GRs NaT bits 0x0000000000000000 PR (Predicates) 0xff000000000108c1 BR0 0xa00000010031b790 RSC 0x0000000000000003 IIP 0xa0000001002f09c0 IPSR 0x0000101008526030 IFS 0x80000028b040cc18 XIP 0xa0000001002f09c0 XPSR 0x0000101008526030 XFS 0x0000000000000000 BR1 0x0000000000000000 Processor Info Valid Field Bits 0x000000000000003f Valid Bits Decoding: Min State Save Area Valid: 00 Min State == GRs NATs, GR0-15, Bank 0, Bank 1, PR, BR0, RSC IIP, IPSR, IFS, XIP, XPSR, XFS BRs Valid: 01 CRs Valid: 02 ARs Valid: 03 RRs Valid: 04 FRs Valid: 05 Static General Registers (GR 0-15) 000-003 0x0000000000000000 a000000100a6ced0 c0000000000b9408 c0000000000b9418 004-007 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 008-011 0xe000000101252580 000000000000000e 0000000000000000 00000000000009c1 012-015 0xe000004080d1fe00 e000004080d18000 0000000000000000 0000000000000fa0 Bank 0 Static General Registers (GR 16-31, bank 0) 016-019 0xc0000000000b9400 0010000000000671 00000000000000f8 00100000000b9671 020-023 0x0000080400000000 0000181008526030 0000000000000000 0000000000000000 024-027 0xffffffffffff0000 0000000000000040 0000000000010000 000000000000000f 028-031 0x40000000000d83f0 0000101308526030 8000000000000006 00000000000108c1 Bank 1 Static General Registers (GR 16-31, bank 1) 016-019 0x0000000000000fa0 e000000101252588 e000000101252598 c0000000000b9800 020-023 0xe000000101252580 c0000000000b9440 e0000001012525c0 e0000001012523f0 024-027 0xa0000001002f0260 a0000001006b3b48 a0000001008611e0 a000000100861140 028-031 0x0000000000000000 0000000000000000 0000000000000020 0000000000000207 Branch Registers (BR 0-7) 000-003 0xa00000010031b790 0000000000000000 0000000000000000 0000000000000000 004-007 0x0000000000000000 0000000000000000 a00000010031b6a0 a0000001002f0320 Control Registers (CR 0-127) 000-003 0x0000000000007e04 00000009e93df11e a000000100000000 0000000000000000 004-007 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 008-011 0x1ffc0000000000c9 000000000000003c 0000000000000000 0000000000000000 012-015 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 016-019 0x0000101008526030 0000080400000000 0000000000000000 a0000001002f09c0 020-023 0xc0000000000b9400 0000000000000660 a0000001002f09c0 80000028b040cc18 024-027 0x000000000004c020 bffc000ffff90058 0000000000000000 0000000000000000 028-031 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 032-035 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 036-039 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 040-043 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 044-047 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 048-051 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 052-055 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 056-059 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 060-063 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 064-067 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 068-071 0x0000000000000000 0000000000000000 0000000000000000 0000800000000000 072-075 0x00000000000000ef 00000000000000ee 000000000000001f 0000000000000000 076-079 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 080-083 0x0000000000010000 0000000000010000 0000000000000000 0000000000000000 084-087 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 088-091 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 092-095 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 096-099 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 100-103 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 104-107 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 108-111 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 112-115 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 116-119 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 120-123 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 124-127 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 Application Registers (AR 0-127) 000-003 0x0003fffffc000000 0000000000000000 0000000000000000 00000000047e0000 004-007 0x0000000000004080 e000004081bb0000 e000004080d18000 00000040847d0000 008-011 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 012-015 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 016-019 0xa0000001002f09c0 e000004080d18fa8 e000004080d18db8 0000000000000000 020-023 0x0000000000000000 0000000000000040 0000000000000000 0000000000000000 024-027 0x0000000000000002 0000000000000000 0000000000000000 0000060080000011 028-031 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 032-035 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 036-039 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 040-043 0x0009804c8a74433f 0000000000000000 0000000000000000 0000000000000000 044-047 0x00000009e971cccd 0000000000000000 0000000000000000 0000000000000000 048-051 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 052-055 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 056-059 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 060-063 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 064-067 0x0000000000000207 0000000000000007 0000000000000001 0000000000000000 068-071 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 072-075 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 076-079 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 080-083 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 084-087 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 088-091 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 092-095 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 096-099 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 100-103 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 104-107 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 108-111 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 112-115 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 116-119 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 120-123 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 124-127 0x0000000000000000 0000000000000000 0000000000000000 0000000000000000 Region Registers (RR 0-7) 000-003 0x0000000000a63039 0000000000a63039 0000000000a63039 0000000000a63039 004-007 0x0000000000a63039 0000000000a63039 0000000000a63039 0000000000a63039 Floating Point Registers (FR 0-127) 000-001 0x000000000000000000000 0x0ffff8000000000000000 002-003 0x000000000000000000000 0x000000000000000000000 004-005 0x000000000000000000000 0x000000000000000000000 006-007 0x000000000000000000000 0x1003e000000000014ff97 008-009 0x1003e0000000000000002 0x10008fa00000000000000 010-011 0x000000000000000000000 0x0fff683126e978d4fdf3b 012-013 0x0ffbfc400000000000000 0x0ffe9a200000000000000 014-015 0x000000000000000000000 0x000000000000000000000 016-017 0x1003e80000000ff401600 0x1003e0000000000001600 018-019 0x1003e80000000ff401600 0x000000000000000000000 020-021 0x000000000000000000000 0x000000000000000000000 022-023 0x000000000000000000000 0x000000000000000000000 024-025 0x000000000000000000000 0x000000000000000000000 026-027 0x000000000000000000000 0x000000000000000000000 028-029 0x000000000000000000000 0x000000000000000000000 030-031 0x000000000000000000000 0x000000000000000000000 032-033 0x1003e200000000004d3c8 0x000000000000000000000 034-035 0x000000000000000000000 0x1003e0000000000000000 036-037 0x1003e600000000017c350 0x1003e600000000017c3b0 038-039 0x1003e200000000012aad0 0x1003e20000000001297e0 040-041 0x1003e20000000001284d0 0x1003e200000000004f1a8 042-043 0x000000000000000000000 0x000000000000000000000 044-045 0x1003e0000000100000001 0x1003effffffff00000000 046-047 0x1003e0000000100000001 0x1003e200000000004c5f0 048-049 0x1003e2000000000129c98 0x1003e20000000001289a0 050-051 0x1003e200000000004f660 0x000000000000000000000 052-053 0x000000000000000000000 0x1003e600000000017c2b0 054-055 0x1003e0000000000000000 0x1003e600000000017c330 056-057 0x1003e200000000012afd0 0x1003e200000000012a150 058-059 0x1003e2000000000128e60 0x1003e200000000004fb20 060-061 0x000000000000000000000 0x000000000000000000000 062-063 0x000000000000000000000 0x000000000000000000000 064-065 0x000000000000000000000 0x000000000000000000000 066-067 0x000000000000000000000 0x000000000000000000000 068-069 0x000000000000000000000 0x000000000000000000000 070-071 0x000000000000000000000 0x000000000000000000000 072-073 0x000000000000000000000 0x000000000000000000000 074-075 0x000000000000000000000 0x000000000000000000000 076-077 0x000000000000000000000 0x000000000000000000000 078-079 0x000000000000000000000 0x000000000000000000000 080-081 0x000000000000000000000 0x000000000000000000000 082-083 0x000000000000000000000 0x000000000000000000000 084-085 0x000000000000000000000 0x000000000000000000000 086-087 0x000000000000000000000 0x000000000000000000000 088-089 0x000000000000000000000 0x000000000000000000000 090-091 0x000000000000000000000 0x000000000000000000000 092-093 0x000000000000000000000 0x000000000000000000000 094-095 0x000000000000000000000 0x000000000000000000000 096-097 0x000000000000000000000 0x000000000000000000000 098-099 0x000000000000000000000 0x000000000000000000000 100-101 0x000000000000000000000 0x000000000000000000000 102-103 0x000000000000000000000 0x000000000000000000000 104-105 0x000000000000000000000 0x000000000000000000000 106-107 0x000000000000000000000 0x000000000000000000000 108-109 0x000000000000000000000 0x000000000000000000000 110-111 0x000000000000000000000 0x000000000000000000000 112-113 0x000000000000000000000 0x000000000000000000000 114-115 0x000000000000000000000 0x000000000000000000000 116-117 0x000000000000000000000 0x000000000000000000000 118-119 0x000000000000000000000 0x000000000000000000000 120-121 0x000000000000000000000 0x000000000000000000000 122-123 0x1003e0000000000000000 0x1003e0000000100000001 124-125 0x1003e0000000100000001 0x1003e200000000012a610 126-127 0x1003e2000000000129320 0x1003e2000000000128000 **** zx1 IOC Registers **** iocErrorValid 0x0000000000000000 **** PCI Component Registers **** pciCompErrorValid 0x0000000000000000 **** PCI Bus Registers **** pciBusErrorValid 0x0000000000000001 ---- PCI Bus ---- validation_bits 0x000000000000074f error_status 0x0000000000161600 error_type 0x 0005 bus_id 0x 0000 bus_addr 0x00000000000b9400 bus_data 0x0000000000000000 bus_cmd 0x0000000000000000 bus_requestor_id 0x00000000fed20000 bus_responder_id 0x0000000000000000 bus_target_id 0x00000000000b9400 bus_oem_id[0] 0x000000000000122e bus_oem_id[1] 0x0000000000000000 cellNum 0x 00000000 sbaNum 0x 0000 ropeNum 0x 0000 .... Mercury LBA .... error_status 0x688 0x0000000100000216 master_id_log 0x0690 0x0000000000000000 inbound_err_add 0x0290 0x0000000000000000 inbound_err_attrib 0x0298 0x0000000000000000 completion_msg_log 0x02A0 0x0000000000000000 outbound_err_address 0x0070 0x00000000000b9400 error_config 0x0680 0x0000000000001d50 status_info_cntrl 0x0108 0x0000000000000048 function_id 0x0000 0x83b00146122e103c capabilities_list 0x0060 0x0f00023700200002 agp_command 0x0068 0x0000000000000000 pcix_capabilities 0x00A0 0x0013ff0000010007 olr_control 0x0600 0x0002361200032403 clock_control 0x0618 0x0000000000000038 bus_mode 0x0620 0x95c8646fa5850001
Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 11 of 17 (1,590 Views)

Re: zx2000: Machine check when booting

terribly sorry :) why did it appear to be possible to format the first post in this thread while it's not the case this time?
Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 12 of 17 (1,589 Views)

Re: zx2000: Machine check when booting

here is an attach with the log and lspci -vvv results for busid=0
Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 13 of 17 (1,584 Views)

Re: zx2000: Machine check when booting

Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 14 of 17 (1,581 Views)

Re: zx2000: Machine check when booting

as I learned from SAL docs, pcibus error_type=5 means master data parity error. The lastest error registered seems to that after replacing the battery, because the date is 1998. However, 2 days ago I reboteed the machine twice (after 2 week uptime), with first boot being unsuccessful before starting EFI, and this time without any battery manipulations. If the nature of both Machine checks - one before EFI/ELILO and one during Linux startup, is the same, it doesn't matter which fail is recorded here. Otherwise I need logs for both types of error. BTW, leds signaled the same manner ("firmware error") in both cases.
Honored Contributor
Robert_Jewell
Posts: 1,238
Registered: ‎06-26-2001
Message 15 of 17 (1,574 Views)

Re: zx2000: Machine check when booting

From the data, it appears that the log contents are valid.  I am really not able to make anything out of this though.

 

Perhaps someone else can interpret the data?  Otherwise, start a new post with a request for an MCA/tombstone analysis.

 

Best of luck,

 

-Bob

Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 16 of 17 (1,559 Views)

Re: zx2000: Machine check when booting

Hello Bob,

 

although I haven't found a consultant to see the logs yet, now I'm very close to the right way :) Now, even gently touching the videocard connector I get this error; taking into account the busid=0 which seems to be the AGP address, I'm almost sure of the reason behind this. Long live zx2000 itself :)

 

PS. I lost the "AGP retainer" a year ago - and now the CPU reminds me that the detail is an essential one :)

 

 

Thank you Bob again, your help was very valuable.

 

Regards

Valery

Advisor
valeryz2001
Posts: 25
Registered: ‎10-13-2012
Message 17 of 17 (1,358 Views)

Re: zx2000: Machine check when booting

Hello Robert,

 

you seem to be absolutely right, CPU is damaged. Here is an extraction from salinfo record:

 

 

BEGIN HARDWARE ERROR STATE from mca on cpu 0
Err Record ID: 11    SAL Rev:  0.02
Time: 2013-05-25 07:23:30    Severity 0 Validation bits 0x00
Processor Device Error Info Section
  UNCORRECTED PROCESSOR ERROR: Cache Check
    processor lid            : 0x0000000000000000
    processor state parameter: 0x28000000fff21130
            rz  [2]=0     rendezvous request unsuccessful
            ra  [3]=0     rendezvous was not attempted
            me  [4]=1     multiple errors have occurred
            mn  [5]=1     min state registered with PAL
            sy  [6]=0     storage integrity not synchronized
            co  [7]=0     not continuable
            ci  [8]=1     machine check is isolated
            mi  [12]=1    more info available
            pi  [13]=0    ip logged is not precise
            pm  [14]=0    min state is not precise
            dy  [15]=0    processor dynamic state is not valid
            rs  [17]=1    rse is valid
            cm  [18]=0    fault has not been corrected
            cr  [20]=1    control registers are valid
            pc  [21]=1    performance counters are valid
            dr  [22]=1    debug registers are valid
            tr  [23]=1    translation registers are valid
            rr  [24]=1    region registers are valid
            ar  [25]=1    application registers are valid
            br  [26]=1    branch registers are valid
            pr  [27]=1    predicate registers are valid
            fp  [28]=1    floating point registers are valid
            b1  [29]=1    bank one general registers are valid
            b0  [30]=1    bank zero general registers are valid
            gr  [31]=1    general registers are valid
            cc  [59]=1    cache check
            bc  [61]=1    bus check
    PAL recovery status:
      error was isolated and contained, continuable if sw can recover
    processor error map      : 0x0000000001002000
      processor code id: 0
      logical thread id: 0
      data cache level 2 error
      processor bus level 1 error
  Cache check info[0]
    Operation: 0 (Unknown/unclassified), Level: L0
  BUS Check Info [0]
    Transaction size: 4, External Bus Error:, Type: 1 (Partial read), Severity: 0, Hierarchy: 0, Status information: 3 (Hard fail)

 

Regards,

Valery

The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.