I faced with strange problem on ProLiant server (BL460c ) in C7000 enclosure. One of eight servers had unexpected system reboots one or twice in 1-2 days.
From OS I didn’t see any errors/warnings besides Kernel Power with ID 41 , no memory dumps too
Went to Blade administration and noticed in IML logs:
Uncorrectable Machine Check Exception (Board 0, Processor 1, APIC ID 0x00000035, Bank 0x00000005, Status 0xBE000000’00800400,
Address 0x00003800’06D29323, Misc 0x00000000’00007FFF)
Searched at HP support site and found this article . Server’s ROM is 07/02/2013. So, is this article applicable to server with ROM > 2011?
Anyway, I applied offered actions (go to System BIOS ,change options below) and resolved the issue.
- Minimum Processor Idle Power State to “No C-states”
- Intel QPI Link Power Management to “Disabled”
- HP Power Profile option to “Maximum Performance”
TIP: If you have warranty/support contract with HP, you do not need to make any changes. Locate your HP local partner and replace CPU /board or open HP support case! It’s not official resolution. According with support article, only HP ProLiant servers with ROMs older than May, 2011 are affected!