HP ProLiant Servers: Unexpected system reboot and Uncorrectable Machine Check Exception

I faced with strange problem on ProLiant server (BL460c ) in C7000 enclosure. One of eight servers had unexpected system reboots one or twice in 1-2 days.

All servers were updated with HP Service Pack 09.2013 through HP SUM 6.20 and configured for Hyper-V cluster 2008 R2 (up-to-date too).

From OS I didn’t see any errors/warnings besides Kernel Power with ID 41 , no memory dumps too

Went to Blade administration and noticed in IML logs:

Uncorrectable Machine Check Exception (Board 0, Processor 1, APIC ID 0x00000035, Bank 0x00000005, Status 0xBE000000’00800400,
Address 0x00003800’06D29323, Misc 0x00000000’00007FFF)

Searched at HP support site and found this article . Server’s ROM is 07/02/2013. So, is this article applicable to server with ROM > 2011?

Anyway, I applied offered actions (go to System BIOS ,change options below) and resolved the issue.

  • Minimum Processor Idle Power State  to  “No C-states”
  • Intel QPI Link Power Management to “Disabled”
  • HP Power Profile option to “Maximum Performance”

TIP: If you have warranty/support contract with HP, you do not need to make any changes. Locate your HP local partner and replace CPU /board or open HP support case! It’s not official resolution. According with support article, only HP ProLiant servers with ROMs older than May, 2011 are affected!

2 thoughts on “HP ProLiant Servers: Unexpected system reboot and Uncorrectable Machine Check Exception”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: