HP ProLiant Servers: Unexpected system reboot and Uncorrectable Machine Check Exception

I faced with strange problem on ProLiant server (BL460c ) in C7000 enclosure. One of eight servers had unexpected system reboots one or twice in 1-2 days.

All servers were updated with HP Service Pack 09.2013 through HP SUM 6.20 and configured for Hyper-V cluster 2008 R2 (up-to-date too).

From OS I didn’t see any errors/warnings besides Kernel Power with ID 41 , no memory dumps too

Went to Blade administration and noticed in IML logs:

Uncorrectable Machine Check Exception (Board 0, Processor 1, APIC ID 0x00000035, Bank 0x00000005, Status 0xBE000000’00800400,
Address 0x00003800’06D29323, Misc 0x00000000’00007FFF)

Searched at HP support site and found this article . Server’s ROM is 07/02/2013. So, is this article applicable to server with ROM > 2011?

Anyway, I applied offered actions (go to System BIOS ,change options below) and resolved the issue.

  • Minimum Processor Idle Power State  to  “No C-states”
  • Intel QPI Link Power Management to “Disabled”
  • HP Power Profile option to “Maximum Performance”

TIP: If you have warranty/support contract with HP, you do not need to make any changes. Locate your HP local partner and replace CPU /board or open HP support case! It’s not official resolution. According with support article, only HP ProLiant servers with ROMs older than May, 2011 are affected!

One thought on “HP ProLiant Servers: Unexpected system reboot and Uncorrectable Machine Check Exception”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s