What is new in Failover Clustering in Windows Server 2016

Finally, I’d like to review what’s new in failover clustering in Windows Server 2016. Actually, I wrote this article a couple of months ago for Russian official Microsoft blog so if you are Russian you can go to this resource to read it in your native language.

Also, I described some of the new features before RTM-version (when only TPs were available) and almost all of them can be applied to Windows Server 2016 as well. It means there are no significant changes in RTM for them. I’ll provide a short description of such features and links to my previous posts with a detailed information.

And yes, of course, completely new functionality (Load Balancing, for instance) will also be described here

* ~~I have all of this in PDF format. Ping me in the comments/email and I’ll send to you the copy~~ PDF has been shared

Cluster OS Rolling upgrade

Cluster migration is usually a headache for administrators. It could be the reason of huge downtime (because we need to evict some nodes from old cluster, build the new one based on these nodes or new hardware and migrate roles from source cluster. So, in the case of overcommitment we won’t have enough resources to run migrated VMs). It’s critical for CSPs and other customers that have implemented SLA policy.

Windows Server 2016 fixes this by adding possibility to place Windows Server 2012 R2 and Windows Server 2016 nodes in the same cluster during upgrade/migration phase.

The new feature named as Cluster Rolling Upgrade (CRU) significantly simplifies overall process and allows us to successively upgrade existed nodes without destroying cluster. It helps to reduce downtime and any required costs (hardware, staff time and etc.)

The full list of CRU benefits is listed below:

Hyper-V virtual machine and Scale-out File Server workloads can be upgraded ONLY from Windows Server 2012 R2 to Windows Server 2016 without any downtime. Other cluster workloads will be unavailable during the time it takes to failover (for example, SQL Server with AlwaysOn FCI ~ 5 minutes of downtime)
It does not require any additional hardware (for example, you evicted 1 node of 4. The rest 3 nodes are online and they must have resources for workloads live migrated from evicted node. In this case zero-downtime is predicted)
The cluster does not need to be stopped or restarted.
In-Place OS upgrade is supported BUT Clean OS install is highly recommended. Use In-Place upgrading carefully and always check logs/services before adding node back to cluster.
A new cluster is not required. In addition, existing cluster objects stored in Active Directory are used.
The upgrade process is reversible until the customer crosses the “point-of-no-return”, when all cluster nodes are running Windows Server Technical Preview, and when the Update-ClusterFunctionalLevel PowerShell cmdlet is run.
The cluster can support patching and maintenance operations while running in the mixed-OS mode.
CRU is supported by VMM 2016 and can be automated through PowerShell/WMI

To get more details read my previous post that shows CRU in action (it’s been written for Technical Preview but can still be used with RTM)

Hint: get list of supported VM’s version by host (Get-VMHostSupportedVersion).

Cloud Witness

Failover cluster in Windows Server 2012 R2 can be deployed with an external disk or file share witness which must be available for each cluster nodes and it’s needed as a source of extra vote. As you may know, witness is highly recommended (I’d say it’s required!) for Windows Server 2012 R2 cluster regardless of a number nodes in it (dynamic quorum automatically decides when to use witness).

In Windows Server 2016 a new witness type has been introduced – Cloud Witness. Yes, it’s Azure-based and it’s specially created for DR-scenarios, Workgroup/Multi-Domain cluster (will be described later), guest clusters and clusters without shared storage between nodes.

Cloud Witness uses Azure Storage resources (Azure Blog Storage through HTTPS protocol. HTTPS port should be opened on all cluster nodes) for read/write operations. Same storage account can be used for different clusters because Azure creates a blob-file generated for each cluster with unique IDs. These blob-files are kept in msft-cloud-witness container and require just KBs of storage. So, costs are minimal and Cloud Witness can be simply used as a third site (“arbitration’) in stretched clusters and DR solutions.

Cloud Witness scenarios:

Multi-Site clusters
Clusters without shared storage (Exchange DAG, SQL Always-On and etc.)
Guests clusters running on Azure and On-Premises
Storage Cluster with or without shared storage (SOFS)
WorkGroup and Multi-Domain Clusters (new in WS2016. It’ll be described later)

How to create and add cloud witness to cluster

1) Create a new Azure Storage Account (Locally-redundant storage) and copy one of the shared key

2) Run Quorum Configuration on your cluster and choose “Select the Quorum Witness – Configure a Cloud Witness”

3) Type your storage account name and paste shared key (copied on step 1)

4) Wait a minute and witness becomes available in Core Resources

5) Here is a blob-file in Azure created for added cluster

PowerShell one-liner:

Workgroup and Multi-Domain Clusters

In Windows Server 2012/2012 R2 and previous versions, there is one global requirement for cluster: single-domain joined nodes. Active Directory Detached cluster, which was introduced in 2012 R2, has the same requirement and does not provide advanced flexibility either. Beginning from Windows Server 2016 you have additional options: create cluster with nodes in Workgroup and create cluster in multi-domain environment.

Virtual Machine Load Balancing / Node Fairness

VMM historically provides advanced options to efficiently manage cluster resources. Dynamic optimization available in VMM automatically load balances resources between nodes. The trimmed-down version of that has been introduced in Windows Server 2016. VM Load Balancing or Node Fairness moves resources (live migration) every 30 minutes to other nodes in cluster based on configured heuristics:

Current % of memory usage
Average CPU load (last 5 mins)

WSFC uses AutoBalancerLevel and AutoBalanerMode for making a decision when to move resources:

get-cluster| fl *autobalancer*

AutoBalancerMode : 2

AutoBalancerLevel : 1

AutoBalancerLevel	Aggressiveness	When to move?
1 (default)	Low	When host load is more than 80%
2	Medium	When host load is more than 70%
3	High	When host load is more than 60%

You can also set these settings in GUI (cluadmin.msc). By default, WSFC (cluster) uses Low level of aggressiveness and tries to always load balance.

I’m using the following values for demo:

#AutoBalancerLevel: 2

(Get-Cluster).AutoBalancerLevel = 2

#AutoBalancerMode:2

(Get-Cluster).AutoBalancerMode = 2

And verifying two scenarios:

High CPU load on my host (about 88%)
High RAM usage (about 77%).

As we use medium aggressiveness (70%) virtual machines will be moved from that busy host to another. My script waits while live migration starts and then outputs the elapsed time for the entire process of load balancing.

When my host had a CPU load (~88%), VM balancer moved more than 1 VM. In case of RAM usage (~77%) – 1 VM was moved.

All live migrations started in 30-minute interval. Load balancing feature works perfectly.

If you manage your nodes from VMM, Dynamic optimization is a preferred method for load balancing (it has more advanced settings and more powerful). When it’s enabled, VMM automatically disables VM Load Balancing feature in Windows Server.

Virtual machine start ordering

In the previous versions start ordering is addressed by configuring VM’s priority. We have Low, Medium, High priority levels and this ability helps to identify which resources should be started before other dependent (for example, Active Directory or SQL Server). Unfortunately, there is a one big limitation : no cross-node orchestration and VMs are considered to be running once it reaches online state.

Windows Server 2016 changes this behavior by adding VM Start Ordering which allows you to define dependencies between VMs and group VMs using Set (see the pic below). Originally added to WS to orchestrate VMs start ordering but can be useful for any applications that represented as cluster groups.

Let’s review some examples:

1 VM (Clu-VM02) runs application that dependent from Active Directory, running on the VM called as Clu-VM-01. But virtual machine Clu-VM03 depends from application on Clu-VM02.

To solve this, I’ll create the new Set using PowerShell:

For VM with Active Directory:

PS C:\&amp;amp;gt; New-ClusterGroupSet -Name AD -Group Clu-VM01

Name : AD

GroupNames : {Clu-VM01}

ProviderNames : {}

StartupDelayTrigger : Delay

StartupCount : 4294967295

IsGlobal : False

StartupDelay : 20

For VM with Application:

New-ClusterGroupSet -Name Application -Group Clu-VM02

For the service that depends from Application:

New-ClusterGroupSet -Name SubApp -Group Clu-VM03

Dependencies between groups:

Add-ClusterGroupSetDependency -Name Application -Provider AD

Add-ClusterGroupSetDependency -Name SubApp -Provider Application

To change already created Set use the cmdlet Set-ClusterGroupSet:

Set-ClusterGroupSet Application -StartupDelayTrigger Delay -StartupDelay 30

StartupDelayTrigger defines what action should trigger the start and can have one of two values:

Delay – waits 20 second (by default). Uses StartupDelay value.
Online – waits until the group has reached an online state

StartupDelay – delay time in seconds. 20 seconds by default

isGlobal – defines if the set should start before all other sets (for example, set with Active Directory VMs must be globally available and, therefore, has to be started before all other sets)

StartupCount – defines the number of groups in the set which must have achieved StartupDelayTrigger before the set is considered started. By default it has “-1” value that means “All groups”. You can change it to your number (if it’d be greater than real number of groups, “All groups” will be used) or “0” to use the majority of groups in the set.

Let’s start VM Clu-VM03 (service that depends from Application):

It waits while Active Directory on Clu-VM01 becomes available (StartupDelayTrigger – Delay , StartupDelay – 20 seconds)

When Active Directory is online, Clu-VM02 VM starts (StartupDelay is also used here)

Clu-VM02 is available and running -> signal to start Clu-VM03 VM

VM Compute/Storage Resiliency

Now we have the new states of nodes and VMs for their better resiliency in scenarios with network or storage issues. Storage and compute resiliencies have been added to achieve proactive action, react on “small” problems and predict the most critical problems. Let’s review some examples.

Isolated Mode

Cluster service on node HV01 is unavailable so there is an issue with intra-cluster communication. In this case HV01 node changes state to Isolated (ResiliencyLevel parameter) and being removed from active cluster membership

HV01 continues to host all VMs* but state of all VMs becomes “Unmonitored” (it means that cluster service does not manage them)

*VMs continues to run if they sit on SMB storage. VMs placed to “Paused Critical” state if they are on block storage (FC/iSCSI and etc) ‘cause isolated node no longer has access to CSV

It’s allowed to HV01 be in Isolated state within ResiliencyDefaultPeriod (240 seconds by default). But if the cluster service (in my case) is not in online state after 240 seconds, HV01 host will go into a down state and VMS will be migrated to the “health” node.

Quarantined

Let’s say that HV01 recovered from Isolated state, cluster service became online and it seems all nodes are in a good condition. Unexpectedly, cluster service on HV01 becomes unavailable again and this issue is repeated once and more times during a last hour.

It this case QuarantineThreshold (number of failures before a node is Quarantined. By default, it’s 3) will be reached and node will go into Quarantine state for 2 hours (QuarantineDuration parameter). All VMs will be moved from HV01 to the health HV02 node.

We fixed all issues in HV01 and want to bring it out of quarantine. In this case, we need to run the following command:

Please note that no more than 25% of nodes can be quarantined at any given time

To customize settings:

(Get-Cluster). QuarantineDuration = 1800

Storage Resiliency

Do you know what happens when shared storage becomes unavailable? Yes, you are right. VMs go to Offline state and then require cold boot on the next start. It was…now Windows Server 2016 takes these VMs to Paused-Criticial (AutomaticCriticalErrorAction parameter) and freezes their state (r/w operations stopped, VM is unavailable but it’s not turned off)

If storage comes back during 30 minutes (AutomaticCriticalErrorActionTimeout, 30 minutes is default), VM goes out of Paused-Critical state and becomes available (analogy – pause/play in your audio player). If storage is still unavailable after configured timeout, VMs will be turned off.

Site-Aware/Stretched Clusters and Storage Replica

Previously, we needed to find 3^rd party solutions for SAN-to-SAN replication or etc. And building stretched clusters required a huge amount of money. Windows Server 2016 can help to significantly reduce costs and enhance unification in such scenarios.

Storage Replica is the main component of multi-site clusters or DR-solution and it supports both asynchronous and synchronous (!) replication between any storage devices (including Storage Spaces Direct). Storage Replica in only available in Datacenter Edition and can be used in the following configurations:

Storage Replica supports automatic failover in stretched clusters and works side-by-side with the other newest feature in Windows Server 2016 – site-awareness. Site-Awareness allows you to define groups of cluster nodes and link them to physical locations (site fault domain/sites) in order to form custom failover policies, VMs and S2D data placement. And in addition, we can link these group on the lower levels (node, rack or chassis). See the examples below.

New-ClusterFaultDomain -Name Voronezh -Type Site -Description “Primary” -Location “Voronezh DC”

New-ClusterFaultDomain -Name Voronezh2 -Type Site -Description “Secondary” -Location “Voronezh DC2”

New-ClusterFaultDomain -Name Rack1 -Type Rack 

New-ClusterFaultDomain -Name Rack2 -Type Rack

New-ClusterFaultDomain -Name HPc7000 -type Chassis

New-ClusterFaultDomain -Name HPc3000 -type Chassis

Set-ClusterFaultDomain -Name HV01 -Parent Rack1

Set-ClusterFaultDomain -Name HV02 -Parent Rack2

Set-ClusterFaultDomain Rack1,HPc7000 -parent Voronezh

Set-ClusterFaultDomain Rack2,HPc3000 -parent Voronezh2

Final result:

Site-Awareness benefits:

Groups failover to a node within the same site, before failing to a node in a different site
During Node Drain VMs are moved first to a node within the same site before being moved cross site
The CSV load balancer will distribute within the same site
Virtual Machines (VMs) follow storage and are placed in same site where their associated storage resides. VMs will begin live migrating to the same site as their associated CSV after 1 minute of the storage being moved.

Using site-awareness we can define the parent site for all new created VMs:

(Get-Cluster).PreferredSite = &amp;amp;lt;site name&amp;amp;gt;

Or set it for the specific cluster group:

(Get-ClusterGroup -Name GroupName).PreferredSite = &amp;amp;lt;preferred site name&amp;amp;gt;

Miscellaneous

Storage Spaces Direct and Storage QoS support
Online shared VHDX resizing for guest clusters, Hyper-V replica and host-level backup support
Enhanced scalability and performance of CSV Cache with added support for tiered spaces, storage spaces direct and deduplication (it becomes usual to give a tens GBs to CSV Cache)
Cluster Log Changes (time zones information, active memory dumps) to simplify overall diagnostic
WSFC automatically recognizes and configures multiple NICs on the same subnet for SMB Multichannel. No configuration is necessary.

Thank you very much for reading!

17 thoughts on “What is new in Failover Clustering in Windows Server 2016”

viniciusmozart says:

January 26, 2017 at 9:08 pm

Thanks! Great job.

1. rlevchenko says:
  
  January 26, 2017 at 9:16 pm
  
  Thanks!
  
Radouane says:

January 29, 2017 at 5:16 pm

Hi Roman,
Could you please send this article on PDF format ?
Many thanks in advance

1. rlevchenko says:
  
  January 29, 2017 at 8:08 pm
  
  Sure. I’ll send to you in 1-2 hours
  
Diep says:

July 30, 2017 at 10:00 am

Why we dont move VMs to another node instead of unmonitor it. we need HA always

1. rlevchenko says:
  
  July 30, 2017 at 10:54 am
  
  Clusvc does migration after ResiliencyDefaultPeriod . We don’t need to migrate resources after every small/uncritical detected issue.
  
Lokesh A R says:

October 23, 2017 at 2:15 pm

Nice article.can you send me the PDF doc. Thanks in advance my email is lokesh@opendoorstc.com

1. rlevchenko says:
  
  October 23, 2017 at 3:11 pm
  
  check your mail, sir!
  
Patrik says:

March 27, 2018 at 6:49 pm

Great article. Question I have is can I use gMSA account to run the CLUSSvc across the domain. I know that this was not supported in Windows 2012 r2. However I am setting up a new Cluster in Server 2016. At this it seems that the server stops with an error saying that the service cannot start because for the following

Event ID: 10016

Description:
The application-specific permission settings do not grant Local Activation permission for the COM Server application with CLSID
{8D8F4F83-3594-4F07-8369-FC3C3CAE4919}
and APPID
{F72671A9-012C-4725-9D2F-2A4D32D65169}
to the user NT AUTHORITY\SYSTEM SID (S-1-5-18) from address LocalHost (Using LRPC) running in the application container Unavailable SID (Unavailable). This security permission can be modified using the Component Services administrative tool.

I have modified the permissions for the gmsa on the computer to Launch and Activate both Local and Remote.
The service still fails to start.

Regards –
Patrick

1. rlevchenko says:
  
  March 27, 2018 at 7:00 pm
  
  There isn’t anything new. gMSAs are not supported by failover clusters.
  
  1. Patrick says:
    
    March 27, 2018 at 7:05 pm
    
    Thank you.
    
Shohrat Permanov says:

June 9, 2018 at 10:44 am

Nice article, really helpful… I will be glad if you can send me PDF.
Спасибо

1. rlevchenko says:
  
  June 13, 2018 at 12:39 pm
  
  I’ll send to you once you provide email (here or through contact page (more secure))
  
  1. Julius says:
    
    February 19, 2019 at 12:37 pm
    
    Thank you very much !can you send me pdf to
    
    1. rlevchenko says:
      
      February 20, 2019 at 12:16 pm
      
      check your box + I’ve hided your email for security reasons 😉
      
Pingback: Storage Spaces Direct in Windows Server 2019 – UseIT | Roman Levchenko
Ilker says:

August 8, 2019 at 6:30 pm

Hi rlevchenko,
firstly very nice article. Can you please send pdf format ?

Thanks