Windows Server Summit 2026 | Part 19: Failover clustering: heart of the private cloud and datacenter

As a rule, servers are not deployed individually but in clusters to ensure high availability and to be prepared for the failure of individual systems. In addition, such clusters allow the load to be efficiently distributed across multiple systems.
The core of such clusters is typically the “failover clustering” server feature. This allows multiple servers with similar hardware to be combined into a logical cluster with relatively little effort. This cluster can then provide redundancy for roles and products such as virtualization (Hyper-V), file services, and SQL Server.
This article discusses features that failover clustering already supports today and provides an outlook on features that will be available in the future.
Live Migration
Die folgenden Kapitel gehen auf einige Funktionen in Bezug auf die Funktion "Live-Migration" ein. Diese ermöglicht das Verschieben von virtuellen Maschinen im laufenden Betrieb, ohne dass die VM dazu heruntergefahren werden muss.
Workgroup clusters
Seit Windows Server 2016 ist es möglich, Failovercluster auch außerhalb eines Active Directory zu betreiben. Neben Kostenersparnisgründen mag dies auch aus Sicherheitserwägungen heraus genutzt werden, um beispielsweise eine separate Authentifizierungsplattform für die Virtualisierungsebene zu verwenden.
Allerdings ist es bislang nicht möglich, virtuelle Maschinen in einem solchen Cluster zwischen den Knoten im laufenden Betrieb zu verschieben. Dies ändert sich jedoch jetzt. Damit fällt eine erhebliche Einschränkung für die Nutzung solcher Cluster weg und der Einsatz wird deutlich vereinfacht.
GPU partitioning
Another new feature concerns virtual machines that have access to a physical graphics card. The “GPU partitioning” feature makes it possible to allocate the graphics card's resources across multiple virtual machines. These virtual machines can now also be moved between nodes while they are running and do not need to be shut down first.
AccelNet for highly available virtual machines
AccelNet enables a virtual machine to access the node's physical network adapter directly. This reduces latency and CPU load and, among other things, allows for the operation of additional VMs on the node compared to a configuration without AccelNet.
This feature is already in use in Azure for virtual machines and is now also being made available in Windows Server. A prerequisite for this is the use of Network ATC (Advanced Traffic Control).
New failover clustering scenarios
Windows Server 2025 supports new scenarios for setting up and operating failover clusters. These are described below.
S2D and SAN coexistence
S2D now supports the integration of existing SAN systems, allowing the two technologies to be combined. This means it is no longer necessary to purchase new server hardware to use S2D.
S2D Campus Cluster
With S2D Campus Cluster, storage devices from servers in two data centers can be combined into a single virtual pool. Previously, this was only possible with servers located at the same site.
Rack-local optimized reads
Until now, all copies of a data block have been treated as equivalent during a read operation. As a result, a read operation may access any copy other than the nearest one. This would be disadvantageous in an S2D Campus Cluster, as read operations could inadvertently be routed to the other side, potentially slowing them down.
To mitigate this behavior, read operation prioritization is now being introduced. The system evaluates where the nearest copy is located and routes the read request there. This is done according to the following hierarchy:
- Same node
- Same chassis
- Same rack
- Same site
Cloud witness with managed identity
A failover cluster is typically assigned a small storage area that contains control information for the cluster. This enables a smooth restart in the event of a complete failure or if the cluster has an uneven number of nodes.
Azure Blob Storage can also be used for this purpose. However, access to this storage has so far only been possible with an access key (similar to a password), which is stored in the failover cluster database.
To enhance security here, it is now possible to use a managed identity for access.
Stretched S2D cluster with storage replica
Windows Server has supported distributed clusters since Windows Server 2008 R2. Until now, however, storage systems had to be replicated using separate mechanisms within the storage systems themselves.
Soon, it will be possible to replicate S2D-based storage between sites using the “Storage Replication” feature built into Windows Server.
Future improvements
Microsoft also provides a preview of features that are planned for the future. These will be discussed in more detail below.
Cluster native update
For failover clusters, there is a special role called “Cluster-Aware Update” (CAU). This role automates the installation of updates in a failover cluster by preventing multiple nodes from being updated and/or restarted simultaneously. It uses various extensions (e.g., for hotfixes, updates, and custom requirements).
This functionality is now natively integrated into the cluster role and, as part of this, has been renamed “Cluster-Native Update” (CNU). Several improvements are also being made to the feature:
- Improved control of nodes during an update with built-in remote management capabilities
- Plugin-based architecture for better extensibility and easy development of custom plugins (including for CAU!)
- Improved configuration interface and diagnostic functions
- Creation and use of templates for update cycles
Admission Control
The term is somewhat misleading, but VMware also uses it to describe the underlying functionality. The purpose of access control is to ensure sufficient cluster capacity in the event of a node failure or maintenance. To this end, minimum reserves for CPU, RAM, and GPU can be configured. Cluster utilization can also be monitored.
Access control can be operated in two modes:
- Soft enforcement (a warning is sent, but the current action is not interrupted)
- Hard enforcement (the current action is aborted as soon as the thresholds are exceeded)
Access control applies, for example, in the following cases:
- Adding/removing a virtual machine
- Powering on/off a virtual machine
- Resizing a virtual machine
- Adding/removing a cluster node
Liked this article? Share it!


