Implementing Microsoft Windows Server Failover Clustering (WSFC) For SQL Server AlwaysOn in the AWS Cloud

 

Step By Step Guide To Implementing Microsoft Windows Server Failover Clustering (WSFC) For SQL Server AlwaysOn in the AWS Cloud




Windows Server Failover Clustering (WSFC) is used to increase application availability. WSFC provides infrastructure features that complement the high availability and disaster recovery scenarios supported in the AWS Cloud.



This provides complete resilience against both server nodes and storage node failures to increase application uptime.



➢ A highly available architecture that spans two Availability Zones.

➢ A virtual private cloud (VPC) configured with public and private subnets, according to AWS best practices, to provide you with your own virtual network on AWS.

➢ Two Amazon EC2 instances running Microsoft Windows with SQL Server. These instances are installed as nodes in a WSFC cluster.

➢ AWS Directory Service with a managed directory. The Amazon EC2 Windows instances that host this architecture’s nodes are joined to the same Active Directory domain.









Best Practices And Recommendations To Set Windows Failover Clustering On EC2


Assign IP addresses ---- Each cluster node should have one elastic network interface assigned that includes two private IP addresses on the subnet: a primary IP address, a cluster IP address. The operating system (OS) should have the NIC configured for DHCP. It should not be set for a static IP address because the IP addresses for the cluster IP will be handled virtually in the Failover Cluster Manager. The NIC can be configured for a static IP if it is configured to only use the primary IP of eth0.



Cluster Quorum ---- Cluster generally requires more than half the nodes to be running, which is known as having a quorum. Quorum is designed to prevent split-brain scenarios which can happen when there is a partition in the network and subsets of nodes cannot communicate with each other.
Quorum is designed to handle the scenario when there is a problem with communication between subsets of cluster nodes so that multiple servers don't try to simultaneously host a resource group and write to the same disk at the same time. By having this concept of quorum, the cluster will force the cluster service to stop in one of the subsets of nodes to ensure that there is only one true owner of a particular resource group. Once nodes that have been stopped can once again communicate with the main group of nodes, they will automatically rejoin the cluster and start their cluster service.


➢ If you have two nodes, a witness is required.
➢ If you have three or four nodes, a witness is strongly recommended.
➢ If you have Internet access, use a cloud witness
➢ If you're in an IT environment with other machines and file shares, use a file share witness




DNS registration ---- All nodes should be in same domain and logged on user should have rights to create an object in Active Directory or both nodes object should have the delegation to do so.




OS Software Level ---- All cluster nodes should have the same updates level.




Communication Allow –-- Below communication should be allow before cluster creation between the nodes which will the part of cluster.





Note: May vary as per the OS version, this is for MS OS 2019







Step By Step How To Set Up a Cluster




There are two cases while creating the cluster.

Before proceeding to create a failover cluster, Go to Server Manager and install the Failover Cluster role.




Case 1: Windows OS NIC is in DHCP mode





Go to run and type clueadmin.exe and hit enter

Once Failover Cluster manager opened, click on Create Cluster tab from right hand side







Select the nodes







Run all the tests to validate the cluster











Once cluster is validation is passed then we can move forward, this is mandatory to fix all the pointers before going to configure cluster, complete information we will get on to the cluster validation report.



Once all are fixed move to another screen and provide the windows failover cluster name









Remove the check mark from add all eligible storage to the cluster and click next










Once cluster created click on to the cluster and you can see currently cluster is offline


Change the IP address and provide the secondary IP address which allocated from AWS console respectively to both the node. Make sure provide the correct IP address to their respective subnets.


















Once both IP address configured the right click on to the cluster and click online









Case2: Static IP configured on windows NIC




All steps will remain same but cluster configuration screen where it will ask for cluster name will show the IP add for subnet which will be assign for respected nodes used by windows failover cluster and don’t follow the last steps where we placed the IP address for failover cluster.






Quorum configuration:

The quorum model in Windows Server is flexible.
The following table lists the three quorum configuration options that are available in the Configure Cluster Quorum Wizard.




Depending on the quorum configuration option that you choose and your specific settings, the cluster will be configured in one of the following quorum modes:








The following table provides additional information and considerations about the quorum witness types.





Here we are configuring File Share Witness:


In Failover Cluster Manager, right-click the cluster’s root node, go to More Actions, and click Configure Cluster Quorum Settings.










Click Next on the introductory screen.

If you choose Advanced quorum configuration, you can change which nodes have quorum votes. That is not part of this article, but you’ll eventually get to the same screen that I’m taking you to. For my directions, choose to Select the quorum witness. Click Next.





Choose Configure a file share witness and click Next.







You can manually enter or browse to the shared location









The next screen summarizes your proposed changes. Review them and click Next when ready. The cluster will attempt to establish your setting.
The final screen shows the results of your action.


Note: Make sure the share folder should be created a highly available location, not to be created on cluster nodes.




WSFC Cluster Testing



Before putting the installed and configured cluster into production, you should test your deployment and familiarize yourself with the cluster’s behavior during a high-availability automatic failover or a disaster recovery event.

(1) Open PowerShell and run the below command

Test-Cluster

Or revalidate the cluster again from the Failover Cluster console

(2) Restart the active node or stop cluster service and see if moving to the passive node

(3) Check the cluster events for any error

(4) Check the windows event viewer






If you enjoyed this article, follow and share it with your friends and colleagues!!!!!!!!!!! 👍



Posted By : Kamlesh Gaur





Comments