Change NC2 bare-metal node type while the cluster is running

Normally during long car races the drivers enter the pit do change to fresh tires, fill up fuel etc. Metaphorically speaking, Nutanix Cloud Clusters on AWS can change the entire car – without the need for a pit stop. The driver can just keep driving while the bare metal is replaced underneath, as if nothing has happened with the exception of getting more power. All the benefits – none of the downtime. And, it can be done with a single command through the Nutanix MCM portal.

Introduction

In this example we swap out i3.metal nodes to more powerful i4i.metal while the cluster is running. The starting point is a cluster with three i3.metal nodes and the end state is the same cluster, but now with three i4i.metal nodes. The change is seamless for the workloads running on top of NC2. Apart from a few packets dropped during the network change they experience no disruption.

Starting point

We start out with a plain NC2 on AWS cluster with three i3.metal nodes. In addition to the basic cluster components we have also opted to deploy Prism Central and Flow overlay networking.

Multiple VMs are running on the NC2 cluster. To monitor their health we start a continuous ping which statistics can be evaluated after the cluster nodes have been replaced.

On the networking side we have set up No-NAT networking with Flow and as such the subnet the test VM is attached to is accessible also from the native AWS VPC. In this case we are pinging the NC2 test VM from an EC2 instance in a separate AWS VPC.

Updating the Cluster Capacity settings in the Nutanix MCM portal

The management portal for NC2 allows for easy updates to the cluster capacity and configuration. We highlight our cluster and navigate to Cluster Capacity where the node types and the number of nodes can be changed.

A few clicks later we have added three new i4i nodes to our original configuration of three i3 nodes and we have also set the number of i3 nodes to zero. This way we get three new nodes of a more powerful configuration added and after all data has been transferred over, the old cluster nodes will be removed and billing for them stopped.

The task has now been accepted by the MCM portal and is being executed in the background. VMs running on NC2 continue working as usual, unaware of the big changes to the system which are under way.

EC2 bare-metal changes as seen from the AWS console

In the AWS console it is possible to witness the process of the i4i.metal nodes being added, i3 and i4i nodes running at the same time while the cluster shifts to run on the new nodes and finally the decommissioning of the i3.metal nodes.

From a networking perspective: The i3.metal ENI which was the active point of North-South communication for the cluster, and therefore part of the AWS VPC route table, has been shifted to an ENI on one of the new i4i.metal hosts post migration.

Result

The node swap has completed without a hitch and without any need of input from the IT administrator managing the NC2 cluster – well, apart from initiating the change at the start. The entire process took just under one hour to complete:

More importantly, the workloads have experienced just a blip in network connectivity and no downtime or reboots.

The Linux VM which we started pinging at the beginning of the blog post is still up and the pings are still getting through. Throughout the hour-long change a total of 3381 pings were sent. 26 of these were lost (0% loss).

The uptime command on the Linux host also show that there was no rebooting of VMs involved.

Conclusion

This was an example showing of how quick and easy it is to migrate from one EC2 bare metal instance type to another when using Nutanix Cloud Clusters on AWS. For more information, please visit the Nutanix Cloud Clusters page below:

https://www.nutanix.com/products/nutanix-cloud-clusters