nutanix – jonamiki.com

December 18, 2024December 18, 2024

Disaster Recovery with NC2 on AWS Pilot Light cluster + Amazon S3

Configuring a Disaster Recovery (DR) from on-premises to Nutanix Cloud Clusters (NC2) on AWS is straight-forward and can provide significant benefits for those responsible for ensuring Business Continuity. With a Pilot Light cluster this can even be done for workloads with two different levels of Recovery Time Objective (RTO) while saving on costs by minimizing NC2 cluster size during normal operations.

This type of DR configuration leverages Nutanix MST, or Multicloud Snapshot Technology.

Why use a Pilot Light cluster?

Workloads can all be important and vital to recover in case disaster strikes, but may have different requirements when it comes to how quickly they must be back online again. For applications and services with short recovery time windows, simply configure replication from on-premises to a small NC2 on AWS cluster using the Nutanix built-in DR tools.

For workloads which are fine with a slightly longer RTO, save on running costs by replicating them to Amazon S3. In case of disaster those workloads can be recovered from S3 to the NC2 on AWS cluster. This brings benefits in the ability to keep the NC2 on AWS cluster at a small and cost-efficient size during normal conditions, with the ability to scale out the cluster if there is need to recover workloads from S3.

Zero-compute option

There is additionally possible to configure a Zero-compute DR strategy with NC2 on AWS in which there is no NC2 cluster and the on-premises environment replicate data directly into Amazon S3, however that will be covered in a separate blog post.

Zero-compute DR offers even lower costs than Pilot Light since there is no need to deploy an NC2 cluster unless there is a disaster. However, it will increase RTO because the NC2 infrastructure need to be provisioned. Therefore, Zero-compute is cheaper from a running-costs perspective, but may incur higher costs to the business during a DR event due to the longer time required to deploy and configure the recovery cluster.

Solution architecture

The below diagram shows how it is possible to replicate both to Amazon S3 as well as to the NC2 on AWS cluster. Both with different RPO times.

Versions used

For this deployment we use the following versions:

Entity	Version
NC2 on AWS Prism Central	pc.2024.3
NC2 on AWS AOS	7.0
On-premises Prism Central	pc.2024.2
On-premises AOS	6.8.1 (CE 2.1)

Deployment steps

Configuring a Pilot Light cluster can be done in just a few hours. For the purpose of this blog post we have gone through the following steps:

Step 1: Deploy an NC2 on AWS cluster and enable DR

This can be done with a few clicks in the NC2 portal. Either deploy into an existing AWS VPC or create new resources during cluster deployment. If an existing VPC with connectivity to on-prem is used it will make setting things up even faster since routing is already configured.

Navigate to the Prism Central settings menu and enable Disaster Recovery as shown:

Step 2: Add connectivity between on-prem and AWS (if not already configured)

Configure Direct Connect or a S2S VPN to link the on-premises Nutanix cluster environment with the AWS VPC holding NC2 on AWS.

Step 3: Create an S3 bucket to store replicated data

Create an S3 bucket which can be accessed from the NC2 on AWS environment. Make sure to give it a name starting with “nutanix-clusters”. You can add on something in addition to keep buckets separate. In this example we use “nutanix-clusters-pilotlight-jwr” as the bucket name.

If not already created, create a user with full access to this bucket and make note of the Access and Secret Access keys as they are used in the next step. An example of a policy which gives full access to our bucket can be found below:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "FullAccessForMstToSpecificS3Bucket",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::nutanix-clusters-pilotlight-jwr",
                "arn:aws:s3:::nutanix-clusters-pilotlight-jwr/*"
            ]
        }
    ]
}

Step 4: Configure MST on Prism Central in NC2 on AWS

MST will deploy a number of Virtual Machines and this deployment requires us to provide three IP addresses on the AWS native VPC CIDR range.

If NC2 was deployed with AWS native networking, simply provide IP addresses outside of the DHCP range on one of the existing networks, or alternatively add a new AWS native subnet through the Prism Central console.

If NC2 was deployed with Prism Central and Flow Virtual Networking (FVN), the easiest way to do this is to shrink the automatically created “PC-Net” DHCP range to make some space for MST. We shrink ours from ending at “10.110.33.253” to end at “10.110.33.200”. Make sure the no VMs are using addresses in the space you are creating.

SSH or open a console window to Prism Central using the “nutanix” user and enter the below command to start the deployment:

clustermgmt-cli mst deploy \
-b <BUCKET NAME> \
-r <AWS REGION> \
-i <IP#1>,1<IP#2>,<IP#3> \
-s <NC2 NETWORK NAME> \
-t <BUCKET TYPE (aws or ntx_oss)>

For our example we use:

clustermgmt-cli mst deploy \
-b nutanix-clusters-pilotlight-jwr \
-r ap-northeast-1 \
-i 10.110.33.201,10.110.33.202,10.110.33.203 \
-s PC-Net \
-t aws

The MST deployment begins as shown. Enter the AWS access key and secret access key for the user with rights to the S3 bucket when prompted.

Step 5: Link the on-premises Prism Central with the NC2 Prism Central

In Prism Central (on-premises or NC2), navigate to “Administration” and “Availability Zones” and add the other Prism Central instance as a remote physical site.

Step 6: Create a Protection Policy and DR Plan

Through the Prism Central console, navigate to Data Protection and Protection Policies. From here, create a new Protection Policy with the on-premises cluster as source and the Amazon S3 bucket as target.

A replication schedule can be set and allows Linear snapshots for maximum of 36 snapshots or Roll-up where data can be retained for up to one month.

Next we create a Recovery Plan. Optionally, to facilitate grouping of those VMs which are destined for Amazon S3 and those destined to be replicated directly to the NC2 cluster, we create two categories to place them in. Below is the category for the VM group which is to be replicated to S3.

When we then go to Data Protection and Recovery Plans we reference our category to get the correct VMs replicated to the correct location, as follows:

Verifying that data is replicated to S3 + Failing over

Now the replication schedule is set and the target VMs are highlighted through categories to ensure they are replicated. By checking our S3 bucket we can verify that snapshots have been sent to S3 from the on-premises cluster.

We can now do a failover or just a test to see that it is possible to recover from the replicated data. When you do this, since we are failing over from data in S3, make sure to select the target cluster to which we want to recover the VMs. This is done by clicking “+ Add Target Clusters” and selecting the NC2 on AWS cluster as per the below:

You will get a warning highlighting that there may be a need to expand the NC2 on AWS cluster with extra nodes to handle the influx of VMs being restored from S3. If required, simply expand the cluster by adding nodes through the NC2 management console.

After failing over we can verify that our VMs are up and running in NC2 on AWS without issues.

Wrap-up

This has been a guide and demonstration of how easy it is to configure Disaster Recovery using a Pilot Light cluster with NC2 on AWS. Please refer to the links below for more detail in the documentation. Hopefully this has been easy to follow. Please feel free to reach out to your nearest Nutanix representative for more information and guidance for your specific use case and environment.

Links and references

December 2, 2024December 9, 2024

Migrate from EC2 to NC2 and perform DR failover between two AWS regions without changing IP addresses

Some customers require cross-region disaster recovery (DR) in AWS but often face the challenge of changing IP addresses during a failover to another region. This change can disrupt external access to services running on instances covered by the DR policy.

Nutanix Cloud Clusters (NC2) address this challenge with built-in DR functionality that ensures IP addresses remain consistent during failovers between regions. Bonus: It is possible to over-provision CPU on NC2, so it may actually be possible to save on compute costs after the migration to NC2. However it can only retain IP addresses during DR for workloads which are already residing on a Nutanix cluster, so we have to migrate the EC2 instances first.

The free Nutanix Move migration tool can migrate workloads from Amazon EC2 to Nutanix Cloud Clusters, though it currently lacks support for IP retention. In this blog we use some creative workarounds to maintain consistent IPs throughout the migration, although note that MAC addresses will change. NC2 then retains the IPs during regional failovers as part of the Nutanix DR solution. Let’s dive in!

Software versions used during testing

AOS	6.10
Prism Central	pc.2024.2

Solution architecture

In this case we have an on-premises datacenter (DC), an AWS VPC with EC2 instances which we want to have covered by a DR policy (so they can fail over to another region without changing IP addresses) and finally two NC2 clusters – one in the primary region and one in a separate region for DR purposes. We use Tokyo (ap-northeast-1) as the primary AWS region and Osaka (ap-northeast-3) as the disaster recovery location in this example.

We illustrate connectivity to the on-premises environment by using Direct Connect. Note that all the testing of this solution has been done with S2S VPN attached to the TGW’s in each region. Peering between the two DR locations can be done by using cross-region VPC peering or peering of two Transit Gateways (TGW).

Networking

To retain the IP addresses of the migrated EC2 instances we use Flow Virtual Networking (FVN) to create overlay no-NAT overlay networks on NC2 with a CIDR range which matches that of the original subnet the EC2 instances are connected to. We create this overlay network in both Tokyo and Osaka NC2 clusters so that we can later fail over the VMs and have them attach to a network with the same CIDR range.

To ensure the on-premises DC is able to access the VMs we modify the route tables throughout the process. That way we maintain routes which point to the migrated EC2 instances, regardless of where they are located.

Automating the VPC and TGW creation with Terraform / Open Tofu

In the case you’d like to try this out yourself, the Terraform / Open Tofu templates for deploying the VPC’s, TGW’s and the routing for these can be found on GitHub here:

https://github.com/jonas-werner/aws-dual-region-peered-tgw-with-vpcs-for-nc2-dr

When to use which peering type for inter-region connectivity

Generally it can be said that VPC Peering is better for lower traffic scenarios or when the simplicity of direct peering is desirable, despite slightly higher data transfer costs incurred for VPC peering.

TGW Peering is more cost-efficient for high-traffic environments or complex architectures, where the centralized management and lower data transfer rates outweigh the additional costs per TGW attachment. Note that although traffic passes through two TGW’s, the peering interface doesn’t incur data transfer charges so the data is only charged once (roughly 0.02 cents / GiB in the Tokyo region).

The workaround for IP retention

As mentioned in the introduction, while the Nutanix Move virtual appliance is very capable at migrating from EC2 to NC2, it is at time of writing unable to retain the IP addresses of the workloads it migrates. To work around this we do the following:

Prior to the migration we use AWS Systems Manager (SSM) to run a PowerShell or Bash script on the instances to be migrated. The script captures the EC2 instances ID, hostname and local IP address and stores that information into a DynamoDB table for use later
We perform the migration from EC2 to NC2 using Nutanix Move. The IP address will change although we migrate the instance to a Flow Virtual Networking (FVN) overlay network with the same CIDR range as the original network the EC2 instances are connected to.
We run a Python script which uses the DymamoDB information as a template and then connects to the Nutanix Prism Central API. It then removes the existing network interface and adds a new one with the correct (original) IP address to each of the migrated instances.

Once the instances are migrated to NC2 the process of configuring DR between Tokyo and Osaka regions is trivial.

You can download the PowerShell and Python scripts used in this blog on GitHub:

https://github.com/jonas-werner/EC2-to-NC2-with-IP-preservation/tree/main

Step 1: Capture IP addresses of the EC2 instances

In this step we prepare for the migration from EC2 to NC2. The initial state of the network, the workloads and the route to the 172.30.1.0/24 network is as illustrated by the red line in the below diagram.

To start with we gather information about the EC2 instances and store that info in DynamoDB. In the name of efficiency we use the SSM Run command to execute the PowerShell script. This makes it easy to get this done in a single go (or two “goes” if we do both Windows and Linux workloads). We test with a single Windows 2019 Server EC2 instance in this example.

First create a DynamoDB table to hold this information. Nothing special is required for this table as long as it is accessible to SSM as it runs the script. We need to give the IAM role used when running SSM commands access to DynamoDB of course, so we add the following permissions to the standard SSM role:

        {
            "Sid": "AllowDynamoDBAccess",
            "Effect": "Allow",
            "Action": [
                "dynamodb:PutItem"
            ],
            "Resource": "arn:aws:dynamodb:<your-aws-region>:<your-aws-account>:table/<dynamodb-table-name>"
        },

In order to collect the instance name from meta data we enable the “Allow tags in instance metadata” setting in the EC2 console. This is important as we will use the “Name” tag in EC2 as the Key to look up the instance in NC2 post-migration. Of course other methods could be used – most obviously the name of the instance itself. However in this case we use the EC2 name tag, as this is also how the VM will show up in NC2 post migration.

The we execute the script on our instances through the SSM Run command as follows

After execution we can see an entry for our Windows EC2 instance showing its instance ID, hostname and IP address: 172.30.1.34. This is the IP we want to retain.

That’s all for this section. Next we perform the migration from EC2 to NC2.

Step 2: Migrating from EC2 to NC2

For the migration we have deployed Nutanix Move on the NC2 cluster. We have also created an FVN overlay no-NAT network with the same CIDR as the subnet the EC2 instance is connected to, although the DHCP range is set to avoid any of the IPs currently used by instances on that subnet.

Move has the NC2 cluster and the AWS environment added in as migration sources / targets.

It complains about “missing permissions” but this is because we have only given it permission to migrate FROM EC2, not TO EC2. Since that is all we want to do, this is fine. Please refer to the Move manual for details on the AWS IAM policy required depending on your use case.

Post migration the VM will have a different IP address (taken from the DHCP range on the FVN subnet it is connected to).

We use a Python script in the next section to revert the IP address to what it was while running as an EC2 instance.

Step 3: Revert the IP address to match what the EC2 instance had originally

Now we execute a Python script which will look up the instance name in DynamoDB, match it with the VM name in NC2 and then remove and re-create the network interface using the Prism Central API. The new interface will have the original IP configured as a static address.

The script can be downloaded from GitHub here. Please export the Prism Central username and password as environment variables to run the script. Also update the Prism Central IP and the subnet name to match the one used in your environment as well as the AWS region and the DynamoDB table name.

After running the script we can now verify that the VM has received its original IP address. Note that since we have replaced the NIC in this process, the IP is the same as before, but the MAC address will have changed.

Routing after migration from EC2 to NC2 in Tokyo

Now that the VM exists on NC2 we need to update our routing to ensure that traffic is directed to this VM and not the original EC2 instance (which has now been shut down by Move after the migration).

To do this we disconnect the EC2 VPC from the TGW and add the subnet as a static route in the TGW, this time pointing to the NC2 VPC rather than the EC2 VPC. The subnet should already exist as an “Allowed prefix” on the DXGW, so that part can be left as-is.

The attachments highlighted in red shows the active route to the 172.30.1.0/24 subnet, which has now been changed to point to the NC2 VPC. Since the subnet is a FVN no-NAT subnet it will show up in the NC2 VPC route table.

Wrapping up the migration part

Now our EC2 instance has been migrated to NC2. Its IP address is intact and since we have updated the routing between AWS and the on-premises DC, the on-prem users can access the migrated instances just like they normally would. In fact, apart from the maintenance window for the migration, VM power-up on NC2 and the routing switch, they are unlikely to notice that their former EC2 instance is now running on another platform.

Configuring DR between the Tokyo and Osaka NC2 clusters

At this point all we have left to do is set up the DR configuration between the two Nutanix clusters in Tokyo and Osaka. Since DR is a built-in component, this is very straight forward. We link the two Prism Central instances and of course create the FVN overlay network on the Osaka side as well to ensure we can keep the same CIDR range also after failover.

After enabling Disaster Recovery we can easily create the DR plan through Prism Central

When we create the DR plan we set the VMs on the Tokyo network to fail over to its equivalent on the Osaka DR site

Finally we proceed to fail over our VM from NC2 in Tokyo to NC2 in Osaka

After failing over we can confirm that the VM is not only powered up in Osaka, but that it has also retained the IP address, as expected.

Updating the routing to point to Osaka rather than Tokyo

After failing over from Tokyo to Osaka we need to also update the routes pointing to the 172.30.1.0/24 network by modifying the TGW in Osaka. From a diagram perspective it will look like follows.

On the Osaka TGW we create a static route to the 172.30.0.0/16 network pointing to the Osaka NC2 VPC

We also update the static route on the Tokyo TGW which points to the local NC2 VPC and instead set it to point to the peering connection to Osaka

Results and wrap-up

With these routing changes implemented it is now possible for users on the on-premises DC to continue to access the very same VMs with the very same IP addresses. This possible is even after those workloads have been migrated from EC2 to NC2 in Tokyo and then further having been failed over with a DR plan from Tokyo region to Osaka region.

Hope that was helpful! Please reach out to your local Nutanix representative for discussions if this type of solution is of interest. Thank you for reading!

L2 extension from on-prem VMware cluster to Nutanix Cloud Clusters (NC2) on AWS using Cisco CSR1000V as on-prem VTEP

This guide has been written in cooperation with Steve Loh, Advisory Solutions Architect, Network & Security at Nutanix in Singapore, gentleman extraordinaire and master of the Cisco Dark Arts.

Introduction

Organizations frequently choose to extend on-premises networks to the cloud as part of retaining connectivity between virtual machines (VMs) during migrations. This is called L2 or Layer 2 extension. With an extended network, VMs connected to the same subnet can communicate as usual even when some of them have been migrated to the cloud and others are still on-premises awaiting migration.

L2 extension is sometimes used on a more permanent basis when the same network segment contain VMs to be migrated as well as appliances which must remain on-prem. If the organization doesn’t want to change the IP addresses of any of these entities and still need part of them migrated, they might chose to maintain L2 extension of some subnets indefinitely. However, this is generally considered a risk and is not recommended.

Architecture diagram

This is a graphical representation of the network stretch from on-prem VMware to Nutanix Cloud Clusters in AWS which is covered in this blog post. Other configurations are also possible, especially when Nutanix is also deployed on-prem.

Video of process

For those preferring watching a demo video of this process rather than reading the blog, please refer to the below.

Limitations and considerations

While L2 extension is a useful tool for migrations, please keep the following points in mind when deciding whether or not to utilize this feature:

L2 extension will complicate routing and thereby also complicate troubleshooting in case there are issues
L2 extension may introduce additional network latency. This takes the shape of trombone routing where traffic need to go from the cloud via a gateway on the on-premises side and then back to the cloud again. Nutanix Flow Policy Based Routing (PBR) may be used to alleviate this.
If routing is set to go via one default gateway either on-premises or in the cloud, if the network connecting the on-premises DC with the cloud environment has downtime, the VMs on the side without the default gateway will no longer be able to communicate with networks other than their own
The Nutanix L2 extension gateway appliance does not support redundant configurations at time of writing
A Nutanix L2 extension gateway can support up to five network extensions
A single Prism Central instance can support up to five Nutanix L2 extension gateways
Always keep MTU sizes in mind when configuring L2 extension to avoid unnecessary packet fragmentation. MTU settings can be configured when extending a network.
Even though VMs are connected to an extended network, if the current version of Move is used for migration, VM IP addresses will not be retained. A Move release in the near future will enable IP retention when migrating from VMware ESXi to NC2 on AWS.

Types of L2 extension

Various methods of extending a network exist. This blog will cover one of these cases – on-premises VMware with Cisco CSR1000V as VTEP to Nutanix Cloud Clusters (NC2) on AWS with a Nutanix gateway appliance.

On-premises VLANs or Flow overlay networks can be extended using Nutanix GW appliances to Flow overlay networks in NC2. It is also possible to extend using Nutanix VPN appliances in case the network underlay is not secure (directly over the internet). Finally, when the on-premises environment does not run Nutanix, using a virtual or physical router with VXLAN and VTEP capabilities is possible. This blog focuses on the last use case as it is a commonly discussed topic among customers considering NC2 and L2 extension.

Routing

When extending networks, the default gateway location and routing to and from VMs on an extended network become important to understand. Customers used to extending networks with VMware HCX or NSX Autonomous Edge are familiar with the concept of trombone routing over a default gateway located on-premises. With Nutanix it is possible to use Policy Based Routing (PBR) to control how routing should be performed for different networks. In many ways, Nutanix PBR offers more detailed tuning of routes than can be done with VMware MON in HCX.

A key difference between extending networks with HCX vs. with Nutanix is that with HCX the extended network appears as a single entity, although it exists on both sides (on-prem and cloud). The default gateway would generally be on-prem and both DHCP and DNS traffic would be handled by on-prem network entities, regardless if a VM was on-prem or in the cloud.

For L2 extension with Nutanix, things work a bit differently. The on-prem network will be manually recreated as an overlay network on NC2, with the same default gateway as the on-prem network but with a different DHCP range. The on-prem and cloud networks are then connected through a Nutanix GW appliance deployed as a VM in Prism Central.

Prerequisites

This guide assumes that an on-premises VMware vSphere 7.x environment and an NC2 version 6.8.1 cluster are already present. It also assumes that the on-prem and NC2 environments are connected over a L3 routed network, like a site-to-site (S2S) VPN or DirectConnect (DX). The two environments have full IP reachability and can ping each other.

In this case we are extending a VLAN which has been configured as a port group with VLAN tagging on standard vSwitches on the ESXi hosts.

Overview of steps

Recreate the network to be extended using Flow overlay networking on NC2
Deploy the Nutanix gateway appliance on NC2
Deploy the Cisco CSR1000V in the on-premises VMware cluster
Enable Promiscuous mode and Forged transmits on the vSwitch portgroup of the VLAN to be extended
Register the CSR1000V routable IP address as a Remote Gateway in NC2
Configure the CSR1000V IP leg on the VLAN to be extended and set up VNI and other settings required to extend the network
Extend the network from NC2 with the CSR1000V as the on-prem VTEP

In the demo video included in this post we also perform a migration of a VM from on-prem to NC2 and verify connectivity with ICMP.

Step 1: Recreate the network to be extended using Flow overlay networking on NC2

Access Prism Central, navigate to Network and Security. Select “Create VPC” to add a new Nutanix Flow VPC to hold the subnet we want to extend.

After the VPC has been created, go to Subnets and create a subnet in the newly created VPC with settings matching the network which will be extended. In this case we create “VPC-C” and a subnet with a CIDR of “10.42.3.0/24”. The default gateway is configured to be the same as on-prem but the DHCP range is set to not overlap.

Step 2: Deploy the Nutanix gateway appliance on NC2

In Prism Central, navigate to “Network and Security” and select “Connectivity”. From here click “Create Gateway” and select “Local” to create the gateway on the NC2 side.

Add a name, set the Gateway Attachment to VPC and select the VPC which was just created in the previous steps.

For Gateway Service, select VTEP and allow NC2 to automatically assign a Floating IP from the AWS VPC CIDR. This IP will be accessible from the on-prem environment and will be used as the anchor point for the L2E when configuring the CSR1000V in a later step

Note that a new VM (the gateway appliance) will automatically be deployed on the NC2 cluster by Prism Central.

Step 3: Deploy the Cisco CSR1000V in the on-premises VMware cluster

Deploy the Cisco appliance on the VMware cluster and select the first network interface to connect to the routable underlay network (IP connectivity to NC2) and the second and third interfaces to connect into the port group of the VLAN to be extended.

In this case VL420 is routable underlay network and VL423 the VLAN to be extended

Configure an IP address on the management network and make note of it as we will use it in a subsequent step. In this case we use “10.42.0.106” as the management IP address on VL420.

Step 4: Enable Promiscuous mode and Forged transmits on the vSwitch portgroup of the VLAN to be extended

In order to pass traffic from the on-premises network to the NC2 network it is necessary to enable Promiscuous mode and Forged transmits on the vSwitch port group on the VMware cluster. In this case we are using standard vSwitches.

Step 5: Register the CSR1000V routable IP address as a Remote Gateway in NC2

We need to create a representation of the on-premises CSR1000V appliance in NC2 so that we can refer to it when extending the network in a later step. This is essentially just a matter of adding in the IP address as a “Remote Gateway”.

In Prism Central, navigate to “Network and Security”, select “Connectivity” and “Create Gateway”. Select “Remote” and add the details for the on-prem Cisco appliance. Give it a name, select “VTEP” as the “Gateway Service” and add the IP address. Let the VxLAN port remain as “4789”.

Step 6: Configure the CSR1000V IP leg on the VLAN to be extended and set up VNI and other settings required to extend the network

In this step we do the configuration of the CSR1000V over SSH. To enable SSH you may need to to the following through the console for the virtual appliance first.

Enable SSH

en
conf t
username cisco password Password1!
line vty 0 4
login local
transport input ssh
end

Now when SSH is available, SSH to the appliance as the user “cisco” with password “Password1!” and complete the remaining configurations.

Configure interface in VLAN to be extended

Configure the 2nd interface to be a leg into the VLAN to be extended by giving it an IP address and enabling the interface

CSR1000V3#
CSR1000V3#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
CSR1000V3(config)#
CSR1000V3(config)#int gi2
CSR1000V3(config-if)#
CSR1000V3(config-if)#ip address 10.42.3.2 255.255.255.0
CSR1000V3(config-if)#no shut

Configure NVE 1 and the VNI to be used + link with the NC2 gateway IP

For ingress-replication, use the Floating IP from the AWS VPC CIDR range which was assigned to the gateway appliance after deploying on NC2.

CSR1000V3#
CSR1000V3#conf term
Enter configuration commands, one per line.  End with CNTL/Z.
CSR1000V3(config)#
CSR1000V3(config)#
CSR1000V3(config)#int NVE 1
CSR1000V3(config-if)#no shutdown
CSR1000V3(config-if)#source-interface gigabitEthernet 1
CSR1000V3(config-if)#member vni 4300
CSR1000V3(config-if-nve-vni)#ingress-replication 10.70.177.192
CSR1000V3(config-if-nve-vni)#
CSR1000V3(config-if-nve-vni)#end
CSR1000V3(config-if)#end

Configure bridge domain and L2E via the 3rd interface (Gi3)

CSR1000V3#
CSR1000V3#conf t
Enter configuration commands, one per line. End with CNTL/Z.
CSR1000V3(config)#bridge-dom
CSR1000V3(config)#bridge-domain 12
CSR1000V3(config-bdomain)#member VNI 4300
CSR1000V3(config-bdomain)#member gigabitEthernet 3 service-instance 1
CSR1000V3(config-bdomain-efp)#end

CSR1000V3#conf t
Enter configuration commands, one per line. End with CNTL/Z.
CSR1000V3(config)#int
CSR1000V3(config)#interface giga
CSR1000V3(config)#interface gigabitEthernet 3
CSR1000V3(config-if)#no shut

CSR1000V3(config-if)#
CSR1000V3(config-if)#service instance 1 ethernet
CSR1000V3(config-if-srv)# encapsulation untagged
CSR1000V3(config-if-srv)#no shut
CSR1000V3(config-if-srv)#end
CSR1000V3#
CSR1000V3#

Configure a default route

We set the default route to go over the default gateway for the underlay we use to connect to AWS and NC2 on AWS

CSR1000V3#
CSR1000V3#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
CSR1000V3(config)#
CSR1000V3(config)#
CSR1000V3(config)#ip route 0.0.0.0 0.0.0.0 10.42.0.1
CSR1000V3(config)#
CSR1000V3(config)#
CSR1000V3(config)#end

Step 7: Extend the network from NC2 with the CSR1000V as the on-prem VTEP

In Prism Central, navigate to “Network and Security” and select “Subnets”. From here we will create the network extension.

Click the subnet to be extended and then select “Extend”

If we were extending a network between two Nutanix clusters we would select “Across Availability Zones” but in this case we extend from a pure VMware environment and a 3rd party (Cisco) appliance, so we select “To A Third-Party Data Center”.

Select the CSR1000V as the remote VTEP gateway and the VPC which contains the subnet we want to extend.

For “Local IP Address”, enter a free IP in the subnet to be extended. This will be the leg the Nutanix gateway appliance extends into that subnet.

Also set the VxLAN VNI we used when configuring the CSR1000V earlier. Adjust the MTU to ensure there is no packet fragmentation. The default is 1392 but this will vary depending on the connectivity provider between on-prem and the AWS cloud.

Configuration of the Layer 2 extension is now complete. In the following section we verify connectivity from on-prem to the cloud using the newly extended network.

Verifying connectivity

As a first step it’s good to ping the IP address in the extended network which is assigned to the Nutnix Gateway appliance. We can verify that the “Local IP Address” is configured for the gateway VM by navigating to “VM” in Prism Central and checking that “10.42.3.3” shows up as an IP for the gateway

Pinging this IP address from a VM in the on-premises VMware environment shows that it can reach the gateway appliance across the extended network without problems. The local VM has an IP of 10.42.3.99, in the same VLAN which has been extended. Latency is about 5ms across the S2S VPN + L2 extension.

As a next step I have migrated a VM using Move from the on-prem VMware environment to NC2. After migration it was assigned an IP of “10.42.3.106” as per the screenshot below

Pinging this VM from on-prem also works just fine

Conclusion

That concludes the walkthrough of configuring L2 extension from on-premises VMware with Cisco CSR1000V to Nutanix Cloud Clusters (NC2). Hopefully this was helpful.

For reference, please refer to the below links to Nutanix and Cisco pages about L2 extension

August 1, 2024August 1, 2024

New video for migration of VMs from VMware Cloud on AWS (VMC) to Nutanix Cloud Clusters on AWS (NC2)

In this video we cover the steps for migration from VMC to NC2 over a VMware Transit Connect (VTGW) and a Transit Gateway (TGW). We briefly also cover the routing through these entities as well as the actual VM migration using Nutanix Move 5.2

August 1, 2024August 1, 2024

Adding an AWS Elastic Load Balancer (ELB) to direct traffic between web servers on Nutanix Cloud Clusters (NC2)

One of the great things about running a virtualized infrastructure on NC2 on AWS is the close proximity to all the cloud native services. One of those highly useful services is the AWS ELB or Elastic Load Balancer.

In this post we show how to get floating IP addresses from the VPC in which NC2 is located and to assign them to a number of web servers running as VMs on NC2. Then we create a Load Balancer target group and finally we create an Application Load Balancer (ALB) and attach it to the target group.

Architecture

In this blog post we only cover the deployment of the web servers and the load balancer, however, Route 53 can also be leveraged for DNS and AWS WAF for security and DDOS protection purposes as illustrated below

Preparing some web servers

We first deploy a few test web servers. In this case the wonderfully named Jammy Jellyfish edition of Ubuntu Server as a cloud image. Feel free to download the image from here:

https://cloud-images.ubuntu.com/releases/22.04/release-20240319/ubuntu-22.04-server-cloudimg-amd64.img

Prism Central makes it very easy to deploy multiple VMs in one go.

When deploying, make sure to use the cloud-init script to set the password and any other parts to make the VM usable after 1st boot:

#cloud-config
password: Password1
chpasswd: { expire: False }
ssh_pwauth: True

Now we have our VMs ready. I’ve installed the Apache web server to serve pages (apt install apache2) but feel free to use whatever works best in your setup.

I used the following index.html code to show the server ID

ubuntu@ubuntu:~$ cat /var/www/html/index.html
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Welcome</title>
    <style>
        body {
            display: flex;
            justify-content: center;
            align-items: center;
            height: 100vh;
            margin: 0;
            font-family: Arial, sans-serif;
            background-color: #f0f0f0;
        }
        .message-container {
            max-width: 300px;
            padding: 20px;
            text-align: center;
            background-color: #ffffff;
            border: 1px solid #cccccc;
            border-radius: 10px;
            box-shadow: 0 0 10px rgba(0, 0, 0, 0.1);
        }
    </style>
</head>
<body>
    <div class="message-container">
        <script>
            // Fetch the hostname
            document.write("Welcome to Server 1");
        </script>
    </div>
</body>
</html>

Configure floating IP addresses for the web servers

Next we request a few floating IP addresses from the VPC which NC2 is deployed into and then assign one IP each to our web servers. Luckily Prism Central makes also this very easy to do – in a single step! From “Compute & Storage”, select “Floating IPs” under “Network & Security”:

After assigning the IPs we can see that each VM have both an internal and an external IP address, where the “external” IP comes from the AWS VPC CIDR range

Creating an AWS LB target group

Next we create a target group for the AWS ALB which we deploy in the next step. The LB target group simply contain the Floating IP addresses we just assigned as well as a health check for the web root of these web servers.

We create an “IP address” target group and set the health check to be HTTP, port 80 and the path as “/” or the web root.

We then add the Floating IP addresses we created previously