Migrating VMs from VMware Cloud on AWS (VMC) to Nutanix Cloud Clusters on AWS (NC2)

Summary

In this blog post we explore two ways to use Nutanix Move 5.3 to migrate Virtual Machines from an existing VMware Cloud on AWS (VMC) environment to Nutanix Cloud Clusters on AWS (NC2). This is done while preserving both IP and MAC addresses of the VMs being migrated.

The most straight forward method is to deploy NC2 into the Connected VPC. This is a VPC which is attached at time of deployment of the VMware Cloud on AWS environment and is owned by the customer. Alternatively, we can deploy NC2 into a completely separate VPC and connect to the VMware Cloud on AWS cluster through a VMware Transit Connect (VTGW).

Architecture overview

The two methods are illustrated below. Method 1 is recommended due to the ease of setup, simple networking and no data transfer charges. However, care need to be taken to ensure there is no overlap with any existing resources deployed into the Connected VPC. For example by creating new private subnets in the Connected VPC specifically for the NC2 deployment.

Method 2 covers migrating via a VMware Transit Connect (VTGW). Although it has additional routing considerations, this is also a fully viable option. In this example we peer the VTGW with a normal customer-controlled AWS Transit Gateway (TGW). Note that with Method 2 the VTGW can also connect directly to a VPC without the need for a TGW, but this will limit the routing options for the customer.

It’s important to keep in mind that both options can migrate VMs from VMC without changing IP or MAC addresses. Neither option require L2 Extension of user VM networks. This underlines the ease of which a migration like this can be done. There are of course some caveats. Refer to the VM networking section below for more detail.

Method 1: Migrate from VMC to NC2 deployed into the VMC Connected VPC
Method 2: Migrate VMs via VTGW into a separate or new VPC

VM networking

The whole migration can be done without L2 extension of VM networks. On the VMware Cloud on AWS side Virtual Machines in VMC are connected to overlay networks created with NSX-T using the VMC management console. These are represented by the “10.3.0.0/24” network in this example. The same CIDR ranges can be created as overlay networks by using Nutanix Flow on the NC2 cluster. Thereby, when VMs are migrated from VMC to NC2, they don’t need to change their IP or MAC addresses.

Note that if L2 Extension is not used, there is no communication between the overlay networks in NC2 and the overlay networks in VMC. Therefore, plan the migration so that VMs which need to communicate are moved together.

Also note that Flow does not advertise the overlay networks into the VPC route table. As long as VMware Cloud on AWS is attached to the connected VPC, the routes for the VMs will point to the active ENI created during the VMC cluster deployment. Destroying the VMC cluster will remove these ENIs and the corresponding routes from the Connected VPC route table.

Migration tool: Move

The Nutanix Move migration tool has with the recent 5.3 release added support for migrations from VMware Cloud on AWS. In this example, Move is deployed into the NC2 cluster. Both the VMC and the NC2 environments have been registered with Move and the inventories of both show up and are available for migration. More details in the Move deployment section below.

Method 1: Deploy NC2 into the Connected VPC

If there is enough space to deploy NC2 into the already existing Connected VPC in the customer account, this is the easiest and most straight-forward option. Connectivity and routing between the Connected VPC and the VMware Cloud on AWS environment is already configured as part of the VMC deployment. Do make sure that the CIDR ranges of any existing subnets are sufficient for deployment of NC2 and that there aren’t already resources deployed into those subnets which could interfere with the NC2 components. If the VPC CIDR range has space for new subnets, consider creating new private subnets to hold the NC2 deployment.

  • Benefits
    • No need to create new VPC and subnets
    • VPC is already connected to VMC and routing is configured
    • Data transfer is free of charge
    • High link / data transfer speed

  • Drawbacks
    • VPC may already be fully populated with resources
    • VPC may not have the correct CIDR ranges for NC2

Method 1: Steps to deploy

Simply sign up for NC2 or start a trial and deploy through the NC2 deployment wizard. Select the latest version – 6.8, to get Flow overlay networking and the centralized management through Prism Central included.

Note that NC2 can only be deployed into private AWS subnets and that internet connectivity need to be present. If no direct internet connectivity is available, proxy support is also available through the deployment wizard.

VMware Cloud on AWS automatically updates the default route table in the connected VPC with the routes to vCenter, ESXi and the user networks. However, if NC2 is deployed into a subnet which doesn’t use the default route table those routes won’t be present. Ensure the subnet NC2 is deployed into is updated with the routes to the VMware Cloud on AWS environment. Particularly the management subnet which holds vCenter and the ESXi hosts. Also, if necessary, update the security group on the active VMC ENI to allow access from the NC2 subnet.

After the NC2 cluster is deployed, follow the steps further down in this article to open the VMC firewall for vCenter and ESXi, deploy Move 5.3 on top of NC2 and register both the NC2 cluster and the VMC vCenter instance.

Method 2: Deploy NC2 into a separate VPC and migrate through a VTGW

If deploying NC2 into the Connected VPC is not possible or desirable, there is another option available. VMware Cloud on AWS supports creating a VMware Transit Connect (VTGW). The VTGW is a VMware-controlled Transit Gateway (TGW) – basically a regional cloud router. The VTGW can in turn be attached either directly to another VPC or peered with a customer controlled TGW. The TGW can then be attached to one or several VPCs of the customers choosing. Do keep cross-AZ and cross-region charges in mind when planning the architecture so that they can be minimized or avoided.

  • Benefits
    • Once set up, migration is straight forward
    • The customer can use any VPC for the NC2 deployment, including a new one
    • High link / data transfer speed

  • Drawbacks
    • Routing requires additional steps and knowledge
    • Data transfer is charged (roughly 2 cents/Gb)
    • Although this example use a TGW and a VTGW, data transfer charges do not end up being doubled. The peering attachment does not incur data transfer charges unless they go across AZs or regions.
    • (V)TGW attachments are charged (roughly 7 cents/h in ap-northeast-1)

Method 2: Steps to deploy: Create the VTGW

Unless already present, go to the AWS console and deploy a Transit Gateway (TGW) in the same region as VMC. Then, in the VMware Cloud on AWS management console, go to “SDDC groups” and deploy a new VTGW. Once deployed, navigate to the “External TGW” tab, click “Add TGW” and enter the account number and the ID of the customer TGW to connect to as well as the regions to use.

In the “Routes” box, enter the CIDR range of the VPC which NC2 is to be deployed into. In this example, “10.90.0.0/16”.

This will advertise the NC2 VPC CIDR to VMware Cloud on AWS and also create a peering attachment invitation from the VMware AWS account to the customer AWS account.

Method 2: Steps to deploy: Configure the TGW

The invitation to add the peering attachment can be accepted through “Transit Gateway Attachments” in the AWS console in the customer account.

While here, take the opportunity to add an attachment to the VPC into which NC2 will be deployed.

Accepting the peering invitation in the AWS console

In the customer AWS account, navigate to the Transit Gateway route table section, select the route table for the TGW peered with VMC and add the routes for the VMC networks. In this case “10.2.0.0/16”, “10.3.0.0/16” and “10.4.0.0/16”. Note that these are added as Static routes.

In addition we have the “10.90.0.0/16” network added via the VPC attachment. There is no need to add static routes for this network as it is propagated automatically.

Method 2: Steps to deploy: Configure the routes to the VMC cluster in the NC2 VPC

The final step for the routing is to add the routes for VMware Cloud on AWS into the route table(s) of the VPC which NC2 is to be deployed into. In our example, “10.2.0.0/16”, “10.3.0.0/16” and “10.4.0.0/16” are added as routes via the TGW attachment. “10.90.0.0/16” is our local network and there is a quad-zero route to the internet via a NAT GW.

This concludes the setup steps specific to Method 2. Please continue with the firewall settings, NC2 cluster deployment and Move installation below.

Firewall settings: Allow Move to access the VMC vCenter and ESXi hosts

Move requires access to the VMC vCenter instance and the ESXi hosts in order to migrate virtual machines. Through the VMware Cloud on AWS console, add a Management Gateway firewall rule to allow the NC2 VPC to access these resources.

  • Add the NC2 VPC CIDR range to an MGW inventory group
    • Navigate to “Networking & Security”
    • Click “Groups” under “Inventory”
    • Click “Management Groups” to edit the groups pertaining to the MGW
    • Add or modify a group and add the CIDR range of the VPC which NC2 is deployed into
      • To modify an existing group, click the 3-dot menu on the left of the group and select “Edit”
Adding the NC2 VPC CIDR range to an MGW inventory group
  • Add the MGW inventory group to the MGW firewall rules
    • Navigate to “Networking & Security”
    • Click “Gateway Firewall” under “Security”
    • Click “Management Gateway” to edit rules for the MGW
    • Update the rules for ESXi and vCenter by adding the MGW inventory group containing the NC2 VPC CIDR range with the action “Allow”
Adding the MGW inventory group to a MGW firewall rule allowing access to ESXi and vCenter from the NC2 VPC

Deploy the NC2 cluster

Sign up for NC2 or start a trial and deploy through the NC2 deployment wizard. Select the latest version – 6.8, to get Flow overlay networking and the centralized management through Prism Central included. When asked, select the VPC and the private subnets desired. In this case subnets in the “10.90.0.0/16” VPC.

After the NC2 cluster is deployed, deploy Move 5.3 on top of NC2 and register both the NC2 cluster and vCenter from the VMC environment.

Install Move and register NC2 and VMC

Download Move 5.3 from the download page: https://portal.nutanix.com/page/downloads?product=move

Follow the Move manual to deploy Move 5.3 on the NC2 cluster in AWS: https://portal.nutanix.com/page/documents/list?type=software&filterKey=software&filterVal=Move

VDDK upload: After Move is deployed, add the NC2 and VMC environments. After adding the VMC environment it will prompt for a VDDK file. This file can be downloaded from the VMware support site. The version used in this example is: “VMware-vix-disklib-7.0.3-19513565.x86_64”. Please use the Linux version.

Migrate the VMs to NC2

If IP retention is desired, use Flow in Prism Central to create an overlay VPC and subnet which matches the CIDR range of the NSX-T subnet in VMC from which the VMs will be migrated. In this example “10.3.0.0/24” is used.

Now the only thing remaining is to create a Migration Plan in Move where VMC is the source and NC2 is the target. Ensure to select the correct target network to ensure IP retention works as expected.

Wrap up

This has been an example of the steps required for migrating Virtual Machines from VMware Cloud on AWS (VMC) to Nutanix Cloud Clusters (NC2) without changing the IP or MAC addresses of the migrated VMs. For more information or for a demo, please reach out to your Nutanix representative or partner.

Additional resources

Quickly create VMs with vSphere and Terraform

This is a beginner-friendly intro to creating (and destroying!) VMs on VMware vSphere using Terraform. It includes code with plenty of comments to show what option does what.

Bonus: Want to run a script or execute PowerShell commands after VM creation? Perhaps download and install some software? No problems – the code includes a section on this as well (for Windows that is).

Download the code

The annotated TF files used in this example can be found on GitHub here:

Repository: https://github.com/jonas-werner/vsphere-terraform/

Direct download link: https://codeload.github.com/jonas-werner/vsphere-terraform/zip/refs/heads/main

Initialize Terraform

To start with we want to initialize Terraform and download any providers required to run our VM deployment. In this case we’re using the vSphere provider since we’re interacting with a VMware vCenter server.

Enter the directory with the downloaded TF files. Pick either the Windows or Linux example. After that, initialize Terraform with:

terraform init

Creating a plan

Start by updating the vSphere login information and other details to match your local vSphere environment. For example what DC, Resource Pool and template VM to use.

When that’s done, proceed to create plan based on the TF file(s) in the current directory by using:

terraform plan -out win.plan

In this case we use the name “win.plan” but any name is OK. The output will show what Terraform will do when the plan is applied. Among other things it will list the outputs requested for this particular .tf file. In this example we have asked for the VM names and the IP addresses as per the below:

Plan: 3 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + vmnames    = [
      + "windows-vm-001",
      + "windows-vm-002",
      + "windows-vm-003",
    ]
  + vmnameswip = [
      + (known after apply),
      + (known after apply),
      + (known after apply),
    ]

Apply the plan

Now when we have a plan created, apply it with:

terraform apply win.plan

After the deployment is complete it’ll show the updated information for the VM:s as follows:

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
Outputs:

vmnames = [
  "windows-vm-001",
  "windows-vm-002",
  "windows-vm-003",
]
vmnameswip = [
  "10.70.2.11",
  "10.70.2.10",
  "10.70.2.12",
]

In vCenter we can now see we’ve got three new VMs

Destroying the VMs

If these are test VMs you may want to remove them after testing is done. To clean up, simply issue:

terraform destroy

or, if you want to skip the confirmation prompt:

teraform destroy --auto-approve

Troubleshooting

“Failed to verify certificate”: Most lab environments wouldn’t use a proper TLS certificate for vCenter. In those cases the below error will show when running “terraform plan”:

│ Error: error setting up new vSphere SOAP client: Post "https://vcenter.lab.jonamiki.com/sdk": tls: failed to verify certificate: x509: certificate signed by unknown authority

│ with provider["registry.terraform.io/hashicorp/vsphere"],
│ on main_win.tf line 3, in provider "vsphere":
│ 3: provider "vsphere" {

The solution is simple – Just uncomment the line below in the vsphere provider section in the terraform .tf file:

allow_unverified_ssl = true

Customization of guest OS fails: One of the following errors may show up if the source VM don’t have VMware Tools installed. The fix, of course, is to install VMware tools in the template VM and then run Terraform again:

Customization of the guest operating system is not supported due to the given reason: Tools is not installed in the GuestOS. Please install the latest version of open-vm-tools or VMware Tools to enable GuestCustomization.
Error: error sending customization spec: Customization of the guest operating system is not supported

VMs boot to a black screen without starting the OS: This is usually down to using EFI vs. BIOS in the VM settings in the .tf file. Try switching to BIOS from EFI or vice versa if the newly created VMs won’t start up.

  firmware         = "efi" 
or
  firmware         = "bios"

Security considerations for VMware Cloud on AWS

VMware Cloud on AWS is configured to be highly secure from the get-go. However, there are additional add-on services both on the VMware Cloud on AWS side and the AWS native service side which can be of benefit for those looking to add additional layers of security to their cloud environments. This guide aims to provide an overview of those services as well as list links for further reading

Overview

Maintaining tight access control for workloads and ensuring they are on the latest patch revisions is central to running an efficient and secure IT environment. In this post we cover how to heighten security by leveraging the tools which comes with AWS Systems Manager and complementing AWS native services. This way IT administrators will be empowered with tools central to their quest of reaching higher security levels while simultaneously being able to reduce management overhead.

Defining good security practices

While security in both on-premises and cloud environments is a broad topic, there are a few security practices which generally makes good sense to implement regardless of where workloads reside. In this section we will cover a few points to define good security practices and in the latter half we will look at what tools can be leveraged to implement this in practice.

Restricting access

Adhering to the principle of least privilege by assigning access to workloads only to those who actually need to administer those workloads. It is easy to assign a single or a few administrative accounts with blanket access across all virtual machines in a VMware environment but it’s certainly not best practice. We will look at tools to help divide and limit access as required in this document.

Tracking access

After limiting access to those who require it for their respective roles it also makes sense to track who has accessed what virtual machine at what time and to centralize the logging of workload access. This makes it easier to back track in case of either a security breach or purely for troubleshooting purposes.

Limiting the attack surface

Virtual machines are generally accessed by administrators over RDP for Windows or SSH for Linux. These protocols are frequently the first ones that hackers will try to exploit. Normally closing these access ports would make managing the workloads remotely a challenge, if not impossible but when using AWS native services in conjunction with VMware Cloud on AWS the methods and options for secure remote systems management increase. For this we leverage AWS Systems Manager in combination with VMware Cloud on AWS.

Having insight into what software is installed where

While keeping operating systems patched and up to date help safeguard against known Common Vulnerabilities and Exposures (CVEs), old or non-approved software can also provide attackers with additional vectors through which to gain access to otherwise protected systems. Gaining clear insight into what is installed and what versions are running in an environment is usually a priority for the security minded IT administrator. Here AWS Systems Manager can be a powerful ally as it integrates well with the virtual machines running on top of VMware Cloud on AWS.

Addressing CVEs quickly

New bugs are found frequently and as they are made public they usually end up in the CVE database. Patching these security holes is of course vital in maintaining a secure environment. Likewise is scanning for attempts to exploit these vulnerabilities with an IDS / IPS system. VMware Cloud on AWS add-ons can be leveraged to assist with this.

Alerting when something happens

Hackers will often spend a significant time in a compromised environment while they investigate ways to broaden their attack by mapping out the network and additional systems to target. These days, assuming that an attack will succeed at some point is common. Instead of just hardening the external-facing systems the internal systems should get a similar treatment. Equally important is to get alerted at an early stage to limit any potential blast radius after a system has been compromised.

Centralized logging

The first indication that intrusion attempts are being made can often be found in the log files of the systems being accessed. Having single source of truth into which all logs and access attempts are sent is vital when it comes to maintaining a birds-eye-view of an environment. It also makes it a lot easier if data doesn’t have to be correlated across multiple locations and tools.

Ensuring firewalls are configured correctly and notify / revert if this changes

A properly configured firewall can make the difference between a secure system and one which can easily be hacked. The old “hard shell – soft core” approach is no longer relevant in today’s environments when everything is connected and therefore potentially an access vector for an aggressor. Keeping standards for firewall configurations and ensuring they are properly applied on an ongoing basis is a great help in making sure that internal as well as external systems are properly protected against attack.

Get notified and react quickly if a hackers gain a foothold in your environment (IDS / IPS)

It’s not a question of “if” but “when” an attack succeeds. Once an assailant has gained a foothold they are inside your environment. The next step is to scan and map out other vulnerable systems in the vicinity. Quick reaction times can limit or even prevent any further intrusion.

Guard against ransomware by having immutable backups of important data and virtual machines

Hoping for the best and planning for the worst case scenarios involve having backups and a Disaster Recovery (DR) plan for when data has become corrupted and / or encrypted by ransomware. Read more about VMware Cloud on AWS add-ons for DR and how AWS native tools can provide immutable backups of all virtual machines and their data later in this blog post.

Using VMware Cloud on AWS add-ons and AWS native services to enhance security

The above may be good points for enhancing security, but how can they be implemented in a VMware Cloud on AWS environment? The following three sections show how to do so in more practical detail:

  1. Leveraging AWS Systems Manager (SSM)
  2. Leveraging VMware Cloud on AWS add-ons
  3. Leveraging additional AWS native services

1. Leveraging AWS Systems Manager

AWS Systems Manager involves several services which in combination provide powerful management capabilities.

For Restricting access and Limiting the attack surface it is possible to use AWS SSM Session Manager for remote access rather than connecting to virtual machines over RDP or SSH. Thereby commonly used ports such as 3389 (RDP) and 22 (SSH) can be closed and will therefore significantly reduce the attack surface of the virtual machines.

Additionally, in many cases it is possible to avoid accessing virtual machines directly. AWS SSM Fleet Manager provides a convenient way of connecting to virtual machines by offering access to Windows registry, Windows logs, Windows file system as well as command line access to Windows PowerShell – all through the AWS Console. Graphical desktop access is of course also possible and can be done by forwarding the RDP port of the virtual machine to the local IT admins computer. For Linux machines the access can be had in the same way and a connection to the Linux shell prompt is readily available through the AWS Console.

For Addressing CVEs quickly AWS Systems Manager offers Patch Manager – a centralized way to ensure VMware Cloud on AWS virtual machines and EC2 instances have their latest security patches applied. Create patch baselines, set maintenance windows and tag virtual machines into groups to ensure continuous uptime while patching takes place.

When it comes to gaining Insight into what software is installed AWS SSM has us covered through AWS Systems Manager Inventory. The AWS SSM agent automatically pulls software inventory from managed instances, like VMware Cloud on AWS virtual machines, and displays it through the AWS Console. For even more detailed reports and prettier graphics SSM Inventory can be expanded by putting the inventory data into Amazon S3, query it with Amazon Athena and finally access the query result data from Amazon QuickSight. This way it is possible to generate comprehensive and sometimes even beautiful reports.

2. Leveraging VMware Cloud on AWS add-ons

While AWS native services can cover many angles of security and systems management, when it comes to keeping guard inside the VMware Cloud on AWS environment the VMware Cloud on AWS add-on service NSX Advanced Firewall is a powerful solution. NSX Advanced Firewall can be easily be deployed in a VMware Cloud on AWS environment and helps with the security practice of Getting notified and react quickly if a hackers find a foothold in your environment (IDS / IPS). It comes pre-loaded with CVEs to protect against, like SQL injection attacks. However, it can also alert on, and optionally block, both attacks and network scans in real-time.

Disaster recovery is another topic which goes hand-in-hand with good security best practices. Primarily from a standpoint of safeguarding against ransomware. This aligns directly with Guard against ransomware by having immutable backups of important data and virtual machines. For the purposes of DR with VMware Cloud on AWS there are two standout solutions available in VMware Cloud Disaster Recovery (VCDR) and VMware Site Recovery. Both are powerful in their own right and offer recovery from corrupted or encrypted data as well as actual DC disasters of course. Please also refer to AWS Backup in the section below for an AWS alternative to backing up virtual machine data to protect from ransomware and other data loss.

3. Leveraging other AWS native services

A wide variety of AWS native services can be brought to bear to boost security in both a VMware Cloud on AWS or on-premises VMware environment. One aspect is increased insight by centralizing logging and alerting for events and another aspect is backing up data as well as running commands. All these topics are covered in the section below.

CloudWatch and CloudWatch logs

The AWS CloudWatch agent can be installed both in VMware Cloud on AWS and on-premises VMware environments. Instructions for installation are listed here. Once installed, telemetry from virtual machines will be logged in CloudWatch to be used for diagnostics and alerting. Filters can be created where log files are searched for key words and events which can subsequently be used to trigger alerts. Read more about CloudWatch log filtering here: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/MonitoringLogData.html

CloudTrail and EventBridge

As covered in the Restricting Access section above, once remote access has been configured to be allowed only through SSM Session Manager, API calls done for virtual machine and instance access can be logged in CloudTrail for later review. This gives a central location for the security team to track who accessed what system at what time. Note that tracking only show when a system was accessed by whom and doesn’t extend to what was done at the OS level. Furthermore, EventBridge can be paired with AWS SNS to generate alerts whenever sessions are initiated, resumed or terminated.

SSM Run Command and AWS Config

When it comes to setting appropriate firewall rules and then ensuring they don’t change unless intended there are two solutions which come in handy. For the AWS native side there is AWS Config with which rules for approved configurations can be set and maintained. For the VMware Cloud on AWS side it is possible to use SSM Run Command to regularly check for and apply firewall settings with PowerShell and Linux shell scripts.

AWS Backup

While the VMware Cloud on AWS add-ons like VCDR and Site Recovery can help with Guard against ransomware by having immutable backups of important data and virtual machines, AWS Backup is a cloud-native option for safeguarding virtual machine data. AWS Backup supports both VMware on-premises and in VMware Cloud on AWS.

When registering a vSphere system with AWS Backup a virtual appliance is downloaded and deployed in the VMware environment. After linking with vCenter it is possible to create backup plans for the VMware environment through the AWS Console. AWS Backup take backups by creating a snapshot of the virtual machine and then backing up the snapshot to AWS. The snapshot is discarded after it has been copied to AWS. Initial backups are full and subsequent backups are incremental.

Conclusion

VMware Cloud on AWS is not a standalone solution but integrates well with a variety of AWS native services. In this blog post AWS Systems Manager and related services were leveraged to enhance security of virtual machines running on VMware Cloud on AWS by centralizing access controls and automating OS patch management.

Further reading

Configuring AWS Systems Manager for use with VMware Cloud on AWS

Maintaining tight access control for workloads and ensuring they are on the latest patch revisions is central to running an efficient and secure IT environment. In this post we cover how to heighten security by leveraging the tools which comes with AWS Systems Manager and complementing AWS native services. This way IT administrators will be empowered with tools central to their quest of reaching higher security levels while simultaneously being able to reduce management overhead

This post is also available on AWS Re-post here: https://repost.aws/articles/ARRpq-n–pQSKwWteAIdpktw

Integrating AWS Systems Manager (SSM) with VMware Cloud on AWS

Deploying SSM can help enhance security for both cloud native and VMware Cloud on AWS environments. This post will show how to integrate SSM with VMware Cloud on AWS for secure remote access as well as security patching of VMs.

Detailed configuration steps are covered below, but the main points are:

  • Enabling and performing initial SSM configuration
  • Creating a VPC endpoint for SSM in the VMware Cloud on AWS connected VPC
  • Creating an inbound endpoint for Route 53
  • Install the SSM agent and registering the VMs with SSM

The SSM endpoint gives the VMs a private IP to communicate with SSM and the Route 53 inbound endpoint ensures that the VMs receive that private IP when they resolve the SSM FQDN. Without these two endpoints the VMs would communicate with SSM over the internet.

Architecture

Integrating AWS Systems Manager with VMware Cloud on AWS is relatively straightforward. While AWS SSM comes with public endpoints for VM-to-SSM communication, the preferred way is to use private IP addressing over the connected VPC Elastic Network Interface (ENI). To accomplish this, create an SSM endpoint in the connected VPC and then set the networks on the VMware Cloud on AWS SDDC CGW to use a Route 53 inbound resolver. That way any communication the SSM agent does to the regional SSM endpoint will go over the ENI rather than the internet. No changes on the VMs are required apart from ensuring they use the correct Route 53 DNS resolver.

Overall network architecture for AWS Systems Manager integration with VMware Cloud on AWS

Integrating AWS Systems Manager for the VMware Cloud on AWS environment involves creating an SSM endpoint in the customer connected VPC and adding a DNS inbound endpoint

Full diagram: https://d1.awsstatic.com/architecture-diagrams/ArchitectureDiagrams/vmware-hybrid-cloud-mgmt-aws-systems-manager.pdf?did=wp_card&trk=wp_card

Configuration steps

In this section the practical setup of AWS Systems Manager with VMware Cloud on AWS is covered. General guidelines for getting started with AWS Systems Manager is covered in the SSM User Guide. The steps below are VMware and VMware Cloud on AWS specific and aim to get an environment up and running swiftly.

Overview of steps involved in setting up Systems Manager for use with VMware Cloud on AWS

  1. Setting the systems manager region
  2. Creating an S3 bucket for SSM logs and inventory output
  3. Creating the S3 bucket policy in IAM
  4. Creating the systems manager IAM users
  5. Creating the systems manager VPC endpoints
  6. Creating the R53 DNS inbound endpoint
  7. Pointing CGW DNS to R53
  8. Creating the SDDC firewall rules
  9. Creating a hybrid activation
  10. VM configuration: Installing the SSM agent and registering with SSM

1. Setting the systems manager region and enable Advanced-Tier

AWS Systems Manager is bound to a region of choice when activated the first time. Please pick a region which makes sense to your organization and note that it cannot be changed afterwards. For a VMware Cloud on AWS environment this would usually be the same region as the VMware Cloud on AWS environment is deployed in.

The easiest way to set the region is to go to AWS Systems Manager Quick Start and select your preferred region during the setup. Access AWS Systems Manager Console here: https://console.aws.amazon.com/systems-manager/ Also refer to the AWS Systems Manager getting started guide here: https://docs.aws.amazon.com/systems-manager/latest/userguide/quick-setup-getting-started.html

To use advanced features of AWS SSM, like patch management of VMware VMs, Advanced-Tier has to be enabled. This will incur an extra charge for instance management. At the time of writing the daily cost for one instance would be $0.1668 USD. Please refer to the SSM pricing structure for more detail and up-to-date pricing.

Once AWS SSM has been set to your preferred region, access “Fleet Manager” and under “Account management” select to modify the settings for Instance tiers.

Under the “Instance tier settings” confirm change from Standard-Tier to Advanced-Tier

2. Creating an S3 bucket for SSM logs and inventory output

Having a designated S3 bucket for SSM logs and other output, like inventory data, makes administration easier. The output can later be accessed for tracking, troubleshooting or even for access via Athena where inventory data can be formatted using SQL queries and used in QuickSight reports.

3. Creating the S3 bucket policy in IAM

The S3 bucket created in step 2 need a policy to allow SSM to access it. This policy will be created in IAM and then attached to the IAM roles used in both cloud native and VMware Cloud on AWS environments. Note that while this is a policy for S3 access, it is created in IAM and not in the S3 console.

The policy is created based on the JSON document below. Please update “YOUR-REGION” and “YOUR-BUCKET-NAME” with the relevant values for your environment.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": [
                "arn:aws:s3:::aws-ssm-YOUR-REGION/",
                 "arn:aws:s3:::aws-windows-downloads-YOUR-REGION/",
                "arn:aws:s3:::amazon-ssm-YOUR-REGION/",
                "arn:aws:s3:::amazon-ssm-packages-YOUR-REGION/",
                "arn:aws:s3:::YOUR-REGION-birdwatcher-prod/",
                 "arn:aws:s3:::aws-ssm-distributor-file-YOUR-REGION/",
                "arn:aws:s3:::aws-ssm-document-attachments-YOUR-REGION/",
                 "arn:aws:s3:::patch-baseline-snapshot-YOUR-REGION/"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl",
                "s3:GetEncryptionConfiguration"
            ],
            "Resource": [
                "arn:aws:s3:::YOUR-BUCKET-NAME/*",
                "arn:aws:s3:::YOUR-BUCKET-NAME"
            ]
        }
    ]
}

4. Creating the systems manager IAM roles

AWS Systems Manager can cover the management of both EC2 instances as well as VMware Cloud on AWS VMs. Each of these instance / VM types require a role created for them in IAM. This is to give the instance / VM the ability to interact with SSM and vice-versa. Detailed steps for this is covered in the AWS SSM user guide. In this section we will cover how to quickly get the VMware Cloud on AWS VM role created for use with hybrid activations in SSM.

Navigate to IAM in the AWS Console and create a new role. Select “Trusted entity type” to be “AWS Service” and the “Use Case” to be “Systems Manager”.

Under Permissions, search for and select the following:

  • CloudWatchAgentServerPolicy
  • AmazonSSMDirectoryServiceAccess
  • AmazonSSMManagedInstanceCore
  • The policy previously created for the SSM S3 bucket

5. Creating the systems manager VPC endpoints

Endpoints for AWS SSM can be created in the VMware Cloud on AWS Connected VPC. This is to provide the VMs in VMware Cloud on AWS a path to communicate with AWS SSM via the connected VPC ENI rather than accessing AWS SSM over the internet. This is both secure and as long as the SSM endpoint in the connected VPC is in the same availability zone as the VMware Cloud on AWS SDDC, communication is free of charge.

Create endpoints for the SSM service and, optionally, also KMS, Logs and SSM messages services as per the below:

6. Creating the R53 DNS inbound endpoint

Creating the SSM endpoint is the first step but the VMs in VMware Cloud on AWS must also be directed to that endpoint rather than the public SSM IP addresses. Rather than modifying each VM, a more efficient way is to add a Route 53 inbound endpoint to the Connected VPC. If the CGW DHCP server for the VMware Cloud on AWS VM segments is then updated to point DNS to that Route 53 endpoint, the VMs will automatically resolve the SSM FQDN to the endpoint private IP address.

Make sure that the security group for the inbound R53 endpoints allow DNS on both TCP and UDP from the VMware Cloud on AWS CGW network segments.

7. Pointing CGW DNS to R53

In the VMware Cloud on AWS console, navigate to “Networking & Security” and select “Segments” under Network. Update the network segments for the VMs to use the R53 inbound endpoint IP addresses.

The VMs connected to the CGW network segment may need to be disconnected and then re-connected to the network in vCenter or restart the networking from the OS side for the changes to take effect.

Before changing the DNS to point to the R53 inbound endpoint the VM in VMware Cloud on AWS resolves the regional SSM endpoint to the public IP address. Communication is done via the IGW in the VMware Cloud on AWS environment.

After updating the DNS the VM resolves the SSM endpoint to the private endpoint created earlier. Communication is over the Connected VPC ENI. Note: The ping doesn’t succeed since the security group for SSM blocks ICMP. However, the command shows that the VM correctly resolves the regional SSM endpoint to the private IP address in the Connected VPC.

8. Creating the SDDC firewall rules

The network segments connected to the VMware Cloud on AWS CGW router need access to both SSM and R53 services in the Connected VPC. AWS SSM also need to be able to access the VMware Cloud on AWS VMs over the Connected VPC ENI. Create firewall rules on the CGW to allow this traffic. In the screenshot below the VMware Cloud on AWS VM network “cgw-53” is allowed to communicate anywhere over any uplink while inbound traffic to the same network is allowed over the VPC interface only.

9. Creating a hybrid activation

Virtual machines external to EC2 are added to AWS SSM through a “Hybrid activation”. This is created via the SSM section of the AWS Console and generates an ID and a code for a preset number of VMs. These can be used as credentials in the next step where the SSM agent is installed and the VMs are registered with SSM

10. VM configuration: Installing the SSM agent and registering with SSM

Instances on EC2 normally come with the SSM agent pre-installed. For VMs in VMware Cloud on AWS or on-premises VMware environments the agent needs to be installed. This is both quick and easy to do. The details for agent installation for a multitude of OS types are described in the SSM user guide. In this example the agent will be installed on a Linux VM. Instructions for Windows VMs can be found here.

After downloading and installing the agent, the Hybrid Activation code and ID from step 9 above are used to register the VM with AWS SSM

Verifying that the VM is available in Systems Manager. Note that in contrast from EC2 instances, the VMware Cloud on AWS VM is prefixed with an “mi-” rather than just an “i-”. This is an indication it is a Managed Instance external to EC2. Of course labels and groups can also be used to keep the managed nodes apart, but the Node ID is a quick indicator of where it comes from.

Managing the VMs in Systems Manager

Viewing the VM inventory

Once the AWS SSM agent is installed it will start collecting the software inventory of the VM. This data is available through the SSM Inventory console but can also be exported to S3 for further processing by Athena. Detailed reports can be created by pointing QuickSight to the Athena data.

Accessing VM registry, logs and filesystem

The VM filesystem, performance metrics, system logs and even registry for Windows VMs is accessible directly through the AWS Console for each VM registered with SSM. Administering the VMs though the AWS Console reduces the need for direct desktop or shell access. For some features, like performance metrics, please be sure to enable KMS encryption in the SSM general settings.

Accessing the VM PowerShell console

While PowerShell or a shell prompt on a VM can be accessed over RDP or SSH, having the ability to quickly connect to a VM console through SSM helps streamline system administration.

Accessing the VM desktop

For Windows VMs it is possible to quickly open an RDP session even though the firewall on the VM is enabled and blocking RDP. The SSM agent creates a loopback connection which enables access from the AWS console. Up to four sessions can be opened in a single SSM Session Manager window. The session window can be maximized to view the VM in full screen

Patching VMs

Through SSM Patch manager, patch baselines and maintenance windows can easily be created and applied to the managed VMware VMs as well as EC2 instances to ensure they are up to date and have the highest level of protection possible

Conclusion

VMware Cloud on AWS is not a standalone solution but integrates well with a variety of AWS native services. In this blog post AWS Systems Manager and related services were leveraged to enhance security of VMs running on VMware Cloud on AWS by centralizing access controls and automating OS patch management.

Further reading

Capture VMware performance data using Live Optics

Gaining insights into the details of what makes up a large and complex VMware environment can be challenging. This is especially true if gathering individual VM performance data is part of the goal.

Live Optics, a tool created by Dell Technologies, is a great way to gather performance data from a vSphere environment over time. It can run on a laptop or VM and has the ability to either take a snapshot of the environment without performance data or to run over a period from 10min to 7 days and report also on performance.

The data can be streamed continuously to a Live Optics endpoint. In this case the data gathering process can be viewed live through the Live Optics web portal. Alternatively the data can be saved locally as an encrypted SIOKIT file during the capture process and then be uploaded to the Live Optics portal once data collection is complete.

Below is a 5 min video showing the entire process of account creation, collection download and report export in Excel format for reference