Leveraging OpenStack for Deep Learning & Machine Learning with GPU pass-through

As part of preparing for OpenStack days in Tokyo 2017 I built an environment to show how GPU pass-through can be used on OpenStack as a means of providing instances ready for Machine learning and Deep learning. This is a rundown of the process

Introduction

Deep Learning and Machine Learning have in recent years grown to become increasingly vital in the advancement of humanity in key areas such as life sciences, medicine and artificial intelligence. Traditionally it has been difficult and costly to create scalable, self-service environments to enable developers and end users alike to leverage these technological advancements. In this post we’ll look at the practical steps for the process of enable GPU powered virtual instances on Red Hat OpenStack. These can in turn be utilized by research staff to run in-house or commercial software for Deep Learning and Machine Learning.

Benefits

Virtual instances for Deep Learning and Machine Learning become easy and quick to create and consume. The addition of GPU powered Nova compute nodes can be done smoothly with no impact to existing cloud infrastructure. Users can choose from multiple GPU types and virtual machine types and the Nova Scheduler will be aware of where the required GPU resources reside for instance creation.

Prerequisites

This post describes how to modify key OpenStack services on an already deployed cloud to allow for GPU pass-through and subsequent assignment to virtual instances. As such it assumes an already functional Red Hat OpenStack overcloud is available. The environment used for the example in this document is running Red Hat OSP10 (Newton) on Dell EMC PowerEdge servers. The GPU enabled servers used for this example are PowerEdge C4130’s with NVIDIA M60 GPUs.

Process outline

After a Nova compute node with GPUs has been added to the cluster using Ironic bare-metal provisioning the following steps are taken:

  • Disabling the Nouveau driver on the GPU compute node
  • Enabling IOMMU in the kernel boot options
  • Modifying the Nova compute service to allow PCIe pass-through
  • Modifying the Nova scheduler service to filter on the GPU ID
  • Creating a flavor utilizing the GPU ID

Each step is described in more detail below.

Disabling the Nouveau driver on the GPU compute node

On the Undercloud, list the current Overcloud server nodes

[stack@toksc-osp10b-dir-01 ~]$ nova list

+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                    | Status | Task State | Power State | Networks            |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
| 8449f79f-fc17-4927-a2f3-5aefc7692154 | overcloud-cephstorage-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.14 |
| ac063e8d-9762-4f2a-bf19-bd90de726be4 | overcloud-cephstorage-1 | ACTIVE | -          | Running     | ctlplane=192.0.2.9  |
| b7410a12-b752-455c-8146-d856f9e6c5ab | overcloud-cephstorage-2 | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |
| 4853962d-4fd8-466d-bcdb-c62df41bd953 | overcloud-cephstorage-3 | ACTIVE | -          | Running     | ctlplane=192.0.2.17 |
| 6ceb66b4-3b70-4171-ba4a-e0eff1f677a9 | overcloud-compute-0     | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |
| 00c7d048-d9dd-4279-9919-7d1c86974c46 | overcloud-compute-1     | ACTIVE | -          | Running     | ctlplane=192.0.2.19 |
| 2700095a-319c-4b5d-8b17-96ddadca96f9 | overcloud-compute-2     | ACTIVE | -          | Running     | ctlplane=192.0.2.21 |
| 0d210259-44a7-4804-b084-f2af1506305b | overcloud-compute-3     | ACTIVE | -          | Running     | ctlplane=192.0.2.15 |
| e469714f-ce40-4b55-921e-bcadcb2ae231 | overcloud-compute-4     | ACTIVE | -          | Running     | ctlplane=192.0.2.10 |
| fefd2dcd-5bf7-4ac5-a7a4-ed9f70c63155 | overcloud-compute-5     | ACTIVE | -          | Running     | ctlplane=192.0.2.13 |
| 085cce69-216b-4090-b825-bdcc4f5d6efa | overcloud-compute-6     | ACTIVE | -          | Running     | ctlplane=192.0.2.20 |
| 64065ea7-9e69-47fe-ad87-ed787f671621 | overcloud-compute-7     | ACTIVE | -          | Running     | ctlplane=192.0.2.18 |
| cff03230-4751-462f-a6b4-6578bd5b9602 | overcloud-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.22 |
| 333b84fc-142c-40cb-9b8d-1566f7a6a384 | overcloud-controller-1  | ACTIVE | -          | Running     | ctlplane=192.0.2.24 |
| 20ffdd99-330f-4164-831b-394eaa540133 | overcloud-controller-2  | ACTIVE | -          | Running     | ctlplane=192.0.2.11 |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+

Compute nodes 6 and 7 are equipped with NVIDIA M60 GPU cards. Node 6 will be used for this example.

From the Undercloud, SSH to the GPU compute node:

[stack@toksc-osp10b-dir-01 ~]$ ssh heat-admin@192.0.2.20
Last login: Tue May 30 06:36:38 2017 from gateway
[heat-admin@overcloud-compute-6 ~]$
[heat-admin@overcloud-compute-6 ~]$

Verify that the NVIDIA GPU cards are present and recognized:
[heat-admin@overcloud-compute-6 ~]$ lspci -nn | grep NVIDIA
04:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
05:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
84:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
85:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)

Use the device ID obtained in the previous command to check if the Nouveau driver is currently in use for the GPUs:
[heat-admin@overcloud-compute-6 ~]$ lspci -nnk -d 10de:13f2
04:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
                Subsystem: NVIDIA Corporation Device [10de:115e]
                Kernel driver in use: nouveau
                Kernel modules: nouveau

 

Disable the Nouveau driver and enable IOMMU in the kernel boot options:

[heat-admin@overcloud-compute-6 ~]$ sudo su -
Last login: 火  5月 30 06:37:02 UTC 2017 on pts/0
[root@overcloud-compute-6 ~]#
[root@overcloud-compute-6 ~]# cd /boot/grub2/

Make a backup of the grub.cfg file before modifying it:
[root@overcloud-compute-6 grub2]# cp -p grub.cfg grub.cfg.orig.`date +%Y-%m-%d_%H-%M`
[root@overcloud-compute-6 grub2]# vi grub.cfg

Modify the following line and append the Noveau blacklist and Intel IOMMU options:
linux16 /boot/vmlinuz-3.10.0-514.2.2.el7.x86_64 root=UUID=a69bf0c7-8d41-42c5-b1f0-e64719aa7ffb ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet

After modification:
linux16 /boot/vmlinuz-3.10.0-514.2.2.el7.x86_64 root=UUID=a69bf0c7-8d41-42c5-b1f0-e64719aa7ffb ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet modprobe.blacklist=nouveau intel_iommu=on iommu=pt

Also modify the rescue boot option:
linux16 /boot/vmlinuz-0-rescue-e1622fe8eb7d44d0a2d57ce6991b2120 root=UUID=a69bf0c7-8d41-42c5-b1f0-e64719aa7ffb ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet

After modification:
linux16 /boot/vmlinuz-0-rescue-e1622fe8eb7d44d0a2d57ce6991b2120 root=UUID=a69bf0c7-8d41-42c5-b1f0-e64719aa7ffb ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet modprobe.blacklist=nouveau intel_iommu=on iommu=pt

Make the same modifications to “/etc/defaults/grub”:
[heat-admin@overcloud-compute-6 ~]$ vi /etc/default/grub

Re-generate the GRUB configuration files with grub2-mkconfig:
[root@overcloud-compute-6 grub2]# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-514.2.2.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-514.2.2.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-e1622fe8eb7d44d0a2d57ce6991b2120
Found initrd image: /boot/initramfs-0-rescue-e1622fe8eb7d44d0a2d57ce6991b2120.img
done

Reboot the Nova compute node:
[root@overcloud-compute-6 grub2]# reboot
PolicyKit daemon disconnected from the bus.
We are no longer a registered authentication agent.
Connection to 192.0.2.20 closed by remote host.
Connection to 192.0.2.20 closed.<\pre>

After the reboot is complete, SSH to the node to verify that the Nouveau module is no longer active for the GPUs:

[stack@toksc-osp10b-dir-01 ~]$ ssh heat-admin@192.0.2.20
Last login: Tue May 30 07:39:42 2017 from 192.0.2.1
[heat-admin@overcloud-compute-6 ~]$
[heat-admin@overcloud-compute-6 ~]$
[heat-admin@overcloud-compute-6 ~]$
[heat-admin@overcloud-compute-6 ~]$ lspci -nnk -d 10de:13f2
04:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM204GL [Tesla M60] [10de:13f2] (rev a1)
                Subsystem: NVIDIA Corporation Device [10de:115e]
                Kernel modules: nouveau

The Kernel module is present but not listed as being active. PCIe pass-through is now possible.

Modifying the Nova compute service to allow PCIe pass-through

From the Undercloud, SSH to the compute node and become root with sudo:

[stack@toksc-osp10b-dir-01 ~]$ ssh heat-admin@192.0.2.20
[heat-admin@overcloud-compute-6 ~]$ sudo su -
Last login: 火  5月 30 07:40:13 UTC 2017 on pts/0

Backup the nova.conf file and edit the configuration file:
[root@overcloud-compute-6 ~]# cd /etc/nova
[root@overcloud-compute-6 nova]# cp -p nova.conf nova.conf.orig.`date +%Y-%m-%d_%H-%M`
[root@overcloud-compute-6 nova]# vi nova.conf

Add the following two lines at the beginning of the “[DEFAULT]” section:
pci_alias = { "vendor_id":"10de", "product_id":"13f2", "device_type":"type-PCI", "name":"m60" }
pci_passthrough_whitelist = { "vendor_id": "10de", "product_id": "13f2" }

Note:
The values for “vendor_id” and “product_id” can be found in the output of “lspci -nn | grep NVIDIA” as shown earlier. Note that the PCIe alias and whitelist is made on a Vendor / Product basis. This means no specific data for each PCIe device is required and new cards of the same type can be added and used without having to modify the configuration file.

The value for “name” is arbitrary and can be anything. However, it will be used to filter on the GPU type later and a brief, descriptive name is suggested as best-practice. A value of “m60” is used in this example.

Restart the Nova compute service:

[root@overcloud-compute-6 nova]# systemctl restart openstack-nova-compute.service

Modifying the Nova scheduler service to filter on the GPU ID

On each of the Nova Controller nodes, perform the following steps:
From the Undercloud, SSH to the controller nodes and become root with sudo:

[stack@toksc-osp10b-dir-01 ~]$ ssh heat-admin@192.0.2.20
[heat-admin@overcloud-compute-6 ~]$ sudo su -
Last login: 火  5月 30 07:40:13 UTC 2017 on pts/0

Create a backup and then modify the nova.conf configuration file:
[root@ overcloud-controller-0 ~]# cd /etc/nova
[root@ overcloud-controller-0 ~]# cp -p nova.conf nova.conf.orig.`date +%Y-%m-%d_%H-%M`
[root@ overcloud-controller-0 ~]# vi nova.conf

Add the following three lines at the beginning of the “[DEFAULT]” section:
pci_alias = { "vendor_id":"10de", "product_id":"13f2", "device_type":"type-PCI", "name":"m60" }
pci_passthrough_whitelist = { "vendor_id": "10de", "product_id": "13f2" }
scheduler_default_filters = RetryFilter, AvailabilityZoneFilter, RamFilter, DiskFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter

Note: Ensure to match the values for “vendor_id”, “product_id” and “name” with those used while modifying the nova.conf file on the Nova compute node.

Note: Also change “scheduler_use_baremetal_filters” from “False” to “True”

Restart the nova-scheduler service:

[root@ overcloud-controller-0 ~]# systemctl restart openstack-nova-scheduler.service

Creating a flavor utilizing the GPU ID

The only step remaining is to create a flavor to utilize the GPU. For this a flavor containing a PCIe filter matching the “name” value in the nova.conf files will be created.
Create the base flavor without PCIe passthrough alias:

[stack@toksc-osp10b-dir-01 ~]$ openstack flavor create gpu-mid-01 --ram 4096 --disk 15 --vcpus 4
+----------------------------+--------------------------------------+
| Field                      | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| disk                       | 15                                   |
| id                         | 04447428-3944-4909-99d5-d5eaf6e83191 |
| name                       | gpu-mid-01                           |
| os-flavor-access:is_public | True                                 |
| properties                 |                                      |
| ram                        | 4096                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 4                                    |
+----------------------------+--------------------------------------+

Check that the flavor has been created correctly:
[stack@toksc-osp10b-dir-01 ~]$ openstack flavor list
+--------------------------------------+------------+------+------+-----------+-------+-----------+
| ID                                   | Name       |  RAM | Disk | Ephemeral | VCPUs | Is Public |
+--------------------------------------+------------+------+------+-----------+-------+-----------+
| 04447428-3944-4909-99d5-d5eaf6e83191 | gpu-mid-01 | 4096 |   15 |         0 |     4 | True      |
+--------------------------------------+------------+------+------+-----------+-------+-----------+

Add the PCIe passthrough alias information to the flavor:
[stack@toksc-osp10b-dir-01 ~]$ openstack flavor set gpu-mid-01 --property "pci_passthrough:alias"="m60:1"

Note: The “m60:1” indicate that one (1) of the specified resource – in this case a GPU, is requested. If more than one GPU is required for a particular flavor, just modify the value. For example: “m60:2” for a dual-GPU flavor.

Verify that the flavor has been modified correctly:

[stack@toksc-osp10b-dir-01 ~]$ nova flavor-show gpu-mid-01
+----------------------------+--------------------------------------+
| Property                   | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| disk                       | 15                                   |
| extra_specs                | {"pci_passthrough:alias": "m60"}     |
| id                         | 04447428-3944-4909-99d5-d5eaf6e83191 |
| name                       | gpu-mid-01                           |
| os-flavor-access:is_public | True                                 |
| ram                        | 4096                                 |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 4                                    |
+----------------------------+--------------------------------------+

That is all. Instances with the GPU flavor can now be created via the command line or the Horizon web interface.

OpenStack Neutron – Expand and / or update floating IP range

Sometimes you run out of public IP addresses and need to expand the floating IP range. If a non-interrupted range is available to expand into from the current range simply use:

neutron subnet-update –allocation-pool start=<original-start-ip>,end=<new-end-ip>

This will overwrite the existing range and expand it to the new end-IP.

To add an extra, separate IP range while still keeping the original range, use:

neutron subnet-update <subnet-id> –allocation-pool start=<original-start-ip>,end=<original-end-ip> –allocation-pool start=<additional-start-ip>,end=<additional-end-ip>

Example of extending a continuous IP range:

[root@c6320-n1 ~(keystone_admin)]# neutron subnet-list
+--------------------------------------+--------------+----------------+----------------------------------------------------+
| id                                   | name         | cidr           | allocation_pools                                   |
+--------------------------------------+--------------+----------------+----------------------------------------------------+
| 1b66dad8-2f2c-4667-9460-7729e2a68d1c | sub-pub      | 172.17.4.0/24  | {"start": "172.17.4.130", "end": "172.17.4.199"}   |
| 74c90d00-af79-4f7c-92ef-4e38231e850c | sub_priv2    | 192.168.0.0/24 | {"start": "192.168.0.40", "end": "192.168.0.50"}   |
| e6cb6f7e-5efd-42df-93e6-67ad4b056035 | sub_internal | 192.168.0.0/24 | {"start": "192.168.0.100", "end": "192.168.0.200"} |
| e47c7f4b-85ec-41e4-ad1a-cf9290a97d87 | sub_priv     | 172.16.0.0/24  | {"start": "172.16.0.100", "end": "172.16.0.200"}   |
+--------------------------------------+--------------+----------------+----------------------------------------------------+


[root@c6320-n1 ~(keystone_admin)]# neutron subnet-show 1b66dad8-2f2c-4667-9460-7729e2a68d1c
+-------------------+--------------------------------------------------+
| Field | Value |
+-------------------+--------------------------------------------------+
| allocation_pools | {"start": "172.17.4.130", "end": "172.17.4.199"} |
| cidr | 172.17.4.0/24 |
| dns_nameservers | |
| enable_dhcp | False |
| gateway_ip | 172.17.4.1 |
| host_routes | |
| id | 1b66dad8-2f2c-4667-9460-7729e2a68d1c |
| ip_version | 4 |
| ipv6_address_mode | |
| ipv6_ra_mode | |
| name | sub-pub |
| network_id | fa9fb87f-70d9-4e18-83cb-c04695cbed5a |
| subnetpool_id | |
| tenant_id | 8d93e4b0f8454ad7b539d14633d72136 |
+-------------------+--------------------------------------------------+


[root@c6320-n1 ~(keystone_admin)]# neutron subnet-update 1b66dad8-2f2c-4667-9460-7729e2a68d1c --allocation-pool start=172.17.4.130,end=172.17.4.240
Updated subnet: 1b66dad8-2f2c-4667-9460-7729e2a68d1c
[root@c6320-n1 ~(keystone_admin)]# neutron subnet-show 1b66dad8-2f2c-4667-9460-7729e2a68d1c
+-------------------+--------------------------------------------------+
| Field | Value |
+-------------------+--------------------------------------------------+
| allocation_pools | {"start": "172.17.4.130", "end": "172.17.4.240"} |
| cidr | 172.17.4.0/24 |
| dns_nameservers | |
| enable_dhcp | False |
| gateway_ip | 172.17.4.1 |
| host_routes | |
| id | 1b66dad8-2f2c-4667-9460-7729e2a68d1c |
| ip_version | 4 |
| ipv6_address_mode | |
| ipv6_ra_mode | |
| name | sub-pub |
| network_id | fa9fb87f-70d9-4e18-83cb-c04695cbed5a |
| subnetpool_id | |
| tenant_id | 8d93e4b0f8454ad7b539d14633d72136 |
+-------------------+--------------------------------------------------+
[root@c6320-n1 ~(keystone_admin)]#
[root@c6320-n1 ~(keystone_admin)]#

Example of adding an additional range to an already existing range:

[root@c6320-n1 ~(keystone_admin)]# neutron subnet-update 1b66dad8-2f2c-4667-9460-7729e2a68d1c --allocation-pool start=172.17.4.130,end=172.17.4.199 --allocation-pool start=172.17.4.209,end=172.17.4.240
Updated subnet: 1b66dad8-2f2c-4667-9460-7729e2a68d1c
[root@c6320-n1 ~(keystone_admin)]#
[root@c6320-n1 ~(keystone_admin)]# neutron subnet-show 1b66dad8-2f2c-4667-9460-7729e2a68d1c
+-------------------+--------------------------------------------------+
| Field             | Value                                            |
+-------------------+--------------------------------------------------+
| allocation_pools  | {"start": "172.17.4.130", "end": "172.17.4.199"} |
|                   | {"start": "172.17.4.209", "end": "172.17.4.240"} |
| cidr              | 172.17.4.0/24                                    |
| dns_nameservers   |                                                  |
| enable_dhcp       | False                                            |
| gateway_ip        | 172.17.4.1                                       |
| host_routes       |                                                  |
| id                | 1b66dad8-2f2c-4667-9460-7729e2a68d1c             |
| ip_version        | 4                                                |
| ipv6_address_mode |                                                  |
| ipv6_ra_mode      |                                                  |
| name              | sub-pub                                          |
| network_id        | fa9fb87f-70d9-4e18-83cb-c04695cbed5a             |
| subnetpool_id     |                                                  |
| tenant_id         | 8d93e4b0f8454ad7b539d14633d72136                 |
+-------------------+--------------------------------------------------+
[root@c6320-n1 ~(keystone_admin)]#
[root@c6320-n1 ~(keystone_admin)]# 

Nova live migration fails with “Migration pre-check error: CPU doesn’t have compatibility.”

This week I’m hosting a hands-on OpenStack training for some clients. The ability to perform Live migrations of running instances between hosts is one of the things they want to see and I had setup the environment to support this.

Live migrations had been working fine for over a week when it finally decided to throw errors this morning.

The error on the command line when trying to do a live migration:

ERROR (BadRequest): Migration pre-check error: CPU doesn't have compatibility.
internal error: Unknown CPU model Haswell-noTSX
Refer to http://libvirt.org/html/libvirt-libvirt.html#virCPUCompareResult (HTTP 400) (Request-ID: req-227fd8fb-eba4-4f40-b707-bb31569ed14f)

Normally this would happen if the hosts running nova-compute had different CPU types, but in this case they are all identical (Dell C6320 nodes).

Checked the CPU map in /usr/share/libvirt/cpu_map.xml and the CPU is listed.


    <model name='Haswell'>
      <model name='Haswell-noTSX'/>
      <feature name='hle'/>
      <feature name='rtm'/>
    </model>

Since the CPU’s are the same on all nodes it’s obviously the lookup of that CPU type that fails. So, I tried to disable the check by editing /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py. This resulted in the error disappearing but my instances staying put on whatever host they were originally running on. Not much better.

Finally I started modifying the /etc/nova/nova.conf files on both the controller node and the nova-compute nodes. The changes that fixed it were as follows:

Old setting:
#cpu_mode=<None>

New setting:
cpu_mode=custom

Old setting:
#cpu_model=<None>

New setting:
cpu_model=kvm64

I also have the following settings which may or may not matter in this case:

virt_type=kvm
limit_cpu_features=false
live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER, VIR_MIGRATE_LIVE, VIR_MIGRATE_TUNNELLED

After restarting nova on both the controller and all three compute nodes, live migrations are working fine again. Not sure why they stopped in the first place, but at least this seems to have done the job.

Checking instances for each node:

[root@c6320-n1 ~(keystone_admin)]# 
[root@c6320-n1 ~(keystone_admin)]# for i in {2..4}; do nova hypervisor-servers c6320-n$i; done
+--------------------------------------+-------------------+---------------+---------------------+
| ID                                   | Name              | Hypervisor ID | Hypervisor Hostname |
+--------------------------------------+-------------------+---------------+---------------------+
| aaac652f-65d9-49e4-aea2-603fc2db26c3 | instance-0000009c | 1             | c6320-n2            |
+--------------------------------------+-------------------+---------------+---------------------+
+----+------+---------------+---------------------+
| ID | Name | Hypervisor ID | Hypervisor Hostname |
+----+------+---------------+---------------------+
+----+------+---------------+---------------------+
+----+------+---------------+---------------------+
| ID | Name | Hypervisor ID | Hypervisor Hostname |
+----+------+---------------+---------------------+
+----+------+---------------+---------------------+

Performing the migration:
[root@c6320-n1 ~(keystone_admin)]# nova live-migration aaac652f-65d9-49e4-aea2-603fc2db26c3 c6320-n4

Verifying that the instance has moved from node2 to node4:


[root@c6320-n1 ~(keystone_admin)]# for i in {2..4}; do nova hypervisor-servers c6320-n$i; done
+----+------+---------------+---------------------+
| ID | Name | Hypervisor ID | Hypervisor Hostname |
+----+------+---------------+---------------------+
+----+------+---------------+---------------------+
+----+------+---------------+---------------------+
| ID | Name | Hypervisor ID | Hypervisor Hostname |
+----+------+---------------+---------------------+
+----+------+---------------+---------------------+
+--------------------------------------+-------------------+---------------+---------------------+
| ID                                   | Name              | Hypervisor ID | Hypervisor Hostname |
+--------------------------------------+-------------------+---------------+---------------------+
| aaac652f-65d9-49e4-aea2-603fc2db26c3 | instance-0000009c | 3             | c6320-n4            |
+--------------------------------------+-------------------+---------------+---------------------+

The keystone CLI is deprecated in favor of python-openstackclient.

UPDATE: It turns out that installing the new client can cause issues with Keystone. I found this out the hard way yesterday when it failed during a demo, preventing authentication from the command line. After a few hours troubleshooting it turns out Apache (httpd.service) and Keystone (openstack-keystone.service) were clashing. I was unable to fix this regardless of updating each of these services config files to separate them out. Finally guessed it might be the last package I installed that was the cause. After removing python-openstackclient and rebooting the controller node the issue was fixed.

Original post
===============
In OpenStack Kilo the Depreciation message for the Keystone CLI will be displayed whenever using invoking the keystone command. “DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient.

To move to the new python-openstackclient, simply install it. On RHEL7.1:
yum install -y python-openstackclient.noarch

After that it will be available as the command “openstack”. It can be invoked in interactive mode just by typing “openstack” or directly from the command line to get information. For example, to list users:
Old Keystone CLI: “keystone user-list
New Openstack CLI: “openstack user list

To be more similar to the output of the old command issue “openstack user list --long” to get the extra fields.

You may also want to update the script “openstack-status” so it uses the new client. To do so, please:
1. Edit /usr/bin/openstack-status with your favorite editor
2. Replace the old command with the new one (around line 227) like so:

#keystone user-list
openstack user list --long

The new CLI can do a lot more of course. For a full list of commands please refer to the below (executed with “openstack” + command):


aggregate add host      ip fixed remove               server rescue
aggregate create        ip floating add               server resize
aggregate delete        ip floating create            server resume
aggregate list          ip floating delete            server set
aggregate remove host   ip floating list              server show
aggregate set           ip floating pool list         server ssh
aggregate show          ip floating remove            server suspend
availability zone list  keypair create                server unlock
backup create           keypair delete                server unpause
backup delete           keypair list                  server unrescue
backup list             keypair show                  server unset
backup restore          limits show                   service create
backup show             module list                   service delete
catalog list            network create                service list
catalog show            network delete                service show
command list            network list                  snapshot create
complete                network set                   snapshot delete
compute agent create    network show                  snapshot list
compute agent delete    object create                 snapshot set
compute agent list      object delete                 snapshot show
compute agent set       object list                   snapshot unset
compute service list    object save                   token issue
compute service set     object show                   token revoke
console log show        project create                usage list
console url show        project delete                usage show
container create        project list                  user create
container delete        project set                   user delete
container list          project show                  user list
container save          project usage list            user role list
container show          quota set                     user set
ec2 credentials create  quota show                    user show
ec2 credentials delete  role add                      volume create
ec2 credentials list    role create                   volume delete
ec2 credentials show    role delete                   volume list
endpoint create         role list                     volume set
endpoint delete         role remove                   volume show
endpoint list           role show                     volume type create
endpoint show           security group create         volume type delete
extension list          security group delete         volume type list
flavor create           security group list           volume type set
flavor delete           security group rule create    volume type unset
flavor list             security group rule delete    volume unset
flavor set              security group rule list
flavor show             security group set
flavor unset            security group show
help                    server add security group
host list               server add volume
host show               server create
hypervisor list         server delete
hypervisor show         server image create
hypervisor stats show   server list
image create            server lock
image delete            server migrate
image list              server pause
image save              server reboot
image set               server rebuild
image show              server remove security group
ip fixed add            server remove volume

RHEL / Red Hat – Package does not match intended download.

Currently installing a few C6320 servers with RHEL7.1 to create an OpenStack demo cluster. Since all servers need almost identical setups I wrote some Expect scripts but unfortunately didn’t put the script runtime timeout high enough. This resulted in the connection to one of the servers being interrupted in the middle of a “yum update -y”.

When trying to run the update again it failed with: “[Errno -1] Package does not match intended download. Suggestion: run yum –enablerepo=rhel-7-server-rpms clean metadata” “Trying other mirror.”

Unfortunately, running the suggested “clean metadata” didn’t fix the problem. Instead, the fix turned out to be a simple “yum clean all” 🙂