Post

Nova live migration fails with “Migration pre-check error: CPU doesn’t have compatibility.”

Nova live migration fails with “Migration pre-check error: CPU doesn’t have compatibility.”

Resolving OpenStack Nova Live Migration CPU Compatibility Issues

Background

This week I’m hosting a hands-on OpenStack training for some clients. The ability to perform live migrations of running instances between hosts is one of the key features they want to see, and I had set up the environment to support this.

Live migrations had been working fine for over a week when they suddenly started failing this morning.

The Error

When attempting to perform a live migration, I received this error:

1
2
3
4
ERROR (BadRequest): Migration pre-check error: CPU doesn't have compatibility.
internal error: Unknown CPU model Haswell-noTSX
Refer to https://libvirt.org/html/libvirt-libvirt.html#virCPUCompareResult (HTTP 400) 
(Request-ID: req-227fd8fb-eba4-4f40-b707-bb31569ed14f)

Normally this would happen if the hosts running nova-compute had different CPU types, but in this case they are all identical (Dell C6320 nodes).

Initial Investigation

I checked the CPU map in /usr/share/libvirt/cpu_map.xml and confirmed the CPU is listed:

1
2
3
4
5
<model name="Haswell">
  <model name="Haswell-noTSX"></model>
  <feature name="hle"></feature>
  <feature name="rtm"></feature>
</model>

Since the CPUs are the same on all nodes, it was clearly an issue with the lookup of that CPU type. I initially tried to disable the check by editing /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py. This made the error disappear, but my instances remained on their original hosts - not much of an improvement.

The Solution

Finally, I modified the /etc/nova/nova.conf files on both the controller node and the nova-compute nodes. The changes that fixed the issue were:

CPU Mode Setting

Old setting:

1
#cpu_mode=<none>

New setting:

1
cpu_mode=custom

CPU Model Setting

Old setting:

1
#cpu_model=<none>

New setting:

1
cpu_model=kvm64

I also had the following settings which may or may not have been relevant:

1
2
3
virt_type=kvm
limit_cpu_features=false
live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER, VIR_MIGRATE_LIVE, VIR_MIGRATE_TUNNELLED

After restarting nova on both the controller and all three compute nodes, live migrations started working properly again.

Verification

Checking Instances for Each Node Before Migration

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@c6320-n1 ~(keystone_admin)]# for i in {2..4}; do nova hypervisor-servers c6320-n$i; done
+--------------------------------------+------------------+---------------+-------------------+
| ID                                   | Name             | Hypervisor ID | Hypervisor Hostname |
+--------------------------------------+------------------+---------------+-------------------+
| aaac652f-65d9-49e4-aea2-603fc2db26c3 | instance-0000009c | 1             | c6320-n2         |
+--------------------------------------+------------------+---------------+-------------------+
+----+------+---------------+-------------------+
| ID | Name | Hypervisor ID | Hypervisor Hostname |
+----+------+---------------+-------------------+
+----+------+---------------+-------------------+
+----+------+---------------+-------------------+
| ID | Name | Hypervisor ID | Hypervisor Hostname |
+----+------+---------------+-------------------+
+----+------+---------------+-------------------+

Performing the Migration

1
[root@c6320-n1 ~(keystone_admin)]# nova live-migration aaac652f-65d9-49e4-aea2-603fc2db26c3 c6320-n4

Verifying Instance Movement from Node 2 to Node 4

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@c6320-n1 ~(keystone_admin)]# for i in {2..4}; do nova hypervisor-servers c6320-n$i; done
+----+------+---------------+-------------------+
| ID | Name | Hypervisor ID | Hypervisor Hostname |
+----+------+---------------+-------------------+
+----+------+---------------+-------------------+
+----+------+---------------+-------------------+
| ID | Name | Hypervisor ID | Hypervisor Hostname |
+----+------+---------------+-------------------+
+----+------+---------------+-------------------+
+--------------------------------------+------------------+---------------+-------------------+
| ID                                   | Name             | Hypervisor ID | Hypervisor Hostname |
+--------------------------------------+------------------+---------------+-------------------+
| aaac652f-65d9-49e4-aea2-603fc2db26c3 | instance-0000009c | 3             | c6320-n4         |
+--------------------------------------+------------------+---------------+-------------------+

Conclusion

By setting cpu_mode=custom and cpu_model=kvm64 in the nova configuration, we were able to resolve the CPU compatibility issue and successfully perform live migrations between compute nodes. This approach provides a more generic CPU model that works across different hosts, even when they have identical physical CPUs.

This post is licensed under CC BY 4.0 by the author.