> For the complete documentation index, see [llms.txt](https://docs.platform9.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.platform9.com/private-cloud-director/virtualized-clusters/troubleshooting-and-log-files/troubleshoot-maintenance-mode-migrations.md).

# Troubleshoot Maintenance Mode Migration Failures

## Overview

VM migration failures during maintenance mode are most commonly caused by resource constraints on the destination host, affinity rule conflicts, or CPU model mismatches between the source and destination host. This guide explains how to identify the cause, safely abort or retry a failed migration, and recover any VMs left in an error state.

For how maintenance mode works and how to enable or disable it, see [Maintenance Mode](/private-cloud-director/virtualized-clusters/add-hosts-virtualized-cluster/maintenance-mode.md).

In this guide, you will diagnose why migrations failed, resolve the blocking condition, and restore stranded VMs to a running state.

## Identify Which VMs Failed to Migrate

Open the migration progress panel for the host in maintenance mode:

1. Navigate to **Infrastructure > Cluster Hosts** in the <code class="expression">space.vars.product\_name</code> UI.
2. Select the host that is in maintenance mode.
3. Click **See Details** on the maintenance mode banner to open the **View Migration Progress** panel.

The panel lists each VM and its migration status. VMs with a `Failed` status are the ones to investigate.

For each failed VM, note the VM name and check the Compute Service log on the source host for the migration error:

```bash
grep -A 5 "<vm-name-or-uuid>" /var/log/pf9/ostackhost.log | tail -40
```

## Common Causes of Migration Failure

### Insufficient Resources on Destination Hosts

If no destination host has enough free CPU or memory to accept the VM, the migration fails with a "No valid host was found" error.

**What to check:**

```bash
pcdctl host list
```

Review the `vCPUs used` and `RAM used` columns for each host in the cluster. If every destination host is near capacity, you must either free up resources (shut down idle VMs) or add another host to the cluster before retrying maintenance mode.

### CPU Model Mismatch

Live migration requires that the source and destination host expose compatible CPU models to the VM. If a host in the cluster was recently upgraded and its effective CPU model differs from the source host, live migration will fail.

**What to check:**

```bash
# Run on both the source and destination host
virsh domcapabilities | grep "model usable='yes'" | sort
```

Compare the lists. The selected CPU model for the cluster (visible in `nova_override.conf` under `[libvirt] cpu_models`) must appear as usable on both hosts. See [Resolve CPU Baseline Mismatch After Host Upgrade](/private-cloud-director/virtualized-clusters/troubleshooting-and-log-files/cpu-baseline-mismatch.md) for steps to correct a mismatch.

### Affinity or Anti-Affinity Rule Conflicts

Maintenance mode honors hard affinity and anti-affinity rules. If a VM has a hard affinity rule requiring it to be co-located with another VM that is also being migrated, and no destination host can satisfy both, the migration fails.

Review the VM's affinity group in the UI: navigate to **Compute > VM Affinity Anti-Affinity Rules** and confirm which group the VM belongs to. If the conflict cannot be resolved automatically, you may need to temporarily remove the hard rule, migrate the VM manually, and re-apply the rule.

### VMs in Error or Unmigratable States

As described in [VM States](/private-cloud-director/virtualized-clusters/add-hosts-virtualized-cluster/maintenance-mode.md#vm-states) in the Maintenance Mode guide, VMs in `Error`, `Suspended`, `Shutdown`, `Rescued`, or `Pending resize confirmation` states are skipped by maintenance mode. VMs in `Error` state must be recovered first.

To recover a VM in `Error` state, attempt a hard reboot:

```bash
pcdctl server reboot --hard <vm-id>
```

If the hard reboot does not resolve the error, see [Recover VMs in ERROR State After Host Reboot or Patching](/private-cloud-director/virtualized-clusters/troubleshooting-and-log-files/recover-vms-in-error-state.md).

## Abort Maintenance Mode and Retry

If maintenance mode is partially complete and you want to stop it, disable maintenance mode from the UI:

1. Navigate to **Infrastructure > Cluster Hosts**.
2. Select the host in maintenance mode.
3. Click **Other > Disable Maintenance Mode** or use the **Disable Maintenance Mode** button on the Host Details page.

Disabling maintenance mode marks the host as schedulable again. VMs that were already successfully migrated remain on their destination hosts; they are not migrated back automatically.

After resolving the blocking condition (freeing resources, fixing CPU model, recovering error-state VMs), re-enable maintenance mode to migrate the remaining VMs.

## Manually Migrate a Stranded VM

If a specific VM cannot be migrated by maintenance mode (for example, it has a Virtual TPM or an unresolvable affinity constraint), migrate it manually before enabling maintenance mode:

```bash
pcdctl server migrate <vm-id>
```

To target a specific destination host:

```bash
pcdctl server migrate --host <destination-host-name> <vm-id>
```

After the manual migration completes, the VM is no longer on the source host and maintenance mode can proceed without encountering it.

## Recover a VM Left in Error State After Migration

If a VM ended up in `Error` state during maintenance mode migration, recover it with a hard reboot:

```bash
pcdctl server reboot --hard <vm-id>
```

If the VM is reporting its hypervisor host as the host in maintenance mode (the source), the VM's record in the management plane may need to be reset. Contact [Platform9 Support](https://support.platform9.com/) with the VM UUID and the Compute Service log from the source host.

For a full procedure covering VMs that remain in `Error` after host maintenance events, see [Recover VMs in ERROR State After Host Reboot or Patching](/private-cloud-director/virtualized-clusters/troubleshooting-and-log-files/recover-vms-in-error-state.md).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.platform9.com/private-cloud-director/virtualized-clusters/troubleshooting-and-log-files/troubleshoot-maintenance-mode-migrations.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
