Volume Attach / Detach Troubleshooting
Overview
Volume attach and detach operations involve coordinated work between the Compute Service and the Persistent Storage Service. When either operation stalls or fails, a volume can become stuck in a state such as detaching, attaching, or reserved, blocking further operations on the volume and sometimes on the virtual machine.
This guide covers the most common root causes, how to detect each one, and the safe steps to recover a stuck volume without data loss.
In this guide, you will diagnose and recover volumes that are stuck in attach or detach states.
Prerequisites
pcdctlconfigured and authenticated against your region.Access to the host running the affected VM (for
systemctlandvirshcommands).For Self-Hosted deployments:
kubectlaccess to the management-plane namespace.
Common Causes
Volume stuck in detaching
Compute Service failed to complete the disconnect; volume record not cleaned up
Volume stuck in attaching
Transport-level failure (iSCSI/NFS) during the LUN-mapping step
Volume stuck in reserved
Compute Service crashed between reserving and completing the attachment
Stale attachment record
VM was deleted or migrated without detaching volumes first
Nova BDM inconsistency
Block device mapping record left over after a failed live migration
Detect a Stuck Volume
Check Volume Status
Look at the status field. A volume in a healthy workflow should pass through intermediate states (attaching, detaching, reserved) in under two minutes. A volume that stays in one of those states for longer than five minutes is stuck.
To list all volumes in a potentially stuck state at once:
Identify the Attached VM and Host
The output includes the server_id (VM UUID) and the host_name of the compute node. Record both — you need them for subsequent steps.
Check the Attachment Record on the Compute Host
SSH to the compute host identified above and inspect the block device mapping:
If the volume device appears in virsh output, the hypervisor still considers it attached. If virsh does not list it but the Persistent Storage Service shows in-use, the records are inconsistent.
Recover a Volume Stuck in detaching
detachingA volume stuck in detaching most often means the Compute Service sent a detach request but did not receive confirmation from the storage host. Follow these steps in order.
Step 1 — Verify the Compute Service Is Healthy
On the affected compute host:
If the service is not running, restart it and wait up to two minutes for in-flight operations to complete:
Check whether the volume transitions to available on its own within two minutes:
Step 2 — Verify the Persistent Storage Service Is Healthy
On the block storage host:
Review logs for errors related to the volume UUID:
Self-Hosted deployments only
If the cinder-volume service runs as a pod rather than a systemd unit, check its status with:
Step 3 — Force-Reset the Volume State
If the Persistent Storage Service is healthy but the volume remains stuck, force-reset its state to available. This is safe only after you have confirmed that the volume is not genuinely connected to a running VM (see the virsh domblklist check above).
Before resetting state
Confirm that virsh domblklist <VM_UUID> no longer lists the volume device. Resetting state on a volume that is still presented to a running VM can cause data corruption.
Step 4 — Clean Up Stale Attachment Records
If a stale attachment record remains after the state reset, delete it:
Step 5 — Clean Up LUN Mappings on the Storage Backend
For SAN-backed volumes (iSCSI or Fibre Channel), the storage array may retain a LUN mapping even after the software-level detach completes. Check this from the storage management interface of your backend:
NetApp ONTAP: verify that no igroup mapping exists for the host WWN / IQN that was using the volume.
Pure Storage: verify that the host connection is removed in the Pure array management interface.
Hitachi VSP: verify that the host group mapping no longer includes the LUN.
If a stale LUN mapping exists, remove it using your storage array's management tools. A stale mapping does not prevent the volume from being reused by a different VM, but it consumes resources on the storage array and can cause confusion during future attach operations.
Recover a Volume with Nova BDM Inconsistency
A Nova Block Device Mapping (BDM) inconsistency can occur when a live migration fails partway through and the migration rollback does not fully clean up. The volume appears in-use but is not attached to any running VM.
Detect the Inconsistency
If the VM does not exist, or is in ERROR state, the BDM record is stale.
Clean Up the Inconsistency
Reset the volume state to
available:Delete stale attachment records:
If the VM is in
ERRORstate and you need to recover it, rebuild or delete it through the Compute Service. The volume will remainavailableand can be reattached to a new or recovered VM.
Orphaned Attachments After Tenant or VM Deletion
When a tenant is deleted while volumes are still attached to VMs, or when VMs are force-deleted without detaching volumes first, orphaned attachment records can accumulate. These records cause subsequent volume operations to fail with "volume already in use" errors even though no active VM is using the volume.
To find orphaned attachments:
If the VM UUID from the attachment does not appear in the server list, the attachment is orphaned. Delete it:
Next Steps
Review Volume State for a full list of volume status values and their meanings.
If the volume was stuck because of a failed live migration, see Storage Live Migration for live-migration prerequisites and known limitations.
For persistent or recurring attach/detach failures, see Troubleshooting Cinder Issues for service-level diagnostics.
Last updated
Was this helpful?
