How To Configure Multi-node Highly Available Glance Image Catalog In A Region
With Platform9 Managed OpenStack, administrators can authorize multiple hosts within a region to act as OpenStack Glance image library servers. Adding multiple image libraries has following benefits:
- It lets you create a highly available Glance image catalog deployment that balances load for data-heavy block storage volume (cinder) and virtual machine instance (nova) provisioning operations.
- It can eliminate downtime when performing maintenance on or migrating an image library role to a new host.
This tutorial describes how to configure a multi-node image catalog and provides tips that will help simplify OpenStack Glance image management.
For some general information on the Platform9 Image Library infrastructure, see Handling OpenStack Cloud Images in Platform9.
What is the Platform9 Image Library Role?
In an OpenStack cloud, the Glance collection of services are in charge of managing virtual machine images and making them available for provisioning operations. In Platform9 Managed OpenStack, Glance services are installed and managed in an on-premise customer host when the administrator applies the Image Library role to the host. This role installs and configures the glance-api service as well as a supporting pf9-imagelibrary service. The imagelibrary service implements image discovery and performs maintenance on the glance image repository.
The Glance service is a simple http web service that stores image metadata, and manages an image file repository on the local filesystem. When provisioning an instance in nova, nova-compute reads image metadata from glance and downloads the virtual machine image from glance's image store. Conversely, when nova is asked to create an image from a VM disk, nova writes image metadata to glance, and uploads an image file to the glance image store. Cinder volume provisioning and image creation follow similar workflows.
Deploying Multiple Image Libraries
All nova VM and cinder volume provisioning operations involve glance, so it's important that glance be reliable and able to scale. Deploying multiple glance services helps with both these requirements:
- When one of the image library hosts is down, nova and cinder can choose a working image library host to complete an operation. You now have a highly available glance image catalog.
- When servicing a host or migrating an image library to a new host, having an additional image library host in service eliminates provisioning downtime.
- When many provisioning operations are happening at the same time, the load on the glance-api service can be distributed among all the image library hosts, reducing the load on each.
How Does it Work?
When authorizing a new glance/image library host, Platform9 adds the address of the host to the keystone service catalog. When asked to perform a deployment operation involving images, nova and cinder query the service catalog for a list of glance-api service addresses. The list is randomized, and an attempt is made to perform the operation on each glance service. Connection errors or missing image file errors are caught in the deployment logic in each service, and the next service address is attempted. The deployment operation fails only when all the available services have failed to do their job.
Image File Storage
When authorizing the image library/glance role on a host, the administrator must choose a location on disk to act as an image file repository. The default is /var/opt/pf9/imagelibrary/data, but this can be any location available on the host.
There are three ways to add an image file to the Glance image repository in Platform9 Managed OpenStack:
- Copy an image file into the repository directory. The supporting pf9-imagelibrary service will detect the new file, calculate Glance metadata for it, and add it to Glance.
- Upload an image using the Glance command line client (or through some other REST client). See Tutorial: Manage Images with the OpenStack Glance Client.
- Create an image as a snapshot of a virtual machine disk or cinder volume.
In each case, when a new image is created, it will only be uploaded to one of the authorized image library hosts, and the image file will be stored in the chosen image repository location on that host. Requests to download the file from another image library host will fail unless the image file is made available on that host. To maximize image availability and fault tolerance, each image library host must have access to all the image files catalogued by the Glance service. There are two ways to ensure this:
- If all of the authorized image library hosts can be attached to a shared storage backend (for example an NFS share), any files that are uploaded to one of the image library hosts are immediately available for provisioning from any of them. See Configure NFS Shared Storage for OpenStack Glance Image Catalog or VM Storage.
- If shared storage is not available, and each image library uses independent storage for image files, then the image file repositories must be periodically synchronized. There are numerous tools available to do this, for example rsync and unison.
In either case the image repository directory must be the same on each image library host. This means that if one image library host mounts an NFS image repository filesystem at /images, all others must use the same mount point.
The Platform9 user interface's Images view can provide clues about the availability of your images. The Host Status description for each image shows whether or not the image is available on each image library host:
In the above scenario, bob200-10-4-253-162.platform9.sys and platform9-support-host represent two different authorized image library hosts. These hosts do not share storage, and the image repositories have not been properly synchronized. As a result, bob200-10-4-253-162.platform9.sys can only provide snap2 and cirros5.img, while platform9-support-host can only provide snap1. All three images are currently available for provisioning, but a failure of either host will result in the unavailability of some images.
Note that other Host Status values may be displayed. Here is the complete list:
- ok: the image is available for provisioning from that host.
- missing: the image file doesn't exist on the host.
- no-access: The file is there, but permissions won't allow the glance-api services to read it. Both glance-api and the pf9-imagelibrary services run as the pf9 user, and to use an image file for provisioning, the pf9 user must be able to read it.
- cannot-delete: The image is available, but the file can't be deleted from the file system by the image library services. If the image is deleted in glance, it must be removed from the image reposititory filesystem manually. To avoid this, ensure that the pf9 user has write permission on the image file's immediate parent directory, and execute permission on each directory in the image path.
- offline: The host itself is not responding, or there was a problem with one of the services associated with the image library role.
To maximize fault tolerance in your image library cluster, you would ideally have a Host Status of 'ok' for every image, on every image library host.
With the new features available for image management in Platform9's 2.2 release, it has become easier than ever to manage OpenStack Glance services for your cloud. The ability to install multiple image library roles improves reliability and reduces downtime, and improved health monitoring provides confidence that users have access to the images they need.
July 26, 2016