Best Practices: OpenStack Resource Overcommitment

This article explains CPU and memory overcommitment feature of OpenStack.

KVM hypervisor is can overcommit CPU and memory resources for a virtual machine. This results in higher consolidation and cost savings. OpenStack Nova takes advantage of this ability.

Linux Memory Management

Why overcommit in the first place?

On KVM hypervisor, the linux scheduler controls resources allocated virtual machines. Even though a virtual machine is configured with certain resource capacity, for most workloads it is not always needed. For example -- developer workstations are idle most of the time. Resources are needed only on occasions like code compilation, testing, etc. So to derive higher efficiency, KVM transparently shares physical capacity between virtual machines.

Resource sharing can lead to better utilization. However, in times of contention VM performance is impacted.

Resource overcommitment in OpenStack:

By default the overcommitment for resources is as follows:

Platform9 Infrastructure view provides snapshot of resource allocations of your private cloud.

resource overcommitment openstack
Platform9 Infrastructure View

CPU is a much more fluid resource than memory (which represents state). So a much higher overcommitment is possible with CPU without significant overhead.


Best Practices:

The default overcommitment values used by OpenStack are fit for average use case. They should provide good consolidation in most deployments. However, if you are concerned about performance of your workloads, a few best practices can help.

1. Swap configuration on hypervisor: Swap is important when running instances with KVM. Due to resource overcommitment, there can be a situation where memory demand from workloads and underlying Linux system exceeds physical memory available on host. Configuring swap ensures correct operation on the system. For example: consider a server with 48GB of ram. With its default overcommitment policy, OpenStack can provision virtual machines up to 1.5 times the memory size: 72GB total. In addition, lets assume 4GB of memory is needed for Linux OS to run properly. In this case the swap space needed is (72 - 48) + 4 = 28GB.

2. Multi-CPU virtual machines: Linux scheduler is very good at scheduling processes on available physical cores. With virtual machines however, the relationship between configured vCPUs of a single virtual machine and available physical CPUs impact performance. For example, if a virtual machine with 4 vCPUs is run on a host with just 2 physical cores, performance is severely impacted. When deploying OpenStack, consider the base CPU model used in all hypervisor machines and restrict the flavors in OpenStack environment accordingly. e.g. If base CPU model has 4 cores in datacenter, consider disallowing use of OpenStack flavors with greater than 4 virtual CPUs. Another good measure is use of host aggregates. Tagging flavors and host aggregates with matching tags will ensure that OpenStack virtual machines are run on right hypervisor.

3. Platform9 Infrastructure View provides information on actual physical resource utilization across the datacenter. Possible overuse can be identified by monitoring Platform9 resource utilization statistics.

Overcommitment of storage is a longer conversation. We plan to write a follow up article on storage overcommitment and best practices.


October 14, 2015