| =============================== |
| Huge Pages Support for Ganeti |
| =============================== |
| This is a design document about implementing support for huge pages in |
| Ganeti. (Please note that Ganeti works with Transparent Huge Pages i.e. |
| THP and any reference in this document to Huge Pages refers to explicit |
| Huge Pages). |
| |
| Current State and Shortcomings: |
| ------------------------------- |
| The Linux kernel allows using pages of larger size by setting aside a |
| portion of the memory. Using larger page size may enhance the |
| performance of applications that require a lot of memory by improving |
| page hits. To use huge pages, memory has to be reserved beforehand. This |
| portion of memory is subtracted from free memory and is considered as in |
| use. Currently Ganeti cannot take proper advantage of huge pages. On a |
| node, if huge pages are reserved and are available to fulfill the VM |
| request, Ganeti fails to recognize huge pages and considers the memory |
| reserved for huge pages as used memory. This leads to failure of |
| launching VMs on a node where memory is available in the form of huge |
| pages rather than normal pages. |
| |
| Proposed Changes: |
| ----------------- |
| The following components will be changed in order for Ganeti to take |
| advantage of Huge Pages. |
| |
| Hypervisor Parameters: |
| ---------------------- |
| Currently, It is possible to set or modify huge pages mount point at |
| cluster level via the hypervisor parameter ``mem_path`` as:: |
| |
| $ gnt-cluster init \ |
| >--enabled-hypervisors=kvm -nic-parameters link=br100 \ |
| > -H kvm:mem_path=/mount/point/for/hugepages |
| |
| This hypervisor parameter is inherited by all the instances as |
| default although it can be overriden at the instance level. |
| |
| The following changes will be made to the inheritence behaviour. |
| |
| - The hypervisor parameter ``mem_path`` and all other hypervisor |
| parameters will be made available at the node group level (in |
| addition to the cluster level), so that users can set defaults for |
| the node group:: |
| |
| $ gnt-group add/modify\ |
| > -H hv:parameter=value |
| |
| This changes the hypervisor inheritence level as:: |
| |
| cluster -> group -> OS -> instance |
| |
| - Furthermore, the hypervisor parameter ``mem_path`` will be changeable |
| only at the cluster or node group level and users must not be able to |
| override this at OS or instance level. The following command must |
| produce an error message that ``mem_path`` may only be set at either |
| the cluster or the node group level:: |
| |
| $ gnt-instance add -H kvm:mem_path=/mount/point/for/hugepages |
| |
| Memory Pools: |
| ------------- |
| Memory management of Ganeti will be improved by creating separate pools |
| for memory used by the node itself, memory used by the hypervisor and |
| the memory reserved for huge pages as: |
| - mtotal/xen (Xen memory) |
| - mfree/xen (Xen unused memory) |
| - mtotal/hp (Memory reserved for Huge Pages) |
| - mfree/hp (Memory available from unused huge pages) |
| - mpgsize/hp (Size of a huge page) |
| |
| mfree and mtotal will be changed to mean "the total and free memory for |
| the default method in this cluster/nodegroup". Note that the default |
| method depends both on the default hypervisor and its parameters. |
| |
| iAllocator Changes: |
| ------------------- |
| If huge pages are set as default for a cluster of node group, then |
| iAllocator must consider the huge pages memory on the nodes, as a |
| parameter when trying to find the best node for the VM. |
| Note that the iallocator will also be changed to use the correct |
| parameter depending on the cluster/group. |
| |
| hbal Changes: |
| ------------- |
| The cluster balancer (hbal) will be changed to use the default memory |
| pool and recognize memory reserved for huge pages when trying to |
| rebalance the cluster. |
| |
| .. vim: set textwidth=72 : |
| .. Local Variables: |
| .. mode: rst |
| .. fill-column: 72 |
| .. End: |