doc/design-cpu-pinning.rst - ganeti - Git at Google

 Ganeti CPU Pinning
 ==================

 Objective
 ---------

 This document defines Ganeti's support for CPU pinning (aka CPU
 affinity).

 CPU pinning enables mapping and unmapping entire virtual machines or a
 specific virtual CPU (vCPU), to a physical CPU or a range of CPUs.

 At this stage Pinning will be implemented for Xen and KVM.

 Command Line
 ------------

 Suggested command line parameters for controlling CPU pinning are as
 follows::

   gnt-instance modify -H cpu_mask=<cpu-pinning-info> <instance>

 cpu-pinning-info can be any of the following:

 * One vCPU mapping, which can be the word "all" or a combination
   of CPU numbers and ranges separated by comma. In this case, all
   vCPUs will be mapped to the indicated list.
 * A list of vCPU mappings, separated by a colon ':'. In this case
   each vCPU is mapped to an entry in the list, and the size of the
   list must match the number of vCPUs defined for the instance. This
   is enforced when setting CPU pinning or when setting the number of
   vCPUs using ``-B vcpus=#``.

   The mapping list is matched to consecutive virtual CPUs, so the first entry
   would be the CPU pinning information for vCPU 0, the second entry
   for vCPU 1, etc.

 The default setting for new instances is "all", which maps the entire
 instance to all CPUs, thus effectively turning off CPU pinning.

 Here are some usage examples::

   # Map vCPU 0 to physical CPU 1 and vCPU 1 to CPU 3 (assuming 2 vCPUs)
   gnt-instance modify -H cpu_mask=1:3 my-inst

   # Pin vCPU 0 to CPUs 1 or 2, and vCPU 1 to any CPU
   gnt-instance modify -H cpu_mask=1-2:all my-inst

   # Pin vCPU 0 to any CPU, vCPU 1 to CPUs 1, 3, 4 or 5, and CPU 2 to
   # CPU 0
   gnt-instance modify -H cpu_mask=all:1\\,3-5:0 my-inst

   # Pin entire VM to CPU 0
   gnt-instance modify -H cpu_mask=0 my-inst

   # Turn off CPU pinning (default setting)
   gnt-instance modify -H cpu_mask=all my-inst

 Assuming an instance has 3 vCPUs, the following commands will fail::

   # not enough mappings
   gnt-instance modify -H cpu_mask=0:1 my-inst

   # too many
   gnt-instance modify -H cpu_mask=2:1:1:all my-inst

 Validation
 ----------

 CPU pinning information is validated by making sure it matches the
 number of vCPUs. This validation happens when changing either the
 cpu_mask or vcpus parameters.
 Changing either parameter in a way that conflicts with the other will
 fail with a proper error message.
 To make such a change, both parameters should be modified at the same
 time. For example:
 ``gnt-instance modify -B vcpus=4 -H cpu_mask=1:1:2-3:4\\,6 my-inst``

 Besides validating CPU configuration, i.e. the number of vCPUs matches
 the requested CPU pinning, Ganeti will also verify the number of
 physical CPUs is enough to support the required configuration. For
 example, trying to run a configuration of vcpus=2,cpu_mask=0:4 on
 a node with 4 cores will fail (Note: CPU numbers are 0-based).

 This validation should repeat every time an instance is started or
 migrated live. See more details under Migration below.

 Cluster verification should also test the compatibility of other nodes in
 the cluster to required configuration and alert if a minimum requirement
 is not met.

 Failover
 --------

 CPU pinning configuration can be transferred from node to node, unless
 the number of physical CPUs is smaller than what the configuration calls
 for.  It is suggested that unless this is the case, all transfers and
 migrations will succeed.

 In case the number of physical CPUs is smaller than the numbers
 indicated by CPU pinning information, instance failover will fail.

 In case of emergency, to force failover to ignore mismatching CPU
 information, the following switch can be used:
 ``gnt-instance failover --fix-cpu-mismatch my-inst``.
 This command will try to failover the instance with the current cpu mask,
 but if that fails, it will change the mask to be "all".

 Migration
 ---------

 In case of live migration, and in addition to failover considerations,
 it is required to remap CPU pinning after migration. This can be done in
 realtime for instances for both Xen and KVM, and only depends on the
 number of physical CPUs being sufficient to support the migrated
 instance.

 Data
 ----

 Pinning information will be kept as a list of integers per vCPU.
 To mark a mapping of any CPU, we will use (-1).
 A single entry, no matter what the number of vCPUs is, will always mean
 that all vCPUs have the same mapping.

 Configuration file
 ------------------

 The pinning information is kept for each instance's hypervisor
 params section of the configuration file as the original string.

 Xen
 ---

 There are 2 ways to control pinning in Xen, either via the command line
 or through the configuration file.

 The commands to make direct pinning changes are the following::

   # To pin a vCPU to a specific CPU
   xm vcpu-pin <domain> <vcpu> <cpu>

   # To unpin a vCPU
   xm vcpu-pin <domain> <vcpu> all

   # To get the current pinning status
   xm vcpu-list <domain>

 Since currently controlling Xen in Ganeti is done in the configuration
 file, it is straight forward to use the same method for CPU pinning.
 There are 2 different parameters that control Xen's CPU pinning and
 configuration:

 vcpus
   controls the number of vCPUs
 cpus
   maps vCPUs to physical CPUs

 When no pinning is required (pinning information is "all"), the
 "cpus" entry is removed from the configuration file.

 For all other cases, the configuration is "translated" to Xen, which
 expects either ``cpus = "a"`` or ``cpus = [ "a", "b", "c", ...]``,
 where each a, b or c are a physical CPU number, CPU range, or a
 combination, and the number of entries (if a list is used) must match
 the number of vCPUs, and are mapped in order.

 For example, CPU pinning information of ``1:2,4-7:0-1`` is translated
 to this entry in Xen's configuration ``cpus = [ "1", "2,4-7", "0-1" ]``

 KVM
 ---

 Controlling pinning in KVM is a little more complicated as there is no
 configuration to control pinning before instances are started.

 The way to change or assign CPU pinning under KVM is to use ``taskset`` or
 its underlying system call ``sched_setaffinity``. Setting the affinity for
 the VM process will change CPU pinning for the entire VM, and setting it
 for specific vCPU threads will control specific vCPUs.

 The sequence of commands to control pinning is this: start the instance
 with the ``-S`` switch, so it halts before starting execution, get the
 process ID or identify thread IDs of each vCPU by sending ``info cpus``
 to the monitor, map vCPUs as required by the cpu-pinning information,
 and issue a ``cont`` command on the KVM monitor to allow the instance
 to start execution.

 For example, a sequence of commands to control CPU affinity under KVM
 may be:

 * Start KVM: ``/usr/bin/kvm … <kvm-command-line-options> … -S``
 * Use socat to connect to monitor
 * send ``info cpus`` to monitor to get thread/vCPU information
 * call ``sched_setaffinity`` for each thread with the CPU mask
 * send ``cont`` to KVM's monitor

 A CPU mask is a hexadecimal bit mask where each bit represents one
 physical CPU. See man page for :manpage:`sched_setaffinity(2)` for more
 details.

 For example, to run a specific thread-id on CPUs 1 or 3 the mask is
 0x0000000A.

 We will control process and thread affinity using the python affinity
 package (http://pypi.python.org/pypi/affinity). This package is a Python
 wrapper around the two affinity system calls, and has no other
 requirements.

 Alternative Design Options
 --------------------------

 1. There's an option to ignore the limitations of the underlying
    hypervisor and instead of requiring explicit pinning information
    for *all* vCPUs, assume a mapping of "all" to vCPUs not mentioned.
    This can lead to inadvertent missing information, but either way,
    since using cpu-pinning options is probably not going to be
    frequent, there's no real advantage.

 .. vim: set textwidth=72 :
 .. Local Variables:
 .. mode: rst
 .. fill-column: 72
 .. End:
	Ganeti CPU Pinning
	==================

	Objective
	---------

	This document defines Ganeti's support for CPU pinning (aka CPU
	affinity).

	CPU pinning enables mapping and unmapping entire virtual machines or a
	specific virtual CPU (vCPU), to a physical CPU or a range of CPUs.

	At this stage Pinning will be implemented for Xen and KVM.

	Command Line
	------------

	Suggested command line parameters for controlling CPU pinning are as
	follows::

	gnt-instance modify -H cpu_mask=<cpu-pinning-info> <instance>

	cpu-pinning-info can be any of the following:

	* One vCPU mapping, which can be the word "all" or a combination
	of CPU numbers and ranges separated by comma. In this case, all
	vCPUs will be mapped to the indicated list.
	* A list of vCPU mappings, separated by a colon ':'. In this case
	each vCPU is mapped to an entry in the list, and the size of the
	list must match the number of vCPUs defined for the instance. This
	is enforced when setting CPU pinning or when setting the number of
	vCPUs using ``-B vcpus=#``.

	The mapping list is matched to consecutive virtual CPUs, so the first entry
	would be the CPU pinning information for vCPU 0, the second entry
	for vCPU 1, etc.

	The default setting for new instances is "all", which maps the entire
	instance to all CPUs, thus effectively turning off CPU pinning.

	Here are some usage examples::

	# Map vCPU 0 to physical CPU 1 and vCPU 1 to CPU 3 (assuming 2 vCPUs)
	gnt-instance modify -H cpu_mask=1:3 my-inst

	# Pin vCPU 0 to CPUs 1 or 2, and vCPU 1 to any CPU
	gnt-instance modify -H cpu_mask=1-2:all my-inst

	# Pin vCPU 0 to any CPU, vCPU 1 to CPUs 1, 3, 4 or 5, and CPU 2 to
	# CPU 0
	gnt-instance modify -H cpu_mask=all:1\\,3-5:0 my-inst

	# Pin entire VM to CPU 0
	gnt-instance modify -H cpu_mask=0 my-inst

	# Turn off CPU pinning (default setting)
	gnt-instance modify -H cpu_mask=all my-inst

	Assuming an instance has 3 vCPUs, the following commands will fail::

	# not enough mappings
	gnt-instance modify -H cpu_mask=0:1 my-inst

	# too many
	gnt-instance modify -H cpu_mask=2:1:1:all my-inst

	Validation
	----------

	CPU pinning information is validated by making sure it matches the
	number of vCPUs. This validation happens when changing either the
	cpu_mask or vcpus parameters.
	Changing either parameter in a way that conflicts with the other will
	fail with a proper error message.
	To make such a change, both parameters should be modified at the same
	time. For example:
	``gnt-instance modify -B vcpus=4 -H cpu_mask=1:1:2-3:4\\,6 my-inst``

	Besides validating CPU configuration, i.e. the number of vCPUs matches
	the requested CPU pinning, Ganeti will also verify the number of
	physical CPUs is enough to support the required configuration. For
	example, trying to run a configuration of vcpus=2,cpu_mask=0:4 on
	a node with 4 cores will fail (Note: CPU numbers are 0-based).

	This validation should repeat every time an instance is started or
	migrated live. See more details under Migration below.

	Cluster verification should also test the compatibility of other nodes in
	the cluster to required configuration and alert if a minimum requirement
	is not met.

	Failover
	--------

	CPU pinning configuration can be transferred from node to node, unless
	the number of physical CPUs is smaller than what the configuration calls
	for. It is suggested that unless this is the case, all transfers and
	migrations will succeed.

	In case the number of physical CPUs is smaller than the numbers
	indicated by CPU pinning information, instance failover will fail.

	In case of emergency, to force failover to ignore mismatching CPU
	information, the following switch can be used:
	``gnt-instance failover --fix-cpu-mismatch my-inst``.
	This command will try to failover the instance with the current cpu mask,
	but if that fails, it will change the mask to be "all".

	Migration
	---------

	In case of live migration, and in addition to failover considerations,
	it is required to remap CPU pinning after migration. This can be done in
	realtime for instances for both Xen and KVM, and only depends on the
	number of physical CPUs being sufficient to support the migrated
	instance.

	Data
	----

	Pinning information will be kept as a list of integers per vCPU.
	To mark a mapping of any CPU, we will use (-1).
	A single entry, no matter what the number of vCPUs is, will always mean
	that all vCPUs have the same mapping.

	Configuration file
	------------------

	The pinning information is kept for each instance's hypervisor
	params section of the configuration file as the original string.

	Xen
	---

	There are 2 ways to control pinning in Xen, either via the command line
	or through the configuration file.

	The commands to make direct pinning changes are the following::

	# To pin a vCPU to a specific CPU
	xm vcpu-pin <domain> <vcpu> <cpu>

	# To unpin a vCPU
	xm vcpu-pin <domain> <vcpu> all

	# To get the current pinning status
	xm vcpu-list <domain>

	Since currently controlling Xen in Ganeti is done in the configuration
	file, it is straight forward to use the same method for CPU pinning.
	There are 2 different parameters that control Xen's CPU pinning and
	configuration:

	vcpus
	controls the number of vCPUs
	cpus
	maps vCPUs to physical CPUs

	When no pinning is required (pinning information is "all"), the
	"cpus" entry is removed from the configuration file.

	For all other cases, the configuration is "translated" to Xen, which
	expects either ``cpus = "a"`` or ``cpus = [ "a", "b", "c", ...]``,
	where each a, b or c are a physical CPU number, CPU range, or a
	combination, and the number of entries (if a list is used) must match
	the number of vCPUs, and are mapped in order.

	For example, CPU pinning information of ``1:2,4-7:0-1`` is translated
	to this entry in Xen's configuration ``cpus = [ "1", "2,4-7", "0-1" ]``

	KVM
	---

	Controlling pinning in KVM is a little more complicated as there is no
	configuration to control pinning before instances are started.

	The way to change or assign CPU pinning under KVM is to use ``taskset`` or
	its underlying system call ``sched_setaffinity``. Setting the affinity for
	the VM process will change CPU pinning for the entire VM, and setting it
	for specific vCPU threads will control specific vCPUs.

	The sequence of commands to control pinning is this: start the instance
	with the ``-S`` switch, so it halts before starting execution, get the
	process ID or identify thread IDs of each vCPU by sending ``info cpus``
	to the monitor, map vCPUs as required by the cpu-pinning information,
	and issue a ``cont`` command on the KVM monitor to allow the instance
	to start execution.

	For example, a sequence of commands to control CPU affinity under KVM
	may be:

	* Start KVM: ``/usr/bin/kvm … <kvm-command-line-options> … -S``
	* Use socat to connect to monitor
	* send ``info cpus`` to monitor to get thread/vCPU information
	* call ``sched_setaffinity`` for each thread with the CPU mask
	* send ``cont`` to KVM's monitor

	A CPU mask is a hexadecimal bit mask where each bit represents one
	physical CPU. See man page for :manpage:`sched_setaffinity(2)` for more
	details.

	For example, to run a specific thread-id on CPUs 1 or 3 the mask is
	0x0000000A.

	We will control process and thread affinity using the python affinity
	package (http://pypi.python.org/pypi/affinity). This package is a Python
	wrapper around the two affinity system calls, and has no other
	requirements.

	Alternative Design Options
	--------------------------

	1. There's an option to ignore the limitations of the underlying
	hypervisor and instead of requiring explicit pinning information
	for all vCPUs, assume a mapping of "all" to vCPUs not mentioned.
	This can lead to inadvertent missing information, but either way,
	since using cpu-pinning options is probably not going to be
	frequent, there's no real advantage.

	.. vim: set textwidth=72 :
	.. Local Variables:
	.. mode: rst
	.. fill-column: 72
	.. End: