| ================================= |
| Conversion between disk templates |
| ================================= |
| |
| .. contents:: :depth: 4 |
| |
| This design document describes the support for generic disk template |
| conversion in Ganeti. The logic used is disk template agnostic and |
| targets to cover the majority of conversions among the supported disk |
| templates. |
| |
| |
| Current state and shortcomings |
| ============================== |
| |
| Currently, Ganeti supports choosing among different disk templates when |
| creating an instance. However, converting the disk template of an |
| existing instance is possible only between the ``plain`` and ``drbd`` |
| templates. This feature was added in Ganeti since its early versions |
| when the number of supported disk templates was limited. Now that Ganeti |
| supports plenty of choices, this feature should be extended to provide |
| more flexibility to the user. |
| |
| The procedure for converting from the plain to the drbd disk template |
| works as follows. Firstly, a completely new disk template is generated |
| matching the size, mode, and the count of the current instance's disks. |
| The missing volumes are created manually both in the primary (meta disk) |
| and the secondary node. The original LVs running on the primary node are |
| renamed to match the new names. The last step is to manually associate |
| the DRBD devices with their mirror block device pairs. The conversion |
| from the drbd to the plain disk template is much simpler than the |
| opposite. Firstly, the DRBD mirroring is manually disabled. Then the |
| unnecessary volumes including the meta disk(s) of the primary node, and |
| the meta and data disk(s) from the previously secondary node are |
| removed. |
| |
| |
| Proposed changes |
| ================ |
| |
| This design proposes the creation of a unified interface for handling |
| the disk template conversions in Ganeti. Currently, there is no such |
| interface and each one of the supported conversions uses a separate code |
| path. |
| |
| This proposal introduces a single, disk-agnostic interface for handling |
| the disk template conversions in Ganeti, keeping in mind that we want it |
| to be as generic as possible. An exception case will be the currently |
| supported conversions between the LVM-based disk templates. Their basic |
| functionality will not be affected and will diverge from the rest disk |
| template conversions. The target is to provide support for conversions |
| among the majority of the available disk templates, and also creating |
| a mechanism that will easily support any new templates that may be |
| probably added in Ganeti, at a future point. |
| |
| |
| Design decisions |
| ================ |
| |
| Currently, the supported conversions for the LVM-based templates are |
| handled by the ``LUInstanceSetParams`` LU. Our implementation will |
| follow the same approach. From a high-level point-of-view this design |
| can be split in two parts: |
| |
| * The extension of the LU's checks to cover all the supported template |
| conversions |
| |
| * The new functionality which will be introduced to provide the new |
| feature |
| |
| The instance must be stopped before starting the disk template |
| conversion, as it currently is, otherwise the operation will fail. The |
| new mechanism will need to copy the disk's data for the conversion to be |
| possible. We propose using the Unix ``dd`` command to copy the |
| instance's data. It can be used to copy data from source to destination, |
| block-by-block, regardless of their filesystem types, making it a |
| convenient tool for the case. Since the conversion will be done via data |
| copy it will take a long time for bigger disks to copy their data and |
| consequently for the instance to switch to the new template. |
| |
| Some template conversions can be done faster without copying explicitly |
| their disks' data. A use case is the conversions between the LVM-based |
| templates, i.e., ``drbd`` and ``plain`` which will be done as happens |
| now and not using the ``dd`` command. Also, this implementation will |
| provide partial support for the ``blockdev`` disk template which will |
| act only as a source template. Since those volumes are adopted |
| pre-existent block devices we will not support conversions targeting |
| this template. Another exception case will be the ``diskless`` template. |
| Since it is a testing template that creates instances with no disks we |
| will not provide support for conversions that include this template |
| type. |
| |
| |
| We divide the design into the following parts: |
| |
| * Block device changes, that include the new methods which will be |
| introduced and will be responsible for building the commands for the |
| data copy from/to the requested devices |
| |
| * Backend changes, that include a new RPC call which will concatenate |
| the output of the above two methods and will execute the data copy |
| command |
| |
| * Core changes, that include the modifications in the Logical Unit |
| |
| * User interface changes, i.e., command line changes |
| |
| |
| Block device changes |
| -------------------- |
| |
| The block device abstract class will be extended with two new methods, |
| named ``Import`` and ``Export``. Those methods will be responsible for |
| building the commands that will be used for the data copy between the |
| corresponding devices. The ``Export`` method will build the command |
| which will export the data from the source device, while the ``Import`` |
| method will do the opposite. It will import the data to the newly |
| created target device. Those two methods will not perform the actual |
| data copy; they will simply return the requested commands for |
| transferring the data from/to the individual devices. The output of the |
| two methods will be combined using a pipe ("|") by the caller method in |
| the backend level. |
| |
| By default the data import and export will be done using the ``dd`` |
| command. All the inherited classes will use the base functionality |
| unless there is a faster way to convert to. In that case the underlying |
| block device will overwrite those methods with its specific |
| functionality. A use case will be the Ceph/RADOS block devices which |
| will make use of the ``rbd import`` and ``rbd export`` commands to copy |
| their data instead of using the default ``dd`` command. |
| |
| Keeping the data copy functionality in the block device layer, provides |
| us with a generic mechanism that works between almost all conversions |
| and furthermore can be easily extended for new disk templates. It also |
| covers the devices that support the ``access=userspace`` parameter and |
| solves this problem in a generic way, by implementing the logic in the |
| right level where we know what is the best to do for each device. |
| |
| |
| Backend changes |
| --------------- |
| |
| Introduce a new RPC call: |
| |
| * blockdev_convert(src_disk, dest_disk) |
| |
| where ``src_disk`` and ``dest_disk`` are the original and the new disk |
| objects respectively. First, the actual device instances will be |
| computed and then they will be used to build the export and import |
| commands for the data copy. The output of those methods will be |
| concatenated using a pipe, following a similar approach with the impexp |
| daemon. Finally, the unified data copy command will be executed, at this |
| level, by the ``nodeD``. |
| |
| |
| Core changes |
| ------------ |
| |
| The main modifications will be made in the ``LUInstanceSetParams`` LU. |
| The implementation of the conversion mechanism will be split into the |
| following parts: |
| |
| * The generation of the new disk template for the instance. The new |
| disks will match the size, mode, and name of the original volumes. |
| Those parameters and any other needed, .i.e., the provider's name for |
| the ExtStorage conversions, will be computed by a new method which we |
| will introduce, named ``ComputeDisksInfo``. The output of that |
| function will be used as the ``disk_info`` argument of the |
| ``GenerateDiskTemplate`` method. |
| |
| * The creation of the new block devices. We will make use of the |
| ``CreateDisks`` method which creates and attaches the new block |
| devices. |
| |
| * The data copy for each disk of the instance from the original to the |
| newly created volume. The data copy will be made by the ``nodeD`` with |
| the rpc call we have introduced earlier in this design. In case some |
| disks fail to copy their data the operation will fail and the newly |
| created disks will be removed. The instance will remain intact. |
| |
| * The detachment of the original disks of the instance when the data |
| copy operation successfully completes by calling the |
| ``RemoveInstanceDisk`` method for each instance's disk. |
| |
| * The attachment of the new disks to the instance by calling the |
| ``AddInstanceDisk`` method for each disk we have created. |
| |
| * The update of the configuration file with the new values. |
| |
| * The removal of the original block devices from the node using the |
| ``BlockdevRemove`` method for each one of the old disks. |
| |
| |
| User interface changes |
| ---------------------- |
| |
| The ``-t`` (``--disk-template``) option from the gnt-instance modify |
| command will specify the disk template to convert *to*, as it happens |
| now. The rest disk options such as its size, its mode, and its name will |
| be computed from the original volumes by the conversion mechanism, and |
| the user will not explicitly provide them. |
| |
| |
| ExtStorage conversions |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| |
| When converting to an ExtStorage disk template the |
| ``provider=*PROVIDER*`` option which specifies the ExtStorage provider |
| will be mandatory. Also, arbitrary parameters can be passed to the |
| ExtStorage provider. Those parameters will be optional and could be |
| passed as additional comma separated options. Since it is not allowed to |
| convert the disk template of an instance and make use of the ``--disk`` |
| option at the same time, we propose to introduce a new option named |
| ``--ext-params`` to handle the ``ext`` template conversions. |
| |
| :: |
| |
| gnt-instance modify -t ext --ext-params provider=pvdr1 test_vm |
| gnt-instance modify -t ext --ext-params provider=pvdr1,param1=val1,param2=val2 test_vm |
| |
| |
| File-based conversions |
| ~~~~~~~~~~~~~~~~~~~~~~ |
| |
| For conversions *to* a file-based template the ``--file-storage-dir`` |
| and the ``--file-driver`` options could be used, similarly to the |
| **add** command, to manually configure the storage directory and the |
| preferred driver for the file-based disks. |
| |
| :: |
| |
| gnt-instance modify -t file --file-storage-dir=mysubdir test_vm |
| |
| |
| Supported template conversions |
| ============================== |
| |
| This is a summary of the disk template conversions that the conversion |
| mechanism will support: |
| |
| +--------------+-----------------------------------------------------------------------------------+ |
| | Source | Target Disk Template | |
| | Disk +---------+-------+------+------------+---------+------+------+----------+----------+ |
| | Template | Plain | DRBD | File | Sharedfile | Gluster | RBD | Ext | BlockDev | Diskless | |
| +==============+=========+=======+======+============+=========+======+======+==========+==========+ |
| | Plain | - | Yes. | Yes. | Yes. | Yes. | Yes. | Yes. | No. | No. | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| | DRBD | Yes. | - | Yes. | Yes. | Yes. | Yes. | Yes. | No. | No. | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| | File | Yes. | Yes. | - | Yes. | Yes. | Yes. | Yes. | No. | No. | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| | Sharedfile | Yes. | Yes. | Yes. | - | Yes. | Yes. | Yes. | No. | No. | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| | Gluster | Yes. | Yes. | Yes. | Yes. | - | Yes. | Yes. | No. | No. | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| | RBD | Yes. | Yes. | Yes. | Yes. | Yes. | - | Yes. | No. | No. | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| | Ext | Yes. | Yes. | Yes. | Yes. | Yes. | Yes. | - | No. | No. | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| | BlockDev | Yes. | Yes. | Yes. | Yes. | Yes. | Yes. | Yes. | - | No. | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| | Diskless | No. | No. | No. | No. | No. | No. | No. | No. | - | |
| +--------------+---------+-------+------+------------+---------+------+------+----------+----------+ |
| |
| |
| Future Work |
| =========== |
| |
| Expand the conversion mechanism to provide a visual indication of the |
| data copy operation. We could monitor the progress of the data sent via |
| a pipe, and provide to the user information such as the time elapsed, |
| percentage completed (probably with a progress bar), total data |
| transferred, and so on, similar to the progress tracking that is |
| currently done by the impexp daemon. |
| |
| |
| .. vim: set textwidth=72 : |
| .. Local Variables: |
| .. mode: rst |
| .. fill-column: 72 |
| .. End: |