doc/design-multi-version-tests.rst - ganeti - Git at Google

 ===================
 Multi-version tests
 ===================

 .. contents:: :depth: 4

 This is a design document describing how tests which use multiple
 versions of Ganeti can be introduced into the current build
 infrastructure.

 Desired improvements
 ====================

 The testing of Ganeti is currently done by using two different
 approaches - unit tests and QA. While the former are useful for ensuring
 that the individual parts of the system work as expected, most errors
 are discovered only when all the components of Ganeti interact during
 QA.

 However useful otherwise, until now the QA has failed to provide support
 for testing upgrades and version compatibility as it was limited to
 using only one version of Ganeti. While these can be tested for every
 release manually, a systematic approach is preferred and none can exist
 with this restriction in place. To lift it, the buildbot scripts and QA
 utilities must be extended to allow a way of specifying and using
 diverse multi-version checks.

 Required use cases
 ==================

 There are two classes of multi-version tests that are interesting in
 Ganeti, and this chapter provides an example from each to highlight what
 should be accounted for in the design.

 Compatibility tests
 -------------------

 One interface Ganeti exposes to clients interested in interacting with
 it is the RAPI. Its stability has always been a design principle
 followed during implementation, but whether it held true in practice was
 not asserted through tests.

 An automatic test of RAPI compatibility would have to take a diverse set
 of RAPI requests and perform them on two clusters of different versions,
 one of which would be the reference version. If the clusters had been
 identically configured, all of the commands successfully executed on the
 reference version should succeed on the newer version as well.

 To achieve this, two versions of Ganeti can be run separately on a
 cleanly setup cluster. With no guarantee that the versions can coexist,
 the deployment of these has to be separate. A proxy placed between the
 client and Ganeti records all the requests and responses. Using this
 data, a testing utility can decide if the newer version is compatible or
 not, and provide additional information to assist with debugging.

 Upgrade / downgrade tests
 -------------------------

 An upgrade / downgrade test serves to examine whether the state of the
 cluster is unchanged after its configuration has been upgraded or
 downgraded to another version of Ganeti.

 The test works with two consecutive versions of Ganeti, both installed
 on the same machine. It examines whether the configuration data and
 instances survive the downgrade and upgrade procedures. This is done by
 creating a cluster with the newer version, downgrading it to the older
 one, and upgrading it to the newer one again. After every step, the
 integrity of the cluster is checked by running various operations and
 ensuring everything still works.

 Design and implementation
 =========================

 Although the previous examples have not been selected to show use cases
 as diverse as possible, they still show a number of dissimilarities:

 - Parallel installation vs sequential deployments
 - Comparing with reference version vs comparing consecutive versions
 - Examining result dumps vs trying a sequence of operations

 With the first two real use cases demonstrating such diversity, it does
 not make sense to design multi-version test classes. Instead, the
 programmability of buildbot's configuration files can be leveraged to
 implement each test as a separate builder with a custom sequence of
 steps. The individual steps such as checking out a given or previous
 version, or installing and removing Ganeti, will be provided as utility
 functions for any test writer to use.

 Current state
 -------------

 An upgrade / downgrade test is a part of the QA suite as of commit
 aa104b5e. The test and the corresponding buildbot changes are a very
 good first step, both by showing that multi-version tests can be done,
 and by providing utilities needed for builds of multiple branches.
 Previously, the same folder was used as the base directory of any build,
 and now a directory structure more accommodating to multiple builds is
 in place.

 The builder running the test has one flaw - regardless of the branch
 submitted, it compares versions 2.10 and 2.11 (current master). This
 behaviour is different from any of the other builders, which may
 restrict the branches a test can be performed on, but do not
 differentiate between them otherwise. While additional builders for
 different versions pairs may be added, this is not a good long-term
 solution.

 The test can be improved by making it compare the current and the
 previous version. As the buildbot has no notion of what a previous
 version is, additional utilities to handle this logic will have to be
 introduced.

 Planned changes
 ---------------

 The upgrade / downgrade test should be generalized to work for any
 version which can be downgraded from and upgraded to automatically,
 meaning versions from 2.11 onwards. This will be made challenging by
 the fact that the previous version has to be checked out by reading the
 version of the currently checked out code, identifying the previous
 version, and then making yet another checkout.

 The major and minor version can be read from a Ganeti repository in
 multiple ways. The two are present as constants defined in source files,
 but due to refactorings shifting constants from the Python to the
 Haskell side, their position varies across versions. A more reliable way
 of fetching them is by examining the news file, as it obeys strict
 formatting restrictions.

 With the version found, a script that acts as a previous version
 lookup table can be invoked. This script can be constructed dynamically
 upon buildbot startup, and specified as a build step. The checkout
 following it proceeds as expected.

 The RAPI compatibility test should be added as a separate builder
 afterwards. As the test requires additional comparison and proxy logic
 to be used, it will be enabled only on 2.11 onwards, comparing the
 versions to 2.6 - the reference version for the RAPI. Details on the
 design of this test will be added in a separate document.

 Potential issues
 ================

 While there are many advantages to having a single builder representing
 a multi-version test, working on every branch, there is at least one
 disadvantage: the need to define a base or reference version, which is
 the only version that can be used to trigger the test, and the only one
 on which code changes can be tried.

 If an error is detected while running a test, and the issue lies with
 a version other than the one used to invoke the test, the fix would
 have to make it into the repository before the test could be tried
 again.

 For simple tests, the issue might be mitigated by running them locally.
 However, the multi-version tests are more likely to be complicated than
 not, and it could be difficult to reproduce a test by hand.

 The situation can be made simpler by requiring that any multi-version
 test can use only versions lower than the reference version. As errors
 are more likely to be found in new rather than old code, this would at
 least reduce the number of troublesome cases.
	===================
	Multi-version tests
	===================

	.. contents:: :depth: 4

	This is a design document describing how tests which use multiple
	versions of Ganeti can be introduced into the current build
	infrastructure.

	Desired improvements
	====================

	The testing of Ganeti is currently done by using two different
	approaches - unit tests and QA. While the former are useful for ensuring
	that the individual parts of the system work as expected, most errors
	are discovered only when all the components of Ganeti interact during
	QA.

	However useful otherwise, until now the QA has failed to provide support
	for testing upgrades and version compatibility as it was limited to
	using only one version of Ganeti. While these can be tested for every
	release manually, a systematic approach is preferred and none can exist
	with this restriction in place. To lift it, the buildbot scripts and QA
	utilities must be extended to allow a way of specifying and using
	diverse multi-version checks.

	Required use cases
	==================

	There are two classes of multi-version tests that are interesting in
	Ganeti, and this chapter provides an example from each to highlight what
	should be accounted for in the design.

	Compatibility tests
	-------------------

	One interface Ganeti exposes to clients interested in interacting with
	it is the RAPI. Its stability has always been a design principle
	followed during implementation, but whether it held true in practice was
	not asserted through tests.

	An automatic test of RAPI compatibility would have to take a diverse set
	of RAPI requests and perform them on two clusters of different versions,
	one of which would be the reference version. If the clusters had been
	identically configured, all of the commands successfully executed on the
	reference version should succeed on the newer version as well.

	To achieve this, two versions of Ganeti can be run separately on a
	cleanly setup cluster. With no guarantee that the versions can coexist,
	the deployment of these has to be separate. A proxy placed between the
	client and Ganeti records all the requests and responses. Using this
	data, a testing utility can decide if the newer version is compatible or
	not, and provide additional information to assist with debugging.

	Upgrade / downgrade tests
	-------------------------

	An upgrade / downgrade test serves to examine whether the state of the
	cluster is unchanged after its configuration has been upgraded or
	downgraded to another version of Ganeti.

	The test works with two consecutive versions of Ganeti, both installed
	on the same machine. It examines whether the configuration data and
	instances survive the downgrade and upgrade procedures. This is done by
	creating a cluster with the newer version, downgrading it to the older
	one, and upgrading it to the newer one again. After every step, the
	integrity of the cluster is checked by running various operations and
	ensuring everything still works.

	Design and implementation
	=========================

	Although the previous examples have not been selected to show use cases
	as diverse as possible, they still show a number of dissimilarities:

	- Parallel installation vs sequential deployments
	- Comparing with reference version vs comparing consecutive versions
	- Examining result dumps vs trying a sequence of operations

	With the first two real use cases demonstrating such diversity, it does
	not make sense to design multi-version test classes. Instead, the
	programmability of buildbot's configuration files can be leveraged to
	implement each test as a separate builder with a custom sequence of
	steps. The individual steps such as checking out a given or previous
	version, or installing and removing Ganeti, will be provided as utility
	functions for any test writer to use.

	Current state
	-------------

	An upgrade / downgrade test is a part of the QA suite as of commit
	aa104b5e. The test and the corresponding buildbot changes are a very
	good first step, both by showing that multi-version tests can be done,
	and by providing utilities needed for builds of multiple branches.
	Previously, the same folder was used as the base directory of any build,
	and now a directory structure more accommodating to multiple builds is
	in place.

	The builder running the test has one flaw - regardless of the branch
	submitted, it compares versions 2.10 and 2.11 (current master). This
	behaviour is different from any of the other builders, which may
	restrict the branches a test can be performed on, but do not
	differentiate between them otherwise. While additional builders for
	different versions pairs may be added, this is not a good long-term
	solution.

	The test can be improved by making it compare the current and the
	previous version. As the buildbot has no notion of what a previous
	version is, additional utilities to handle this logic will have to be
	introduced.

	Planned changes
	---------------

	The upgrade / downgrade test should be generalized to work for any
	version which can be downgraded from and upgraded to automatically,
	meaning versions from 2.11 onwards. This will be made challenging by
	the fact that the previous version has to be checked out by reading the
	version of the currently checked out code, identifying the previous
	version, and then making yet another checkout.

	The major and minor version can be read from a Ganeti repository in
	multiple ways. The two are present as constants defined in source files,
	but due to refactorings shifting constants from the Python to the
	Haskell side, their position varies across versions. A more reliable way
	of fetching them is by examining the news file, as it obeys strict
	formatting restrictions.

	With the version found, a script that acts as a previous version
	lookup table can be invoked. This script can be constructed dynamically
	upon buildbot startup, and specified as a build step. The checkout
	following it proceeds as expected.

	The RAPI compatibility test should be added as a separate builder
	afterwards. As the test requires additional comparison and proxy logic
	to be used, it will be enabled only on 2.11 onwards, comparing the
	versions to 2.6 - the reference version for the RAPI. Details on the
	design of this test will be added in a separate document.

	Potential issues
	================

	While there are many advantages to having a single builder representing
	a multi-version test, working on every branch, there is at least one
	disadvantage: the need to define a base or reference version, which is
	the only version that can be used to trigger the test, and the only one
	on which code changes can be tried.

	If an error is detected while running a test, and the issue lies with
	a version other than the one used to invoke the test, the fix would
	have to make it into the repository before the test could be tried
	again.

	For simple tests, the issue might be mitigated by running them locally.
	However, the multi-version tests are more likely to be complicated than
	not, and it could be difficult to reproduce a test by hand.

	The situation can be made simpler by requiring that any multi-version
	test can use only versions lower than the reference version. As errors
	are more likely to be found in new rather than old code, this would at
	least reduce the number of troublesome cases.