Version number and development tree organization

Owner: ???

Effort: ???

Dependencies: ???

Abstract: The purpose of this proposal is to present a coherent plan for how development branches in XEmacs are managed. This will cover such issues as stable versus experimental branches, creating new branches, synchronizing patches between branches, and how version numbers are assigned to branches.

A development branch is defined to be a linear series of releases of the XEmacs code base, each of which is derived from the previous one. When the XEmacs development tree is forked and two branches are created where there used to be one, the branch that is intended to be more stable and have fewer changes made to it is considered the one that inherits the parent branch, and the other branch is considered to have begun at the branching point. The less stable of the two branches will eventually be forked again, while this will not happen usually to the more stable of the two branches, and its development will eventually come to an end. This means that every branch has a definite ending point. For example, the 20.x branch began at the point when the released 19.13 code tree was split into a 19.x and a 20.x branch, and a 20.x branch will end when the last 20.x release (probably numbered 20.5 or 20.6) is released.

I think that there should always be three active development branches at any time. These branches can be designated the stable, the semi-stable, and the experimental branches. This situation has existed in the current code tree as soon as the 21.0 development branch was split. In this situation, the stable branch is the 20.x series. The semi-stable branch is the 21.0 release and the stability releases that follow. The experimental branch is the branch that was created as the result of the 21.0 development branch split. Typically, the stable branch has been released for a long period of time. The semi-stable branch has been released for a short period of time, or is about to be released, and the experimental branch has not yet been released, and will probably not be released for awhile. The conditions that should hold in all circumstances are:

There should be three active branches.
The experimental branch should never be in feature freeze.

The reason for the second condition is to ensure that active development can always proceed and is never throttled, as is happening currently at the end of the 21.0 release cycle. What this means is that as soon as the experimental branch is deemed to be stable enough to go into feature freeze:

The current stable branch is made inactive and all further development on it ceases.
The semi-stable branch, which by now should have been released for a fair amount of time, and should be fairly stable, gets renamed to the stable branch.
The experimental branch is forked into two branches, one of which becomes the semi-stable branch, and the other, the experimental branch.

The stable branch is always in high resistance, which is to say that the only changes that can be made to the code are important bug fixes involving a small amount of code where it should be clear just by reading the code that no destabilizing code has been introduced. The semi-stable branch is in low resistance, which means that no major features can be added, but except right before a release fairly major code changes are allowed. Features can be added if they are sufficiently small, if they are deemed sufficiently critical due to severe problems that would exist if the features were not added (for example, replacement of the unexec mechanism with a portable solution would be a feature that could be added to the semi-stable branch provided that it did not involve an overly radical code re-architecture, because otherwise it might be impossible to build XEmacs on some architectures or with some compilers), or if the primary purpose of the new feature is to remedy an incompleteness in a recent architectural change that was not finished in a prior release due to lack of time (for example, abstracting the mouse pointer and list-of-colors interfaces, which were left out of 21.0). There is no feature resistance in place in the experimental branch, which allows full development to proceed at all times.

In general, both the stable and semi-stable branches will contain previous net releases. In addition, there will be beta releases in all three branches, and possibly development snapshots between the beta releases. It's obviously necessary to have a good version numbering scheme in order to keep everything straight.

First of all, it needs to be immediately clear from the version number whether the release is a beta release or a net release. Steve has proposed getting rid of the beta version numbering system, which I think would be a big mistake. Furthermore, the net release version number and beta release version number should be kept separate, just as they are now, to make it completely clear where any particular release stands. There may be alternate ways of phrasing a beta release other than something like 21.0 beta 34, but in all such systems, the beta number needs to be zero for any release version. Three possible alternative systems, none of which I like very much, are:

The beta number is simply an extra number in the regular version number. Then, for example, 21.0 beta 34 becomes 21.0.34. The problem is that the release version, which would simply be called 21.0, appears to be earlier than 21.0 beta 34.
The beta releases appear as later revisions of earlier releases. Then, for example, 21.1 beta 34 becomes 21.0.34, and 21.0 beta 34 would have to become 21.-1.34. This has both the obvious ugliness of negative version numbers and the problem that it makes beta releases appear to be associated with their previous releases, when in fact they are more closely associated with the following release.
Simply make the beta version number be negative. In this scheme, you'd start with something like -1000 as the first beta, and then 21.0 beta 34 would get renumbered to 21.0.-968. Obviously, this is a crazy and convoluted scheme as well, and we would be best to avoid it.

Currently, the between-beta snapshots are not numbered, but I think that they probably should be. If appropriate scripts are handled to automate beta release, it should be very easy to have a version number automatically updated whenever a snapshot is made. The number could be added either as a separate snapshot number, and you'd have 21.0 beta 34 pre 1, which becomes before 21.0 beta 34; or we could make the beta number be floating point, and then the same snapshot would have to be called 21.0 beta 33.1. The latter solution seems quite kludgey to me.

There also needs to be a clear way to distinguish, when a net release is made, which branch the release is a part of. Again, three solutions come to mind:

The major version number reflects which development branch the release is in and the minor version number indicates how many releases have been made along this branch. In this scheme, 21.0 is always the first release of the 21 series development branch, and when this branch is split, the child branch that becomes the experimental branch gets version numbers starting with 22. This scheme is the simplest, and it's the one I like best.
We move to a three-part version number. In this scheme, the first two numbers indicate the branch, and the third number indicates the release along the branch. In this scheme, we have numbers like 21.0.1, which would be the second release in the 21.0 series branch, and 21.1.2, which would be the third release in the 21.1 series branch. The major version number then gets increased only very occasionally, and only when a sufficiently major architectural change has been made, particularly one that causes compatibility problems with code written for previous branches. I think schemes like this are unnecessary in most circumstances, because usually either the major version number ends up changing so often that the second number is always either zero or one, or the major version number never changes, and as such becomes useless. By the time the major version number would change, the product itself has changed so much that it often gets renamed. Furthermore, it is clear that the two version number scheme has been used throughout most of the history of Emacs, and recently we have been following the two number scheme also. If we introduced a third revision number, at this point it would both confuse existing code that assumed there were two numbers, and would look rather silly given that the major version number is so high and would probably remain at the same place for quite a long time.
A third scheme that would attempt to cross the two schemes would keep the same concept of major version number as for the three number scheme, and would compress the second and third numbers of the three number scheme into one number by using increments of ten. For example, the current 21.x branch would have releases No. 21.0, 21.1, etc. The next branch would be No. 21.10, 21.11, etc. I don't like this scheme very much because it seems rather kludgey, and also because it is not used in any other product as far as I know.
Another scheme that would combine the second and third numbers in the three number scheme would be to have the releases in the current 21.x series be numbered 21.0, then 21.01, then 22.02, etc. The next series is 21.1, then 21.11, then 21.12, etc. This is similar to the way that version numbers are done for DOS in Windows. I also think that this scheme is fairly silly because, like the previous scheme, its only purpose is to avoid increasing the major version number very much. But given that we have already have a fairly large major version number, there doesn't seem to be any particular problem with increasing this number by one every year or two. Some people will object that by doing this, it becomes impossible to tell when a change is so major that it causes a lot of code breakage, but past releases have not been accurate indicators of this. For example, 19.12 caused a lot of code breakage, but 20.0 caused less, and 21.0 caused less still. In the GNU Emacs world, there were byte code changes made between 19.28 and 19.29, but as far as I know, not between 19.29 and 20.0.

With three active development branches, synchronizing code changes between the branches is obviously somewhat of a problem. To make things easier, I propose a few general guidelines:

Merging between different branches need not happen that often. It should not happen more often than necessary to avoid undue burden on the maintainer, but needs to be done at all defined checkpoints. These checkpoints need to be noted in all of the places that track changes along the branch, for example, in all of the change logs and in all of the CVS tags.
Every code change that can be considered a self-contained unit, no matter how large or small, needs to have a change log entry, preferably a single change log entry associated with it. This is an absolute requirement. There should be no code changes without an associated change log entry. Otherwise, it is highly likely that patches will not be correctly synchronized across all versions, and will get lost. There is no need for change log entries to contain unnecessary detail though, and it is important that there be no more change log entries than necessary, which means that two or more change log entries associated with a single patch need to be grouped together if possible. This might imply that there should be one global change log instead of change logs in each directory, or at the very least, the number of separate change logs should be kept to a minimum.
The patch that is associated with each change log entry needs to be kept around somewhere. The reason for this is that when synchronizing code from some branch to some earlier branch, it is necessary to go through each change log entry and decide whether a change is worthy to make it into a more stable branch. If so, the patch associated with this change needs to be individually applied to the earlier branch.
All changes made in more stable branches get merged into less stable branches unless the change really is completely unnecessary in the less stable branch because it is superseded by some other change. This will probably mean more developers making changes to the semi-stable branch than to the experimental branch. This means that developers should strive to do their development in the most stable branch that they expect their code to go into. An alternative to this which is perhaps more workable is simply to insist that all developers make all patches based off of the experimental branch, and then later merge these patches down to the more stable branches as necessary. This means, however, that submitted patches should never be combinations of two or more unrelated changes. Whenever such patches are submitted, they should either be rejected (which should apply to anybody who should know better, which probably means everybody on the beta list and anybody else who is a regular contributor), or the maintainer or some other designated party needs to filter the combined patch into separate patches, one per logical change.
The maintainer should keep all the patches around in some data base, and the patches should be given an identifier consisting of the author of the patch, the date the patch was submitted, and some other identifying characteristic, such as a number, in case there is more than one patch on the same date by the same author. The database should hopefully be correctly marked at all times with something indicating which branches the patch has been applied to, and this database should hopefully be publicly visible so that patch authors can determine whether their patches have been applied, and whether their patches have been received, so that patches do not get needlessly resubmitted.
Global automatable changes such as textual renaming, reordering, and additions or deletions of parameters in function calls should still be allowed, even with multiple development branches. (Sometimes these are necessary for code cleanliness, and in the long run, they save a lot of time, even through they may cause some headaches in the short-term.) In general, when such changes are made, they should occur in a separate beta version that contains only such changes and no other patches, and the changes should be made in both the semi-stable and experimental branches at the same time. The description of the beta version should make it very clear that the beta is comprised of such changes. The reason for doing these things is to make it easier for people to diff between beta versions in order to figure out the changes that were made without the diff getting cluttered up by these code cleanliness changes that don't change any actual behavior.

Ben Wing

Conform with <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Automatically validated by PSGML