|
Searching XEmacs
Quick Links
About XEmacs
Getting XEmacs
Customizing XEmacs
Troubleshooting XEmacs
Developing XEmacs
|
|
|
Version number and development tree organization
Owner: ???
Effort: ???
Dependencies: ???
Abstract: The purpose of this proposal is to present a
coherent plan for how development branches in XEmacs are managed.
This will cover such issues as stable versus experimental branches,
creating new branches, synchronizing patches between branches, and how
version numbers are assigned to branches.
A development branch is defined to be a linear series of releases
of the XEmacs code base, each of which is derived from the previous
one. When the XEmacs development tree is forked and two branches are
created where there used to be one, the branch that is intended to be
more stable and have fewer changes made to it is considered the one
that inherits the parent branch, and the other branch is considered to
have begun at the branching point. The less stable of the two
branches will eventually be forked again, while this will not happen
usually to the more stable of the two branches, and its development
will eventually come to an end. This means that every branch has a
definite ending point. For example, the 20.x branch began at the
point when the released
19.13 code tree was split into a 19.x and a 20.x branch, and a 20.x
branch will end when the last 20.x release (probably numbered 20.5 or
20.6) is released.
I think that there should always be three active development
branches at any time. These branches can be designated the stable,
the semi-stable, and the experimental branches. This situation has
existed in the current code tree as soon as the 21.0 development
branch was split. In this situation, the stable branch is the 20.x
series. The semi-stable branch is the 21.0 release and the stability
releases that follow. The experimental branch is the branch that was
created as the result of the 21.0 development branch split.
Typically, the stable branch has been released for a long period of
time. The semi-stable branch has been released for a short period of
time, or is about to be released, and the experimental branch has not
yet been released, and will probably not be released for awhile. The
conditions that should hold in all circumstances are:
There should be three active branches.
The experimental branch should never be in feature freeze.
The reason for the second condition is to ensure that active
development can always proceed and is never throttled, as is happening
currently at the end of the 21.0 release cycle. What this means is
that as soon as the experimental branch is deemed to be stable enough
to go into feature freeze:
The current stable branch is made inactive and all further
development on it ceases.
The semi-stable branch, which by now should have been released for
a fair amount of time, and should be fairly stable, gets renamed to
the stable branch.
The experimental branch is forked into two branches, one of which
becomes the semi-stable branch, and the other, the experimental
branch.
The stable branch is always in high resistance, which is to say
that the only changes that can be made to the code are important bug
fixes involving a small amount of code where it should be clear just
by reading the code that no destabilizing code has been introduced.
The semi-stable branch is in low resistance, which means that no major
features can be added, but except right before a release fairly major
code changes are allowed. Features can be added if they are
sufficiently small, if they are deemed sufficiently critical due to
severe problems that would exist if the features were not added (for
example, replacement of the unexec mechanism with a portable solution
would be a feature that could be added to the semi-stable branch
provided that it did not involve an overly radical code
re-architecture, because otherwise it might be impossible to build
XEmacs on some architectures or with some compilers), or if the
primary purpose of the new feature is to remedy an incompleteness in a
recent architectural change that was not finished in a prior release
due to lack of time (for example, abstracting the mouse pointer and
list-of-colors interfaces, which were left out of 21.0). There is no
feature resistance in place in the experimental branch, which allows
full development to proceed at all times.
In general, both the stable and semi-stable branches will contain
previous net releases. In addition, there will be beta releases in
all three branches, and possibly development snapshots between the
beta releases. It's obviously necessary to have a good version
numbering scheme in order to keep everything straight.
First of all, it needs to be immediately clear from the version
number whether the release is a beta release or a net release. Steve
has proposed getting rid of the beta version numbering system, which I
think would be a big mistake. Furthermore, the net release version
number and beta release version number should be kept separate, just
as they are now, to make it completely clear where any particular
release stands. There may be alternate ways of phrasing a beta
release other than something like 21.0 beta 34, but in all such
systems, the beta number needs to be zero for any release version.
Three possible alternative systems, none of which I like very much,
are:
The beta number is simply an extra number in the regular version
number. Then, for example, 21.0 beta 34 becomes 21.0.34. The problem
is that the release version, which would simply be called 21.0,
appears to be earlier than 21.0 beta 34.
The beta releases appear as later revisions of earlier releases.
Then, for example, 21.1 beta 34 becomes 21.0.34, and 21.0 beta 34
would have to become 21.-1.34. This has both the obvious ugliness of
negative version numbers and the problem that it makes beta releases
appear to be associated with their previous releases, when in fact
they are more closely associated with the following release.
Simply make the beta version number be negative. In this scheme,
you'd start with something like -1000 as the first beta, and then 21.0
beta 34 would get renumbered to 21.0.-968. Obviously, this is a crazy
and convoluted scheme as well, and we would be best to avoid it.
Currently, the between-beta snapshots are not numbered, but I think
that they probably should be. If appropriate scripts are handled to
automate beta release, it should be very easy to have a version number
automatically updated whenever a snapshot is made. The number could
be added either as a separate snapshot number, and you'd have 21.0
beta 34 pre 1, which becomes before 21.0 beta 34; or we could make the
beta number be floating point, and then the same snapshot would have
to be called 21.0 beta 33.1. The latter solution seems quite kludgey
to me.
There also needs to be a clear way to distinguish, when a net
release is made, which branch the release is a part of. Again, three
solutions come to mind:
The major version number reflects which development branch the
release is in and the minor version number indicates how many releases
have been made along this branch. In this scheme, 21.0 is always the
first release of the 21 series development branch, and when this
branch is split, the child branch that becomes the experimental branch
gets version numbers starting with 22. This scheme is the simplest,
and it's the one I like best.
We move to a three-part version number. In this scheme, the first
two numbers indicate the branch, and the third number indicates the
release along the branch. In this scheme, we have numbers like
21.0.1, which would be the second release in the 21.0 series branch,
and 21.1.2, which would be the third release in the
21.1 series branch. The major version number then gets increased
only very occasionally, and only when a sufficiently major
architectural change has been made, particularly one that causes
compatibility problems with code written for previous branches. I
think schemes like this are unnecessary in most circumstances, because
usually either the major version number ends up changing so often that
the second number is always either zero or one, or the major version
number never changes, and as such becomes useless. By the time the
major version number would change, the product itself has changed so
much that it often gets renamed. Furthermore, it is clear that the
two version number scheme has been used throughout most of the history
of Emacs, and recently we have been following the two number scheme
also. If we introduced a third revision number, at this point it
would both confuse existing code that assumed there were two numbers,
and would look rather silly given that the major version number is so
high and would probably remain at the same place for quite a long
time.
A third scheme that would attempt to cross the two schemes
would keep the same concept of major version number as for the three
number scheme, and would compress the second and third numbers of the
three number scheme into one number by using increments of ten. For
example, the current 21.x branch would have releases No. 21.0, 21.1,
etc. The next branch would be No. 21.10, 21.11, etc. I don't like
this scheme very much because it seems rather kludgey, and also
because it is not used in any other product as far as I know.
Another scheme that would combine the second and third numbers
in the three number scheme would be to have the releases in the
current 21.x series be numbered 21.0, then 21.01, then 22.02, etc.
The next series is 21.1, then 21.11, then 21.12, etc. This is similar
to the way that version numbers are done for DOS in Windows. I also
think that this scheme is fairly silly because, like the previous
scheme, its only purpose is to avoid increasing the major version
number very much. But given that we have already have a fairly large
major version number, there doesn't seem to be any particular problem
with increasing this number by one every year or two. Some people
will object that by doing this, it becomes impossible to tell when a
change is so major that it causes a lot of code breakage, but past
releases have not been accurate indicators of this. For example,
19.12 caused a lot of code breakage, but 20.0 caused less, and 21.0
caused less still. In the GNU Emacs world, there were byte code
changes made between 19.28 and 19.29, but as far as I know, not
between 19.29 and 20.0.
With three active development branches, synchronizing code changes
between the branches is obviously somewhat of a problem. To make
things easier, I propose a few general guidelines:
Merging between different branches need not happen that often.
It should not happen more often than necessary to avoid undue burden
on the maintainer, but needs to be done at all defined checkpoints.
These checkpoints need to be noted in all of the places that track
changes along the branch, for example, in all of the change logs and
in all of the CVS tags.
Every code change that can be considered a self-contained unit,
no matter how large or small, needs to have a change log entry,
preferably a single change log entry associated with it. This is an
absolute requirement. There should be no code changes without an
associated change log entry. Otherwise, it is highly likely that
patches will not be correctly synchronized across all versions, and
will get lost. There is no need for change log entries to contain
unnecessary detail though, and it is important that there be no more
change log entries than necessary, which means that two or more change
log entries associated with a single patch need to be grouped together
if possible. This might imply that there should be one global change
log instead of change logs in each directory, or at the very least,
the number of separate change logs should be kept to a minimum.
The patch that is associated with each change log entry needs to
be kept around somewhere. The reason for this is that when
synchronizing code from some branch to some earlier branch, it is
necessary to go through each change log entry and decide whether a
change is worthy to make it into a more stable branch. If so, the
patch associated with this change needs to be individually applied to
the earlier branch.
All changes made in more stable branches get merged into less
stable branches unless the change really is completely unnecessary in
the less stable branch because it is superseded by some other change.
This will probably mean more developers making changes to the
semi-stable branch than to the experimental branch. This means that
developers should strive to do their development in the most stable
branch that they expect their code to go into. An alternative to this
which is perhaps more workable is simply to insist that all developers
make all patches based off of the experimental branch, and then later
merge these patches down to the more stable branches as necessary.
This means, however, that submitted patches should never be
combinations of two or more unrelated changes. Whenever such patches
are submitted, they should either be rejected (which should apply to
anybody who should know better, which probably means everybody on the
beta list and anybody else who is a regular contributor), or the
maintainer or some other designated party needs to filter the combined
patch into separate patches, one per logical change.
The maintainer should keep all the patches around in some data
base, and the patches should be given an identifier consisting of the
author of the patch, the date the patch was submitted, and some other
identifying characteristic, such as a number, in case there is more
than one patch on the same date by the same author. The database
should hopefully be correctly marked at all times with something
indicating which branches the patch has been applied to, and this
database should hopefully be publicly visible so that patch authors
can determine whether their patches have been applied, and whether
their patches have been received, so that patches do not get
needlessly resubmitted.
Global automatable changes such as textual renaming,
reordering, and additions or deletions of parameters in function calls
should still be allowed, even with multiple development branches.
(Sometimes these are necessary for code cleanliness, and in the long
run, they save a lot of time, even through they may cause some
headaches in the short-term.) In general, when such changes are made,
they should occur in a separate beta version that contains only such
changes and no other patches, and the changes should be made in both
the semi-stable and experimental branches at the same time. The
description of the beta version should make it very clear that the
beta is comprised of such changes. The reason for doing these things
is to make it easier for people to diff between beta versions in order
to figure out the changes that were made without the diff getting
cluttered up by these code cleanliness changes that don't change any
actual behavior.
Ben Wing
|