[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
42.1 Dumping Justification | ||
42.2 Overview | ||
42.3 Data descriptions | ||
42.4 Dumping phase | ||
42.5 Reloading phase | ||
42.6 Remaining issues |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The C code of XEmacs is just a Lisp engine with a lot of built-in primitives useful for writing an editor. The editor itself is written mostly in Lisp, and represents around 100K lines of code. Loading and executing the initialization of all this code takes a bit a time (five to ten times the usual startup time of current xemacs) and requires having all the lisp source files around. Having to reload them each time the editor is started would not be acceptable.
The traditional solution to this problem is called dumping: the build process first creates the lisp engine under the name ‘temacs’, then runs it until it has finished loading and initializing all the lisp code, and eventually creates a new executable called ‘xemacs’ including both the object code in ‘temacs’ and all the contents of the memory after the initialization.
This solution, while working, has a huge problem: the creation of the
new executable from the actual contents of memory is an extremely
system-specific process, quite error-prone, and which interferes with a
lot of system libraries (like malloc). It is even getting worse
nowadays with libraries using constructors which are automatically
called when the program is started (even before main()
) which tend to
crash when they are called multiple times, once before dumping and once
after (IRIX 6.x ‘libz.so’ pulls in some C++ image libraries thru
dependencies which have this problem). Writing the dumper is also one
of the most difficult parts of porting XEmacs to a new operating system.
Basically, ‘dumping’ is an operation that is just not officially
supported on many operating systems.
The aim of the portable dumper is to solve the same problem as the system-specific dumper, that is to be able to reload quickly, using only a small number of files, the fully initialized lisp part of the editor, without any system-specific hacks.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The portable dumping system has to:
Note: As of 21.5.18, the dump file has been moved inside of the executable, although there are still problems with this on some systems.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The more complex task of the dumper is to be able to write memory blocks on the heap (lisp objects, i.e. lrecords, and C-allocated memory, such as structs and arrays) to disk and reload them at a different address, updating all the pointers they include in the process. This is done by using external data descriptions that give information about the layout of the blocks in memory.
The specification of these descriptions is in lrecord.h. A description of an lrecord is an array of struct memory_description. Each of these structs include a type, an offset in the block and some optional parameters depending on the type. For instance, here is the string description:
static const struct memory_description string_description[] = { { XD_BYTECOUNT, offsetof (Lisp_String, size) }, { XD_OPAQUE_DATA_PTR, offsetof (Lisp_String, data), XD_INDIRECT(0, 1) }, { XD_LISP_OBJECT, offsetof (Lisp_String, plist) }, { XD_END } }; |
The first line indicates a member of type Bytecount, which is used by
the next, indirect directive. The second means "there is a pointer to
some opaque data in the field data
". The length of said data is
given by the expression XD_INDIRECT(0, 1)
, which means "the value
in the 0th line of the description (welcome to C) plus one". The third
line means "there is a Lisp_Object member plist
in the Lisp_String
structure". XD_END
then ends the description.
This gives us all the information we need to move around what is pointed to by a memory block (C or lrecord) and, by transitivity, everything that it points to. The only missing information for dumping is the size of the block. For lrecords, this is part of the lrecord_implementation, so we don’t need to duplicate it. For C blocks we use a struct sized_memory_description, which includes a size field and a pointer to an associated array of memory_description.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Dumping is done by calling the function pdump()
(in ‘dumper.c’) which is
invoked from Fdump_emacs (in ‘emacs.c’). This function performs a number
of tasks.
42.4.1 Object inventory | ||
42.4.2 Address allocation | ||
42.4.3 The header | ||
42.4.4 Data dumping | ||
42.4.5 Pointers dumping |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The first task is to build the list of the objects to dump. This includes:
We end up with one pdump_block_list_elt
per object group (arrays
of C structs are kept together) which includes a pointer to the first
object of the group, the per-object size and the count of objects in the
group, along with some other information which is initialized later.
These entries are linked together in pdump_block_list
structures
and can be enumerated thru either:
pdump_object_table
, an array of pdump_block_list
, one
per lrecord type, indexed by type number.
pdump_opaque_data_list
, used for the opaque data which does
not include pointers, and hence does not need descriptions.
pdump_desc_table
, which is a vector of
memory_description
/pdump_block_list
pairs, used for
non-opaque C memory blocks.
This uses a marking strategy similar to the garbage collector. Some differences though:
This is done by pdump_register_object()
, which handles
Lisp_Object variables, and pdump_register_block()
which handles
generic memory blocks (C structures, arrays, etc.), which both delegate
the description management to pdump_register_sub()
.
The hash table doubles as a map object to pdump_block_list_elmt (i.e.
allows us to look up a pdump_block_list_elmt with the object it points
to). Entries are added with pdump_add_block()
and looked up with
pdump_get_block()
. There is no need for entry removal. The hash
value is computed quite simply from the object pointer by
pdump_make_hash()
.
The roots for the marking are:
staticpro
’ed variables (there is a special
staticpro_nodump()
call for protected variables we do not want to
dump).
dump_add_root_lisp_object
(staticpro()
is equivalent to staticpro_nodump()
+
dump_add_root_lisp_object()
).
dump_add_root_block
(for blocks with relocatable pointers), or dump_add_opaque
(for
"opaque" blocks with no relocatable pointers; this is just a shortcut
for calling dump_add_root_block
with a NULL description).
dump_add_root_block_ptr
,
each of which points to a block of heap memory (generally a C structure
or array). Note that dump_add_root_block_ptr
is not technically
necessary, as a pointer variable can be seen as a special case of a
data-segment memory block and registered using
dump_add_root_block
. Doing it this way, however, would require
another level of static structures declared. Since pointer variables
are quite common, dump_add_root_block_ptr
is provided for
convenience. Note also that internally we have to treat it separately
from dump_add_root_block
rather than writing the former as a call
to the latter, since we don’t have support for creating and using memory
descriptions on the fly – they must all be statically declared in the
data-segment.
This does not include the GCPRO’ed variables, the specbinds, the catchtags, the backlist, the redisplay or the profiling info, since we do not want to rebuild the actual chain of lisp calls which end up to the dump-emacs call, only the global variables.
Weak lists and weak hash tables are dumped as if they were their non-weak equivalent (without changing their type, of course). This has not yet been a problem.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The next step is to allocate the offsets of each of the objects in the
final dump file. This is done by pdump_allocate_offset()
which
is called indirectly by pdump_scan_by_alignment()
.
The strategy to deal with alignment problems uses these facts:
Hence, for each lrecord type, C struct type or opaque data block the
alignment requirement is computed as a power of two, with a minimum of
2^2 for lrecords. pdump_scan_by_alignment()
then scans all the
pdump_block_list_elmt
’s, the ones with the highest requirements
first. This ensures the best packing.
The maximum alignment requirement we take into account is 2^8.
pdump_allocate_offset()
only has to do a linear allocation,
starting at offset 256 (this leaves room for the header and keeps the
alignments happy).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The next step creates the file and writes a header with a signature and
some random information in it. The reloc_address
field, which
indicates at which address the file should be loaded if we want to avoid
post-reload relocation, is set to 0. It then seeks to offset 256 (base
offset for the objects).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The data is dumped in the same order as the addresses were allocated by
pdump_dump_data()
, called from pdump_scan_by_alignment()
.
This function copies the data to a temporary buffer, relocates all
pointers in the object to the addresses allocated in step Address
Allocation, and writes it to the file. Using the same order means that,
if we are careful with lrecords whose size is not a multiple of 4, we
are ensured that the object is always written at the offset in the file
allocated in step Address Allocation.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A bunch of tables needed to reassign properly the global pointers are then written. They are:
For each of the dynarrs we write both the pointer to the variables and the relocated offset of the object they point to. Since these variables are global, the pointers are still valid when restarting the program and are used to regenerate the global pointers.
The pdump_weak_object_chains
dynarr is a special case. The
variables it points to are the head of weak linked lists of lisp objects
of the same type. Not all objects of this list are dumped so the
relocated pointer we associate with them points to the first dumped
object of the list, or Qnil if none is available. This is also the
reason why they are not used as roots for the purpose of object
enumeration.
Some very important information like the staticpros
and
lrecord_implementations_table
are handled indirectly using
dump_add_opaque
or dump_add_root_block_ptr
.
This is the end of the dumping part.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The file is mmap’ed in memory (which ensures a PAGESIZE alignment, at least 4096), or if mmap is unavailable or fails, a 256-bytes aligned malloc is done and the file is loaded.
Some variables are reinitialized from the values found in the header.
The difference between the actual loading address and the reloc_address is computed and will be used for all the relocations.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The memory contents are restored in the obvious and trivial way.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The variables pointed to by pdump_root_block_ptrs in the dump phase are reset to the right relocated object addresses.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
All the objects are relocated using their description and their offset
by pdump_reloc_one
. This step is unnecessary if the
reloc_address is equal to the file loading address.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Same as Putting back the pdump_root_block_ptrs.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Since some of the hash values in the lisp hash tables are
address-dependent, their layout is now wrong. So we go through each of
them and have them resorted by calling pdump_reorganize_hash_table
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The build process will have to start a post-dump xemacs, ask it the loading address (which will, hopefully, be always the same between different xemacs invocations) [[unfortunately, not true on Linux with the ExecShield feature]] and relocate the file to the new address. This way the object relocation phase will not have to be done, which means no writes in the objects and that, because of the use of mmap, the dumped data will be shared between all the xemacs running on the computer.
Some executable signature will be necessary to ensure that a given dump file is really associated with a given executable, or random crashes will occur. Maybe a random number set at compile or configure time thru a define. This will also allow for having differently-compiled xemacsen on the same system (mule and no-mule comes to mind).
The DOC file contents should probably end up in the dump file.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Aidan Kehoe on December 27, 2016 using texi2html 1.82.