|
|||||||||||||||||
Searching XEmacs
Quick Links
About XEmacs
|
Lisp-level encoding stream interfaceBen Wing <ben@xemacs.org>An lstream interface for use in creating arbitrary lisp coding systems (not just international encodings but gzip, base64, md5, etc.). StatusNot for inclusion Specification is mostly complete. Open bugsNone. Other open issuesThe following specification needs to be implemented. - Lisp Stream API <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Expose XEmacs internal lstreams to Lisp as stream objects. (In addition to the functions given below, each stream object has properties that can be associated with it using the standard put, get etc. API. For GNU Emacs, where put and get have not been extended to be general property functions, but work only on strings, we would have to create functions set-stream-property, stream-property, remove-stream-property, and stream-properties. These provide the same functionality as the generic get, put, remprop, and object-plist functions under XEmacs) (Implement properties using a hash table, and *generalize* this so that it is extremely easy to add a property interface onto any kind of object) (write-stream STREAM STRING) Write the STRING to the STREAM. This will signal an error if all the bytes cannot be written. (read-stream STREAM &optional N SEQUENCE) Reads data from STREAM. N specifies the number of bytes or characters, depending on the stream. SEQUENCE specifies where to write the data into. If N is not specified, data is read until end of file. If SEQUENCE is not specified, the data is returned as a stream. If SEQUENCE is specified, the SEQUENCE must be large enough to hold the data. (push-stream-marker STREAM) returns ID, probably a stream marker object (pop-stream-marker STREAM) backs up stream to last marker (unread-stream STREAM STRING) The only valid STREAM is an input stream in which case the data in STRING is pushed back and will be read ahead of all other data. In general, there is no limit to the amount of data that can be unread or the number of times that unread-stream can be called before another read. (stream-available-chars STREAM) This returns the number of characters (or bytes) that can definitely be read from the screen without an error. This can be useful, for example, when dealing with non-blocking streams when an attempt to read too much data will result in a blocking error. (stream-seekable-p STREAM) Returns true if the stream is seekable. If false, operations such as seek-stream and stream-position will signal an error. However, the functions set-stream-marker and seek-stream-marker will still succeed for an input stream. (stream-position STREAM) If STREAM is a seekable stream, returns a position which can be passed to seek-stream. (seek-stream STREAM N) If STREAM is a seekable stream, move to the position indicated by N, otherwise signal an error. (set-stream-marker STREAM) If STREAM is an input stream, create a marker at the current position, which can later be moved back to. The stream does not need to be a seekable stream. In this case, all successive data will be buffered to simulate the effect of a seekable stream. Therefore use this function with care. (seek-stream-marker STREAM marker) Move the stream back to the position that was stored in the marker object. (this is generally an opaque object of type stream-marker). (delete-stream-marker MARKER) Destroy the stream marker and if the stream is a non-seekable stream and there are no other stream markers pointing to an earlier position, frees up some buffering information. (delete-stream STREAM N) (delete-stream-marker STREAM ID) (close-stream stream) Writes any remaining data to the stream and closes it and the object to which it's attached. This also happens automatically when the stream is garbage collected. (getchar-stream STREAM) Return a single character from the stream. (This may be a single byte depending on the nature of the stream). This is actually a macro with an extremely efficient implementation (as efficient as you can get in Emacs Lisp), so that this can be used without fear in a loop. The implementation works by reading a large amount of data into a vector and then simply using the function AREF to read characters one by one from the vector. Because AREF is one of the primitives handled specially by the byte interpreter, this will be very efficient. The actual implementation may in fact use the function call-with-condition-handler to avoid the necessity of checking for overflow. Its typical implementation is to fetch the vector containing the characters as a stream property, as well as the index into that vector. Then it retrieves the character and increments the value and stores it back in the stream. As a first implementation, we check to see when we are reading the character whether the character would be out of range. If so, we read another 4096 characters, storing them into the same vector, setting the index back to the beginning, and then proceeding with the rest of the getchar algorithm. (putchar-stream STREAM CHAR) This is similar to getchar-stream but it writes data instead of reading data. Function make-stream There are actually two stream-creation functions, which are: (make-input-stream TYPE PROPERTIES) (make-output-stream TYPE PROPERTIES) These can be used to create a stream that reads data, or writes data, respectively. PROPERTIES is a property list and the allowable properties in it are defined by the type. Possible types are: (1) `file' (this reads data from a file or writes to a file) Allowable properties are: :file-name (the name of the file) :create (for output streams only, creates the file if it doesn't already exist) :exclusive (for output streams only, fails if the file already exists) :append (for output streams only; starts appending to the end of the file rather than overwriting the file) :offset (positions in bytes in the file where reading or writing should begin. If unspecified, defaults to the beginning of the file or to the end of the file when :appended specified) :count (for input streams only, the number of bytes to read from the file before signaling "end of file". If nil or omitted, the number of bytes is unlimited) :non-blocking (if true, reads or writes will fail if the operation would block. This only makes sense for non-regular files). (2) `process' (For output streams only, send data to a process.) Allowable properties are: :process (the process object) (3) `buffer' (Read from or write to a buffer.) Allowable properties are: :buffer (the name of the buffer or the buffer object.) :start (the position to start reading from or writing to. If nil, use the buffer point. If true, use the buffer's point and move point beyond the end of the data read or written.) :end (only for input streams, the position to stop reading at. If nil, continue to the end of the buffer.) :ignore-accessible (if true, the default for :start and :end ignore any narrowing of the buffer.) (4) `stream' (read from or write to a lisp stream) Allowable properties are: :stream (the stream object) :offset (the position to begin to be reading from or writing to) :length (For input streams only, the amount of data to read, defaulting to the rest of the data in the string. Revise string for output streams only if true, the stream is resized as necessary to accommodate data written off the end, otherwise the writes will fail. (5) `memory' (For output only, writes data to an internal memory buffer. This is more lightweight than using a Lisp buffer. The function memory-stream-string can be used to convert the memory into a string.) (6) `debugging' (For output streams only, write data to the debugging output.) (7) `stream-device' (During non-interactive invocations only, Read from or write to the initial stream terminal device.) (8) `function' (For output streams only, send data by calling a function, exactly as with the STREAM argument to the print primitive.) Allowable Properties are: :function (the function to call. The function is called with one argument, the stream.) (9) `marker' (Write data to the location pointed to by a marker and move the marker past the data.) Allowable properties are: :marker (the marker object.) (10) `decoding' (As an input stream, reads data from another stream and decodes it according to a coding system. As an output stream decodes the data written to it according to a coding system and then writes results in another stream.) Properties are: :coding-system (the symbol of coding system object, which defines the decoding.) :stream (the stream on the other end.) (11) `encoding' (As an input stream, reads data from another stream and encodes it according to a coding system. As an output stream encodes the data written to it according to a coding system and then writes results in another stream.) Properties are: :coding-system (the symbol of coding system object, which defines the encoding.) :stream (the stream on the other end.) Consider (define-stream-type 'type :read-function :write-function :rewind- :seek- :tell- (?:buffer) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Generalized Coding Systems - Lisp API for Defining Coding Systems <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< User-defined coding systems. (define-coding-system-type 'type :encode-function FUN :decode-function FUN :detect-function FUN :buffering (number = at least this many chars line = buffer up to end of line regexp = buffer until this regexp is found in match source data. match data will be appropriate when fun is called encode fun is called as (encode INSTREAM OUTSTREAM) should read data from instream and write converted result onto outstream. Can leave some data stuff in stream, it will reappear next time. Generally, there is a finite amount of data in instream and further attempts to read lead to would-block errors or retvals. Can use instream properties to record state. May use read-stream functionality to read everything into a vector or string. ->Need vectors + string exposed to resizing of Lisp implementation where necessary. DiscussionBen sez: From: Ben Wing <ben@666.com> Sender: owner-xemacs-beta@xemacs.org To: "Alastair J. Houghton" <ajhoughton@lineone.net> CC: xemacs-beta@xemacs.org Message-ID: <3973766F.197819DE@666.com> Subject: Re: Lstreams and Lisp Date: Mon, 17 Jul 2000 14:11:11 -0700 X-Mailer: Mozilla 4.73 [en] (Windows NT 5.0; U) alastair, this looks great! please continue. as for lstreams, originally i wanted them not to escape because there were no lisp accessors, and there may be [or might have been, conceivably] primitives that could take lstreams as arguments and might [conceivably ...?] not work if some strange lstream were passed in. but i've actually been thinking of creating an lstream interface myself, for use in creating arbitrary lisp coding systems [i want to extend the coding-system interface to work not just with international encodings but to be able to handle gzip, base64, md5, etc.]. [unfortunately, what you're working on now doesn't fit into this system because the latter only deals with strings/streams of text or binary data and not arbitrary lisp objects; although i can certainly see the usefulness of an arbitrary lisp object converter, and it looks like that's exactly what you're working on here. the coding system stuff would still be useful because it includes various optimizations for working specifically with streams; but eventually i would really like to see the interfaces merged. e.g. why couldn't you `find-file' using a coding system that generated sound and image objects mixed in with the text? that's exactly what modern html browsers do, in essence. i suppose i should extend the coding system interface to allow text marked up with extents; still ...] i'm appending a rather raw writeup of my proposed lstream interface, with some bits on extending the coding system mechanism. [this comes out of a massive document of such proposals that martin and i sent to japan a few months ago as part of the contract that he and i are getting from them. most of this is stuff he transcribed from scribbled notes i faxed to him, since i can't type too well any more but can still write more or less; and the rest of it i dictated to a professional transcriptionist [with no technical knowledge, of course!], and was cleaned up by martin. that's why it's so messy.] if you're interested in implementing this lstream interface or something like it, please go ahead! i've got my hands busy with mule work and merging of existing workspaces into the code base for quite some time now. btw when you have time you might want to extend your 'string' encoding to allow encoding/decoding using a coding system, which would almost certainly be required when the string contains non-ascii characters. [e.g. when sending text to an x selection, `ctext' is required, and for windows, `mswindows-tstr'.] also, you might consider adding funs that allow creating a user-defined "encoding", instead of specifying the conversion functions directly. This is in response to a post by Alastair J. Houghton <ajhoughton@lineone.net> some part of which appears below: Why does it say /* #define CHECK_LSTREAM(x) CHECK_RECORD (x, lstream) Lstream pointers should never escape to the Lisp level, so functions should not be doing this. */ in lstream.h? The reason I'm wondering is that I'd like to make my encode-binary and decode-binary functions work with arbitrary output sinks/input sources, so the obvious implementation is to create a suitable Lstream within the Lisp-visible functions. The trouble is that I want the interface to the functions to include the facility to add user-defined conversions, which means that a Lisp function may have to accept an Lstream parameter... so I'm wondering whether there was any reason for this comment ;-) Just in case you're interested, here's the interface I'm proposing (there'll be an additional encode-binary-string function that works in an efficient way). The STREAM parameter could accept any Lisp object for which an Lstream can be created. DEFUN ("encode-binary-stream", Fencode_binary_stream, 3, 3, 0, /* Encode the sequence DATA into a binary STREAM using the specified binary FORMAT vector. Each element of the FORMAT vector should either be a symbol, or a list of the form (SYMBOL PARAMETER...). SYMBOL may be one of binary string bit-vector integer float space vector or alternatively the name of a Lisp function that will be called with the remaining data, the output stream and a list of PARAMETER values as its arguments. i.e. it's declaration should look something like the following (defun my-conversion data stream parameter-list ... ) and it will be called using (my-conversion data stream parameter-list) Such a function should return the remaining data after it has consumed whatever it required. The built-in encodings support the following parameters: Encoding Parameters binary :length string :length :pad :terminator bit-vector :length :direction integer :length :signed :direction float :length :format :direction space :length vector FORMAT-ELT :length :pack where FORMAT-ELT is anything that could be an element of the FORMAT parameter. :length is followed by a length in bytes (or in bits for bit-vector). :pad is followed by a character used to pad the string to the specified length. :terminator is followed by a character used to terminate the string. :direction is followed by one of `big-endian', `little-endian', `host' or `network'. :signed is followed by t or nil. :format is followed by `native'. Conversions to other floating point formats are currently not supported. :pack is followed by an integer specifying the vector stride (e.g. the format [(vector (integer :length 2) :pack 4)] represents an array of 16-bit integers, but with a gap of 2 bytes between successive elements). The function returns a string containing the raw binary data. */ (format, data, stream)) DEFUN ("decode-binary-stream", Fdecode_binary_stream, 2, 2, 0, /* Decode the specified STREAM using the binary FORMAT vector. See `encode-binary-stream' for more information on the FORMAT vector; note however that user-defined conversion functions should be declared as (defun my-conversion stream parameter-list ...) and should return the data they have converted. */ (format, stream)) Closed bugsNone. |
||||||||||||||||
|
|
||||||||||||||||
Conform with <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Automatically validated by PSGML