LZMA compression

Matt Campbell mattcampbell "at" pobox.com
Mon Feb 12 21:26:01 2007


Alex Pelts wrote:
> I did not know that single stream was maintained. How do new clients 
> join the session? Is dictionary replicated when new client joins the 
> session?

In the ZRLE encoding, a zlib stream is maintained for each connected 
client.  The same would be done for LZMA or any other compression method.

Having given the matter some more thought, I'd like to propose a new 
encoding, tentatively called LZMA Cells.

The LZMA Cells encoding:

This encoding divides the rectangle into variable-height rows, each of 
which consists of variable-width sub-rectangles called cells.  For each 
row, the sum of the widths of all cells equals the width of the whole 
rectangle.  Likewise, the sum of the heights of all rows equals the 
height of the whole rectangle.  The number of rows and the number of 
cells in each row is therefore not explicitly specified in the encoding.

Each cell is represented by raw pixels, using the CPIXEL type as defined 
by the ZRLE encoding.

This encoding uses LZMA for compression.  A single LZMA stream is 
maintained for the duration of the RFB session, so all rectangles must 
be encoded and decoded strictly in order.

On the wire, the encoding consists of a four-byte length field followed 
by the specified number of bytes of LZMA-compressed data:

	U32 length;
	U8 lzmaData[length];

The uncompressed representation of the rectangle consists of an 
arbitrary number of rows.  Each row begins with a field specifying its 
height in pixels:

	U16 rowHeight;

This field is followed by an arbitrary number of cells.  Each cell 
consists of a field specifying its width in pixels, followed by the raw 
pixel data:

	U16 cellWidth;
	CPIXEL data[cellWidth * rowHeight];

Rationale:

This encoding emphasizes simplicity and flexibility, leaving most 
optimization to the compressor.  It does not support palettes or 
run-length encoding; a good compression algorithm with a reasonably 
large dictionary renders these complications unnecessary.  However, it 
does use the CPIXEL (compressed pixel) type, because in the common 
32-bit true-color pixel format, 8 bits for each pixel are never used.

This encoding attempts to facilitate the division of a large rectangle 
into smaller sub-rectangles which contain patterns that the compressor 
can recognize and efficiently compress.  Because the optimal 
sub-rectangle size is subject to change, possibly even within a given 
rectangle, and is not necessarily known yet, the encoding allows a 
rectangle to be dviided into arbitrarily many sub-rectangles of varying 
sizes.  Current servers may simply use tiles of fixed size, as in the 
Hextile and ZRLE encodings, but requiring this would be short-sighted. 
For example, a sophsticated server may be able to optimize the encoding 
of rendered text when each glyph has a distinct bounding box (e.g. most 
common character sets when italics are not used).  If each cell 
corresponds to a glyph, the compressor may be able to compress  the 
rendered text more efficiently than otherwise, especially when 
characters or even sequences of characters recur frequently.  In the 
simpler case of fixed-size tiles, the compressor should be able to 
minimize the overhead of repeating the tile width and height.

Thoughts?

Matt