The Squash compression/decompression module


This module provides general compression/decompression facilities of a lossless nature through an SWI interface. The intended algorithm is 12-bit LZW, however, no guarantee is made that this will be so in the future.

The interface is designed to be restartable, so that compression or decompression can occur from a variety of locations. Operations involving file IO can easily be constructed from the operations provided. SWI: Squash_Compress &42700
Input: r0 - flags
bit 0 0 -> start new operation
1 -> continue existing operation
(using existing workspace contents)
bit 1 0 -> this is the end of the input
1 -> there is more input after this
bit 2 has no effect
bit 3 0 -> no effect
1 -> return the work space size required in
bytes (all other bits must be 0).
bits 4..31 must be zero
If bit 3 of r0 set then
r1 - input size or -1 (don't want max output size)
otherwise
r1 - workspace pointer
r2 - input pointer
r3 - number of bytes of input available
r4 - output pointer
r5 - number of bytes of output available
Output: If bit 3 of r0 set on input then
r0 - required work space size
r1 - max output size or -1 (don't know or wasn't asked)
otherwise
r0 - Output status:
0 - operation completed
1 - operation ran out of input data (r3 = 0)
2 - operation ran out of output space (r5 < 12)
r1 - preserved
r2 - updated to show first unused byte
r3 - updated to show number of input bytes not used
r4 - updated to show first unused output byte
r5 - updated to show number of output bytes not used
Errors: Squash_bad_address
Squash_corrupt_workspace

        Squash_bad_parameters

(Aside: The asymmetry in handling of running out of space is due to the handling of the compressed stream in chunks of several bytes at a time, in order to extract bit fields efficiently.)

If bits 0 and 1 of r0 are clear, and the output is definitely big enough, a fast algorithm will be used. SWI: Squash_Decompress &42701
Input: r0 - flags
bit 0 0 -> start new operation
1 -> continue existing operation
(using existing workspace contents)
bit 1 0 -> this is the end of the input
1 -> there is more input after this
bit 2 0 -> normal
1 -> You may assume that the output
will all fit in this buffer (allows a faster algorithm
to be used, if bits 0 and 1 are both 0)
bit 3 0 -> no effect
1 -> return the work space size required in
bytes (all other bits must be 0).
bits 4..31 must be zero
If bit 3 of r0 set then
r1 - input size or -1 (don't want max output size)
otherwise
r1 - workspace pointer
r2 - input pointer
r3 - number of bytes of input available
r4 - output pointer
r5 - number of bytes of output available
Output: if bit 3 of r0 set on input then
r0 - required work space size
r1 - max output size or -1 (don't know or wasn't asked)
otherwise
r0 - Output status:
0 - operation completed
1 - operation ran out of input data (r3 < 12)
2 - operation ran out of output space (r5 = 0)
r1 - preserved
r2 - updated to show first unused byte
r3 - updated to show number of input bytes not used
r4 - updated to show first unused output byte
r5 - updated to show number of output bytes not used
Errors: Squash_bad_address
Squash_corrupt_workspace
Squash_corrupt_input

        Squash_bad_parameters

(Aside: in the case above where r3 < 12, the unused input must be resupplied. See the note on Squash_Compress.)

The algorithm used by this module is 12-bit LZW, as used by the Unix 'compress' command (with -b 12 specified). If future versions of the module use different algorithms, they will still be able to decompress existing compressed data.

When the restartable algorithm is used the maximum output size for compressed data is 12 + 3*input size/2 in the worst case. When the fast algorithm is used the maximum output size for compressed data is 3 + 3*input size/2 in the worst case. When the module is queried about maximum output size it can not determine whether it will be able to use the fast algorithm so the maximum output size returned is always calculated using the first of these formulae.

There is no formula for the maximum output size on decompression so the module currently returns -1 if queried (this may change if the algorithm is changed so the query facility should still always be used). The usual strategy is to store the original data length with the compressed data so that the exact space required for decompression is known.

The compress operation acts as a filter on a stream of data. The SWI returns if either the input or the output is exhausted.

The performance for an 8Mhz A420 with ARM2 is approximately as follows:-

Store to store:
Compression 24 kbytes per second
Decompression 48 kbytes per second

Fast case: Store to store, with all input present AND output buffer assumed big enough:
Compression 68 kbytes per second
Decompression 280 kbytes per second