Wednesday, May 02, 2007

Internals: Supporting generic LZARI output compression

One objective I have for the next release is to improve compression performance. This will lead to lower SD card write cycles, thus improving power performance. There are two strategies involved. The first is to introduce a compression mechanism for the binary CSV output format. The second is to introduce a general compression algorithm for use with any output format.

Overview

For the general compression algorithm, I've chosen LZARI (using code written by Haruhiko Okumura on 4/7/1989 under a free license) from several candidates. It is reasonably fast, has low resource requirements, provides good compression performance, and is adaptable for embedded use.

LZARI works to reduce redundancy by compressing repeated character strings (the LZ part), and passing the resulting 'symbols' through an arithmetic encoder (the ARI part). Character strings are stored in a binary tree for fast matching.

Implementation

There was a substantial amount of work involved in adapting the code for the GPS Logger MG:

  • all non-local variables were extracted out to a single context structure for allocation from a single chunk of SRAM, and to preserve state across function calls;
  • all variables were analysed and converted to a quantifiable size (either 8, 16 or 32 bits, rather than simply "int") to optimise variable memory use;
  • the API was changed from being stream based (file I/O operations) to being data based (read/write data blocks) [which turned out to be the most difficult challenge, requiring the introduction of state logic];
  • the end of encoded data is now signified by the use of a specific compression symbol rather than relying upon knowing and storing the uncompressed data size ahead of encoding;
  • selective compilation of either or both decode or encode routines was enabled, as the release build requires encode only, whereas the debug build requires both (the unit tests do decoding to verify that encoding works);
  • fine grained assert and lint checking was introduced, to improve quality.
Evaluation

To fine tune the implementation for use, I needed to evaluate possible variations of the tunable compression parameters against real sample data. I wanted to find a tradeoff between good performance but low variable memory requirements (currently 24K of SRAM is available as a file cache, and I don't want the variable memory requirements to reduce this by too much).

There are two tunable compression parameters. The buffer-size specifies the size of the pool of data from which strings can be matched. The max-match-size specifies the maximum length that a matched string can be. In theory, the greater these are, the greater the potential compression ratio, but the trade off is that variable memory requirements grow.

Using buffer-sizes of 256 or 512 with max-match-sizes of 48, 64, 96 or 128, resulted in a range of variable memory requirements from 5328 to 8352 bytes.

The sample data was collected from the internet, being various NMEA logs captured for scientific, educational, personal or other purposes. From some 20 sources, a total of about 700MB was gathered. The data was tested by compressing it (a) as originally found (with all manner of standard or proprietary NMEA sentences), (b) by selecting valid NMEA sentences only (i.e. ^$GP with valid checksums), or (c) by selecting valid GGA NMEA sentences only.

Conclusions

The results show that the average compression ratios achieved were (a) 20%-25%, (b) 16%, and (c) 25%-27%. Using a buffer-size of 512 required 50% more processing time than a buffer-size of 256, even though compression performance improved by less than 5%. There was little impact on processing time with different max-match-size. From the 6 million NMEA sentences in (b), I found that the maximum sentence length was 103 bytes, meaning that a max-match-size of 96 was a better choice than 128.

For this reason, I elected to use a buffer-size of 256, and a max-match-size of 96. It provides good compression, requires only 5760 bytes of variable memory, and lower processing overhead.

If LZARI is enabled (using a new configuration option 'compress' which defaults to 'no'), the maximum file cache size becomes 20K rather than 24K, but the data it holds is now compressed by up to 4.4x. Practically, this means that 20K of compressed data in the cache corresponds to 88K of uncompressed data, which previously would have required 3.7x cache (at 24K). This means that using LZARI can reduce SD card writes by 3.7x, roughly 4x, which should have a significant impact on power performance.

No comments: