ImageEncodingAx206

From ST2205u wiki

Compressed image format

This page describes the currently known information about the compression of image files used in pictureframes using the ax206 chip.

Introduction

The ax206 chip stores images in a JPEG-like format, but with several modifications and/or parts missing. Images also contain some kind of thumbnail image.

To understand the information on this page, it is recommended to have a basic understanding of JPEG compression.

Image format

The image file seems to contain the following parts:

  • a 16-byte header
  • a small, low-res version of the image in a kind of raw format (color space is unknown, but likely YCbCr)
  • a JPEG DQT table (but without the DQT marker) containing two JPEG quantisation tables, one for luma and one for chroma
  • a JPEG DHT table (but without the DHT marker) containing four huffman tables (combinations of luma/chroma, DC/AC)
  • the entropy coded JPEG data

Although the encoding is very much like JPEG, there are no JPEG chunk markers at all.

Header

The image header is contained in the first 16 bytes.

Some of the fields in the header can be related to values from the JPEG specification.

The table below shows all known fields:

Byte Related JPEG field Meaning
0x00/0x01 SOF: X Image width, encoded as big-endian
0x02/0x03 SOF: Y Image height, encoded as big-endian
0x04 SOF: Hi/Vi "Resolution" byte, 0 for images without sub-sampling of the color components, 3 for images that use sub-sampling of the color components
0x05/0x06/0x07 DQT: Tq Three numbers, each indicating which quantisation table (0 or 1) should be used for which component (Y, Cb, Cr)
0x08/0x09/0x0A SOS: Tdj Three numbers, each indicating which DC huffman table (0 or 1) should be used for which component (Y, Cb, Cr)
0x08/0x09/0x0A SOS: Taj Three numbers, each indicating which AC huffman table (0 or 1) should be used for which component (Y, Cb, Cr)
0x0E/0x0F Unknown Unknown

MCU info block

Immediately after the header there is an MCU info block, this block contains 8 bytes per MCU which makes it possible to start decoding with a random MCU, rather then needing to do the decode linearly. This way the picture frame can do transition effects. The size of this block depends on the number of MCU's and thus on the "resolution" byte 0x04 from the header. For a resolution value of 0x00, the size (in bytes) is width * height / 8. For a resolution value of 0x03, the size is width * height / 32.

The info in each MCU info block consists of last DC vals for the 3 components and the location in the huffman bitstream where the data for this block starts. The way the location in the huffman bitstream is coded is, erm, rather interesting. The contents of this field of the per MCU info is an offset to add to the start of the MCU info block. An example to make things more clear. Lets say the huffman compressed data for the 4th MCU starts at location 3066 (decimal), the information block for the 4th MCU starts at offset 16 + 3 * 8 = 40 (decimal). So the contents of the field coding were the huffman compressed data for the 4th MCU starts is 3066 - 40 = 3026. So say that the 5th MCU starts 5 bytes later (the 4th one compressed well), then the field coding were the huffman compressed data starts would contain 3071 - 48 = 3023. Yes this is a bit weird, but it is how it is.

Offset Meaning
0x00-0x01 Litte Endian 16 bit word storing last DC val for the Y channel
0x02-0x03 Litte Endian 16 bit word storing last DC val for the Cb channel
0x04-0x05 Litte Endian 16 bit word storing last DC val for the Cr channel
0x06-0x07 Litte Endian 16 bit word storing the offset into the image, from the start of this info block! Where the huffman data for this MCU starts

DQT table

This table is basically a standard JPEG DQT table, but it seems to be missing its 0xFFDB marker at the start.

DHT table

This table is basically a standard JPEG DHT table, but it seems to be missing its 0xFFC4 marker at the start.

Entropy coded data

This seems to be standard huffman encoded coefficient data, with the following exceptions:

  • there are no stuffing bytes after 0xFF bytes
  • after each JPEG MCU (minimum coded unit), the next MCU seems to start at the next byte-aligned position in the stream
  • there is no end-of-image JPEG marker

The order of components (Y, Cb, Cr) coded in the stream depends on the resolution byte from the header.

  • For resolution value 0x00, they seem to be interleaved in the following order: chroma1, chroma2, luma.
  • For resolution value 0x03, they seem to be interleaved in the following order: chroma1, chroma2, luma, luma, luma, luma. The two chroma blocks are sub-sampled by a factor 2 in this case (and therefore encode for a 16x16 pixel area), the luma blocks are not sub-sampled (so that's why there are 4 blocks of 8x8 pixels to cover the same 16x16 pixel area).