Hi
I have a Kidz Cam (Sakar 88379) and it isn't fully supported.
So I have been looking at the raw data files to try and figure out the
image encoding.
I've read kilgota's notes and would like to contribute to this list what
I have discovered so far about the raw image format.
First note that the camera is able to take photos with a variety of
settings:
Quality: HI or LO
Resolution: HI (CIF) or LO (QCIF)
The resulting 16-byte header appears to have the following structure:
byte 0,1: always 0x00,0x22, except I saw 00 02 once in a particularly
highly-compressed file.
byte 2: bit 0: resolution (0=CIF 1=QCIF)
byte 2: bit 4: quality (0=HI 1=LO)
byte 3: 0x52=QCIF 0x5e=CIF
byte 4: height in MCUs (an MCU is an 8x8 cell or "minimum coded unit")
byte 5: width in MCUs
byte 6,7: number of 128-byte chunks in the data (?)
byte 10,11: always 0x32,0x00
The following data (the image data) appears to made up of width/2
'segments' each one terminated with FF D9. It seems to be very JPEG-like.
The number of segments in the data is always width/2 (where width is the
width in MCUs) suggesting that each segment contains two columns of MCUs.
An analysis of segment length compared with with the uniformity of MCUs
from some test photos supports this. The first segment of a file appears
to encode the first (left-most) 2 columns in the image data.
Each segment in the file is terminated with FF D9 then padded with nuls
to the next 16-byte boundary.
The segment content (up to the terminating FF D9) is FF-escaped; that
is, any FF is followed by a 00. (The FF 00 is 'unescaped' to a single FF
for processing as is done in JPEG files).
The last valid segment in the file is often followed by a 'garbage
segment' that extends up to the next 128 byte boundary. In garbage
segments, instances of FF followed by bytes other than 00 or D9 are
found. Some files do not have 'garbage segments', as they just fit to a
128 byte boundary.
In one test file named 'white' (generated by shining a bright pen-LED
light directly into the camera) that yielded a near-white image, a 6-bit
sequence was found to indicated the end of each MCU. This was easy to
spot since the shortest segments in the 'white' image consisted almost
completely of repetitions of that code and was a completely white column
in the Windows-decoded image. There were always the right number of
6-bit codes in each segment: namely twice the height of the image in
MCUs. (Occasionally 1,2 or 3 extra occurrences, explained because the
code is feasibly a substring of other huffman codes). However, the
6-bit code used in the white file didn't appear the right number of
times in other test files indicating that the huffman codes are
different for each image.
The huffman code for MCU termination in files by could be successfully
guessed by examining the final bits in unescaped segments, and choosing
the shortest common affix followed by zero or more 0 bits. (Segment data
are padded to the 8-bit boundary with zeros, but there are never
extraneous bytes.) Ambiguous guesses were reduced by counting the number
of occurrences of affixes in each segment (it had to occur at least
height*2 times). The termination codes were generally 4, 5 or 6 bits
long. From a sample of 17 photos, the only terminator codes seen were:
1010, 01010, 11010, 001010
When the segments are split up on the MCU terminator, sometimes a code
of all 1s is seen. This is unusual because in JPEG files, huffman codes
are never all-1s.
Also, because the image data starts immediately after the header, and
the codes used are presumably different and because in some files there
is no 'garbage' segment present, the only place left to store a
description of which huffman table used is in the header (or possibly
not downloaded from the camera?)
I suspect that the last 8 bytes of the header contain the representation
of the huffman code used, but I haven't been able to show that yet.
Perhaps I am wrong and the terminator code is always 1010. That would
suggest that trivial AC and DC coefficients are encoded as 00!
I'd be very interested to hear if anyone else has made progress with
deciphering Kidz Cam raw image files.
I'm also interested in building a collection of test images. The
collection would have to contain raw images and equivalents decoded by
the Windows driver. The Sakar driver that I found was able to save files
in either BMP or JPG format. I'm not a Windows expert so I was fumbling
around quite badly. I would also prefer not to reverse engineer the
driver, and instead derive the encoding by examining the resulting
output files. However neither BMP nor JPG outputs appear to represent
the data very well.. I suspect the driver decodes the camera data to an
8-bit (greyscale?) buffer, and then re-encodes that to either BMP or
JPG. I say this because the JPG files written that I have examined
contain no recognisable sections from the raw image data, and the BMP
files are always 8-bit RGB but appear to be greyscale?
David
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php_______________________________________________
Gphoto-devel mailing list
Gphoto-devel@...
https://lists.sourceforge.net/lists/listinfo/gphoto-devel