Kidz Cam Huffman tables?

View: New views
3 Messages — Rating Filter:   Alert me  

Kidz Cam Huffman tables?

by David Leonard-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi

I have a Kidz Cam (Sakar 88379) and it isn't fully supported.
So I have been looking at the raw data files to try and figure out the
image encoding.
I've read kilgota's notes and would like to contribute to this list what
I have discovered so far about the raw image format.

First note that the camera is able to take photos with a variety of
settings:

   Quality: HI or LO
   Resolution: HI (CIF) or LO (QCIF)

The resulting 16-byte header appears to have the following structure:

byte 0,1: always 0x00,0x22, except I saw 00 02 once in a particularly
highly-compressed file.
byte 2: bit 0: resolution (0=CIF 1=QCIF)
byte 2: bit 4: quality (0=HI 1=LO)
byte 3: 0x52=QCIF 0x5e=CIF
byte 4: height in MCUs (an MCU is an 8x8 cell or "minimum coded unit")
byte 5: width in MCUs
byte 6,7: number of 128-byte chunks in the data (?)
byte 10,11: always 0x32,0x00

The following data (the image data) appears to made up of  width/2
'segments' each one terminated with FF D9. It seems to be very JPEG-like.
The number of segments in the data is always width/2 (where width is the
width in MCUs) suggesting that each segment contains two columns of MCUs.
An analysis of segment length compared with with the uniformity of MCUs
from some test photos supports this. The first segment of a file appears
to encode the first (left-most) 2 columns in the image data.

Each segment in the file is terminated with FF D9 then padded with nuls
to the next 16-byte boundary.
The segment content (up to the terminating FF D9) is FF-escaped; that
is, any FF is followed by a 00. (The FF 00 is 'unescaped' to a single FF
for processing as is done in JPEG files).
The last valid segment in the file is often followed by a 'garbage
segment' that extends up to the next 128 byte boundary. In garbage
segments, instances of FF  followed by bytes other than 00 or D9 are
found. Some files do not have 'garbage segments', as they just fit to a
128 byte boundary.

In one test file named 'white' (generated by shining a bright pen-LED
light directly into the camera) that yielded a near-white image, a 6-bit
sequence was found to indicated the end of each MCU. This was easy to
spot since the shortest segments in the 'white' image consisted almost
completely of repetitions of that code and was a completely white column
in the Windows-decoded image. There were always the right number of
6-bit codes in each segment: namely twice the height of the image in
MCUs. (Occasionally 1,2 or 3 extra occurrences, explained because the
code is feasibly a substring of other huffman codes).  However, the
6-bit code used in the white file didn't appear the right number of
times in other test files indicating that the huffman codes are
different for each image.

The huffman code for MCU termination in files by could be successfully
guessed by examining the final bits in unescaped segments, and choosing
the shortest common affix followed by zero or more 0 bits. (Segment data
are padded to the 8-bit boundary with zeros, but there are never
extraneous bytes.) Ambiguous guesses were reduced by counting the number
of occurrences of affixes in each segment (it had to occur at least
height*2 times). The termination codes were generally 4, 5 or 6 bits
long. From a sample of 17 photos, the only terminator codes seen were:
1010, 01010, 11010, 001010

When the segments are split up on the MCU terminator, sometimes a code
of all 1s is seen. This is unusual because in JPEG files, huffman codes
are never all-1s.

Also, because the image data starts immediately after the header, and
the codes used are presumably different and because in some files there
is no 'garbage' segment present, the only place left to store a
description of which huffman table used is in the header (or possibly
not downloaded from the camera?)
I suspect that the last 8 bytes of the header contain the representation
of the huffman code used, but I haven't been able to show that yet.

Perhaps I am wrong and the terminator code is always 1010. That would
suggest that trivial AC and DC coefficients are encoded as 00!

I'd be very interested to hear if anyone else has made progress with
deciphering Kidz Cam raw image files.

I'm also interested in building a collection of test images. The
collection would have to contain raw images and equivalents decoded by
the Windows driver. The Sakar driver that I found was able to save files
in either BMP or JPG format. I'm not a Windows expert so I was fumbling
around quite badly. I would also prefer not to reverse engineer the
driver, and instead derive the encoding by examining the resulting
output files. However neither BMP nor JPG outputs appear to represent
the data very well.. I suspect the driver decodes the camera data to an
8-bit (greyscale?) buffer, and then re-encodes that to either BMP or
JPG. I say this because the JPG files written that I have examined
contain no recognisable sections from the raw image data, and the BMP
files are always 8-bit RGB but appear to be greyscale?

David

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Gphoto-devel mailing list
Gphoto-devel@...
https://lists.sourceforge.net/lists/listinfo/gphoto-devel

Parent Message unknown Re: Kidz Cam Huffman tables?

by kilgota :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi, David.

This is very welcome information. I have not quit working on these
cameras. The compression algorithm is just particularly nasty.

> Message: 3
> Date: Tue, 24 Jun 2008 22:43:45 +1000
> From: David Leonard <leonard@...>
> Subject: [gphoto-devel] Kidz Cam Huffman tables?
> To: gphoto-devel@...
> Message-ID: <4860EC01.4040703@...>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hi
>
> I have a Kidz Cam (Sakar 88379) and it isn't fully supported.

Very true. Also there are a number of similar cameras out there, at this
point. So for me this decompression problem is of high priority.

> So I have been looking at the raw data files to try and figure out the
> image encoding.
> I've read kilgota's notes and would like to contribute to this list what
> I have discovered so far about the raw image format.
>
> First note that the camera is able to take photos with a variety of
> settings:
>
>   Quality: HI or LO
>   Resolution: HI (CIF) or LO (QCIF)
>

It seems to me that the "resulting 16-byte header" is simply a copy of the
16 bytes in the previously downloaded Allocation Table which specifically
relate to the given image. Is this also your impression?


> The resulting 16-byte header appears to have the following structure:
>
> byte 0,1: always 0x00,0x22, except I saw 00 02 once in a particularly
> highly-compressed file.
> byte 2: bit 0: resolution (0=CIF 1=QCIF)
> byte 2: bit 4: quality (0=HI 1=LO)
> byte 3: 0x52=QCIF 0x5e=CIF
> byte 4: height in MCUs (an MCU is an 8x8 cell or "minimum coded unit")
> byte 5: width in MCUs
> byte 6,7: number of 128-byte chunks in the data (?)
> byte 10,11: always 0x32,0x00
>

Yes, pretty much right. Although bytes 10 and 11 are not always the same
on all of the cameras which seem to be using the same general scheme. I
mean, there are other models besides the KidzCam and these bytes can
differ. So they do mean something.

> The following data (the image data) appears to made up of  width/2

width/2? Not compressed, or something?

> 'segments' each one terminated with FF D9. It seems to be very JPEG-like.

Yes, this much I know. Someone else has remarked to me that it is like the
results of a JPEG implementation project which was done by a very bad
student. Well, I would not say things like that, but it does seem that it
might be doing things by blocks, not just by rows.

> The number of segments in the data is always width/2 (where width is the
> width in MCUs) suggesting that each segment contains two columns of MCUs.
> An analysis of segment length compared with with the uniformity of MCUs
> from some test photos supports this. The first segment of a file appears
> to encode the first (left-most) 2 columns in the image data.
>
> Each segment in the file is terminated with FF D9 then padded with nuls
> to the next 16-byte boundary.
> The segment content (up to the terminating FF D9) is FF-escaped; that
> is, any FF is followed by a 00. (The FF 00 is 'unescaped' to a single FF
> for processing as is done in JPEG files).
> The last valid segment in the file is often followed by a 'garbage
> segment' that extends up to the next 128 byte boundary. In garbage
> segments, instances of FF  followed by bytes other than 00 or D9 are
> found. Some files do not have 'garbage segments', as they just fit to a
> 128 byte boundary.
>
> In one test file named 'white' (generated by shining a bright pen-LED
> light directly into the camera) that yielded a near-white image, a 6-bit
> sequence was found to indicated the end of each MCU. This was easy to
> spot since the shortest segments in the 'white' image consisted almost
> completely of repetitions of that code and was a completely white column
> in the Windows-decoded image. There were always the right number of
> 6-bit codes in each segment: namely twice the height of the image in
> MCUs. (Occasionally 1,2 or 3 extra occurrences, explained because the
> code is feasibly a substring of other huffman codes).  However, the
> 6-bit code used in the white file didn't appear the right number of
> times in other test files indicating that the huffman codes are
> different for each image.
>
> The huffman code for MCU termination in files by could be successfully
> guessed by examining the final bits in unescaped segments, and choosing
> the shortest common affix followed by zero or more 0 bits. (Segment data
> are padded to the 8-bit boundary with zeros, but there are never
> extraneous bytes.) Ambiguous guesses were reduced by counting the number
> of occurrences of affixes in each segment (it had to occur at least
> height*2 times). The termination codes were generally 4, 5 or 6 bits
> long. From a sample of 17 photos, the only terminator codes seen were:
> 1010, 01010, 11010, 001010
>
> When the segments are split up on the MCU terminator, sometimes a code
> of all 1s is seen. This is unusual because in JPEG files, huffman codes
> are never all-1s.
>
> Also, because the image data starts immediately after the header, and
> the codes used are presumably different and because in some files there
> is no 'garbage' segment present, the only place left to store a
> description of which huffman table used is in the header (or possibly
> not downloaded from the camera?)
> I suspect that the last 8 bytes of the header contain the representation
> of the huffman code used, but I haven't been able to show that yet.

Our suspicions agree. I think it might be computing a Huffman table
based upon some of those numbers, too, or perhaps it is using a fixed
Huffman table which is computed during the image processing. But I have
not made much progress on these things, either.

>
> Perhaps I am wrong and the terminator code is always 1010. That would
> suggest that trivial AC and DC coefficients are encoded as 00!
>
> I'd be very interested to hear if anyone else has made progress with
> deciphering Kidz Cam raw image files.
>
> I'm also interested in building a collection of test images. The
> collection would have to contain raw images and equivalents decoded by
> the Windows driver. The Sakar driver that I found was able to save files
> in either BMP or JPG format. I'm not a Windows expert so I was fumbling
> around quite badly. I would also prefer not to reverse engineer the
> driver, and instead derive the encoding by examining the resulting
> output files. However neither BMP nor JPG outputs appear to represent
> the data very well.. I suspect the driver decodes the camera data to an
> 8-bit (greyscale?) buffer, and then re-encodes that to either BMP or
> JPG. I say this because the JPG files written that I have examined
> contain no recognisable sections from the raw image data, and the BMP
> files are always 8-bit RGB but appear to be greyscale?
>
> David


Yes, very interesting indeed. You have done a lot more detailed analysis
of the images than I have. I have been trying other stuff, also without
tangible success yet.

I have to give a final exam for a short summer-session course, this
afternoon. Then I will have to grade it over the weekend. I am leaving for
a conference July 13, followed by vacation, and will probably be out of
contact for a while. The conference paper is written. I do have to make my
slides for the presentation. Other than this, I am glad to spend some of
the time between now and July 13 trying to make some progress, here.

There are several people who have similar cameras. They could help us with
getting a collection of raw photos. At least one of them is working
on this problem, too. I do not know how far he has gotten.

Theodore Kilgore

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Gphoto-devel mailing list
Gphoto-devel@...
https://lists.sourceforge.net/lists/listinfo/gphoto-devel

Parent Message unknown Re: Kidz Cam Huffman tables?

by kilgota :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

David,

I got my test written and have a free half-hour or so. Thus, I am reading
through this more carefully and will add some other comments.



On Fri, 27 Jun 2008, gphoto-devel-request@... wrote:

>
> Message: 3
> Date: Tue, 24 Jun 2008 22:43:45 +1000
> From: David Leonard <leonard@...>
> Subject: [gphoto-devel] Kidz Cam Huffman tables?
> To: gphoto-devel@...
> Message-ID: <4860EC01.4040703@...>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Hi
>
> I have a Kidz Cam (Sakar 88379) and it isn't fully supported.
> So I have been looking at the raw data files to try and figure out the
> image encoding.
> I've read kilgota's notes and would like to contribute to this list what
> I have discovered so far about the raw image format.
>
> First note that the camera is able to take photos with a variety of
> settings:
>
>   Quality: HI or LO
>   Resolution: HI (CIF) or LO (QCIF)


Note that in fact there are several related cameras, which all seem to
have a very similar construction, and there are minor variants in what
they are doing. They all have pretty much identical command sets. The
first thing they do, after an init sequence, is to download an Allocation
Table which contains one 0x10-byte line per photo, starting at the line 3
(counting from 0). You have the meanings pretty much correct, except for
some differences between the cameras (KidzCam is JL2005B; there are also
JL2005C and D):

>
> The resulting 16-byte header appears to have the following structure:
>
> byte 0,1: always 0x00,0x22, except I saw 00 02 once in a particularly
> highly-compressed file.

byte 0 is 00, byte 1 may be different depending on the model. My JL2005C
gives 0x32 for byte 1, and someone has sent me a JL2005D sample where it
is 0x20. The 00 in byte 0 of the line in the allocation table is the
marker that a photo is there.

> byte 2: bit 0: resolution (0=CIF 1=QCIF)

0=HI, and 1=LO (whatever these mean relative to the camera)

> byte 2: bit 4: quality (0=HI 1=LO)

yes

> byte 3: 0x52=QCIF 0x5e=CIF

Other possibilities, depending on the model.


> byte 4: height in MCUs (an MCU is an 8x8 cell or "minimum coded unit")
> byte 5: width in MCUs

yes and yes.

> byte 6,7: number of 128-byte chunks in the data (?)

Depends. It can be 512-byte chunks for some of them. And it signifies the
number of bytes read in order to complete the photo, not the number of
bytes in the photo.


> byte 10,11: always 0x32,0x00

Byte 10 again depends on the model. I have also seen 0x43 and 0x10, at
least, as well as 0x32

Byte 12-13 Begin block of a photo. Note that photo 0 does not begin at
block 0, but usually at some higher block number. Bytes 12-13 of photo 0's
allocation table line must copy bytes 8-9 of line 0 of the allocation
table.

That leaves as a possibility that bytes 8, 10-11 and/or bytes 14-15 could
have something to do with codes for constructing a Huffman table.


>
> The following data (the image data) appears to made up of  width/2
> 'segments' each one terminated with FF D9. It seems to be very JPEG-like.
> The number of segments in the data is always width/2 (where width is the
> width in MCUs) suggesting that each segment contains two columns of MCUs.
> An analysis of segment length compared with with the uniformity of MCUs
> from some test photos supports this. The first segment of a file appears
> to encode the first (left-most) 2 columns in the image data.
>

So you think it is doing things in columns of width 0x10...

> Each segment in the file is terminated with FF D9 then padded with nuls
> to the next 16-byte boundary.
> The segment content (up to the terminating FF D9) is FF-escaped; that
> is, any FF is followed by a 00. (The FF 00 is 'unescaped' to a single FF
> for processing as is done in JPEG files).
> The last valid segment in the file is often followed by a 'garbage
> segment' that extends up to the next 128 byte boundary. In garbage
> segments, instances of FF  followed by bytes other than 00 or D9 are
> found. Some files do not have 'garbage segments', as they just fit to a
> 128 byte boundary.

One possible reason for this is that the camera _must_ download by blocks.
One cannot do a partial read of a block without raping the hardware.
That fact would seem to indicate that the "garbage segments" indeed
contain nothing but garbage.


>
> In one test file named 'white' (generated by shining a bright pen-LED
> light directly into the camera) that yielded a near-white image, a 6-bit
> sequence was found to indicated the end of each MCU. This was easy to
> spot since the shortest segments in the 'white' image consisted almost
> completely of repetitions of that code and was a completely white column
> in the Windows-decoded image. There were always the right number of
> 6-bit codes in each segment: namely twice the height of the image in
> MCUs. (Occasionally 1,2 or 3 extra occurrences, explained because the
> code is feasibly a substring of other huffman codes).  However, the
> 6-bit code used in the white file didn't appear the right number of
> times in other test files indicating that the huffman codes are
> different for each image.
>
> The huffman code for MCU termination in files by could be successfully
> guessed by examining the final bits in unescaped segments, and choosing
> the shortest common affix followed by zero or more 0 bits. (Segment data
> are padded to the 8-bit boundary with zeros, but there are never
> extraneous bytes.) Ambiguous guesses were reduced by counting the number
> of occurrences of affixes in each segment (it had to occur at least
> height*2 times). The termination codes were generally 4, 5 or 6 bits
> long. From a sample of 17 photos, the only terminator codes seen were:
> 1010, 01010, 11010, 001010
>
> When the segments are split up on the MCU terminator, sometimes a code
> of all 1s is seen. This is unusual because in JPEG files, huffman codes
> are never all-1s.
>
> Also, because the image data starts immediately after the header, and
> the codes used are presumably different and because in some files there
> is no 'garbage' segment present, the only place left to store a
> description of which huffman table used is in the header (or possibly
> not downloaded from the camera?)
> I suspect that the last 8 bytes of the header contain the representation
> of the huffman code used, but I haven't been able to show that yet.

As I said, some of them are used to indicate the start block of the
photo's data, so this is not completely true. The "header" of the photo is
simply copied from the allocation table, insofar as I am able to tell.
Some of that data has to indicate where the photo is, because we are
dealing with another one of those primitive controller chips which seems
to expect that all the data in the camera has simply been dumped out, and
we then need to be able to _find_ the photos inside of that god-awful gob
of data. But the "header" is pasted on after the fact, when the raw files
are bitten off the big gob and are separately written.

>
> Perhaps I am wrong and the terminator code is always 1010. That would
> suggest that trivial AC and DC coefficients are encoded as 00!
>
> I'd be very interested to hear if anyone else has made progress with
> deciphering Kidz Cam raw image files.

Someone else is working on it. I am not sure what he is doing but I
suspect that some debugger is in use. I have not heard from that
individual recently. There was someone else who was doing similar
things, a few months ago, but he seemed to decide that the matter was
too difficult to hold his interest. Other than that, I know of no other
tangible progress. The second individual was the one who made the remark
about a bad student's implementation of JPEG. He also, before he quit,
made the comment that the decompression algorithm can be done in two ways.
One of them requires MMX support in the CPU, and the second avoids the use
of MMX. He said that if the camera is run in webcam mode, then the use of
MMX is obligatory but it is not if the camera has been used to take still
photos.

>
> I'm also interested in building a collection of test images. The
> collection would have to contain raw images and equivalents decoded by
> the Windows driver.

As I said, I can ask around about this. There are several others who have
similar cameras.

The Sakar driver that I found was able to save files
> in either BMP or JPG format.

Yes. I would tend to figuring out the steps to the BMP format because it
is necessary to do that in order to do the decompression. I suppose that I
can only speak for myself, but at this point I am strongly not interested
in how the driver software is creating JPEG files. For, once it is clear
that the camera is not producing the JPEG files but is doing some other
kind of compression, the matter of JPEG seems to me to be only a red
herring.

I'm not a Windows expert so I was fumbling
> around quite badly. I would also prefer not to reverse engineer the
> driver,

Yes, that is what we all would prefer.

and instead derive the encoding by examining the resulting
> output files.

Your approach is very interesting. I do not know if it will be ultimately
successful without also doing what you want to avoid, but it does look
very interesting nonetheless.

However neither BMP nor JPG outputs appear to represent
> the data very well.. I suspect the driver decodes the camera data to an
> 8-bit (greyscale?) buffer,

That would be the standard thing to do, it seems to me. But the 8-bit data
is not exactly greyscale" since color information has in fact been
sampled, and the location in the image determines what color has been
sampled there.

and then re-encodes that to either BMP or

To write a BMP from that is not exactly the same as to do a "re-encode."
Of course, some other things might be going on, such as gamma correction
and smoothing, and such, thus making it quite impossible to work backwards
to get the raw image, which has only been decompressed and no more.

> JPG. I say this because the JPG files written that I have examined
> contain no recognisable sections from the raw image data, and the BMP
> files are always 8-bit RGB but appear to be greyscale?

I do not follow this last statement.

>
> David
>


David,  feel free to contact me off-list.

Theodore Kilgore

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Gphoto-devel mailing list
Gphoto-devel@...
https://lists.sourceforge.net/lists/listinfo/gphoto-devel