World Photography Forum - View Single Post

robski · #22 25-07-07, 23:29

Let us take a step back for a moment and reconsider the 64 outputs for the DCT process from the 8 x 8 pixel block of the image data. The first output is a steady (DC) level of all the 64 image pixels averaged together and the other 63 outputs represent different frequency (AC) levels found in the 8 x 8 pixel image block.

If through the user compression level (quality factor) slider in the quantization stage it discarded all of the 63 AC outputs the resultant image would show 8 x 8 pixel areas of the same tone. The image will get maximum compression typically something in excess of 120:1 but you may of dumped a lot of image information to get it. This is only important if the image information was there in the first place. In practice an applications maximum compression setting is not that brutal but does introduce a big reduction in the strength of the high frequency components.

Attached is a Screen grab of a before and after scaled by 300%. The upper part is the original tiff image and the lower is the file saved as jpeg in Photoshop with a Quality setting of zero ( maximum compression). Compare the 2 halves.

In the some areas of the lower half you can clearly see the 8 x 8 block structure where the weak high frequency detailed information has been completely averaged (smoothed). You should also notice a slight colour shift.

Just to complete the jpeg compression story we will briefly look at the remaining processes. In a previous post I had already mentioned that the 63 AC components from the DCT process were run length encoded ( lossless compression method ). However the DC component is treated differently. It is assumed that neighboring 8 x 8 blocks will have a similar average value so instead of dealing with a large number format it uses a small number format which defines the difference in level from the previous block thereby requiring less code to store the information. The last stage is to use Huffman encoding (another lossless compression method) to compress again all the above compressed data.

There is a lot of work involved to produce the humble JPEG image file.

Another thread touched on Digital radio which makes use of MPEG. Many of the concepts covered above apply to the human audio system. They know which bits they have to preserve, emphasis and deemphasize. A few people tried to compare it with CD music quality and vinyl. CD technology is not in the same ball park. A CD ROM holds about 2Gbytes of data. Two thirds of it is required to make the technology work reliable. A special 8 to 14 bit encoding to limit the bandwidth of the laser pickup circuits. Checksums so that the electronics can auto-correct corrupted data. Interleaving of data so that small scratches on the disc don't destroy huge chunks of data - but instead give smaller chunks of lost data which then gives the auto-correction half a chance of working. The data is blocked up into frames which requires control codes to keep the frames in sequence and say when user data starts and finishes. If I recall over 100 frames has to be read in before decoding can start. So only one third of the disk contains music or computer files.

Oops got carried away there - My mind slipped back to a previous life time in electronics.

OK I think we have done the JPEG theory to death now. For my sins I had to research the subject for problems at work and ended up knowing more than I ever wanted to know.

In future posts I want to focus on the practical aspects of using JPEG