Data Compression

Mick · #21 25-07-07, 23:24

Good stuff Rob, got to be honest though I get a bit lost at Fouriers concept, I used to be a radio ham so I understand the frequency spectrum bit but am having trouble with what the 8X8 grids actually mean with relation to the image being processed, I am determined to get it sussed though.

Mick

robski · #22 25-07-07, 23:29

Let us take a step back for a moment and reconsider the 64 outputs for the DCT process from the 8 x 8 pixel block of the image data. The first output is a steady (DC) level of all the 64 image pixels averaged together and the other 63 outputs represent different frequency (AC) levels found in the 8 x 8 pixel image block.

If through the user compression level (quality factor) slider in the quantization stage it discarded all of the 63 AC outputs the resultant image would show 8 x 8 pixel areas of the same tone. The image will get maximum compression typically something in excess of 120:1 but you may of dumped a lot of image information to get it. This is only important if the image information was there in the first place. In practice an applications maximum compression setting is not that brutal but does introduce a big reduction in the strength of the high frequency components.

Attached is a Screen grab of a before and after scaled by 300%. The upper part is the original tiff image and the lower is the file saved as jpeg in Photoshop with a Quality setting of zero ( maximum compression). Compare the 2 halves.

In the some areas of the lower half you can clearly see the 8 x 8 block structure where the weak high frequency detailed information has been completely averaged (smoothed). You should also notice a slight colour shift.

Just to complete the jpeg compression story we will briefly look at the remaining processes. In a previous post I had already mentioned that the 63 AC components from the DCT process were run length encoded ( lossless compression method ). However the DC component is treated differently. It is assumed that neighboring 8 x 8 blocks will have a similar average value so instead of dealing with a large number format it uses a small number format which defines the difference in level from the previous block thereby requiring less code to store the information. The last stage is to use Huffman encoding (another lossless compression method) to compress again all the above compressed data.

There is a lot of work involved to produce the humble JPEG image file.

Another thread touched on Digital radio which makes use of MPEG. Many of the concepts covered above apply to the human audio system. They know which bits they have to preserve, emphasis and deemphasize. A few people tried to compare it with CD music quality and vinyl. CD technology is not in the same ball park. A CD ROM holds about 2Gbytes of data. Two thirds of it is required to make the technology work reliable. A special 8 to 14 bit encoding to limit the bandwidth of the laser pickup circuits. Checksums so that the electronics can auto-correct corrupted data. Interleaving of data so that small scratches on the disc don't destroy huge chunks of data - but instead give smaller chunks of lost data which then gives the auto-correction half a chance of working. The data is blocked up into frames which requires control codes to keep the frames in sequence and say when user data starts and finishes. If I recall over 100 frames has to be read in before decoding can start. So only one third of the disk contains music or computer files.

Oops got carried away there - My mind slipped back to a previous life time in electronics.

OK I think we have done the JPEG theory to death now. For my sins I had to research the subject for problems at work and ended up knowing more than I ever wanted to know.

In future posts I want to focus on the practical aspects of using JPEG

robski · #23 26-07-07, 00:12

Mick does the photo of the 64 frequency elements in post 16 help ?

The 8 x 8 is just a manageable block size it could of been 16 x 16 but then it would contained a higher number of frequency components and would make the DCT design very complex. For the same reason the JPEG spec imposes a 14 bit limit on the input channels ( If I recall correctly I don't think it is 12 bit). Hence you can't save as a 16 bit JPEG.

You can get too hung up on the DCT processing. Most of us with an electronics background can understand filters (bandpass, highpass & lowpass) made from coils, resistors and capacitors. Digital filters are something else. Extremely mathematical in their design. In practice their circuits involve shift registers, addition and subtraction units. For example if you shift a set of data 3 places to the right and the add it to the same set of data shifted one place to the left you have performed a mathematical function upon the data. The same principle applies when you sharpen or blur an image in Photoshop. You apply a filter with a mathematical function.

Going off on a similar tangent think about noise reduction. These often use a process called auto-correlation which entails phase shifting the data which smooth out the high frequency pits and bumps. A thing which helped a few things click into place for me in this area was to think about some calculation used in accounting. The one used to average the last 3 years expenditure as time rolls forward to see the general treads rather than focus on a single bad month. If you think about it this is exactly what a lowpass filter is doing. We can change the design by averaging 5 or 7 years instead to get a smoother output.

Mick · #24 26-07-07, 19:30

Rob they have all helped, I think I have the gist of it but am lacking a lot of the detail but that will come when I have read it through a few more times and had the time to think about it.

Mick

robski · #25 27-07-07, 00:15

Noise the Compression Killer.

In an earlier post we talked about the effect noise has on compression ratios. For this post I have conducted some simple experiments in Photoshop to provide some test results for you to compare and see what effect the noise has had on the resultant file size.

For all the tests I have used an image size of 800 x 640 RGB pixels which works out to be 1.46 M bytes of uncompressed data. Each image was saved using Photoshop CS2 JPEG compression level 12 ( max image quality - minimum compression.)

The first test was to create a 50% grey image using the colour picker and fill commands. The image saved down to 53,114 bytes.
A compression ratio of 29:1. Would of been better if the image area was bigger but we are suffering the overheads of the JPEG format.
This image contains virtually no information other than there are 512000 RGB pixels of a value of 127 in a 800 x 640 matrix. A more intelligent compressor would describe all that is needed to recreate the image in a dozen bytes or so.

I repeated the exercise but this time created a bright red image which saved down to 53,386 bytes. This was simply to compare a RGB image with no colour to one with a bright colour.

I next created another 50% grey and added 0.5% Gaussian noise which saved down to 185,054 bytes
The noise level was increase to 1% for the next test which saved down to 524,146 bytes
The grain effect (noise) was barely noticeable in the two images above and typical of a low noise image straight from the camera.

The last noise introduction test was with a noise level of 5% which saved down to 933,222 bytes In this case the grain was really noticeable and typical of low light photography using a high ISO setting.

The last test was to see how effective a noise reduction program such as Neat Image was at reducing the file size. A crop of 800 x 640 pixels was taken from a straight out of camera file shot at ISO 400. This saved down to 374,052 bytes. The same crop was passed to Neat Image for some very basic processing. This time it saved down to 293,126 bytes. A reduction of 20% in file size. My general experience has been that you get typically a 20% to 30% reduction in file size using a noise reduction program.

From the results above you can see it is very important to keep noise under control when trying to keep files sizes small.

Tips to avoid unwanted noise (grain effect)

- Don't shoot at a higher ISO(film sensitivity) setting than necessary.
- When choosing a new digital camera avoid cameras that have a very small sensor.
- Take care not to vastly underexpose shots. The camera sensor like all electronic devices will produce a certain amount of electrical noise. It is therefore important that the signal (light in this case) is much stronger than the noise.
- Take care in Post processing not to unduly increase noise levels when increasing the contrast or sharpening.

Noise levels can be reduced by using a noise reduction program or plug-in. However, care is needed as over processing can produce unrealistic results by removing image detail along with the noise.

robski · #26 01-08-07, 00:46

In the last Post we saw how unwanted detail (noise/grain effect) reduced the compression ratio. Unfortunately wanted fine detail has the same effect.

In 1948 Claude Shannon introduced a concept called Entropy which gives a fundamental mathematical limit on the best possible lossless data compression of any communication. This is all part of information theory, working out the shortest code to send a message. Measuring the information content in a piece of data. The key is to keep the essential information and discard the redundant data (code). Using a lossless compression scheme on fine detailed image data your be lucky if you manage to halve the file size (2:1 compression ratio). So according to Shannon's information theory some information has to be discarded if you want to reduce the file size further. ( or get a symbol to carry more information but thats another story) It was the JPEG committees job to find methods to discard information that the human eye and brain would not notice and if further information was discarded the brain could still recognize the image. The lossy compression scheme will continue to thrive all the time we have limited data storage and limits imposed on network bandwidth. The day we all have fibre optic cable to the doorstep, terabyte memory cards and petabyte hard disks the nasty JPEG baseline compressed images will be as useless as wax cylinder records. Remember JPEG was developed back in the days when the 56k modem was king.

Apart from the general interest in the topic the main aim is to help people limbo under various file size limits imposed by forums .

Some more experiments in Photoshop to illustrate that increased image detail (information) reduces the compression ratio.
I have used the same image size of 800 x 640 pixels which is typical of those posted in the Gallery. For each test I created a coloured checker board pattern with an increased number of squares. The Jpeg Quality setting was 12 ( max quality - minimum compression ). Note non of these images will have camera noise.

53,290 bytes 1 square
53,965 bytes 4 squares
62,548 bytes 16 squares
109,098 bytes 64 squares
209,375 bytes 256 squares
334,432 bytes 1024 squares
654,468 bytes 4096 squares but with a Quality Factor setting of 6 it just managed to get the file size (275,236 bytes) below the forums 300K limit

Somewhere between 256 and 1024 squares we had blown this forums 300K limit. To meet this limit extra compression is required with some loss in image quality.
In the case of 4096 squares ( each square ~ 12 x 10 pixels ) which is not especially detailed required a heavy compression setting of 6.

Assuming you have already taken some action to reduce the noise levels, what is the best course of action for images that have large areas which are highly detailed. For example shots showing the rough tree bark texture or fine grains of sand. Even shots showing lots of grass blades present a problem. Well to be honest you are going to struggle to get these under the 300K barrier. In a few cases you will fail to compress the image with sufficient quality to warrant posting.

Using a quality setting less than 5 will degrade the image so much that it will no longer show the fine detail you were trying to express. The next option is to reduce the image pixel count either by cropping or re-sampling. If you crop tighter try to exclude some of the detailed areas that are less important to the composition. If you downsize by re-sampling use the bilinear method which tries to preserve image detail. You may end up with a postage stamp sized image and still not worth posting. As far as baseline JPEG is concerned it's a no win situation. It might be worth exploring fractal based compression.

In general for the image to hit the optimum 200K with good quality it must have sizable areas with little or no detail. For example large expanses of blue sky or out of focus or plain backgrounds. Even if your border line with these types of shot you can often gain a bit of extra compression by lightly running the blur tool over these low detailed areas of the image.

Some readers are gluttons for punishment and keen for more detail on the Baseline JPEG compression method. I'll see if I can come up with a few more illustrations. In the meantime I am experimenting with a computer program written in C to further illustrate the Discrete Cosine Transform process.

robski · #27 02-08-07, 00:44

For those who are interested more information on the RGB colour space to YCbCr colour space conversion during encoding.
The conversion separates the RGB image to monochrome ( luminance Y) and colour (chrominance Cr Cb) channels so that different levels of compression can be applied to the monochrome and colour components.

RGB to YCbCr Conversion is done by taking different proportions of the RGB channels as shown below

YCbCr (256 levels) can be computed directly from 8-bit RGB as follows:
Y = 0.299 R + 0.587 G + 0.114 B
Cb = - 0.1687 R - 0.3313 G + 0.5 B + 128
Cr = 0.5 R - 0.4187 G - 0.0813 B + 128

Down sampling is often applied the Cr and Cb chroma channels after conversion.

The attached image shows the original image and it's conversion to luminance and chrominance channels.

When the JPEG file is decoded the YCbCr colour space needs to be converted back to RGB as outlined below.
Before this happens the chrominance channels will have to be up sampled to match the original image size.

RGB can be computed directly from YCbCr (256 levels) as follows:
R = Y + 1.402 (Cr-128)
G = Y - 0.34414 (Cb-128) - 0.71414 (Cr-128)
B = Y + 1.772 (Cb-128)

robski · #28 02-08-07, 22:43

Further illustration on the theory behind the Discrete Cosine Transform (DCT) process. Basically the DCT process is measuring the strength of predefined Sine wave patterns in the 8 x 8 pixel block. It is looking for a number of patterns at different frequencies.

Through Fourier's work the concept was developed that said any shape can be broken down into a series of sine wave components of different frequencies and amplitude. The mathematics is very complex but it is easier to understand a visual demonstration of the Fourier principle. Therefore I have managed to find a couple of links to java applets which demonstrate the construction of complex shapes from a series of sine waves of different frequency and strength.

You may need to download a java plug-in to get them to work if your web browser is not already java enabled.

http://www.earlevel.com/Digital%20Au...rmonigraf.html

The first one is very basic. It has 8 sliders which control the amplitude of each frequency. You can see the effect of adjusting the amplitude of each component.
Two buttons are also provided which are programmed to preset the sliders to provide a Sawtooth and Square waveform.

http://www.chem.uoa.gr/applets/Apple..._Fourier2.html

The second link provides some dry theory on the left. On the right is the applet. Press the Clear button to start. Then select one of the eight wave shapes to be demonstrated. The first time you click Add the first harmonic frequency is displayed. Each time you click the add button the next harmonic component is added (shown in the lower window). The more add clicks you make the composite slowly builds up to the required waveform.

robski · #29 04-08-07, 00:31

Let's set off by saying the Discrete Cosine Transform is a tedious sequence of additions, subtractions and multiplication on all the rows and columns. It is not my intention to drill down that deep. Besides it would take an age to sit there with a calculator pressing all the buttons to step through the sequence of processing an 8 x 8 array of image data. Best left to your PC CPU or a dedicated chip found in your camera.

The following information I pulled from a DCT microchip data sheet. It is my intention just to give you an overview of the complex processing involved. Part of the process is to use values of different frequency Cosine waves ( a cosine is the same as sine it just starts from a different point). It is senseless to keep calculating these values every time you want to use them so they are stored in a table. The table is a 8 x 8 matrix. The table stores a steady value term (DC) and 7 frequencies (AC) terms. The first frequency ( the fundamental ) is chosen so that the first half cycle covers the 8 image pixels. The other frequencies are harmonics of the fundamental frequency. I have attached a table used by this microchip and plotted the values in the table to show the DC term (0th) and the 7 other frequencies (1th - 7th terms).

In the JPEG processing of the 8 x 8 image data uses a Two Dimensional DCT process. In practice this is executed by using a One Dimensional DCT on the columns ( left to right ) in the 8 x 8 matrix followed by another One Dimensional DCT on the rows ( top to bottom ).

The other attachment is a block diagram showing the processing of a One Dimensional DCT. The inputs are XK0 though to XK7 taken from the 8 image pixels in a row. The unit will add together the image pixels for even numbered rows or subtract one from the other on odd numbered rows. These results are then multiplied by one of the cosine values from the table. The value used will also depend on what part of the 8 x 8 image it is processing. Is you head hurting yet ?

The results are stored in memory. Another One Dimensional DCT takes the values from the memory but works on them from top to bottom to give the final Two Dimensional DCT terms.

In the next post I will show you the results from a C program I've been dabbling with that converts the 8 x 8 image values to DCT value and back to image values.

robski · #30 08-08-07, 00:52

The first 3 attachments show the effect of the Discrete Cosine Transform (DCT) on 3 different samples of 8 x 8 pixel image data. Each attachment shows 3 tables. The upper table is the original 8 x 8 image data. The middle table shows the DCT terms after the transform. Note quantization has not been applied. The lower table show the results of the reverse DCT which is used to restore the image data. For the purposes of producing a compact table I have only displayed the results to one decimal place. Very small errors of 0.001 are seen when between original and restored if displayed to 3 decimal places. Which goes to show that Forward and Reverse DCT does produce faithful results if quantization is not applied. The positive terms in the DCT table are the addition of a frequency component and the negative terms are the subtraction of a frequency component. The first term is the DC component.

The fourth attachment shows the zig-zag collection of the AC terms after quantization. With small magnitudes at higher frequencies the greater chance of getting longer runs of zeros with zig-zag scanning for the run length compression.

The first attachment uses an image with a vertical grey bar. Looking at the DCT terms we can see that only horizontal frequency terms (components) are produced. No terms are seen down the table. Which follows as the vertical brightness levels in the image are constant.

The second attachment uses an image with a horizontal grey bar. Looking at the DCT terms we can see that only vertical frequency terms (components) are produced. No terms across the table. Which follows as the horizontal brightness levels in the image are constant.

It is worth noting that after the zig-zag collection of terms there is a long sequence of zero which will be ideal for good run length compression.

The third attachment uses an image with a diagonal grey bar. The DCT table shows both vertical and horizontal frequency terms. Many of the terms are small and after quantization will reduce to zero. It is also worth noting that after the zig-zag collection there are less zero terms.

So the DCT process favours images with strong vertical and horizontal elements over diagonal elements when producing figures for compression. The DCT process does not perform any compression in itself.

Welcome to World Photography Forum!
	Thank you for finding your way to World Photography Forum, a dedicated community for photographers and enthusiasts. There's a variety of forums, a wonderful gallery, and what's more, we are absolutely FREE. You are very welcome to join, take part in the discussion, and post your pictures! Click here to go to the forums home page and find out more. Click here to join.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)