An Open Design for Data Compression

Using Random Dot Stereograms

by Allan Spale, Class of 2000, Major in Computer Science

Abstract

A new method of compressing data is introduced based on the concept of a random dot stereogram.  This article focuses on the design of an algorithm that achieves data compression via random dot stereograms and examines the process of encoding and decoding a data file. The random dot stereograms can be used for the two basic categories of compression: lossy compression (where an acceptable amount of data is lost) and lossless compression  (where no data is lost).

1.  Introduction

Many people have seen the pictures of what appear to be randomly colored dots and, when staring straight ahead with the eyes “focused” at infinity, see a three-dimensional image of a hidden object.  Through these two-dimensional pictures one sees objects with depth and dimension.  These brightly colored pictures are called random dot stereograms. This article describes how the concepts of building random dot stereograms can be used to compress data.

One of the back pages of an issue of Byte magazine from October 1997 [1]  had an interesting group of three pictures. By looking at these three pictures in a certain way, the viewer could perceive “four -dimensional” images or images in three dimensions that could be viewed from all sides and with object motion. This seemed like a good means for compressing some type of data; however, with limited understanding of 4-D objects, “three dimensional” random dot stereograms (RDS) would be more practical. There was a problem though with the typical RDS pictures. Books like Magic Eye had images with color, and additional color information present in the image would reduce the efficiency of this design [10]. An  RDS that would use a minimal number of colors was needed.

The book Random Dot Stereograms [5] gave a design to create a binary (or two-colored) RDS. The book explained how to specify the depth information for every pixel in the RDS in order to create a depth map for the image hidden in the RDS. There were dozens of examples in the book of computerized images that were placed into this form.  They would reappear if the RDS (with the computerized images "hidden" in it) was viewed properly. One book referenced in [5], Principles of Cyclopean Perception [4], provided the psychology of how the brain perceives depth.  It stated that attempting to view an RDS was an objective test of one's ability to view stereoscopic information without additional monocular depth cues such as relative brightness, linear perspective, or relative size.  Monocular cues are guides used by each eye to help establish depth perception.  This differs from binocular cues, like retinal disparity and convergence, that are necessary in viewing an RDS properly [8].  The use of only binocular cues and not monocular cures would simplify an algorithm that could decipher the hidden image in the RDS. The book also referenced a computer program that could interpret the depth map of an RDS, albeit with error. This would provide a way to view RDS images to regain the hidden depth map.

A paper from IEEE Transactions on Pattern Anaylsis & Machine Intelligence entitled  "A Constraint Learning Feedback Dynamic Model for Stereopsis" [2],  provided an algorithm, but no program code, that would enable one to recover the depth image in a double image RDS. The tests reported in the paper seemed to imply that this data compression algorithm was feasible and produced minimal error under certain assumptions. Thus, high-quality lossy data compression (where an acceptable amount of data is lost) seemed likely, and lossless data compression (where no data is lost) still seemed feasible.

The goal of any data compression algorithm is to remove as much redundancy as possible from a file, and replace the redundancy with a "shorthand" notation of symbols that use less space than the replaced, redundant symbols.  Formally, for a file F, F' is a compression of F if, the size of F' is smaller than F and there exists a (compression algorithm) transform, F to F',  and a (decompression algorithm) transform, F' to F.  In the case of lossless compression, the decompression of F' to F should result in F being the same size with perfect arrangement and values of data as in the original file.  Conversely with lossy compression, the decompression of F' to F should result in the decompressed file being somewhat different than that of the original file, depending upon the amount of detail discarded during the compression process.

2. An RDS Algorithm

Since the goal of this article focuses on the design of an algorithm that achieves data compression via RDS, the process of encoding and decoding the file will be explained in generic terms.  Every compression/decompression algorithm has two main components: an encoding or compression phase and a decoding or decompression phase. A third component, error correction, may or may not be present  in the algorithm.  In our example, the file contents must be encoded into an RDS.  The contents of the file would be decoded later using a machine vision algorithm to determine the depth values of the RDS.  The system is not perfect, meaning that the decompression algorithm used to interpret the depth values of the RDS rarely will be 100% correct.  The RDS decompression algorithm typically finds an amount of ambiguity in the RDS so that the depth values recovered from the RDS do not necessarily match those in the original uncompressed file.  Hence, error correction must be used on the compressed file in order to ensure perfect data integrity, or at least an acceptable loss of data integrity.

RDS Data Compression

The algorithm uses a double image random dot stereogram, meaning that there is a left table and an equal sized right table. The left table consists of a random selection of symbols generated by a known pseudo random number generator, and thus only the seed number needs to be stored to generate the left table.  The right table contains the depth data, which is generated by an RDS algorithm [5].  Working with one row at a time, from left to right, the algorithm uses a “sliding depth window” to determine the correct symbol for the right table element. A summary of the algorithm is the following:

1. Determine the depth for the current table element.
2. To determine the symbol for this element, count the number of symbols in the depth window starting from the left, corresponding to the depth value.
3. Assign that symbol to the current table element.
4. Move the depth window to the right by one table element.
5. Go to step 1 until the last symbol of the right table is determined.

If the data in the original data file is based on n-bit “symbols,” then there will be 2n possible symbol values. For example, if the data is based on 3-bit symbols, then there are 8 (=23) possible values. A “depth” is associated with each of the possible symbol values in the data file.  In the RDS compression algorithm, the elements in the table are restricted to two possible symbols (0 and 1).  One could think of a file of size n as containing n one-bit symbols.  These symbols are grouped by how the program that uses this data interprets them.  So, in the case of compressing the data, the file to be compressed is grouped according to the interpretation of the compression determined by the maximum permitted depth level for an RDS.  Each symbol in the data file can be represented as one bit (a 0 or a 1) in the table.  Using this binary representation for each of the possible depth values, the amount of information represented by each symbol is three bits (23=8), and the representation in the table is 1 bit.  This would indicate that (without error correction) there is a compression of two bits per symbol.

Example: To compress a row of eight symbols 10725154 in octal notation, a pseudo random number generator is used to generate symbols in the Left Table.  Assume that the seed s generated the Left Table entries in positions L1, … , L8:

 Left Table Right Table L1 L2 L3 L4 L5 L6 L7 L8 R1 R2 R3 R4 R5 R6 R7 R8 0 1 0 1 0 0 1 1

There are eight symbols in the row, and there will be eight “depths.”  The symbol assigned to the Right Table in each of the entry positions R1, ... R8 is dependent on the depth (0 through 7) of the corresponding symbol position from the original 10725154. Thus the respective depths of the positions are  1, 0, 7, 2, 5, 1, 5, and 4.

For R1, the algorithm would use the depth table:

 Depth Table for R1 L1 L2 L3 L4 L5 L6 L7 L8 0 1 0 1 0 0 1 1

The symbol assigned to position R1 has to have depth 1.  The algorithm for R1 would choose the symbol in L1 if the depth was 0, the symbol in L2 if the depth was 1, the symbol in L3 if the depth was 2, etc.  Since the depth is 1, the symbol in position R1 is the symbol in L2, which is 1.

 Left Table Right Table L1 L2 L3 L4 L5 L6 L7 L8 R1 R2 R3 R4 R5 R6 R7 R8 0 1 0 1 0 0 1 1 1

For R2, the algorithm would use a shifted depth table:

 Depth Table for R2 L2 L3 L4 L5 L6 L7 L8 R1 1 0 1 0 0 1 1 1

Thus for R2, choose the symbol in L2 if the depth was 0, the symbol in L3 if the depth was 1, the symbol in L4 if the depth was 2, ... , and the symbol in R1 if the depth was 7.  Since the depth is 0, the symbol in position R2 becomes 1.

 Left Table Right Table L1 L2 L3 L4 L5 L6 L7 L8 R1 R2 R3 R4 R5 R6 R7 R8 0 1 0 1 0 0 1 1 1 1

This process continues until the full table is generated.

 Left Table Right Table L1 L2 L3 L4 L5 L6 L7 L8 R1 R2 R3 R4 R5 R6 R7 R8 0 1 0 1 0 0 1 1 1 1 1 0 1 1 0 0

Thus, a seed s (to generate the Left Table) and the binary representation 11100110 is the compression of the symbols 10725154. (Note that 107251548 is 001 000 111 010 101 001 101 0102 )

Using this type of RDS generation, the depth window moves in relation to the current symbol being created in the RDS. Once a row is completed, the depth window begins at the leftmost symbol in the next row.  Once all of the rows are completed, the depth map will have been completely generated. For code to generate this particular RDS, known as the dual image random dot stereogram (DIRDS), refer to [5]. There are other versions of RDS generation that have improved upon this basic design  (See [9]).

Error Correction

Error correction is a general method for ensuring data integrity. Error correction is prevalent in data transmission applications.  For example, an error detection technique called checksum is used in data transmission.  It will check a special number in the transmission (used to determine if error correction is necessary) derived from the bit pattern of the data.  Since a formula is used to derive this number on the sending end, a formula on the receiving end must use a formula to derive this special number.  If the formula derives an incorrect number, an error will have been assumed to occur [3].  Another example is the use of odd parity, which ensures the integrity of data stored in a computer's Random Access Memory (RAM).  Every byte of memory written will have an extra bit called a parity bit set according to whether there is an even or odd number of 1's in the byte.  If there is an even number of 1’s, the parity bit is 1; otherwise, it is 0.  When reading the byte in memory, if there is an even number of 1's and the parity bit is not set to 1, it will be known that an error has occurred.  This is because, including the parity bit using odd parity, there should always be an odd number of 1's for each byte [6].

The RDS method does not preserve a one-to-one relationship between the symbol being encoded and the symbols in the depth window. Therefore, to preserve the data encoded into the RDS, additional information must be provided. In order to determine what degree of error correction is necessary, the RDS must be sent to the decoding process while the correct values of all symbols in the RDS can still be verified.

The method of storing the error correction values can be described in a general way. The position of the value generated from the incorrect symbol should be noted in addition to the correct depth value. This error correction method can be quite burdensome and reduce the level of compression in the resulting file. Thus, consideration should be made that will allow the redundancy of values present in the error correction area to be represented in a more concise manner. For example, if the range of depth values is a subset of all the possible decimal values available for a binary representation used to store each depth value, then a more compact binary representation may be used to store these values, so long as the complete range of depth values may be accurately maintained. Also, instead of indicating the exact location of the incorrect symbol in the RDS, a relative location value may be stored to indicate how many symbols previously were correct before a symbol with the incorrect depth value occurred. In this case, the first symbol having the incorrect depth value must have its absolute location stored in order that the relative location values that follow for the incorrect symbols may be able to determine the correct locations of the symbols with incorrect depth values.

The decompression part of this design involves the use of a neural network.  A neural network is similar to that of a network of neurons in a living thing where its neurons send information or , cause an action.  In living things that have a brain, information is gathered from the senses to help the brain learn something, whether it be learning about something or learning an action, or perform some other cognitive process.  Since this organization in living things, particularly humans, assists the living thing with various cognitive processes, this organization is replicated in computer algorithms that may be designed to learning from data or recognizing patterns, for example [7].

In order to more successfully decode an RDS, the algorithm [2] uses a type of neural network that will allow the algorithm to approximate the depth map reconstruction based upon a set of random dot stereograms it has already decoded.  This process is known as training the neural network.  During the training session, the depth map is available for each RDS that is decoded is available in order that the network may work toward the best approximation of the depth map by assigning a connection strength to each neuron.  The state of the neurons in the network from one RDS is then applied to try to decode the next RDS, again with the depth map available so that additional adjustments can be made to the neural network.

Care must be taken in training the neural network.  An appropriate training set should be chosen that will reflect what types of files will be used by the system.  Since I have only designed the general structure of this compression-decompression system, I am uncertain if it would be more advantageous to apply different networks to different general filetypes or just use one network for all filetypes.  Additionally, one should monitor the training session, although it is not necessary since a general goal of how accurate the neural network should be can be specified before proceeding with the next item in the training set.  This is because one can repeat the refining process for creating the connection strengths in a neural network too many times or too few times.  In either of these cases, the network may not be properly trained to decode the general patterns within random dot stereograms that appear frequently.

Another important aspect of this compression-decompression design that I am unsure of is the size of the file containing the connection strengths for neurons in a trained neural network.  If it is large, it may not be feasible to have distinct trained networks for distinct general filetypes, although this would probably improve overall performance.  Also, it may not be feasible for the file to take up so much space on the storage area of the user's computer.  These and other questions will be answered once the algorithm is coded.

RDS Data Decompression

The decoding process is simple since the configured neuron values used by the algorithm to determine the depth value.  No interpretation occurs during decompression using the algorithm from [2].  However, a person who has the capacity to see using both eyes goes through an elaborate process to determine depth information from images received in each eye. This process must be replicated as efficiently (efficient by time and efficient by the degree of correct interpretation) as possible in an algorithm that is executable by the computer. The algorithm provided in [2] is a possible starting point for developing computer code to recover the depth values for many of the symbols of the RDS. The reader is encouraged to use [2] as a basis for the decompression design; although, by now, there are probably more efficient machine vision algorithms that improve upon the original algorithm. However, it is important to realize that by using the error correction for this RDS, all of the depth values for all symbols of the RDS may be correctly recovered.

3. Future Improvements and Conjectures

This article would not be complete without my ideas for improving upon this data compression design. This final section will be divided into subsections so as to help the reader find an area of interest to research relevant to this data compression design. These particular ideas and improvements are only hypotheses. I have not tested any of these concepts as to their feasibility.

4th Dimensional Data Representation Using RDS

Just as I formulated the idea for using the RDS from reading the Byte article (see [1]) on 4-D imaging, I think that it might be possible for an RDS to be applied to the methodology for generating 4-D images. Since a property of the objects in a 4-D image is movement, some care must be taken by the machine vision algorithm that views the image to correctly interpret the 4-D RDS.

Overcoming Limitations of RDS Data Compression

There are two limitations in using RDS data compression.  One is obstruction, where higher depth values will tend to obstruct lower depth values if they are in the same region of an image.  The second is ambiguity, where the decompression algorithm cannot determine the initial depth value. There are two possible ways to go about compressing files with a high degree of obstruction. One is to change the way the file is encoded into the RDS image. First, one might want to enlarge the pixels of the image. Instead of encoding the image pixel by pixel, let one pixel from the original file be enlarged into a small square or a small horizontal or vertical line. This method would create a larger encoded file size, adding some redundancy to the file, with the hope that a larger percentage of data compression will offset it. The other method of changing the way that a file is encoded into a RDS image, is to globally scale the image and reduce the range of depth values. This would include making the range of possible depth values large, and reducing the allowed depth values to a reasonably small number.  Of course, a limit exists as to how extreme this can become.

The problem of ambiguity may be attacked by increasing the number of symbols.  With this approach, the decompression algorithm has a greater probability of matching symbols that give the correct depth value.  Of course, adding additional symbols will likely increase the size of the compressed files.

In all of these approaches, the goal is to minimize the size of the compressed file.  While additional information in the form of symbols or increased pixel size or reduced depth range reduce obstruction and ambiguity, they will increase the final compressed file size.  Thus a  balance must be found between adding additional information and maintaining an acceptable amount of compression.

4.  Conclusion

This article is preliminary work on RDS data compression and it is designed to change the way methodologies for data compression are viewed. RDS data compression offers an alternative method for data compression, with the possibility of  providing greater space savings than traditional data compressors.

5.  Acknowledgements

The author would like to thank his family, friends, and the following faculty from Elmhurst College who patiently listened and advised him: Mr. James Dauer, Dr. Phyllis Kowalke, Dr. John Jeffrey, and Dr. William Muellner.  The author would especially like to thank Dr. Jon Johnson whom not only patiently helped the author fix up and rewrite portions of this paper for publication but also helped make the information provided in the paper more understandable.

6.  References

1.   Abrahams, Marc. Something to Stare At. Byte. 22,11 (Nov. 1997), 164.

2.   Bokil, Amol and Khotanzad, Alireza. A Constraint Learning Feedback Dynamic Model for   Stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence. 17, 11 (Nov. 1995), 1095-1100.

3.   internet.com Corp. "Checksum." 1 Sept. 1997: n.pag. On-line. Internet. 7 Jun. 1997.  Available WWW: http://webopedia.internet.com/Communications/Error_Correction/checksum.html.

4.   Julesz, Bela. Foundations of Cyclopean Perception. University of Chicago Press, Chicago, Illinois, 1971.

5.   Kinsman, Andrew A. Random Dot Stereograms. Kinsman Physics, Rochester, New York, 1992.

6.   Kozierok, Charles M. "Parity Checking." 17 Apr. 2000: n-pag. On-line. Internet. 7 Jun. 2000. Available WWW: http://www.pcguide.com/ref/ram/err.htm.

7.   Lawrence, Jeannette. Introduction to Neural Networks 6th ed.  California Scientific Software Press, Nevada City, California, 1994.

8.   Myers, David. Exploring Psychology 2nd ed. Worth Publishers, New York, 1993.

9.   Thimbleby, Harold W., Inglis, Stuart, and Witten, Ian H. Displaying 3D Images: Algorithms for Single-Image Random-Dot Stereograms. Computer. 27,10 (Oct. 1994), 38-48.

10.   The Walt Disney Company and N.E. Thing Enterprises, Inc. 1995. Disney's Magic Eye: a Book of Postcards. Hyperion, New York, New York, 1995.

Allan Spale received his Bachelor of Science degree in February, 2000 from Elmhurst College.  He majored in computer science with a minor in mathematics.  He is enrolled in a masters degree program in computer science at the University of Illinois at Chicago.