Home Index of Lectures Liquid Crystal Displays

The Concept of REDUNDANCY


Copyright © by V. Miszalok and V. Smolej, last update: 17-05-02

Mail me...
Let me know
what you think
 Introduction
 Definition of INFORMATION
 Examples of INFORMATION
 Definition of REDUNDANCY
 Examples of REDUNDANCY
 Comparing examples of REDUNDANCY
 Summary
 Categorization of information science
 Understanding INFORMATION and REDUNDANCY
 Genetic Code
 Appendix

Cave: Mozilla Firefox doesn't display WMF images. Recommended browser for this page: Internet Explorer.

Introduction

The word REDUNDANCY stems from the Latin verb „redundare“, which means overflow, be available in ample supply. REDUNDANCY can mean both something positive, like the overflow in the sense of wealth, and something negative, like ballast. This double-entendre makes the expression interesting from the point of view of information theory. We will see that REDUNDANCY has both connotations at the same time, it can mean both wealth and ballast, depending on who is the receiver of information.

 

Definition of INFORMATION

The following definitions are taken (in a very much simplified form) from the contribution of Warren Weaver, to be found in the classical book written by C.E. Shannon: "The Mathematical Theory of Communication", Univ. of Illinois Press, 1949.

 

Examples for INFORMATION

Example 1.1: Weather report, limiting itself to the Sun/no Sun alternatives has the INFORMATION of 1 Bit:

Sunny? Code
yes 1
no 0

Example 1.2: Weather report, a little more detailed - trying to answer two simple yes/no questions: will it be sunny or clouded, and will it be warm or cold ?

Sunny? Warm? Code
yes yes 11
no yes 01
yes no 10
no no 00

The INFORMATION content of the answer is 2 Bits.

From the information theoretical point of view the word „INFORMATION“ means something rather different from what our everyday language uses it for. It has nothing to do with the content of the message, and everything with its length, in a more concise fashion, with the minimal length necessary to convey the answer to a complex question - which, as we have seen, can be reduced to a series of simple questions (Q1: will it be sunny?, Q2....) The length of the message - in other words the INFORMATION - will depend on how complex the question is. The larger the number of possible answers, the bigger the INFORMATION.

Example 1.3: There's more telephones on the world than credit cards. So the information of the telephone number must be greater than the one available in the credit card number. How come credit card numbers are substantially longer than the telephone numbers?
Hint: why would anybody want a long telephone number? What about credit cards? The extra length, above the minimum length - i.e. INFORMATION - the REDUNDANCY, may be either wealth or ballast, depending on what the information is used for.

 

Definition of REDUNDANCY

If the message is longer than strictly necessary, i.e. it is longer than the INFORMATION of the message, then the code contains REDUNDANCY.

Definition 2: REDUNDANCY is the 2-logarithm of the quotient between the code length and the message information:

REDUNDANCY = log2 ( Code length / INFORMATION )

The REDUNDANCY can never be negative. It is zero, when the code is as short as it can be (Code length is identical to the INFORMATION, which is seldom the case). It is 1.0, when the code length is 2 x INFORMATION. The REDUNDANCY present in different messages is eventually a compromise between efficiency concerns (one may want to keep the length and thus the time needed for transmission down) and security issues (if we know things may happen on the way to the message, for instance somebody may want to intercept and possibly change it, we will try to counteract this ahead of time).

 

Example for REDUNDANCY

Example 2.1: We decide to send our weather report via teletype, using two 8 bit characters:

Weather forecast Code
Sunny & Warm SW
Sunny & Cold SC
Cloudy & Warm CW
Cloudy & Cold CC

The INFORMATION did not change and still amounts to 2 Bits.
The code length however increased from 2 to 16 bits.
The REDUNDANCY of this code is log2(16/2) = log2(8) = 3.

Example 2.2: SW above means Sunny and Warm, if you read weather forecasts on a regular basis. It could just as well mean South West, if you are backpacking while reading the weather report. And just as well as Sunny and Cold SC could mean you may be in South Carolina. So why not write out the message in simple English, as shown in the left side column?
The INFORMATION is still 2 bits, but by providing 16 characters per message the code length increases to 16*8 bit. The REDUNDANCY of this code is log2(16*8/2) = log2(64) = 6. Note that given the practical experience, there's usually more than 4 types of weather to be expected, so to talk about redundancy in this example is strictly speaking incorrect.

Example 2.3: Instead of writing out the message we decide to use icons. They are stored as 32x32x8bit images. The code length thus increases to 8192 bits. The REDUNDANCY of this code is log2(8192/2) = log2 4096 = 12.


 

Comparing REDUNDANCY alternatives

Let's compare 1.2 with Example 2.3:

Example 1.2 with 2-Bit-Code Example 2.3 with Icon-Code
INFORMATION 2 Bit 2 Bit
Code length 2 Bit 8*32*32 = 8192 Bit
REDUNDANCY log2(1) = 0 log2(4096) = 12
Transmission time Telephone costs minimal excessive
a small transmission error... destroys everything makes no difference
when used between people no good, you need a handbook to be able to communicate optimal, self-evident, no reading (analphabets!) or language experience required
when used between computers optimal, just two IF statements needed very difficult, there are so many possibilities to design an icon
 

Summary

The lower the number of possible answers to the given question, the lower the amount of INFORMATION needed. On the other hand the higher the uncertainty, the more INFORMATION is needed.
The amount of REDUNDANCY needed or required is one of the prime factors influencing the decision about what code to use for communicating. Null REDUNDANCY is nearly always unfavorable, because it's extremely error prone and unreadable. With increasing REDUNDANCY the code becomes more and more fault-tolerant and it takes less effort for us to use the code and understand the message. We pay for the increased REDUNDANCY by the increase in the bandwidth required and the additional effort, needed to make computers handle the code.

Humans love REDUNDANCY, computer hate it.

People do not like codes with low REDUNDANCY ( Telephone numbers, car license plates, account numbers). Computers do not like codes with high REDUNDANCY ( spoken language, pictures, music ). They do not know how to destroy the REDUNDANCY and retain the INFORMATION, something that is very natural thing to do for a man. This is also the point, where the difficulties in the man-machine dialogue arise: in the transformation between the codes with radically different REDUNDANCIES. When talking to a machine, for instance via keyboard, the human operator must destroy a sizeable amount of his or her natural REDUNDANCY. When giving the answers the machine must try to overcome its natural poverty in REDUNDANCY and use redundant-rich codes (Monitor instead of teletype, color monitor instead of a black and white one etc).

Categorization of information science

Using REDUNDANCY as a guideline we can see the following fields of informatics: