Vous êtes sur la page 1sur 17

c 


 cc 
When we think of codes, we often think of secret codes. Some of us may
remember having secret decoder rings when we wer e young. Certainl y
trans mission of codes, Which t heor etically can be underst ood only by t he
pers on who is s uppos ed t o receive them, is important for military purposes
and to insure privacy on networks and other places where infor mation may be
available to unauthorized personnel. Along with t he creation of these codes,
there is also the breaking of the code. This is the intercepting and decoding of
messages by the above mentioned unauthorized personnel. This is Serious
work For the military and others who are interested in securing the integrity of
their code and breaking the codes of others. It is also an increasingly popular
hobby. Thus ther e is a cont est bet ween t hos e who br eak codes and thos e
who try to create codes that cannot be broken. Perhaps the most important
case of codebreaking was the br eaking of the Ger man enigma code during
World War II. For details see Kahn [55] and [54] The branch of code theory
concerned with all this is called cryptology.
In this chapter, however, we will be concerned with a different type of
code. AS in Section 15.3, we define a code to be a representation of a set of
symbols as Strings of 1's and 0's. This set of symbols usually includes the
letters of the alphabet, the numbers on t he keyboard, and, often, control
symbols. Thes e codes are repres entations of binary strings for use by
computers in the Storage and trans mission of data. There are several
properties which we would like our codes to have. Unfortunately we shall find
that some of these properties are not mutually compatible.
The most important property of a code is that, when a message is expressed as a
binary string consisting of the concatenation of elements of the code, this
concatena tionis unique. If a message is decoded, there must be no problem
in deciding which letters the code repres ents. We call Such a code a uniquel y
decipherable code (see Definition 17.8.)
There are several ways to achieve this goal. One way is to encode all
symbols with binary swings of the same length. Such a code is called a = 
 For example, if each symbol is encoded using 8 bits, then we would
know that at the end of cacti eight bits, we would have a code string ,
representing a Symbol of the transmitted message. The block code is particularly
useful if it is necessary to limit the length of the code for each symbol or letter sent.
Another method for constructing a uniquely decipherable code is to use a ð   
This was defined in Definition 17.10 and discussed in the section on weighted trees. We
defined a code ? to be a prefix code if' it has the property that an element of a code
cannot be the beginning string of another element of the code. Thus when we have read
a string, of 1's and 0's which represent the symbol 
 we know this the moment the
string for A is completed. A prefix code is also called an    
One type of prefix code is the  Each symbol is encoded into a string
consisting of a string of 1's followed by a 0 at the end. Thus the set of strings of the code
has the form {0, 10, 110, 11110, 1 11110,«}. This code has the obvious disadvantage
that the elements of the code are going to be very long and take up a lot of storage
space.
Often it is desirable to compress data to minimize storage space or transmission
time. The most efficient code, with regard to minimizing space, is the   
Encoding with the Huffman code is described in Section 15.3. The Huffman code also
has the advantage of being an instantaneous code.
A well-known example of a code that minimizes transmission time is the  
 In both the Huffman and Morse codes, letters or symbols that would occur most
often are shorter. In Morse code, letters are separated by "spaces" and words by three
"spaces." In this case, spaces are units of time.

Î  

  

þ o 
  
  
 ü


 
  
  
 

When transmitting data, errors may occur during transmission. We refer to


anything which may produce errors with the vague term "noise." For example, data
received from a distant spacecraft such as Voyager and Galileo could certainly have
problems with various kinds of noise. In some cases, we may only be interested in
detecting an error. This could be the case if the data could be retransmitted. Codes
which have the property that they can detect errors are called  
    
In other cases, where data cannot be transmitted, such as data from a distant
spacecraft, we want enough information about the data so we cannot only detect the
error, but also correct it. Codes which have the property to correct errors are called
 
      It might seem reasonable to always use error-correcting
codes. The problem with error-correcting codes, or even error-detecting codes, is
that more information must be included for these codes, so they are less efficient
with regard to minimizing space. Unfortunately, with both error-correcting and error-
detecting codes, we can never be sure that errors, are corrected or detected. The problem
is with multiple errors. It is certainly possible to correct or detect the error if there is
only one. The best that we can do in general is reduce the probability that errors will
occur undetected or uncorrected. Again we have the problem that the more that we
reduce the probability, the more information we have to send, and the less efficient our
code will be.
Before beginning our discussion of error-correcting and error-detecting codes
however, we want to include one more type of code. There is a classic example which
demonstrates the use for this code. Suppose we have a rotating disk, which is divided
into sectors, and a series of brushes or laser beams sending back digital information
about how far the disk has rotated. If the binary strings recording the numbering of
adjacent sectors are substantially different, in the sense that there are a lot of changes of
the individual digits in going from one sector to the next, then a reading taken just as the
sector was changing could produce a number totally different from the number of either
of the sectors. In this case, it is desirable to number the sectors so that the binary string
determining the sector has only one digit change between adjacent sectors. A code
which has this property is the ÄIts construction is discussed in Section 6.7
The first error-detecting method we consider is the parity bit. We demonstrate this
method Using ASCII code. This code is a block code which uses 7 bits, so that every
symbol is a string of seven 1's and 0's. An eighth bit is then added which make the
number of 1 's even. Therefore if the code for a transmitted string is received with a
single error, the number of 1's will be odd, and the receiver will know that an error has
occurred. Unfortunately, if two errors occur, they will not be detected since the sum of
the 1's will again be even. Assume the probability of an error in transmission is 0.01 both
for a 1 changing to a 0 and for a 0 changing to a 1 . Further assume that the probability
for an error is the same regardless of the location of the error and whether the error is
changing 1 to 0 or conversely. We also assume that the occurrence of one error does not
affect the probability of another occurring.
We know from the binomial theorem for probability (Theorem 8.101) that the
8
probability for exactly one error is  '.01 '.99 7 which is approximately 0.07.
1
8
How-ever the probability of exactly two errors is  '.01 '.99 which is approximately
2 6

1
0.002, which is substantially smaller.
Since 3 errors would be detected, the probability of more than two errors going
undetected is less that the probability of four or more errors occurring, since any odd
number of errors would be detected. This probability is almost negligible in comparison
to that of one or less errors occurring.
Consider the code which is encoded by simply repeating each string to be encoded
a given number of times. For example, if each string of code to be encoded is repeated
once, then 10110 would be encoded as 101 10101 10. If each string to be encoded is
repeated twice, then 10 110 would be encoded as 101 1010110101 1 (). If each string to
be encoded is repeated once, then we have an error-detecting code. If an error occurs,
then the corresponding positions will not be the same. For example, if the encoded
string, is 111111010110111011, then an error has occurred in the third, and last bit. We
cannot correct the error since we don't know which error occurred in which copy of the
string. If the strings to he encoded are repeated twice, the best we can do is error
detection. If wehave three copies of the string, then we can correct

þ 
   
?    ?    ? 
  ?   
00000000 NUL 10 100000 SP 11000000 (&, () I 1 0 0 0 0 0
10000001 S OH 00100001 01000001 A 11100001
10000010 STX 00100010 "      

              
       01000100 D 11100100 d
00000101 E NQ 10100101          
00000110 ACK 10100110  11000110 F 01100110 f
10000111 BE L 00100111 01000111 G 11100111
10001000 BS 0 0 1 0 1 ()0 0 01001000 H 11101000 h
00001001 HT 10101001 11001001 I 01101001 
00001010 LF 10101010 11001010 J 01101010 j
10001011 VT 00101011 ® 01001011  11101011 k
00001100 FF 10101100 11001100 L 01101100 1
10001101 CR 00101101 01001101 11101101 in
10001110 O 00101110 01001110 11101110 n
00001111 SI 10101111 11001111 0 01101111 o
10010000 DL E 00110000 0 () 10 10000 P 11110000 P
00010001 DC1 10110001 1 11010001 Q 01110001 q
00010010 DC2 10110010 2 11010010 R 01110010 r
10010011 DC3 00110011 3 01010011 S 11110011 
00010100 DC4 10110100 4 11010100 T 01110100 t
10010101 NAK 00110101 5 01010101 U 11110101 u
10010110 S YN* 00110110 6 0 1 (1 1 0 1 1 0 V 11110110 v
00010111 ET B 10110111 7 11010111 W 01110111 w
00011000 C AN 10111000 8 11011000 X 01111000 x
10011001 EM 00111001 9 01011001 Y 11111001 
10011010 S UB 00111010 01011010 Z 11111010 
10011011 E SC 10111011 11011011 01111011
10011100 FS 00111100      I
00011101 GS 10111101        I
           
     !"     #$ 
lihat aslinya!!

the code for a single error. If there is a difference in bits at corresponding positions
in the string, then we accept the value which occurs twice. For example, if our string has
length 4 and we receive 110110011101, then in the second position we receive two 1's
and a 0. Thus we assume that the correct value is 1. and that the correct string which
was encoded, was 1101. Obviously a problem occurs if an error occurs more than once
in the same position in the string. If a string is repeated so that we have copies of the
string, then the error correction gives the right result as long as an error occurs in the
 §
same position in the repetition less than  § times.
2

[
cc  
Beginning at this point, we shall assume that all strings in the code have a fixed length 
and we shall treat these strings as vectors or  matrices. Therefore, we shall have
addition of vectors. However, we shall define addition mod 2. so that 1 - 1 = 0. Thus
11110001 + 101001 11 = 01010110
The codes which we encounter in this section are called    These codes
are also known as ð As previously mentioned, we shall consider the strings
of, a code ? to be binary strings of length
which we shall also consider as vectors or 1
x matrices. If 
is the set of all binary strings of length
then C is a subset of 

The reader is asked to prove that 
 with the above addition, forms a group under
addition. Each string in  , is its own inverse, so in a sense, adding and subtracting are
the same thing. If in some cases it may seem that we should be subtracting instead of
adding, it doesn't matter. A code C is a    C is a subgroup of  . If you have
taken linear algebra, you may observe that C is really a linear vector space, and many of
the properties of linear algebra would be helpful and illuminating here. However, in this
development, the only properties we shall use, regarding ?
 are that C is a group, and
that an clement of C is a vector and hence can be multiplied by a matrix of the proper
size. We shall also use the distributive law of matrices, which states that
?   ?
for all matrices A, 
and ?where multiplication is defined. The reader may recall that
if  



  and  = (



 ), then the  of uand v,
denoted by    , is equal to
u1v1+u2v2+u3v3+«+unvn 
The weight of a string of code c, denoted by ! 
 is the number of 1's in the
string. For example, if c = 1011010, then ! 4.
Suppose we have a x matrix G such that the first columns and rows form the 
x  identity matrix 
 and all the columns are distinct. Thus Ä has the 6"    For
example,
Î   

is such a matrix. G is called a      . Consider the rows of the generator,
matrix as vectors or strings of a code. Call this set of strings. For example, in the matrix
above,
S= {100101,010110,001011}

Let C be the code created by taking all vectors that are finite sums of strings in m
The reader is asked to prove that C is, a subgroup of  . In our example, we get 110011
by adding the first two strings in S, so 110011 would be in C. The group ?is # 
by the set mIt is also a minimal set generating ?
since no elements of mare sums of
other elements of mWe denote this by saying, that C=S*. The
 
code ?with form 6"    (i.e.- generated by the rows of 6"    is called an
[
]$code.
THEOREM 18.1 The [n. ]$C contains strings.
hThe first bits of a string in C determine the elements of C. The positions
where 1's occur in the first bits of a string in ?indicate which strings in mwere added
together. For example, if 1's occurred in the first and third bits of a string in C, then this
string was created by adding the first and third rows of Ä Since there are 2k distinct
ways of forming the first bits, there are 2k strings in C.
If !wish to transmit message strings of length 
we encode them by multiplying
them oil the right by ÄThus if !!!!!or !
!
!

! , then we encode
it as the string !ÄIn our example, we would encode 110 or (1, 1, 0) as

Î   

or 110011. We note.that the message string is the first three bits of the encoded string.
In general the message string of length  will be the first bits of the encoded string,
since the identity matrix 
forming the first columns of Ä
simply repeats the original
string. Thus we can decode by simply taking the first bits of the encoded string. Also
note that when we multiplied ( 1, 1, 0) by Äabove, we formed
(1,1,0,0,1,1) = 1 . (1,0,0,1,0,1) + 1 . (0,1,0,1,1,0) + 0 . (0,0,1,0,1,1)
where for any vector , 1 ‡   and 0 ‡ = (0, 0, 0, 0, 0, 0). Thus the encoded
string is a sum of the vectors in mand hence is in C, since ?is a group. In general, if S =
(S1,S2,S3,..Sk)  is the set of rows of the generator matrix and w = (w1,w2,..wk) is a
message code, then the encoded string is
w1s1 ! ! $$$%m
This is a sum of the strings in m
since each !   is either 1 or 0 and hence is in C
since C is the group generated by m
We have noted that Ähas the form 6"    and that "relays the message. What
does A2 do? Let us again look at our example. If the message string is (w1,w2,w3),
then the encoded string is
Î   

Thus the fourth bit in the encoded string must be w1 +w2, the fifth bit must be w2 + w3,
and the sixth bit must be w1 + w3. Therefore if the encoded string is
(w,w2,w3,w4,w5,w6)
& !' !!
!5 = w2 + w3
and w6 = w1 + w3. If any
encoded string after it is transmitted does not satisfy these equations, we know there is
an error in transmission. For example, if we received the encoded string 101100 then,
since w1 = 1, w2 = 0, and w3 = 1, we must have w4 = 1 = 1 +0, w50 = 01, and w6
= 0 = 1 + 1. Since the equation for w5 is not correct, We know that an error has
occurred. Thus the matrix An-k serves as check on the accuracy of the transmission,
much as the parity check did earlier. In general if we have the encoded string
w1w2w3«wk«wnand
Î   

then for (


wi = ! 
! 
 ! 
the encoded string must satisfy these
)equations.
The next problem is to correct an error, given that a single error has occurred. The
method below is known as using
   We illustrate this method with Our
example where
Î   

We know
m{100101,010110,0010111}
so that
?{000000,100101,010110,00101 1,110011,011101.101110,111000}

We now form cosets of C in 


 us described in Section 9.4. The first coset is C
itself. To form the next coset, we select an element of  which has minimum weight
and is not in C. For example, we could choose == 100000. The coset
=?{100000,000101,110110, 1010 l1, 010011, 111101, 001110, 011000}
We again select an element of 
which has minimum weight and is not in either of
the previous cosets. For example, we could select = = 10000. The coset
= + C ={010000,110101,000110,011011,100011,001101, 11111 0. 101000}
Continuing in this process, we have the following table of 

 divided into cosets,


where the element of least weight is listed first. The elements in the first column arc
called coset leaders.
   

In the last coset , we had to use a string of weight 2. We could have selected either
100010, 010001, or 001100. Frankly, we were happy just to find one. The process
works as follows: When an encoded string is received, we find it in the table. For
example, assume that we receive 110110. We then look at the first column of the row
containing 110110 to find 100000 so we assume that the error occurred in the first bit
since 100000 was added to each element of ?to get this coset. We look at the first row
of the column containing 110110 to find 010110 which we assume is the correct string
of?Again, suppose we get the string 001010. We then look to the first column of, the
row containing 001010 to find 000001 so we assume that the error occurred lit the sixth
bit. We look at the first row of the column containing, 001010 to find 001011, which we
assume is the correct string of C.
We now look for an easier way to determine errors. Let  and ! be vectors (or
strings) of length The vector is &#  to the vector ! if their dot product,  ‡ !
0Given a code C, the of C, denoted by Có , is the set of all strings of' 

that arc orthogonal to every string in C. It is left to the reader to show that Có is a
subgroup of  . If code ?is an [n, k]-code so that it generated by strings, then Có is
generated by n -  strings. We shall not prove this, since it requires a knowledge of
linear algebra, but simply accept it as true. We shall however prove the following
theorem.
c 
 Let C be a group code and Có be its dual code. A string is in Có
if and only if it is orthogonal to each string in S, the set of generators of C.
h 
Obviously if  is in Có , then it is orthogonal to each string in m, since it is orthogonal to
every string in ?and S
C. Assume that m* 


 +and  is orthogonal to 
for all si ! S, so that ,s1, = 0 for all !mEvery clement of C is of the form
! ! ! ! 
where each !, is either 1 or 0. From the linear property of matrices we know that
,! ! ! !  !,  !,  !,  
 0 + 0 + 0 + «+ 0
0
Recall that in our example
Î   
we now define
Î   

where 3 A is the transpose of A3 and is obtained by changing the rows of A3 to
columns. The matrix Ćis called the ð   
By checking each possibility in our example, we can show that the inner product
of any row of G with any row of Ćis equal to 0. From this we know that Ć   0
where  
is the transpose of the $th row of G. By definition of transpose,   is the -th
row of G, changed to a column. We place it in column form so we can multiply it on
the right by the matrix Ć:
G † ( !1 1  !2 2  !3 3 ) G † !1 1  G † !2 2  G † !3 3
000
0
†
Thus if we multiply Ä and the transpose of any element of ?
we get 0. In general if
Î   
then
Î   
The inner product of the -th row of Äwith the -$&row of Ć is equal to
0+0+«+0+A
-+0+«+0+A
-+«+0+0 = 0
so that in the general case Ć   0 where   is the transpose of the $th row of R.
Using the same argument as above, if we multiply Ć and the transpose of any
element of ?
we get 0.
We also get another rather remarkable result. If two elements b1 and b2 of  are
in the same coset where we form cosets of  , using the group C as we did above. then

 
To show this, we use the fact that if b1 and b2 an in the same coset, then b1 = b2 + c for
some c ! C. Therefore =1 =2 [   and

 
† 
since Ä  0 for all c ! C.
Since Ć =  is the same for all = in a coset, we can select any = in the coset and
determine this value. Thus in the table above we call add this common value of the
image of Ć to each row, since the elements in each row determine a coset. We choose
the coset leader since it is the simplest and place the value of its image in the second
column. These value are called   %already know that the first syndrome is
0 
0 §
 §
0 §
We find that
Î 
1 
so 0 §§ is the second syndrome, and
1 §

Î 

1 
so 1 §§ is the third syndrome. Continuing. we have the following table.
0 §

 
Having completed the table, suppose that we receive the transmitted string
101100. We then multiply its transpose by Ć getting

Î 
so we know that 101100 is in row 6. The coset leader is 000010 the element in the
leftmost column of the row containing, 101100. The element of C, in the top row of the
column, containing 101100, is 101110. By the way in which the table was constructed
we know that 101100 = 101110 + 000010, so we assume that the transmitted message
101100 should have been 101110 and there was an error in the fifth bit.
This method is much quicker, since we only have to multiply the transpose of the
transmitted message by Ć , find the row containing the syndrome, and find the
transmitted message. The coset leader for that message is the error, and the element of C
in the first row of the column containing the transmitted message is the corrected
message.
We can, however, make the process faster even quicker, and we need only the
first two columns of the above table. Suppose we receive the transmitted message
110000. We multiply its transpose by Ćgetting
Î 
0 
so the syndrome is 1 §§ and the coset leader is 001000. Since the coset leader tells us the
1 §
error occurred in the third bit, if we add 001000 to 110000, we get 111000, the
corrected code. Thus our method is simple. Multiply the transpose of the transmitted
message by Ć to find the syndrome. Find the coset leader for that syndrome and
add it to the transmitted message to get the corrected code. Notice that weuse only the
first two columns of, the table. At present, however, we do have one problem. If we
1
should receive the transmitted message 101001 so the syndrome is 1§§
1§
there are three strings of weight 2 in the corresponding row. Remember 100010 was
chosen arbitrarily. Any of these are equally likely to be the error so our correction using
syndromes here is hopeless. We are also trying to correct a string with two errors
instead of one.

EXERCISES

1.Y Which of the following are propergenerating matrices? If not, why not?

2.Y Given the generating matrix

a.Y Find Ć.


b.Y Encode 111. 011, 101, 110.
c.Y Decode 111011, 110100, 010010, 011110.
d.Y Use Ć to determine which strings in () are proper encodings.

3.Y Given the generating matrix


a.Y Find Ć.
b.Y Encode 1111, 0101, 1001, 1010.
c.Y Decode 1111110, 0111010, 011 1101 , 1011110.
d.Y Use ĆY to determine which string in () are proper encodings.

4.Y Given the generating matrix


a.Y Find Ć.
b.Y Construct the table of cosets with coset leaders.
c.Y Use this table to correct errors (if any) in the tlansimissions, of 1111100,
1111000, 0110101, and 1011000.

5.Y Given the generating matrix

a.Y FindĆ.
b.Y Construct the table of cosecs with coset leaders.
c.Y Use this table to correct errors (if any) in the transmissions of
011001, 111000, 110111 and 101100.

6.Y Given the generator matrix


a.Y Find Ć.
b.Y Find the syndromes for the cosets.
c.Y Use these syndromes to correct the errors (if any) for the transmitted
strings 111101, 1111001, 110101, 101001.

7.Y Given the generator matrix


a.Y Find Ć.
b.Y Find the syndromes for the cosets.
c.Y Use these syndromes to correct the errors (if any) for the transmitted
strings 111101, 11100 1, 110010, and 101001.

8.Y Explain what happens using the generator matrix in the exercise 7, if
there are two or more errors.

9.Y Using the generator matrix in exercise 5, find other strings than the rows
of Ć which are in C† and construct a parity check matrix using these strings.

10.YUsing the generator matrix in exercise 4, find other strings than the rows
of Ć which are in ?†, and construct a parity check matrix using these strings.

11.YProve t hat 
the set of all binary strings of lengt h
is a group under
the addition defined in this chapter.
12.YLet ? be the code cr eated by taking all vectors that are finit e sums of ,
strings in S. Prove that C is a subgroup of  under addition.

13.YPr ove t hat t he  


 of ?
 denot ed by ?† is a s ubgr oup of 

under addition.

‡
   
At the end of the last section, we saw that there was difficulty trying to correct the code
for certain strings, since not all coset leaders had weight 1. This is remedied by using a
matrix called a ‘    as a generating matrix. Before looking at the
† †
Hamming matrix Ä
we first look at the parity check matrix ‘ . Let ‘ be a matrix
with rows such that the columns consist of all possible string's of length  except the
string consisting of all 0's. We shall assume that r • 3. Ther are 2r - 1 such strings so
Ä is an  matrix where $1. We use the columns of weight 1 as the final 
columns, forming the identity matrix so that †
‘ 6
, has the form  "  where At is an
 x   )  matrix. The Hamming matrix Ä is the (n ²r)   matrix of the form
6"   þ where  is an   )     matrix. The code generated by the rows of' the
Hamming matrix is called the ‘  
†
For example, let 3, and ‘ be the matrix

Î 

then Ä is the matrix


Î 

To study Hamming matrices, we need the concept of distance between two


strings, and their relation to the weight of the strings. We begin with a theorem about
weights of strings.

c 
 For strings c and c', !. ” ! !. 
h
Let   and  ! 1!
2!
3!
... ! If    ! 1 then either  1 or  ! 1 .
Therefore, for every 1 occurring in    !
, there is a 1 occurring in either or .
The ‘     or simply    between two strings of code  and
/
having the same length, is the number of corresponding bits in the string where one
string has the digit 1 and the other has the digit 0. We shall denote the distance function
by ó
. For example if 101011 and .110010, then 
. 3, since the two
strings differ in the second, third, and sixth positions. Obviously, the greater the
distance between two strings, the more errors that can be made in transmitting one
string without accidentally getting code. Since we have called  a distance function we
should show that it has the basic properties of a distance function.

c 
 The Hamming distance function has the following properties:
(a)Y For strings  and .,
. = 0 if and only if ..
=  For strings  and .
 
. .
 
(c)Y For strings , . and /, 
..  
. 
..
h
Parts (a) and (b) follow directly and are left to the reader. For part c, we note that for
strings  and /, !(/) = 
.. To see this we note that if  and ..
 1!
!
!
!! ! ! ! !!
2  3 ...
then  [  contributes 1 to !(/) if and only if  0 and  /1 or

 = 1 and  / = 0. But this true if and only if and i and  .. are different, which
contributes 1 to 
.. We also note that for any string  .., the string .. consists
only of 0's. We shall call this string . By the definition of addition,  + for every
string c. Therefore,

 

It is important to know the minimal distance between any two strings in a code.
If ?is a code, then the    of a code ?
denoted by D(C), is the smallest
distance between any two strings in ? The following theorem gives us an important
measure of the number of errors that can be corrected or detected using tile code.

c 
 Fora ?

a)Y If D(C) 1, then up to errors can be detected using the code.
b)Y If D(C) = 2k + 1, then up to errors can be corrected using the code.

h
(a) If D(C) = 1 and c ! C. then c differs from any other string in the code in at least
1places. Therefore, if  is transmitted and has or fewer errors, it cannot possibly
be another string in the code, and an error is detected.
(b) If string is transmitted as .with  or fewer errors, then for any c ! C, we have 

. If 
.. , for some string /in ?
then 
.  
.. But 

..  
. 
.. and 
.. (2k + 1, giving a contradiction. Therefore .can
be corrected to , the only string whose distance from . is less than  + 1.
Wewant to determine D(C), the smallest distance between any two strings in C.
To do this however, we first need the following theorem.
c 
 V? is equal to %?  min{! 1? and   0}
h
By definition of D(C), there exists !
 !.  ? such that !,!.   V?  But , .. 
!  /
 and, since   /  ?
 %?   !  /  Therefore W(C) ” D(C).
Conversely, for c ! C ! ! m,  • D(C). Therefore, W(C) • D(C).
We now show that for a Haniming code ?, %? > 3. There is no   ? of
weight 1. If there were and  were a string with all 0's, except for a 1 in the --th place,
then since  is orthogonal to every row of Ԡ , the --th column of Ԡ would consist of
†
all 0's, which contradicts the construction of' ‘ . Also there is no c ! C of weight 2. If
there were, then  would be a string with all 0's except for two 1's, say in the -th and --th
positions. Again since c is orthogonal to every row of Ԡ , the $thand -$thcolumns of
†
every row in ‘ would have to be either both 1's or both 0's. But then the $thand -$th
†
columns of ‘ would have to be the same, which again contradicts the construction of
†
‘ . Therefore, %? 23 and ?can be used for error correction for a single error.
We now want to show that there is one element of weight 1 in each coset except
for C itself. If we do this, it will greatly simplify the problem of decoding. By Theorem
18.1, an [
]-code ?contains 2k strings. Since ‘ has the form 6"  , the Hamming
code C is an [
 $ ]-code and so ? contains 2 $ elements.  , contains 2 elements.
Therefore, there are
2
2
2 
cosets including ?The strings in C have length 2$1. Therefore, there are 2$1 strings
of weight 1. We now have to show that no coset contains two strings of weight 1.
Assume that and . are both strings of weight 1 in the same coset. The by definition of
a coset, = .   for some c ! C. Therefore    .
 o by
Theorem 18.3,
! ! ! .  ” 1 + 1 = 2
But ! • 3, so this is obviously impossible. Therefore each coset contains
exactly one string of weight 1, except for C which contains a string of weight 0.
We return to our example where Ä is the matrix
Î 
and ‘ is the matrix
Î 
Since every coset contains a coset leader of weight 1, consider the string 0010000. If we
†
multiply it by ‘ we have
Î 
0 
so the syndrome is 1 §§ , which is the third column of Ä . In fact, if a I occurs in the --
1 §
th digit of a string of weight 1, the syndrome, when the transpose of the string is
multiplied by Ä , is the --th column in Ä  Therefore, whenever we receive a
† 
transmitted message string and multiply it by ‘
if the transmission is correct, we get
†
a syndrome with all 0's. If there is a single error, we get one of the columns of ‘

since the transmitted message string has to be in one of the cosets, so there is a coset
leader of weight 1. Therefore if the -th column of Ä is the syndrome, we know the
coset leader has a 1 in the $thcolumn, so the error is in the $th column or $th bit of the
string. For example, suppose we get the transmitted message string 1110110. We
multiply its transpose by Ä
getting
Î 
†
which is the second row of ‘  Therefore the error is in the second bit, and the
message should have been 1010110.
In the remainder of this section, we willquickly explore other codes. The first of
these is the Golay code. Hamming codes were discovered independently by Hamming
in 1950 and Golay in 1949. We will not try to explain why they are called Hamming,
codes. It is a long and involved story. For details see Thompson [1] 21 However there is
a Golay code. It was in Golay's 1949 paper. This is the (23, 12, 7) model published by
Golay. This means that it is a (23, 12) generator matrix with a minimum distance of 7
6
between the strings of C. This Ä has generator G = " 11 þ where
Î 

This matrix has a geometrical interpretation using five lines in the plane (see
Thompson). It is more easily studied by using the extended generator (24, 12,8) matrix
6
G = " 12 where
Î 

(see Hill 1441). This is produced by rearranging the Golay generating matrix and
adding, a parity bit. The symmetry of this matrix makes it much easier to study. It is
easily seen that †
 6þ " 12  since þ  þ. Golay introduced several other codes
including a (4096, 244, 8) code which was used by the Voyager spacecraft to transmit
images of Jupiter, Uranus, and Neptune.
For a given string
 of length
 let  1 , where
is a string of length 
consisting of all 1's. Given a set of strings m
let 3 m * 1 
m+ * 1  m 
Given the set
S= {0000, 00l1, 1100, 1111)
let S1 = 3 m
 m  3 m
 m  3  m
 and m , = 3 m $  A plot is a
construction created by M. Plotkin [88]. The codes generated by sets S1, S2, S3,« are
called 
Î    The set m is a (64, 32, 16) code which was used for error
correction on images transmitted by the Mariner 9 spacecraft. More specifically. the
Reed-Muller matrix used is called a Hadamard matrix which is defined next.
Consider matrices defined recursively as follows
Î 

where the matrix þ , is defined by þ - = þ - for 1 ” i , j ” . Therefore,


Î 
and
Î 

Let  denote the matrix that results front  by replacing with 0 and 1 with - 1. for
each - for 1 ” i, j ” . Thus,
Î 

The matrix 
 has the property    .  "
 where " is the identity matrix. Matrices
having this property are called ‘    The code generated is called the
‘  

[  
†
1.Y Given that ‘ is the matrix
find Ä
†
2.Y Given that ‘ is the matrix
find Ä 
3.Y Find the distance between strings 110010101 and 010101111.
4.Y Find the distance between strings 110011001 and 111100001 .
5.Y Find three strings orthogonal to 110011001.
6.Y Find three strings orthogonal to 010101111.
7.Y Given the generator matrix
†
(a)Y Find ‘

(b)Y Using only Ä . correct errors (if any) in the transmitted messages 1110110,
1011001, 1101010, and 1111110.
8.Y Given the generator matrix
†
(a)Y Find ‘

(b)Y Using only Ä . correct errors (if any) in the transmitted messages 1111110,
1010010, 1101100, and 1111101.
9.Y For
each of the Ha mming matrices in the matrices in exercises ß and 8, if
1001010 is trans mitted as 1001001, what is t he corr ect ed code? How
does this affect the original word encoded?
10.YWhich of the following are Hamming matrices?
11.YProve Theorem 18.4: The Hamming distance function has the following
properties:
a)Y For strings  and .
 
. = 0 if and only if..
b)Y For strings  and .
 
. =  .
 .
12.YConstruct the Reed-Muller code m .
13.YConstruct the Hadamard matrices 
'
4
and 5.

Vous aimerez peut-être aussi