Appendix F - Classical Error Correcting Codes

Introduction

Classical error correcting codes are in use in a wide variety of digital electronics and other classical information systems. It is a good idea to learn some of the basic definitions, ideas, methods, and simple examples of classical error correcting codes in order to understand the (slightly) more complicated quantum error correcting codes. There are many good introductions to classical error correction. Here we follow a few sources which also discuss quantum error correcting codes: the book by Loepp and Wootters, an article in Lo, Popescu, and Spiller by Steane, Gottesman's Thesis, and Gaitan's Book on quantum error correction, which also discusses classical error correction.

Binary Operations

The set $\{0,1\}\,\!$ is a group under addition. (See Section D.2.8 of Appendix D.) The way this is achieved is by deciding that we will only use these two numbers in our language and using addition modulo 2, meaning $0+0=0,1+0=0+1=1,\,\!$ and $1+1=0\,\!$ . If we also include the operation of multiplication, the set (with the two operations) becomes a field (a Galois Field), which is denoted GF $(2)\,\!$ . Since one often works with strings of bits, it is very useful to consider the string of bits to be a vector and to use vector addition (which is component-wise addition) and vector multiplication (which is the inner product). For example, the addition of the vector $(0,0,1)\,\!$ and $(0,1,1)\,\!$ is $(0,0,1)+(0,1,1)=(0,1,0)\,\!$ . The inner product between these two vectors is $(0,0,1)\cdot (0,1,1)=0\cdot 0+0\cdot 1+1\cdot 1=0+0+1=1\,\!$ .

Definitions and Basics

Definition 1

The inner product is also called a checksum or parity check since it shows whether or not the first and second vectors agree, or have an even number of 1's at the positions specified by the ones in the other vector. We may say that the first vector satisfies the parity check of the other vector, or vice versa.

Definition 2

The weight or Hamming weight is the number of non-zero components of a vector or string. The weight of a vector $v\,\!$ is denoted wt( $v\,\!$ ).

Definition 3

The Hamming distance is the number of places where two vectors differ. Let the two vectors be $v\,\!$ and $w\,\!$ . Then the Hamming distance is also equal to wt( $v+w\,\!$ ). The Hamming distance between $v\,\!$ and $w\,\!$ will be denoted Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d_H(v,w)\,\!} .

Definition 4

We use Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \{0,1\}^n\,\!} to denote the set of all binary vectors of length Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n\,\!} . A code $C\,\!$ of length Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n\,\!} is any subset of that set. The set of all elements of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle C\,\!} is called the set of codewords. We also say there are Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 2^n\,\!} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n\,\!} -bit words in the space.

Suppose $n\,\!$ bits are used to encode Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k\,\!} logical bits. We use the notation Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [n,k] \,\!} do denote such a code.

Definition 5

The minimum distance of a code is the smallest Hamming distance between any two non-equal vectors in a code. This can be written


	$d_{Hmin}(C)={\underset {v,w\in C,v\neq w}{\mbox{min}}}d_{H}(v,w).\,\!$	(F.1)

For shorthand, we also use Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d(C)\,\!} or Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d\,\!} if $C\,\!$ is understood.

When that code has a distance Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d\,\!} , the notation Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [n,k,d] \,\!} is used.

Example 1

It is interesting to note that if we encode redundantly using $0_{L}=00\,\!$ and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1_L=11\,\!} as our logical zero and logical one respectively, then we could detect single bit errors but not correct them. For example, if we receive $01\,\!$ , we know this cannot be one of our encoded states. So an error must have occurred. However, we don't know whether the sender sent Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 0_L=00 \,\!} or Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 1_L=11\,\!} . We do know that an error has occurred though, as long as we know only one error has occurred. Such an encoding can be used as an error detecting code. In this case there are two code words, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 0_L=00 \,\!} and $1_{L}=11\,\!$ , but four words in the space. The minimum distance is 2, which is the distance between the two code words.

Example 2

The three-bit redundant encoding was already given in Chapter 7. One takes logical zero and logical one states to be


	$0_{L}=000\;\;\;{\mbox{ and }}\;\;\;1_{L}=111,\,\!$	(F.2)

where the subscript Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle L \,\!} is used to denote a "logical" state; that is, one that is encoded. Recall that this code is able to detect and correct one error. In this case there are two code words out of eight possible words, and the minimal distance is 3.

Definition 6

The rate of a code is given by the ration of the number of logical bits to the number of bits, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k/n\,\!} .

Definition 7

A linear code Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle C_l\,\!} is a code that is closed under addition.

Linear Codes

Linear codes are particularly useful because they are able to efficiently identify errors and the associated correct codewords. This ability is due to the added structure these codes have. These will be discussed in the following sections.

Generator Matrix

For linear codes, any linear combination of codewords is a codeword. One key feature of a linear code is that it can be specified by a ''generator matrix,'' $G\,\!$ ^[1]. For an Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [n,k]\,\!} code, the generator matrix is an Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n\times k\,\!} matrix with columns that form a basis for the Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k\,\!} -dimensional coding sub-space of the $n\,\!$ -dimensional binary vector space. In other words, the vectors comprising the rows form a basis that will span the code space. (Note that one may also use the transpose of this matrix as the definition for Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle G\,\!} .) Any code word Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w\,\!} described by a vector $v\,\!$ can be written in terms of the generator matrix as $w=Gv\,\!$ . Note that $G\,\!$ is independent of the input and output vectors. In addition, $G\,\!$ is not unique. If columns are switched or even added to produce a new vector that replaces a column, then the generator matrix is still valid for the code. This is due to the requirement that the columns be linearly independent, which is still satisfied if these operations are performed.

Parity Check Matrix

Once $G\,\!$ is obtained, one can calculate another useful matrix, $P\,\!$ . $P\,\!$ is an $(n\times k)\times n\,\!$ matrix which has the property that


	$PG=0.\,\!$	(F.3)

The matrix $P\,\!$ is called the parity check matrix or dual matrix. The rank of $P\,\!$ is at most $n-k\,\!$ and has the property that it annihilates any code word. To see this, recall any code word is written as $Gv\,\!$ : $PGv=0\,\!$ since $PG=0\,\!$ . Also, due to the rank of $P\,\!$ , it can be shown that $Pw=0\,\!$ only if $w\,\!$ is a code word. That is to say, $Pw=0\,\!$ if and only if $w\,\!$ is a code word. This means that $P\,\!$ can be used to test whether or not a word is in the code.

Suppose an error occurs on a code word $w\,\!$ to produce $w^{\prime }=w+e\,\!$ . It follows that


	$Pw^{\prime }=P(w+e)=Pe,\,\!$	(F.4)

since $Pw=0\,\!$ . This result, $Pe\,\!$ is called the error syndrome and the measurement to identify Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Pe\,\!} is the syndrome measurement. Therefore, the result depends only on the error and not on the original code word. If the error can be determined from this result, then it can be corrected independent of the code word. However, in order to have $Pe\,\!$ be unique, two different results, Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Pe_1\,\!} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Pe_2\,\!} , must not be equal. This is possible if a distance $d\,\!$ code is constructed such that the parity check matrix has Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d-1=2t\,\!} linearly independent columns. This enables the errors to be identified and corrected.

Errors

For any classical error correcting code, there are general conditions which must be satisfied in order for the code to be able to detect and correct errors. The two examples above show how the error can be detected, but here the objective is to give some general conditions.

Note that any state containing an error may be written as the sum of the original state (logical or encoded state) Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w \,\!} and another vector Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle e \,\!} . The error vector $e\,\!$ has ones are in the places where errors are present and zeroes everywhere else. To ensure that the error may be corrected, the following condition must be satisfied for two states with errors occurring


	Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle w_1 + e_1 \neq w_2 + e_2. \,\!}	(F.5)

This condition is called the disjointness condition. This condition means that an error on one state cannot be confused with an error on another state. If it could, then the state including the error could not be uniquely identified with an encoded state and the state could not be corrected to its original state before the error occurred. More specifically, for a code to correct Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t\,\!} single-bit errors, it must have distance at least Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle 2t + 1 \,\!} between any two codewords, i.e., it must be a distance $2t+1\,\!$ code, i.e., it must be true that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d(C) \geq 2t + 1 \,\!} . An Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [n,k]\,\!} code with minimal distance Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle d \,\!} is denoted $[n,k,d]\,\!$ .

Example 3

An important example of an error correcting code is called the Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle [7,4,3]} Hamming code. This code, as the notation indicates, encodes Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle k=4} bits of information into Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n=7} bits. It also does it in such a way that one error can be detected and corrected since it has a distance of $3$ . The generator matrix for this code can be taken to be (see for example Loepp and Wootters)


	Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle G^T = \left(\begin{array}{ccccccc} 1 & 0 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 0 & 1 & 1 & 1 \\ 0 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 & 0 & 1 & 1 \end{array}\right). \,\!}	(F.6)

From this the parity check matrix, $P\,\!$ can be calculated by finding a set of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle n-k\,\!} mutually orthogonal vectors which are also orthogonal to the code space defined by the generator matrix. Alternatively, one could find the generator matrix from the parity check matrix. A method for doing this can be found in Steane's article in Lo, Popescu, and Spiller. One first puts $G^{T}\,\!$ in the form $(I_{k},A),\,\!$ where $I_{k}\,\!$ is the $k\times k\,\!$ identity matrix. Then the parity check matrix $P=(A^{T},I_{n-k}).\,\!$ In either case, one can arrive at the following parity check matrix for this code:


	$P=\left({\begin{array}{ccccccc}1&1&1&0&1&0&0\\1&1&0&1&0&1&0\\0&1&1&1&0&0&1\end{array}}\right).\,\!$	(F.7)

It is useful to note that the code can also be defined by the parity check matrix. The codewords are annihilated by the parity check matrix and only the codewords are annihilated by that matrix.

The Disjointness Condition and Correcting Errors

The motivation for the disjointness condition, Eq.(F.5), is to associate each vector in the space with a particular code word. That is, assuming that only certain errors occur, each error vector should be associated to a particular vector in the code space when the error is added to the original code word. This partitions the set into disjoint subsets each of which contains only one code vector. A message is decoded correctly if the vector (the one containing the error) is in the subset which is associated with the original vector (the one with no error). For example, if one vector is sent, say $v_{1}\,\!$ and an error occurs during transmission to produce $v_{2}=v_{1}+e\,\!$ , this vector must be in the subset containing $v_{1}\,\!$ .

A way to decode is to record an array of possible code words, possible errors, and the combinations of those errors and code words. The array can be set up as a top row of the code word vectors and a leftmost column of errors with the element of the first row and the first column being the zero vector and all subsequent entries in the column being errors. Then the element at the top of a column (say the jth column) is added to the error in the corresponding row (say the kth row) to get the j,k entry of the array. With this array one can associate a column with a subset that is disjoint with the other sets. Then identifying the erred code word in a column associates it with a code word and thus corrects the error.

The Hamming Bound

The Hamming bound is a bound that restricts the rate of the code. Due to the disjointness condition, a certain number of bits are required to ensure our ability to detect and correct errors. Suppose there is a set of $n\,\!$ bit vectors for encoding $k\,\!$ bits of information. There is a set of error vectors of weight $t\,\!$ that has $C(n,t)\,\!$ elements^[2]. So the number of error vectors, including errors of weight up to $t\,\!$ , is $\sum _{i=0}^{t}C(n,i).\,\!$ (Note that no error is also part of the set of error vectors. The objective is to be able to design a code which can correct all errors up to those of weight $t\,\!$ and this includes no error at all.) Since there are $2^{n}\,\!$ vectors in the whole space of $n\,\!$ bits, and assuming $m\,\!$ vectors are used for the encoding, the Hamming bound is


	$m\sum _{i=0}^{t}C(n,i)\leq 2^{n}.\,\!$	(F.8)

For linear codes, $m=2^{k},\,\!$ so


	$2^{k}\sum _{i=0}^{t}C(n,i)\leq 2^{n}.\,\!$	(F.9)

Taking the logarithm,


	$k\leq n-\log _{2}\left(\sum _{i=0}^{t}C(n,i)\right).\,\!$	(F.10)

For large $n,k\,\!$ and $t\,\!$ , we can use Stirling's formula to show that


	${\frac {k}{n}}\leq 1-H\left({\frac {t}{n}}\right),\,\!$	(F.11)

where $H(x)=-x\log x-(1-x)\log(1-x)\,\!$ and we have neglected an overall multiplicative constant which goes to 1 as $n\rightarrow \infty .\,\!$ (Again, see the article in Lo, Popescu, and Spiller by Steane.)

More Definitions

Definition 11: Dual Code

Let ${\mathcal {C}}\,\!$ be a code and let $v\,\!$ be a vector in the code space. The dual code, denoted ${\mathcal {C}}^{\perp }\,\!$ is the set of all vectors which have zero inner product with all $v\in {\mathcal {C}}\,\!$ . In other words, it is the set of all vectors $u\,\!$ such that $u\cdot v=0,\,\!$ for all $v\in {\mathcal {C}}\,\!$ .

For binary vectors a vector can be orthogonal to itself. This is different from ordinary vectors (in 3-d space) so this is one concept which is quite different in that regard.

The dual code is a useful entity in classical error correction and will be used in the construction of the quantum error correcting codes known as CSS codes.

Final Comments

As can be seen from the Hamming bound, there is a limit to rate of an error correcting code. This does not indicate whether or not codes satisfying these bounds exists, but it does tell us that no codes exist which do not satisfy these bounds. Encoding decoding and error detection and correction are all problems which can be difficult to solve in general. One of the advantages of the linear codes is that they provide a systematic method for identifying errors on a code through the use of the parity check operation. More generally, checking to see whether or not a bit string (vector) is in the code space would require a lookup table. This would be much more time-consuming that using the parity check matrix where matrix multiplication is quite efficient relative to the lookup table.

Many of these ideas and definitions will be utilized in Chapter 7 on quantum error correction. Some linear codes, including the Hamming code above, will have quantum analogues as do many quantum error correcting codes. In quantum computers, as will be discussed, error correction is necessary since the quantum information is delicate. Such discussions will be taken up in Chapter 7.

Footnotes

↑ Recall we are working with binary codes. So the entries of the matrix will also be binary numbers, i.e., 0's and 1's.
↑ That is $n\,\!$ choose $t\,\!$ vectors. The notation is $C(n,t)={n \choose t}={\frac {n!}{(n-t)!t!}}.\,\!$

[1] Recall we are working with binary codes. So the entries of the matrix will also be binary numbers, i.e., 0's and 1's.

[2] That is $n\,\!$ choose $t\,\!$ vectors. The notation is $C(n,t)={n \choose t}={\frac {n!}{(n-t)!t!}}.\,\!$

[1]

[2]