# Remarks on the Hamming [7.4.3] code and Sage

This post simply collects some very well-known facts and observations in one place, since I was having a hard time locating a convenient reference.

Let $C$ be the binary Hamming [7,4,3] code defined by the generator matrix $G = \left(\begin{array}{rrrrrrr} 1 & 0 & 0 & 0 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 1 & 0 & 1 & 1 \end{array}\right)$ and check matrix $H = \left(\begin{array}{rrrrrrr} 1 & 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 1 & 0 & 1 & 0 & 1 & 0 \\ 1 & 0 & 1 & 1 & 0 & 0 & 1 \end{array}\right)$. In other words, this code is the row space of G and the kernel of H. We can enter these into Sage as follows:

sage: G = matrix(GF(2), [[1,0,0,0,1,1,1],[0,1,0,0,1,1,0],[0,0,1,0,1,0,1],[0,0,0,1,0,1,1]])
sage: G
[1 0 0 0 1 1 1]
[0 1 0 0 1 1 0]
[0 0 1 0 1 0 1]
[0 0 0 1 0 1 1]
sage: H = matrix(GF(2), [[1,1,1,0,1,0,0],[1,1,0,1,0,1,0],[1,0,1,1,0,0,1]])
sage: H
[1 1 1 0 1 0 0]
[1 1 0 1 0 1 0]
[1 0 1 1 0 0 1]
sage: C = LinearCode(G)
sage: C
Linear code of length 7, dimension 4 over Finite Field of size 2
sage: C = LinearCodeFromCheckMatrix(H)
sage: LinearCode(G) == LinearCodeFromCheckMatrix(H)
True


The generator matrix gives us a one-to-one onto map $G:GF(2)^4\to C$ defined by $m \longmapsto m\cdot G$. Using this map, the codewords are easy to describe and enumerate: $\begin{tabular}{ccc} decimal & binary & codeword \\ 0 & 0000 & 0000000 \\ 1 & 0001 & 0001011 \\ 2 & 0010 & 0010101 \\ 3 & 0011 & 0011110 \\ 4 & 0100 & 0100110 \\ 5 & 0101 & 0101101 \\ 6 & 0110 & 0110011 \\ 7 & 0111 & 0111000 \\ 8 & 1000 & 1000111 \\ 9 & 1001 & 1001100 \\ 10 & 1010 & 1010010 \\ 11 & 1011 & 1011001 \\ 12 & 1100 & 1100001 \\ 13 & 1101 & 1101010 \\ 14 & 1110 & 1110100 \\ 15 & 1111 & 1111111 \end{tabular}$.

Using this code, we first describe a guessing game you can play with even small children.

Number Guessing game: Pick an integer from 0 to 15. I will ask you 7 yes/no questions. You may lie once.
I will tell you when you lied and what the correct number is.

Question 1: Is n in {0,1,2,3,4,5,6,7}?
(Translated: Is 1st bit of Hamming_code(n) a 0?)
Question 2: Is n in {0,1,2,3,8,9,10,11}?
(Is 2nd bit of Hamming_code(n) a 0?)
Question 3: Is n in {0,1,4,5,8,9,12,13}?
(Is 3rd bit of Hamming_code(n) a 0?)
Question 4: Is n in {0,2,4,6,8,10,12,14} (ie, is n even)?
(Is 4th bit of Hamming_code(n) a 0?)
Question 5: Is n in {0,1,6,7,10,11,12,13}?
(Is 5th bit of Hamming_code(n) a 0?)
Question 6: Is n in {0,2,5,7,9,11,12,14}?
(Is 6th bit of Hamming_code(n) a 0?)
Question 7: Is n in {0,3,4,7,9,10,13,14}?
(Is 7th bit of Hamming_code(n) a 0?)

Record the answers in a vector (0 for yes, 1 for no): $v = (v_1,v_2,...,v_7)$. This must be a codeword (no lies) or differ from a codeword by exactly one bit (1 lie). In either case, you can find n by decoding this vector.

We discuss a few decoding algorithms next.

Venn diagram decoding:

We use a simple Venn diagram to describe a decoding method.

sage: t = var('t')
sage: circle1 = parametric_plot([10*cos(t)-5,10*sin(t)+5], (t,0,2*pi))
sage: circle2 = parametric_plot([10*cos(t)+5,10*sin(t)+5], (t,0,2*pi))
sage: circle3 = parametric_plot([10*cos(t),10*sin(t)-5], (t,0,2*pi))
sage: text1 = text("$1$", (0,0))
sage: text2 = text("$2$", (-6,-2))
sage: text3 = text("$3$", (0,7))
sage: text4 = text("$4$", (6,-2))
sage: text5 = text("$5$", (-9,9))
sage: text6 = text("$6$", (9,9))
sage: text7 = text("$7$", (0,-9))
sage: textA = text("$A$", (-13,13))
sage: textB = text("$B$", (13,13))
sage: textC = text("$C$", (0,-17))
sage: text_all = text1+text2+text3+text4+text5+text6+text7+textA+textB+textC
sage: show(circle1+circle2+circle3+text_all,axes=false)


This gives us the following diagram: Decoding algorithm:
Suppose you receive $v = ( v_1, v_2, v_3, v_4, v_5, v_6, v_7)$.
Assume at most one error is made.
Decoding process:

1. Place $v_i$ in region i of the Venn diagram.
2. For each of the circles A, B, C, determine if the sum of the bits in four regions add up to 0 or to 1. If they add to 1, say that that circle has a “parity failure”.
3. The error region is determined form the following table. $\begin{tabular}{cc} parity failure region(s) & error position \\ none & none \\ A, B, and C & 1 \\ B, C & 4 \\ A, C & 2 \\ A, B & 3 \\ A & 5 \\ B & 6 \\ C & 7 \end{tabular}$

For example, suppose v = (1,1,1,1,1,0,1). The filled in diagram looks like This only fails in circle B, so the table says (correctly) that the error is in the 6th bit. The decoded codeword is $c = v+e_6 = (1,1,1,1,1,1,1).$

Next, we discuss a decoding method based on the Tanner graph.

Tanner graph for hamming 7,4,3 code

The above Venn diagram corresponds to a bipartite graph, where the left “bit vertices” (1,2,3,4,5,6,7) correspond to the coordinates in the codeword and the right “check vertices” (8,9,10) correspond to the parity check equations as defined by the check matrix. This graph corresponds to the above Venn diagram, where the check vertices 8, 9, 10 were represented by circles A, B, C:

sage: Gamma = Graph({8:[1,2,3,5], 9:[1,2,4,6], 10:[1,3,4,7]})
sage: B = BipartiteGraph(Gamma)
sage: B.show()
sage: B.left
set([1, 2, 3, 4, 5, 6, 7])
sage: B.right
set([8, 9, 10])
sage: B.show()


This gives us the graph in the following picture: Decoding algorithm:
Suppose you receive $v = ( v_1, v_2, v_3, v_4, v_5, v_6, v_7)$.
Assume at most one error is made.
Decoding process:

1. Place $v_i$ at the vertex i on the left side of the bipartite graph.
2. For each of the check vertices 8,9,10 on the right side of the graph, determine of the if the sum of the bits in the four left-hand vertices connected to it add up to 0 or to 1. If they add to 1, we say that that check vertex has a “parity failure”.
3. Those check vertices which do not fail are connected to bit vertices which we assume are correct. The remaining bit vertices
connected to check vertices which fail are to be determined (if possible) by solving the corresponding check equations.

check vertex 8: $v_2+v_3+v_4+v_5 = 0$

check vertex 9: $v_1+v_3+v_4+v_6 = 0$

check vertex 10: $v_1+v_2+v_4+v_7 = 0$

Warning: This method is not guaranteed to succeed in general. However, it does work very efficiently when the check matrix H is “sparse” and the number of 1’s in each row and column is “small.”

For example, suppose v = (1,1,1,1,1,0,1). The check vertex 9 fails its parity check, but vertex 8 and 10 do not. Therefore, only bit vertex 6 is unknown, since vertex 6 is the only one not connected to 8 and 10. This tells us that the decoding codeword is $c = (1,1,1,1,1,v_6,1)$, for some unknown $v_6$. We solve for this unknown using the check vertex equation $v_1+v_3+v_4+v_6 = 0$, giving us $v_6 = 1$. The decoded codeword is $c = (1,1,1,1,1,1,1).$

This last example was pretty simple, so let’s try $v=(0,1,1,1,1,1,1)$. In this case, we know the vertices 9 and 10 fail, so $c = (v_1,1,1,1,1,v_6,v_7)$. We solve using $v_1+1+1+v_6 = 0$ $v_1+1+1+v_7 = 0$

This simply tells us $v_1=v_6=v_7$. By majority vote, we get $c = (1,1,1,1,1,1,1)$.