Ring theory, via coding theory and cryptography

In these notes on ring theory, I tried to cover enough material to get a feeling for basic ring theory, via cyclic codes and ring-based cryptosystems such as NTRU. Here’s a list of the topics.

1 Introduction to rings
1.1 Definition of a ring
1.2 Integral domains and fields
1.3 Ring homomorphisms and ideals
1.4 Quotient rings
1.5 UFDs
1.6 Polynomial rings
1.6.1 Application: Shamir’s Secret Sharing Scheme
1.6.2 Application: NTRU
1.6.3 Application: Modified NTRU
1.6.4 Application to LFSRs

2 Structure of finite fields
2.1 Cyclic multiplicative group
2.2 Extension fields
2.3 Back to the LFSR

3 Error-correcting codes
3.1 The communication model
3.2 Basic definitions
3.3 Binary Hamming codes
3.4 Coset leaders and the covering radius
3.5 Reed-Solomon codes as polynomial codes
3.6 Cyclic codes as polynomial codes
3.6.1 Reed-Solomon codes as cyclic codes
3.6.2 Quadratic residue codes
3.6.3 BCH bound for cyclic codes
3.6.4 Decoding cyclic codes
3.6.5 Cyclic codes and LFSRs

4 Lattices
4.1 Basic definitions
4.2 The shortest vector problem
4.2.1 Application to a congruential PKC
4.3 LLL and a reduced lattice basis
4.4 Hermite normal form
4.5 NTRU as a lattice cryptosystem

Calculus on graphs

In these notes, I tried to cover enough material to get a feeling for “calculus on graphs”, with applications to sports rankings and the Friendship Theorem. Here’s a list of the topics.

1 . Introduction
2. Examples
3. Basic definitions
3.1 Diameter, radius, and all that
3.2 Treks, trails, paths
3.3 Maps between graphs
3.4 Colorings
3.5 Transitivity
4. Adjacency matrix
4.1 Definition
4.2 Basic results
4.3 The spectrum
4.4 Application to the Friendship Theorem
4.5 Eigenvector centrality and the Keener ranking
4.6 Strongly regular graphs
4.7  Orientation on a graph
5. Incidence matrix
5.1 The unsigned incidence matrix
5.2 The oriented case
5.3 Cycle space and cut space
6. Laplacian matrix
6.1 The Laplacian spectrum
7  Hodge decomposition for graphs
7.1 Abstract simplicial complexes
7.2 The Bjorner complex and the Riemann hypothesis
7.3 Homology groups
8. Comparison graphs
8.1 Comparison matrices
8.2 HodgeRank
8.3 HodgeRank example

Gray codes

This is based on work done about 20 years ago with a former student Jim McShea.

Gray codes were introduced by Bell Labs physicist Frank Gray in the 1950s. As introduced, a Gray code is an ordering of the n-tuples in GF(2)^n = \{0,1\}^n such that two successive terms differ in only one position. A Gray code can be regarded as a Hamiltonian path in the cube graph. For example:

[[0, 0, 0], [1, 0, 0], [1, 1, 0], [0, 1, 0], [0, 1, 1], [1, 1, 1], [1, 0, 1], [0, 0, 1]]

These can be generalized to n-tuples of integers (mod m) in the obvious way.

Gray codes have several applications:

  • solving puzzles such as the Tower of Hanoi and the Brain [G],
  • analog-digital-converters (goniometers) [S],
  • Hamiltonian circuits in hypercubes [Gil] and Cayley graphs of Coxeter groups [CSW],
  • capanology (the study of bell-ringing) [W],
  • continuous space-filling curves [Gi],
  • classification of Venn diagrams [R],
  • design of communication codes,
  • and more (see Wikipedia).
brain-puzzle

The Brain puzzle

Here's a SageMath/Python function for computing Gray codes.
def graycode(length,modulus):
    """
    Returns the n-tuple reverse Gray code mod m.


    EXAMPLES:
        sage: graycode(2,4)
        
        [[0, 0],
         [1, 0],
         [2, 0],
         [3, 0],
         [3, 1],
         [2, 1],
         [1, 1],
         [0, 1],
         [0, 2],
         [1, 2],
         [2, 2],
         [3, 2],
         [3, 3],
         [2, 3],
         [1, 3],
         [0, 3]]

    """
    n,m = length,modulus
    F = range(m)
    if n == 1:
        return [[i] for i in F]
    L = graycode(n-1, m)
    M = []
    for j in F:
        M = M+[ll+[j] for ll in L]
    k = len(M)
    Mr = [0]*m
    for i in range(m-1):
        i1 = i*int(k/m)
        i2 = (i+1)*int(k/m)
        Mr[i] = M[i1:i2]
    Mr[m-1] = M[(m-1)*int(k/m):]
    for i in range(m):
        if is_odd(i):
            Mr[i].reverse()
    M0 = []
    for i in range(m):
        M0 = M0+Mr[i]
    return M0


REFERENCES

[CSW] J. Conway, N. Sloane, and A. Wilks, “Gray codes and reflection groups”, Graphs and combinatorics 5(1989)315-325

[E] M. C. Er, “On generating the N-ary reflected Gray codes”, IEEE transactions on computers, 33(1984)739-741

[G] M. Gardner, “The binary Gray code”, in Knotted donuts and other mathematical entertainments, F. H. Freeman and Co., NY, 1986

[Gi] W. Gilbert, “A cube-filling Hilbert curve”, Math Intell 6 (1984)78

[Gil] E. Gilbert, “Gray codes and paths on the n-cube”, Bell System Technical Journal 37 (1958)815-826

[R] F. Ruskey, “A Survey of Venn Diagrams“, Elec. J. of Comb.(1997), and updated versions.

[S] Web page of T. Sillke

[W] A. White, “Ringing the cosets”, Amer. Math. Monthly 94(1987)721-746

Sports ranking methods, 4

This is the fourth of a series of expository posts on matrix-theoretic sports ranking methods. This post discusses the Elo rating.

This system was originally developed by Arpad Elo (Elo (1903-1992) was a physics professor at Marquette University in Milwaukee and a chess master, eight-time winner of the Wisconsin State Chess Championships.) Originally, it was developed for rating chess players in the 1950s and 1960s. Now it is used for table tennis, basketball, and other sports.

We use the following version of his rating system.

As above, assume all the $n$ teams play each other (ties allowed)
and let r_i denote the rating of Team i, i=1,2,\dots,n.

Let A=(A_{ij}) denote an $n\times n$ matrix of score results:

A_{ij}= \left\{ \begin{array}{rr} -1,& {\rm if\ team\ } i {\rm \ lost\ to\ team\ } j,\\ +1,& {\rm if\ team\ } i {\rm\ beat\ team\ } j,\\ 0, & {\rm if}\ i=j. \end{array} \right.

Let S_{ij}=(A_{ij}+1)/2.

As in the previous post, the matrix A associated to the example of the Patriot league is the adjacency matrix of a diagraph.

  1. Initialize all the ratings to be 100: {\bf r}=(r_1,\dots,r_n) = (100,\dots,100).
  2. After Team i plays Team j, update their rating using the formula

    r_i = r_i+K(S_{ij}-mu_{ij}),

    where K=10 and

    \mu_{ij} = (1+e^{-(r_i-r_j)/400})^{-1}.

In the example of the Patriot league, the ratings vector is

{\bf r}=(85.124, 104.79, 104.88, 85.032, 94.876, 124.53).

This gives the ranking

Lafayette < Army < Lehigh < Bucknell < Holy Cross < Navy.

This gives a prediction failure rate of 13.3\%.

Some SageMath code for this:

def elo_rating(A):
    """
    A is a signed adjacency matrix for a directed graph.

    Returns elo ratings of the vertices of Gamma = Graph(A) 
        
    EXAMPLES:
        sage: A = matrix(QQ,[
        [0 , -1 , 1  , -1 , -1 , -1 ],
        [1,   0 ,  -1,  1,  1,   -1  ],
        [-1 , 1 ,  0 ,  1 , 1  , -1  ],
        [1 , -1 , -1,  0 ,  -1 , -1  ],
        [1 , - 1 , - 1 , 1 , 0 , - 1  ],
        [1 ,  1  ,  1  , 1  , 1  , 0 ]
        ])
        sage: elo_rating(A)
        (85.124, 104.79, 104.88, 85.032, 94.876, 124.53)

    """
    n = len(A.rows())
    RR = RealField(prec=20)
    V = RR^n
    K = 10
    r0 = 100 # initial rating
    r = n*[r0]
    for i in range(n):
        for j in range(n):
            if ij and A[i][j]==1:
                S = 1
            elif ij and A[i][j]==-1:
                S = 0
            else:
                S = 1/2
            mu = 1/(1+e^(-(r[i]-r[j])/400))
            r[i] = r[i]+K*(S-mu)
    return V(r)

Sports ranking methods, 1

This is the first of a series of expository posts on matrix-theoretic sports ranking methods. This post, which owes much to discussions with TS Michael, discusses Massey’s method.

Massey’s method, currently in use by the NCAA (for football, where teams typically play each other once), was developed by Kenneth P. Massey
while an undergraduate math major in the late 1990s. We present a possible variation of Massey’s method adapted to baseball, where teams typically play each other multiple times.

There are exactly 15 pairing between these teams. These pairs are sorted lexicographically, as follows:

(1,2),(1,3),(1,4), …, (5,6).

In other words, sorted as

Army vs Bucknell, Army vs Holy Cross, Army vs Lafayette, …, Lehigh vs Navy.

The cumulative results of the 2016 regular season are given in the table below. We count only the games played in the Patriot league, but not including the Patriot league post-season tournament (see eg, the Patriot League site for details). In the table, the total score (since the teams play multiple games against each other) of the team in the vertical column on the left is listed first. In other words, ”a – b” in row $i$ and column $j$ means the total runs scored by team i against team j is a, and the total runs allowed by team i against team j is b. Here, we order the six teams as above (team 1 is Army (USMI at Westpoint), team 2 is Bucknell, and so on). For instance if X played Y and the scores were 10-0, 0-1, 0-1, 0-1, 0-1, 0-1, then the table would read 10-5 in the position of row X and column Y.

X\Y Army Bucknell Holy Cross Lafayette Lehigh Navy
Army x 14-16 14-13 14-24 10-12 8-19
Bucknell 16-14 x 27-30 18-16 23-20 10-22
Holy Cross 13-14 30-27 x 19-15 17-13 9-16
Lafayette 24-14 16-18 15-19 x 12-23 17-39
Lehigh 12-10 20-23 13-17 23-12 x 12-18
Navy 19-8 22-10 16-9 39-17 18-12 x
sm261_baseball-ranking-application_teams-digraph

Win-loss digraph of the Patriot league mens baseball from 2015

In this ordering, we record their (sum total) win-loss record (a 1 for a win, -1 for a loss) in a 15\times 6 matrix:

M = \left(\begin{array}{cccccc} -1 & 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & -1 & 0 & 0 & 0 \\ -1 & 0 & 0 & 1 & 0 & 0 \\ -1 & 0 & 0 & 0 & 1 & 0 \\ -1 & 0 & 0 & 0 & 0 & 1 \\ 0 & -1 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & -1 & 0 & 0 \\ 0 & 1 & 0 & 0 & -1 & 0 \\ 0 & -1 & 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & -1 & 0 & 0 \\ 0 & 0 & 1 & 0 & -1 & 0 \\ 0 & 0 & -1 & 0 & 0 & 1 \\ 0 & 0 & 0 & -1 & 1 & 0 \\ 0 & 0 & 0 & -1 & 0 & 1 \\ 0 & 0 & 0 & 0 & -1 & 1 \end{array}\right).

We also record their total losses in a column vector:

{\bf b}= \left(\begin{array}{c} 2 \\ 1 \\ 10 \\ 2 \\ 11 \\ 3 \\ 2 \\ 3 \\ 14 \\ 4 \\ 14 \\ 10 \\ 11 \\ 22 \\ 6 \\ \end{array}\right).

The Massey ranking of these teams is a vector {\bf r} which best fits the equation

M{\bf r}={\bf b}.

While the corresponding linear system is over-determined, we can look for a best (in the least squares sense) approximate solution using the orthogonal projection formula

P_V = B(B^tB)^{-1}B^t,

valid for matrices B with linearly independent columns. Unfortunately, in this case B=M does not have linearly independent columns, so the formula doesn’t apply.

Massey’s clever idea is to solve

M^tM{\bf r}=M^t{\bf b}

by row-reduction and determine the rankings from the parameterized form of the solution. To this end, we compute

M^tM= \left(\begin{array}{rrrrrr} 5 & -1 & -1 & -1 & -1 & -1 \\ -1 & 5 & -1 & -1 & -1 & -1 \\ -1 & -1 & 5 & -1 & -1 & -1 \\ -1 & -1 & -1 & 5 & -1 & -1 \\ -1 & -1 & -1 & -1 & 5 & -1 \\ -1 & -1 & -1 & -1 & -1 & 5 \end{array}\right)

and

M^t{\bf b}= \left(\begin{array}{r} -24 \\ -10 \\ 10 \\ -29 \\ -10 \\ 63 \\ \end{array}\right).

Then we compute the rref of

A= (M^tM,M^t{\bf b}) = \left(\begin{array}{rrrrrr|r} 5 & -1 & -1 & -1 & -1 & -1 & -24 \\ -1 & 5 & -1 & -1 & -1 & -1 & -10 \\ -1 & -1 & 5 & -1 & -1 & -1 & 10 \\ -1 & -1 & -1 & 5 & -1 & -1 & -29 \\ -1 & -1 & -1 & -1 & 5 & -1 & -10 \\ -1 & -1 & -1 & -1 & -1 & 5 & 63 \end{array}\right),

which is

rref(M^tM,M^t{\bf b})= \left(\begin{array}{rrrrrr|r} 1 & 0 & 0 & 0 & 0 & -1 & -\frac{87}{6} \\ 0 & 1 & 0 & 0 & 0 & -1 & -\frac{73}{6} \\ 0 & 0 & 1 & 0 & 0 & -1 & -\frac{53}{6} \\ 0 & 0 & 0 & 1 & 0 & -1 & -\frac{92}{3} \\ 0 & 0 & 0 & 0 & 1 & -1 & -\frac{73}{6} \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{array}\right).

If {\bf r}=(r_1,r_2,r_3,r_4,r_5,r_6) denotes the rankings of Army, Bucknell, Holy Cross, Lafayette, Lehigh, Navy, in that order, then

r_1=r_6-\frac{87}{6},\ \ r_2=r_6-\frac{73}{6},\ \ r_3=r_6-\frac{53}{6},\ \ r_4=r_6-\frac{92}{6},\ \ r_5=r_6-\frac{73}{6}.

Therefore

Lafayette < Army = Bucknell = Lehigh < Holy Cross < Navy.

If we use this ranking to predict win/losses over the season, it would fail to correctly predict Army vs Holy Cross (Army won), Bucknell vs Lehigh, and Lafayette vs Army. This gives a prediction failure rate of 20\%.

The minimum upset ranking problem

Suppose n teams play each other, and let Team r_1 < Team r_2 < \dots < Team r_n denote some fixed ranking (where r_1,\dots,r_n is some permutation of 1,\dots,n). An upset occurs when a lower ranked team beats an upper ranked team. For each ranking, {\bf r}, let U({\bf r}) denote the total number of upsets. The minimum upset problem is to find an “efficient” construction of a ranking for which U({\bf r}) is as small as possible.

In general, let A_{ij} denote the number of times Team i beat team $j$ minus the number of times Team j beat Team i. We regard this matrix as the signed adjacency matrix of a digraph \Gamma. Our goal is to find a Hamiltonian (undirected) path through the vertices of \Gamma which goes the “wrong way” on as few edges as possible.

  1. Construct the list of spanning trees of \Gamma (regarded as an undirected graph).
  2. Construct the sublist of Hamiltonian paths (from the spanning trees of maximum degree 2).
  3. For each Hamiltonian path, compute the associated upset number: the total number of edges transversal in \Gamma going the “right way” minus the total number going the “wrong way.”
  4. Locate a Hamiltonian for which this upset number is as large as possible.

Use this sagemath/python code to compute such a Hamiltonian path.

def hamiltonian_paths(Gamma, signed_adjacency_matrix = []):
    """
    Returns a list of hamiltonian paths (spanning trees of 
    max degree <=2).

    EXAMPLES:
        sage: Gamma = graphs.GridGraph([3,3])
        sage: HP = hamiltonian_paths(Gamma)
        sage: len(HP)
        20
        sage: A = matrix(QQ,[
        [0 , -1 , 1  , -1 , -1 , -1 ],
        [1,   0 ,  -1,  1,  1,   -1  ],
        [-1 , 1 ,  0 ,  1 , 1  , -1  ],
        [1 , -1 , -1,  0 ,  -1 , -1  ],
        [1 , - 1 , - 1 , 1 , 0 , - 1  ],
        [1 ,  1  ,  1  , 1  , 1  , 0 ]
        ])
        sage: Gamma = Graph(A, format='weighted_adjacency_matrix')
        sage: HP = hamiltonian_paths(Gamma, signed_adjacency_matrix = A)
        sage: L = [sum(x[2]) for x in HP]; max(L)
        5
        sage: L.index(5)
        21
        sage: HP[21]                                 
        [Graph on 6 vertices,
         [0, 5, 2, 1, 3, 4],
         [-1, 1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1]]
        sage: L.count(5)
        1

    """
    ST = Gamma.spanning_trees()
    if signed_adjacency_matrix == []:
        HP = []
        for X in ST:
            L = X.degree_sequence()
            if max(L)<=2:
                #print L,ST.index(X), max(L)
                HP.append(X)
        return HP
    if signed_adjacency_matrix != []:
        A = signed_adjacency_matrix
        HP = []
        for X in ST:
            L = X.degree_sequence()
            if max(L)<=2:
                #VX = X.vertices()
                EX = X.edges()
		if EX[0][1] != EX[-1][1]:
                    ranking = X.shortest_path(EX[0][0],EX[-1][1])
		else:
		    ranking = X.shortest_path(EX[0][0],EX[-1][0])
		signature = [A[ranking[i]][ranking[j]] for i in range(len(ranking)-1) for j in range(i+1,len(ranking))]
                HP.append([X,ranking,signature])
        return HP

Wessell describes this method in a different way.

  1. Construct a matrix, M=(M_{ij}), with rows and columns indexed by the teams in some fixed order. The entry in the i-th row and the j-th column is defined bym_{ij}= \left\{ \begin{array}{rr} 0,& {\rm if\ team\ } i {\rm \ lost\ to\ team\ } j,\\ 1,& {\rm if\ team\ } i {\rm\ beat\ team\ } j,\\ 0, & {\rm if}\ i=j. \end{array} \right.
  2. Reorder the rows (and corresponding columns) to in a basic win-loss order: the teams that won the most games go at the
    top of M, and those that lost the most at the bottom.
  3. Randomly swap rows and their associated columns, each time checking if the
    number of upsets has gone down or not from the previous time. If it has gone down, we keep
    the swap that just happened, if not we switch the two rows and columns back and try again.

An implementaiton of this in Sagemath/python code is:

def minimum_upset_random(M,N=10):
    """
    EXAMPLES:
        sage: M = matrix(QQ,[
        [0 , 0 , 1  , 0 , 0 , 0 ],
        [1,   0 ,  0,  1,  1,   0  ],
        [0 , 1 ,  0 ,  1 , 1  , 0  ],
        [1 , 0 , 0,  0 ,  0 , 0  ],
        [1 , 0 , 0 , 1 , 0 , 0  ],
        [1 ,  1  ,  1  , 1  , 1  , 0 ]
        ])
        sage: minimum_upset_random(M)
        (
        [0 0 1 1 0 1]                    
        [1 0 0 1 0 1]                    
        [0 1 0 0 0 0]                    
        [0 0 1 0 0 0]                    
        [1 1 1 1 0 1]                    
        [0 0 1 1 0 0], [1, 2, 0, 3, 5, 4]
        )

    """
    n = len(M.rows())
    Sn = SymmetricGroup(n)
    M1 = M
    wins = sum([sum([M1[j][i] for i in range(j,6)]) for j in range(6)])
    g0 = Sn(1)
    for k in range(N):
        g = Sn.random_element()
        P = g.matrix()
        M0 = P*M1*P^(-1)
        if sum([sum([M0[j][i] for i in range(j,6)]) for j in range(6)])>wins:
            M1 = M0
            g0 = g*g0
    return M1,g0(range(n))

Mathematics of zombies

What do you do if there is a Zombie attack? Can mathematics help? This page is (humorously) dedicated to collecting links to papers or blog posted related to the mathematical models of Zombies.

 

George Romero’s 1968 Night of the Living Dead, now in the public domain, introduced reanimated ghouls, otherwise known as zombies, which craved live human flesh. Romero’s script was insired on Richard Matheson’s I Am Legend. In Romero’s version, the zombies could be killed by destroying the zombie’s brain. A dead human could, in some cases be “reanimated,” turning into a zombie. These conditions are modeled mathematically in several papers, given below.

  1. When Zombies Attack! Mathematical Modelling of a Zombie Outbreak!, paper by Mathematicians at the University of Ottawa, Philip Munz, Ioan Hudea, Joe Imad and Robert J. Smith? (yes, his last name is spelled “Smith?”).
  2. youtube video 28 Minutes Later – The Maths of Zombies , made by Dr James Grime (aka, “siningbanana”), which references the above paper.
  3. Epidemics in the presence of social attraction and repulsion, Oct 2010 Zombie paper by Evelyn Sander and Chad M. Topaz.
  4. Statistical Inference in a Zombie Outbreak Model, slides for a talk given by Des Higman, May 2010.
  5. Mathematics kills zombies dead!, 08/17/2009 blog post by “Zombie Research Society Staff”.
  6. The Mathematics of Zombies, August 18, 2009 blog post by Mike Elliot.
  7. Love, War and Zombies – Systems of Differential Equations using Sage, April 2011 slides by David Joyner. Sage commands for Love, War and Zombies talk. This was given as a Project Mosaic/M-cast broadcast.
  8. Public domain 1968 film Night of the Living Dead by George Romero.