The Vigenère cipher and Sage

The Vigenère cipher is named after Blaise de Vigenère, a sixteenth century diplomat and cryptographer, by a historical accident. Vigene`re actually invented a different and more complicated cipher. The so-called “Vigenère cipher” cipher was actually invented by Giovan Batista Belaso in 1553. In any case, it is this cipher which we shall discuss here first.

This cipher has been re-invented by several authors, such as author and mathematician Charles Lutwidge Dodgson (Lewis Carroll) who claimed his 1868 “The Alphabet Cipher” was unbreakable. Several others claimed the so-called Vigenère cipher was unbreakable (e.g., the Scientific American magazine in 1917). However, Friedrich Kasiski and Charles Babbage broke the cipher in the 1800’s [1]. This cipher was used in the 1700’s, for example, during the American Civil War. The Confederacy used a brass cipher disk to implement the Vigenère cipher (now on display in the NSA Museum in Fort Meade) [1].

The so-called Vigenère cipher is a generalization of the Cesaer shift cipher. Whereas the shift cipher shifts each letter by the same amount (that amount being the key of the shift cipher) the so-called Vigenère cipher shifts a letter by an amount determined by the key, which is a word or phrase known only to the sender and receiver).

For example, if the key was a single letter, such as “C”, then the so-called Vigenère cipher is actually a shift cipher with a shift of 2 (since “C” is the 2-nd letter of the alphabet, if you start counting at 0). If the key was a word with two letters, such as “CA”, then the so-called Vigenère cipher will shift letters in even positions by 2 and letters in odd positions are left alone (or shifted by 0, since “A” is the 0-th letter, if you start counting at 0).

[1] Wikipedia article on the Vigenere cipher.

Using Sage, let’s look at a message (a chant at football games between rivals USNA and West Point):

sage: AS = AlphabeticStrings()           
sage: A = AS.alphabet()
sage: VC = VigenereCryptosystem(AS, 1) # sets the key length to be = 1
sage: m = VC.encoding("Beat Army!"); m  # trivial example

Now, we choose for illustration a simple key of length 1, and encipher this message:

sage: key = AS("D")
sage: c = VC.enciphering(key, m)
sage: c  # a trivial example

You see here that in this case the cipher boils down to the Caesar/shift cipher (shifting by 3).

Deciphering is easy:

sage: VC.deciphering(key, c)

Next, we choose for illustration a simple key of length 2, and encipher the same message:

sage: VC = VigenereCryptosystem(AS, 2)
sage: key = AS("CA")
sage: m = VC.encoding("Beat Army!"); m
sage: c = VC.enciphering(key, m); c

Since one of the key letters is “A” (which shifts by 0), half the plaintext is unchanged in going to the ciphertext.

Here is the algorithmic description of the above (so-called) Vigenère cipher:

      key - a string of upper-case letters (the secret "key")
      m - string of upper-case letters (the "plaintext" message)
      c - string of upper-case letters (the "ciphertext" message)

  Identify the alphabet A, ..., Z with the integers 0, ..., 25. 
    Step 1: Compute from the string key a list L1 of corresponding
            integers. Let n1 = len(L1).
    Step 2: Compute from the string m a list L2 of corresponding
            integers. Let n2 = len(L2).
    Step 3: Break L2 up sequencially into sublists of size n1, and one sublist
            at the end of size <=n1. 
    Step 4: For each of these sublists L of L2, compute a new list C given by 
            C[i] = L[i]+L1[i] (mod 26) to the i-th element in the sublist, 
            for each i.
    Step 5: Assemble these lists C by concatenation into a new list of length n2.
    Step 6: Compute from the new list a string c of corresponding letters.

Once it is known that the key is, say, n characters long, frequency analysis can be applied to every n-th letter of the ciphertext to determine the plaintext. This method is called “Kasiski examination“, or the “Kasiski attack” (although it was first discovered by Charles Babbage).

The cipher Vigenère actually discovered is an “auto-key cipher” described as follows.

      key - a string of upper-case letters (the secret "key")
      m - string of upper-case letters (the "plaintext" message)
      c - string of upper-case letters (the "ciphertext" message)

  Identify the alphabet A, ..., Z with the integers 0, ..., 25. 
    Step 1: Compute from the string m a list L2 of corresponding
            integers. Let n2 = len(L2). 
    Step 2: Let n1 be the length of the key. Concatenate the string 
            key with the first n2-n1 characters of the plaintext message.
            Compute from this string of length n2 a list L1 of corresponding
            integers. Note n2 = len(L1).
    Step 3: Compute a new list C given by C[i] = L1[i]+L2[i] (mod 26), for each i.
            Note n2 = len(C).
    Step 5: Compute from the new list a string c of corresponding letters.

Note how the key is mixed with the plaintext to create the enciphering of the plaintext to ciphertext in Steps 2 and 3.

A screencast describing this has been posted to vimeo.

Bifid cipher and Sage

The Bifid cipher was invented around 1901 by Felix Delastelle. It is a “fractional substitution” cipher, where letters are replaced by pairs of symbols from a smaller alphabet. The cipher uses a 5×5 square filled with some ordering of the alphabet, except that “i”‘s and “j”‘s are identified (this is a so-called Polybius square; there is a 6×6 analog if you add back in “j” and also append onto the usual 26 letter alphabet, the digits 0, 1, …, 9). According to Helen Gaines’ book “Cryptanalysis”, this type of cipher was used in the field by the German army during World War I.

The Bifid cipher was discusses in Alasdair McAndrew’s book on Cryptography and Sage. We shall follow his discussion. As an example of a Polybius square for the Bifid cipher, pick the key to be “encrypt” (as Alasdair does). In that case, the Polybius square is \left(\begin{array}{rrrrr}  E & N & C & R & Y \\  P & T & A & B & C \\  D & E & F & G & H \\  I & K & L & M & N \\  O & P & Q & R & S  \end{array}\right). BTW, the 6\times 6 analog is: \left(\begin{array}{rrrrrr}  E & N & C & R & Y & P \\  T & A & B & C & D & E \\  F & G & H & I & J & K \\  L & M & N & O & P & Q \\  R & S & T & U & V & W \\  X & Y & Z & 0 & 1 & 2  \end{array}\right).

Here is Sage code to produce the 6\times 6 case (the 5\times 5 case is in Alasdair’s book):

def bifid(pt, key):
        pt - plaintext string      (digits okay)
        key - short string for key (no repetitions, digits okay)
        ciphertext from Bifid cipher (all caps, no spaces)

    This is the version of the Bifid cipher that uses the 6x6
    Polybius square.

        Alasdair McAndrew

        sage: key = "encrypt"
        sage: pt = "meet me on monday at 8am"
        sage: bifid(pt, key)
        [[2, 5], [0, 0], [0, 0], [1, 0], [2, 5], [0, 0], [3, 0], 
         [0, 1], [2, 5], [3, 0], [0, 1], [1, 3], [1, 1], [0, 4], 
         [1, 1], [1, 0], [5, 4], [1, 1], [2, 5]]

    AS = AlphabeticStrings()
    A = list(AS.alphabet())+[str(x) for x in range(10)]
    # first make sure the letters are capitalized
    # and text has no spaces
    key0 = [x.capitalize() for x in key if not(x.isspace())]
    pt0 = [x.capitalize() for x in pt if not(x.isspace())]
    # create long key
    long_key = key0+[x for x in A if not(x in key0)]
    n = len(pt0)
    # the fractionalization
    pairs = [[long_key.index(x)//6, long_key.index(x)%6] for x in pt0]
    print pairs
    tmp_cipher = flatten([x[0] for x in pairs]+[x[1] for x in pairs])
    ct = join([long_key[6*tmp_cipher[2*i]+tmp_cipher[2*i+1]] for i in range(n)], sep="")
    return ct

def bifid_square(key):
    Produced the Polybius square for the 6x6 Bifid cipher.

        sage: key = "encrypt"
        sage: bifid_square(key)
        [E N C R Y P]
        [T A B C D E]
        [F G H I J K]
        [L M N O P Q]
        [R S T U V W]
        [X Y Z 0 1 2]

    AS = AlphabeticStrings()
    A = list(AS.alphabet())+[str(x) for x in range(10)]
    # first make sure the letters are capitalized
    # and text has no spaces
    key0 = [SR(x.capitalize()) for x in key if not(x.isspace())]
    # create long key
    long_key = key0+[SR(x) for x in A if not(x in key0)]
    # the fractionalization
    pairs = [[long_key.index(SR(x))//6, long_key.index(SR(x))%6] for x in A]
    f = lambda i,j: long_key[6*i+j] 
    M = matrix(SR, 6, 6, f)
    return M

Have fun!

Zombies and Mathematics

What do you do if there is a Zombie attack? Can mathematics help? This post is (humorously) dedicated to collecting links to papers or blog posted related to the mathematical models of Zombies.

George Romero‘s 1968 Night of the Living Dead, now in the public domain, introduced reanimated ghouls, otherwise known as zombies, which craved live human flesh. Romero’s script was insired on Richard Matheson’s I Am Legend. In Romero’s version, the zombies could be killed by destroying the zombie’s brain. A dead human could, in some cases be “reanimated,” turning into a zombie. These conditions are modeled mathematically in several papers, given below.

Public domain 1968 film Night of the Living Dead by George Romero.


The Floyd-Warshall-Roy algorithm is an algorithm for finding shortest paths in a weighted, directed graph. It allows for negative edge weights and detects a negative weight cycle if one exists. Assuming that there are no negative weight cycles, a single execution of the FWR algorithm will find the shortest paths between all pairs of vertices.

This algorithm is an example of dynamic programming, which allows one to break the computation down to simpler steps using some sort of recursive procedure. If n=|V| then this is a O(n^3)-time algorithm. For comparison, the Bellman-Ford algorithm has complexity O(|V|\cdot |E|), which is O(n^3)-time for “dense” graphs. However, Bellman-Ford only yields the shortest paths eminating from a single vertex. To achieve comparable output, we would need to iterate Bellman-Ford over all vertices, which would be an O(n^4)-time algorithm for “dense” graphs. Except for “sparse” graphs, Floyd-Warshall-Roy is better than an interated implementation of Bellman-Ford.

Here is a Sage implementation

def floyd_warshall_roy(A):
    Shortest paths

        A - weighted adjacency matrix 

        dist  - a matrix of distances of shortest paths
        paths - a matrix of shortest paths

        sage: A = matrix([[0,1,2,3],[0,0,2,1],[20,10,0,3],[11,12,13,0]]); A
        sage: floyd_warshall_roy(A)
        ([[0, 1, 2, 2], [12, 0, 2, 1], [14, 10, 0, 3], [11, 12, 13, 0]], 
          [-1  1  2  1]
          [ 3 -1  2  3]
          [ 3 -1 -1  3]
          [-1 -1 -1 -1])
        sage: A = matrix([[0,1,2,4],[0,0,2,1],[0,0,0,5],[0,0,0,0]])
        sage: floyd_warshall_roy(A)
        ([[0, 1, 2, 2], [+Infinity, 0, 2, 1], [+Infinity, +Infinity, 0, 5],
          [+Infinity, +Infinity, +Infinity, 0]], 
          [-1  1  2  1]
          [-1 -1  2  3]
          [-1 -1 -1  3]
          [-1 -1 -1 -1])
        sage: A = matrix([[0,1,2,3],[0,0,2,1],[-5,0,0,3],[1,0,1,0]]); A
        sage: floyd_warshall_roy(A)
        Traceback (click to the left of this block for traceback)
        ValueError: A negative edge weight cycle exists.
        sage: A = matrix([[0,1,2,3],[0,0,2,1],[-1/2,0,0,3],[1,0,1,0]]); A
        sage: floyd_warshall_roy(A)
        ([[0, 1, 2, 2], [3/2, 0, 2, 1], [-1/2, 1/2, 0, 3/2], [1/2, 3/2, 1, 0]],
          [-1  1  2  1]
          [ 2 -1  2  3]
          [-1  0 -1  1]
          [ 2  2 -1 -1])
    G = Graph(A, format="weighted_adjacency_matrix")
    V = G.vertices()
    E = [(e[0],e[1]) for e in G.edges()]
    n = len(V)
    dist = [[0]*n for i in range(n)]
    paths = [[-1]*n for i in range(n)]
    # initialization step
    for i in range(n):
        for j in range(n):
            if (i,j) in E:
                paths[i][j] = j
            if i == j:
                dist[i][j] = 0
            elif A[i][j] != 0:
                dist[i][j] = A[i][j]
                dist[i][j] = infinity
    # iteratively finding the shortest path
    for j in range(n):
        for i in range(n):
            if i != j:
                for k in range(n):
                    if k != j:
                        if dist[i][k]>dist[i][j]+dist[j][k]:
                            paths[i][k] = V[j]
                        dist[i][k] = min(dist[i][k], dist[i][j] +dist[j][k])
    for i in range(n):
        if dist[i][i] < 0:
            raise ValueError, "A negative edge weight cycle exists."
    return dist, matrix(paths)   

Sage and the future of mathematics

I am not a biologist nor a chemist, and maybe neither are you, but suppose we were. Now, if I described a procedure, in “standard” detail, to produce a result XYZ, you would (based on your reasonably expected expertise in the field), follow the steps you believe were described and either reproduce XYZ or, if I was mistaken, not be able to reproduce XYZ. This is called scientific reproducibility. It is cructial to what I believe is one of the fundamental principles of science, namely Popper’s Falsifiability Criterion.

More and more people are arguing, correctly in my opinion, that in the computational realm, in particular in mathematical research which uses computational experiments, that much higher standards are needed. The Ars Technica article linked above suggests that “it’s time for a major revision of the scientific method.” Also, Victoria Stodden argues one must “facilitate reproducibility. Without this data may be open, but will remain de facto in the ivory tower.” The argument basically is that to reproduce computational mathematical experiments is unreasonably difficult, requiring more that a reasonable expertise. These days, it may in fact (unfortunately) require purchasing very expensive software, or possessing of very sophisticated programming skills, accessibility to special hardware, or (worse) guessing parameters and programming procedures only hinted at by the researcher.

Hopefully, Sage can play the role of a standard bearer for such computational reproducibility. Sage is free, open source and there is a publically available server it runs on (

What government agencies should require such reproducibility? In my opinion, all scientific funding agencies (NSF, etc) should follow these higher standards of computational accountability.

Sage at the NSF-CDI+ECCAD conferences

This is a very quick view of some highlights of the conference I think further details of the talks will appear on the conference webpage later. This is very incomplete – just a few thought I was able to
jot down correctly.

Panel discussion:
Q1: What are the grand challenges of symbolic computing?
Is the term “symbolic computation” to broad? (Hybrid symbolic/numerical, algebraic geometric computation, algebraic combinatorial/group-theoretical, computer proofs, tensor calculations, differential, mathematical knowledge/database research, user interfaces, …)

General ans: No. Hoon Hong points out that user interfaces are lower level but below to the same group.

Q2: How can the latest algorithmic advances be made readily available: google analog of problem formulation? (Idea: suppose someone has a clever idea for a good algorithm but not enough discipline to implement it …)

One answer: Sage can put software together – is this the right way? Analog of

Q3: What is the next killer app for symbolic computation? (Oil app of Groebner bases, cel phone app, robotics, …)

Q4: Can academically produced software such as LAPACK, LINBOX, SAGE compete with commercial software?

Hoon Hong answer: Yes but why? Why not cooperate. Support Sage very much but more research on interfaces and integration of different systems could lead to cooperation of the commercial systems with Sage.

Another panel:
Q: What are the spectacular successes and failures of computer algebra?

(a) Small number of researchers.
(b) Sage could fail from lack of lies with the symbolic/numerical community (as Maxima/Axiom did). Matlab may fail due to uphill battle to integrate Mupad into good symbolic toolbox. (Many voiced view that Matlab is strong because of its many toolboxes, on the panel and privately.)
(c) Education at the High School level using CA.
(d) Presenting output of CA intelligently and in a standard format.
(e) Failure to teach people how to properly use a CA system.

(a) Sage – interesting new effort (with caveat above)
(b) Groebner bases, LLL.

My talk on Sage raised a lot of questions. My There is both strong support for Sage and some questions on its design philisophy. My page 6 at
was a source of lots of questions.

At ECCAD, Sage was mentioned a few times in talks as well as in some posters. The “main” Sage event was a laptop software demo Which Karl-Dieter Crisman set up for Sage.

Overall, a good experience for Sage.

Sage at AMS-MAA 2009 Washington DC

I think Sage gained a lot of publicity this year both by being at
the booth but also having an MAA panel discussion and an AMS session.
The panel discussion was good to be able to meet others in the
teaching community. I think this is related to Sage development because
projects like the educational open source software webworks has a
funding model which seems to be successful. I think Karl Crisman said he
would try to follow up on that.

A few people I met at the booth said they were interested in Sage development
but more stopped by saying that either they or their students could not afford
Maple or Mma and was looking into a cheaper quality alternative. The collaroration
possibilities of the Sage server was a strong “selling point” for smaller schools
which could load sage on a webserver.

There were some really good talks at the Sage session. For example,
Marshall’s talk had amazing graphics and Robert Miller’s talk was very well attended
(with maybe twice as many people in the audience as some of the others).
I thought the quality overall was great, but I’m very partial to such topics of course.

The general message I got from many was that more written material
on Sage in use would be welcomed, especially books. I was touched by one guy
who explained to me that his students were very poor (waitresses, for example)
who cannot afford calculus texts and commercial math programs. The point
he was implicitly making was that by offering software and documentation for
free we are actually improving the quality of such peoples’ lives in a real way.

Future coding theory in Sage projects

Here are a few ideas for future Sage projects in error-correcting codes.

Needed in Sage is

  • a rewrite in Cython, for speed,
  • special decoding algorithms, also in Cython,
  • for special decoders, it seems to me that one either needs
  1. more classes (eg, a CyclicCodes class), each of which will have a decode method, or
  2. another attribute, such as a name (string) which can be used to determine which decoder method to use.
  • codes over Rings (Cesar Agustin Garcia Vazquez is working on this)
  • codes defined from finite groups fings – for example split group codes,
  • a fast minimum distance function (followed by a fast weight distribution function), valid for all characteristics.

It seems more “Pythonic” to add more classes for decoders, but I am not sure.

Steiner systems and codes

A t-(v,k,λ)-design D=(P,B) is a pair consisting of a set P of points and a collection B of k-element subsets of P, called blocks, such that the number r of blocks that contain any point p in P is independent of p, and the number λ of blocks that contain any given t-element subset T of P is independent of the choice of T. The numbers v (the number of elements of P), b (the number of blocks), k, r, λ, and t are the parameters of the design. The parameters must satisfy several combinatorial identities, for example:

\lambda _i = \lambda \left(\begin{array}{c} v-i\\ t-i\end{array}\right)/\left(\begin{array}{c} k-i\\ t-i\end{array}\right)

where \lambda _i is the number of blocks that contain any i-element set of points.

A Steiner system S(t,k,v) is a t-(v,k,λ) design with λ=1. There are no Steiner systems known with t>5. The ones known (to me anyway) for t=5 are as follows:

S(5,6,12), S(5,6,24), S(5,8,24), S(5,7,28), S(5,6,48), S(5,6,72), S(5,6,84),
S(5,6,108), S(5,6,132), S(5,6,168), and S(5,6,244).

Question: Are there others with t=5? ANy with $t>5$?

A couple of these are well-known to arise as the support of codewords of a constant weight in a linear code C (as in the Assmus-Mattson theorem, discussed in another post) in the case when C is a Golay code (S(5,6,12) and S(5,8,24)). See also the wikipedia entry for Steiner system.

Question: Do any of these others arise “naturally from coding theory” like these two do? I.e., do they all arise as the support of codewords of a constant weight in a linear code C via Assmus-Mattson?

Here is a Sage example to illustrate the case of S(5,8,24):

sage: C = ExtendedBinaryGolayCode()
sage: C.assmus_mattson_designs(5)
[‘weights from C: ‘,
[8, 12, 16, 24],
‘designs from C: ‘,
[[5, (24, 8, 1)], [5, (24, 12, 48)], [5, (24, 16, 78)], [5, (24, 24, 1)]],
‘weights from C*: ‘,
[8, 12, 16],
‘designs from C*: ‘,
[[5, (24, 8, 1)], [5, (24, 12, 48)], [5, (24, 16, 78)]]]
sage: C.assmus_mattson_designs(6)
sage: blocks = [ for c in C if hamming_weight(c)==8]; len(blocks)

The Assmus-Mattson Theorem, Golay codes, and Mathieu groups

A block design is a pair (X,B), where X is a non-empty finite set of v>0 elements called points, and B is a non-empty finite multiset of size b whose elements are called blocks, such that each block is a non-empty finite multiset of k points. A design without repeated blocks is called a simple block design. If every subset of points of size t is contained in exactly \lambda blocks the the block design is called a t(v,k,\lambda) design (or simply a t-design when the parameters are not specfied). When \lambda = 1 then the block design is called a S(t,k,v) Steiner system.

Let C be an [n,k,d] code and let C_i = \{ c \in C\ |\ wt(c) = i\} denote the weight i subset of codewords of weight i. For each codeword c\in C, let supp(c)=\{i\ |\ c_i\not= 0\} denote the support of the codeword.

The first example below means that the binary [24,12,8]-code C has the property that the (support of the) codewords of weight 8 (resp, 12, 16) form a 5-design.

Example: Let $C$ denote the extended binary Golay code of length 24. This is a self-dual [24,12,8]-code. The set X_8 = \{supp(c)\ |\ c \in C_8\} is a 5-(24, 8, 1) design; X_{12} = \{supp(c)\ |\ c \in C_{12}\} is a 5-(24, 12, 48) design;and, X_{16} = \{supp(c)\ |\ c \in C_{16}\} is a 5-(24, 16, 78) design.

This is a consequence of the following theorem of Assmus and Mattson.

Assmus and Mattson Theorem (section 8.4, page 303 of [HP]):

Let A_0, A_1, ..., A_n be the weight distribution of the codewords in a binary linear [n , k, d] code C, and let A_0^\perp, A_1^\perp, ..., A_n^\perp be the weight distribution of the codewords in its dual [n,n-k, d^\perp] code C^\perp. Fix a t, 0<t<d, and let s = |\{ i\ |\ A_i^\perp \not= 0, 0<i\leq n-t\, \}|.
Assume s\leq d-t.

  • If A_i\not= 0 and d\leq i\leq n then C_i = \{ c \in C\ |\ wt(c) = i\} holds a simple t-design.
  • If A_i^\perp\not= 0 and d^\perp\leq i\leq n-t then C_i^\perp = \{ c \in C^\perp \ |\ wt(c) = i\} holds a simple t–design.
  • If A_i^\perp\not= 0 and d^\perp\leq i\leq n-t then C_i^\perp = \{ c \in C^\perp \ |\ wt(c) = i\} holds a simple t–design.

In the Assmus and Mattson Theorem, X is the set \{1,2,...,n\} of coordinate locations and B = \{supp(c)\ |\ c \in C_i\} is the set of supports of the codewords of C of weight i. Therefore, the parameters of the t-design for C_i are

  • t = given,
  • v = n,
  • k = i, (this k is not to be confused with dim(C)!),
  • b = A_i,
  • \lambda = b*binomial(k,t)/binomial(v,t)

(by Theorem 8.1.6, p. 294, in \cite{HP}). Here is a SAGE example.

sage: C = ExtendedBinaryGolayCode()
sage: C.assmus_mattson_designs(5)
['weights from C: ',
[8, 12, 16, 24],
'designs from C: ',
[[5, (24, 8, 1)], [5, (24, 12, 48)], [5, (24, 16, 78)], [5, (24, 24, 1)]],
'weights from C*: ',
[8, 12, 16],
'designs from C*: ',
[[5, (24, 8, 1)], [5, (24, 12, 48)], [5, (24, 16, 78)]]]
sage: C.assmus_mattson_designs(6)

The automorphism group of the extended binary Golay code is the Mathieu group M_{24}. Moreover, the code is spanned by the codewords of weight 8.

[HP] W. C. Huffman, V. Pless, Fundamentals of error-correcting codes, Cambridge Univ. Press, 2003.
[CvL] P. Cameron, J. van Lint, Graphs, codes and designs, Cambridge Univ. Press, 1980.