Bifid cipher and Sage

The Bifid cipher was invented around 1901 by Felix Delastelle. It is a “fractional substitution” cipher, where letters are replaced by pairs of symbols from a smaller alphabet. The cipher uses a 5×5 square filled with some ordering of the alphabet, except that “i”‘s and “j”‘s are identified (this is a so-called Polybius square; there is a 6×6 analog if you add back in “j” and also append onto the usual 26 letter alphabet, the digits 0, 1, …, 9). According to Helen Gaines’ book “Cryptanalysis”, this type of cipher was used in the field by the German army during World War I.

The Bifid cipher was discusses in Alasdair McAndrew’s book on Cryptography and Sage. We shall follow his discussion. As an example of a Polybius square for the Bifid cipher, pick the key to be “encrypt” (as Alasdair does). In that case, the Polybius square is \left(\begin{array}{rrrrr}  E & N & C & R & Y \\  P & T & A & B & C \\  D & E & F & G & H \\  I & K & L & M & N \\  O & P & Q & R & S  \end{array}\right). BTW, the 6\times 6 analog is: \left(\begin{array}{rrrrrr}  E & N & C & R & Y & P \\  T & A & B & C & D & E \\  F & G & H & I & J & K \\  L & M & N & O & P & Q \\  R & S & T & U & V & W \\  X & Y & Z & 0 & 1 & 2  \end{array}\right).

Here is Sage code to produce the 6\times 6 case (the 5\times 5 case is in Alasdair’s book):

def bifid(pt, key):
    """
    INPUT:
        pt - plaintext string      (digits okay)
        key - short string for key (no repetitions, digits okay)
    
    OUTPUT:
        ciphertext from Bifid cipher (all caps, no spaces)

    This is the version of the Bifid cipher that uses the 6x6
    Polybius square.

    AUTHOR:
        Alasdair McAndrew

    EXAMPLES:
        sage: key = "encrypt"
        sage: pt = "meet me on monday at 8am"
        sage: bifid(pt, key)
        [[2, 5], [0, 0], [0, 0], [1, 0], [2, 5], [0, 0], [3, 0], 
         [0, 1], [2, 5], [3, 0], [0, 1], [1, 3], [1, 1], [0, 4], 
         [1, 1], [1, 0], [5, 4], [1, 1], [2, 5]]
        'HNHOKNTA5MEPEGNQZYG'

    """
    AS = AlphabeticStrings()
    A = list(AS.alphabet())+[str(x) for x in range(10)]
    # first make sure the letters are capitalized
    # and text has no spaces
    key0 = [x.capitalize() for x in key if not(x.isspace())]
    pt0 = [x.capitalize() for x in pt if not(x.isspace())]
    # create long key
    long_key = key0+[x for x in A if not(x in key0)]
    n = len(pt0)
    # the fractionalization
    pairs = [[long_key.index(x)//6, long_key.index(x)%6] for x in pt0]
    print pairs
    tmp_cipher = flatten([x[0] for x in pairs]+[x[1] for x in pairs])
    ct = join([long_key[6*tmp_cipher[2*i]+tmp_cipher[2*i+1]] for i in range(n)], sep="")
    return ct

def bifid_square(key):
    """
    Produced the Polybius square for the 6x6 Bifid cipher.

    EXAMPLE:
        sage: key = "encrypt"
        sage: bifid_square(key)
        [E N C R Y P]
        [T A B C D E]
        [F G H I J K]
        [L M N O P Q]
        [R S T U V W]
        [X Y Z 0 1 2]

    """
    AS = AlphabeticStrings()
    A = list(AS.alphabet())+[str(x) for x in range(10)]
    # first make sure the letters are capitalized
    # and text has no spaces
    key0 = [SR(x.capitalize()) for x in key if not(x.isspace())]
    # create long key
    long_key = key0+[SR(x) for x in A if not(x in key0)]
    # the fractionalization
    pairs = [[long_key.index(SR(x))//6, long_key.index(SR(x))%6] for x in A]
    f = lambda i,j: long_key[6*i+j] 
    M = matrix(SR, 6, 6, f)
    return M

Have fun!

Zombies and Mathematics

What do you do if there is a Zombie attack? Can mathematics help? This post is (humorously) dedicated to collecting links to papers or blog posted related to the mathematical models of Zombies.

George Romero‘s 1968 Night of the Living Dead, now in the public domain, introduced reanimated ghouls, otherwise known as zombies, which craved live human flesh. Romero’s script was insired on Richard Matheson’s I Am Legend. In Romero’s version, the zombies could be killed by destroying the zombie’s brain. A dead human could, in some cases be “reanimated,” turning into a zombie. These conditions are modeled mathematically in several papers, given below.

Public domain 1968 film Night of the Living Dead by George Romero.

uniform matroids and MDS codes

It is known that uniform (resp. paving) matroids correspond to MDS (resp. “almost MDS” codes). This post explains this connection.

An MDS code is an [n,k,d] linear error correcting block code C which meets the Singleton bound, d+k=n+1. A uniform matroid is a matroid for which all circuits are of size \geq r(M)+1, where r(M) is the rank of M. Recall, a circuit in a matroid M=(E,J) is a minimal dependent subset of E — that is, a dependent set whose proper subsets are all independent (i.e., all in J).

Consider a linear code C whose check matrix is an (n-k)\times n matrix H=(\vec{h}_1,\dots , \vec{h}_n). The vector matroid M=M[H] is a matroid for which the smallest sized dependency relation among the columns of H is determined by the check relations c_1\vec{h}_1 + \dots + c_n \vec{h}_n = H\vec{c}=\vec{0}, where \vec{c}=(c_1,\dots,c_n) is a codeword (in C which has minimum dimension d). Such a minimum dependency relation of H corresponds to a circuit of M=M[H].

Floyd-Warshall-Roy, 3

As in the previous post, let G=(V,E) be a weighted digraph having n vertices and let A=A_G denote itsn\times n adjacency matrix. We identify the vertices with the set \{1,2,\dots, n\}.

The previous post discussed the following result, due to Sturmfels et al.

Theorem: The entry of the matrix A\odot n-1 in row i and column j equals the length of a shortest path from vertex i to vertex j in G. (Here A\odot n-1 denotes the n-1-st tropical power of A.)

This post discusses an implementation in Python/Sage.

Consider the following class definition.

class TropicalNumbers:
    """
    Implements the tropical semiring.

    EXAMPLES:
        sage: T = TropicalNumbers()
        sage: print T
        Tropical Semiring

    """
    def __init__(self):
        self.identity = Infinity

    def __repr__(self):
        """
        Called to compute the "official" string representation of an object.
        If at all possible, this should look like a valid Python expression
        that could be used to recreate an object with the same value.

        EXAMPLES:
            sage: TropicalNumbers()
            TropicalNumbers()

        """
        return "TropicalNumbers()"

    def __str__(self):
        """
        Called to compute the "informal" string description of an object. 

        EXAMPLES:
            sage: T = TropicalNumbers()
            sage: print T
            Tropical Semiring

        """
        return "Tropical Semiring"

    def __call__(self, a):
        """
        Coerces a into the tropical semiring.

        EXAMPLES:
            sage: T(10)
            TropicalNumber(10)
            sage: print T(10)
            Tropical element 10 in Tropical Semiring
        """
        return TropicalNumber(a)

    def __contains__(self, a):
        """
        Implements "in".

        EXAMPLES:
            sage: T = TropicalNumbers()
            sage: a = T(10)
            sage: a in T
            True

        """
        if a in RR or a == Infinity:
            return a==Infinity or (RR(a) in RR)
        else:
            return a==Infinity or (RR(a.element) in RR)

class TropicalNumber:
    def __init__(self, a):
        self.element = a
        self.base_ring = TropicalNumbers()

    def __repr__(self):
        """
        Called to compute the "official" string representation of an object.
        If at all possible, this should look like a valid Python expression
        that could be used to recreate an object with the same value.

        EXAMPLES:

        """
        return "TropicalNumber(%s)"%self.element

    def __str__(self):
        """
        Called to compute the "informal" string description of an object. 

        EXAMPLES:
            sage: T = TropicalNumbers()
            sage: print T(10)
            Tropical element 10 in Tropical Semiring
        """
        return "%s"%(self.number())

    def number(self):
        return self.element

    def __add__(self, other):
        """
        Implements +. Assumes both self and other are instances of
        TropicalNumber class.

        EXAMPLES:
            sage: T = TropicalNumbers()
            sage: a = T(10)
            sage: a in T
            True
            sage: b = T(15)
            sage: a+b
            10

        """
        T = TropicalNumbers()
        return T(min(self.element,other.element))

    def __mul__(self, other):
        """
        Implements multiplication *.

        EXAMPLES:
            sage: T = TropicalNumbers()
            sage: a = T(10)
            sage: a in T
            True
            sage: b = T(15)
            sage: a*b
            25
        """
        T = TropicalNumbers()
        return T(self.element+other.element)

class TropicalMatrix:
    def __init__(self, A):
        T = TropicalNumbers()
        self.base_ring = T
        self.row_dimen = len(A)
        self.column_dimen = len(A[0])
        # now we coerce the entries into T
        A0 = A
        m = self.row_dimen
        n = self.column_dimen
        for i in range(m):
            for j in range(n):
                A0[i][j] = T(A[i][j])
        self.array = A0

    def matrix(self):
        """
        Returns the entries (as ordinary numbers).

        EXAMPLES:
            sage: A = [[0,1,3,7],[2,0,1,3],[4,5,0,1],[6,3,1,0]]
            sage: AT = TropicalMatrix(A)
            sage: AT.matrix()
            [[0, 1, 3, 7], [2, 0, 1, 3], [4, 5, 0, 1], [6, 3, 1, 0]]
        """
        m = self.row_dim()
        n = self.column_dim()
        A0 = [[0 for i in range(n)] for j in range(m)]
        for i in range(m):
            for j in range(n):
                A0[i][j] = (self.array[i][j]).number()
        return A0

    def row_dim(self):
        return self.row_dimen

    def column_dim(self):
        return self.column_dimen

    def __repr__(self):
        """
        Called to compute the "official" string representation of an object.
        If at all possible, this should look like a valid Python expression
        that could be used to recreate an object with the same value.

        EXAMPLES:

        """
        return "TropicalMatrix(%s)"%self.array

    def __str__(self):
        """
        Called to compute the "informal" string description of an object. 

        EXAMPLES:

        """
        return "Tropical matrix %s"%(self.matrix())

    def __add__(self, other):
        """
        Implements +. Assumes both self and other are instances of
        TropicalMatrix class.

        EXAMPLES:
            sage: A = [[1,2,Infinity],[3,Infinity,0]]
            sage: B = [[2,Infinity,1],[3,-1,1]]
            sage: AT = TropicalMatrix(A)
            sage: BT = TropicalMatrix(B)
            sage: AT
            TropicalMatrix([[TropicalNumber(1), TropicalNumber(2), TropicalNumber(+Infinity)],
              [TropicalNumber(3), TropicalNumber(+Infinity), TropicalNumber(0)]])
            sage: AT+BT
            [[TropicalNumber(1), TropicalNumber(2), TropicalNumber(1)],
             [TropicalNumber(3), TropicalNumber(-1), TropicalNumber(0)]]
        """
        A = self.array
        B = other.array
        C = []
        m = self.row_dim()
        n = self.column_dim()
        if m != other.row_dim:
            raise ValueError, "Row dimensions must be equal."
        if n != other.column_dim:
            raise ValueError, "Column dimensions must be equal."
        for i in range(m):
            row = [A[i][j]+B[i][j] for j in range(n)] # + as tropical numbers
            C.append(row)
        return C

    def __mul__(self, other):
        """
        Implements multiplication *.

        EXAMPLES:
            sage: A = [[1,2,Infinity],[3,Infinity,0]]
            sage: AT = TropicalMatrix(A)
            sage: B = [[2,Infinity],[-1,1],[Infinity,0]]
            sage: BT = TropicalMatrix(B)
            sage: AT*BT
             [[TropicalNumber(1), TropicalNumber(3)],
              [TropicalNumber(5), TropicalNumber(0)]]
            sage: A = [[0,1,3,7],[2,0,1,3],[4,5,0,1],[6,3,1,0]]
            sage: AT = TropicalMatrix(A)
            sage: A = [[0,1,3,7],[2,0,1,3],[4,5,0,1],[6,3,1,0]]
            sage: AT = TropicalMatrix(A)
            sage: print AT*AT*AT
            Tropical matrix [[0, 1, 2, 3], [2, 0, 1, 2], [4, 4, 0, 1], [5, 3, 1, 0]]
        """
        T = TropicalNumbers()
        A = self.matrix()
        B = other.matrix()
        C = []
        mA = self.row_dim()
        nA = self.column_dim()
        mB = other.row_dim()
        nB = other.column_dim()
        if nA != mB:
            raise ValueError, "Column dimension of A and row dimension of B must be equal."
        for i in range(mA):
            row = []
            for j in range(nB):
                c = T(Infinity)
                for k in range(nA):
                    c = c+T(A[i][k])*T(B[k][j])
                row.append(c.number())
            C.append(row)
        return TropicalMatrix(C)

This shows that the shortest distances of digraph with adjacency matrix \begin{pmatrix} 0&1&3&7\\ 2&0&1&3\\ 4&5&0&1\\ 6&3&1&0\end{pmatrix}, is equal to A\odot 3, which is equal to \begin{pmatrix} 0&1&2&3\\ 2&0&1&2\\ 4&4&0&1\\ 5&3&1&0\end{pmatrix}. This verifies an example given in chapter 1 of the book by Maclagan and Sturmfels, Introduction to Tropical Geometry .

Floyd-Warshall-Roy

The Floyd-Warshall-Roy algorithm is an algorithm for finding shortest paths in a weighted, directed graph. It allows for negative edge weights and detects a negative weight cycle if one exists. Assuming that there are no negative weight cycles, a single execution of the FWR algorithm will find the shortest paths between all pairs of vertices.

This algorithm is an example of dynamic programming, which allows one to break the computation down to simpler steps using some sort of recursive procedure. If n=|V| then this is a O(n^3)-time algorithm. For comparison, the Bellman-Ford algorithm has complexity O(|V|\cdot |E|), which is O(n^3)-time for “dense” graphs. However, Bellman-Ford only yields the shortest paths eminating from a single vertex. To achieve comparable output, we would need to iterate Bellman-Ford over all vertices, which would be an O(n^4)-time algorithm for “dense” graphs. Except for “sparse” graphs, Floyd-Warshall-Roy is better than an interated implementation of Bellman-Ford.

Here is a Sage implementation

def floyd_warshall_roy(A):
    """
    Shortest paths

    INPUT: 
        A - weighted adjacency matrix 

    OUTPUT
        dist  - a matrix of distances of shortest paths
        paths - a matrix of shortest paths

    EXAMPLES:
        sage: A = matrix([[0,1,2,3],[0,0,2,1],[20,10,0,3],[11,12,13,0]]); A
        sage: floyd_warshall_roy(A)
        ([[0, 1, 2, 2], [12, 0, 2, 1], [14, 10, 0, 3], [11, 12, 13, 0]], 
          [-1  1  2  1]
          [ 3 -1  2  3]
          [ 3 -1 -1  3]
          [-1 -1 -1 -1])
        sage: A = matrix([[0,1,2,4],[0,0,2,1],[0,0,0,5],[0,0,0,0]])
        sage: floyd_warshall_roy(A)
        ([[0, 1, 2, 2], [+Infinity, 0, 2, 1], [+Infinity, +Infinity, 0, 5],
          [+Infinity, +Infinity, +Infinity, 0]], 
          [-1  1  2  1]
          [-1 -1  2  3]
          [-1 -1 -1  3]
          [-1 -1 -1 -1])
        sage: A = matrix([[0,1,2,3],[0,0,2,1],[-5,0,0,3],[1,0,1,0]]); A
        sage: floyd_warshall_roy(A)
        Traceback (click to the left of this block for traceback)
        ...
        ValueError: A negative edge weight cycle exists.
        sage: A = matrix([[0,1,2,3],[0,0,2,1],[-1/2,0,0,3],[1,0,1,0]]); A
        sage: floyd_warshall_roy(A)
        ([[0, 1, 2, 2], [3/2, 0, 2, 1], [-1/2, 1/2, 0, 3/2], [1/2, 3/2, 1, 0]],
          [-1  1  2  1]
          [ 2 -1  2  3]
          [-1  0 -1  1]
          [ 2  2 -1 -1])
    """
    G = Graph(A, format="weighted_adjacency_matrix")
    V = G.vertices()
    E = [(e[0],e[1]) for e in G.edges()]
    n = len(V)
    dist = [[0]*n for i in range(n)]
    paths = [[-1]*n for i in range(n)]
    # initialization step
    for i in range(n):
        for j in range(n):
            if (i,j) in E:
                paths[i][j] = j
            if i == j:
                dist[i][j] = 0
            elif A[i][j] != 0:
                dist[i][j] = A[i][j]
            else:
                dist[i][j] = infinity
    # iteratively finding the shortest path
    for j in range(n):
        for i in range(n):
            if i != j:
                for k in range(n):
                    if k != j:
                        if dist[i][k]>dist[i][j]+dist[j][k]:
                            paths[i][k] = V[j]
                        dist[i][k] = min(dist[i][k], dist[i][j] +dist[j][k])
    for i in range(n):
        if dist[i][i] < 0:
            raise ValueError, "A negative edge weight cycle exists."
    return dist, matrix(paths)   

Sage and the future of mathematics

I am not a biologist nor a chemist, and maybe neither are you, but suppose we were. Now, if I described a procedure, in “standard” detail, to produce a result XYZ, you would (based on your reasonably expected expertise in the field), follow the steps you believe were described and either reproduce XYZ or, if I was mistaken, not be able to reproduce XYZ. This is called scientific reproducibility. It is cructial to what I believe is one of the fundamental principles of science, namely Popper’s Falsifiability Criterion.

More and more people are arguing, correctly in my opinion, that in the computational realm, in particular in mathematical research which uses computational experiments, that much higher standards are needed. The Ars Technica article linked above suggests that “it’s time for a major revision of the scientific method.” Also, Victoria Stodden argues one must “facilitate reproducibility. Without this data may be open, but will remain de facto in the ivory tower.” The argument basically is that to reproduce computational mathematical experiments is unreasonably difficult, requiring more that a reasonable expertise. These days, it may in fact (unfortunately) require purchasing very expensive software, or possessing of very sophisticated programming skills, accessibility to special hardware, or (worse) guessing parameters and programming procedures only hinted at by the researcher.

Hopefully, Sage can play the role of a standard bearer for such computational reproducibility. Sage is free, open source and there is a publically available server it runs on (sagenb.org).

What government agencies should require such reproducibility? In my opinion, all scientific funding agencies (NSF, etc) should follow these higher standards of computational accountability.

Evil spkgs?

In Matlab, a plugin is called a “toolkit”. In Photoshop, a plugin is called a plugin:-) As the success of Photoshop and Matlab suggest, plugins are very important. They enable “third party” developers to write specialized applications for your software package.

In Sage, (www.sagemath.org) a plugin is called an “spkg” (short for Sage PacKaGe).

Here is an easy-to-follow recipe for making an evil spkg.

  1. Make sure that your spkg modifies and existing Python or Cython file in the Sage devel tree. Extra evil bonus points for modifying heavily used files, such as in the plot subdirectory, and the more files touched the more evil points you earn. AFAIK, this will insure that, once your “innocent” spkg is applied to a developers tree, he cannot write a trac patch which can be correctly applied to a fresh clone.
  2. Make sure you ruin another spkg (surreptitiously, for extra bonus points). The more important the spkg you hose, the more evil points you get.

I wonder how Matlab and Photoshop avoid evil plugins?

Sage at the NSF-CDI+ECCAD conferences

This is a very quick view of some highlights of the conference http://www4.ncsu.edu/~kaltofen/NSF_WS_ECCAD09_Itinerary.html. I think further details of the talks will appear on the conference webpage later. This is very incomplete – just a few thought I was able to
jot down correctly.

Panel discussion:
Q1: What are the grand challenges of symbolic computing?
Is the term “symbolic computation” to broad? (Hybrid symbolic/numerical, algebraic geometric computation, algebraic combinatorial/group-theoretical, computer proofs, tensor calculations, differential, mathematical knowledge/database research, user interfaces, …)

General ans: No. Hoon Hong points out that user interfaces are lower level but below to the same group.

Q2: How can the latest algorithmic advances be made readily available: google analog of problem formulation? (Idea: suppose someone has a clever idea for a good algorithm but not enough discipline to implement it …)

One answer: Sage can put software together – is this the right way? Analog of stackoverflow.com?

Q3: What is the next killer app for symbolic computation? (Oil app of Groebner bases, cel phone app, robotics, …)

Q4: Can academically produced software such as LAPACK, LINBOX, SAGE compete with commercial software?

Hoon Hong answer: Yes but why? Why not cooperate. Support Sage very much but more research on interfaces and integration of different systems could lead to cooperation of the commercial systems with Sage.

Another panel:
Q: What are the spectacular successes and failures of computer algebra?

Failures:
(a) Small number of researchers.
(b) Sage could fail from lack of lies with the symbolic/numerical community (as Maxima/Axiom did). Matlab may fail due to uphill battle to integrate Mupad into good symbolic toolbox. (Many voiced view that Matlab is strong because of its many toolboxes, on the panel and privately.)
(c) Education at the High School level using CA.
(d) Presenting output of CA intelligently and in a standard format.
(e) Failure to teach people how to properly use a CA system.

Successes:
(a) Sage – interesting new effort (with caveat above)
(b) Groebner bases, LLL.

My talk on Sage raised a lot of questions. My There is both strong support for Sage and some questions on its design philisophy. My page 6 at http://sage.math.washington.edu/home/wdj/expository/nsf-eccad2009/
was a source of lots of questions.

At ECCAD http://www.cs.uri.edu/eccad2009/Welcome.html, Sage was mentioned a few times in talks as well as in some posters. The “main” Sage event was a laptop software demo Which Karl-Dieter Crisman set up for Sage.

Overall, a good experience for Sage.

Sage at AMS-MAA 2009 Washington DC

I think Sage gained a lot of publicity this year both by being at
the booth but also having an MAA panel discussion and an AMS session.
The panel discussion was good to be able to meet others in the
teaching community. I think this is related to Sage development because
projects like the educational open source software webworks has a
funding model which seems to be successful. I think Karl Crisman said he
would try to follow up on that.

A few people I met at the booth said they were interested in Sage development
but more stopped by saying that either they or their students could not afford
Maple or Mma and was looking into a cheaper quality alternative. The collaroration
possibilities of the Sage server was a strong “selling point” for smaller schools
which could load sage on a webserver.

There were some really good talks at the Sage session. For example,
Marshall’s talk had amazing graphics and Robert Miller’s talk was very well attended
(with maybe twice as many people in the audience as some of the others).
I thought the quality overall was great, but I’m very partial to such topics of course.

The general message I got from many was that more written material
on Sage in use would be welcomed, especially books. I was touched by one guy
who explained to me that his students were very poor (waitresses, for example)
who cannot afford calculus texts and commercial math programs. The point
he was implicitly making was that by offering software and documentation for
free we are actually improving the quality of such peoples’ lives in a real way.

Future coding theory in Sage projects

Here are a few ideas for future Sage projects in error-correcting codes.

Needed in Sage is

  • a rewrite in Cython, for speed,
  • special decoding algorithms, also in Cython,
  • for special decoders, it seems to me that one either needs
  1. more classes (eg, a CyclicCodes class), each of which will have a decode method, or
  2. another attribute, such as a name (string) which can be used to determine which decoder method to use.
  • codes over Rings (Cesar Agustin Garcia Vazquez is working on this)
  • codes defined from finite groups fings – for example split group codes,
  • a fast minimum distance function (followed by a fast weight distribution function), valid for all characteristics.

It seems more “Pythonic” to add more classes for decoders, but I am not sure.