In Stanislaw Ulam’s 1976 autobiography (Adventures of a mathematician, Scribner and Sons, New York, 1976), one finds the following problem concerning two players playing a generalization of the childhood game of “20 questions”.
(As a footnote: the history of who invented this game is unclear. It occurred in writings of Renyi, for example “On a problem in information theory” (Hungarian), Magyar Tud. Akad. Mat. Kutató Int. Közl. 6 (1961)505–516, but may have been known earlier under the name “Bar-kochba”.)
Problem: Player 1 secretly chooses a number from to ( is large but fixed, say ). Player 2 asks a series of “yes/no questions” in an attempt to determine that number. Player 1 can lie at most times ( is fixed). Assuming best possible play on both sides, what is the minimum number of “yes/no questions” Player 2 must ask to (in general) be able to correctly determine the number Player 1 selected?
There is a very large literature on this problem. Just google “searching with lies” (with the quotes) to see a selection of the papers (eg, this paper). It turns out this children’s game, when suitably generalized, has numerous applications to computer science and engineering.
We start with a simple example. Pick an integer from 0 to 1000000. For example, suppose you pick .
Here is how I can ask you 20 True/False questions and guess your number.
Is your number less than 524288? False
Is your number less than 655360? True
Is your number less than 589824? True
Is your number less than 557056? True
Is your number less than 540672? True
Is your number less than 532480? True
Is your number less than 528384? True
Is your number less than 526336? True
Is your number less than 525312? True
Is your number less than 524800? True
Is your number less than 524544? True
Is your number less than 524416? True
Is your number less than 524352? True
Is your number less than 524320? True
Is your number less than 524304? True
Is your number less than 524296? True
Is your number less than 524292? True
Is your number less than 524290? True
Is your number less than 524289? False
This is the general description of the binary search strategy:
First, divide the whole interval of size into half and find which half the number lies in. Discard the one it does not lie in.
Second, divide the subinterval of size into half and find which half the number lies in. Discard the one it does not lie in.
Third, divide the subinterval of size into half and find which half the number lies in. Discard the one it does not lie in.
And so on.
This process must stop after 20 steps, as you run out of non-trivial subintervals.
This is an instance of “searching with lies” where no lies are told and “feedback” is allowed.
I don’t know what your number is, but whatever it is, represent it in binary as a vector
where , where . The 20 questions are:
and so on
You may determine the value of the ‘s from this (since each is either 0 or 1).
This is an instance of “searching with lies” where no lies are told and “feedback” is not allowed.
Let’s go back to our problem where Player 1 secretly chooses a number from to ( is large but fixed, say ). Use the following notation: If the answers between successive questions is possible and then questions may be adapted to these answers (“feedback allowed”), call this minimum number of “yes/no questions” Player 2 must ask, . If feedback is not allowed, call this minimum number .
In practice, we not only want to know (resp., ) but an efficient algorithm for determining the “yes/no questions” Player 2 should ask.
In his 1964 PhD thesis (E. Berlekamp, Block coding with noiseless feedback, PhD Thesis, Dept EE, MIT, 1964), Berlekamp was concerned with a discrete analog of a problem which arises when trying to design a analog-to-digital converter for sound recording which compensates for possible feedback.
To formulate this problem in the case when lies are possible, we introduce some new terms.
A binary code is a set of -tuples of ‘s and ‘s, i.e., a subset , where is the field of integers . If we assume, in addition, is a vector space over then we say is a linear binary code. A codeword is an element of a binary code.
The Hamming distance between -tuples is the number of non-zero coordinates of .
The minimum distance of is the smallest integer such that for all distinct .
For example, let
This is a 1-error correcting code of minimum distance .
We say that is -error correcting if the following property always holds: given any and any , if then every , , must satisfy (in other words, there is at most one codeword within distance from any ).
Let denote the largest possible size of a binary code of minimum distance .
Fact: If a binary code is -error correcting then .
Let denote the largest possible size of a binary linear code of minimum distance . It is known that , for some (called the dimension of ).
Fact: Fix any and . is the smallest such that .
Record the answers to all the “yes/no questions” as an -tuple of ‘s (for “no”) and ‘s (for “yes”). The binary code is the set of all codewords corresponding to correct answers. Intuitively, if you are Player 2 and you ask Player 1 what the coordinates of his codeword associated to his secret number is, and if he lies times then you will receive from him a binary vector of Hamming distance from a codeword. Since is -error correcting, you can determine from .
We want the smallest possible satisfying the condition . The values of (or , where we take , to be more precise) have been tabulated for “small” values of and online at the MinT website (U. of Salzburg, Wolfgang Ch. Schmid and Rudolf Schürer), codetables.de (Markus Grassl), and Boston U. (Ari Trachtenberg). This gives us an upper bound for , and hence also for for “small” values of and .
Here’s a recipe for using online tables to find a good upper bound the smallest number Q of questions Player 2 must ask Player 1, if Player 1 gets to lie times:
- Find the smallest k such that ,
- go to the table minT and go to the row ; find the smallest column number n which contains an entry greater than or equal to k.
- that column number n is an upper bound for Q.