Medieval monks to modern metrics

The Philosophers’ Game

Loosely derived from the school curriculum at the time, called the quadrivium, plus a lots of ingenuity, two medieval monks from Germany invented a peculiar board game, rithmomachia — the so-called “Battle of Numbers”. (BTW, the monk’s names are Asilo and Hermanus Contractus — see [RN] for more background.) It wasn’t just a student’s idle pastime. It was infused with sacred interpretations that made it a training ground for spiritual and mathematical thinking. Indeed, to be a competitive player, you needed to know arithmetic, geometric, and harmonic progressions, as combinations of these could be the difference between victory and defeat.

The game’s most elegant victory condition is the proper victory: achieve it by advancing three pieces into enemy territory whose face values form a mathematical progression. The medieval monks who codified these rules saw something profound in this condition. They called such a trio a “well-coordinated fireteam”—three soldiers working in concert, their numerical harmony representing a kind of strategic perfection.

Somewhere in the process of thinking about both chess and rithmomachia, a question began to form: What would a “territorial” victory condition look like in chess?

Chess has no progressions to form from its pieces, no face values to arrange in patterns. But it does have something any rithmomachia player would recognize immediately: the concept of coordinated pieces controlling key terrain. What if there were a measurable threshold—a moment when one side’s pieces become so well-coordinated in enemy territory that victory becomes, in some sense, inevitable?

The Pursuit of a Chess Metric

This question led me down an unexpected path. I began exploring what quantitative chess analysis might reveal about positional dominance—not just material advantage or tactical threats, but something closer to what rithmomachia players meant by “coordination.”

Classical chess evaluation compute positional metrics such as space control, piece mobility, king safety, and tactical threats. To evaluate a given position, modern Stockfish (version 16 and later) abandoned the use of such classical chess metrics to form the final position evaluation. Instead, the evaluation is entirely based on NNUE neural networks, which are stronger but operate as black boxes. Those classical positional metrics are central to the CGA-based production of QARC latex reports quantitatively analyzing chess games.

One of the first books that discussed chess metrics was [Be99].

What if some combination of these interpretable factors could serve, in a significant majority of cases, as a chess analogue to a proper victory in rithmomachia? A numerical condition based on “territorial” metrics that, when achieved, strongly predicts the eventual winner in chess?

Using ChatGPT and data collected using the CGA module, I explored linear regression approaches to find useful weights for these positional metrics. The conversations were genuinely valuable—the back-and-forth helped me think through which metrics mattered, how to handle the different scales of measurement, and whether simplicity or precision should take priority. In the end, I chose simplicity. The result is what I call the Fireteam Index.

The Fireteam Index

The analysis led to two formulas, reflecting different hypotheses about what drives chess outcomes.

Standard FTI uses all four positional/territorial factors:

\text{FTI} = \Delta\text{Space} + \Delta\text{Mobility} + \Delta\text{King Safety} + \frac{\Delta\text{Threats}}{10}

Each Δ represents the difference between your value and your opponent’s. Threats are divided by ten because they’re measured on a different scale and tend to be more volatile.

Truncated FTI (FTI-T) emerged from exploring whether a simpler model might work as well or better:

\text{FTI-T} = 1.2 \cdot \Delta\text{Space} + 0.6 \cdot \Delta\text{Threats}

This two-term formula drops Mobility and King Safety entirely, focusing only on territorial control and tactical pressure. The weights 1.2 and 0.6 were derived through linear regression on the Berliner 5th World Correspondence Chess Championship games. The hypothesis: for decisive games, these two factors may capture most of what matters, while the additional complexity of the four-term model adds noise rather than signal.

Both formulas are applied using two prediction algorithms:

  • Per-ply: Examines the raw index at each half-move. If one side maintains a positive FTI for ten consecutive plies after the opening, we predict that side will win.

  • Windowed (smoothed): Applies a ten-ply rolling average before checking for the sustained advantage. This filters out tactical noise and captures deeper positional trends.

The CGA module now tracks both FTI variants in parallel, allowing direct comparison of their predictive accuracy across game collections.

Enter Hans Berliner

To test these ideas, I needed games with meticulous play and clear outcomes—games where positional factors would have time to matter. This led me to correspondence chess, and specifically to the games of Hans Berliner.

Berliner’s story is remarkable. Born in Berlin in 1929 to a Jewish family, he fled Nazi Germany at age eight, eventually settling in Washington, D.C. He learned chess at thirteen during a rainy day at summer camp, and the game became his “main preoccupation.” By twenty, he was a master; by thirty-six, he was World Correspondence Chess Champion.

His 1965–68 championship campaign was historically dominant. He scored 14 out of 16 points—twelve wins and four draws—a margin of victory three times greater than any previous world champion. His game against Yakov Estrin in the Two Knights Defense was later ranked by grandmaster Andy Soltis as the greatest chess game of the 20th century. The opening novelty Berliner introduced in that game still appears in theoretical discussions today.

But what makes Berliner particularly fitting for this project is what came next. After his chess triumph, he enrolled at Carnegie Mellon University at age forty to study computer science under Allen Newell. He completed his Ph.D. thesis in 1974 [Be74]. He went on to create BKG, a backgammon program that defeated the world champion in 1979—the first computer program to beat a human world champion in any game. He developed the B* search algorithm for game trees. And he led the team that built HiTech, the first chess computer to reach senior master strength and, in 1988, the first to defeat a grandmaster.

The CGA Module and QARC

This brings us to the Chess Game Analyzer (CGA) module and the QARC reports—Quantitative-Analytic Reports on Chess.

The CGA module is a Python tool that processes PGN files through Stockfish analysis and generates comprehensive LaTeX reports. For position evaluation, it uses Stockfish’s centipawn scores. For the positional metrics that feed the Fireteam Index—space, mobility, king safety, and threats—it implements custom calculations using python-chess, inspired by classical evaluation concepts but computed independently of any engine. This approach works with any version of Stockfish and produces interpretable, human-understandable metrics.

Version 6t introduces parallel tracking of both the standard FTI and truncated FTI-T, with book-level summaries comparing their predictive accuracy. This allows empirical testing of whether the simpler two-term model outperforms the more comprehensive four-term version.

QARC reports are produced purely by computer from PGN files—no human annotation required. This makes it feasible to generate detailed analytical reports for entire tournaments, specific matches, or a player’s path through an event. I plan to post QARC files for various matches and player-paths on GitHub, along with the CGA Python code itself.

The rithmomachia-inspired Fireteam Index is one component of these reports, but the “Q” in QARC stands for the broader goal: making quantitative chess analysis accessible and reproducible.

In testing against Berliner’s championship games, the FTI algorithms correctly predicted the winner in a striking percentage of decisive games. More interesting than the raw accuracy, though, is what the metrics reveal about how Berliner won. His games show characteristic patterns: controlled aggression in the opening, relentless accumulation of small positional advantages, and—crucially—the ability to maintain FTI dominance once established.

The draws are equally instructive. When Berliner drew, the FTI analysis shows neither side achieving sustained dominance. He understood which positions to fight in and which to hold.

What’s in This Package

The accompanying materials include:

  • chess_game_analyzer.py: The CGA module itself, with both FTI variants and prediction tracking
  • QARC report examples: Sample Quantitative-Analytic Reports on Chess generated from the Berliner championship games
  • Documentation: The theoretical framework connecting rithmomachia concepts to the Fireteam Index

The code will be released on GitHub under a permissive license (MIT or modified BSD, your choice). The methodology is documented well enough that you can generate QARC reports for your own game collections.

Looking Forward

I’m continuing to explore several directions:

  • Comparing standard FTI versus truncated FTI-T across larger game collections to determine which is more predictive
  • Applying the indices to different playing styles and time controls
  • Investigating whether similar “territorial threshold” concepts appear in other abstract strategy games
  • Exploring phase-specific variations (opening, middlegame, endgame)
  • Since the weights (1.2,0,0,0.6), derived from Berliner’s games, could be different for different players of other skill levels. It would be interesting to know more about how these weights can differ between different categories of players. (Eg, speed vs classical times, GM vs Master, and so on.)

The medieval monks who invented rithmomachia understood something that modern chess analysts are still working to quantify: there is a moment when coordination becomes dominance, and control of enemy territory leads to victory. The Fireteam Index is one attempt to capture that moment numerically.

It’s fitting, somehow, that the path from a medieval mathematical game to modern chess metrics led through Hans Berliner—a man who bridged the worlds of human chess mastery and computer chess pioneering. His games provided the testing ground; his research provided the conceptual tools. And his story reminds us that the deepest questions in game analysis have always been the same: What does it mean to coordinate? What does it mean to dominate? And how do we know when victory has become inevitable?

References

[Be74] Hans Berliner “Chess as Problem Solving: The Development of a Tactics Analyzer,” PhD thesis, Carnegie-Mellon Univ, 1974.

[Be99] Hans Berliner, The System: A World Champion’s Approach to Chess, Gambit Publications, 1999.

[RN] https://www.rithmo.net/

David Joyner, January 2026

This work draws on conversations with Claude (Anthropic) and ChatGPT (OpenAI), Stockfish 17 for position evaluation, and the correspondence chess archives that preserve Berliner’s remarkable games.