Figuring out why AIs get flummoxed by some games



In Nim, there is a limited number of optimal moves for a given board configuration. If you don’t play one of them, then you essentially cede control to your opponent, who can go on to win if they play nothing but optimal moves. And again, the optimal moves can be identified by evaluating a mathematical parity function.

So, there are reasons to think that the training process that worked for chess might not be effective for Nim. The surprise is just how bad it actually was. Zhou and Riis found that for a Nim board with five rows, the AI got good fairly quickly and was still improving after 500 training iterations. Adding just one more row, however, caused the rate of improvement to slow dramatically. And, for a seven-row board, gains in performance had essentially stopped by the time the AI had played itself 500 times.

To better illustrate the problem, the researchers swapped out the subsystem that suggested potential moves with one that operated randomly. On a seven-row Nim board, the performance of the trained and randomized versions was indistinguishable over 500 training gains. Essentially, once the board got large enough, the system was incapable of learning from observing game outcomes. The initial state of the seven-row configuration has three potential moves that are all consistent with an ultimate win. Yet when the trained move evaluator of their system was asked to check all potential moves, it evaluated every single one as roughly equivalent.

The researchers conclude that Nim requires players to learn the parity function to play effectively. And the training procedure that works so well for chess and Go is incapable of doing so.

Not just Nim

One way to view the conclusion is that Nim (and by extension, all impartial games) is just weird. But Zhou and Riis also found some signs that similar problems could also crop up in chess-playing AIs that were trained in this manner. They identified several “wrong” chess moves—ones that missed a mating attack or threw an end-game—that were initially rated highly by the AI’s board evaluator. It was only because the software took a number of additional branches out several moves into the future that it was able to avoid these gaffes.



Source link

  • Related Posts

    Staff complain that xAI is flailing because of constant upheaval

    After the departures, only Manuel Kroiss—known as “Makro”—and Ross Nordeen will remain of the 11 cofounders who helped Musk set up xAI in San Francisco in March 2023. Last month,…

    Chirp Discount Codes and Deals: Save Up to 67%

    Chirp reinvented the wheel—or at least one type, the yoga wheel. Chirp Wheels are effective in relieving upper and lower back pain, sciatica, and tension headaches. WIRED contributor Hannah Singleton…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    Jet fuel prices are rising. That could make summer flights more expensive

    Jet fuel prices are rising. That could make summer flights more expensive

    Staff complain that xAI is flailing because of constant upheaval

    Staff complain that xAI is flailing because of constant upheaval

    Sheffield Shield 2025/26, NSW vs WA 28th Match Match Report, March 14 – 17, 2026

    Sheffield Shield 2025/26, NSW vs WA 28th Match Match Report, March 14 – 17, 2026

    What we’ve been playing – “I have learned the hard way that trusting others is a suckers game”

    What we’ve been playing – “I have learned the hard way that trusting others is a suckers game”

    Iris Law Just Showed Me the Chicest Way to Wear Leggings in 2026

    Iris Law Just Showed Me the Chicest Way to Wear Leggings in 2026

    North Korea fires ballistic missiles as US-South Korea hold military drills | News

    North Korea fires ballistic missiles as US-South Korea hold military drills | News