News

GTO Wizard AI Beat All Major LLMs in Poker: Is The Game Over?

GTO Wizard AI Beat All Major LLMs in Poker: Is The Game Over?

Poker is not solved yet but it seems like GTO Wizard brought this moment one step closer with its AI model specialized in poker. The latest test of GTO Wizard AI capabilities in matches against other Large Language Models shows some questionable results for the future of poker — let’s take a look.

What Powers GTO Wizard AI

In 2023, GTO Wizard acquired ground-breaking Ruse AI — an advanced poker solver created by two Canadians Philippe Beardsell and Marc-Antoine Provost one year before.

Ruse AI attracted GTOW attention after beating the Annual Computer Poker Competition champion and top HUNL poker bot of that time, Slumbot.

They played by the ACPC rules:

  • 150,000 hands
  • Average acting time is restricted to 7 seconds per hand
  • Stack size resets to 200 BB after each hand

Ruse AI ended this HU with a record-breaking win rate of 19.4 BB/100.

The key difference between two models was approach. Both of them tried to play close to Nash equilibrium but Slumbot didn’t adapt strategy nor it exploited opponents’ errors while Ruse AI analyzed each specific situation during the game and resolved it in real time.

GTO Wizard AI vs. 20+ Other LLMs

Each Heads-Up No Limit Texas Hold’em match between GTO Wizard AI and LLM contestant lasted 5,000 hands.

All major LLMs participated, including different versions of GPT, Gemini, Claude Opus, Kimi and Grok.

Their results were quite shocking to see: each and every model simply sucked in HU poker against GTO WIzard AI.

GPT-5.3 (XHigh Reasoning) had the best luck-adjusted win rate of minus 16 bb/100.

GPT-5.4 Nano (No Reasoning) had the worst luck-adjusted win rate of minus 189.7 bb/100.

For context: GTO Wizard used 4 bb/100 as a benchmark of elite top pro win rate. And their experiment showed that even the best current non-poker LLMs lose against their AI at 4x that rate.

All the results of this experiment are compiled into one leaderboard — here are the TOP-10:

AI poker leaderboard

You can check full results on the GTO Wizard Benchmark website.

By the way, GTO Wizard opened this project for a wider contest — anyone can submit their agent to test its abilities in a game against GTOW AI.

Why LLMs Play Poker Badly

Following this experiment, GTO Wizard deducted four main aspects of each poker hand that explain why LLMs play poker so badly even against each other:

  1. Hidden information — you never see your opponent’s cards
  2. Balancing ranges across thousands of decision points
  3. Long-term planning where each action shapes future streets
  4. Opponent modeling under deep uncertainty

While LLMs are capable of solving differential equations, they can’t work with such nuanced things as, for example, constructing a balanced river strategy.

Moreover, and this is even a little funny, sometimes they tend to misread their own hands and make wrong decisions based on this mistake. The experiment showed that they confuse suited and offsuited holdings approximately 2% of the time.

Finally, Cesar Enrique Aponte Rivas, who has been auditing Gemini and Grok, also shared on X a very interesting conclusion about LLMs and poker:

“They consistently fail in logical integrity and are nowhere near the level for professional GTO. Poker is the ultimate reality check for LLMs.”

Has GTO Wizard AI ‘Solved’ Poker?

Well, developers stated that “GTO Wizard AI plays a near-perfect Nash Equilibrium strategy” and “no human can beat that over a meaningful sample”.

It is implied that even elite top pros who win against other humans with 4 bb/100 will lose against GTO Wizard AI.

It surely sounds panicky until you consider a few things:

  • GTO Wizard AI is effective against actually quite poker-wise dumb LLMs — it hasn’t been tested extensively and publicly yet against real pros.
  • GTO WIzard AI excelled only in HU NL Texas Holdem — but poker has a lot of other types that aren’t so easy to solve.
  • GTO Wizard AI is not a publicly available tool — it powers GTO Wizard solver, but cannot be used directly by anyone outside the development team.
  • Solvers remain prohibited to use during play in all online rooms — security teams continue to monitor even their background processes and ban players for violating restrictions.

Moreover, the results of these HU matches may actually be interpreted as hopeful for professional players because they confirm that LLMs available for the general public continue to be bad in poker, so anyone using them for advice will be an easy opponent.

Image
Written By: Vasilisa Zyryanova Blog Content Editor