Trained transformer-based chess models to play like humans (including thinking time) [P]

I trained a set of deep learning (transformer-based) chess models to play like humans (inspired by MAIA and Grandmaster Chess Without Search).

There's a separate model for each 100-point rating bucket from ~800 to 2500+. I started with training a mid-strength model from scratch on a 8xH100 cluster, then fine-tuned models for the other rating ranges on my local 5090 GPU. The total training size was nearly a year of Lichess data, about 1B total games.

Each rating range actually has 3 models: A move model, a thinking time model, and a white win / draw / black win model. Despite being quite small (only 9MM parameters!) the move models achieve better accuracy than MAIA-2 and are approximately on par with MAIA-3 (see here for MAIA-2 comparison).

AFAIK this is the only attempt to train on thinking times in chess, so I don't have a benchmark to compare against for that.

Likely because of the network size, at high ratings the models aren't quite as good as they could be. They see short tactical motifs but can't do deep calculation - probably a bigger model would help here.

The move and win models take into account player ratings and clock times. For instance, under extreme time pressure a much stronger player has a lower win prob even if their opponent is weaker. The models blunder more under time pressure as well.

The data pipeline is C++ via nanobind, then training with Pytorch. Getting this right was actually the thing I spent the most time on. Pre-shuffling the dataset and then being able to read the shuffled dataset sequentially at training time kept the GPU utilization high. Without this it spent a huge percentage of time on I/O while the GPU sat idle. Happy to answer questions about the rating-conditioning, the clock model, or the data pipeline.

Code (including training code and model weights) is at https://github.com/thomasj02/1e4_ai/. A demo is at https://1e4.ai/ but all the frontend code is also in the repo if you want to self-host.

submitted by /u/hazard02
[link] [comments]

Trained transformer-based chess models to play like humans (including thinking time) [P]

Want to read more?

Tagged with