Releases: lightvector/KataGo
Experimental Eval Cache, Bugfixes
If you're a new user, this section has tips for getting started and basic usage! If you don't know which version to choose (OpenCL, CUDA, TensorRT, Eigen, Eigen AVX2), see here.
Download the latest neural nets to use with this engine release at https://katagotraining.org/.
Also, for 9x9 boards or for boards larger than 19x19, see https://katagotraining.org/extra_networks/ for networks specially trained for those sizes!
KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!
Notes about Precompiled Exes in this Release
For CUDA and TensorRT, the executables attached below are labeled with the versions of the libraries they are built for. E.g. trt10.2.0 for TensorRT 10.2.0.x, or cuda12.5 for CUDA 12.5.x, etc. It's recommended that you install and run these with the matching versions of CUDA and TensorRT rather trying to run with different versions.
The OpenCL version will more often work as long as you have any semi-modern GPU hardware accelerator and appropriate drivers installed, whether for Nvidia or non-Nvidia GPUs, without needing any specific versions, although it may be a bit less performant.
Available also below are both the standard and +bs50 versions of KataGo. The +bs50 versions are just for fun, and don't support distributed training but DO support board sizes up to 50x50. They may also be slightly slower and will use much more memory, even when only playing on 19x19, so use them only when you really want to try large boards.
The Linux executables were compiled on a 22.04 Ubuntu machine using AppImage. You will still need to install e.g. correct versions of Cuda/TensorRT or have drivers for OpenCL, etc. on your own. Compiling from source is also not so hard on Linux, see the "TLDR" instructions for Linux here.
Changes this Release
- Added experimental eval-caching feature, NOT enabled by default yet. Enable it by setting
useEvalCache=truein the gtp.cfg or analysis.cfg. This will make it so that while analyzing with KataGo interactively, if you walk deeper into a variation and have KataGo realize a good move that was a blind spot, then when you walk backward to an earlier point, the search will be far more likely to solve that tactic as well and now able to analyze the earlier position in light of the newly solved tactic. - Subtree value bias now no longer applies to nodes following passing, to avoid conflating evals that don't have a discriminating local pattern.
- Fixed issue in contributing to distributed training where where unnecessary locking might block starting new games while old games were uploaded.
- Fixed some typos that could cause the python testing script
python/play.pyor getting input features in python to crash - Various internal refactors and cleanups
Various bugfixes, mingw support
This is not the latest release - see a more recent release at v1.16.4!
Changes this Release
Added support
- Added support for mingw compile for opencl/eigen in windows (TODO)
- Allowed human policy to be used for book generation
Fixes for contribution to distributed training
- Fixed bug regarding already-finished games that occasionally caused contribution to katagotraining.org to crash. (thanks to luotianyi for reporting and testing!)
- Linux executables for CUDA and TensorRT have been built now with TCMalloc enabled in the hopes that this will might reduce long-term memory fragmentation during contribute.
Other fixes
- Fixed issue that would result in faulty search behavior with some passing hack options
- Fixed incorrect turn number when converting books to startposes for training.
- Fixed debugging output for illegal game history checking
- Fixed some issues with compilation and scripts on Metal and/or MacOS
Metal Backend and TensorRT Compile Bugfix
This is not the latest release - see a more recent release at v1.16.4!
Changes this Release
This is a quick bugfix release for two issues:
- Fix major bug in v1.16.1 neural net weight calculations that completely broke the Metal backend. (Thanks to @dfannius and @ChinChangYang for quick reporting and fix and testing!)
- Fix issue in detecting versions from TensorRT header files with certain recent TensorRT versions when compiling from source.
The changes in this release compared to v1.16.1 mainly only affect users using the Metal backend or who were building TensorRT from source, so users on v1.16.1 on backends besides Metal and who are not building from source don't need to upgrade. (Users on version v1.16.0, though, should still upgrade to fix issues with potential KataGo crashes).
Reduced Numeric Issues and other Bugfixes
This is not the latest release - see a more recent release at v1.16.4!
Notes about Precompiled Exes in this Release
For CUDA and TensorRT, the executables attached below are labeled with the versions of the libraries they are built for.
- E.g. "trt10.2.0" for TensorRT 10.2.0.*
- E.g. "cuda12.5" for CUDA 12.5.*
- etc.
It's recommended that you install and run these with the matching versions of CUDA and TensorRT rather trying to run with different versions.
The OpenCL version will more often work as long as you have any semi-modern GPU hardware accelerator and appropriate drivers installed, whether for Nvidia or non-Nvidia GPUs, without needing any specific versions, although it may be a bit less performant.
Available also below are both the standard and +bs50 versions of KataGo. The +bs50 versions are just for fun, and don't support distributed training but DO support board sizes up to 50x50. They may also be slower and will use much more memory, even when only playing on 19x19, so use them only when you really want to try large boards.
The Linux executables were compiled on a 22.04 Ubuntu machine using AppImage. You will still need to install e.g. correct versions of Cuda/TensorRT or have drivers for OpenCL, etc. on your own. Compiling from source is also not so hard on Linux, see the "TLDR" instructions for Linux here.
Changes this Release
Mitigation for crashes due to infinites/nans:
With v1.16.0, some users were observing that KataGo would sometimes crash while contributing to katagotraining.org on 19x19 positions with extreme komis or results on TensorRT, and also KataGo would crash often when running on larger board sizes, in both cases due to a nonfinite (i.e. nan or infinite) policy output. For the latter, this is especially true for nets that were not trained for large boards, but also still had occasional crashes when using some nets that were trained for large boards. Our best guess of the cause is due to occasional extremely large activations in the net.
KataGo now internally scales the weights of the neural net in a few ways that should reduce the typical magnitude of the internal activations. When running with FP16, this should hopefully better make use of the available FP16 range and make it so that somewhat more extreme values than before are required before a crash, and should hopefully be enough of a buffer stop the 19x19 contribute crashes to entirely. TensorRT also changed convert output heads in FP32 one layer earlier, which might help as some of the larger activations tend to be in the head.
Other various notable changes (C++/Engine):
- Fixed a bug in the "-fixed-batch-size" and "-half-batch-size" arguments to benchmark in how they set the benchmark batch size limits.
- KataGo now tolerates simple ko violations in SGFs or GTP play commands or a few other locations, just like it tolerates superko violations.
- KataGo now compiles with C++17 standard rather than C++14.
- Fixed some issues where some board-size-related config arguments didn't accept values up to 50 for the +bs50 executables.
- Updated CMake logic to handle a change to the header define format in newer versions of TensorRT.
Other various notable changes (Python/Training):
- Significantly rearranged the python directory - all .py files that aren't intended to be directly run have been moved to a
./python/katago/...package. The imports in all the scripts and the various self-play training scripts should be updated appropriately. - Some shuffling/training scripts no longer rely upon symlinking the data location, which doesn't work on windows
New Training Data Gen, Metal Support, Many Bugfixes
This is not the latest release - see a more recent release at v1.16.2!
KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!
As a reminder, for 9x9 boards, see here for a special neural net better than any other net on 9x9, which was used to generate the 9x9 opening books at katagobooks.org.
Precompiled Exe Notes and Versions
For CUDA and TensorRT, the executables attached below are labeled with the versions they are built for.
- E.g. "trt10.2.0" for TensorRT 10.2.0.*
- E.g. CUDA 12.5 for CUDA 12.5.*
- And so on
- Cuda 12.8.* and/or TensorRT 10.9.0.* should hopefully be suitable for the most recently released NVIDIA GPUs as of April 2025, the RTX 5000 family.
It's recommended that you install and run these with the matching versions of CUDA and TensorRT rather trying to run with different versions.
The OpenCL version will more often work as long as you have any semi-modern GPU hardware accelerator and appropriate drivers installed, whether for Nvidia or non-Nvidia GPUs, without needing any specific versions, although it may be a bit less performant.
Available also below are both the standard and +bs50 versions of KataGo. The +bs50 versions are just for fun, and don't support distributed training but DO support board sizes up to 50x50. They may also be slower and will use much more memory, even when only playing on 19x19, so use them only when you really want to try large boards. Also, KataGo's default neural nets will behave extremely poorly and nonsensically for board sizes well above 25x25, but see https://katagotraining.org/extra_networks/ under "Large Board Networks" for a net specifically finetuned to play very strongly on huge board sizes - this is the net recommended to use for large boards with this release!
The Linux executables were compiled on a 22.04 Ubuntu machine. Experimentally, the attached exes for Linux in this release were built using AppImage, in an attempt to mitigate libzip.5.so/libzip.4.so incompatibility issues, hopefully making them a bit more portable than before. You will still need to install e.g. correct versions of Cuda/TensorRT or have drivers for OpenCL, etc. on your own. If you are unable to run due to system library incompatibilities, you should be able to work around it by compiling from source, which is usually not so hard on Linux, see the "TLDR" instructions for Linux here.
Notable Changes this Release
New Training Data for Selfplay/Contribute
This release will record an entire new set of data for training that prior releases did not record, which are "action-value" winloss and score targets (i.e. q-value) indicating the predicted winrate and score after each move searched in a position. The new tensor is named qValueTargetsNCMove in the .npz data files.
These targets are not used yet by nets, however they are likely valuable for future experimentation and research. Once enough people switch and https://katagotraining.org/ accumulates enough of the new data, it should become possible to do some new experimentation and research on this as a neural net training target, possibly improving the nets or possibly enabling certain new algorithm improvements in the search in future KataGo versions.
Metal Backend for MacOS
Merged Metal backend implementation from @ChinChangYang into this main KataGo repo, for running KataGo neural nets on MacOS. (yay!). This release does not have prebuilt exes for Metal available, since @lightvector is unable to build and test them easily due to not having a Mac. However, CCY may be able to supply some before long.
Other user-facing feature additions/changes:
- Added search parameter
enableMorePassingHacks, enabled for GTP/Analysis by default, that forces both passing and non-passing moves to get searched properly if a pass would end the game and enough visits are used. - Analysis engine now reports
playSelectionValueindicating KataGo's propensity to choose a move, see documentation.
User-facing bugfixes:
- Fixed issue where KataGo GTP extension
kata-get-paramdidn't work properly fornumSearchThreads. - Fixed issue where a move by the same-colored player a second time in a row would be rejected as invalid even when tolerating illegal moves, improved internal checks on move legality.
- Fixed bug where
autoAvoidRepeat-related parameters could be ignored or fail to parse or take effect properly.
C++ tools/utils changes:
- Minor fix to book generation and handling of passes in graph hashing.
- Minor additions and adjustments to book policy metrics.
- Minor updates to various sgf/hint processing tools.
- Minor updates to improve handling of GoGoD and Fox sgfs for sgf training data generation tools.
- Better testing scripts for GPU or other backend numerical error distributions.
- KataGo now supports a new model version 16, which has two more policy head channels corresponding to predictions of the two new recorded data targets. These are not yet used for anything or reported, but they are computed, ready for further research and use in future versions.
- Better GPU numerical error checking.
Python scripts feature additions/changes:
- Updated scripts to read training data with the new
qValueTargetsNCMoveand to train and export version 16 nets.- The scripts should continue to be able to train older (version 15) nets and read data produced by older versions.
- If
-exclude-qvaluesis provided to the v1.16 shuffle.py script, it should also output shuffled data that omits the new tensor and therefore is readable by v1.15 scripts.
Support rectangular boards and named rules in python GTPplay.pyscript.
- Added logging of the training window (i.e. data range) in training.
- Added experimental support for nested residual bottleneck blocks in training that consist entirely of dilated convolutions via a single pre/post transpose.
- Implemented concurrent loading of npz files to slightly speed up training.
- Added support for Pytorch 2.6+ secure checkpoint serialization.
Python scripts bugfixes:
- Minor fixes to python GTP
play.pyscript. - Fixed error reporting on script to summarize old selfplay files to cache their sizes and avoid filesystem queries.
- Fixed bug where training code incorrectly computed the randomization for loading data files to train on the right number of rows in expectation with chunky data files.
- Fixed bug where training code was overaggressive with respect to which recording which files were used to avoid double-sampling with chunky data files.
- Fixed bug where shuffle script incorrectly recorded data file ranges for logging.
New Human-like Play and Analysis + more bugfixes
This is not the latest release - see a more recent release at v1.16.0!
This is a bugfix release for v1.15.0 / v1.15.1 that fixes a few oddities and unexpected behaviors with human SL settings. It fixes one further bug on top of v1.15.2. Thanks again to everyone reporting issues!
Compared to v1.15.0 / v1.15.1 it includes a new example config gtp_human9d_search_example.cfg that demonstrates how to get very strong (possibly mildly-superhuman) play while still getting a nontrivial amount of human style bias from the human SL model.
This release also includes executables compiled for CUDA 12.5, CUDNN 8.9.7, and TensorRT 10.2.0, to give an updated and more recent possible alternative for users who may have had issues with the earlier releases compiled for older CUDA versions.
See the v1.15.0 release page for help and documentation and pretty pictures about the latest-released features, notably the new Human SL model released with v1.15.0.
Changes in v1.15.3
- Fixed a bug where GTP genmove might fail to return a legal move when using a high proportion of humanSL "weightless" visits combined with a low maxVisits.
- Slight adjustment to default values in gtp_human9d_search_example.cfg.
Changes in v1.15.2
- Fixed an issue where weightless visits (e.g. from
humanSLRootExploreProbWeightless) would not be counted towards the visits or playouts limit for a search. - Fixed an issue where KataGo would ignore human SL exploration parameters on the first visit or two to any node.
- Fixed a compile error when compiling KataGo for testing without any backend. (i.e. with dummy backend)
- Fixed an issue with cuda backend compilation types on some systems.
- Clarified some various docs about human SL model usage, including some minor improved comments in the gtp_human5k_example.cfg
- Added new example config new example config gtp_human9d_search_example.cfg.
New Human-like Play and Analysis + more bugfix
This is not the latest release - see the release v1.15.3 for various bugfixes and the versions that you should actually use.
Still see the v1.15.0 release page for help and documentation and pretty pictures about the major features released with v1.15.x, although the executables there are outdated and/or more buggy compared to v1.15.3
This is a bugfix release for v1.15.0 / v1.15.1 that fixes a few oddities and unexpected behaviors with human SL settings.
It also includes a new example config gtp_human9d_search_example.cfg that demonstrates how to get very strong (possibly mildly-superhuman) play while still getting a nontrivial amount of human style bias from the human SL model.
This release also includes executables compiled for CUDA 12.5, CUDNN 8.9.7, and TensorRT 10.2.0, to give an updated and more recent possible alternative for users who may have had issues with the earlier releases compiled for older CUDA versions.
See the v1.15.0 release page for help and documentation and pretty pictures about the latest-released features, notably the new Human SL model released with v1.15.0.
Changes
- Fixed an issue where weightless visits (e.g. from
humanSLRootExploreProbWeightless) would not be counted towards the visits or playouts limit for a search. - Fixed an issue where KataGo would ignore human SL exploration parameters on the first visit or two to any node.
- Fixed a compile error when compiling KataGo for testing without any backend. (i.e. with dummy backend)
- Fixed an issue with cuda backend compilation types on some systems.
- Clarified some various docs about human SL model usage, including some minor improved comments in the gtp_human5k_example.cfg
- Added new example config new example config gtp_human9d_search_example.cfg.
New Human-like Play and Analysis + quick bugfix
This is not the latest release - see the release v1.15.3 for various bugfixes and the versions that you should actually use.
Still see the v1.15.0 release page for help and documentation and pretty pictures about the major features released with v1.15.x, although the executables there are outdated and/or more buggy compared to v1.15.3
This is a quick bugfix for v1.15.0 that fixes a minor issue with the Analysis Engine where it would report an error when querying the version. This release also slightly clarifies the documentation in gtp_human5k_example.cfg on how to use the new Human SL model released with v1.15.0. Please continue to report any issues and we will fix them. :)
New Human-like Play and Analysis
This is not the latest release - see v1.15.3 for various bugfixes and use the code and/or executables there rather than here.
But stay on this page and read on below for info about human-like play and analysis introduced in v1.15.x!
If you're a new user, this section has tips for getting started and basic usage! If you don't know which version to choose (OpenCL, CUDA, TensorRT, Eigen, Eigen AVX2), see here. Also, download the latest neural nets to use with this engine release at https://katagotraining.org/.
KataGo is continuing to improve at https://katagotraining.org/ and if you'd like to donate your spare GPU cycles and support it, it could use your help there!
As a reminder, for 9x9 boards, see here for a special neural net better than any other net on 9x9, which was used to generate the 9x9 opening books at katagobooks.org.
Available below are both the standard and "bs29" versions of KataGo. The "bs29" versions are just for fun, and don't support distributed training but DO support board sizes up to 29x29. They may also be slower and will use much more memory, even when only playing on 19x19, so use them only when you really want to try large boards.
The Linux executables were compiled on a 20.04 Ubuntu machine. Some users have encountered issues with libzip or other library compatibility issues in the past. If you have this issue, you may be able to work around it by compiling from source, which is usually not so hard on Linux, see the "TLDR" instructions for Linux here.
Known issues (fixed in v1.15.1)
- Analysis engine erroneously reports an error when sending a
query_versionaction.
New Human-trained Model
This release adds a new human supervised learning ("Human SL") model trained on a large number of human games to predict human moves across players of different ranks and time periods! Not much experimentation with it has been done yet and there is probably low-hanging fruit on ways to use and visualize it, open for interested devs and enthusiasts to try.
Download the model linked here or listed in the downloads below, b18c384nbt-humanv0.bin.gz. Casual users should NOT download b18c384nbt-humanv0.ckpt - this is an alternate format for devs interested in the raw pytorch checkpoint for experimentation or for finetuning using the python scripts.
Basic usage:
./katago.exe gtp -model <your favorite usual model for KataGo>.bin.gz -human-model b18c384nbt-humanv0.bin.gz -config gtp_human5k_example.cfg
The human model is passed in as an extra model via -human-model. It is NOT a replacement for the default model (actually it can be if you know what you are doing! See the config and Human SL analysis guide for more details.).
Additionally, you need a config specifically designed to use it. The gtp_human5k_example.cfg configures KataGo to imitate 5-kyu-level players. You can change it to imitate other ranks too, as well as to do many more things, including making KataGo play in a human style but still at a strong level or analyze in interesting ways. Read the config file itself for documentation on some of these possibilities!
And for advanced users or devs see also this guide to using the human SL model, which is written from the perspective of the JSON-based Analysis Engine, but is also applicable to gtp as well.
Human SL analysis guide
Pretty Pictures
Just to show off how the model has learned how differently ranked players might play, here are example screenshots from a less-trained version of the Human SL model from a debug visualization during development. When guessing what 20 kyu players are likely to play, Black's move is to simply follow White, attaching at J17:
At 1 dan, the model guesses that players are likely to play the tiger mouth spoil or wedge at H17/H16, showing an awareness of local good shape, as well as some likelihood of various pokes at white's loose shape:
At 9 dan, the model guesses that the most likely move is to strike the very specific weak point at G14, which analysis confirms is one of the best moves.
As usual, since this is a raw neural net without any search, its predictions are most analogous to a top player's "first instinct with no reading" and at high dan levels won't be as accurate in guessing what such players, with the ability to read sharply, would likely play.
Another user/dev in the Computer Go discord shared this interesting visualization, where the size of the square is based on the total probability mass of the move summed across all player ranks, and the color and label are the average rank of player that the model predicts playing that move:
Hopefully some of these inspire possibilities for game review and analysis in GUIs or tools downstream of the baseline functionality added by KataGo. If you have a cool idea for experimenting with these kinds of predictions and stats, or think of useful ways to visualize them, feel free to try it!
Other Changes This Release
GTP and Analysis Engine changes
(Updated GTP doc, Updated Analysis Engine Doc)
- Various changes to both GTP and Analysis Engine to support the human SL model, see docs.
- GTP
versioncommand now reports information about the neural nets(s) used, not just the KataGo executable version. - GTP
kata-set-paramnow supports changing the large majority of search parameters dynamically instead of only a few. - GTP
kata-analyzecommand now supports a newrootInfoproperty for reporting root node stats. - GTP added
resignMinMovesPerBoardAreaas a way to prevent early resignation. - GTP added
delayMoveScaleanddelayMoveMaxas a way to add a randomized delay to moves so to prevent the bot from responding instantly to players. Delay will be on average shorter on "obvious" moves, hopefully giving a more natural-feeling pacing. - Analysis Engine now by default will report a warning in response to queries that contain unused fields, to help alert about typos.
- Analysis Engine now reports various raw neural net outputs in rootInfo.
- GTP and Analysis Engine both have changed "visits" to mean the child node visit count (i.e. the number of playouts that the child node after a move received) instead of the edge visit count (i.e. the number of playouts that the root MCTS formula "wanted" to invest in the move). The child visit count is more indicative of evaluation depth and quality. A new key "edgeVisits" has been added to report the original edge visit count, which is partly indicative of how much the search "likes" the move.
- These two values used to be almost identical in practical cases, although graph search could make them differ sometimes. With some humanSL config settings in this new version, they can now differ greatly.
Misc improvements
- Better error handling in TensorRT, should catch more cases where there are issues querying the GPU hardware and avoid buggy or broken play.
Training Scripts Changes
- Many changes and updates to training scripts to support human SL model training and architecture. Upgrade with caution, if you are actively training things.
- Added experimental sgf->training data command (
./katago writetrainingdata) to KataGo's C++ side that was used to produce data for human SL net training. There is no particular documentation offered for this, run it with-helpand/or be prepared to read and understand the source code. - Configs for new models now default to model version 15 with a slightly different pass output head architecture.
- Many minor bugfixes and slight tweaks to training scripts.
- Added option to gatekeeper to configure the required winning proportion.
Minor fixes, restore support for TensorRT 8.5
This release is outdated, see https://github.com/lightvector/KataGo/releases/tag/v1.15.0 for a newer release!
Summary and Notes
This is primarily a bugfix release. If you're contributing to distributed training for KataGo, this release also includes a minor adjustment to the bonuses that incentivize KataGo to finish the game cleanly, which might slightly improve robustness of training.
Both this and the prior release support an upcoming larger and stronger "b28" neural net that is currently being trained and will likely be ready soon!
As a reminder, for 9x9 boards, see here for a special neural net better than any other net on 9x9, which was used to generate the 9x9 opening books at katagobooks.org.
Available below are both the standard and "bs29" versions of KataGo. The "bs29" versions are just for fun, and don't support distributed training but DO support board sizes up to 29x29. They may also be slower and will use much more memory, even when only playing on 19x19, so use them only when you really want to try large boards.
The Linux executables were compiled on a 20.04 Ubuntu machine. Some users have encountered issues with libzip or other library compatibility issues in the past. If you have this issue, you may be able to work around it by compiling from source, which is usually not so hard on Linux, see the "TLDR" instructions for Linux here.
Changes in v1.14.1
- Restores support for TensorRT 8.5. Although the precompiled executables are still for TensorRT 8.6 and CUDA 12.1, if you are building from source TensorRT 8.5 along with a suitable CUDA version such as 11.8 should work as well. Thanks to @hyln9 - #879
- Changes ending score bonus to not discourage capture moves, encouraging selfplay to more frequently sample mild resistances and and refute bad endgame cleanup.
- Python neural net training code now randomizes history masking, instead of using a static mask that is generated at data generation time. This should very slightly improve data diversity when reusing data rows.
- Python neural net training code now will clear out nans from running training statistics, so that the stats can remain useful if a neural net during training experiences an exploded gradient but still manages to recover from it.
- Various minor cleanups to code and documentation, including a new document about graph search.