Skip to content

Commit

Permalink
Merge branch 'next'
Browse files Browse the repository at this point in the history
  • Loading branch information
glinscott committed Mar 29, 2018
2 parents 60f1ead + 0456753 commit 0cb27d5
Show file tree
Hide file tree
Showing 23 changed files with 589 additions and 247 deletions.
99 changes: 54 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,24 +9,74 @@ The goal is to build a strong UCT chess AI following the same type of techniques

We will need to do this with a distributed project, as it requires a huge amount of compute.

Please visit the LCZero forum to discuss: https://groups.google.com/forum/#!forum/lczero
Please visit the LCZero forum to discuss: https://groups.google.com/forum/#!forum/lczero, or the github issues.

# Contributing

The server is live at http://lczero.org/. Please download the client and give it a try: https://github.com/glinscott/leela-chess/releases. More information on getting started here: https://github.com/glinscott/leela-chess/wiki.
For precompiled binaries, see:
* [https://github.com/glinscott/leela-chess/wiki](wiki)
* [https://github.com/glinscott/leela-chess/wiki/Getting-Started](wiki/Getting-Started)

For live status: http://lczero.org

The rest of this page is for users who want to compile the code themselves.
Of course, we also appreciate code reviews, pull requests and Windows testers!

NOTE: The steps below are not required -- only for those that want to experiment with generating their own data.
# Compiling

## Requirements

* GCC, Clang or MSVC, any C++14 compiler
* boost 1.58.x or later (libboost-all-dev on Debian/Ubuntu)
* BLAS Library: OpenBLAS (libopenblas-dev) or (optionally) Intel MKL
* zlib library (zlib1g & zlib1g-dev on Debian/Ubuntu)
* Standard OpenCL C headers (opencl-headers on Debian/Ubuntu, or at
https://github.com/KhronosGroup/OpenCL-Headers/tree/master/opencl22/)
* OpenCL ICD loader (ocl-icd-libopencl1 on Debian/Ubuntu, or reference implementation at https://github.com/KhronosGroup/OpenCL-ICD-Loader)
* An OpenCL capable device, preferably a very, very fast GPU, with recent
drivers is strongly recommended (OpenCL 1.2 support should be enough, even
OpenCL 1.1 might work). If you do not have a GPU, modify config.h in the
source and remove the line that says `#define USE_OPENCL`.
* Tensorflow 1.4 or higher (for training)
* The program has been tested on Linux.

## Example of compiling - Ubuntu 16.04

# Install dependencies
sudo apt install libboost-all-dev libopenblas-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev zlib1g-dev

# Test for OpenCL support & compatibility
sudo apt install clinfo && clinfo

# Clone github repo
git clone [email protected]:glinscott/leela-chess.git
cd leela-chess
git submodule update --init --recursive
mkdir build && cd build

# Configure, build and run tests
cmake ..
make
./tests

# Compiling Client

See https://github.com/glinscott/leela-chess/tree/master/go/src/client/README.md.
This client will produce self-play games and upload them to http://lczero.org.
A central server uses these self-play games as input to the training process.

## Weights

The weights from the distributed training are downloadable from http://lczero.org/networks, the best one is at the top.
The weights from the distributed training are downloadable from http://lczero.org/networks, the best one is the top network that has some Games played on it.

Weights that we trained to prove the engine was solid are here https://github.com/glinscott/lczero-weights. Currently, the best weights were obtained through supervised learning on a human dataset with elo ratings > 2000.

# Training a new net using self-play

Running the Training is not required to help the project, only the central server needs to do this.
The distributed part is running the client to create self-play games. Those games are uploaded
http://lczero.org, and used as the input to the training process.

After compiling lczero (see below), try the following:
```
cd build
Expand Down Expand Up @@ -94,47 +144,6 @@ automatically resume using the tensorflow checkpoint.

You can use this to adjust learning rates, etc.

# Compiling

## Requirements

* GCC, Clang or MSVC, any C++14 compiler
* boost 1.58.x or later (libboost-all-dev on Debian/Ubuntu)
* BLAS Library: OpenBLAS (libopenblas-dev) or (optionally) Intel MKL
* zlib library (zlib1g & zlib1g-dev on Debian/Ubuntu)
* Standard OpenCL C headers (opencl-headers on Debian/Ubuntu, or at
https://github.com/KhronosGroup/OpenCL-Headers/tree/master/opencl22/)
* OpenCL ICD loader (ocl-icd-libopencl1 on Debian/Ubuntu, or reference implementation at https://github.com/KhronosGroup/OpenCL-ICD-Loader)
* An OpenCL capable device, preferably a very, very fast GPU, with recent
drivers is strongly recommended (OpenCL 1.2 support should be enough, even
OpenCL 1.1 might work). If you do not have a GPU, modify config.h in the
source and remove the line that says `#define USE_OPENCL`.
* Tensorflow 1.4 or higher (for training only)
* The program has been tested on Linux.

## Example of compiling - Ubuntu 16.04

# Install dependencies
sudo apt install libboost-all-dev libopenblas-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev zlib1g-dev

# Test for OpenCL support & compatibility
sudo apt install clinfo && clinfo

# Clone github repo
git clone [email protected]:glinscott/leela-chess.git
cd leela-chess
git submodule update --init --recursive
mkdir build && cd build

# Configure, build and run tests
cmake ..
make
./tests

# Compiling Client

See https://github.com/glinscott/leela-chess/tree/master/go/src/client/README.md.

# Other projects

* [mokemokechicken/reversi-alpha-zero](https://github.com/mokemokechicken/reversi-alpha-zero)
Expand Down
68 changes: 44 additions & 24 deletions go/src/client/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ var HOSTNAME = flag.String("hostname", "http://162.217.248.187", "Address of the
var USER = flag.String("user", "", "Username")
var PASSWORD = flag.String("password", "", "Password")
var GPU = flag.Int("gpu", -1, "ID of the OpenCL device to use (-1 for default, or no GPU)")
var DEBUG = flag.Bool("debug", false, "Enable debug mode to see verbose output")
var DEBUG = flag.Bool("debug", false, "Enable debug mode to see verbose output and save logs")

type Settings struct {
User string
Expand Down Expand Up @@ -76,11 +76,11 @@ func getExtraParams() map[string]string {
return map[string]string{
"user": *USER,
"password": *PASSWORD,
"version": "3",
"version": "4",
}
}

func uploadGame(httpClient *http.Client, path string, pgn string, nextGame client.NextGameResponse) error {
func uploadGame(httpClient *http.Client, path string, pgn string, nextGame client.NextGameResponse, retryCount uint) error {
extraParams := getExtraParams()
extraParams["training_id"] = strconv.Itoa(int(nextGame.TrainingId))
extraParams["network_id"] = strconv.Itoa(int(nextGame.NetworkId))
Expand All @@ -96,6 +96,8 @@ func uploadGame(httpClient *http.Client, path string, pgn string, nextGame clien
body := &bytes.Buffer{}
_, err = body.ReadFrom(resp.Body)
if err != nil {
time.Sleep(time.Second * (2 << retryCount))
err = uploadGame(httpClient, path, pgn, nextGame, retryCount+1)
return err
}
resp.Body.Close()
Expand Down Expand Up @@ -180,7 +182,7 @@ func (c *CmdWrapper) launch(networkPath string, args []string, input bool) {
}
}

func playMatch(baselinePath string, candidatePath string, params []string, flip bool) (int, string) {
func playMatch(baselinePath string, candidatePath string, params []string, flip bool) (int, string, error) {
baseline := CmdWrapper{}
baseline.launch(baselinePath, params, true)
defer baseline.Input.Close()
Expand Down Expand Up @@ -230,24 +232,29 @@ func playMatch(baselinePath string, candidatePath string, params []string, flip
io.WriteString(p.Input, "position startpos"+move_history+"\n")
io.WriteString(p.Input, "go\n")

best_move := <-p.BestMove
err := game.MoveStr(best_move)
if err != nil {
log.Println("Error decoding: " + best_move + " for game:\n" + game.String())
log.Fatal(err)
}
if len(move_history) == 0 {
move_history = " moves"
select {
case best_move := <-p.BestMove:
err := game.MoveStr(best_move)
if err != nil {
log.Println("Error decoding: " + best_move + " for game:\n" + game.String())
return 0, "", err
}
if len(move_history) == 0 {
move_history = " moves"
}
move_history += " " + best_move
turn += 1
case <-time.After(60 * time.Second):
log.Println("Bestmove has timed out, aborting match")
return 0, "", errors.New("timeout")
}
move_history += " " + best_move
turn += 1
}

chess.UseNotation(chess.AlgebraicNotation{})(game)
return result, game.String()
return result, game.String(), nil
}

func train(networkPath string, params []string) (string, string) {
func train(networkPath string, count int, params []string) (string, string) {
// pid is intended for use in multi-threaded training
pid := os.Getpid()

Expand All @@ -268,6 +275,13 @@ func train(networkPath string, params []string) (string, string) {
}
}

if *DEBUG {
logs_dir := path.Join(dir, fmt.Sprintf("logs-%v", pid))
os.MkdirAll(logs_dir, os.ModePerm)
logfile := path.Join(logs_dir, fmt.Sprintf("%s.log", time.Now().Format("20060102150405")))
params = append(params, "-l"+logfile)
}

num_games := 1
train_cmd := fmt.Sprintf("--start=train %v %v", pid, num_games)
params = append(params, train_cmd)
Expand All @@ -280,7 +294,7 @@ func train(networkPath string, params []string) (string, string) {
log.Fatal(err)
}

return path.Join(train_dir, "training.0.gz"), c.Pgn
return path.Join(train_dir, "training."+fmt.Sprintf("%d", count)+".gz"), c.Pgn
}

func getNetwork(httpClient *http.Client, sha string, clearOld bool) (string, error) {
Expand All @@ -307,7 +321,7 @@ func getNetwork(httpClient *http.Client, sha string, clearOld bool) (string, err
return path, nil
}

func nextGame(httpClient *http.Client) error {
func nextGame(httpClient *http.Client, count int) error {
nextGame, err := client.NextGame(httpClient, *HOSTNAME, getExtraParams())
if err != nil {
return err
Expand All @@ -327,16 +341,19 @@ func nextGame(httpClient *http.Client) error {
if err != nil {
return err
}
result, pgn := playMatch(networkPath, candidatePath, params, nextGame.Flip)
client.UploadMatchResult(httpClient, *HOSTNAME, nextGame.MatchGameId, result, pgn, getExtraParams())
result, pgn, err := playMatch(networkPath, candidatePath, params, nextGame.Flip)
if err != nil {
return err
}
go client.UploadMatchResult(httpClient, *HOSTNAME, nextGame.MatchGameId, result, pgn, getExtraParams())
return nil
} else if nextGame.Type == "train" {
networkPath, err := getNetwork(httpClient, nextGame.Sha, true)
if err != nil {
return err
}
trainFile, pgn := train(networkPath, params)
uploadGame(httpClient, trainFile, pgn, nextGame)
trainFile, pgn := train(networkPath, count, params)
go uploadGame(httpClient, trainFile, pgn, nextGame, 0)
return nil
}

Expand All @@ -358,13 +375,16 @@ func main() {
}

httpClient := &http.Client{}
for {
err := nextGame(httpClient)
start := time.Now()
for i := 0; ; i++ {
err := nextGame(httpClient, i)
if err != nil {
log.Print(err)
log.Print("Sleeping for 30 seconds...")
time.Sleep(30 * time.Second)
continue
}
elapsed := time.Since(start)
log.Printf("Completed %d games in %s time", i, elapsed)
}
}
44 changes: 34 additions & 10 deletions src/Network.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@
#include "UCTNode.h"
#endif

#include "Utils.h"
#include "Random.h"
#include "Network.h"
#include "NNCache.h"
Expand Down Expand Up @@ -874,8 +873,11 @@ T relative_difference(T a, T b) {
return std::max(fabs((fa - fb) / fa), fabs((fa - fb) / fb));
}

void compare_net_outputs(std::vector<float>& data,
std::vector<float>& ref) {
bool compare_net_outputs(std::vector<float>& data,
std::vector<float>& ref,
bool display_only = false,
std::string info = "") {
auto almost_equal = true;
// The idea is to allow an OpenCL error > 5% every SELFCHECK_MIN_EXPANSIONS
// correct expansions. As the num_expansions increases between errors > 5%,
// we'll allow more errors to occur (max 3) before crashing. As if it
Expand All @@ -885,16 +887,20 @@ void compare_net_outputs(std::vector<float>& data,
static std::atomic<int64> num_expansions{min_correct_expansions};
num_expansions = std::min(num_expansions + 1, 3 * min_correct_expansions);

// We accept an error up to 5%, but output values
// We accept an error up to 10%, but output values
// smaller than 1/1000th are "rounded up" for the comparison.
constexpr float relative_error = 5e-2f;
constexpr float relative_error = 10e-2f;
for (auto idx = size_t{0}; idx < data.size(); ++idx) {
auto err = relative_difference(data[idx], ref[idx]);
if (err > relative_error) {
printf("Error in OpenCL calculation: expected %f got %f (%lli"
if (display_only) {
myprintf("compare_net_outputs %s idx %d data %f ref %f err=%f\n",
info.c_str(), idx, data[idx], ref[idx], err);
} else if (err > relative_error) {
almost_equal = false;
myprintf("Error in OpenCL calculation: expected %f got %f (%lli"
"(error=%f%%)\n", ref[idx], data[idx], num_expansions.load(), err * 100.0);
if (num_expansions < min_correct_expansions) {
printf("Update your GPU drivers or reduce the amount of games "
myprintf_so("Update your GPU drivers or reduce the amount of games "
"played simultaneously.\n");
throw std::runtime_error("OpenCL self-check mismatch.");
}
Expand All @@ -903,6 +909,7 @@ void compare_net_outputs(std::vector<float>& data,
}
}
}
return almost_equal;
}
#endif

Expand Down Expand Up @@ -989,8 +996,25 @@ Network::Netresult Network::get_scored_moves_internal(const BoardHistory& pos, N
auto cpu_policy_data = std::vector<float>(policy_data.size());
auto cpu_value_data = std::vector<float>(value_data.size());
forward_cpu(input_data, cpu_policy_data, cpu_value_data);
compare_net_outputs(policy_data, cpu_policy_data);
compare_net_outputs(value_data, cpu_value_data);
auto almost_equal = compare_net_outputs(policy_data, cpu_policy_data);
almost_equal &= compare_net_outputs(value_data, cpu_value_data);
if (!almost_equal) {
myprintf("PGN\n%s\nEND\n", pos.pgn().c_str());
// Compare again but with debug info
compare_net_outputs(policy_data, cpu_policy_data, true, "orig policy");
compare_net_outputs(value_data, cpu_value_data, true, "orig value");
// Call opencl.forward again to see if the error is reproduceable.
std::vector<float> value_data_retry(Network::NUM_VALUE_INPUT_PLANES * width * height);
std::vector<float> policy_data_retry(Network::NUM_OUTPUT_POLICY);
opencl.forward(input_data, policy_data_retry, value_data_retry);
auto almost_equal_retry = compare_net_outputs(policy_data_retry, policy_data, true, "retry policy");
almost_equal_retry &= compare_net_outputs(value_data_retry, value_data, true, "retry value");
if (!almost_equal_retry) {
throw std::runtime_error("OpenCL retry self-check mismatch.");
} else {
myprintf("compare_net_outputs retry was ok\n");
}
}
}
#endif

Expand Down
7 changes: 5 additions & 2 deletions src/Parameters.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,10 @@ using namespace Utils;

// Configuration flags
bool cfg_allow_pondering;
int cfg_max_threads;
int cfg_num_threads;
int cfg_max_playouts;
int cfg_max_visits;
int cfg_lagbuffer_cs;
int cfg_resignpct;
int cfg_noise;
Expand All @@ -65,10 +67,11 @@ bool cfg_quiet;
void Parameters::setup_default_parameters() {
cfg_allow_pondering = true;
int num_cpus = std::thread::hardware_concurrency();
//cfg_num_threads = std::max(1, std::min(num_cpus, MAX_CPUS));
cfg_max_threads = std::max(1, std::min(num_cpus, MAX_CPUS));
cfg_num_threads = 2;

cfg_max_playouts = 800;
cfg_max_playouts = MAXINT_DIV2;
cfg_max_visits = 800;
cfg_lagbuffer_cs = 100;
#ifdef USE_OPENCL
cfg_gpus = { };
Expand Down
Loading

0 comments on commit 0cb27d5

Please sign in to comment.