My first attempt at extending the inputs to use primes (Player 2.2) did not work very well - the improvement in performance over Player 2.1 (the same but without the primes input) was quite small. My next attempt - Player 2.3 - extends the prime input.

In particular it adds four new inputs for each player, instead of one. They are:

It did not perform particularly well. I took the player at the end of the training and played 100k money games against Player 2.1. It scored +0.0107ppg, with a standard error of 0.0045ppg. So significant at better than two standard errors, but just barely. I'm unsure whether the improvement is really due to adding the new input, or if it's just because I trained Player 2.1 for longer and with smaller alpha.

This was using 80 hidden nodes for both Player 2.3 and Player 2.1.

Perhaps the inputs are still not set up optimally. Maybe I need to note the position of the max prime somewhere, rather than just the max count.

Note that Player 2.3q - the one I use for filtering moves in multiple-ply calculations - is an example of Player 2.3, but with only five hidden nodes.

In particular it adds four new inputs for each player, instead of one. They are:

- n/3 for n<3, or 1 for n>=3, where n = max # of primes on the player's side
- 1 for n>=4, 0 otherwise
- 1 for n>=5, 0 otherwise
- n-5 for n>=6, 0 otherwise

The idea was similar to how checker counts are distributed among several inputs: it lets the network find a more complex nonlinear dependence on count.

I started with Player 2.1 and trained it for 1.5M iterations. Alpha was 0.02 to start, dropped to 0.004 at 200k runs, and dropped to 0.008 at 1M runs. Learning chart below:

It did not perform particularly well. I took the player at the end of the training and played 100k money games against Player 2.1. It scored +0.0107ppg, with a standard error of 0.0045ppg. So significant at better than two standard errors, but just barely. I'm unsure whether the improvement is really due to adding the new input, or if it's just because I trained Player 2.1 for longer and with smaller alpha.

This was using 80 hidden nodes for both Player 2.3 and Player 2.1.

Perhaps the inputs are still not set up optimally. Maybe I need to note the position of the max prime somewhere, rather than just the max count.

Note that Player 2.3q - the one I use for filtering moves in multiple-ply calculations - is an example of Player 2.3, but with only five hidden nodes.

## No comments:

## Post a Comment