How it started.
The first two versions of Pwned were multiplayer only. To add single player, the first thing I did was look for a chess engine tutorial in C#. I was fortunate to find a great resource created by Adam Berent that includes easy to read C# examples that explain how to build a strong chess engine. His website is a great way to get started in computer chess for a C# developer. Using examples from his own chess engine called ChessBin, he walks you through all the moving parts. You have to write a bit of code yourself but these parts are explained very clearly. Dave Moore had already written a lot of the code that that I needed when he was helping with the first version. After a short while I had my own engine working inside the Pwned GUI and I thought I was almost done...
The engine was strong enough to beat me, but I don't really play chess that much (never). Also, I couldn't find any chess engines that it could beat. Keep in mind at this point I did not have a console interface, only a GUI. So I was making each move by hand, taking about 10 minutes to play-test 1 game.
The Rybka Controversy.
In the summer of 2011 there were a lot of stories in the news about the Rybka engine created by Vasik Rajlich. The international computer games association disqualified him from the World Computer Chess Championship and striped him off his past achievements in 2006, 2007, 2008, 2009 and 2010. Reading about the story lead me to the TalkChess.Com forum where some of the evidence was being discussed.
The engines he was accused of plagiarising are open source engines. They were created by academics to help advance the study of computer chess, and many people use them to learn how to write their own chess engines. If Rybka played stonger (+200 Elo), should it not be considered different enough that you should call it original? The rules state your work as to be original. There was ample evidence that showed identical logic to the point where many believed parts of other engines were copy/pasted so Vasik was not allowed to compete.
On the plus side, thanks to all the coverage I stumbled upon the best resource imaginable. Posting on this forum are some of the most famous people in computer chess today. And they are helping total beginners like me get started! If was in a band, this would be like stumbling upon a website where B.B. King, Jimmy Page, Neil Young and Slash were personally giving lessons to beginners, while at the same arguing with each other in the General Chat section about how Tony Iommi stole a solo from B.B. King. It was surreal!
How I tested it.
After spending some time on the TalkChess forum it became clear the first thing I had to do was write a console interface that would talk either Xboard or UCI. There are many 3rd party tools you can use for testing once you have a working console app. Writing a basic implementation of these protocols did not take very much times, and the 3rd party tools quickly told me that I had a bug somewhere. I couldn't beat any engines, not even close. If I can't win any games, I can't even get an approximate rating. Time to hunt some bugs.
There are a lot of moving parts in a chess engine, and even if it appears to play a good game of chess, it can still have bugs hiding in any of those moving parts. The first place I was recommended to look was at the move generation code. I had to test if I was generating the right moves or not. The best way to do this is to implement at function called perft. perft will search for all valid moves and count them, up to a given ply. You then load in various board layouts and check the perft count after say 5 moves. This value should match the perft count from other engines. If your perft count for many different layouts matches other trusted engines, you can say this code is bug free and move on.
The other benefit of perft is that it can be used to measure the speed of your move generation code. Pwned was close to 10,000x slower than Gaviota. Gaviota is not written in C# but from what I understood the overhead should not be that bad. The search for faster move generation lead me to a paper by Robert Hyatt on the subject of rotated bit boards. What I was doing with a series of nested for loops, rotated bit boards did with a single bitwise calculation. Using rotated bit board increased my perft speed beyond my wildest expectations. I also thought it was exceptionally cool to get answers to my questions from a legendary computer chess world champion.
The next place I was directed to look was at my end game logic. If you have a bug in your end game logic, your engine could opt for a draw when it could have won, or it might miss obvious chances to win. These blunders can be reduced by implementing a command called rgstat. This command will generate a series of random games and report the aggregate total of how each game ended. You then compare these percentages with another engine that you trust. When you play around a million games, the numbers should match up to a few decimal places, otherwise you have a bug somewhere.
How the search works.
The opening book is consulted in the GUI, before the search is called. If there have been more than 12 moves played, the book will not be consulted at all. If a move is not found in the book, the search is invoked.
End game table bases are consulted in the first ply if there are 5 pieces or less on the board. If a match is found in the EGTB's, the move with the shallowest depth to mate is returned and the search is terminated early. The engine uses iterative deepening to improve move ordering on the following ply. The search also makes use of killer move heuristics and null move to speed things up even more. Other features include quiescent search and a 80Meg Transposition Table on the Xbox version (64 Megs for main search, 16 Megs for QSearch). The transposition table uses 4 buckets with an age replacement scheme. The transposition table is indexed using a bitwise AND operation as opposed to modulus.
There is only one search extension in Pwned. The extension will be triggered on the last ply if the player is in check. An extra ply is added to the search to see what will happen after check because this can often be checkmate. Table bases are consulted in the board eval. If a mate is found this should end the search quickly.
How the board evaluation works.
Even after all the bug fixes and optimizations were done, I still could not win as many games as I wanted, and the more time I spent on my eval function, the worse things seemed to get. I was starting to lose a lot of search depth because the eval function was slowing down. Lucas Braesch claims that a basic bug-free search should be able to get around 2000 Elo with a very small efficient evaluation function.
The eval function he described uses these 3 core components : Piece mobility, material evaluation and piece square tables. After implementing the basic eval that he described, my search depth was better than ever and I was actually winning a respectable amount of games. I added king safety and a penalty for doubled pawns and that is the extent of the eval function in Pwned.
How it was tuned.
To get an accurate rating, the engine needs to play against other engines that are close to the same skill level. I found a site that compiles a list of older engines and engines that are not tracked on the CCRL website. This list is called the Also-Rans Rating List. Also-ran is a term from horse racing for a horse that finishes outside the top three. From this list, I was able to find good opponents for my engine.
I was also able to monitor progress by playing games against old versions of my engine. The neat thing about this was if I played the same version of engine against itself, after about 1000 games the score would be exactly even. So when I played against an older version of my engine, after a few hours I could see subtle changes in strength. The tool I used the most for tournament play was a Windows command line tool called cutechess-cli.
I also had a set of test positions in EPD format that I would feed into Arena GUI from time to time. I was using this set of test positions to help write my original eval function. After I got the simplified eval function working I stopped worrying about individual test cases. The game state space for chess is just so enormous that the only way to find bugs or trigger blunders is to play 1000's of games over and over. A static set of test positions didn't do me many favours when I was tuning.
This was an exciting project for many reasons.
- This was something I always thought would be too difficult because I didn't play a lot of Chess. I have a much better understanding of the rules now.
- Realizing the engine could beat me was very fun!
- Chess has a very large game state space. There are more possible layouts for a chess board than there are atoms in the universe. Trying to search through something that large is a very fun concept.
- You never really know if your engine is bug free, because of the large game state space. So you have to run tests until your you are satisfied with the results. This is a weird feeling since you can't ever have confidence in the code, but it still performs quite well.
- The computer chess community is exceptional. I could never have finished this without their help.