Description of the rating system
Why this system?
Usually rating is based on the ELO-like systems. Then every win or loss adds or subtracts some rating points from a current rating of a player. The rules by which the rating points are added with each game or with each tournament may be different but the common feature is existence of an initial rating. I wanted to avoid assigning initial values of the rating to the players.
Players are often rated by the number of wins or by the percentage of wins. This did not sound correct as it is important against whom the wins or losses were scored.
The system I use does not have an entry point. As long as the player has enough games in the database his rating becomes reliable anyway.
And the purpose of the rating was to see how professionals of different countries can be compared. In other word,. to make a 1D array of the real world, which is often what people do? Is not the game of go a 1D array of moves on the 2D surface?
Step 1. Minimization.
The rating table is built by the minimization of a function
F = S ak (exp(-p(xi-xj - rk)2) -1)where summation is performed over all games in the database, xi and xj are unknown ratings of the players, rk is the result of the match (1 or -1 depending on the win or loss of the first player, ak is a coefficient that describes the importance of the match. Generally one can assign different values for different tournaments. In this attempt the only contribution comes from aging of the results. Coefficient p is chosen in such way as to ensure existence of a single global minimum of the function. p = 0.25 in these calculations.
The role played by coefficient p can be seen from an example for 2 players. Let the first player have m wins and n losses. x is the difference of their ratings. After taking derivative of the evaluation function with respect to x and making some straightforward transformations the following equation for x is obtained:
It is easy to see that p should be less that 0.5 to have a unique solution. In the case p > 0.5 the equation will have 3 roots and the middle one will correspond to a maximum.
As the functional depends on the difference of the ratings only the rating of one of the players (it is not important which) is anchored to zero (or any other number). To find the minimum of the quadratic form one has to solve a system of N nonlinear equation, where N is the number of players. There equations are obtained naturally by taking derivatives with respect to the rating of the players. One of the equations is replaced with x1= 0.
Minimization itself is performed by iterations starting from all xi = 0. Note, that the first iteration gives (if all games have the same value)
The following iterations account for the ratings of the opponents.
After minimization the least rating is found and the whole set is shifted to make it zero. Again the fact that the function depends on difference only is used. Thus all ratings are made positive. The result is called raw rating.
Step 2. Dealing with unreliable results.
In the case when there are few games of a certain player in the database the raw rating depends strongly on who were the opponents of this particular player. Thus, the player with only one game, which was a win against the player with the 10-th rating is almost certain to have the highest rating. To avoid this unreliable situation the devaluation of the rating of players with few games is suggested and implemented. This devaluation consists in multiplication of the raw rating by a factor
where mi is the number of games of i-th player in the database. This factor is important for the players with few games only and has almost no effect on the ratings of the players with more than 20 games.
Aging of the results consists in the decrease of the coefficients ak in the function and is performed on the tournament basis. The formula is
ak = 1 - exp(-y)
where y is the number of years from the tournament.
Practically, tournament based means the if the quarterfinals of the current year LG cup have been played then the result of quarterfinals of the last year LG cup are considered year old and so on, while the last year semifinals are considered recent.
The database includes all results found at
These are international tournaments and the major domestic tournaments in Japan, Korean and China.
Before the final presentation the ratings are multiplied by 1000.
The most serious danger is the incompleteness of data. The data should at least be representative and avoid including biased results for a certain player. The author and the compiler of the database had no preferences for any of the players but fluctuation may occur and thus certain player can be under or overrated.
Note also that the results of the step 2 of the rating depend on the depth of the distribution. This is important for the lower part of the table. Luckily that part of the table is not important.
Unfortunately errors are unavoidable. They could occur when recording the results in the database or even in the source. If you see any inform the author. E-mail address is email@example.com.
One of the constant source of errors is different spelling of Korean names. With more than 400 players in the database the task of correcting the names is difficult.
While rating is always an ambiguous thing the ratings of the players with a small number of games should not be taken too serious. They depend too much on who was the random opponent.
The absolute value of the rating has a meaning of the depth of the representation. One can argue that 1000 roughly corresponds to one stone of handicap with usual account for the fact that the white starts a handicap game. One may argue also that if the rating of the leader is about 4000 points than he is supposed to be 4 stones stronger than the weakest player in the rating table. But that is not true because the player with the least rating has too few games in the database to judge. So, the absolute values of the numbers are not important. Only the order and only in the upper part of the table is of interest.
The difference of values of the ratings in the upper part of the table is reliable. To interpret this value one should imagine that this difference is produced by the games between those two players only. Thus 200 points of difference would correspond to 3-2 result in 5 games, i.e. 60 percent wins by the stronger player.
It is useful and interesting to look at the raw rating of the player to see the potential of growth in the number of games increases while the results remain the same.
The table may be found at baduk.htm.
By clicking on the name of any player you can find what results contributed. Warning: the database of results in html format is very large (more than 0.6Mb), it may take time.
Discussions and the report on the changes can be found at discus.htm