Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The intuition for the values of heur_score_table #73

Open
morningM00N opened this issue Jan 19, 2023 · 1 comment
Open

The intuition for the values of heur_score_table #73

morningM00N opened this issue Jan 19, 2023 · 1 comment

Comments

@morningM00N
Copy link

Thank you for sharing a good project.

When I study your code, I cannot understand why heur_score_table[row] is calculated in a such way in init_tables().

heur_score_table[row] = SCORE_LOST_PENALTY +
SCORE_EMPTY_WEIGHT * empty +
SCORE_MERGES_WEIGHT * merges -
SCORE_MONOTONICITY_WEIGHT * std::min(monotonicity_left, monotonicity_right) -
SCORE_SUM_WEIGHT * sum;

Why does every entry have SCORE_LOST_PENALTY value?

It seems to have no effect on determining the priorities between moves.

Moreover, I wonder why you give a penalty for large value of SCORE_SUM_WEIGHT * sum.

I think a larger sum value takes advantage to get the final score highly.

@nneonneo
Copy link
Owner

The heuristic score will be zero for lost positions (positions where no further moves are possible), so the SCORE_LOST_PENALTY is really a bonus for not losing yet. The large value of LOST_PENALTY helps to ensure that the model always prioritizes not losing, and also helps to buffer against large penalties, since we want to ensure that the score for any non-lost position is positive.

Final score (as reported by the game) doesn't really matter; survival does. The longer you survive, the larger your score, because the score increments at essentially a fixed rate. The SCORE_SUM_WEIGHT heuristic was added in commit a64d749. If you look at the calculation of sum, you will see that it is the sum of pow(rank, 3.5) for each rank on the board (where rank ranges from 0 to 15). This is not the same as the sum of the face values of the tiles (which would be pow(2, rank)). It is carefully calculated such that 2 * pow(N, 3.5) > pow(N, 3.5) only for N >= 6; that is, two 32 tiles sum to less than one 64 tile, but two 64 tiles sum to more than one 128 tile. By negating this component, the heuristic will prefer to leave smaller tiles unmerged, but will prefer to merge larger tiles earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants