

On March 5th, 2014, developer Gabriele Cirulli published the game 2048 on his Github repository. The game has been compared to the mobile game Flappy Bird due to its addictive nature and difficult gameplay.
#2048 fibonacci how to
By playing lots of times, the AI would eventually know how to play this game well by itself.2048 is a single player puzzle game in which the player combines tiles on a grid with matching numbers in order to create a single tile with the value 2048. In addition, the Value(state’ next) could be considered as a result of current move therefore, would be the error value to the optimal value. The Reward is kind of like a critic in this machine learning system, which provided the standard of score in the Fib2584, so the AI could adjust the values depending on it. Only the 4-tuples used would be updated with this function. With Value() indicates the value of a certain state (aka board or position) state is the current board, state’ is the board after performing the best move, and state’ next is the board after performing the best move to the next board Reward is the score got by the moves which could combining some tiles, while Reward next is the score got from moving the next board α is the learning rate less than 1, which sets the adjusting speed of values towards the optimal values.

#2048 fibonacci update
Here we used the update function of TD-Afterstate below. Assuming that there are optimal values for each 4-tuple, the goal would be let AI adjusts itself to these optimal values in the end. To evaluate, we have to train the database by Temporal Difference Learning first.Īt the beginning, all values for 4-tuples are initialized as zero. and should be both evaluated and their values would be the same theoretically. Please notice: for the convenience of explanation, here I just ignored the equivalence from symmetry for now. To pick up the best action to move, the AI would generate the next board for each possible action without new random tile popping out yet, and evaluate these board (may be 4 or less) to compare their values. Finally, the evaluation would be the sum of the above values. Now we can look up these 4-tuples’ values in trained database (would be explained later) for outside and inside respectively, including the all-zero 4-tuples. And rest of 4-tuples are for both outside and inside. Then, the non-zero 4-tuples inside are and (blue). There are 2 non-zero 4-tuples outside, which are and (0 is for empty tile). First, we evaluate the 4-tuples outside (red). Taking the next picture below for instance, we assume the AI is evaluating this board. That’s why they can’t be treated the same. There is a vast difference between outside and inside - for example, the value of a 610 tile at the corner or center are very different. To do this, I defined some features for AI to estimate.įor the following pictures, the left one indicates the 4-tuples outside which has been marked out, and the right one is for the 4-tuples inside. EvaluationĮvaluation is the process to value a certain board. Original 2048 is created by Gabriele Cirulli, which is based on 1024 by Veewo Studio and conceptually similar to Threes by Asher Vollmer. Win Rate: 85.55% (win - reaching the 610 tile, which is the 14th of Fibonacci).

According to the following screenshot of C++ version, which was designed to train this AI, the training result (out of 10000 games) is: The main concept is training the AI how to evaluate and predict any board’s value to make a suboptimal move.

In this Fib2584, I implemented an AI which was trained about 4.3 million games with TD Learning (Temporal Difference Learning). Fib2584 is forked from the well-known 2048 with Fibonacci tiles instead of powers of two.
