The final project of my university Machine Learning course was very open-ended, allowing me to take a different stab at Bejeweled from what I made for Swapples.
After a semester spent covering decision trees, neural networks, naive bayes, and a few other supervised-learning algorithms, I opted to explore a new topic for my final project: reinforcement learning. This approach to learning interested me because it didn’t need labeled data, just an interactive environment for it to explore. My implementation used PyBrain and was coded to only know that it can swap pieces (it didn’t know that matching 3 of the same color would yield a positive reward).
The video below shows how the reinforcement learner played Bejeweled when it was just starting out (no state-action pairs have any associated value):
Sometimes it gets lucky by chance and makes a match, which gives a positive reward and helps the program learn that configuration of pieces for the future. After playing at top speed for a few hours, the bot plays much better, as shown in the video below:
It was really cool to see a piece of code go from oblivious to expert-level in a game without any guidance. For the specific case of Bejeweled, though, it would have been much easier to just write a program that knew how to play it – I’ll have to make my next application of reinforcement learning do or solve something I don’t understand. =)