How to Predict a Good Board Game?
Updated: May 28
Board gaming can be an expensive hobby. I own over 130 games, and when you consider that each cost somewhere between $30 and $60, that's a lot of money spent! Therefore I try to put a lot of time and effort into researching specific games before I buy them. I read reviews and watch YouTube videos about the game, as well as asking friends whether they have heard of or played the game. So far this strategy has reduced the amount of forgotten board games in my collection. Ones which I excitedly bought, unwrapped and read the rule book for, then set on my bookshelf to languish, unplayed, forever.
But for one category of board games it is impossible to collect this kind of information before making a purchase. These are the games which are funded through the website Kickstarter. With Kickstarter the idea is that you're pre-ordering a game before the game has even been made. So you have to put your money down before there are any reviews or word-of-mouth to go on! I've only backed a couple games on Kickstarter—expansions to games I already own—but I wanted to see if I could use maching learning to make predictions about how good a new Kickstarter game would likely be.
To start with, I would need lots of data on existing board games. I decided to collect this by web-scraping the website BoardGameGeek using Python. BoardGameGeek contains information for pretty much every game ever made and allows users to rate each game on a scale from 1 to 10. This rating would be my target attribute (or dependent variable) on which I would test the predictive ability of various other features of a given game. Below are some screenshots showing the data I collected from BoardGameGeek's database of games (the extracted information is highlighted in red).
BoardGameGeek has close to 105,000 games in its database but I decided to stop after collecting data on only 2,500 for a few reasons. First, after a few thousand deep (starting from the most highly-rated), the games began to lack key pieces of information. These are games very few people have ever played or even heard of, so I wanted my sample to contain only relatively popular games. Second, web-scraping takes a ton of time. In order not to overload BoardGameGeek's servers, I set a timer to wait five seconds between each page request while scraping. So this meant it took over seven hours to gather data on just 2,500 games!
It was now time to start analyzing. After converting my all my categorical features into dummies (e.g. creating a new feature called "Economic" for which each game had a value of 0 or 1 depending on whether it fit into the Economic category of games), I was left with a data set that had 1261 columns. Evidently there are lots and lots of way to categorize games! The statistics gods will smite you if you try to plug 1260 features into a linear regression equation, so I would need a way to select only those which were most import for predicting the average rating of a particular game.
The solution was using a random forest regression to rank each of my dummies based on importance (for predicting a game's average rating). The graph below shows the top ten features my random forest model found:
I then took the first seven features from the list above and plugged them into a linear regression model to see how they performed. The result was an R-squared with a paltry value of 0.215 (a medium or strong relationship would have been upwards of 0.6 or 0.7). So although I found the features which "best" predicted whether a game would have a high rating or not, there remains a lot of unexplained "noise" in the model.
Sadly I was left with a model that produced no useful predictive power for deciding whether to back a Kickstarter campaign. You may already be thinking of a few things that were wrong in my assumptions. First, does the average rating, as defined by BoardGameGeek users, really tell us anything about whether any one individual will enjoy a game? Certainly not for me! Sure I enjoy many of the games in the top 100, but there are some I can't stand! Pandemic Legacy: Season 2, perhaps the worst board game experience I have ever had to endure, is rated #33. And my beloved 878 Vikings (from my top 10 list) is way down at #741!
This reveals the real problem with my attempted analysis: taste in board games is far too variable to be predicted by any collection of features a game might have. For example, I have a friend whose name starts with a "C" and rhymes with "Blonnor" who hates word games, yet to me games like Codenames, Decrypto, or Crosstalk (rated #2274?!) are some of my all-time favorites! There are so many different board games and so many different types of people who enjoy them.
Just for fun I also decided to see whether a game's weight (i.e. complexity) on scale of 1 to 5 could predict its average rating. The results show you just how much variance there is in people's preference of games. Sad to say that, although still inadequate, the weight of a game does a better job at predicting its rating than my fancy random forest algorithm earlier.