Monday, August 03, 2009
Three eminent physicists from the 18th and 19th century meet for the first time in heaven, sitting on a cloud and chatting about their work. The first says: "I found out in 1724 that water freezes at 32 degrees". The second disagrees, and says: "I found out in 1742 that water freezes at 0 degrees". The third says: "No way, I found out in 1848 that water freezes at 273 degrees". A loud argument ensues, until St. Peter arrives and says: "Mr. Fahrenheit, Mr. Celsius, Baron Kelvin, you're never going to agree on the freezing point of water if you're all using a different scale to measure it!"
Silly? But the equivalent is happening every day in the gaming blogosphere, when people discuss review scores of games. While I was on holiday, Eurogamer published a very readable second review of Darkfall, now giving the game a score of 4/10. And of course the Darkfall community still thinks that score is unfair, and too low. Since the original 2/10 review, lots of bloggers have chimed in and given the game various scores, and none ever bothered to talk about what scale they were using. How can you rate a game X out of 10 if you can't agree what kind of a game 10 out of 10 is?
For example the original Eurogamer review of World of Warcraft in 2005, from the same guy who did the 4/10 Darkfall re-review, gave the game a 8/10 score, scoring less than the 9/10 City of Heroes. A later re-re-re-review of World of Warcraft with Wrath of the Lich King got a 10/10 score. On Metacritic there are scores from 60/100 to 100/100 for World of Warcraft, with an average of 93/100. Warhammer Online has a slightly lower average, 86/100, with a range from 70/100 to 100/100. Metacritic also has a "user score" in parallel, and in that one WAR beats WoW by 8.1 to 7.1 out of 10. So from the point of view of the reviewers WAR is nearly as good as WoW, from the point of the users giving scores on Metacritic it is actually better, and from the point of subscribers WoW beats WAR in the US and Europe by about 5 million to 0.3 million. If you consider subscription numbers of games with similar cost to be some kind of a vote-with-your-feet general user review score, we end up with three extremely different scores, from three extremely different scales.
So why is there this myth that review scores make any sense at all, that there is some sort of universal scale on which all games can be ranked? Even if you don't consider the possibility of some commercial publication giving a game a higher score because of the advertising revenue from that game company, it should be obvious that for example an average gamer, a veteran gamer, and a professional game reviewer will have very different priorities and scoring criteria. Somebody who actually plays Darkfall and decided to stick with it obviously has a rating scale on which Darkfall scores higher than any other existing game, even World of Warcraft, because otherwise we would need to assume that he consciously went for a less good game. At least Fahrenheit, Celsius, and Kelvin agreed that the boiling point of water is higher than the freezing point. With game review score you can't even find everyone agreeing whether Darkfall is a better or worse game than World of Warcraft. So how on earth can you even start discussing whether "4 out of 10" is a "fair" score?
Personally I only use two scales to rank games. One is the hyper-subjective Tobold scale, on which there are only two levels, "recommended" yes or no. The other scale is a financial one, how much money is a game making, which for games with very similar pricing models is equivalent to the number of subscribers. Out of 5.3 million people in the US and Europe who all had the free choice, 5 million preferred WoW over WAR, and 0.3 million preferred WAR over WoW. Of course the 0.3 million all think that the 5 million are misguided idiots, and vice versa, but economic theory says that decisions involving money are usually more true than any opinion polls. But that financial scale has the big disadvantage that in most cases we don't have the numbers. For games like Darkfall or Lord of the Rings Online no subscription numbers have been published, and for Free2Play games like Second Life or Free Realms the published numbers are usually "people who made free accounts", of which many stopped playing, or just play the game because it is free.
But whatever scale we use to rank games, there is no guarantee that this scale will correspond to YOUR personal one. At best they can provide some sort of probability. If I take a random PC gamer who hasn't tried MMORPGs before, and let him play WoW, WAR, and Darkfall each for a month, there is a high *probability* that he'll like WoW best, WAR medium, and Darkfall least. But if I take somebody who stopped playing MMORPGs in disgust when EA "ruined" Ultima Online by introducing PvP-free Trammel, the order of preference of these three games might well be reversed.
So the best advice before looking at a review score of a game you don't know, is to look a the review scores of games you *do* know, from exactly the same source and reviewer. If you agree with the review scores of the games you know, then there is a chance that the review score of the unknown game from the same source is relevant to you. If you don't agree with the previous reviews, than you and the reviewer are using very different scales, and the scores are simply irrelevant for you.