Finally after at least a couple of years I have returned to the rewrite of the ranking code I wrote.
I got a new copy of the data from the bgstat app that I use to record my game plays. The app also the master list of all the games that make up my game collection. And syncs all of that info back onto the bgg website for me.
The point I’m making is this new data file is different to the one I originally got from the app some six or seven years ago. Back then it was just a csv file of the game plays.
Now its a json file that contains all the apps data. So not only game results, list of games, locations, tags, etc.
Luckily using Python as the language of choice for this project means there is a nice json library I can import in to do all the heavy lifting for me to read in and process the json file I have.
Sadly for me the json used by bgstats is not a simple format. It contains arrays within arrays! A nightmare to code to get to the information that I need for this project, even with a json library.
Now if you had asked me before I had started this second attempt at a rewrite that the bits of data analysis I’ve been picking up (different projects) would be of use here. I would of been a bit sceptical.
However using the pandas library I found out I could use pandas to extract these arrays within arrays to get the data I want into a dataframe.
Having the data in a dataframe opens up all sorts of cool data analysis stuff I can do.
So not only can I do the paired comparisons, and top 10 lists, but also produce stats that not even the bgstats app currently does.