In our previous post we mentioned a graph gist containing a data set of World Cup matches but didn't go into any detail about what the World Cup Graph looks like.
The following diagram, created using Alistair Jones' excellent arrows, shows the domain model at a high level:
The World Cup sits at the centre of our graph and contains relationships to lots of meta data. At the moment not much is going on around matches but the intention is to introduce player data into the graph which will change that.
At first glance it may seem that we've pulled out a lot of nodes where properties could have been used. I only pulled out nodes if I thought they'd be connected to multiple times and doing so might lead to cool insights.
For example, Round is represented a node so that we can write queries to see how countries fared across World Cups.
Time has been made a first class citizen in case there are interesting insights to be found related to when teams played their matches.
You can also explore the contents of the CSV file by executing the following cypher:
LOAD CSV WITH HEADERS FROM "https://dl.dropboxusercontent.com/u/7619809/matches.csv" AS csvLine RETURN csvLine LIMIT 1
Tweak the limit to have a look at more rows and get a feel for the data.
Feel free to share any thoughts on the model in the comments or just fork the gist and make your own changes.
We've got a read only version of the database available for you to play with at worldcup-neo4j-db.herokuapp.com/browser/ and Rik has recorded an excellent video to help you get started with exploring it.