Sure, granted data gets more sparse the further back you get into time. Advanced statistics in the sports realm as only really gotten popular in the last twenty years. But given large amounts of time to waste on this problem it would be interesting to run these models on prior data and against prior events to see how effective they are over time. College basketball has been tending recently to being more random given the higher percentage of three point shots being taken nowadays versus twenty years ago. It would be interesting to see if this increased randomness makes games harder to predict nowadays versus in earlier periods.
You are viewing a single comment's thread from:
See, that's the sort of thing that I find fascinating.
When the dynamics of the meta-game change, how does that affect our ability to predict the outcome of games going forward based on all the data we have versus smaller and more specific slices of the data for training? There's probably an entire PhD thesis waiting for somebody with a pile of sports knowledge, computing power, and the patience for sorting through a lot of tedious, scattered information – but I would definitely read that thesis.
It would also be interesting to compare the predictive power of the various models run against one sport versus another. Is the bracket density of college football more amenable to statistical analysis than college basketball? I have no idea; I barely know enough to even ask the question. But it's interesting!
I am all for more analysis of algorithms and looking at their applicability across novel regimes. That's cool stuff. Which is why I'm following this series of posts even though I really have no grounding or interest in the sports side of things. It's fascinating in and of itself.