Wednesday, August 08, 2007

Data Mining: The Referees

This is the first post in the Data Mining series. The topic is NHL Referees.

First of all, there are some basic assumptions that I'm operating under, so please remember these as you read the posts:
  1. There are two referees at each game, and it's impossible to tell which referee made the penalty call. Any average data is unable to be directly correlated to one individual referee. It's more indicative of a trend than a specific indictment.
  2. The box score data, while I believe it to be accurate, could have errors.
  3. All data in this series is from the 2006-2007 season, unless I specifically state otherwise.
Having put that out of the way, let's take a look at what I'll cover in the series:
  • Who has the highest PIM average per game? Any differences for home vs road teams?
  • Who has the lowest PIM average per game? Any differences for home vs road teams?
  • Who is involved in games with the highest / lowest scoring?
  • Who is involved in games with the highest / lowest shots?
That's the extent of part 1. If it goes well (and I enjoy writing these), then I'll go more in depth with part 2.

Expect one post per day until the series is complete, unless I get ambitious...

No comments: