KRACH-ey!
Posted by: ronJust a reminder to everyone, the KRACH wouldn’t have really solved the problems of the Pairwise this year either. In fact the KRACH had TWO sub .500 teams in the tournament, Minnesota-Duluth and Wisconsin.
In all fairness I don’t believe the Pairwise or the KRACH is the problem per se. I believe what has happened this year is a SYMPTOM of a much bigger problem, and that’s an excess of conference games and dramatically unbalanced non-conference schedules that has become a mainstay of college hockey (and sports in general)
Because there are so few non-conference games and less common opponent comparisons, outliers such as the Nebraska-Omaha / Princeton series can dramatically swing comparisons. The excess of league games has been overlooked mainly because with the exception of the ECAC which only plays 22 games (probably due to the Ivies) this is the way it has always been. The issue is that these statistical systems are built around very impractical numbers when trying to compare teams against ALL of NCAA Hockey.
The BCS has the same problem because of the limited number of games played overall in college football. Somehow 11 or 12 games determine a champion between 100 plus teams, and guess who’s usually at the top? Teams from the power conferences.
In the short term, the likely fix to our Pairwise issues will be a band-aid fix. No teams below .500 in the tournament, road bonuses returning, who knows. But it doesn’t fix the problem. The data set is. The question is, how do we do it and can we? Or is this the best it will ever get?
March 25th, 2008 at 3:23 pm
True ‘dat … not sure what we can do about it though.
March 26th, 2008 at 9:25 am
A required number of road non-conference games for every team would be a good place to start. Perhaps a cap on the number of conference games played dependent on the number of teams in the conference is another thought.
March 28th, 2008 at 4:11 am
Actually, the problem seems mostly to be that a bunch of good teams happen to be in the same conference, playing each other, and driving down each other’s winning percentages. There is one way in which KRACH can take the results of a few games too literally, which is illustrated in the most extreme case by giving an undefeated team an infinite rating, so that losing to them doesn’t change your rating at all. From another way of thinking at things, this is because KRACH starts with implicit ignorance about each team’s rating is: it’s just as likely to be between 100 and 300 as 10 and 30, or 3000 and 9000.
You can instead put in some sort of initial guess that each team’s rating has some probability distribution that drops off as you get away from 100, and then let the game results pull it away from that. A nice one is to assume that the predicted chance to beat a team with a rating of 100 is equally likely to fall anywhere between 0% and 100%. So a team starts off with the range from 100 to 150 (50% to 60%) being as likely as 400 to 900 (80% to 90%). It turns out that the best guesses given the results are then just what they would be if you took ordinary KRACH but added two games–one win and one loss–for each team against a “fictitious team” with a rating of 100. So it turns out what Ken Butler used to do (adding a tie against a fictitious team to make everything finite) was halfway to “the right thing to do”.
This matters a lot for football of lacrosse, where there are not a lot of games, but in hockey, it doesn’t make a big difference. You can actually calculate this with my DIY ratings script,: It drops tUMD to 15th, and out of the tournament, but Wisconsin still makes iit at #12. (There’s also a way to estimate the errors on the rankings, but I haven’t implemented it for hockey, and we’re also not quite handling ties the way we should.)