Hope you had fun reading through an abbreviated version of my thesis research on baseball attendance last week. I could talk about projected Cleveland Indians attendance numbers all day, especially with an Akron Aeros tie-in. Now, this week, we’re heading over to college basketball.
March means madness. It’s one of the most fun sports times of the year. I’m personally a huge basketball fan, but when it’s do-or-die in the NCAA tournament, it’s perfect for any fan. But yet, I have an issue with the way bracket predictions are portrayed in the media. So I’ll cover that today. And closest to home here in Northeast Ohio, we have the Akron Zips. In order to follow up on my Sunday article about their success, I’ll have a more thorough look at their NCAA odds today. Hope you enjoy.
So first, let’s start with the statistical revolution that’s going on in all of sports. It relates here, trust me. And it couldn’t be as topical as right now, right on the heels of the 7th annual Sloan Sports Analytics Conference at MIT, which our very own Craig attended.
In a way, this “revolution” began with folks like Bill James, specific individuals who changed the way the game was portrayed with new statistics and new ways of looking at things. James was the guy responsible for the Moneyball-popularization in the Oakland A’s organization and then baseball as a whole. Everyone started using those stats and analytics in a more thorough fashion.
Obviously, when I was younger, I tried to make up my own all-powerful baseball statistic too. It was some sort of strange combination of OBP and SLUG and re-adjusting for fielder’s choices and how the runners on base advanced. I thought it was the best thing since sliced bread. Until I realized how impractical it was … eventually. But this creation seems to be how the development process works for a young sports stats lover; Nate Silver has admitted he did the same thing.
In the world of college basketball, the idea is kind of similar as it relates to “bracketology,” the now over-popularized method of projecting the look of the NCAA tournament bracket before it even takes place. Joe Lunardi, not a stats guy by trade, was actually just an extremely curious and well-connected communications administrator at St. Joe’s 1 . In 2002, his first post ever at ESPN.com reportedly had 250,000 hits within two hours. And thus began the craze.
But yet there’s something missing in this analysis fad. And it’s actually an approach you’ve already seen several times in my writing here at WFNY 2 – a Nate Silver approach. Silver, who first got famous as one of the key early contributors to Baseball Prospectus and the creator of the PECOTA player development comparison tool, then struck gold with the 2008 election.
Starting as an anonymous writer on Daily Kos and then eventually public on his own hosted site of www.fivethirtyeight.com, Silver became the star of aggregate analysis. He didn’t have anything that uniquely new to share in terms of an all-powerful statistic like James. But he made up for it by averaging together all the available data in a proprietary fashion. He didn’t just average though – he rated political polls based on their historical accuracy, historical partisan lean, recent-ness and more. It was genius.
So I’m not saying that such an approach is the end-all, be-all to all important topics that include numbers. I don’t know the perfect answer – nor will I ever. But such an analysis can never hurt, especially as it compares to just one individual’s opinion, as is the case with Lunardi’s brackets at ESPN. They’re excellent and a great starting point; I won’t ever doubt that and I read them regularly. But is one post the answer? Is there a better way to analyze the “noise”? That’s why I’m here.
Now let’s go onto my method for today. I went out aggregating bracket predictions in a number of ways. The end goal: Determining a true estimate of the actual bubble pool via this method and, in a most direct sense, seeing what it means for those Akron Zips. This means I won’t be telling you explicitly whether the Ohio State Buckeyes will be a No. 3 or a No. 6 3 . But instead I will be explaining how many teams we’re dealing with in this at-large/bubble world, who the supposed “locks” are for the tournament and then finally the bubble pool looks in 2013. And all by using the Nate Silver-esque method of aggregating opinions as opposed to featuring only one.
The easiest way to begin? Starting backwards. With the creation of the Dayton-hosted First Four in 2011, there are now 68 NCAA tournament teams. With 31 conferences, that means there are 31 automatic qualifiers and 37 at-large teams. But first, starting backwards and based on our desired aggregate model, of those 31 conferences, how many are likely going to be one-bid conferences no matter what?
Looking at 5 different bracket projections 4 , all 21 of these following conferences had exactly one team in the tournament. There was no disagreement. All of the 10 other conferences had at least 2 teams in every bracket. So, here are those 21 one-team conferences:
1. Big South; 2. Colonial; 3. Northeast; 4. MEAC; 5. Atlantic Sun; 6. SWAC; 7. Big West;
8. MAAC; 9. Big Sky; 10. Ivy; 11. Southern; 12. Summit; 13. America East; 14. Southland;
15. Horizon; 16. Patriot; 17. MAC; 18. WAC; 19. Ohio Valley; 20. Sun Belt; 21. C-USA.
In order to begin this process of estimating the at-large pool, I first need to figure out how many teams we’re dealing with. If, in fact, there are exactly 21 one-team conferences — that leaves 47 spots for the final 10 conferences with all those potential teams in the general pool. Relatively simple math, as we’re starting again with 68 total tournament teams.
Next, before you even ask or start to be curious, yes, there is a reason for the ordering of conferences as you see above. Well, not perfectly for the first 14 conferences, but moderately so for the next 7. Thanks to the help of one other useful site, TeamRankings, there actually is an average computer system number of the estimate of total tournament teams for each conference.
So the 1-14 conferences have no shot at all of an at-large team. That is pretty clear: TeamRankings identified them with 1.0 team on average. But then in the 15-21 conferences, there is at least a slight chance, per TeamRankings. And this potential chance of a two-team tournament for these conferences then later backed up when actually starting to look at the at-large/bubble contenders, which we’ll get to soon enough.
Thus, we have 14 confirmed one-bid conferences. Then, there are 7 likely one-bid conferences, of which there are varying degrees of likelihood — led by C-USA’s Memphis, who is a likely lock no matter what occurs.
For now, I’ll then go with 16-20 one-team conferences. Upsets will likely happen — and not just in these conferences (hello, Northwesterns of the world). Also, all of the possible at-large contenders from these conferences are likely leading their conferences right now, so things would be perfect with 21 tournament teams. But it’s March, so crazy things will occur eventually. That’s why I’m going with the extra wiggle room.
The bracket math adventure continues in now dealing with the remaining 48-52 spots in the NCAA tournament. Let’s cover again why this is the case: With 16-20 one-team conferences, our analysis is covering all the conferences with multiple teams in. So for example, locking in the Sun Belt or Ohio Valley with one team crosses out the possibility of one of their conference’s other teams being in our pool. Again, shocking upsets can happen in other conferences. And this is accounted here too.
The next step: Finding the locks for the NCAA tournament. Pretty logically speaking, everyone should follow that there are at least 25-30 secured teams that will make the tournament no matter what craziness possibly could occur in the next 11 days before Selection Sunday. There just aren’t enough opportunities to lose that much ground when so much success has occurred over the past 4 months of the season. Again, identifying the lock teams will lead to the final destination of the possible bubble teams.
In total, looking again at those 5 brackets from before, there were 40 most likely at-large teams that made it safely into each one. None were in the “Last 4 In” category in those brackets. Comparing these 40 to a previous list I made based on S-Curve rankings from three different sites 5 , I found a top 32 overall. Then, of course, there will be the remaining 8 teams that are near-locks, and from this information, there will be the remaining bubble teams likely fighting for 8-12 additional spots.
Here is first the list of 32 locks, as sorted by those S-Curve averages, which were updated a few days ago:
1. Indiana; 2. Kansas; 3. Gonzaga; 4. Miami; 5. Duke; 6. Georgetown; 7. Louisville; 8. Michigan State
9. Fla; 10. New Mexico; 11. Mich; 12. Kansas St; 13. Syracuse; 14. Marquette; 15. Arizona; 16. Wisc
17. Okl St; 18. Ohio St; 19. St Louis; 20. UNLV; 21. Pitt; 22. UCLA: 23. Notre Dame; 24. Oregon
25. VCU; 26. Colorado St; 27. Memphis; 28. NC St; 29. Butler; 30. UNC; 31. Minnesota; 32. Illinois
Now, here are those 8 near-locks, also sorted by the same mechanism:
33. SD St; 34. Mizzou; 35. Creighton; 36. Colorado; 37. Cal; 38. Oklahoma; 39. Cincy; 40. Wichita St
Just for clarity: These are the top 40 teams in terms of largest likelihood of receiving an at-large bid of some kind. They are the closest locks for the tournament. So this has nothing to do with my personal opinion of these teams nor their actual likelihood to succeed in the tournament. And as I said above, those S-Curves not necessarily have been updated in a few days, so changes have a occurred.
All 40 of these teams made it safely into the 5 brackets I chose for aggregate analysis. To be more specific, looking back at the 21 conferences I shared earlier, the only team from one of those conferences in this list is C-USA’s Memphis. Again, as I said before, if the Tigers don’t win their conference tournament, that is one of the 1-5 major upsets I’m expecting by continuously saying 16-20 and 48-52.
Now, the fun part: Looking at the pool of possible bubble teams. Overall, I’ve seen 33 other teams that have been in some type of bubble watch list on the Internet. My goal continues to be looking at the last-entrant field to see where the Akron Zips rank.
So at first, let’s eliminate the bottom feeders. These 12 teams made it onto some bubble lists, but unless something absolutely insane happens, won’t get an at-large bid:
Providence; Iowa; St. John’s; Air Force; Xavier; Charlotte;
Bucknell; Stanford; BYU; Valparaiso; Indiana St; Stephen F. Austin.
The three teams in red italics all are currently leading their mid-major conferences 6 , but still would need significant help to qualify for the tournament as an at-large team.
The elimination of those 10 teams leads us to this: the 21 most likely at-large contenders for the final 8-12 spots in the tournament. Again, some of these teams are mid-majors currently leading their conferences (like the Akron Zips), and their conference tournament success will impact the number of these final at-large teams that make the tournament. Independent of that, we have to analyze how they rank against each other.
I’ve placed the mid-major conference leaders in red italics again here for emphasis. The columns are the following: Tiers (my ranking of their slot); T Odds% (percentage of times they made the tournament in the 5 aggregate brackets I used); RPI (Rating Percentage Index, found at Warren Nolan); BPI (ESPN’s new Basketball Power Index); Pom (Ken Pomeroy’s advanced efficiency rankings); SOS (an average of SOS rankings from RPI, BPI, Ken Pomeroy and Jeff Sagarin’s rankings).
All five teams in Tier 1 actually also made the tournament in all five brackets. So that brings the total to 45 teams, along with only 3-7 remaining spots. But those five teams were at least moderately tenuous in their standing — some where on the “Last 4 In” and most usually were 11 or 12 seeds.
The next seven teams in Tier 2 then are the likely favorites from there, which, per the math above, could include up to seven spots. Middle Tennessee and Belmont are current conference leaders in the Sun Belt (20 on conference list above) and Ohio Valley (19). As you can see, their standings compare favorably to the other top bubble contenders, even if they lose their conference tournaments in the next week. The other five teams in this tier are all relatively major-conference contenders that have been going back-and-forth in recent brackets.
Then, finally, with teams 13-21, we have Tier 3, which is the home of those Akron Zips. According to this analysis, the Zips of the MAC (17) and Lousiana Tech of the WAC (18) are the next likely mid-majors to potentially compete for an at-large spot, but it just might be out of reach. Something would have to go miraculously well — such as multiple losses for teams in Tier 2 — for them to jump up to contend for one of the 8-12 top spots, while also losing in the next week. These two teams’ strength of schedules are just too low right now for them to be serious contenders, especially when you add in that extra loss.
There you have it folks, a detailed look at the March Madness chaos, through the lens of aggregate statistical analysis. Upsets are guaranteed to happen over the next 11 days until the bracket is finally decided, but this helpful guide should help clear you through the — to again borrow a Nate Silver term — “noise” as you clearly try to answer who will be the final at-large contenders for NCAA tournament spots.
As of right now, things don’t look too hot for the Akron Zips. Roughly, as an personal opinion estimate of what I think their at-large odds would be should they win out, then lose to say Ohio in the MAC Tournament Championship, I’d guess about 10-15%. The only way for Keith Dambrot’s upstart team to really secure their hopes for anything more than an NIT 7 is just to win in Cleveland. Their resume just doesn’t look too favorably against the Tennessees and Kentuckys of the world.
- And please, before you think I’m hating on Joe Lunardi a la Adrian Wojnarowski’s takedown of John Hollinger, relax. I’ve actually got an insane amount of respect for Lunardi. Nobody works harder in the sports media from mid-February to April than him. And he was amazing to me in a 30-minute phone call back in October 2010 when I was the sports editor for Flyer News. [back]
- Notably only in the last few weeks, NFL Draft, NBA Draft and Indians prospects. [back]
- For quick analysis: I’d lean toward the better number after Tuesday’s second half dominance of Indiana, but most brackets had them in the 5-7 range previously. So they’ll likely get situated as a 4-5, depending on any changes in the Big Ten tournament. [back]
- Key: Warren Nolan, Bracketville, Jerry Palm, Joe Lunardi and USA Today. [back]
- Key: TeamRankings, Joe Lunardi and Bracketville. [back]
- Matching up the teams with the one-bid conferences mentioned above: Bucknell — 16 Patriot; Valparaiso — 15 Horizon; Stephen F. Austin — 14 Southland. [back]
- With Tuesday night’s victory over Miami at the JAR, the Zips secured the regular-season MAC title. That guarantees at least a trip to the NIT. [back]