Odd Man Out – How did South Carolina Reach the Final Four?

It’s been an interesting tournament this year.  For the first time since 2007, we didn’t have a 13 seed or higher win a game in the opening round.  Bad on my wallet, as there were several prime candidates to pull off the upset (upset picks).  However, upsets are bound to happen, this tournament they just came later.  Particularly of note are Wisconsin dropping Villanova (not too much of a surprise given Wisconsin’s tournament prowess), and South Carolina’s second round defeat of Duke, a perennial favorite to win it all.  What is captivating here is not that a 7th seed upset a 2 seed, its how a team with below average shooting (the most important of the four factors) can pull off a string of victories against superior opponents, at least on paper.

There are a few defensive specialist teams that found there way into this years tournament.  Northwestern, South Carolina, and probably most notable “Press” Virginia.  The latter of which has gotten the most attention, and rightfully so for leading the nation in turnovers forced by a good margin.  However, following close behind them is South Carolina, with an almost equally stifling defense.  It has won them games all year, especially considering their offense is doing them little favors.  Their shooting percentages are towards the league bottom.   They make up for this by collecting offensive rebounds and getting to the foul line quite a bit, to only once again be average free-throw shooters.

So how did a team who lost 6 out of 9 games leading up to the tournament, with an average at best offensive, pull out 4 straight wins over superior opponents?  Lets take a look.

South Carolina consistently was on the boards getting rebounds after their misses.  They have done this all year, but in the tournament are averaging slightly higher than their season average in offensive rebounding.  Defensively they have not let up on forcing turnovers.  In the last four games, they have forced their opponents to turnover the ball over 23.3% of the time in every game, that’s almost 1 in every 4 possessions the opposition is not even getting a shot.  Again this is just slightly higher than their season averages.   So what really changed for South Carolina this year?  They found their shot.  Whether its a streak of good luck or stepping up at the right time, a team that has shot an average of 46% of 2 pointers on the season has stepped up to making just below 55% of them.  On par with the elite shooting offenses in college basketball.  4 games is a very small sample statistically, it could fall into the realm of random luck, or it could be the start of an uptick in shooting efficiency.  The question is how long it will their shooting touch last?  With two games to go, can they hold on to their touch, or will they fade away into final four history?

It call comes down to their next opponent.  Ironically enough Gonzaga has seen and beat the other notable defense-minded teams of Northwestern and West Virginia earlier this tourney.  They excel in shooting defense and have a potent offense of their own.  Odds are South Carolina regresses to their mean, especially against a team with excellent man coverage that hardly fouls the shooter, something the Gamecocks have come to rely on.  If they don’t find their shot, expect the Zags to have a similar performance as they did against Xavier, and turn this game into a blowout.

Advertisements

Anatomy of the March Madness Upset Part 3 – 2017 Picks

For those who have been following along, we have been analyzing the past 15 years of March Madness upsets to try to figure out who are the best candidates to bust brackets.  Catch up on part 1 here  and part 2 here.  In quick recap we saw some common patterns across histories college basketball underdogs, particularly that most of these teams excelled in at least one of the following areas: 2 point percentage, 3 point percentage, defensive turnover percentage, offensive rebounding percentage, or has an elite adjusted defensive efficiency (points per possession allowed adjusted for competition).  Why did we go through all of that work?  To better select candidates for 2017’s tournament, which is quickly approaching.

Lets get right to it, with my March Madness analysis.

Lower Seed Higher Seed Seed AdjEM Diff 2Pt% 3Pt% Def TO% Off Reb% AdjDefEff Synopsis Money Line
Troy Duke 15-2 22.42 50.8 35.5 17.7 30.6 107.3 3000
North Dakota Arizona 15-2 23.06 51.6 28.4 20 26.2 103.2 1427
Jacksonville St. Louisville 15-2 26.14 50.9 36.7 16.8 31.5 104.8 3600
Northern Kentucky Kentucky 15-2 25.93 51.5 34.2 20.1 36.3 109.5 3310
New Mexico St. Baylor 14-3 17.79 54.5 33.4 19.2 36 102.7 candidate 616
Florida Gulf Coast Florida St. 14-3 17.73 55.5 34 17.3 34.5 104.5 candidate 559
Iona Oregon 14-3 19.83 49.3 39.8 18.1 27 106.2 candidate 1050
Kent St. UCLA 14-3 21.22 48.6 31.4 19.3 38.1 102.7 1750
East Tennessee State Florida 13-4 14.9 55.4 38.2 22 30 96.5 candidate 490
Bucknell West Virginia 13-4 18.08 54.6 37.7 19.8 26.8 100.3 candidate 1000
Vermont Purdue 13-4 12.34 55.6 36.7 19.7 30.1 98.6 candidate 358
Winthrop Butler 13-4 16.28 51.2 38 18.7 27 101.8 candidate 500
UNC Wilmington Virginia 12-5 14.28 56.1 36 20.4 32.5 105.4 candidate 353
Princeton Notre Dame 12-5 7.51 50.4 38.1 20.6 24.2 96.9 candidate 253
Nevada Iowa St. 12-5 9.77 49.3 38.5 15.9 30.6 101.2 candidate 236
Middle Tennessee Minnesota 12-5 1.94 53.8 37 20.2 30.2 97 -115
Providence/USC SMU 11-6
Xavier Maryland 11-6 -0.5 52 34 17.8 35.2 99.5 113
Rhode Island Creighton 11-6 3.97 50.5 34 19.5 33.1 95 -102
Kansas St./Wake Forest Cincinnati 11-6

This years crop of 11-15 seeds are particularly interesting to me.  Lets break it down.  First off the 15 seeds, I am not in love with any of these potential upsets this year, all have below average defenses and don’t standout in any offensive categories.  I will be skipping betting these match-ups this year.

Lets talk 11-6 matchups.  We won’t know two of these until the play-in games.  I like both USC and Wake Forest to take care of business and secure spots in the tourney, but I skipped my analysis for these games as they really aren’t upsets, they are larger named schools who see similar quality of opponents all year long.

The classic 12-5 is the first place people typically look for “upsets”.  My usual quarrels with the 11-6 not being upsets have crept into play this year into the 12-5 match-ups.  Particularly Middle Tennessee State, it is a considered a pick’em by Vegas.  The other 3 match-ups are your more traditional 12-5’s, where we typically see an upset 25% of the time.   That being said, I like all of the 12’s this year as potential upset candidates.  If Princeton can string together a run on three’s they have a shot, the Ivy league is well represented in our upset list.  UNC Wilmington shoots the ball extremely well, which they will need to upset Virginia, who is arguably the best defensive team around.  Similarly to Princeton, if Nevada gets hot they become a real upset threat.

The real value bets this year lie in the 13-4 and 14-3 rounds.  These teams get to avoid the Arizona’s, Duke’s, and Louisville’s who could legitimately vie for a one seed.  They are also as a whole really good shooters, probably the most important factor in determining the outcome of a college basketball game, especially an upset.  New Mexico State, Florida Gulf Coast, East Tennessee State, Bucknell and Vermont all shoot at above a 54% clip for their 2 point shots.  Iona and Winthrop are equally deadly from the 3.  The gems of this class are Bucknell, and my two favorites Vermont and East Tennessee State.  All shoot above 54% from 2 and 36% from 3, if they got hot look out.  They also play a little better defense than some of their peers with similar seeds.  Rounding out the 14 seeds, New Mexico State, FGCU, and Iona are all efficient shooters as well.  Kent State although they rebound well, they don’t shoot at an elite level and are likely going to be outmatched by UCLA the nations top offense.

What we didn’t see in this years class of potential upsets, are any teams excelling in creating turnovers or elite on defense.  However we were graced with excellent shooters, and some decent rebounding teams.  I would take field goal efficiency every day.  Anyways there you have it, my 2017 March Madness upset candidates.  Note I say candidates, many of these teams will go on to be blown out, but most will compete, and a couple are going to prevail, hopefully we have narrowed our upset contenders correctly.  There you have it, focus on the 13 and 14 seeds this year, particularly Vermont, East Tennessee State, and Bucknell if they can handle the pressure (see what I did there?)

Anatomy of the March Madness Upset Part 2

Following up on my last post, I wanted to evaluate what common trends we could find from the March Madness upsets of the last 15 years.  Last time we looked at difference between efficiencies, but this time I want to break it down further and look at some individual statistics behind these upsets.  Some interesting patterns arise, but it remains clear there are different play styles involved and there are multiple ways in which these upsets happen.  Lets get right to the data behind these college basketball upsets.

Year Lower Seed Higher Seed Score Seed AdjEMDiff 2Pt% 3Pt% Def TO% Off Reb% AdjDefEff
2016 Middle Tennessee Michigan State 90–81 15-2 -25.68 47.80 39.2 19.4 28.70 98.80
2016 Stephen F. Austin West Virginia 70–56 14-3 -11.38 54.50 36.9 25.9 33.50 95.50
2016 Hawaii California 77–66 13-4 -7.90 54.10 32.2 19.8 30.40 93.10
2016 Arkansas-Little Rock Purdue 85–83 (2 OT) 12-5 -12.18 48.90 38.2 21 26.70 95.00
2016 Yale Baylor 79–75 12-5 -6.07 51.40 36 17.8 39.30 94.90
2016 Northern Iowa Texas 75–72 11-6 -6.97 51.20 37.2 18.5 17.70 97.30
2016 Gonzaga Seton Hall 68–52 11-6 1.61 54.30 37.8 15.1 32.10 94.40
2016 Wichita State Arizona 65–55 11-6 -0.68 49.20 32.3 23.2 31.60 87.60
2015 Georgia State Baylor 57–56 14-3 -13.56 53.40 32.2 23.20 30.00 99.10
2015 UAB Iowa State 60–59 14-3 -20.17 46.60 33.2 19.70 34.20 99.80
2015 Dayton Providence 66–53 11-6 -3.43 52.60 35.60 21.10 23.30 93.00
2015 UCLA SMU 60–59 11-6 -3.37 47.40 36.8 17.90 28.60 96.10
2014 Mercer Duke 78–71 14-3 -18.34 51.60 38.80 19.00 31.90 101.80
2014 Stephen F. Austin VCU 77–75 (OT) 12-5 -9.91 52.30 34.90 23.60 38.12 100.90
2014 North Dakota State Oklahoma 80–75 (OT) 12-5 -8.89 55.40 34.70 17.10 30.60 103.10
2014 Harvard Cincinnati 61–57 12-5 -1.77 49.70 38.70 21.20 32.30 94.70
2014 Tennessee Massachusetts 86–67 11-6 7.57 50.60 31.90 16.90 39.70 93.60
2014 Dayton Ohio State 60–59 11-6 -8.76 50.40 37.70 18.80 34.00 99.00
2013 Florida Gulf Coast Georgetown 78–68 15-2 -18.78 52.30 33.40 22.10 32.50 96.80
2013 Harvard New Mexico 68–62 14-3 -13.76 51.50 39.80 20.90 25.60 99.30
2013 LaSalle Kansas State 63–61 13-4 -4.97 49.30 37.70 21.30 29.00 96.20
2013 Ole Miss Wisconsin 57–46 12-5 -6.42 49.40 32.40 21.50 34.10 93.60
2013 California UNLV 64–61 12-5 -4.53 48.80 30.20 16.80 32.50 92.30
2013 Oregon Oklahoma State 68–55 12-5 -4.56 49.10 33.30 22.00 35.70 88.10
2013 Minnesota UCLA 83–63 11-6 3.98 48.90 33.70 20.20 43.80 93.10
2012 Lehigh Duke 75–70 15-2 -11.88 49.40 34.70 21.30 31.00 96.60
2012 Norfolk State Missouri 86–84 15-2 -29.10 50.50 31.50 19.60 33.80 99.90
2012 Ohio Michigan 65–60 13-4 -7.30 47.70 34.00 26.40 33.90 92.00
2012 South Florida Temple 58–44 12-5 -4.05 49.00 31.20 18.70 34.00 88.10
2012 VCU Wichita State 62–59 12-5 -10.57 45.90 33.40 27.30 33.40 90.80
2012 Colorado UNLV 68–64 11-6 -7.79 48.30 34.60 18.40 29.20 93.30
2012 North Carolina State San Diego State 79–65 11-6 0.65 49.40 35.50 18.60 35.80 95.20
2011 Morehead State Louisville 62–61 13-4 -15.80 50.00 34.20 22.70 41.20 94.70
2011 Richmond Vanderbilt 69–66 12-5 -2.16 49.90 39.00 19.60 28.80 93.40
2011 Marquette Xavier 66–55 11-6 1.43 50.50 34.90 20.60 35.80 93.80
2011 VCU Georgetown 74–56 11-6 -9.96 48.00 37.00 22.10 30.70 95.40
2011 Gonzaga St. John’s 86–71 11-6 -0.29 51.90 36.10 20.80 36.30 91.60
2010 Ohio Georgetown 97–83 14-3 -16.50 46.40 36.50 21.60 31.20 95.60
2010 Murray State Vanderbilt 66–65 13-4 -3.08 54.20 38.10 24.00 39.60 90.40
2010 Cornell Temple 78–65 12-5 -7.23 51.10 43.30 20.90 31.50 97.70
2010 Washington Marquette 80–78 11-6 -2.65 49.50 33.60 22.20 36.60 90.10
2010 Old Dominion Notre Dame 51–50 11-6 -0.71 49.20 31.70 22.60 42.10 87.20
2009 Cleveland State Wake Forest 84–69 13-4 -8.54 47.40 30.40 24.10 33.50 90.20
2009 Wisconsin Florida State 61–59 (OT) 12-5 2.04 47.70 36.00 19.30 31.60 92.70
2009 Arizona Utah 84–71 12-5 -3.01 50.90 38.90 18.00 35.60 98.40
2009 Western Kentucky Illinois 76–72 12-5 -11.79 49.90 37.70 19.70 37.50 99.70
2009 Dayton West Virginia 68–62 11-6 -15.17 46.00 32.80 21.90 37.70 89.60
2008 Siena Vanderbilt 83–62 13-4 -6.49 48.10 38.20 24.00 31.30 96.90
2008 San Diego Connecticut 70–69 (OT) 13-4 -14.22 48.70 33.70 22.90 32.80 93.00
2008 Villanova Clemson 75–69 12-5 -9.31 47.80 34.40 23.40 36.00 91.80
2008 Western Kentucky Drake 101–99 (OT) 12-5 -7.81 51.20 38.90 24.50 36.80 94.00
2008 Kansas State Southern California 80–67 11-6 -0.97 50.20 32.00 22.40 44.30 91.40
2007 Winthrop Notre Dame 76–64 11-6 -6.04 55.10 35.50 20.60 35.40 93.10
2007 VCU Duke 79–77 11-6 -9.20 48.20 40.10 23.80 36.00 97.60
2006 Northwestern State Iowa 64–63 14-3 -11.87 50.50 36.20 24.10 38.20 95.70
2006 Bradley Kansas 77–73 13-4 -7.56 48.00 33.60 23.10 35.50 88.20
2006 Montana Nevada 87–79 12-5 -7.13 54.90 37.00 20.40 33.40 99.50
2006 Texas A&M Syracuse 66–58 12-5 2.44 49.10 36.10 27.30 43.20 87.10
2006 Milwaukee Oklahoma 82–74 11-6 -2.68 48.30 33.70 21.70 38.40 93.00
2006 George Mason Michigan State 75–65 11-6 -0.44 53.80 35.60 20.40 32.20 88.70
2005 Bucknell Kansas 64–63 14-3 -17.76 48.90 36.90 23.70 31.40 92.00
2005 Vermont Syracuse 60–57 (OT) 13-4 -7.43 48.70 35.80 19.40 35.50 94.30
2005 Milwaukee Alabama 83–73 12-5 -8.29 49.90 35.30 24.30 36.70 91.50
2005 UAB Louisiana State 82–68 11-6 -2.15 49.80 34.70 27.40 32.10 93.40
2004 Manhattan Florida 75–60 12-5 -8.59 47.10 36.80 24.00 35.40 91.00
2004 Pacific Providence 66–58 12-5 -9.71 52.50 35.50 19.40 31.10 95.10
2003 Tulsa Dayton 84–71 13-4 -4.42 51.00 36.80 20.30 33.60 91.80
2003 Butler Mississippi State 47–46 12-5 -7.01 53.40 39.10 20.00 28.70 96.40
2003 Central Michigan Creighton 79–73 11-6 -7.84 56.10 38.40 22.40 35.60 96.60
2002 UNC-Wilmington Southern California 93–89 (OT) 13-4 -10.22 46.50 37.30 24.00 33.20 94.00
2002 Creighton Florida 83–82 (OT) 12-5 -15.22 51.20 37.20 22.90 35.70 97.50
2002 Tulsa Marquette 71–69 12-5 -6.67 51.10 40.20 21.10 32.50 98.40
2002 Missouri Miami (Florida) 93–80 12-5 -1.44 49.70 39.10 20.00 39.70 96.70
2002 Wyoming Gonzaga 73–68 11-6 -8.41 50.60 30.90 18.70 35.80 94.60
2002 Southern Illinois Texas Tech 76–68 11-6 -4.81 49.90 36.60 21.60 36.20 93.60

Well that is a lot of numbers.  First off as I mentioned before 11-6’s aren’t great upsets.  So I will focus on the 12 and higher seeds for my analysis.  We see a lot of upsets coming from teams with specialized skill sets.  For instance just last year Middle Tennessee was an elite 3 point shooting team, finishing in the top 5 percent in the NCAA in that category.  Sure enough when they knocked off Michigan St. they finished with 11/19 shooting from behind the arc.  Now this performance may be somewhat of an outlier, but given the opportunity to score 3 points a possession is the kind of stat that is needed to cause this caliber of an upset.  They were not alone in this, when Arkansas Little Rock upset Purdue, also last year, they too were a top 3 point shooting team.  Similarly Mercer over Duke in 2014, Harvard over Cincinatti in 2014, and Harvard over New Mexico in 2013 were all excellent long range shooters.  Not all of these games had great 3 point performances, but it can lead to a more variable outcome, meaning if a team gets hot they may beat teams they otherwise shouldn’t, exactly where an upset in March stems from.

3 pointers are just one of the skill sets that lead can to an upset.  A couple other patterns arose too.  Lets talk about offensive rebounding.  As the game evolves we saw a lot of teams move away from even attempting these rebounds, preferring to get back on defense and prevent the fast break opportunity.  However, those that can own the boards, can get higher efficiency put back shots and not give up the possession can reap the rewards.  Teams like Morehead State in  2011 managed to obtain over 40% of their offensive rebounds that year, which is one of the reasons they were able to upset Louisville.  2010’s Old Dominion upset Notre Dame after owning the boards all season, as did Kansas state in 2008.

Forcing turnovers is another skill that when teams perform at an elite level can cause a March Madness upset.  Some of this may be more match-up dependent, some teams see more opponents that play full court press (a.k.a. West Virginia), and can adjust accordingly, but if it’s an opponents first time encountering this style of play in a while it can be a game-breaker.  We saw this last year when Stephen F. Austin knocked off West Virginia early, forcing 22 turnovers giving them a taste of their own medicine.  Georgia State forced 21 turnovers upsetting Baylor in 2015.  Ohio was also a turnover specialist knocking off Michigan in 2010.

Some other stats to look for include elite 2 point shooting, and overall adjusted defensive efficiency (typically allowing less than 0.9 points per possession adjusted for competition).  What seems to be the common pattern though, is that a team goes above being well-balanced and really excels in at least one of these areas.  I don’t mean in excel in the top 25% of teams, but rather in the top 5-10% among Division 1 teams seems to be what the data shows.  What category seems to matter a little less, but teams that are excellent 2 or 3 point shooters seem to lead to the most upsets.

I looked at a couple other stats where I didn’t see the same patterns.  Particularly I looked at whether or not a team won its conference tournament (which is a bit of misnomer since most of the 12-15 seeds punched their ticket from winning their respective tournaments).  However, even with the bigger schools, this didn’t seem to be of any relevance.  Another area I looked at was how a team performed in its last 10 games leading up to the tournament, but again this didn’t seem to show much.  A lot of the smaller conference schools may have inflated win percentages due to playing lesser competition, it really needs to be considered on a team by team basis and does not seem to work well being generalized across all teams.

Since html isn’t the best format to work with, I have uploaded my excel data here, including highlighting the elite categories for easier viewing.  NCAA Upsets Data

Enjoy, and check back later this week after Selection Sunday for part 3, where I will explore the likely upset candidates for 2017.

Anatomy of the March Madness Upset

I wanted to take a quick look today at all the NCAA tournament “upsets” dating back to 2012.  Note the air quotes around upset, because what we will see from the data is a lot of the 11-6 seed match-ups are not upsets at all.  It has become apparent the tournament selection committee has not been staying up to date with modern stats to help aide their decision, but rather rely on their old faithful RPI.  Although I know they are looking to change that this year, it will be a while before that process takes hold.

For my classification of an upset I am loosely using any 11 seed or higher winning in the opening round.  The data below is limited to the first round of the tournament (ignore whatever the NCAA is calling their playin-game round these days, that does not count as the first round).  The data is compiled dating back to 2002.  Without further adieu, I have compiled all the March Madness upsets below, along with their expected adjusted efficiency margins (Adj EM – the difference in average points scored per vs allowed per 100 possessions adjusted to their competition they played throughout season) coalesced from Ken Pomeroy’s pre-tourney statistics.

Year Lower Seed Higher Seed Score Seed Lower Seed AdjEM Higher Seed AdjEM AdjEM Diff
2016 Middle Tennessee Michigan State 90–81 15-2 3.90 29.58 -25.68
2016 Stephen F. Austin West Virginia 70–56 14-3 14.43 25.81 -11.38
2016 Hawaii California 77–66 13-4 11.73 19.63 -7.90
2016 Arkansas-Little Rock Purdue 85–83 (2 OT) 12-5 12.48 24.66 -12.18
2016 Yale Baylor 79–75 12-5 13.83 19.90 -6.07
2016 Northern Iowa Texas 75–72 11-6 10.11 17.08 -6.97
2016 Gonzaga Seton Hall 68–52 11-6 19.28 17.67 1.61
2016 Wichita State Arizona 65–55 11-6 21.17 21.85 -0.68
2015 Georgia State Baylor 57–56 14-3 9.89 23.44 -13.56
2015 UAB Iowa State 60–59 14-3 2.83 23.00 -20.17
2015 Dayton Providence 66–53 11-6 14.14 17.56 -3.43
2015 UCLA SMU 60–59 11-6 14.17 17.54 -3.37
2014 Mercer Duke 78–71 14-3 7.50 25.84 -18.34
2014 Stephen F. Austin VCU 77–75 (OT) 12-5 10.80 20.71 -9.91
2014 North Dakota State Oklahoma 80–75 (OT) 12-5 12.57 21.47 -8.89
2014 Harvard Cincinnati 61–57 12-5 17.30 19.07 -1.77
2014 Tennessee Massachusetts 86–67 11-6 21.71 14.14 7.57
2014 Dayton Ohio State 60–59 11-6 12.94 21.70 -8.76
2013 Florida Gulf Coast Georgetown 78–68 15-2 3.34 22.12 -18.78
2013 Harvard New Mexico 68–62 14-3 6.93 20.68 -13.76
2013 LaSalle Kansas State 63–61 13-4 13.24 18.22 -4.97
2013 Ole Miss Wisconsin 57–46 12-5 16.53 22.95 -6.42
2013 California UNLV 64–61 12-5 12.68 17.20 -4.53
2013 Oregon Oklahoma State 68–55 12-5 14.82 19.39 -4.56
2013 Minnesota UCLA 83–63 11-6 19.11 15.13 3.98
2012 Lehigh Duke 75–70 15-2 8.95 20.83 -11.88
2012 Norfolk State Missouri 86–84 15-2 -2.43 26.67 -29.10
2012 Ohio Michigan 65–60 13-4 10.77 18.07 -7.30
2012 South Florida Temple 58–44 12-5 11.47 15.52 -4.05
2012 VCU Wichita State 62–59 12-5 12.67 23.24 -10.57
2012 Colorado UNLV 68–64 11-6 8.26 16.05 -7.79
2012 North Carolina State San Diego State 79–65 11-6 13.11 12.45 0.65
2011 Morehead State Louisville 62–61 13-4 6.64 22.44 -15.80
2011 Richmond Vanderbilt 69–66 12-5 14.89 17.05 -2.16
2011 Marquette Xavier 66–55 11-6 17.66 16.23 1.43
2011 VCU Georgetown 74–56 11-6 8.63 18.59 -9.96
2011 Gonzaga St. John’s 86–71 11-6 16.15 16.44 -0.29
2010 Ohio Georgetown 97–83 14-3 7.17 23.67 -16.50
2010 Murray State Vanderbilt 66–65 13-4 14.11 17.19 -3.08
2010 Cornell Temple 78–65 12-5 13.27 20.50 -7.23
2010 Washington Marquette 80–78 11-6 17.46 20.11 -2.65
2010 Old Dominion Notre Dame 51–50 11-6 17.24 17.95 -0.71
2009 Cleveland State Wake Forest 84–69 13-4 11.40 19.94 -8.54
2009 Wisconsin Florida State 61–59 (OT) 12-5 17.58 15.54 2.04
2009 Arizona Utah 84–71 12-5 15.63 18.64 -3.01
2009 Western Kentucky Illinois 76–72 12-5 6.91 18.70 -11.79
2009 Dayton West Virginia 68–62 11-6 9.28 24.45 -15.17
2008 Siena Vanderbilt 83–62 13-4 6.67 13.17 -6.49
2008 San Diego Connecticut 70–69 (OT) 13-4 4.41 18.62 -14.22
2008 Villanova Clemson 75–69 12-5 12.62 21.93 -9.31
2008 Western Kentucky Drake 101–99 (OT) 12-5 13.88 21.69 -7.81
2008 Kansas State Southern California 80–67 11-6 18.51 19.48 -0.97
2007 Winthrop Notre Dame 76–64 11-6 14.26 20.30 -6.04
2007 VCU Duke 79–77 11-6 13.91 23.11 -9.20
2006 Northwestern State Iowa 64–63 14-3 7.04 18.91 -11.87
2006 Bradley Kansas 77–73 13-4 16.01 23.57 -7.56
2006 Montana Nevada 87–79 12-5 8.70 15.84 -7.13
2006 Texas A&M Syracuse 66–58 12-5 15.44 12.99 2.44
2006 Milwaukee Oklahoma 82–74 11-6 11.86 14.54 -2.68
2006 George Mason Michigan State 75–65 11-6 16.24 16.68 -0.44
2005 Bucknell Kansas 64–63 14-3 5.73 23.48 -17.76
2005 Vermont Syracuse 60–57 (OT) 13-4 13.39 20.83 -7.43
2005 Milwaukee Alabama 83–73 12-5 12.70 20.99 -8.29
2005 UAB Louisiana State 82–68 11-6 12.10 14.25 -2.15
2004 Manhattan Florida 75–60 12-5 11.42 20.01 -8.59
2004 Pacific Providence 66–58 12-5 8.61 18.32 -9.71
2003 Tulsa Dayton 84–71 13-4 11.17 15.59 -4.42
2003 Butler Mississippi State 47–46 12-5 15.04 22.06 -7.01
2003 Central Michigan Creighton 79–73 11-6 10.18 18.01 -7.84
2002 UNC-Wilmington Southern California 93–89 (OT) 13-4 10.76 20.98 -10.22
2002 Creighton Florida 83–82 (OT) 12-5 11.31 26.53 -15.22
2002 Tulsa Marquette 71–69 12-5 15.78 22.45 -6.67
2002 Missouri Miami (Florida) 93–80 12-5 14.51 15.95 -1.44
2002 Wyoming Gonzaga 73–68 11-6 11.43 19.83 -8.41
2002 Southern Illinois Texas Tech 76–68 11-6 12.69 17.50 -4.81

Some observations on the March Madness Upset:

As I mentioned before there are a number of 11-6 upsets with an Adj EM Diff either really small or in the favor of the lower seed.

Even large differences between talent can have upsets, we all remember Middle Tennessee destroying a lot of brackets last year by defeating Michigan St. in the opening round.  While many argue they should have been ranked higher than a 15 seed, they still overcame a huge efficiency discrepancy.

What was the biggest upset in March Madness history?   Some contenders from recent memory include the before mentioned Middle Tennessee over Michigan St, Lehigh upsetting Duke in 2012, Mercer upsetting Duke two years later, or going back a few years Dunk City Florida Gulf Coast upsetting Georgetown.  However strictly based on adjusted efficiency differential we can see the biggest upset was Norfolk State knocking off Missouri in 2012.

The average of the adjusted efficiency differential of the upset is -7.51.  Note this means nothing statistically, I was curious so I calculated it.

Anyways chew on that for a bit, we will try to dive in to some other factors that caused these upsets in upcoming posts.

Calculating Home Court Advantage in College Basketball

Home court advantage is a term often thrown around.  It is a common theme across all sports, with varying rates of its influence.  From a logical standpoint, some aspects make sense, not having to travel long distances (or cross time zones for that matter), having engaged fans rooting you on, and the comfort of being in a place you have often played before.  Books like Scorecasting suggest the possibility of other influences, particularly a bias in the referees to make calls that favor the home team.  Whatever the cause I want to explore the reach of home court advantage in college basketball.  While this has been done before, many times, I want to take this opportunity to exploit a powerful tool to help more accurately analyze true home court advantage.

One of the hurdles in calculating home court advantage in college basketball is the way teams schedule opponents.  Typically schools in the power conferences schedule a large percentage of exhibition type games with lesser opponents.  These games are almost never a home and away type setup.  If we were to calculate home court advantage using these games, we would get a lopsided result because the home team most likely always wins, and by a large margin.  For my calculation I want to restrict my data set to teams who play a home and away with each other in the same season.  Fortunately, conference play provides just this data.  The challenge lies in separating these games where two opponents play a home and away from all the others.  We will need some tools to assist us.

One common recurring theme when evaluating data is which tool is the best for the job?   Excel or its Open Office equivalents are often a good choice for tabular data, however so is mysql, or insert your favorite programming language.  I often find myself wanting to write queries against data in a delimited text file (csv), however I don’t want to layout a database schema, connect to a database, and perform inserts in order to do so.  It’s time prohibitive and tedious.  One tool I have found to be particularly helpful is a package called “Q – Text as Data“.  It is a simple command line utility that can run in Windows and Linux and let you query csv’s as if they were mysql tables, using the column headers as the table field names.

Calculating home court advantage in college basketball using Q

Calculating home court advantage in college basketball using Q

Back to our experiment of identifying games with a home and away within a single season.  Lets see how Q can help us.  I am starting with the following data set, which includes all games from 2005 to the 2015 seasons.  You can download it here:  2005-2015-scores.  Lets use some sql via Q to filter this file to the games we care about.  I will show the commands and then offer some explanation below.

q -H -d,
"SELECT AVG(a.teamscore - a.oppscore)
FROM 2005-2015.csv a
INNER JOIN(SELECT teamname, opponent, datestr, seasonyear, site, teamscore, oppscore from 2005-2015.csv WHERE (site = 'H' OR site = 'A')  group by teamname, opponent, seasonyear having count(*) = 2) b
ON a.teamname = b.teamname AND a.opponent = b.opponent and a.seasonyear = b.seasonyear and a.site='H'"

The output of the above, we see teams win by 3.53 points per game in the home leg of the home and away.  Conversely the visiting disadvantage can be calculated as follows:

q -H -d,
"SELECT AVG(a.teamscore - a.oppscore)
FROM 2005-2015.csv a
INNER JOIN(SELECT teamname, opponent, datestr, seasonyear, site, teamscore, oppscore from 2005-2015.csv WHERE (site = 'H' OR site = 'A')  group by teamname, opponent, seasonyear having count(*) = 2) b
ON a.teamname = b.teamname AND a.opponent = b.opponent and a.seasonyear = b.seasonyear and a.site='A'"

The output this time is -3.518, which represents the ppg the visitors lost by.  In this case the value of playing at home vs away is a swing of 7 points.  Home teams win by a margin of 3.53 ppg, while visiting teams lose at a margin of -3.518 ppg, so taking both of these into consideration we get a swing of 7 points (rounding to the nearest whole number).  That is our calculated home court advantage in college basketball.

Ok, so what did we just do?  You notice “q” is the name of the program running, we are passing a couple of parameters to it.
-H tells it to use the first row in the csv as the header, which translates to mysql column names.
-d, tells it that the text file we are passing in is comma delimited (defaults to pipe delimited).
the third option is the the sql to run, explained below.

Lets take a look at what this sql is doing, from the inside out.  You can see in our INNER JOIN we are grouping by teamname, opponent, and seasonyear which will isolate results for each combination of teams within a season.  We want only games where the site is ‘H’ or ‘A’, neutral games and semi-home or semi-away games are identified differently so we can rule out sites where there is not a true home court advantage.  We use only groups having exactly two games, where one game is home and one is away.  We do this to not include any additional times an opponent may have played, likely in the event of a tournament.  Next we join these group results with the original rows to return the original data set filtered to only include the games we care about.  From there we simply take the average of the score differential for each the home and away games to come up with our calculated home court advantage.

So there it is, we calculated a home court advantage of ~3.5 points per game, and a visitors disadvantage of ~-3.5 points per game, giving roughly a 7 point swing for non-neutral sites.  That is your calculated home court advantage in college basketball, courtesy of Q, which can give you an advantage in your analytics arsenal.

Easiest Fantasy League I Ever Won

I am going to take a quick aside from college basketball to tell you about the easiest fantasy league I ever won, and how you could have too.  Most people are familiar with your standard formats of fantasy football, baseball, or basketball.  Those who follow hockey or soccer also have regular fantasy leagues.  However, one in particular, is less popular but provided perhaps the easiest opportunity to win I have ever participated in.  Enter, NFL Playoff Challenge.

The brief synopsis is you play for four weeks, following the NFL playoffs.  Each week you choose a lineup consisting of a QB, 2 WR’s, 2RB’s, a TE, K, and DEF.  There are no salary caps, no draft, you can choose a new lineup each week.  Scoring follows standard fantasy football formats (non PPR) with one exception.  The multiplier.  Each week you start the same player at the same position, you gain a multiplier to that persons score.  For example if you were to start Aaron Rodgers in the wildcard round, when they won he would have scored 2X points in the divisional round, and 3X points in the NFC Championship, and had they made it all the way, 4X points in the Superbowl.  One thing to note, is you can select players not playing in the wildcard round, and they will automatically advance to a 2X multiplier in the next round (although will not net any points in the wildcard round).  If a team is eliminated, you can choose a new player next week, however the multiplier will be reset.

NFL Fantasy Playoff Challenge Strategy

Lets look at some simple strategy.  Assuming we know nothing about the teams being played.  If we look at the odds of a team playing in the wildcard round to reach the Superbowl, assigning a 50% chance to win each game, they have to win 3 games to advance which gives them a 12.5% probability.  Note we don’t care whether they win the last game, just get to it, as there are no games beyond that so advancing is no longer a concern, other than the winning team will likely yield more points.  A team with an opening round bye, only needs to win 2 games to reach the Superbowl, giving them twice the odds of a team playing in the wildcard round.  So we now know that choosing a team that gets a bye, will give us twice the chance of reaching the 4X multiplier we want.  The goal is to maximize our points, so what would you rather have?

Wildcard player:
1X + 2X + 3X + 4X = Max points possible (10X) if reaches Superbowl but half the odds of doing so.

First round bye player:
2X + 3X + 4X = 90% of Max points possible, but twice the odds of wildcard player.

Under these assumptions, it seems obvious it is in our best interest to pick players given the first round bye.  Yet few people will do so, with names like Antonio Brown and Le’Veon Bell available in the wildcard round, its hard to pass up, even though they are not as likely as advancing to the fantasy league finals where the coveted 4X multiplier comes into play.

Thus far we have assumed all teams are created equal, which we know is not the case.  This year in particular all the favorites won the wildcard games, if we assume that was the case going in, we could justify taking the big names players on those teams.  The problem though, is picking which of the teams that advanced to the divisional rounds would make it to the Superbowl.  On the NFC side, we had Atlanta, Dallas, Seattle, and Green Bay.  Any one of which had a legitimate shot at winning.  How do we know which one we want to pick players from?  Do we guess, pick a sampling from various teams and hope for a more balanced approach?

No, absolutely not!  While the NFC was a crap-shoot, the AFC this year was a different story.   Enter the Houston Texans and Oakland Raiders.  One team with no business being in the playoffs with a QB throwing more picks than he did touchdowns.  The other team an offensive powerhouse who lost their starting and backup QB’s.  Who does the winner get to play?  None other than Tom Brady and the Patriots in the divisional round.  Chalk this up to a free win for New England, boosting their odds of reaching the Superbowl to at least 50%, assuming they are the favorites to win the AFC championship since they will host it at home.  See where I am going with this?  While the NFC is going to be a toss-up, I can fill my NFL Playoff Challenge team with Patriots, who have a better than 50% chance of advancing to the final game, and thus receiving the 4X multiplier.

There lies the secret of my fantasy playoff strategy this year.  Load up on Patriots.  Had New England lost to Pittsburgh, I would have no chance of winning.  However I weighed that risk, vs the riskier scenario of having to pick which NFC team would advance and the choice became simple.  I filled 6 out of the 8 positions with Patriots.  I could have chosen all 8, however it is hard to predict if a #2 WR or RB are worth it even with the multiplier given the lack of opportunities as their #1 counterparts.  Lucking out with Julio Jones as my other WR was an added bonus (originally had Jordy Nelson, but after he got hurt, I switched to Julio).  My other RB slot was reserved for E. Elliot, who didn’t advance, but having a player from the 1 and 2 seeded teams in the NFC gave me a good shot of getting one in the Superbowl for the 4x bonus.  While I was trailing the first 3 weeks of the playoffs, I can assure you my fantasy league opponents were horrified to see 6 out of 8 players with a 4X point multiplier next to their name, as well as Julio with a 3X.  Although they did talk some smack after a dud of a first half by New England, Tom Brady answered and turned it around.  Check cashed, easiest fantasy league I ever won!

nflplayoffentry

March Madness Prop Bets

March Madness Prop Bets
After a thrilling Superbowl weekend, the thought crossed my mind, why don’t other sports offer more prop bets, particularly college basketball’s March Madness?  There are a lot of similarities.  Often times your team is not the one competing so another reason to root for something can be refreshing.  Sure, there is the argument that it is an amateur sport, and we shouldn’t be betting on kids, and the fear that some player may exploit it for profit.  However, how fun would it be to have readily available prop bets for March Madness?  You are throwing money down in your office pool, but once the opening weekend is done and you are all but eliminated wouldn’t it be fun to double down and have a little more action?  March Madness prop bets could be just the ticket.

Lets take a look at some possible examples:

What will be higher throughout March Madness?
+150 Grayson Allen Trips
-200 12 seed vs 5 seed upsets

Number of schools names mispronounced in the opening round?
-300 Over 2.5
+200 Under 1.5

Will Donald Trump fill out a bracket?
– 700 No
+500 Yes

Number of games decided by one point throughout March Madness?
– 110 Over 3.5
– 110 Under 3.5

Which conference will have more wins?
Pac 12  2/1
Big 10  3/2
WCC    2/1

Will the National Champion be a repeat winner?
– 200 Yes
+ 150 No

Will a 16 seed knock off a 1 seed?
Yes  20/1
No   1/20

What will be higher?
The number of games Gonzaga wins
The number of Kentucky players who declare for the draft

Other common types of March Madness prop bets are which seed will end up winning the tournament?  Or how many 1 seeds will reach the Final Four?  However, I particularly enjoy the offbeat comparisons style bets.  A would you rather of atypical scenarios, such as the Gonzaga example above.

So sure, you can find some prop bets in your favorite offshore casino.  However I would like to see a little more variety of prop bets come March in Las Vegas.  While to most, 68 teams competing for one title is probably more than enough action to bet on, some times you want to put a few bucks down on something stupid and have a little fun.  Whats the harm in that?   Any interesting prop bets you would like to see?