A Solution to the Uneven Scheduling Problem in Baseball: Institute the Ratings Percentage Index (RPI)
Baseball, Numbers June 3rd. 2008, 3:00pm
Cornell numbers wizard Max Wasserman has thrilled you with his firm grasp on math as it relates to sports. This might his best effort yet: Attempting to solving the unbalanced scheduling problem in MLB by instituting the RPI – which college basketball uses.
My biggest pet peeve is unfairness. Any sort of inequality in a competition really grinds my gears. This accounts for one of the 400 reasons I hate Last Comic Standing. As for sports, which are generally less rigged than Last Comic Standing, there are a few cases that reach a level of unfairness I can’t really ignore. No, I’m not talking about the lack of a cap in baseball or the overtime in the NFL. I’m talking about unbalanced scheduling. And the league that is the guiltiest of scheduling imbalance is Major League Baseball.
Baseball’s uneven scheduling should come as no surprise. It is the only major pro sports organization today with divisions of unequal size and one league that has more teams than the other. And with interleague play being such a limited part of the schedule, ridiculous matchups frequently manifest, like Baltimore and Arizona playing two home-and-home series last year. This results in some teams playing tougher schedules than others. But while schedule strength is accounted for in the NFL (albeit several spots down the list of tiebreakers), it is not in Major League Baseball. And it really should, especially when playoff races come down to one-game differences, like last season. If only there were some sort of formula that could evaluate a teams performance based on their record and schedule strength. Well my favorite sport, college basketball, has such a formula, the Ratings Percentage Index or RPI.
The NCAA uses the RPI formula in pretty much every sport they run. It gives them a standard to compare teams that almost always play completely dissimilar schedules. Using the RPI allows the NCAA to decide which teams to put in the tournament, which teams to leave out, and which 26-2 teams warrant a No. 8 seed because Karl Hobbs’ paranoia renders him unable to schedule any quality teams in non-conference play [remembers GW's 2006 season, punches wall].
Sure, using the RPI for Major League Baseball may sound silly, but it can be very accurate. Remember, it was George Mason’s high RPI (23 in late February) that helped them sneak into the NCAA Tournament in 2006. And had Davidson fallen in the SoCon tournament last season, many experts had suggested that their RPI (44 in early March), boosted by their tough non-conference schedule, would have carried them through.
Anyway, using the RPI formula (which is 25% win percentage, 50% opponents’ win percentage, and 25% opponents’ opponents’ win percentage) on the results of the 2007 Major League Baseball regular season (which took a loooong time to organize), the teams are ranked as such:

[SOS = Strength of Schedule = 67% opponents' win percentage + 33% opponents' opponents' win percentage. Division winners are highlighted in dark colors (blue for AL, orange for NL). Wild card winners are highlighted in light colors (blue for AL, orange for NL).]
Once you pass the glut of American League teams that dominate the top of this list, you’ll notice that the National League team with the greatest RPI is the Padres. And as you know, the Padres failed to make the playoffs after tying the Rockies for second in the NL West behind the D-backs and losing the wild card tiebreaker when Matt Holliday’s foot came close enough to the plate for a “gimme.€ But not only is the Padres RPI greater than Colorado’s, but both teams beat out Arizona. And when you adjust the formula to add weight for road wins and home losses, the Padres only increase their lead in RPI. Why is that?
Well, you’ll notice that San Diego has the greatest strength of schedule of the three NL West rivals. And when you look at the out-of-division schedules of those teams, you’ll see why. The number of games each team played against out-of-division opponents in 2007 are in the table below.

The Padres played more games than the others against the NL Central champion Cubs, who had a high RPI for an NL team. In exchange, the Diamondbacks played more games against the Pirates, dead last in RPI. But the big difference came in interleague play. While San Diego didn’t have to play a series with the AL Wild Card Yankees, they did play two home-and-home series with the tough Mariners, fifth in the RPI. Meanwhile, the Rockies had their difficult Yankees series counterbalanced by three games against the not-at-all-good Royals. And as I mentioned earlier, Arizona got an extra series with the Orioles, or as I call them, Angelos and Demons.
Now, I understand that the interleague schedule imbalances will always occur due to interleague rivalries. The Cardinals will usually have a slight advantage over the Cubs because they get six games against the Royals instead of six against the White Sox. Those imbalances can be solved if interleague was either expanded or eliminated altogether. After all, baseball got along for most of its history without it. But even if you think that the interleague system is fine the way it is, you can’t ignore the fact that San Diego and Seattle are not rivals. So why do they have to be scheduled as such? Unlike in college sports, pro teams have no control of their own schedules. It should be up to Major League Baseball to make everything fair.
Granted, the schedule imbalances shown may have resulted in only a slight difference in RPI. But remember that all San Diego needed was one more win to make the playoffs. Had the schedules been more balanced, they could have been the ones to get swept by the Red Sox. What makes me say that? Of all the NL playoff teams, the one with the greatest RPI made the World Series. And of all the teams in Major League Baseball, the one with the top RPI won the World Series.
Oh, and in case you were wondering, here’s the current RPI standings in the 2008 season, through games of Sunday, June 1st, with division and wild card leaders highlighted.

59 Responses to “A Solution to the Uneven Scheduling Problem in Baseball: Institute the Ratings Percentage Index (RPI)”
Leave a Reply
You must be logged in to post a comment.

June 3rd, 2008 at 3:04 PM
Max Wasserman went to Cornell. Ever heard of it?
June 3rd, 2008 at 3:06 PM
Yes, but Ed Helms did not go to Cornell
June 3rd, 2008 at 3:06 PM
I could’ve written this soooo much better
/probable response from TBL commenters attempting to mask their jealousy
June 3rd, 2008 at 3:08 PM
We need to expand the field.
/March Madnessed
June 3rd, 2008 at 3:09 PM
3 of the top 6 teams in the al east. and none are the yanks. if they turn it on…wow- that’ll be a great pennant race
June 3rd, 2008 at 3:10 PM
@Cousins: I don’t think anyone would complain if two more wild cards per league were added.
June 3rd, 2008 at 3:10 PM
i didnt see anywhere that mentions why MLB schedules the bulk of Indians’ home games during times when they fully know that the weather is going to suck. or giving the tribe 2 home games against the red sox and 4 on the road.
how is your RPI going to fix that? the only option is me visiting fucking Bud Selig and putting my size 13 shoe up his old man ass.
June 3rd, 2008 at 3:10 PM
Doesn’t the final table prove that an RPI system would basically be incosequential?
The schedule is still weighted heavily towards playing your division foes. It you can’t beat the teams in your division head-to-head, you’re not likely to win the division.
Also, if given the choice before the season, I’d much rather have faced the Rays than say the Tigers or Indians and probably even the Mariners.
June 3rd, 2008 at 3:11 PM
go Rays…and I agree with atlantasportsfan..
June 3rd, 2008 at 3:12 PM
The Padres need to blow up that team keep Adrian and Peavy and trade just about everybody else
June 3rd, 2008 at 3:12 PM
I guess at Cornell they don’t teach you how to write a formal conclusion though. I went to ASU and even I can write a conclusion
June 3rd, 2008 at 3:12 PM
MLB is run by old, unimaginative assholes who have no use for this type of information.
June 3rd, 2008 at 3:13 PM
is Kazmir back in the rotation?
June 3rd, 2008 at 3:14 PM
the 2007 RPI worked well in that all the division winners, with the exception of the NL West Rockies-Padres-Diamondbacks race, were the top RPI team in their respective divisions (Red Sox, Indians, Angels, Phillies, Cubs).
Yikes, the NL Central blew ass, right down to the last three teams…Cubs-15, Brewers-19, Cardinals-22, Reds-28, Astros-29, Pirates-30.
June 3rd, 2008 at 3:15 PM
Yes Kazmir is back..he was 5-1 with an era of 1.22 or 1.12.something like that..
June 3rd, 2008 at 3:15 PM
+1 ATL
June 3rd, 2008 at 3:15 PM
So, the entire NL Central is in the Top 14? Seems kind of odd.
Then again, I’m just glad the Pirates aren’t last .
June 3rd, 2008 at 3:15 PM
@spencer: There is a way to fix that, though I didn’t talk much about it here to keep the post from being overly long. The RPI can take into account home and road games by making road wins and home losses weigh more and road losses and home wins weigh less.
June 3rd, 2008 at 3:17 PM
@Everyone: I’m an engineering student. I can’t write.
June 3rd, 2008 at 3:17 PM
Yikes, this was a lot to take in at work. I don’t follow baseball all that much, but really, how many ties are there, and wouldn’t you rather solve the tie with an actual game than with crazy formulated stats. They kinda have to with the NCAA because there are so many schools and so short a time to get things done. But, baseball has few teams, and lots of time between the end of the regular season and the start of postseason for one or two play in games.
June 3rd, 2008 at 3:19 PM
Apparently we have another non-hockey fan in the house.
But nice work breaking down the scheduling differences in MLB.
June 3rd, 2008 at 3:20 PM
iggy…yea, there’s a way to fix it. it’s called holding a gun to bud selig’s forehead and making sure he understand’s the point that there is baseball outside of Boston, NY and Milwaukee. fuck him. sorry for yelling.
June 3rd, 2008 at 3:20 PM
[tossed out idea to click on ads, while reading and interpreting information. guarantee everyone will enjoy the roundup pic. No peeking, Couz]
June 3rd, 2008 at 3:20 PM
@jibble: Well, the NHL is changing it’s scheduling ways for next season so I let them off the hook.
June 3rd, 2008 at 3:20 PM
Saw Kendra from Girls Next Door had a GMU Final Four t-shirt.
I couldn’t of written this better, but I could have wrote it better.
June 3rd, 2008 at 3:23 PM
@iggy: I would actually think the NFL is the worst too, but I think they do this by design of the schedule, so you probably let them off the hook, too.
June 3rd, 2008 at 3:24 PM
This statistical analysis does not take Jay Bruce into account. Therefor, it is faulty.
June 3rd, 2008 at 3:24 PM
I will miss the Packers not being on TV all the time, So John Madden can D-ride Favre with the best of them. He doesnt touch Peter King though, I think Peter had Favre pee in a cup for him
June 3rd, 2008 at 3:24 PM
-1 to you Atlanta if this in reference to that Orioles post from Friday.
It’s not jealousy to point out something is crappy when it is in fact crappy.
June 3rd, 2008 at 3:24 PM
[I probably won't even be able to see the picture because I suck]
June 3rd, 2008 at 3:25 PM
Everyone, get ready for a shirtless Patrick Dempsey
June 3rd, 2008 at 3:25 PM
Maybe I’m interpreting this wrong, but I thought he was bringing up RPI in regards to scheduling, not as a means to break ties?
Baseball is the most stat-reported game, I don’t see why throwing RPI into the mix would be so horrible. I wouldn’t mind seeing the Cubs play someone other than the Pirates nine times in two months.
June 3rd, 2008 at 3:26 PM
My only problem with this is that there is a pitching rotation and the fact that some teams get lucky in the sense that they don’t always face the same pitcher every time. What if (and this is a big what if) the Padres beat a certain team more often just because Peavy or Young happened to be the SP for 2 of the 3 games? And on the opposite end of this, what if every time they played a team, they got caught at the bottom of their rotation? Wouldn’t the numbers be different? I know that this is just being picky, but baseball is definitely a game of numbers and Peavy’s and Young’s numbers are drastically different than the rest of the rotation.
June 3rd, 2008 at 3:28 PM
Um, I will just refrain from commenting on this idea. It looks like a lot of words, though.
Well one comment – so instead of make a few easy decisions regarding the make up of the divisions and abandoning inter-leaguie play, which is the reason the schedule is imbalanced anyway, you propose to construct an RPI for baseball that doesn’t need one.
I just don’t see the need for an RPI. MLB is the top echelon of baseball – the teams have the most talented players available. Colleges need RPI because of the wide disparity of talent across Division 1 and the need to compare and rank teams for the post-season, er, bowls and tournaments. MLB has a very nice system – win your division or one of the wild cards and you get to advance to the post-season. The fact that game-playing schedule is a bit cock-eyed doesn’t really enter into the win distribution for a team.
Maybe I am just dislexic today and I didn’t get this post. What am I missing?
June 3rd, 2008 at 3:29 PM
@CBH – I was out of commission on Friday, so no, it wasn’t in reference to that.
June 3rd, 2008 at 3:30 PM
I thought it was Holliday’s hand, not his foot.
June 3rd, 2008 at 3:34 PM
Go back to 2 divisons in each league, then have the division winners only make the playoffs. 2 and 2. Problem solved.
Or just combine all 30 teams into one league to fight for 1 playoff spot, then have that team play Jay Bruce in a winner-take-all death match-slash-home run derby.
Either would seem to work fine.
June 3rd, 2008 at 3:35 PM
A couple of items: An RPI (or something equivalent in focus) is needed for college basketball because there are both autmoatic entry bids and at-large bids and as such, there needs to be a way to separate good teams from bad. In baseball, all bids are automatic. Win your division or have the best remaining record. There’s no need for an RPI because there’s no selection process for determining playoff teams. The unbalanced schedule does need some work though.
June 3rd, 2008 at 3:37 PM
In truth doesnt the Best team always win in a 7game series?
June 3rd, 2008 at 3:39 PM
@Ben: Whoops, you were right. My bad. I guess I can’t write or remember things.
June 3rd, 2008 at 3:39 PM
In baseball? Not a chance.
June 3rd, 2008 at 3:40 PM
Who Remembers Dave Parker? no reason just thought I’de mention
June 3rd, 2008 at 3:42 PM
My head hurts..just play game..hit ball..catch ball..no computer…Computer bad…Roman no like..Roman smash…ROMAN SMASH!
June 3rd, 2008 at 3:42 PM
Farouq, you were badass when you were in the Nation of Domination. You helped The Rock become what he is today, and for that I will forever be grateful
June 3rd, 2008 at 3:43 PM
Should we expect a John Schnaars post tomorrow on this?
June 3rd, 2008 at 3:45 PM
Jay Bruce > The Rock
June 3rd, 2008 at 3:49 PM
After a minute of number crunching I’ve come to the realization that the numbers work out perfectly for balanced play, assuming someone moves one of the NL Central teams (Astros?) over to the AL west and all divisions have five teams. You have to ignore the potential quality difference in interleague opponent, but it’s a far sight better than the current set-up.
Two 3-game (home and home) series with every non-divisional team in your league: 2 series * 3 games per series * 10 teams = 60 games.
Two 3-game (home and home) interleague series with ‘rival’ team: 2 series * 3 games per series * 1 team = 6 games
Eight 3-game series (4 home and 4 home) with four divisional teams: 8 series * 3 games per series * 4 teams = 96 games
60 + 6 + 96 = 162.
It’s a little inter-divisional heavy, but it makes too much sense to ever be implemented. The math works as well as my 3-division, 12-team fantasy football format. That is to say, perfectly.
June 3rd, 2008 at 3:51 PM
Faarooq I liked you better when you wore the gladiator headgear.
June 3rd, 2008 at 3:54 PM
believe it or not My actual name is Farouq
June 3rd, 2008 at 3:56 PM
@Ben, et al: Just like in college sports, teams that win their division should always make the playoffs no matter their relative record or RPI (see Georgia). There’s no reason it couldn’t be used to pick the wild cards though, especially when you consider the unbalanced inter-division scheduling. But the main purpose of the RPI formula in this post is to make a point about the uneven schedules. The RPI wouldn’t and shouldn’t have to be used if the schedules were even slightly more balanced.
June 3rd, 2008 at 4:00 PM
Good post, and I would rather shit my pants in front of Cheryl Cole than watch baseball.
June 3rd, 2008 at 4:00 PM
@Ben: Or the divisions were better aligned, like you said. Personally, I’d figure it would be the D-backs to move to the AL West (set up a San Diego-Phoenix interleague rivalry) and the Astros to move to the NL West (and keep the Houston-Texas interleague rivalry).
June 3rd, 2008 at 4:04 PM
@iggy – I’m in total agreement that the scheduling needs a lot of work. I mean, it literally took my five minutes to generate post #47, and the numbers work perfectly. There’s one big adjustment needed (balanced divisions), but making it work is too easy.
It’s a one-year sample so take it for what it’s worth, but outside of the NL West, the RPI and actual playoff entrants coincided. That is to say that the top-4 RPI teams in each league made the playoffs. Using the RPI at this point in the season for 2008 doesn’t tell us much because on top of the unbalanced total league schedule, it’s even more unbalanced in the middle of a season.
June 3rd, 2008 at 4:04 PM
@iggy: I agree again, I was just looking for a quick 6-4 to 5-5 balance.
June 3rd, 2008 at 4:21 PM
@Ben – In your “even up the leagues” scenario, there is an odd number of teams in each league. Therefore, either (a)one team in each league will be idle for 3 days at a time, or (b)there is interleague play every day of the season. (a) is unrealistic because it would lengthen an already long season due to the fact that each team would have 3 days off about 4 or 5 times. (b) I have no problem with, but I still don’t see the powers that be (Selig) going along with this.
June 3rd, 2008 at 4:33 PM
I haven’t gone through the minutiae of scheduling, but if you stagger the ‘off’ times for each team, my guess is you would be able to fit in the same schedule currently run (~180 days).
June 3rd, 2008 at 4:43 PM
Joe Mather > Jay Bruce > The Rock
Fixed it for ya…
June 3rd, 2008 at 4:46 PM
Oh, and there are two ways to resolve the unbalanced schedule problem. One is obvious; get rid of Selig. And the second is to do away with interleague play. Its time has come and gone.
June 3rd, 2008 at 7:28 PM
very interesting read, but i don’t entirely agree with it. my beef is with calling the nfl’s schedule fair.
i think rpi has more of place in football where there are only 16 games. the sample size for how good a football team is so small because there are only those 16 games. it’s easily thrown off by a game or two. 6-10 becomes 8-8 with 2 close games going another way. the difference between a team’s winning percentage and actual quality can really be skewed by them winning or losing some close games.
baseball, on the other hand, has a huge sample size for finding out how good a teams is. a bad team can get hot for a week, but that’s only 6 or 7 games and they are likely to show their true colors the rest of the season.