Study Using Wonderlic to Predict Future Arrests is Dangerous and Not Practically Useful

A study has come out that links Wonderlic test results to future off-field arrest incidents, as described by Kevin Seifert at ESPN. That study, “Off-Duty Deviance: Organizational Policies and Evidence for Two Prevention Strategies,” appears in the April edition of the Journal of Applied Psychology.

The study itself isn’t specifically focused on the NFL. It does use the NFL, and its off-field policies and readily available data (leaked Wonderlics; publicly reported arrests, publicly available employee tenure info) to examine whether certain factors can be correlated with predicting future criminality.

Those factors are prior arrests or drug-related suspensions while in college, and what is termed GMA (General Mental Ability) as measured by the controversial Wonderlic test given to NFL prospects.

There were two NFL draft-related results. First, that between 2001 and 2012, players with publicly-documented pre-draft arrests were nearly twice as likely to be arrested after reaching the NFL than those who had not been arrested. The second, which is perhaps less obvious and more valuable, was that there was a small but clear correlation between arrests and Wonderlic tests scores. Players who scored below the mean in the researchers’ sample were also about twice as likely to be arrested in the NFL as those who scored above it.

“The effects are relatively small,” said author Brian Hoffman, an associate professor and chair of the industrial-organizational program at the University of Georgia. “But it’s important here because when making multimillion-dollar decisions, a small effect can be very meaningful. A player’s getting a four-game suspension can be a big deal, competitively and financially.”

I have problems with actually applying these results to the NFL, for a variety of reason. The first is disparate impact. We know all of the criticisms of the Wonderlic.

Three years ago, we noted that the NFL was looking into a different option than the Wonderlic to “level the socio-economic playing field.” It is noted for having a racial disparity and impact in its testing results.

So we’ve basically found that players from lower socio-economic backgrounds, players from difficult backgrounds, and those who may not have had the same advantages may have a slightly higher chance of arrest. Great (not great, sarcasm font) but what are the real world implications?

The chance of overvaluing this are way more than undervaluing it. Even according to Seifert’s summary, the observed rate was 18% of being arrested during NFL career for those with a score below the Wonderlic average (21.7) versus 9.5% rate for those above the average Wonderlic score. According to the study, 96 players were recorded as having engaged in an ODD (Off-Duty Deviance, i.e., they were arrested). Of those, 67 had exactly one incident.

So we are talking about a very small percentage that actually faced severe discipline, significant suspension risk, etc.

The other thing is, how would you really apply this? If you look at the list of the highest Wonderlics (never complete, because they are usually only reported in extreme cases or prominent names) they are overly-represented at the top by QBs. You aren’t fielding a 53-man roster of quarterbacks and Ivy League offensive linemen, just like you aren’t fielding an entire roster of choir boys. Applying this to the real world is dangerous, because the NFL isn’t the real world. Aptitude in standardized tests may be a correlative factor in whether you work in management or at a ground level in a Fortune 500 company. Positions in the NFL aren’t determined by standardized testing aptitude, and there can be differences across positions.

Four years ago, there was a study that teams were undervaluing players who were arrested. I had some issues with that one as well, but they had to do with the study set-up and how the regressions were run. There was some evidence that players that were charged and arrested in college were drafted lower but provided similar production despite the lower draft status.

I think teams already try to balance character. I think they try to assess how a player will likely develop. I’m just not sure using a result from a Wonderlic to try to predict character concerns is actually going to be useful.

You would have passed on a lot of great players. Upside is way more valuable than downside. Teams that take average players aren’t that much better off than those that draft busts at the top of the draft, but those that get stars show a huge difference.

If you had applied this mentality, you would have had unfounded concerns about a lot of top players who, in my opinion, were unfairly ridiculed for their scores–A.J. Green and Patrick Peterson are two that come to mind. You would have written off Tyrann Mathieu. I can think of too many counter-examples to think that a very small percentage (remember, only a handful of that 9% difference would have faced serious suspensions) chance of getting arrested should change the overall evaluation.

