Earlier today, I wrote about a study that claimed to have discovered that racial bias exists among Major League umpires -- that being, umpires reward pitchers of the same ethnicity with extra strikes roughly 1% of time. I questioned the results of the study, but while my questions were gleamed from a brief article on the subject, the blog Man on a Rant actually did some poking around to find the actual study itself (PDF). After reviewing the study for first-hand, the blogger raises some concerns about the validity of the results -- namely, the sample size of the umpires studied:
This looks like a data problem. A quick review of the study's first table (which I should have spent some more time on initially) reveals that our umpire sample size is 93, and a whopping 85 of the 93 umps tracked are white. 5 are black, and 3 are Hispanic. And right here, at least in my mind, you can throw out the study's results, regardless of how elegant the rest of the paper may seem.I have absolutely no formal training in statistics, but this sounds like a legitimate concern. Obviously, the sample size of umpires included in the study is constrained by the number of major league umpires employed at any particular point in time, so I'm not sure how the researchers can accurately compensate for this obstacle. Does this mean we should invalidate all of the results of the study? I don't think so, but it seems like any discussion about it should include the caveat that the work of only eight non-white umpires was analyzed.
You cannot make sweeping statements of race and racial bias with subject groups this small. There is nothing to infer. Sure, you might have enough total pitches viewed by umps of all ethnicities to generate a statistical comparison that looks legitimate... but, at the end of the day, you're making key assumptions about racial attitudes based on the work of 3-5 people. An experiment/study's conditions need to be comparable in number. You wouldn't compare the averaged IQ tests of 85 students to the average of 5 other students and expect to gain any kind of brilliant insight. It's the same deal here.

















Reader Comments (Page 1 of 1)
8-14-2007 @ 9:27PM
John Brooks said...
Thanks for refuting this argument. I have been an "amateur" umpire for many years, doing baseball, slow pitch and fast pitch. I can assure you it has NEVER been a factor in any decision I have made on the field.
Incidentally, my former vocation was in news (managing editor of a daily newspaper). I left the profession because of the overabundance of shoddy reporting and writing such as that foisted on us by the TIME reporter. How she - and her editors - can go with a story on an unfounded study is beyond the imagination.
Reply
8-14-2007 @ 9:39PM
Ed Slawinski said...
More incomprehensible bull shit from the media. These stories are indicative of the dishonesty that dominates the American press. I do not regard journalists as Americans. They are all, and I mean all liars trying to write a key story that will make them famous and of course more money. I have a friend who works for a national newspaper and I can assure you that he pays his own bill when we meet for lunch.
Reply
8-15-2007 @ 12:19AM
Dick Sohn said...
There's nothing wrong with the sample size. If a college had 5% Chinese students but 80% of those students were on the dean's list, wouldn't that say something significant about Chinese students? If a spelling bee had only five Vietnamese entrants but all five made it to the finals wouldn't that tell you something about Vietnamese students? The problem is not with the sample size. It's the 1% difference that is statistically insignificant.
Reply
8-15-2007 @ 1:25AM
eknizhnik said...
Actually, that's incorrect (directed At the comment above).
If you have a 1% difference in calls among two decently sized groups (even if one group has 20 umps and another has 80), that difference COULD be statistically significant. If you see that difference between a large group and a small group, the small group's averages often have too much variance to read in to. This is precisely the point of the critique.
Reply
8-16-2007 @ 11:43AM
Doug Meckley said...
I am a statistician. The problem with the numbers are these: The sample size is critical to determining the confidence interval around any result. The author is on the right track when he suggests that with such a small sample size it's hard to say whether those results show true significance, or pure chance. Statistical significance occurs when there is no overlap of the confidence intervals. Small sample sizes can still infer significance provided the difference in the results is large enough to spread those averages far enough apart that the confidence intervals did not overlap. With the sample sizes as small as they are, the differences would likely have to be much greater than what was demonstrated.
Reply