clock menu more-arrow no yes

Filed under:

Maybe Illinois isn't so corrupt after all

If Wikipedia says it’s true, then it has to be.

Slate’s Ben Blatt did some digging on how each state is best described on its Wikipedia page. To do this, he looked at which word is used most disproportionally on each state’s page. Theoretically, that’s the one that best describes it.

Here’s what Blatt did:

“I calculated which word was used most disproportionately by looking at each word’s frequency within the state’s page compared with its frequency in the pages for all 50 states. This means that words that show up in a whole bunch of articles would be unlikely to make any state’s list. I also needed to use a cutoff to exclude words that appeared very rarely—if a word showed up once on the Texas page and zero times elsewhere, it would be hard to call its use disproportionate.”

So when he a frequency cutoff of three, here are the results:

It’s hard to argue with “corruption” for Illinois, based on our track record of incarcerating politicians. And it also meshes with the fact that a poll shows 89 percent of people say corruption is common in Illinois.

But what happens when he changes the cutoff to 10 times? You get totally different results:

Notice that the only state that is unchanged is Hawaii, known for its islands.

Blatt’s point here is that it can be easy to manipulate data to get different results.

Maybe the word “nuclear” is just a reference to blowing up Illinois’ pension system as we know it.