Statistics

15 posts
A game to test your ability to pick random numbers

Compared to a computer’s pseudo-random number generator, we are not good at picking random numbers. Ilya Perederiy made a quick game to show how bad you are: Your fingers tend to repeat certain patterns even if you don’t notice it. The program keeps a database of each possible combination of 5 presses, and two counters are stored under each entry — one is for every zero that follows the combination,...

0 0
Book Preview: How Charts Lie, by Alberto Cairo

If you’re like me, your first exposure to data visualization was as a consumer. You may have run across a pie chart, or a bar chart, perhaps in a newspaper or a textbook. Thanks to the power of the visual language, you got the message quickly, and moved on. Few of us learned how to create charts from first principles. No one taught us about axes, tick marks, gridlines, or...

0 0
Case of the 500-mile email

Trey Harris, a previous tech administrator for a university, tells the story of a statistics department that couldn’t send email farther than 500 miles away. The story is more about the peculiarities of server admin in 2002, but I’m more interested in those statisticians: “We could send email. Just not more than–“ “–500 miles, yes,” I finished for him, “I got that. But why didn’t you call earlier?” “Well, we...

0 0
Probability you will break up with your partner

Rosenfeld, et al. from Stanford University ran a survey in 2009 for a study on How Couples Meet and Stay Together. Dan Kopf and Youyou Zhou for Quartz used this dataset to estimate the probability that you will break up with your partner, given a few bits of information about your current relationship. The Stanford data page says a 2017 release is on the way. I’m curious how, if anything,...

0 0
This chart advises webpages to add more words

A reader sent me the following chart. In addition to the graphical glitch, I was asked about the study's methodology. I was able to trace the study back to this page. The study uses a line chart instead of the bar chart with axis not starting at zero. The line shows that web pages ranked higher by Google on the first page tend to have more words, i.e. longer content...

0 0
Data-driven porn

Gustavo Turner for Logic on his experiences covering the porn industry and the introduction of data into the stream: Over the course of these experiences, I learned about a major new force reshaping the industry: data. That day on set, the director’s instructions came directly from the production company, which decided on the topic and vetted the script. And the company based its creative direction on specific fantasies proposed by...

0 0
Labels, scales, controls, aggregation all in play

JB @barclaysdevries sent me the following BBC production over Twitter. He was not amused. This chart pushes a number of my hot buttons. First, I like to assume that readers don't need to be taught that 2007 and 2018 are examples of "Year". Second, starting an area chart away from zero is equally as bad as starting a bar chart not at zero! The area is distorted and does not...

0 0
Men and women faced different experiences in the labor market

Last week, I showed how the aggregate statistics, unemployment rate, masked some unusual trends in the labor market in the U.S. Despite the unemployment rate in 2018 being equal, and even a little below, that in 2000, the peak of the last tech boom, there are now significantly more people "not in the labor force," and these people are not counted in the unemployment rate statistic. The analysis focuses on...

0 0
What to make of the historically low unemployment rate

One of the amazing economic stories of the moment is the unemployment rate, which at around 4% has returned to the level last reached during the peak of the tech boom in 2000. The story is much more complex than it seems. I devoted a chapter of Numbersense (link) to explain how the government computes unemployment rates. The most important thing to realize is that an unemployment rate of 4...

0 0
No such thing as raw data

Nick Barrowman on the myth of raw data: Assumptions inevitably find their way into the data and color the conclusions drawn from it. Moreover, they reflect the beliefs of those who collect the data. As economist Ronald Coase famously remarked, “If you torture the data enough, nature will always confess.” And journalist Lena Groeger, in a 2017 ProPublica story on the biases that visual designers inscribe into their work, soundly...

0 0