Outlier detection in R

Speaking of outliers, it’s not always obvious when and why a data point is an outlier. The Overview of Outliers package in R by Antony Unwin lets you compare methods. Articles on outlier methods use a mixture of theory and practice. Theory is all very well, but outliers are outliers because they don’t follow theory. Practice involves testing methods on data, sometimes with data simulated based on theory, better with...

Visualizing Outliers

Step 1: Figure out why the outlier exists in the first place. Step 2: Choose from these visualization options to show the outlier. Read More

Getting into the heads of the chart designer

When I look at this chart (from Business Insider), I try to understand the decisions made by its designer - which things are important to her/him, and which things are less important. The chart shows average salaries in the top 2 percent of income earners. The data are split by gender and by state. First, I notice that the designer chooses to use the map form. This decision suggests that...

Political winds and hair styling

Washington Post (link) and New York Times (link) published dueling charts last week, showing the swing-swang of the political winds in the U.S. Of course, you know that the pendulum has shifted riotously rightward towards Republican red in this election. The Post focused its graphic on the urban / not urban division within the country: Over Twitter, Lazaro Gamio told me they are calling these troll-hair charts. You certainly can...

Denver outspends everyone on this

Someone at the Wall Street Journal noticed that Denver's transit agency has outspent other top transit agencies, after accounting for number of rides -- and by a huge margin. But the accompanying graphic conspires against the journalist. For one thing, Denver is at the bottom of the page. Denver's two bars do not stand out in any way. New York's transit system dwarfs everyone else in both number of rides...

What if the RNC assigned seating randomly

The punditry has spoken: the most important data question at the Republican Convention is where different states are located. Here is the FiveThirtyEight take on the matter: They crunched some numbers and argue that Trump's margin of victory in the state primaries is the best indicator of how close to the front that state's delegation is situated. Others have put this type of information on a map: The scatter plot...

