A twitter user asked how I feel about this latest effort (from NASA) to illustrate global warming. To see the entire video, go to their website. This video hides the lede so be patient or jump ahead to 0:56 and watch till the end. Let's first describe what we are seeing. The dataset consists of monthly average global temperature "anomalies" from 1880 to 2021 - an "anomaly" is the deviation...

Daniel Z. tweeted about my post from last week. In particular, he took a deeper look at the chart of energy demand that put all hourly data onto the same plot, originally published at the StackOverflow blog: I noted that this is not a great chart particularly since what catches our eyes are not the key features of the underlying data. Daniel made a clearly better chart: This is a...

This post is the second post in response to a blog post at StackOverflow (link) in which the author discusses the "harm" of "aggregating away the signal" in your dataset. The first post appears on my book blog earlier this week (link). One stop in their exploratory data analysis journey was the following chart: This chart plots all the raw data, all 8,760 values of electricity consumption in California in...

A twitter follower sent the following chart: It's odd to place the focus on China when the U.S. line is much higher, and the growth in spending in the last few years in the U.S. is much higher than the growth rate in China. In the Trifecta Checkup, this chart is Type D (link): the data are at odds with the message of the chart. The intended message likely is...

Andrew's post about start-at-zero helps me refine my own thinking on this evergreen topic. The specific example he gave is this one: The dataset is a numeric variable (y) with values over time (x). The minimum numeric value is around 3 and the range of values is from around 3 to just above 20. His advice is "If zero is in the neighborhood, invite it in". (Link) The rule, as...

Through my twitter feed, I found my way to this chart, made by jamie_bio. This is produced using R code even though it looks like a slide. The underlying dataset concerns votes at the United Nations on various topics. Someone has already classified these topics. Jamie looked at voting blocs, specifically, countries whose votes agree most often or least often with the U.K. If you look at his Github, this...

A reader finds this chart hard to parse: The chart shows the trend in gas prices in New York in the past two years. This is a case in which the simple line chart works very well. I added annotations as the reasons behind the decline and rise in prices are reasonably clear.  One should be careful when formatting dates. The legend of the original chart looks like this: In...

An author in Significance claims that a single season of Premier League football without live spectators is enough to prove that the so-called home field advantage is really a live-spectator advantage. The following chart depicts the data going back many seasons: I find this bar chart challenging. It plots the ratio of home wins to away wins using an odds scale, which is not intuitive. The odds scale (probability of...

In the prior post about Canadian elections, I suggested that designers expand beyond plots of one variable at a time. Today, I look at a project by DataWrapper on the German elections which happened this week. Thanks to long-time blog supporter Antonio for submitting the chart. The following is the centerpiece of Lisa's work: CDU/CSU is Angela Merkel's party, represented by the black color. The chart answers one question only:...

Andrew jumped on the Benford bandwagon to do a tongue-in-cheek analysis of numbers in Hollywood movies (link). The key graphic is this: Benford's Law is frequently invoked to prove (or disprove) fraud with numbers by examining the distribution of first digits. Andrew extracted movies that contain numbers in their names - mostly but not always sequences of movies with sequels. The above histogram (gray columns) are the number of movies...