Statistics

23 posts
Crazy rich Asians inspire some rich graphics

On the occasion of the hit movie Crazy Rich Asians, the New York Times did a very nice report on Asian immigration in the U.S. The first two graphics will be of great interest to those who have attended my free dataviz seminar (coming to Lyon, France in October, by the way. Register here.), as it deals with a related issue. The first chart shows an income gap widening between...

0 0
Changing size analogies and the trends of everyday things

When you try to describe the size of something but don’t have an exact measurement, you probably compare it to an everyday object that others can relate to. Using the Google Books Ngram dataset, Colin Morris looked for how such comparisons changed over the past few centuries. I especially like the bits of history to explain why some words fell into and out of fashion. Tags: language, n-gram, size

0 0
Waffle House index as a storm indicator

Waffle House activated their storm center in preparation for Hurricane Florence. Their restaurants are open 24/7, so they need to keep track of which ones need to close or limit their menus. This might also have to do with an informal Waffle House Index that FEMA described last year: If a Waffle House can serve a full menu, they’ve likely got power (or are running on a generator). A limited...

0 0
Algorithms to fix underrepresentation on Wikipedia

Wikipedia is human-edited, so naturally there are biases towards certain groups of people. Primer, an artificial intelligence startup, is working on a system that looks for people who should have an article. It’s called Quicksilver. We trained Quicksilver’s models on 30,000 English Wikipedia articles about scientists, their Wikidata entries, and over 3 million sentences from news documents describing them and their work. Then we fed in the names and affiliations...

0 0
Live polling results for transparency and a way to learn about the process

In a collaboration with Siena College, The Upshot is showing live polling results. The ticker moves in real-time for every phone call. For the first time, we’ll publish our poll results and display them in real time, from start to finish, respondent by respondent. No media organization has ever tried something like this, and we hope to set a new standard of transparency. You’ll see the poll results at the...

0 0
Counting baseball cliches

Post-game sports interviews tend to sound similar. And when you do say something out of pattern, the talk shows and the social media examine every word to find hidden meaning. It’s no wonder athletes talk in cliches. The Washington Post, using natural language processing, counted the phrases and idioms that baseball players use. We grouped phrases that were variations of each other together (within a one- or two-word difference) into...

0 0
Weighing the risk of moderate alcohol consumption

A research study on mortality and alcohol consumption is making the rounds. Its main conclusion is that all alcohol consumption is bad for you, because of increased risk. David Spiegelhalter, the chair of the Winton Centre for Risk and Evidence Communication, offers a different interpretation of the data: Let’s consider one drink a day (10g, 1.25 UK units) compared to none, for which the authors estimated an extra 4 (918–914)...

0 0
Education deserts: places without schools still serve pies and story time

I very much enjoyed reading The Chronicle's article on "education deserts" in the U.S., defined as places where there are no public colleges within reach of potential students. In particular, the data visualization deployed to illustrate the story is superb. For example, this map shows 1,500 colleges and their "catchment areas" defined as places within 60 minutes' drive. It does a great job walking through the logic of the analysis...

0 0
What data scientists really do

Statistics. I kid, I kid. Hugo Bowne-Anderson, host of the DataFramed podcast, culled some information together that he’s gathered from interviewing data scientists. This is what data scientists really do. One result of this rapid change is that the vast majority of my guests tell us that the key skills for data scientists are not the abilities to build and use deep-learning infrastructures. Instead they are the abilities to learn...

0 0
2018 House forecast from FiveThirtyEight

Ever since the huge forecasting upset in 2016, I’ve tended to stay away from that stuff. I mean, it was painful to watch the Golden State Warriors, a huge favorite to win the championship basically the whole series, lose to the Cleveland Cavaliers. Yeah. The Warriors. What were you thinking of? Alas, it is 2018, and FiveThirtyEight has their forecast for who will control the House. Mainly, I post for...

0 0