Statistics

5 posts
Squirrel census count in Central Park

In 2018, there was a squirrel census count at Central Park in New York. New York Times graphics editor Denise Lu participated in the citizen science project “to collect the kind of data that underlies the work I do every day.” Lu did a short but interesting piece on her experience counting squirrels. You can download the data via NYC Open Data. Now I’m wondering if I should apply to...

0 0
Analysis of online sermons

Pew Research Center analyzed online sermons in U.S. searches, taking a closer look at what people typically hear across religions: For instance, the distinctive words (or sequences of words) that often appear in sermons delivered at historically black Protestant congregations include “powerful hand” and “hallelujah … come.” The latter phrase (which appears online in actual sentences such as “Hallelujah! Come on … let your praises loose!”) appeared in some form...

0 0
AI-generated pies

Janelle Shane applied her know-how with artificial intelligence to generate new types of pies that the world has never seen: People wonder about what it would be like if a super-intelligent AI decided to place all of humanity in a realistic simulation. I wonder what it would be like if the simulation were built by today’s AI instead – whose computing power is somewhere around the level of an earthworm’s....

0 0
Looking for similar NBA games, based on win probability time series

Inpredictable, a sports analytics site by Michael Beuoy, tracks win probabilities of NBA games going back to the 1996-97 season. When a team is up by a lot, their probability of winning is high, and then flip that for the losing team. So for each game, you have a minute-by-minute time series of win probability. Beuoy added a new feature that looks for games with similar patterns a.k.a. “Dopplegamers”. Tags:...

0 0
Data shelf life

Stephen M. Stigler argues that data have a limited shelf life. The abstract: Data, unlike some wines, do not improve with age. The contrary view, that data are immortal, a view that may underlie the often-observed tendency to recycle old examples in texts and presentations, is illustrated with three classical examples and rebutted by further examination. Some general lessons for data science are noted, as well as some history of...

0 0
The Myth of ‘Dumbing Down’

For The Atlantic, Ian Bogost on communicating complex ideas to an audience: One thing you learn when writing for an audience outside your expertise is that, contrary to the assumption that people might prefer the easiest answers, they are all thoughtful and curious about topics of every kind. After all, people have areas in their own lives in which they are the experts. Everyone is capable of deep understanding. Up...

0 0
The rule governing which variable to put on which axis, served a la mode

When making a scatter plot, the two variables should not be placed arbitrarily. There is a rule governing this: the outcome variable should be shown on the vertical axis (also called y-axis), and the explanatory variable on the horizontal (or x-) axis. This chart from the archives of the Economist has this reversed: The title of the accompanying article is "Ice Cream and IQ"... In a Trifecta Checkup (link), it's...

0 0
FiveThirtyEight launches new NBA metric for predications

FiveThirtyEight has been predicting NBA games for a few years now, based on a variant of Elo ratings, which in turn have roots in ranking chess players. But for this season, they have a new metric to predict with called RAPTOR, or Robust Algorithm (using) Player Tracking (and) On/Off Ratings: NBA teams highly value floor spacing, defense and shot creation, and they place relatively little value on traditional big-man skills....

0 0
Statistical significance explainer, and Instagram’s experiment to hide Likes

There are some statistical concepts that all data visualization practitioners should know about, and the concept of statistical significance is one of them. It's a hard concept to grasp because it requires one to think beyond the data that are collected. The abstract thinking is necessary since we typically want to make general statements - while using the collected data as evidence. My new video in the Data Science: The...

0 0
Statistical fallacies in the news

For UnHerd, Tom Chivers, talks about David Spiegelhalter’s new book and why every statistical headline deserves a grain of salt. One way to make sure things check out: As a non-mathematician, I have a few shortcuts for working out whether a statistic is worth believing, which seem to have done all right for me so far. One, which Spiegelhalter stresses, is that often the best statistical analysis you can do...

0 0