Statistics

1 posts
Vector paths of meaning between words and phrases

Benjamin Schmidt, an assistant professor of history at Northeastern University, explored the space between words and drew the paths to get from one word to another. The above, for example, is the path between Seinfeld and Breaking Bad. Using Google News as the corpus, the steps: Take any two words. I used “duck” and “soup” for my testing. Find a word that is, in cosine distance, between the two words:...

0 0
Counting the times Kevin Durant’s shoe came off during games

This is what happens when there is a lull during the basketball playoff season. Chris Herring, for FiveThirtyEight, goes into full detail of the relatively high number of times Kevin Durant’s shoe falls off during games: All told, an extensive video analysis of Durant’s games from the past three regular seasons and postseasons reveals that the four-time scoring champ has come out of his shoe at least 31 times since...

0 0
A guide for applying to data science jobs

Emily Robinson gives advice on applying for a data science job (that you can likely generalize for most tech jobs). For example: If you have a GitHub, pin the repos you want people to see and add READMEs that explain what the project is. I also strongly recommend creating a blog to write about data science, whether it’s projects you’ve worked on, an explanation of a machine learning method, or...

0 0
Increasing similarity of Billboard songs

Popular songs on the Billboard charts always tended to sound similar, but these days they’re sounding even more similar. Andrew Thompson and Matt Daniels for The Pudding make the case: From 2010-2014, the top ten producers (by number of hits) wrote about 40% of songs that achieved #1 – #5 ranking on the Billboard Hot 100. In the late-80s, the top ten producers were credited with half as many hits,...

0 0
Amazon Rekognition for government surveillance

Amazon’s Rekognition is a video analysis system that promises to identify individuals in real-time. Amazon wants to sell the systems to governments for surveillance. From the ACLU: Amazon is marketing Rekognition for government surveillance. According to its marketing materials, it views deployment by law enforcement agencies as a “common use case” for this technology. Among other features, the company’s materials describe “person tracking” as an “easy and accurate” way to...

0 0
Amazon Rekognition for government surveillance

Amazon’s Rekognition is a video analysis system that promises to identify individuals in real-time. Amazon wants to sell the systems to governments for surveillance. From the ACLU: Amazon is marketing Rekognition for government surveillance. According to its marketing materials, it views deployment by law enforcement agencies as a “common use case” for this technology. Among other features, the company’s materials describe “person tracking” as an “easy and accurate” way to...

0 0
Data scientists as the new Mad Men

Ken Auletta for The New Yorker looks at “math men” replacing the Mad Men: Engineers and data scientists vacuum data. They see data as virtuous, yielding clues to the mysteries of human behavior, suggesting efficiencies (including eliminating costly middlemen, like agency Mad Men), offering answers that they believe will better serve consumers, because the marketing message is individualized. The more cool things offered, the more clicks, the more page views,...

0 0
Data is, sometimes

Financial Times recently updated their style guide: data — the rule for always using data as plural has been relaxed. If you read data as singular then write it as such. For example, we already allow singular for ‘big data’. And we should for personal data too. An easy rule would be that if it can be used as a synonym for information then it should probably be singular —...

0 0
Every document copy stored on used digital photocopiers

CBS News picked up four used photocopiers and looked at the hard drives. There was a lot of private information stored in them: Nearly every digital copier built since 2002 contains a hard drive – like the one on your personal computer – storing an image of every document copied, scanned, or emailed by the machine. In the process, it’s turned an office staple into a digital time-bomb packed with...

0 0
Common charting issues related to connecting lines, labels, sequencing

The following chart about "ranges and trends for digital marketing salaries" has some problems that appear in a great number of charts. The head tilt required to read the job titles. The order of the job titles is baffling. It's neither alphabetical nor by salary. The visual form suggests that we could see trends in salaries reading left-right, but the only information about trends is the year on year salary...

0 0