Data Sources

56 posts
Money-in-politics nonprofits merge their datasets

Center for Responsive Politics and National Institute on Money in Politics are merging their datasets to make it more accessible: The nation’s two leading money-in-politics data organizations have joined forces to help Americans hold their leaders accountable at the federal and state levels, they said today. The combined organization, OpenSecrets, merges the Center for Responsive Politics (CRP) and the National Institute on Money in Politics (NIMP), each leading entities for...

0 0
Mining Parler data

Just before the social network Parler went down, a researcher who goes by the Twitter username @donk_enby scraped 56.7 terabytes of data from the site via a less-than-secure API. Motherboard reports on what some researchers are doing with the data: One technologist took the scraped Parler data, took every file that had GPS coordinates included within it, formatted that information into JSON, and plotted those onto a map. The technologist...

0 0
Data for all of the referee calls in NBA games

Owen Phillips compiled per game and cumulative foul calls for all NBA referees between the 2016-17 and 2019-20 seasons. On its own, I’m not sure it’s that exciting, but if you’re into basketball analytics, it might be fun to tie in with other data. Tags: basketball, Owen Phillips, referee

0 0
Google search trends dataset for Covid-19 symptoms

Google released a search trends dataset earlier this month. Using this dataset, Adam Pearce made an explorer to compare search volume over time: The COVID-19 Search Trends symptoms dataset shows aggregated, anonymized trends in Google searches for more than 400 health symptoms, signs, and conditions, such as cough, fever and difficulty breathing. The dataset provides a time series for each region showing the relative volume of searches for each symptom....

0 0
Friends sitcom transcript dataset

For your analytical perusal, Emil Hvitfeldt provides ten seasons’ worth of scripts from the Friends sitcom in an easy-to-use R package: The goal of friends to provide the complete script transcription of the Friends sitcom. The data originates from the Character Mining repository which includes references to scientific explorations using this data. This package simply provides the data in tibble format instead of json files. The ten seasons ran from...

0 0
Data on loans issued through the Paycheck Protection Program

The Paycheck Protection Program was established to provide aid to small businesses. It’s a $669-billion loan program. The data for 4.8 million loans, amounting to $521 billion so far, is now available from the Small Business Administration. For loans less than $150,000, you can download data for all states individually. Data for loans that were more than $150,000 can be downloaded as a single file. Look up business name, type,...

0 0
What the federal government has been buying and where from

The Federal Procurement Data System tracks federal contracts of $10,000 or more. For ProPublica, Moiz Syed and Derek Willis made the data for coronavirus-related contracts more accessible with a searchable database. Browse the items, the companies, and the amounts. Somehow it seems like so much, and yet so not enough. See also the accompanying article highlighting some of the more questionable contracts. Tags: coronavirus, procurement, ProPublica

0 1
Coronavirus data at the state and county level, from The New York Times

Comprehensive national data on Covid-19 has been hard to come by through government agencies. The New York Times released their own dataset and will be updating regularly: The tracking effort grew from a handful of Times correspondents to a large team of journalists that includes experts in data and graphics, staff news assistants and freelance reporters, as well as journalism students from Northwestern University, the University of Missouri and the...

0 0
Restaurant struggles

The restaurant industry is taking a big hit right now, as most people are staying put at home. OpenTable provides a downloadable dataset to show how much restaurant dining is down: This data shows year-over-year seated diners at restaurants on the OpenTable network across all channels: online reservations, phone reservations, and walk-ins. For year-over-year comparisons by day, we compare to the same day of the week from the same week...

0 0
Nationwide database of credibly accused Catholic clergy

For ProPublica, Ellis Simani and Ken Schwencke compiled an interactive database that you can search: ProPublica reporters spent months collecting the lists as they were originally released by each diocese. They then made them searchable via a public database in order to provide victims of clerical abuse and members of the public a way to search across all of the released lists. More than 6,700 names are included in the...

0 0