Software

3 posts
Cleaning your data with Excel and Google Spreadsheets

For Datawrapper, Lisa Charlotte Rost outlines the steps to prepare and clean your data in Excel or Google Spreadsheets. From the beginning: When you download an Excel file, it often has multiple sheets. Our data set has three of them, as seen on the bottom: “Data”, “Metadata – Countries” and “Metadata – Indicators”. Look through all of your sheets and make sure you understand what you’re seeing there. Do the...

0 0
Microsoft’s visual data explorer SandDance open sourced

Microsoft just open sourced their data exploration tool known as SandDance: For those unfamiliar with SandDance, it was introduced nearly four years ago as a system for exploring and presenting data using “unit visualizations.” Instead of aggregating data and showing the resulting sums as bar charts, SandDance shows every single row of a dataset (for datasets up to ~500K rows). It represents each of these rows as a mark that...

0 0
Book review: Visualizing Baseball

I requested a copy of Jim Albert’s Visualizing Baseball book, which is part of the ASA-CRC series on Statistical Reasoning in Science and Society that has the explicit goal of reaching a mass audience. The best feature of Albert’s new volume is its brevity. For someone with a decent background in statistics (and grasp of basic baseball jargon), it’s a book that can be consumed within one week, after which...

0 0
Runway ML makes machine learning easier to use for creators

Machine learning can feel like a foreign concept only useful to those with access to big machines. Runway ML aims to make machine learning easier to use for a wider audience, specifically for creators. It provides a click-and-drag interface that lets you link algorithms, import datasets, and most importantly, experiment. Looks like fun. Give it a go. Tags: machine learning

0 0
Webinar Wednesday

I'm delivering a quick-fire Webinar this Wednesday on how to make impactful data graphics for communication and persuasion. Registration is free, at this link. *** In the meantime, I'm preparing a guest lecture for the Data Visualization class at Yeshiva University Sims School of Management. The goal of the lecture is to emphasize the importance of incorporating analytics into the data visualization process. Here is the lesson plan: Introduce the...

0 0
Kepler.gl, an open source tool for mapping large-scale data

Kepler.gl, a collaboration between Uber and Mapbox, allows for easier mapping of large-scale data. From Shan He for Uber: Showing geospatial data in a single web interface, kepler.gl helps users quickly validate ideas and glean insights from these visualizations. Using kepler.gl, a user can drag and drop a CSV or GeoJSON file into the browser, visualize it with different map layers, explore it by filtering and aggregating it, and eventually...

0 0
Altair for visualization in Python

Vega-Lite is a grammar for interactive graphics primarily used for analysis. Altair is a visualization library in Python that is based on this grammar. With Altair, you can spend more time understanding your data and its meaning. Altair’s API is simple, friendly and consistent and built on top of the powerful Vega-Lite visualization grammar. This elegant simplicity produces beautiful and effective visualizations with a minimal amount of code. Jim Vallandingham...

0 0
Outlier detection in R

Speaking of outliers, it’s not always obvious when and why a data point is an outlier. The Overview of Outliers package in R by Antony Unwin lets you compare methods. Articles on outlier methods use a mixture of theory and practice. Theory is all very well, but outliers are outliers because they don’t follow theory. Practice involves testing methods on data, sometimes with data simulated based on theory, better with...

0 0
Microsoft Excel painter

Remember the artist Tatsuo Horiuchi who uses Microsoft Excel to paint scenery? Four years later, he’s still at it. Watch below. Horiuchi is my favorite example of someone who shows that the tool is secondary to what you want to make. Spend less time debating about what software you should use to visualize your data, and spend more time deciding what you want to show. Tags: Excel, paintings, tools

0 0
Choosing the right metric reveals the story behind the subway mess in NYC

I forgot who sent this chart to me - it may have been a Twitter follower. The person complained that the following chart exaggerated how much trouble the New York mass transit system (MTA) has been facing in 2017, because of the choice of the vertical axis limits. This chart is vintage Excel, using Excel defaults. I find this style ugly and uninviting. But the chart does contain some good...

0 0