A resource for news innovators powered by American Press Institute
Complexity: Advanced
Article Complexity Bar Graph

Making sense of dirty data

Cleaning data can be time consuming and frustrating, but it’s also a critical step in any data analysis. Learn how to use free tools and smart approaches to get your data ready for analysis.

This is a free course but requires registration and log-in. Click on Module 4: Dealing with Messy Data

Any data journalist will tell you the hardest part of any project is cleaning the data. That term describes the process of checking and correcting information in your dataset — standardizing units, correcting misspellings, identifying and fixing errors, etc. Learn how to fix formatting problems, use OpenRefine, and use creative approaches to address problems in massive databases.