This is a free course but requires registration and log-in. Click on Module 4: Dealing with Messy Data
Any data journalist will tell you the hardest part of any project is cleaning the data. That term describes the process of checking and correcting information in your dataset — standardizing units, correcting misspellings, identifying and fixing errors, etc. Learn how to fix formatting problems, use OpenRefine, and use creative approaches to address problems in massive databases.