Member Training: Data Cleaning

Data Cleaning is a critically important part of any data analysis. Without properly prepared data, the analysis will yield inaccurate results. Correcting errors later in the analysis adds to the time, effort, and cost of the project.

In this training, you’ll get a software-agnostic overview of the major steps required to clean data. You’ll see examples of the issues that you will encounter in cleaning data.

Key components covered in the presentation are:

  • Understanding the workflow
  • How to measure data quality
  • Outliers and missing data
  • Data dependencies
  • Eliminating duplicates
  • Altering data

Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.

Not a Member? Join!

About the Instructor

John Williams is an R expert and biostatistician at the Columbia University Vagelos College of Physicians and Surgeons. He followed a long career in software development with an M.S. in Applied Statistics from Columbia University’s Teachers College. A lifelong musician, he led a variety of bands in the 1980’s and continues to perform in New York City.

Not a Member Yet?
It’s never too early to set yourself up for successful analysis with support and training from expert statisticians. Just head over and sign up for Statistically Speaking.

You'll get access to this training webinar, 130+ other stats trainings, a pathway to work through the trainings that you need — plus the expert guidance you need to build statistical skill with live Q&A sessions and an ask-a-mentor forum.