- Oftentimes, real world data is messy
- Ex: you come up with a thesis topic and write down all the theory and statistical analysis you want to perform
- You find a relevant dataset, but should you trust it?
- With large datasets, unrealistic to go through and check all the data
- Plotting a few key relationships can help us determine how usable our data is
- ggplot is your best friend when it comes to EDA!