In statistical practice, there are many situations where best practices are clear. There are many, though, where they aren’t. The granddaddy of these practices is adjusting p-values when you make multiple comparisons. There are good reasons to do it and good reasons not to. It depends on the situation.
At the heart of the issue is a concept called Family-wise Error Rate (FWER). FWER is the probability that
A research study rarely involves just one single statistical test. And multiple testing can result in more statistically significant findings just by chance.
After all, with the typical Type I error rate of 5% used in most tests, we are allowing ourselves to “get lucky” 1 in 20 times for each test. When you figure out the probability of Type I error across all the tests, that probability skyrockets.