Spoiler alert, real data are seldom normally distributed. How does the population distribution influence the estimate of the population mean and its confidence interval?
To figure this out, we randomly draw 100 observations 100 times from three distinct populations and plot the mean and corresponding 95% confidence interval of each sample.
Bootstrapping is a methodology derived by Bradley Efron in the 1980s that provides a reasonable approximation to the sampling distribution of various “difficult” statistics. Difficult statistics are those where there is no mathematical theory to establish a distribution.
Most of the time when we plan a sample size for a data set, it’s based on obtaining reasonable statistical power for a key analysis of that data set. These power calculations figure out how big a sample you need so that a certain width of a confidence interval or p-value will coincide with a scientifically meaningful effect size.
But that’s not the only issue in sample size, and not every statistical analysis uses p-values.
Ever hear this rule of thumb: “The Chi-Square test is invalid if we have fewer than 5 observations in a cell”.
I frequently hear this mis-understood and incorrect “rule.”
We all want rules of thumb even though we know they can be wrong, misleading, or misinterpreted.
Rules of Thumb are like Urban Myths or like a bad game of ‘Telephone’. The actual message gets totally distorted over time.
Any time you report estimates of parameters in a statistical analysis, it’s important to include their confidence intervals.
How confident are you that you can explain what they mean? Even those of us who have a solid understand of confidence intervals get tripped up by the wording.
The Wording for Describing Confidence Intervals
Let’s look at an example. (more…)
One issue with using tests of significance is that black and white cut-off points such as 5 percent or 1 percent may be difficult to justify.
Significance tests on their own do not provide much light about the nature or magnitude of any effect to which they apply.
One way of shedding more light on those issues is to use confidence intervals. Confidence intervals can be used in univariate, bivariate and multivariate analyses and meta-analytic studies.