Spoiler alert, real data are seldom normally distributed. How does the population distribution influence the estimate of the population mean and its confidence interval?
To figure this out, we randomly draw 100 observations 100 times from three distinct populations and plot the mean and corresponding 95% confidence interval of each sample.
(more…)
Lest you believe that odds ratios are merely the domain of logistic regression, I’m here to tell you it’s not true.
One of the simplest ways to calculate an odds ratio is from a cross tabulation table.
We usually analyze these tables with a categorical statistical test. There are a few options, depending on the sample size and the design, but common ones are Chi-Square test of independence or homogeneity, or a Fisher’s exact test.
(more…)
What does it mean for two variables to be correlated?
Is that the same or different than if they’re associated or related?
This is the kind of question that can feel silly, but shouldn’t. It’s just a reflection of the confusing terminology used in statistics. In this case, the technical statistical term looks like, but is not exactly the same as, the way we mean it in everyday English. (more…)
Any time you report estimates of parameters in a statistical analysis, it’s important to include their confidence intervals.
How confident are you that you can explain what they mean? Even those of us who have a solid understand of confidence intervals get tripped up by the wording.
The Wording for Describing Confidence Intervals
Let’s look at an example. (more…)
One issue with using tests of significance is that black and white cut-off points such as 5 percent or 1 percent may be difficult to justify.
Significance tests on their own do not provide much light about the nature or magnitude of any effect to which they apply.
One way of shedding more light on those issues is to use confidence intervals. Confidence intervals can be used in univariate, bivariate and multivariate analyses and meta-analytic studies.
(more…)
Here’s a common situation.
Your grant application or committee requires sample size estimates. It’s not the calculations that are hard (though they can be), it’s getting the information to fill into the calculations.
Every article you read on it says you need to either use pilot data or another similar study as a basis for the values to enter into the software.
You have neither.
No similar studies have ever used the scale you’re using for the dependent variable.
And while you’d love to run a pilot study, it’s just not possible. There are too many practical constraints — time, money, distance, ethics.
What do you do?
(more…)