I was recently asked about when to use one and two tailed tests.

The long answer is: Use one tailed tests when you have a specific hypothesis about the direction of your relationship. Some examples include you hypothesize that one group mean is larger than the other; you hypothesize that the correlation is positive; you hypothesize that the proportion is below .5.

The short answer is: Never use one tailed tests.

Why?

1. Only a few statistical tests even can have one tail: z tests and t tests. So you’re severely limited. F tests, Chi-square tests, etc. can’t accomodate one-tailed tests because their distributions are not symmetric. Most statistical methods, such as regression and ANOVA, are based on these tests, so you will rarely have the chance to implement them.

2. Probably because they are rare, reviewers balk at one-tailed tests. They tend to assume that you are trying to artificially boost the power of your test. Theorectically, however, there is nothing wrong with them when the hypothesis and the statistical test are right for them.

{ 9 comments… read them below or add one }

I am currently working on my dissertation and one of my committee members suggested that I should have used a one-tailed test as I have a directional hypothesis, but I think that a two-tailed test is just as appropriate based on several of the reasons listed on the blog.

I was particularly intrigued by the statement that “F tests, chi-square tests, etc. can’t accommodate one-tailed tests because their distributions are not symmetric.” This would make a fine argument for not re-rerunning my data and was wondering if there is a reference or citation for that point. I have not been able to find that point in any of the stats texts that I own. Any help would be greatly appreciated!

Hi Sue,

Hmmm. I would think that texts that talk about the F-test would mention that it’s not symmetric. You could certainly use any text that states that t-squared=F.

But I’ll see if I can find something that says it directly.

But in any case, you don’t have to rerun anything, even if you weren’t using F tests. To get a one-sided p-value, just double the two-sided p-value you have.

Karen

Hi Karen,

Thank you so much for your reply and offer to try a find a source regarding the limited utility of one-tailed tests when doing ANOVAs and regressions, as well as the advice for converting two-tailed tests to one-tailed tests.

The only problem I have is that the Pearson Correlation Coefficient output from my stats consultant does not contain the p value. In order to calculate the p value for a two-tailed test, I thought it might be possible to take the df (n-2 for two-tailed tests) and look up the significance level in the table of critical values of the correlation coefficient. Once I get those values, I would simply divide by 2 to get the one-tailed level of significance. Do you think that is a statistically sound procedure.

Thank you again for your assistance,

Sue

i found this issue more important please continue in such way.

kind regards,

nurilign.

hi dear,

how can we calculate p-value of one-tailed from two-tailed hypothesis in spss?

tq,

kevin

Hi Kevin,

All you have to do is divide it by 2.

Karen

Hi,

Jst wanna thank you for your post ; it will save me on my exam tomorrow (y)

& in my course it is not always divided by 2 (p-value). It depends on the value of your t (,= 0) & if your H1:’value” is > or 0, Hasard ?

Even if i’m wrong, I want to thank you again !

Very interesting. I am reading a much celebrated book (The weakness of Civil Society in Post-Communist Europe, by Marc Howard, 2003) in polticial science at the moment containing regression analysis with one tailed coeficients. This, togheter with that they only have a N of 23 (with 5 independent variables), raise my eyebrows. The results are also quite controversial….

What would you off-hand say about that?

Hi David,

Without knowing anything else, the one tailed tests of coefficients wouldn’t worry me too much except for the fact you said it’s controversial. Which means perhaps the opposite result is reasonable.

The N of 23 is more of a concern. That’s pretty small. I find results like this are not bad per se. It’s fine to consider as one possible piece of information–they are great for spurring more research. But you can’t make any conclusions based on them.

{ 1 trackback }