Cohort and Case-Control Studies: Pro’s and Con’s

by Karen Grace-Martin

by Annette Gerritsen, Ph.D.

Two designs commonly used in epidemiology are the cohort and case-control studies. Both study causal relationships between a risk factor and a disease. What is the difference between these two designs? And when should you opt for the one or the other?

Cohort studies

Cohort studies begin with a group of people (a cohort) free of disease. The people in the cohort are grouped by whether or not they are exposed to a potential cause of disease. The whole cohort is followed over time to see if the development of new cases of the disease (or other outcome) differs between the groups with and without exposure.

For example, you could do a cohort study if you suspect there might be a causal relationship between the use of a certain water source and the incidence of diarrhea among children under five in a village with different water sources.

You  select a group of children under five years, either all children of that age in the village, a random sample taken from the population register, or e.g. children living in the same area, or attending the same clinic. Then you  classify them as either using the suspected water source or other water sources. You check e.g. after two weeks whether the children have had diarrhea.

You can then calculate how many diarrhea cases there were among those children using the suspected water source and those using other sources of water supply (cumulative incidence of diarrhea).  How to compare the cumulative incidence rates of the two groups, in order to conclude whether the suspected water source is a risk factor for the disease or not, will be discussed in  a future blog.

Case-control studies

The same problem could also be studied in a case-control study. A case-control study begins with the selection of cases (people with a disease) and controls (people without the disease).  The controls should represent people who would have been study cases if they had developed the disease (population at risk).

The exposure status to a potential cause of disease is determined for both cases and controls. Then the occurrence of the possible cause of the disease could be calculated for both the cases and controls. To come back to the example, you may compare children who present themselves at a health center with diarrhea (cases) with children with other complaints, for example acute respiratory infections (controls). You determine which source of drinking water they had used.  Then calculate the proportion of cases and controls that were exposed to the suspected water source.

Pro’s and con’s

On what basis do you decide to choose a cohort design or a case-control design?

Cohort studies provide the best information about the causation of disease, because you follow persons from exposure to the occurrence of the disease. With data from cohort studies you can calculate cumulative incidences, which are the most direct measurement of the risk of developing disease.

An added advantage is that you can examine a range of outcomes/diseases caused by one exposure (e.g. heart disease, lung disease, renal disease caused by smoking).

However, cohort studies are major undertakings. They may require long periods of follow-up since disease may occur a long time after exposure. Therefore, it is a very expensive study design.

Cohort studies work well for rare exposures–you can specifically select people exposed to a certain factor.  But this design does not work for rare diseases–you would then need a large study group to find sufficient disease cases.

Case-control studies are relatively simple to conduct. They do not require a long follow-up period (as the disease has already developed), and are hence much cheaper. This design is especially useful for rare diseases (as you select the cases yourself), but not for rare causes (as you will probably not find these in sufficient number in your study). It is also very suitable for diseases with a long latent period, such as cancer.

However, case-control studies are less adept at showing a causal relationship than cohort studies. They are more prone to bias.

One example is recall bias:  cases might recall certain exposures more clearly than controls, simply due to the fact that they have thought about what could have caused their disease.

Next time, an article will show how cross-tabulations are calculated, used, and interpreted in cohort and case-control studies.

About the Author: With expertise in epidemiology, biostatistics and quantitative research projects, Annette Gerritsen, Ph.D. provides services to her clients focussing on the methodological soundness of each phase of an epidemiological study to ensure getting valid answers to the proposed research questions. She is the founder of Epi Result.

Bookmark and Share

{ 15 comments… read them below or add one }

Namusubo sarah

So clear


Navdeep kaur dosanjh

what would be best study design to find out causal relationship between lung cancer and smoking?


Redwood middle school student

Super clear love it I’m only in eighth grade. I’m doing science Olympiad and this really helped me for disease detectives!!


a student

The article is very easy to understand. Thank you very much!!!!



well understood thanks



This was so helpful! Clear and concise.



the article is well simplified for easy understanding. Thanks.



Well written, simple and clear explanation. I m no longer confused now!



Dear Annette Gerritsen, could you please explain the difference of interpretation between RR in cohort study and OR in case control study? I have read some books but it’s still not clear.

Best regards,

Sarumpaet Sori Muda



The article is clear,well written and easy to understand


Paul Jiya

Wonderful work!!! Very clear than wat we were taught n class…….


Rose zawadi

This article helped me much,it is self explanatory
Thank you.



The article is clear. However, I would like you to help me on the methodology of a study that I would like to do.

I would like to do an intervention whereby I was thinking that data in the registers before the intervention would be used as a baseline and after the intervention would compare the treatment outcomes for baseline and end-line.

May you help me with how best can I can put this idea on paper?



sadia habib

This article is really well written .It helped me alot to solve my confusion.thanks


Gideon Quarshie

This article is well written out. It helped me answer questions on mid term take home exams. Thank you very much.


Leave a Comment

Please note that, due to the large number of comments submitted, any comments on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.

{ 1 trackback }

Previous post:

Next post: