Cohort and Case-Control Studies: Pro’s and Con’s

Two designs commonly used in epidemiology are the cohort and case-control studies. Both study causal relationships between a risk factor and a disease. What is the difference between these two designs? And when should you opt for the one or the other?

Cohort studies

Cohort studies begin with a group of people (a cohort) free of disease. The people in the cohort are grouped by whether or not they are exposed to a potential cause of disease. The whole cohort is followed over time to see if the development of new cases of the disease (or other outcome) differs between the groups with and without exposure.

For example, you could do a cohort study if you suspect there might be a causal relationship between the use of a certain water source and the incidence of diarrhea among children under five in a village with different water sources.

You  select a group of children under five years, either all children of that age in the village, a random sample taken from the population register, or e.g. children living in the same area, or attending the same clinic. Then you  classify them as either using the suspected water source or other water sources. You check e.g. after two weeks whether the children have had diarrhea.

You can then calculate how many diarrhea cases there were among those children using the suspected water source and those using other sources of water supply (cumulative incidence of diarrhea).

Case-control studies

The same problem could also be studied in a case-control study. A case-control study begins with the selection of cases (people with a disease) and controls (people without the disease).  The controls should represent people who would have been study cases if they had developed the disease (population at risk).

The exposure status to a potential cause of disease is determined for both cases and controls. Then the occurrence of the possible cause of the disease could be calculated for both the cases and controls. To come back to the example, you may compare children who present themselves at a health center with diarrhea (cases) with children with other complaints, for example acute respiratory infections (controls). You determine which source of drinking water they had used.  Then calculate the proportion of cases and controls that were exposed to the suspected water source.

Pro’s and con’s

On what basis do you decide to choose a cohort design or a case-control design?

Cohort studies provide the best information about the causation of disease because you follow persons from exposure to the occurrence of the disease. With data from cohort studies you can calculate cumulative incidences. Cumulative incidences are the most direct measurement of the risk of developing disease.

An added advantage is that you can examine a range of outcomes/diseases caused by one exposure. For example, when heart disease, lung disease, renal disease are caused by smoking.

However, cohort studies are major undertakings. They may require long periods of follow-up since disease may occur a long time after exposure. Therefore, it is a very expensive study design.

Cohort studies work well for rare exposures–you can specifically select people exposed to a certain factor.  But this design does not work for rare diseases–you would then need a large study group to find sufficient disease cases.

Case-control studies are relatively simple to conduct. They do not require a long follow-up period (as the disease has already developed), and are hence much cheaper. This design is especially useful for rare diseases (as you select the cases yourself), but not for rare causes (as you will probably not find these in sufficient number in your study). It is also very suitable for diseases with a long latent period, such as cancer.

However, case-control studies are less adept at showing a causal relationship than cohort studies. They are more prone to bias.

One example is recall bias:  cases might recall certain exposures more clearly than controls, simply due to the fact that they have thought about what could have caused their disease.

by Annette Gerritsen, Ph.D.

About the Author: With expertise in epidemiology, biostatistics and quantitative research projects, Annette Gerritsen, Ph.D. provides services to her clients focussing on the methodological soundness of each phase of an epidemiological study to ensure getting valid answers to the proposed research questions. She is the founder of Epi Result.

The Pathway: Steps for Staying Out of the Weeds in Any Data Analysis
Get the road map for your data analysis before you begin. Learn how to make any statistical modeling – ANOVA, Linear Regression, Poisson Regression, Multilevel Model – straightforward and more efficient.

Reader Interactions


  1. Navdeep kaur dosanjh says

    what would be best study design to find out causal relationship between lung cancer and smoking?

  2. Redwood middle school student says

    Super clear love it I’m only in eighth grade. I’m doing science Olympiad and this really helped me for disease detectives!!

  3. Sarumpaet says

    Dear Annette Gerritsen, could you please explain the difference of interpretation between RR in cohort study and OR in case control study? I have read some books but it’s still not clear.

    Best regards,

    Sarumpaet Sori Muda

  4. Chetami says

    The article is clear. However, I would like you to help me on the methodology of a study that I would like to do.

    I would like to do an intervention whereby I was thinking that data in the registers before the intervention would be used as a baseline and after the intervention would compare the treatment outcomes for baseline and end-line.

    May you help me with how best can I can put this idea on paper?


  5. Gideon Quarshie says

    This article is well written out. It helped me answer questions on mid term take home exams. Thank you very much.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please note that, due to the large number of comments submitted, any questions on problems related to a personal study/project will not be answered. We suggest joining Statistically Speaking, where you have access to a private forum and more resources 24/7.