Introduction
Coffee is among the most commonly consumed beverages globally(1). Roasted coffee has several biologically active compounds including caffeine, flavonoids, lignans, cafestol and other polyphenols(Reference Bøhn, Blomhoff and Paur2). In particular, caffeine acts as a central nervous system stimulant and has short-term effects on cognitive functioning, heart rate, alertness, sleep regulation and emotional processing(Reference de Mejia and Ramirez-Mares3). However, the potential long-term effects of its habitual consumption are not fully understood. In observational phenotypic studies, low-to-moderate levels of regular coffee consumption has been reported to lower risk of dementia(Reference Santos, Costa and Santos4), cardiovascular disease(Reference Ding, Bhupathiraju and Satija5,Reference Stevens, Linstead and Hall6) , type 2 diabetes mellitus(Reference Ding, Bhupathiraju and Chen7), Parkinson’s disease(Reference Hong, Chan and Bai8) and all-cause and cancer mortality(Reference Torres-Collado, Compañ-Gabucio and González-Palacios9). Conversely, high intakes have been associated with harmful long-term effects. High coffee consumption was found to be associated with increased risk of dementia(Reference Pham, Mulugeta and Zhou10) and cardiovascular disease(Reference Zhou and Hyppönen11).
Mendelian randomisation (MR) studies lie at the interface between observational and interventional research methods, allowing the estimation of causal effects using observational data(Reference Davies, Holmes and Davey Smith12). This statistical approach relies on the use of genetic variants associated with the exposure of interest (coffee) to act as proxy markers or instruments, and overall, must comply with three core assumptions (Fig. 1). Since genetic variants are randomly assigned at conception, MR overcomes the effect of unmeasured confounding and reverse causality. The variants can be selected on the basis of candidate genes known to affect the exposure or using results from genome wide association studies (GWAS)(Reference Swerdlow, Kuchenbaecker and Shah13). In recent years, the use of the MR method has increased in popularity, with many papers utilising the availability of large-scale cohort data and GWAS(Reference Grover, Del Greco and König14). There have been several recent MR studies on coffee, spanning a broad range of health outcomes.

Fig. 1. Diagram explaining the three core assumptions of Mendelian randomisation studies. (1) Relevance assumption: the genetic variant(s) are associated with the exposure of interest. (2) Independence assumption: the genetic variant(s) are not associated with confounding factors associated with the exposure and outcome. (3) Exclusion restriction assumption: the genetic variant(s) are only associated with the outcome through the exposure of interest. Created with BioRender.com.
In this systematic review, we aimed to map the available MR studies examining the role of coffee consumption on health outcomes, and to evaluate the certainty and robustness of the evidence. The consolidation of this data allows us to summarise the potential benefits and harms of habitual coffee consumption on health, and will help to guide and inform future research, policy-makers and the public.
Materials and methods
Protocol and registration
This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) 2020 guidelines, which is an update to the original 2009 statement(Reference Moher, Liberati and Tetzlaff15,Reference Page, McKenzie and Bossuyt16) . The protocol was registered at the International Prospective Register of Systematic Reviews (PROSPERO) under ID CRD42021295323 on 9 December 2021.
This study is a review of previously published studies and does not involve the collection of original data from human or animal subjects. All data were sourced from publicly available studies and hence, no ethical approval was required.
Search strategy and data sources
We searched PubMed, Embase, Cochrane Database of Systematic Reviews, Cumulative Index to Nursing and Allied Health Literature (CINAHL) databases and two preprint repositories – bioRxiv and medRxiv – from inception to 30 September 2022. We included the search terms ‘Mendelian’ OR ‘Mendelian randomization’, ‘Genetic instrument’ OR ‘instrumental variable’ and ‘Coffee’ OR ‘caffeine’, as both MeSH terms and keywords. We applied truncation and wildcard symbols to account for different variations, spelling and plurals of each term. Pre-print repositories were searched using the medrxivr R package(Reference McGuinness and Schmidt17). A summary of the search queries used for each database is provided in Supplementary Table 1.
Eligibility criteria
The criteria for inclusion and exclusion of studies were based on the Population, Exposure/Intervention, Comparison, Outcomes and Study (PECOS) design framework, as described in Table 1. Two reviewers (K.P. and N.A.K.) independently screened the articles using Covidence(18) and any conflicts were resolved by a third reviewer (E.H.). The study selection process was documented using a PRISMA flow diagram template.
Table 1. PECOS criteria for inclusion of studies

Data extraction
In the data extraction stage, two reviewers (K.P. and N.A.K.) independently extracted key data using a custom template on Covidence. When any inconsistencies arose, a consensus was reached through discussion. For studies that included other analysis methods (for example, phenotypic analyses), only data relating to the MR analysis were extracted. The minimum data to be extracted will include the title of the study, authors, year of publication, MR design, description of the exposure and outcome populations, description of the genetic instrument and effect estimates for at least one MR method. For most studies, inverse variance weighted MR was considered the main analysis. We also collected information on statistical power, replication cohorts, multiple testing corrections, statistical heterogeneity and sensitivity/subgroup analyses.
Where multiple outcomes were investigated in a single study, each outcome association was assessed independently to determine whether it met the inclusion criteria before extraction. In any studies that included results from multiple cohorts of the same ethnic group, we presented the pooled results or selected the analysis with the highest number of single nucleotide polymorphism (SNPs), largest outcome sample size or the main analysis as specified by the author. After data extraction, we further excluded studies that had overlapping outcome study samples. We chose to include the study with the largest sample size, or if sample sizes were similar, we chose the study with the most robust method of sensitivity analysis.
Meta-analysis
For any outcomes that had reported estimates in more than one non-overlapping sample, we undertook a meta-analysis of the results using the STATA ‘metan’ command to provide a pooled estimate and presented them using forest plots. We did not include meta-analysis of outcomes which only had studies reporting null findings. Studies were also considered to be ineligible for meta-analysis if the SNP-exposure estimates were expressed in different units (for example, cups/d and % increase in coffee) and conversion of the estimates was not possible given the available source information. In these cases, pooled estimates were shown separately for different units of coffee.
Evaluating certainty of evidence and robustness of the associations
To assess the certainty of evidence, we applied a modified version of the Grading of Recommendations, Assessment, Development and Evaluations (GRADE) rating system(Reference Guyatt, Oxman and Vist19). Studies were ranked as high, moderate, low or very low certainty to describe how likely it was that the reported estimate was similar to the true effect. MR studies start as high certainty and can be rated down on the basis of risk of bias, imprecision, inconsistency, indirectness and publication bias. Certainty can be rated up for a large magnitude of effect, when a dose–response gradient is present and when the effect of any residual confounding would increase the magnitude of the effect (suggesting an underestimate of the effect estimate). We adapted the domains to be relevant for MR studies and created a checklist to improve ease and consistency of use(Reference Meader, King and Llewellyn20). Full description of the domains assessed in this study are given in Supplementary Table 2. Each included outcome was assessed using the GRADE rating system and reported individually. An overall study rating was also given, by taking the lowest quality of evidence rating from all outcomes. To aid with assessing whether pleiotropy was adequately addressed in each study, we summarised the potential pleiotropic associations using PhenoScanner V2 for coffee SNPs reported in the Coffee and Caffeine Genetics Consortium and UK Biobank GWAS studies and their proxies (r 2 > 0·8) (Supplementary Table 3)(Reference Cornelis, Byrne and Esko21–Reference Kamat, Blackshaw and Young23). We firstly checked associations significant at genome wide significance level (p-value <5 × 10−8), then checked for any additional associations significant at p < 1 × 10−5.
Robustness of the associations was assessed according to a ranking system previously established by Markozannes and colleagues(Reference Markozannes, Kanellopoulou and Dimopoulou24). The system ranks MR associations as robust, probable, suggestive or insufficient evidence for causality on the basis of the evidence provided by the main MR analysis and at least one sensitivity method (MR-Egger, weighted median, weighted mode, MR-PRESSO or multivariable MR). When statistical heterogeneity was detected, we considered the random effects model as the main analysis and did not include the fixed effects model in the assessment of robustness. A ‘robust’ classification requires that all methods are statistically significant, and the direction of effects must be consistent. Both ‘probable’ and ‘suggestive’ evidence must have at least one method that is statistically significant – when the direction of effects was consistent, the association was categorised as probable, and when the direction of effects was inconsistent, it was categorised as suggestive. In studies that applied multiple testing correction methods, the corrected p-value was used. We ranked the association as ‘insufficient’ if all methods had statistically non-significant p-values, low statistical power or wide confidence intervals. Studies that did not present any sensitivity analyses were assigned a ‘non-evaluable’ ranking.
Results
Study selection
The search yielded a total of 462 studies, 163 of which were excluded owing to duplication (Fig. 2). We screened 299 articles in the title and abstract screening phase and excluded 201 that did not meet the inclusion criteria. A further thirty articles were excluded in the full-text screening phase. We extracted data from sixty-seven studies, which contained analyses of 241 outcome associations. After data extraction, we excluded forty-four outcome associations owing to overlapping outcome sample populations from fourteen studies. However, because some of these studies had other outcomes contributing to the review, the process resulted in the exclusion of only eight out of the fourteen studies. Details on excluded duplicate outcomes are described in Supplementary Table 4. Overall, we have presented results for fifty-nine studies, covering 197 outcomes (of those, there are 160 unique outcomes).

Fig. 2. PRISMA flow diagram summarising the identification, screening and eligibility assessment for studies included in this review.
Description of the study design and data sources
Most of the included studies used a two-sample MR design (84·7%, fifty studies), while only nine studies (15·3%) used one-sample design (Table 2). The earliest study included in the review was published in 2015; however, nearly two-thirds were published in 2021 or 2022 (66·1%, thirty-nine studies). The UK Biobank (UKB) and the Coffee and Caffeine Genetics Consortium (CCGC) were the most common data sources for the exposure population, featuring in thirty-seven (62·7%) and fifteen (25·4%) studies, respectively. The outcome population data sources were more varied; however, population ancestry was mostly European. The studies similarly utilised large cohort databases such as the UK Biobank, FinnGen, PRACTICAL consortium, DIAGRAM consortium and GIANT consortium. The outcomes spanned a broad range of health outcomes, including cardiovascular traits, neurodegenerative diseases, metabolic disease, cancer and mortality.
Table 2. Summary of the characteristics of fifty-nine Mendelian randomisation studies on coffee consumption included in this review

OSMR, one-sample Mendelian randomisation study; TSMR, two-sample Mendelian randomisation study.
*At least 1 method of formal pleiotropy assessment was performed (for example, MR-Egger intercept test, MR-PRESSO outlier test and leave-one-out analysis).
Description of the instrument selection
Although the genetic instruments were selected from similar GWAS studies or consortia, each study applied their own set of inclusion criteria for the SNPs. The median number of SNPs used was eleven (Table 2). In a majority of studies, all SNPs were associated with coffee consumption at a genome wide significance level (p < 5 × 10−8) and the clumping threshold was set to r 2 < 0.001 or r 2 < 0.01. Instrumental variable (IV) exposure estimates, where reported, were adjusted for at least age and sex, with most studies also adjusting for BMI, typical food intake, SNP array and 10–20 principal components (data not shown).
Assessment of potential pleiotropy
From the total 197 outcome associations, 134 (68·0%) included more than one MR analytical approach, with 130 (66·0%) of those analyses including two or more pleiotropy robust methods (Tables 2–9). In addition, fifty-one of fifty-nine included studies (86·4%) conducted at least one method of formal pleiotropy assessment (MR-Egger test, MR-PRESSO outlier tests or leave-one-out analyses) and only eight studies reported no formal pleiotropy assessment (Table 2).
Table 3. Summary of MR studies related to cardiovascular traits

↑ Positive association (main analysis); ↓ negative association (main analysis); − null association (main analysis).
MR-E, MR-Egger; WM, weighted median; WMode, weighted mode; MR-P, MR-PRESSO; MVMR, multivariable MR; O, other method; UKB, UK Biobank; CARDIoGRAMplusC4, Coronary Artery Disease Genome-wide Replication and Meta-analysis + Coronary Artery Disease (C4D) Genetics consortia; MVP, Million Veteran Program; HERMES, Heart failure Molecular Epidemiology for Therapeutic targetS; AFGen, Atrial Fibrillation Genetics; ISGC, International Stroke Genetics Consortium; CGPS, Copenhagen General Population Study; CCHS, Copenhagen City Heart Study; DIAGRAM, DIAbetes Genetics Replication And Meta-analysis.
Table 4. Summary of MR studies related to serum lipids

↑ Positive association (main analysis); ↓ negative association (main analysis); − null association (main analysis).
MR-E, MR-Egger; WM, weighted median; WMode, weighted mode; MR-P, MR-PRESSO; MVMR, multivariable MR, O, other method; UKB, UK Biobank; DIAGRAM, DIAbetes Genetics Replication And Meta-analysis; GLGC, Global Lipids Genetics Consortium.
Table 5. Summary of MR studies related to neurological diseases and brain morphology

↑ Positive association (main analysis); ↓ negative association (main analysis); − null association (main analysis).
MR-E, MR-Egger; WM, weighted median; WMode, weighted mode; MR-P, MR-PRESSO; MVMR, multivariable MR; O, other method; CGPS, Copenhagen General Population Study; CCHS, Copenhagen City Heart Study; IGAP, International Genomics of Alzheimer’s Project; SSGAC, Social Science Genetic Association Consortium; GeM-HD, Genetic Modifiers of Huntington’s Disease; Courage-PD, Comprehensive Unbiased Risk Factor Assessment for Genetics and Environment in Parkinson’s Disease; IPDGC, International Parkinson Disease Genomics Consortium; ILAE, International League Against Epilepsy; iPSYCH, Integrative Psychiatric Research; PGC, Psychiatric Genomics Consortium; UKB, UK Biobank; IHGC, International Headache Genetics Consortium; CHARGE, Cohorts for Heart and Aging Research in Genomic Epidemiology.
Table 6. Summary of MR studies related to cancer and neoplasms

↑ Positive association (main analysis); ↓ negative association (main analysis); − null association (main analysis).
MR-E, MR-Egger; WM, weighted median; WMode, weighted mode; MR-P, MR-PRESSO; MVMR, multivariable MR; O, other method; UKB, UK Biobank; BCAC, Breast Cancer Association Consortium; IARC, International Academic and Research Consortium; PRACTICAL, Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome.
Table 7. Summary of MR studies related to metabolic diseases

↑ Positive association (main analysis); ↓ negative association (main analysis); − null association (main analysis).
MR-E, MR-Egger; WM, weighted median; WMode, weighted mode; MR-P, MR-PRESSO; MVMR, multivariable MR; O, other method; MAGIC, Meta-Analyses of Glucose and Insulin-related traits Consortium; UKB, UK Biobank; CGPS, Copenhagen General Population Study; CCHS, Copenhagen City Heart Study; DIAGRAM, DIAbetes Genetics Replication And Meta-analysis; GIANT, Genetic Investigation of ANthropometric Traits.
Table 8. Summary of MR studies related to autoimmune and inflammatory diseases

↑ Positive association (main analysis); ↓ negative association (main analysis); − null association (main analysis).
MR-E, MR-Egger; WM, weighted median; WMode, weighted mode; MR-P, MR-PRESSO; MVMR, multivariable MR; O, other method; eBMD, estimated mineral density; IMSGC, International Multiple Sclerosis Genetics Consortium; UKB, UK Biobank; arcOGEN, Arthritis Research UK Osteoarthritis Genetics; GEFOS, GEnetic Factors for OSteoporosis; GUGC, Global Urate Genetics Consortium.
Table 9. Summary of MR studies related to the digestive system and renal system

↑ Positive association (main analysis); ↓ negative association (main analysis); − null association (main analysis).
MR-E, MR-Egger; WM, weighted median; WMode, weighted mode; MR-P, MR-PRESSO; MVMR, multivariable MR; O, other method; UKB, UK Biobank; QSkin, QSkin Sun and Health Study; UKIBDGC, UK Inflammatory Bowel Disease Genetics Consortium; CKDGen, Chronic Kidney Disease Genetics.
For most outcomes, the associations were similar across different pleiotropy robust methods; however, screening of the commonly used coffee SNPs and their proxies on PhenoScanner highlighted several potentially pleiotropic SNPs that should be considered when assessing the MR associations (Supplementary Table 3). SNP rs1260326 (GKCR) was the most pleiotropic and was reported to be associated (p < 5 × 10−8) with serum lipid measures, cardiovascular disease risk factors, pulse rate, resting heart rate, gout, type 2 diabetes, markers of metabolic diseases, kidney disease, liver disease and alcohol intake. Serum lipid markers (rs1481012, rs7800944 and rs34060476), coronary artery disease (rs66723169), gout (rs1481012, rs7800944 and rs34060476), obesity and metabolic disease (rs1481012, rs4410790, rs7800944, rs6265, rs2470893, rs2472297, rs574367, rs10865548 and rs66723169) or addictive behaviours such as smoking and alcohol consumption (rs4410790, rs6265, rs2470893, rs34060476 and rs66723169), were all commonly flagged as potential pleiotropic associations. At p < 1 × 10−5, we identified further associations with diastolic blood pressure (rs2472297 and rs10865548), systolic blood pressure (rs10865548) and heart rate (rs597045 and rs1956218), among others.
GRADE rating – certainty of evidence
When looking at the individual disease outcome associations, 136 of 197 (69·0%) had a high certainty of evidence and did not need to be downgraded in any domains, 37 (18·8%) had a moderate rating and 24 had a low or very low rating (Supplementary Table 5). Overall GRADE ratings for each study were also determined, with most studies (57·6%, thirty-four studies) ranked as high, nearly a third were ranked as moderate (30·5%, eighteen studies) and only a small proportion of studies were downgraded to a low or very low rating (11·9%, seven studies). We found that studies were most commonly downgraded in the risk of bias and imprecision domains, primarily owing to issues regarding sample overlap between the exposure and outcome populations, violations of the core MR assumptions or insufficient statistical power (Supplementary Table 5).
Cardiovascular traits
MR studies reporting on cardiovascular outcomes were largely found to report null findings (Table 3). There was no evidence for an association between coffee consumption and coronary artery disease, peripheral artery disease, heart failure, atrial fibrillation, aortic valve stenosis, hypertension, aortic aneurysm (thoracic and abdominal), transient ischaemic attack or pulmonary embolism(Reference Zhou, Lin and Zheng25–Reference Kwok, Leung and Schooling35). There was also insufficient evidence to support an association with stroke, ischaemic stroke (large vessel, small vessel and cardioembolic), intracranial aneurysm or subarachnoid haemorrhage(Reference Yuan, Carter and Mason28,Reference Karhunen, Bakker and Ruigrok29,Reference Qian, Ye and Huang32,Reference Nordestgaard and Nordestgaard34) . However, the findings on intracerebral haemorrhage were conflicting(Reference Zhang, Wang and Yuan27,Reference Yuan, Carter and Mason28,Reference Qian, Ye and Huang32) . Meta-analysis of results from three non-overlapping studies were also inconclusive (pooled odds ratio (OR) per 50% increase in coffee 1·09, 95% CI 0·71–1·48; pooled OR per 1 cup/d increase in coffee 1·60, 95% CI 1·07–2·13) (Fig. 3).

Fig. 3. Forest plot showing the meta-analysis of studies reporting on the effect of coffee consumption on intracerebral haemorrhage.(Reference Woo, Falcone, Devan, Brown, Biffi, Howard and Anderson86)
There is a suggestive association with increased risk of venous thromboembolism and deep vein thrombosis, and a robust association with decreased risk of varicose veins (OR per 50% increase in coffee 0·78, 95% CI 0·67–0·92) (Table 3)(Reference Yuan, Carter and Mason28,Reference Yuan, Bruzelius and Damrauer36) . There was a potential association with lower diastolic blood pressure(Reference Nordestgaard, Thomsen and Nordestgaard37); however, out of the five variants used in the coffee instrument, one variant (rs2472297) is directly associated with diastolic blood pressure (p < 1 × 10−5), as identified in the GWAS by the International Consortium for Blood Pressure Genome-Wide Association Studies(Reference Ehret, Munroe and Rice38). The same study did not report an association with systolic blood pressure.
Serum lipids
Our review identified four MR studies on serum lipids(Reference Kwok, Leung and Schooling35,Reference Nordestgaard, Thomsen and Nordestgaard37,Reference Zhou and Hyppönen39) , including one still in the pre-print stage(Reference Li, Choudhury and Zhang40). Genetically determined coffee consumption was consistently associated with higher total cholesterol, LDL-cholesterol and apolipoprotein B (Table 4). There was no association between coffee and apolipoprotein A-1. As formal MR analyses were not conducted in Nordestgaard et al.(Reference Nordestgaard, Thomsen and Nordestgaard37) and the unit was not clearly described in Li et al.(Reference Li, Choudhury and Zhang40), we could only conduct the meta-analysis between estimates from Zhou and Hyppönen(Reference Zhou and Hyppönen39) and Kwok et al.(Reference Kwok, Leung and Schooling35). The pooled estimate supports an association with higher LDL-cholesterol (pooled beta per 1 cup/d increase in coffee 0£07, 95% CI 0·03–0·11) (Fig. 4). MR analyses in Zhou and Hyppönen(Reference Zhou and Hyppönen39) and Kwok et al. (Reference Kwok, Leung and Schooling35) both considered the impact of pleiotropy by excluding known pleiotropic SNPs.

Fig. 4. Forest plot showing the meta-analysis of studies reporting on the effect of coffee consumption on LDL-cholesterol.
1Original estimate was described per SD change in LDL-cholesterol; converted to per 1 mM change in LDL-cholesterol based on 1 SD = 38·67 mg/dl = 1 mM.
Neurological diseases and brain morphology
A study on Alzheimer’s disease reporting pooled estimates from the International Genomics of Alzheimer’s Project (IGAP) and FinnGen cohorts found a positive association between coffee and Alzheimer’s disease, while a later study in a smaller cohort found no association (Table 5)( Reference Zhang, Wang and Yuan27,Reference Nordestgaard, Nordestgaard and Frikke-Schmidt41) . Meta-analysis of these three estimates suggests that coffee consumption may be associated with an increased risk of Alzheimer’s disease (pooled OR per 1 cup/d increase in coffee 1·18, 95% CI 1·02–1·33) (Fig. 5). We also found probable evidence to support an association between coffee and a younger age of onset of Huntington’s disease(Reference Wang, Cornelis and Zhang42). Studies on cognition, amyotrophic lateral sclerosis (ALS), Parkinson’s disease, epilepsy, attention deficit hyperactivity disorder (ADHD) and cerebral microbleeds all reported null findings(Reference Zhou, Taylor and Karhunen43–Reference Domenighetti, Sugier and Sreelatha49). While analysis using data from the International Headache Genetics consortium (IHGC) did not provide evidence for a relationship, meta-analysis incorporating data from the UK Biobank and FinnGen cohorts supported an association with decreased risk of migraines (pooled OR per 50% increase in coffee 0·73, 95% CI 0·63–0·83, I 2 87·5%) (Fig. 5)(Reference Yuan, Daghlas and Larsson50,Reference Chen, Zhang and Zheng51) . Heterogeneity in this analysis may reflect differences in how the migraine phenotype is defined and collected across the different studies; however, heterogeneity measures may be biased when there are a small number of studies in the meta-analysis(Reference von Hippel52).

Fig. 5. Forest plot showing the meta-analysis of studies reporting on the effect of coffee consumption on Alzheimer’s disease and migraines.
There was one study reporting a robust association reported between coffee and lower grey matter volume (beta in standard deviation (SD) per 1 coffee cup/d increase −0·371, 95% CI −0·596 to −0·147)(Reference Zheng and Niu44). No associations were observed for other brain volume measures (total brain, white matter and hippocampus), white matter hyperintensity volume or MRI markers of small vessel disease (fractional anisotropy and mean diffusivity).
Cancer and neoplasms
Coffee consumption was not found to be associated with cancers of the brain, head and neck, breast, thyroid, lung, colon/rectum, stomach, liver, biliary tract, pancreas, kidney, bladder, cervix, endometrium, uterus, prostate or testicles(Reference Carter, Yuan and Kar53–Reference Li, Yan and Li56) (Table 6). There was also no association with overall cancer, lymphoma, non-Hodgkin’s lymphoma, leukaemia and melanoma. Carter et al.(Reference Carter, Yuan and Kar53) identified a robust association between coffee consumption and increased risk of oesophageal cancer in the UK Biobank cohort (OR per 50% increase in coffee 2·79, 95% CI 1·73–4·5); however, the results were not replicated in the FinnGen cohort. Similarly, this study found probable associations with an increased risk of multiple myeloma and a decreased risk of ovarian cancer, which were also not replicated in the FinnGen cohort. Meta-analysis of estimates from the UK Biobank and FinnGen suggest that coffee consumption is associated with an increased risk of oesophageal cancer (pooled OR per 50% increase in coffee 2·67, 95% CI 1·40–3·94). Given that the epithelial ovarian cancer subtype accounts for most ovarian cancer cases(Reference Torre, Trabert and DeSantis57), we conducted meta-analysis of ovarian cancer estimates, including an estimate for epithelial ovarian cancer, in the Ovarian Cancer Association Consortium(Reference Ong, Hwang and Cuellar-Partida58) (pooled OR per 50% increase in coffee 0·86, 95% CI 0·74–0·98) (Fig. 6).

Fig. 6. Forest plot showing the meta-analysis of studies reporting on the effect of coffee consumption on oesophageal cancer, multiple myeloma and ovarian cancer.
Metabolic traits
In the largest available study, coffee drinking had a suggestive association with an increased risk of type 2 diabetes mellitus(Reference Yuan and Larsson59) (Table 7). Coffee was also associated with markers of an increased risk of diabetes, including higher fasting glucose, higher insulin resistance, increased risk of obesity and higher BMI; however, robustness could not be assessed for most outcomes(Reference Kwok, Leung and Schooling35,Reference Nordestgaard, Thomsen and Nordestgaard37,Reference Narayan and Yoon60,Reference Nicolopoulos, Mulugeta and Zhou61) . There was insufficient evidence to support an association with glycated haemoglobin, fasting insulin, adiponectin, height or plasma glucose. A meta-analysis could not be conducted for waist circumference as Nordestgaard et al.(Reference Nordestgaard, Thomsen and Nordestgaard37) did not include formal MR analysis, only regression of the coffee genetic risk score against the outcomes (common in early MR studies).
Autoimmune and inflammatory diseases
There was insufficient evidence to support an association between genetically determined coffee consumption and multiple sclerosis or systemic lupus erythematosus(Reference Lu, Wu and Zhang62,Reference Bae and Lee63) (Table 8). Bae and Lee(Reference Bae and Lee63) suggested that there may be an association between coffee and an increased risk of rheumatoid arthritis; however, the findings were not replicated in a later study(Reference Pu, Gu and Zheng64). Results from these two studies could not be pooled as the SNP-exposure estimates were expressed in different units.
A probable association between coffee consumption and an increased risk of osteoarthritis (OA) was identified in the UK Biobank cohort(Reference Nicolopoulos, Mulugeta and Zhou61), while only suggestive evidence was identified within the Arthritis Research UK Osteoarthritis Genetics (arcOGEN) consortium(Reference Lee65). The association remained when data was restricted to knee OA cases, but not for hip OA(Reference Zhang, Fan and Chen66). Coffee was not associated with fracture risk or estimated mineral density measures(Reference Yuan, Michaëlsson and Wan67). The findings on gout were conflicting, findings from the Global Urate Genetics Consortium (GUGC) and the Biobank Japan cohort reported a decreased risk of gout(Reference Shirai, Nakayama and Kawamura68), while a study in the UK Biobank reported no association(Reference Nicolopoulos, Mulugeta and Zhou61). Although meta-analysis of the three cohorts suggested a negative association (pooled OR per 1 cup/d increase in coffee 0·71, 95% CI 0·53–0·88) (Fig. 7), MR-PRESSO distortion tests, conducted in the UK Biobank study, showed that the association was likely to be due to three potentially pleiotropic outlying variants (rs1260326, rs1481012 and rs7800944)(Reference Nicolopoulos, Mulugeta and Zhou61). No association was found between coffee and serum uric acid(Reference Shirai, Nakayama and Kawamura68).

Fig. 7. Forest plot showing the meta-analysis of studies reporting on the effect of coffee consumption on gout.
Diseases of the digestive system and renal system
Null findings were reported for diverticular disease, gastroesophageal reflux disease, Crohn’s disease, and ulcerative colitis (Table 9)(Reference Yuan69–Reference Yuan and Larsson71). There was a potential association between coffee and decreased risk of non-alcoholic fatty liver disease(Reference Yuan, Chen and Li72). Coffee consumption had a protective effect on gallstone disease, but only after adjusting for BMI and smoking in a multivariable MR (MVMR) model, or in another study looking at only cases of symptomatic gallstone disease(Reference Yuan, Gill and Giovannucci73,Reference Nordestgaard, Stender and Nordestgaard74) . We also found probable evidence for a protective effect of coffee on markers of kidney disease. Coffee consumption was associated with a decreased risk of chronic kidney disease, higher estimated glomerular filtration rate, lower levels of albuminuria and a decreased risk of kidney stones(Reference Kennedy, Pirastu and Poole75,Reference Yuan and Larsson76) . Analyses on glomerular filtrate rate excluded potentially pleiotropic variants (rs1260326, rs9275576 and rs476828)(Reference Kennedy, Pirastu and Poole75,Reference Köttgen, Pattaro and Böger77) .
Mortality and other outcomes
Coffee consumption had no effect on all-cause mortality or cancer-specific mortality(Reference Nordestgaard and Nordestgaard34,Reference Ong, Law and An55,Reference van Oort, Beulens and van Ballegooijen78,Reference Taylor, Martin and Geybels79) (Table 10). There was no association with pregnancy loss(Reference Yuan, Liu and Larsson80); however, coffee consumption had a probable association with decreased postmenopausal bleeding and menopausal disorders(Reference Nicolopoulos, Mulugeta and Zhou61). There was insufficient evidence to support an association with lower back pain(Reference Lv, Cui and Zhang81), while a study on hearing showed a potential association with decreased risk of tinnitus(Reference Cresswell, Casanova and Beaumont82). For eye disorders, we found no association with intraocular pressure(Reference Kim, Aschard and Kang83); however, coffee had a potentially adverse association with senile cataracts and glaucoma(Reference Yuan, Wolk and Larsson84,Reference Li, Cheng and Cheng85) .
Table 10. Summary of MR studies related to mortality and other outcomes

↑ Positive association (main analysis); ↓ negative association (main analysis); − null association (main analysis).
MR-E, MR-Egger; WM, weighted median; WMode, weighted mode; MR-P, MR-PRESSO; MVMR, multivariable MR; O, other method; PRACTICAL, Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome; UKB, UK Biobank.
Discussion
Our review, including fifty-nine MR studies and 160 unique disease outcome associations, supports some possible benefits and harms with habitual coffee intake. Previous observational evidence (for umbrella reviews please see refs.(Reference Poole, Kennedy and Roderick87,Reference Grosso, Godos and Galvano88) ) has identified almost no harmful effects, and deemed coffee drinking in moderation as safe, except during pregnancy and for women at increased risk of fractures. These reviews also highlighted many potential benefits of coffee consumption, including lowered risk of all-cause and cardiovascular mortality, cancers, metabolic conditions, liver conditions, Parkinson’s disease, depression and Alzheimer’s disease. However, most of these benefits from observational associations were not supported by genetic studies identified in our review(Reference Kwok, Leung and Schooling35,Reference Domenighetti, Sugier and Sreelatha49,Reference Carter, Yuan and Kar53,Reference Taylor, Martin and Geybels79,Reference Zhang, Liu and Choudhury89) , and for Alzheimer’s disease/dementia, two studies(Reference Zhang, Wang and Yuan27,Reference Nordestgaard, Nordestgaard and Frikke-Schmidt41) suggested potential increases in risk that warrant further research. This suggests that the phenotypic associations reported for coffee are likely to be due to residual confounding or reverse causality, and not through a causal pathway(Reference Davies, Holmes and Davey Smith12). However, our review did suggest potential benefits for some conditions that align with observational findings, and notably, the potentially lower risk of ovarian cancer, hepatocellular carcinoma, kidney disease, gallstone disease and migraines are interesting and warrant confirmation in independent studies.
Our systematic review provides an important update to the existing body of knowledge on the health effects of coffee consumption. There is one previous narrative review that summarised the MR evidence on coffee and caffeine consumption(Reference Cornelis and Munafo90). However, this review included only fifteen MR studies and found that coffee had no consistent effects on the included health outcomes. Over two-thirds of the studies included in our review were published after this previous review. We used two methods of quality assessment, and we adapted the processes for use with MR studies. Authors in the previous review provided valuable insights into the methodological issues of MR, including insufficient power, pleiotropy and collider bias. We found that these methodological issues were still present but often improved in more recent studies with the increased availability of larger scale individual-level and summary-level data. Overall, we noticed a marked increase in the quality and standardisation of reporting MR studies, which coincides with the release of the STROBE-MR guidelines (pre-print 2019, published 2021)(Reference Skrivankova, Richmond and Woolf91).
Our review found only a handful of studies reporting associations that could be assessed as ‘robust’, and even these were not independently replicated. The association between coffee consumption and smaller grey matter volumes is well supported by prior observational studies and randomised controlled trial evidence, providing strong evidence that the association may be causal(Reference Pham, Mulugeta and Zhou10,Reference Lin, Weibel and Landolt92) . However, the mechanisms of effect are yet to be fully understood. Considering that higher habitual coffee intakes are typically linked to higher circulating levels of caffeine(Reference van Dam, Hu and Willett93), the competitive antagonist binding of caffeine to the adenosine receptors may be a potential pathway underlying these associations(Reference Haller, Montandon and Rodriguez94,Reference Rivera-Oliver and Díaz-Ríos95) . Caffeine molecules are structurally similar to adenosine molecules, which allows them to competitively bind to adenosine receptors and pass through the blood–brain barrier. It is possible that this disrupts adenosine homeostasis or alters the expression of adenosine receptors, which has been implicated in Alzheimer’s disease(Reference Chang, Wu and Lin96). Another theory to explain the association between coffee and brain diseases is that caffeine intake impacts blood–brain barrier permeability and, hence, allows entry of toxins and pathogens into the brain. However, a recent MRI study found that caffeine ingestion had no effect on blood–brain barrier permeability(Reference Lin, Jiang and Liu97). Interestingly, a recently published MR study found an association between coffee and delayed age-of-onset of Parkinson’s disease(Reference Kuzovenkova, Liu and Gan-Or98), supporting a protective effect of coffee for neurodegeneration. No association was found with Parkinson’s disease risk, suggesting that coffee may influence the onset of Parkinson’s symptoms not the main disease pathway. Coffee may impact Alzheimer’s and Parkinson’s uniquely, despite their similar neurodegenerative symptoms and overlapping affected brain regions.
The observed effects of coffee on oesophageal cancer risk may reflect the association between hot beverage consumption and oesophageal cancer. A meta-analysis of studies on tea drinking found that participants who drank tea at higher temperatures had a higher risk of oesophageal squamous cell carcinomas(Reference Luo and Ge99). It is possible that the consumption of hot beverages causes damage to the oesophageal cell mucosa, which may increase cell turnover rates and the risk of cancerous mutations(Reference Kamangar, Chow and Abnet100). This explanation is supported by a recent MR study, which found that the association between coffee and oesophageal cancer was attenuated in multivariable models additionally adjusting for hot beverage consumption(Reference Xue, Xue and Zhao101).
Our review did not find strong evidence to support associations between coffee consumption and other types of cancer, except for potential protective associations with hepatocellular carcinoma and ovarian cancer, and an increased risk for multiple myeloma. More recent evidence provides further support for the association with multiple myeloma, including replication in an independent outcome cohort(Reference Lin, Zhou and Zhu102). Mediation analyses from the same study suggested that three plasma metabolites acted as mediators in the association, possibly via the glutathione metabolism pathway. Dysregulation of this pathway impacts antioxidant defence and immune response modulation and has been implicated in the pathogenesis of several diseases(Reference Wu, Fang and Yang103). Meanwhile, the protective association with hepatocellular carcinoma may only be present in Europeans, as later studies in East Asian populations found no association between coffee and hepatocellular carcinoma or other digestive system cancers(Reference Tan, Wei and Liu104,Reference Cai, Li and Liang105) . Similarly, recent literature suggests that coffee may associate with increased risks of endometrioid ovarian cancer, opposing previous studies that supported protective associations(Reference Liu, Feng and Du106). Epidemiological evidence on coffee and ovarian cancer remains conflicting, so further investigation is required to disentangle these associations.
MR studies do not support the cardiovascular benefits suggested by observational studies. While excessive intake of caffeine (toxicity) is known to lead to adverse cardiovascular symptoms such as tachycardia and increased blood pressure(Reference Willson107), MR studies in this review found no evidence of harm. It is important to note that MR studies examine the effects of habitual (rather than excessive) coffee intakes, and there is evidence to suggest that the patterns of coffee consumption are in part driven by individual differences in the function of the cardiovascular system, as reflected by blood pressure and heart rate(Reference Hyppönen and Zhou108). Indeed, this type of natural self-moderation in consumption levels may help to protect those individuals who are susceptible to possible caffeine-related cardiovascular symptoms from any serious harm. More recent MR studies including a broader set of instrumental variables (37 SNPs v. 9–14 SNPs) have reported probable associations between coffee and increased risk of coronary artery calcification, myocardial infarction, atrial fibrillation and heart failure(Reference Yang, Yuan and Lyu109–Reference Wang, Song and Fan111), which could in part relate to the observed increases in serum LDL-cholesterol by higher habitual intakes(Reference Zhou and Hyppönen39). Mediation analyses suggested that the association with heart failure may involve segmental/global circumferential strain and left ventricular volume(Reference Wang, Song and Fan111). Circumferential strain contributes to arterial wall thickening(Reference Smiseth, Rider and Cvijic112), which aligns with the theory that competitive adenosine receptor binding stimulates acute increases in blood pressure and arterial thickness, which may induce ventricular modelling and cardiac strain over time(Reference Mehta, Khoury and Madsen113).
Many of the instruments used to reflect habitual coffee intake may be pleiotropic, and this was reflected in the varied conclusions on the association between coffee and gout. As noted in the analyses using MR-PRESSO by Nicolopoulos and colleagues(Reference Nicolopoulos, Mulugeta and Zhou61), estimates were influenced by the effect of pleiotropic outlying SNPs, and when removed from the coffee instrument, no association was observed in the UK Biobank or the Global Urate Genetics Consortium cohorts. Estimates in the Biobank Japan cohort remained significant after the removal of pleiotropic SNPs (rs671, rs1260326 and rs13234378); however, we observed a large drop in the precision of estimation, suggesting that the pleiotropic SNPs had a large contribution to the instrument strength(Reference Shirai, Nakayama and Kawamura68). It is also possible that the varied findings are due to ethnic differences between Asian and European populations.
It is important to acknowledge potential limitations of our review. Although we aimed to cover all health outcomes associated with coffee, our search may have missed relevant studies, particularly when the MR analyses were not described in the title or abstract or conducted only as a supplementary analysis. At the time of this review there are no formal data extraction or quality assessment tools established for MR studies, so our templates and tools had to be adapted from general tools for observational studies or previous publications. In addition, the GRADE system for assessing certainty of evidence is known to be a very subjective process(Reference Guyatt, Oxman and Vist19). We aimed to standardise the process between reviewers using a checklist format(Reference Meader, King and Llewellyn20); however, there is naturally a level of subjectivity to each decision, which should be taken into account. We found that most studies identified in this review were in European populations, and therefore not directly generalisable to other ethnic populations or lower-to-middle income countries. In particular, many studies utilise the UK Biobank as the exposure or outcome data source, which is known to be a non-representative sample and subject to a healthy volunteer bias(Reference Fry, Littlejohns and Sudlow114). There is evidence to suggest that the association between CYP1A2 and coffee intake may differ between Caucasian and Asian populations, implying that one of the best genetic instruments for coffee intake may be influenced by ethnicity(Reference Denden, Bouden and Haj Khelil115). All included studies implemented linear MR analyses, and uncertainties exist in the ability to use MR in evaluating nonlinear effects(Reference Hamilton, Hughes and Lu116). Our review focused on MR studies that approximate differences in habitual coffee intake using genetic variants. Although some variants included in the instruments of these MR studies are directly involved in caffeine metabolism, associations may not reflect circulating caffeine concentrations or be applicable to the effects of other caffeinated drinks(Reference Woolf, Cronjé and Zagkos117). We observed evidence for pleiotropy for many of the instruments used in the MR analyses. However, some of the earlier studies were published before sensitivity analysis methods for MR were developed, preventing assessment of robustness of the evidence(Reference Burgess, Bowden and Fall118). Similarly, a reporting standard for MR studies has only been recently established, so earlier studies lacked standardisation of methodology(Reference Skrivankova, Richmond and Woolf91). Lastly, several studies identified in the review were underpowered, so caution should be exercised with null associations, as small effects may have been missed.
Our systematic review of MR studies did not support observational evidence for broad benefits of coffee intake, aside from potential associations with a decreased risk of migraines, hepatocellular carcinoma, kidney disease, gallstone disease and ovarian cancer. We also did not observe any strong evidence of harm, although more research is needed to assess possible effects on oesophageal cancer and dementia/Alzheimer’s disease. However, the genetic variants used to instrument coffee intake approximate modest differences in average coffee intakes, and as they may not directly reflect caffeine concentrations in the blood, these studies may not have captured effects seen with excessive intakes. Overall, evidence from MR studies published to date suggests that moderate consumption of approximately 1–3 cups/d is generally safe. There is a need for creation and validation of data extraction protocols and quality assessment tools for systematic reviews of MR studies. Future studies should also aim to understand the underlying mechanisms of any causal associations and expand upon knowledge in non-European cohorts and cross-ethnic studies.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S0954422425100206
Data availability statement
No new data were created or analysed in this study. All the work was developed using published data. Data sharing is not applicable to this article.
Acknowledgements
None.
Authorship
K.P. and N.A.K. conducted the literature search, study screening, data extraction and quality assessment. K.P. and A.Z. prepared the final data tables. K.P. conducted data analysis and wrote the paper. N.A.K. drafted the review protocol and the data extraction and quality assessment tool, with comments from K.P., A.M., A.Z. and E.H. K.P., A.M., A.Z. and E.H. interpreted results and revised the paper. All authors read and approved the final version of the manuscript for submission.
Financial support
This research is supported by an Australian Government Research Training Program (RTP) Scholarship (K.P. and N.A.K.), National Health and Medical Research Council Australia Leadership Investigator Award, GNT2025349 (E.H.), National Health and Medical Research Council Australia Project Grant, GNT1123603 (E.H.) and the Medical Research Future Fund, grant MRF2007431 (E.H.).
Declaration of Interests
The authors do not have any conflicts of interest to declare.

















