Powerful three-sample genome-wide design and robust statistical inference in summary-data Mendelian randomization

Q Zhao, Y Chen, J Wang… - International journal of …, 2019 - academic.oup.com
International journal of epidemiology, 2019academic.oup.com
Abstract Background Summary-data Mendelian randomization (MR) has become a popular
research design to estimate the causal effect of risk exposures. With the sample size of
GWAS continuing to increase, it is now possible to use genetic instruments that are only
weakly associated with the exposure. Development We propose a three-sample genome-
wide design where typically 1000 independent genetic instruments across the whole
genome are used. We develop an empirical partially Bayes statistical analysis approach …
Background
Summary-data Mendelian randomization (MR) has become a popular research design to estimate the causal effect of risk exposures. With the sample size of GWAS continuing to increase, it is now possible to use genetic instruments that are only weakly associated with the exposure.
Development
We propose a three-sample genome-wide design where typically 1000 independent genetic instruments across the whole genome are used. We develop an empirical partially Bayes statistical analysis approach where instruments are weighted according to their strength; thus weak instruments bring less variation to the estimator. The estimator is highly efficient with many weak genetic instruments and is robust to balanced and/or sparse pleiotropy.
Application
We apply our method to estimate the causal effect of body mass index (BMI) and major blood lipids on cardiovascular disease outcomes, and obtain substantially shorter confidence intervals (CIs). In particular, the estimated causal odds ratio of BMI on ischaemic stroke is 1.19 (95% CI: 1.07–1.32, P-value <0.001); the estimated causal odds ratio of high-density lipoprotein cholesterol (HDL-C) on coronary artery disease (CAD) is 0.78 (95% CI: 0.73–0.84, P-value <0.001). However, the estimated effect of HDL-C attenuates and become statistically non-significant when we only use strong instruments.
Conclusions
A genome-wide design can greatly improve the statistical power of MR studies. Robust statistical methods may alleviate but not solve the problem of horizontal pleiotropy. Our empirical results suggest that the relationship between HDL-C and CAD is heterogeneous, and it may be too soon to completely dismiss the HDL hypothesis.
Oxford University Press