Chengde Medical College, Chengde, Hebei Province, China.
Xu Qian
Tel: 18932896530;
Email: qianxu@cdmc.edu.cn
Received : Apr 27, 2024 Accepted : May 23, 2024 Published : May 30, 2024 Archived : www.meddiscoveries.org
Objective: This study aims to investigate the causal relationships between specific blood metabolites and the risk of coronary atherosclerosis using a two-sample Mendelian Randomization (MR) approach.
Methods: We utilized genome-wide association summary statistics from 8,299 participants in the Canadian Longitudinal Study on Aging (CLSA) and 456,348 participants from the UK Biobank to examine 1091 blood metabolites and 309 metabolite ratios. Instrumental variables (IVs) were identified based on genetic variants strongly associated with these metabolites and ratios, and a robust set of IVs was employed to ensure the validity of the MR analyses. Various MR methods, including Inverse Variance Weighted (IVW), MR-Egger, and Weighted Median approaches, were used to address different assumptions and potential biases.
Results and conclusion: The MR analysis revealed 111 significant causal associations between metabolites/metabolite ratios and the risk of coronary atherosclerosis. Notably, specific metabolites such as the glycine to alanine ratio demonstrated a strong causal link with coronary heart disease. Validation with the FinnGen database confirmed the consistency and robustness of these associations. Our findings provide compelling evidence of causal relationships between several blood metabolites and coronary atherosclerosis, highlighting potential targets for metabolic interventions. These results demonstrate the utility of MR in uncovering causal factors in complex diseases and suggest avenues for further research into preventive and therapeutic strategies against coronary heart disease. Limitations related to ethnic diversity and biological variability across tissues underscore the need for broader validation and experimental verification of these findings.
Keywords: Coronary atherosclerosis; Mendelian randomization; Blood metabolites; Genetic association.
With the improvement in living standards and the acceleration of population aging, the incidence of coronary atherosclerotic heart disease has risen sharply [1]. Studies have shown that treatments that halt the progression of atherosclerosis, such as lipid-lowering therapies, can significantly reduce the risk of myocardial infarction or death in patients with coronary heart disease [2]. Coronary atherosclerosis, the primary cause of coronary heart disease, begins with endothelial dysfunction, which is induced by prolonged exposure to a series of pathogenic factors such as diabetes, hypertension, smoking, and stress [3]. Research by William Durante has shown that glutamine can be metabolized into a large number of metabolic products, and its metabolic disturbances may lead to endothelial dysfunction [4]. Current studies indicate that metabolites are closely associated with the occurrence and development of coronary atherosclerosis [5].
Metabolites, small molecules that are intermediate or final products of metabolic reactions, typically reflect an individual’s genomic composition. The composition of the human plasma metabolome is influenced by the ecology of the gut microbiome, diet, and lifestyle, yet the levels of individual metabolites in the blood are strictly regulated by host genetics, providing reliable evidence for the early diagnosis, progression, and treatment of diseases. Levels of certain specific blood metabolites can also serve as references for the etiology and treatment of complex diseases [6,7]. Studies have proven that variations in small molecule metabolites may reflect underlying coronary heart disease and serve as biomarkers for the progression of coronary artery atherosclerosis [8]. Moreover, current associations are primarily based on observational studies with limited sample sizes and confounding factors [9]. In this study, we apply two-sample Mendelian Randomization (MR) for the first time to estimate the causal effects of 1,091 blood metabolites and 309 metabolite ratios on the risk of coronary atherosclerosis.
In recent years, Mendelian Randomization (MR) has been widely used to assess the potential causal relationships between exposure factors and outcomes [10]. MR studies use genetic variations associated with the exposure of interest as instrumental variables, thereby introducing a randomized scheme into observational studies. Since genetic variations are determined at the time of gamete formation and conception, associations between exposure and disease biological relevance are less likely to be affected by confounding factors or reverse causation [11]. This study has a solid theoretical basis, and its results may provide references for the treatment and prevention of coronary artery atherosclerosis.
Research design
To estimate the causal relationships between 1,091 blood metabolites and 309 metabolite ratios and coronary atherosclerosis, we conducted a two-sample MR analysis using GWAS summary statistics, along with reverse analyses to bolster the credibility of our results. In a compelling two-sample MR study, the instrumental variables must satisfy three fundamental assumptions: (1) Relevance assumption: the instrumental variables must be strongly associated with the exposure factors (1,091 blood metabolites and 309 metabolite ratios) with a P<1e-5, and for reverse validation, P<5e-8; (2) Independence assumption: the instrumental variables must not be associated with any potential confounders; (3) Exclusion restriction: the instrumental variables can affect the outcome (i.e., coronary atherosclerosis) only through the exposure and not be directly related to the outcome (P>5e-5).
GWAS summary data for 1,091 blood metabolites and 309 metabolite ratios
The genome-wide association summary dataset for 1,091 blood metabolites and 309 metabolite ratios was obtained by Chen et al. [12]. From the Canadian Longitudinal Study on Aging (CLSA) cohort, encompassing 8,299 individuals. Among the 1,091 blood metabolites, 850 are known and span across eight super pathways, including lipids, amino acids, xenobiotics, nucleotides, cofactors and vitamins, carbohydrates, peptides, and energy. The remaining 241 metabolites are classified as unknown or ‘partially’ characterized molecules. Many metabolites serve as substrates and products in enzymatic reactions; thus, identifying the genetic determinants of substrate-to-product ratios can provide insights into biological processes, which is not possible by studying individual metabolites alone. Likewise, the genetic controls of enzymes and transporters can be precisely located.
Summary statistics for coronary atherosclerosis
The summary statistics for coronary atherosclerosis were sourced from the UK Biobank (UKB) (https://www.ukbiobank. ac.uk/) and FinnGen (https://www.finngen.fi/en). Coronary atherosclerosis GWAS summary data extracted from the UKB summary database served as the test dataset, involving 456,348 participants, including 16,041 cases and 440,307 controls of European ancestry from the United Kingdom [13]. Validation dataset data were extracted from FinnGen R8 GWAS summary database, comprising 42,421 cases and 285,621 controls from Finland. All subjects were of European descent, ensuring no sample overlap between exposures and outcomes to minimize bias caused by confounding factors.
Selection of instrumental variables
We selected genetic variants closely associated with the 1,091 blood metabolites and 309 metabolite ratios based on the relevance assumption (p<5e-8). We then set an r² threshold of 0.001 and a kb range of 10,000 to eliminate any Single Nucleotide Polymorphisms (SNPs) in Linkage Disequilibrium (LD) to ensure the independence of the selected genetic variants. Subsequently, we filtered SNPs in the outcome data based on the exclusion restriction assumption to select SNPs not strongly associated with the outcome directly and exclude heterogeneous data. Weak Instrumental Variables (IVs) can impact the accuracy of causal relationships between exposure and outcome. The strength of the IVs was assessed by calculating the F-statistic [14], with the formula F=R²(N-2) / (1-R²) [15], where N is the sample size of the exposure data, R²=2×MAF×(1-MAF) *Beta *Beta [16], MAF being the minor allele frequency, and Beta is the effect size of the SNP as an exposure IV. Only SNPs with an MAF>0.01 and an F-value>10 were selected, ensuring the selected SNPs were strong IVs. Following these steps, these carefully chosen SNPs were used as the final IVs for subsequent two-sample MR analysis.
MR analysis and sensitivity analysis
In this study, we primarily used the standard Inverse Variance Weighted (IVW) method to estimate the causal relationship between 1,091 blood metabolites and 309 metabolite ratios and coronary atherosclerosis. When instrumental Variables (IVs) satisfy the three key assumptions, the IVW method can provide the highest statistical power. However, if some IVs do not meet these assumptions, IVW may yield incorrect results. Therefore, we used MR-Egger, Weighted Median, and Weighted mode as complementary methods to verify the reliability of the MR analysis. IVW is a method in MR for meta-summarizing the effects at multiple sites when analyzing multiple SNPs, where estimates are obtained by calculating the slope of a weighted linear regression. The slope of MR-Egger regression provides an estimate of the causal relationship, offering reliable results even if all SNPs are ineffective, though it performs less effectively compared to the IVW method [17,18]. The Weighted Median represents the median of the distribution function of individual SNP effect values sorted by weight, providing a robust estimate when at least 50% of the information comes from valid IVs.
Multiple sensitivity analysis methods were applied in this study. In the IVW method, the heterogeneity of IVs was assessed using Cochran’s Q test [19]. When heterogeneity was present (P<0.05), an IVW random effects model was used; if no heterogeneity was present, an IVW fixed effects model was applied [20]. In this study, results indicated no heterogeneity, whereby the fixed effects IVW method provided a better causal relationship than the random effects IVW method. Additionally, we used the MR-Egger intercept test [17]. In the MR-Egger test, if the p-value is less than 0.05, it cannot be assured that the genetic variants are independently related to the exposure and outcome. It is noteworthy that all MR analyses were conducted using version 4.3.1 of “Two Sample MR” [21].
Selection of instrumental variables
Based on the UK Biobank, after a series of stringent screening steps, eligible SNPs from the 1,091 blood metabolites and 309 metabolite ratios were finally included in this analysis. These SNPs were carefully selected according to strict quality control standards to ensure their reliability. Importantly, these SNPs showed F-statistics above the threshold of 10, indicating that they are strong representatives in our MR analysis for coronary atherosclerosis [22]. To support further analyses, we systematically collected key information about these SNPs, including the effect allele, other allele, beta, standard error (se), and p-value. Detailed characteristics of these IVs are presented in Supplement Table 1.
Causal impact of 1,091 blood metabolites and 309 metabolite ratios on coronary atherosclerosis
The MR analysis results for 1,091 blood metabolites and 309 metabolite ratios related to coronary atherosclerosis are detailed in Supplement Table 2. Through the IVW method, 111 significant causal associations were observed (Figure 1). Specifically, among the analyzed 1,091 blood metabolites and 309 metabolite ratios, we identified 51 blood metabolites and 10 metabolite ratios that increase the risk of coronary atherosclerosis, whereas 35 blood metabolites and 15 metabolite ratios decrease the risk.
Sensitivity analysis
The results of Cochran's Q test showed heterogeneity among these causal associations, particularly with levels of Isobutyrylcarnitine (c4), 1-stearoyl-2-oleoyl-GPE (18:0/18:1), Oleoyl-linoleoyl-glycerol (18:1/18:2), 1-stearoyl-2-linoleoyl-GPE (18:0/18:2), 1-stearoyl-2-arachidonoyl-GPE (18:0/20:4), N-palmitoyl-sphinganine (d18:0/16:0), 1-oleoyl-2-arachidonoyl-GPE (18:1/20:4), Linoleoyl-arachidonoyl-glycerol (18:2/20:4), 4-acetamidobutanoate, and 1-palmitoyl-2-oleoyl-GPE (16:0/18:1) and ratios of Phosphate to linoleoyl-arachidonoyl-glycerol (18:2 to 20:4), Retinol (Vitamin A) to linoleoyl-arachidonoyl-glycerol (18:2 to 20:4), Cholesterol to oleoyl-linoleoyl-glycerol (18:1 to 18:2) (Supplement Table 3). Thus, the causal relationship between these 14 exposures and coronary atherosclerosis outcomes was assessed using the IVW random effects model. Additionally, the intercept test of the MR-Egger analysis observed horizontal pleiotropy with levels of N-acetyltaurine (Supplement Table 4). Moreover, Steiger test results indicated that the observed associations are unlikely due to reverse causality.
External validation using FinnGen database for coronary atherosclerosis GWAS data
We conducted additional validation based on GWAS data from the FinnGen database to establish the causal relationships between blood metabolites, metabolite ratios, and atherosclerosis. Of the positive results obtained from the test dataset, 14 were validated (Supplement Table 5). Figure 2 displays the positive outcomes of the validation analysis. Specifically, levels of Gamma-glutamylglycine (OR=0.935, 95% CI: 0.895-0.976, p-value=0.002), Cysteine-glutathione disulfide (OR=0.931, 95% CI: 0.893-0.970, p-value=0.001), 2-o-methylascorbic acid (OR=0.956, 95% CI: 0.918-0.996, p-value=0.031), 3-indoleglyoxylic acid (OR=0.956, 95% CI: 0.921-0.992, p-value=0.017), and Glycine to alanine ratio (OR=0.955, 95% CI: 0.926-0.985, p-value=0.003) were found to reduce the risk of coronary atherosclerosis. Conversely, levels of Galactonate (OR=1.057, 95% CI: 1.005-1.112, p-value=0.030), 1-stearoyl-GPG (18:0) (OR=1.068, 95% CI: 1.021 - 1.116, p-value = 0.004), 1-palmitoyl-GPI (16:0) (OR=1.051, 95% CI: 1.002-1.101, p-value =0.039), N-acetyl-3-methylhistidine (OR=1.039, 95% CI: 1.004-1.076, p-value=0.029), 1-palmitoyl-2-arachidonoyl-GPE (16:0/20:4) (OR=1.030, 95% CI: 1.002-1.058, p-value=0.037), X-23654 (OR=1.030, 95% CI: 1.002-1.058, p-value=0.037), 1-stearoyl2-arachidonoyl-GPE (18:0/20:4) (OR=1.035, 95% CI: 1.007- 1.064, p-value=0.014), 4-acetamidobutanoate (OR=1.080, 95% CI: 1.030-1.133, p-value=0.002), and 1-stearoyl-2-arachidonoyl-GPE (18:0/20:4) (OR=1.046, 95% CI: 1.011-1.082, pvalue=0.009) were associated with increased risk of coronary atherosclerosis. The effects of metabolites and metabolite ratios on coronary atherosclerosis were consistent with the test dataset results, demonstrating the robustness of these 14 associations. Cochran's Q test analysis indicated heterogeneity in genetic variations related to levels of Gamma-glutamylglycine and 2-o-methylascorbic acid (Supplement Table 6). The intercept test of the MR-Egger analysis confirmed the absence of horizontal pleiotropy in all associations of the validation results (Supplement Table 7).
Reverse mendelian randomization analysis
To further investigate the causality between coronary atherosclerosis and blood metabolites and metabolite ratios, we conducted a reverse MR analysis using instrumental variables representing coronary atherosclerosis, detailed in Supplement Table 8. The reverse analysis found no causal relationship between coronary atherosclerosis and the 14 screened metabolites and metabolite ratios. Sensitivity analysis did not detect heterogeneity or horizontal pleiotropy (Supplement Table 9 and Supplement Table 10).
This study is the first to apply Mendelian Randomization (MR) to explore the causal relationships between 1,091 blood metabolites and 309 metabolite ratios and Coronary Heart Disease (CHD), prioritizing the detection of causal evidence. By utilizing comprehensive genome-wide association study data of blood metabolites and assessing causal links between blood metabolites and CHD via MR, we circumvent potential reverse causality and environmental confounders [23]. By employing genetic variants as probes, we initially identified 86 metabolites and 25 metabolite ratios associated with CHD. Subsequent validation confirmed that 13 blood metabolites and one metabolite ratio are causally linked to CHD, providing evidence for future research into the pathogenic mechanisms of CHD through blood metabolites.
Our findings underscore a strong causal relationship between the glycine to alanine ratio and CHD, the sole metabolite ratio linked to CHD in our conclusions. Glycine, a non-essential amino acid synthesized endogenously, is involved in metabolic synthesis between the liver and kidneys and is a key component of the antioxidant glutathione, participating in the body's antioxidative reactions. Supplementation with glycine has been shown to improve symptoms of diabetes, hyperlipidemia, and hypertension in patients with metabolic syndrome, which are risk factors for CHD [24-26]. Alanine, a natural and non-toxic amino acid, has been identified in a nine-year longitudinal study as a new amino acid closely associated with the risk of CHD [27]. Both alanine and glycine have been linked to CHD, thereby supporting our conclusion that the glycine to alanine ratio is causally related to CHD. Beyond CHD, the glycine to alanine ratio has also been shown to affect bipolar disorder [28]. In a study involving European populations, alanine was negatively correlated with the incidence of angina, while in East Asian populations, each unit increase in glycine reduced the risk of myocardial infarction and coronary artery disease by 9.0% and 4.1%, respectively [29], consistent with our findings. Galactonate, a precursor in the biotechnological production of ascorbic acid, has been shown by Claire Yager et al. to involve the liver as the main organ capable of oxidizing galactose, with the heart and muscles also participating in galactose metabolism, and changes in cardiac function occurring in patients with galactosemia [30,31]. Evidence suggests that galactonate levels are closely linked to heart function.
Levels of 2-o-methylascorbic acid, a potent antioxidant, have been examined in elderly patients before and after cardiovascular surgery, finding that post-surgery plasma levels of ascorbic acid are low and unrelated to pre-surgery delirium; moreover, Lascorbic acid can protect against vascular dysfunction caused by cadmium, closely related to CHD [32,33]. Elevated blood levels of the amino acid homocysteine are a likely risk factor for CHD, with its metabolism closely linked to B vitamins, and elevated levels can promote endothelial dysfunction and the occurrence of atherosclerosis and thrombosis through various pathways; a meta-analysis found that reducing homocysteine levels by 25% could reduce the risk of CHD by about 11% [34,35]. γ-glutamyl transferase, a ubiquitous enzyme on cell surfaces that catalyzes the reaction between glutathione and amino acids to form glutamyl amino acids and cysteinylglycine, has activities linked to various cardiac metabolic risk factors, particularly associated with the risk of death from CHD, though its direct role in the development of atherosclerosis and CHD is unclear [36]. This provides evidence that levels of glutamylglycine, gamma-glutamylthreonine, and cysteine-glutathione disulfide could contribute to the development of coronary atherosclerosis.
Our results also show that levels of 1-palmitoyl-GPI (16:0), 1-palmitoyl-2-arachidonoyl-GPE (16:0/20:4), and 1-palmitoyl-2- oleoyl-GPE (16:0/18:1) can lead to the occurrence of coronary atherosclerosis [26]. In conjunction with published studies, a longitudinal study in Daqing identified 16 blood metabolites associated with increased CVD risk in diabetic patients, among which palmitoyl sphingomyelin, a metabolic product of sphingomyelin containing palmitate at variable acyl positions, was seldom mentioned in previous literature related to CVD. However, the longitudinal study results demonstrated that palmitoyl sphingomyelin is the strongest independent factor related to CVD in diabetic patients [37]. This aligns with our findings. Beyond amino acids, lipid metabolism is also crucial for maintaining cardiac function. Stearoyl-CoA desaturase-1 (SCD; human isoform SCD1), an enzyme critical in de novo fatty acid synthesis, plays a role in cardiac metabolic remodeling and maintaining metabolic homeostasis in the cardiovascular system. A deficiency in SCD1 could accelerate vascular calcification and lead to atherosclerosis under conditions of hyperlipidemia [38,39]. Thus, there is theoretical support that levels of 1-stearoyl-GPG (18:0) and 1-stearoyl-2-arachidonoyl-GPE (18:0/20:4) can lead to the occurrence of coronary heart disease.
Additionally, our results indicate that levels of N-acetyl-3- methylhistidine, 3-indoleglyoxylic acid, and X-23654 are closely related to the occurrence of coronary heart disease. However, how these metabolites influence coronary heart disease through metabolic pathways is not fully understood, necessitating further research.
Our study has several strengths. Firstly, it covers 1,091 blood metabolites and 309 metabolite ratios, providing a comprehensive range of genetic variables covering a broad spectrum of known metabolites and disease relationships, with 81 metabolites not previously tested in representative large-scale metabolomic GWAS. Secondly, the test sample set consisted of 16,041 individuals of European ancestry, and the validation dataset came from FinnGen R9. Combining these GWAS data with MR research allows for effective causal inference with high statistical power, giving the study's conclusions a degree of credibility and persuasiveness. Our genetic-level analysis accurately identifies blood metabolites and metabolite ratios related to coronary atherosclerosis, offering a novel analysis method that uses genetic variations as proxies to assess causal relationships between exposure and outcome. The genes of the 13 blood metabolites and one metabolite ratio identified can aid in the precise prevention of coronary heart disease or serve as potential drug targets for modulating metabolite levels [40].
However, the study also has some limitations. Firstly, the cohort mainly consists of individuals of European ancestry, so the results should be applied cautiously to other ethnicities or populations. Secondly, metabolite levels can vary significantly between different tissues and cells [41]. Lastly, although the study identified 13 blood metabolites and one metabolite ratio that can cause coronary heart disease, the mechanisms behind their pathogenesis remain unclear, requiring further research to explore their pathogenic mechanisms in coronary heart disease.
This study utilizes a mendelian randomization approach to robustly identify blood metabolites causally linked to coronary atherosclerosis, highlighting the potential for metabolic interventions in disease prevention and treatment. Despite its innovative methodology and strong statistical backing, the findings' generalizability across different ethnicities remains a limitation. Future research should focus on validating and expanding these results through clinical trials, aiming to integrate these insights into practical therapeutic strategies.