RESEARCH
Peer reviewed scientific video journal
Video encyclopedia of advanced research methods
Visualizing science through experiment videos
EDUCATION
Video textbooks for undergraduate courses
Visual demonstrations of key scientific experiments
BUSINESS
Video textbooks for business education
OTHERS
Interactive video based quizzes for formative assessments
Products
RESEARCH
JoVE Journal
Peer reviewed scientific video journal
JoVE Encyclopedia of Experiments
Video encyclopedia of advanced research methods
EDUCATION
JoVE Core
Video textbooks for undergraduates
JoVE Science Education
Visual demonstrations of key scientific experiments
JoVE Lab Manual
Videos of experiments for undergraduate lab courses
BUSINESS
JoVE Business
Video textbooks for business education
Solutions
Language
English
Menu
Menu
Menu
Menu
A subscription to JoVE is required to view this content. Sign in or start your free trial.
Research Article
Erratum Notice
Important: There has been an erratum issued for this article. View Erratum Notice
Retraction Notice
The article Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size (LEfSe) in Microbiome Data (10.3791/61715) has been retracted by the journal upon the authors' request due to a conflict regarding the data and methodology. View Retraction Notice
We present a protocol to integrate multiomics Mendelian randomization, single-cell transcriptomics, and structure-based in silico screening to systematically delineate the Drebrin Like (DBNL)-macrophage axis in heart failure. We suggest DBNL as a druggable immunometabolic target.
Heart failure (HF) remains a major global health challenge, with limited effective treatments targeting its core pathophysiological mechanisms. In this study, an integrated multiomics approach combining Mendelian randomization (MR) and single-cell RNA sequencing (scRNA-seq) was used to identify potential biomarkers and therapeutic targets for HF. We utilized data from genome-wide association studies (GWASs), expression quantitative trait loci (eQTLs), methylation quantitative trait loci (mQTLs), and protein quantitative trait loci (pQTLs) to investigate the genetic mechanisms of HF. Single-cell RNA sequencing was performed to analyse gene expression in cardiac macrophages, and pseudotime analysis was used to study the dynamic regulation of DBNL during macrophage differentiation. Molecular docking and dynamics simulations identified pirinixic acid (WY-14643; PubChem CID: 4594; synonyms: 50892-23-4; WY-14643) as a potential regulator of DBNL. Multiomics analysis revealed that DBNL (Drebrin-like) is a key gene associated with HF risk. Single-cell RNA sequencing revealed that DBNL is expressed mainly in cardiac macrophages and is upregulated under pathological conditions. The expression of DBNL by macrophages is associated with immune metabolism and profibrotic pathways, particularly the IL-6/JAK/STAT3, PI3K/AKT/mTOR, and TGF-β signalling pathways. Pseudotime analysis indicated that DBNL has a dynamic regulatory effect on macrophage differentiation, especially in chronic inflammation. Molecular docking and dynamic simulations have shown that pyridine acid has a potential role in regulating DBNL. This study elucidates the role of DBNL in the progression of heart failure and suggests its potential as a therapeutic target. Importantly, the therapeutic relevance is inferred from computational predictions. These findings offer novel insights into immune metabolism and macrophage-mediated intercellular communication, establishing a theoretical basis for future applications in the optimization of DBNL-targeted drugs and functional research. Nonetheless, additional experimental validation and preclinical studies are needed to substantiate the clinical efficacy of DBNL as a therapeutic target.
Heart failure (HF) represents a significant and growing global health challenge that affects millions of individuals worldwide and places considerable economic strain on healthcare systems1. Consequently, there is a pressing need for efficacious interventions in heart failure (HF) management.
The complex pathophysiology of heart failure originates from impaired myocardial contractility of the left ventricle, leading to a cascade of systemic complications, such as pulmonary and peripheral congestion2. However, directly addressing the fundamental pathophysiological mechanisms that underlie the progression of heart failure, particularly adverse myocardial remodelling, remains a critical and unmet challenge3. In recent decades, the domain of biomedical research has undergone a substantial paradigm shift, primarily facilitated by the incorporation of multiomics technologies and advanced bioinformatics methodologies. These tools offer exceptional opportunities for identifying novel disease biomarkers and therapeutic targets. Mendelian randomization (MR), a rigorous genetic epidemiological approach, uses single-nucleotide polymorphisms (SNPs) as instrumental variables to infer causal relationships between genetic exposures and clinical outcomes4. By adopting the methodological framework of randomized controlled trials, Mendelian randomization (MR) effectively addresses the problems of confounding and reverse causality to provide robust evidence for the identification and repurposing of drug targets. In parallel, single-cell RNA sequencing (scRNA-seq) reveals tissue heterogeneity with unparalleled resolution,offering profound insights into cell type-specific gene expression and intercellular communication networks within intricate organs such as the heart5.The multiomics approach, which synthesizes genetic, transcriptomic, and proteomic data, offers substantial potential for elucidating disease mechanisms and identifying therapeutic pathways.
Despite these technological advances, significant knowledge gaps persist regarding the specific cellular and molecular mechanisms that drive cardiac inflammation and remodelling in heart failure. Accumulating evidence suggests that immune cells, particularly macrophages, are crucial for orchestrating inflammatory responses and subsequent tissue remodelling during myocardial injury and chronic heart failure processes, which are in turn crucial to disease progression6. Previous research has concentrated predominantly on broad inflammatory mediators or cell surface markers, resulting in a significant gap in the exploration of intracellular regulatory targets within these crucial immune cell populations. Additionally, although traditional genome-wide association studies (GWASs) have identified numerous genetic associations related to heart failure, they frequently encounter challenges in establishing definitive causal relationships, thereby limiting their applicability for direct therapeutic translation7.Consequently, the implementation of a comprehensive multiomics Mendelian randomization (MR) framework, which incorporates genetic variants that influence DNA methylation (mQTLs), gene expression (eQTLs), and protein abundance (pQTLs), is essential for achieving more robust causal inferences across multiple biological regulatory layers. This approach addresses the inherent limitations associated with single-layer QTL studies. Notably, the Drebrin-like (DBNL) protein has been the subject of extensive research in the context of neuronal development and cytoskeletal dynamics8; however, its functional significance and therapeutic potential in cardiovascular diseases, particularly heart failure, remain largely unexplored. This significant gap in our understanding impedes the consideration of DBNL as a novel and potentially viable target for drug development in heart failure treatment.
To address these critical knowledge gaps, this study aims to robustly identify key genes causally associated with heart failure risk through an integrated multiomics MR framework that combines eQTL, mQTL, and pQTL data. This study presents a novel approach involving the integration of multiomics Mendelian randomization analyses -- including expression quantitative trait loci (eQTLs), methylation quantitative trait loci (mQTLs), and protein quantitative trait loci (pQTLs) -- with high-resolution single-cell transcriptomic data (GSE161470) and bulk transcriptomic data (GSE161472) to systematically identify the DBNL gene as a critical causal risk factor for heart failure. These findings demonstrate that DBNL is predominantly and specifically expressed in cardiac macrophages and that its expression changes dynamically throughout their differentiation process. Moreover, the results of this study elucidate the pivotal role of DBNL in modulating the cardiac immune microenvironment and orchestrating heart failure-associated immunometabolic pathways, such as the IL-6/JAK/STAT3 signalling cascade, as well as profibrotic signalling pathways. Additionally, by employing computational biology techniques, this research predicts that pyridyl acid (WY-14643) is a high-affinity ligand for DBNL, thereby providing a comprehensive, multidimensional theoretical framework that supports the designation of DBNL as a novel therapeutic target for heart failure and facilitates the development of targeted pharmacological interventions. Using single-cell RNA sequencing, we characterized the cell-specific expression and functional pathways of the key gene DBNL identified within the cardiac microenvironment, and we investigated the role of DBNL in immunometabolic and profibrotic signalling pathways, as well as in cell-cell communication among cardiac cell types. Potential small-molecule ligands for DBNL were identified through computational molecular docking and molecular dynamics simulations. We hypothesize that DBNL, which is expressed primarily in cardiac macrophages, exerts a causal effect on heart failure progression by regulating immunometabolic pathways and intercellular crosstalk. In summary, by integrating advanced multiomics Mendelian randomization with high-resolution single-cell analysis, we have systematically identified and characterized novel therapeutic targets for heart failure and explored potential small-molecule modulator targets.
Data Download
Gene Expression Data
The single-cell RNA sequencing (scRNA-seq) data utilized in the present study were sourced from the Gene Expression Omnibus (GEO) repository maintained by the National Center for Biotechnology Information (NCBI) (https://www.ncbi.nlm.nih.gov/geo/), specifically from the dataset with the accession number9 GSE161470 (human cardiac tissue comprising four control specimens and one pathological specimen). This dataset was originally published by Zhang et al. in 202210. The primary objective of the original investigation was to examine the cellular heterogeneity and molecular regulatory mechanisms in human cardiac tissue under conditions of heart failure. For the current analysis, five samples from this dataset, each comprising comprehensive single-cell expression profiles derived from human heart tissue, were selected. The other dataset utilized in this study was likewise obtained from the NCBI GEO public repository, specifically the Series Matrix File corresponding to accession number GSE161472, accompanied by the annotation file GPL11154. The expression profile comprises a total of 84 samples, including 37 control samples and 47 disease samples. This research encompasses an integrative multiomics analysis, with all investigations conducted using publicly accessible data.
eQTL data
The eQTL data, obtained from the eQTLGen consortium, focus on elucidating the genetic framework of gene expression in blood and the genetic factors influencing complex traits3. The consortium is presently engaged in the second phase of its extensive project, undertaking meta-analyses of genome-wide data pertaining to blood gene expression.
Exposure data - mQTLs
The mQTL data were obtained from a published meta-analysis of the European (EUR) cohort, which examines whole-blood DNA methylation within the genetic framework of 3,701 samples from populations of European ancestry11. The dataset included information on 426,636 mQTL traits.
Exposure data-pQTL
Plasma pQTL data were acquired from the deCODE database (https://www.decode.com/summarydata/)4. This study employed the 2021 data release of the deCODE pQTL dataset, encompassing a genome-wide association study (GWAS) of plasma protein levels measured using 4,907 aptamers in a cohort of 35,559 individuals of European descent.
Outcome data
Summary statistics for heart failure were obtained from a large-scale genome-wide association study (GWAS) primarily involving participants of European ancestry, accessed through the European Bioinformatics Institute (EBI) database (GCST90162626). The heart failure dataset included 115,150 cases and 1,550,331 controls. The GWAS Catalogue, which encompasses publications, leading associations, and detailed summary statistics, currently offers data mapped to the Genome Assembly and dbSNP Build.
Mendelian randomization analysis of mQTLs, eQTLs, and pQTLs
To systematically investigate the potential causal relationships between gene expression, protein abundance, DNA methylation levels, and the risk of heart failure, Mendelian randomization (MR) analyses were performed utilizing expression quantitative trait loci (eQTLs), protein quantitative trait loci (pQTLs), and methylation quantitative trait loci (mQTLs) datasets. During the exposure data preprocessing phase, single-nucleotide polymorphisms (SNPs) associated with each exposure variable (gene, protein, or methylation site) were extracted from the respective databases at a genome-wide significance threshold of P < 1 × 10⁻⁵ to serve as initial candidate instrumental variables (IVs). Subsequently, linkage disequilibrium (LD) clumping was conducted for each exposure factor's IVs using a window size of 10,000 kilobases (kb) and an LD R² threshold of 0.001 to ensure independence among instruments. These selected IVs were then harmonized with summary statistics from a heart failure genome-wide association study (GWAS; ID: GCST90162626) by employing the read_outcome_data function, retaining only those SNPs exhibiting an association P value below 5×10⁻⁵ in the outcome dataset. To mitigate weak instrument bias, the F statistic for each IV was calculated as F = (β_exposure/SE_exposure)², and only instruments with F > 10 were included in subsequent analyses. For causal effect estimation, allele alignment between the exposure and outcome datasets was performed using the harmonize_data function from the TwoSampleMR package. MR analyses were then conducted employing four complementary statistical approaches: (1) the inverse variance weighted (IVW) method, which provides meta-analyses of Wald ratio estimates across SNPs; (2) MR-Egger regression, which accounts for directional pleiotropy by incorporating an intercept term under the Instrument Strength Independent of Direct Effect (InSIDE) assumption; (3) the weighted median method, which yields consistent causal estimates even if up to 50% of the instruments are invalid; and (4) the weighted mode method, which identifies the most frequent causal effect estimate cluster, offering enhanced statistical power and reduced type I error relative to MR-Egger. In instances where only a single instrumental variable was available, the Wald ratio method was applied exclusively. To assess the robustness of the findings, comprehensive sensitivity analyses were performed, including heterogeneity testing via the mr_heterogeneity function, pleiotropy assessment using mr_pleiotropy_test, and leave-one-out analyses implemented through the mr_leaveoneout function, which iteratively excludes each SNP to determine the influence of individual variants on the overall results. Significant associations were visualized using graphical tools such as mr_scatter_plot and mr_forest_plot. This analytical pipeline was uniformly applied across the eQTL, pQTL, and mQTL datasets to maintain methodological consistency throughout the study.
Colocalization analysis
A colocalization analysis was conducted through the Coloc method, eQTL summary data, and a GWAS of heart failure5. The index single-nucleotide polymorphism (SNP) was utilized to compute the posterior probability within a 100 kb clumping window. In the colocalization (coloc) analysis, Hypothesis H3 denotes the posterior probability that the two traits, namely, gene expression and heart failure, are correlated but possess distinct causal variants. Conversely, Hypothesis H4 indicates the posterior probability that the association between the two traits is attributable to a single, shared causal variant. A threshold of SNP.PP.H4 greater than 0.90 was used to determine colocalization.
Immune infiltration
The CIBERSORT method is a widely adopted technique for assessing immune cell types within the microenvironment12. By utilizing support vector regression principles, deconvolution analysis of the expression matrix of immune cell subtypes can be performed. By incorporating 547 biomarkers, CIBERSORT can differentiate 22 human immune cell phenotypes, including T cells, B cells, plasma cells, and various myeloid cell subpopulations. Utilizing the GSE161472 dataset, an analysis was conducted employing the CIBERSORT algorithm in conjunction with its integrated LM22 signature matrix, which characterizes the gene expression profiles of 22 distinct human immune cell types. The infiltration levels of these 22 immune cell populations were quantified for each individual sample. Subsequently, the cor.test function was applied to assess the correlations between the expression of key genes and the corresponding immune cell infiltration levels.
Single-cell RNA sequencing data processing and quality control
The single-cell expression profile data were processed using the Seurat (V4.3.0) package in the R (V4.3.0) environment6.This study utilized a conventional workflow for the analysis of single-cell RNA sequencing data. Initially, the expression profiles were imported utilizing the Seurat package. Cells were filtered on the basis of several quality metrics, including each cell's total UMI count, number of expressed genes, proportion of mitochondrial reads, and proportion of ribosomal reads. Outliers were identified as values deviating from the median by more than three median absolute deviations (MADs). The specific filtering thresholds applied were as follows: nFeature_RNA ≥ 200, percent.mt ≤ 2.15718, nFeature_RNA ≤ 2840.941, and nCount_RNA ≤ 5194.27. Typically, cells exhibiting excessively high total UMI counts and numbers of expressed genes were classified as doublets, whereas cells with elevated percentages of mitochondrial or ribosomal reads were considered to be of low quality, potentially undergoing apoptosis or fragmentation. Following these filtering steps, DoubletFinder (version 2.0.4) was employed to identify and remove doublets from each sample individually, completing the cell quality control process. Initially, data normalization was conducted using the normalizeData function. The cell cycle status was subsequently evaluated via the CellCycleScoring function, and highly variable genes were identified through the FindVariableFeatures method. The dataset was then scaled using ScaleData to standardize the data and to mitigate the influence of mitochondrial genes, ribosomal genes, and cell cycle effects on downstream analyses. Linear dimensionality reduction was performed using principal component analysis (PCA) through the RunPCA function, with significant principal components selected for further analysis. To address batch effects across different samples, the Harmony algorithm (version 1.1.0) was employed. This approach iteratively clusters similar cells from distinct batches within PCA space while maintaining batch diversity within clusters. Given the relatively mild batch effects observed in the dataset, default parameters (θ = 2) were applied. Nonlinear dimensionality reduction was subsequently carried out using RunUMAP, followed by the construction of a cell neighbourhood graph via FindNeighbors and cell clustering through FindClusters. For cell type annotation, a hierarchical annotation framework was implemented: primary manual annotation was based on characteristic gene expression patterns informed by the CellMarker database and pertinent literature; this was complemented by automated annotation results obtained from the SingleR software as a reference. To further increase the accuracy and comprehensiveness of cell type identification, multiple authoritative databases, including the Human Primary Cell Atlas (HPCA), BlueprintEncode, MonacoImmune, DatabaseImmuneCell, and NovershternHaematopoietic, were consulted. Cell annotation was conducted by querying the CellMarker (http://117.50.127.228/CellMarker/CellMarkerBrowse.jsp) database and reviewing the literature, aided by automated annotation support provided by SingleR (V2.4.0) software13. It aims to identify the cell types present in the corresponding tissue and their associated marker genes14.
Analysis of ligand - receptor interactions
In this study, CellCall (version 1.0.7) was used to perform a comprehensive analysis of intercellular communication networks15. By utilizing cell type annotations derived from Seurat alongside the raw count matrix, a normalized analysis object was constructed with parameters configured for the human genome. The TransCommuProfile function was applied to quantify the strength of cell-cell interactions through a weighted algorithm, implementing a significance threshold of a p value < 0.05 to identify reliable ligand-receptor pairs. The significant interaction pairs were subsequently subjected to KEGG pathway enrichment analysis via the getHyperPathway function, and the relationships between cell types and pathways were illustrated using bubble plots. The overall communication network was ultimately visualized through a circular plot, wherein eight distinct colours denoted different cell types. Interaction strength and directionality were represented by arrow features, providing a detailed characterization of intercellular signalling dynamics.
Pseudotime analysis
To investigate the dynamic transcriptional regulation of macrophages throughout the progression of heart failure, this study utilized the Monocle algorithm to conduct pseudotime analysis on macrophage subpopulations. The gene expression matrix corresponding to the target cell subpopulations was extracted to construct single-cell trajectory analysis objects, with highly variable genes selected as ordering features. By employing the DDRTree dimensionality reduction technique, cells were mapped onto a two-dimensional space to reconstruct the differentiation trajectory. Visualization analyses were performed to determine the distribution of cells along the pseudotime axis and to identify genes whose expression changed significantly over pseudotime. Subsequent analyses concentrated on the key gene DBNL and characterized its expression dynamics along the cell trajectory, elucidating the transcriptional reprogramming mechanisms of macrophages during heart failure progression16.
Gene set enrichment analysis (GSEA)
In this study, a GSEA approach was used to elucidate the regulatory mechanisms associated with key genes involved in heart failure. Utilizing previously identified key genes, samples were stratified into high and low-expression cohorts on the basis of the median expression value. Differential expression analysis was conducted employing the limma package, generating a ranked gene list according to log fold change (logFC). Subsequent KEGG pathway enrichment analysis was performed using the clusterProfiler tool, with gene sets sourced from the MsigDB database serving as the reference background. The GSEA algorithm was then applied to identify signalling pathways that were significantly enriched between the two expression groups, and an adjusted p-value threshold of less than 0.05 was used to determine statistical significance. To illustrate the regulatory functions of the core genes within critical pathways, various visualization techniques, including multipathway GSEA plots and circular network diagrams, were used.
Gene set variation analysis (GSVA)
GSVA is a nonparametric, unsupervised approach utilized for assessing gene set enrichment within transcriptomic data. This method transforms gene-level variations into pathway-level variations by calculating composite scores for specific gene sets, facilitating the evaluation of biological functional changes across various samples. In the present study, gene sets were sourced from the Molecular Signatures Database. The GSVA algorithm was employed to compute composite scores for each gene set, enabling the evaluation of potential biological functional alterations across different samples. The results of the GSVA enrichment analysis are provided in the supplementary material (Supplementary Table 1).
CTD drug prediction
The target gene (DBNL) was input into the search field of the Comparative Toxicogenomics Database (CTD), the disease category "cardiovascular disease" was selected, and the query was executed to retrieve drug prediction data associated with the condition "heart failure". The obtained prediction results were subsequently imported into Cytoscape software to facilitate data visualization and enable the construction of a gene-chemical interaction network map.
Molecular docking methods
Owing to the unresolved three-dimensional crystal structure of the human DBNL protein (UniProt ID: Q9UJU6), this study predicted the three-dimensional structure of DBNL on the basis of AlphaFold317. Pirinixic acid (WY-14643) is available for download from the PubChem database (PubChem CID: 5694). Afterwards, the protein structure was preprocessed using MGLTools software (version 1.5.7)18, including steps such as the addition of hydrogen atoms. At the same time, proteins and small molecules were converted into the PDBQT format required for docking. AutoDock Vina software (version 1.1.2)19 was used for global molecular docking (exogenicity=16, num_modes=30) to explore potential binding modes. Upon completion of the docking calculations, the complex conformation with the highest affinity, indicated by the lowest binding free energy, should be selected as the initial structure for subsequent molecular dynamics simulations.
Molecular dynamics simulation method
To systematically investigate the binding stability and interaction mechanisms between candidate compounds and proteins, conventional molecular dynamics (MD) simulations were conducted utilizing the GROMACS software package (version 2024.03)20. The protein parameters were generated using the Amber14SB force field21, the water molecule model was generated using the TIP3P model22, and the ligand topology parameters were generated utilizing the Antechamber Python Parser Interface (ACPYPE) tool, which is based on the General Amber Force Field (GAFF). The ligand-protein complex system was subsequently situated within a periodic boundary octahedral box filled with TIP3P water molecules. Sodium (Na⁺) and chloride (Cl⁻) ions were introduced to achieve a concentration of 0.15 mol/L and to neutralize the overall charge of the system. Upon completion of the system construction, the initial step involved minimizing the energy using the steepest descent method for 50,000 steps, with the aim of eliminating any potentially unreasonable conformations within the structure. Two stages of system equilibration were subsequently performed: a 100 ps NVT (constant particle number, volume, and temperature) simulation followed by a 100 ps NPT (constant particle number, pressure, and temperature) simulation. During these simulations, positional constraints were applied to the heavy atoms of the protein backbone to preserve the structural integrity of the protein. The temperature was maintained at 300 K using the V-rescale thermostat, and the pressure was regulated at 1 bar by employing the Parrinello-Rahman barostat for pressure coupling.Upon completion of the equilibration phase, a production phase simulation spanning 100 nanoseconds was conducted, during which all positional constraints are eliminated. The trajectory was integrated using a 2-femtosecond time step, and the particle-mesh Ewald (PME) method was employed to precisely manage long-range electrostatic interactions. The trajectory was saved every 10 ps, and a total of 10000 frames were output for subsequent analysis. In addition, stable trajectories in the 90-100 ns interval were extracted from the simulation, and the binding free energy of ligand protein complexes was calculated using the gmx MMPBSA tool23.
Statistical analysis
The validity of this Mendelian randomization (MR) analysis is contingent upon three fundamental assumptions. (1) Relevance: The instrumental variables (IVs) must exhibit a strong association with the exposure. (2) Independence: The IVs must be independent of any confounders that affect both the exposure and the outcome. (3) Exclusion Restriction: The IVs should influence the outcome exclusively through their impact on the exposure. A breach of this assumption, where an IV affects the outcome through pathways not involving the exposure, is referred to as horizontal pleiotropy. All the statistical analyses were performed using R version 4.3.0, with two-sided tests, and a p value of less than 0.05 was generally considered to indicate statistical significance.
Identification of DBNL as a gene associated with heart failure risk through Mendelian randomization and colocalization analysis
The outcome identifier for heart failure-related samples, GCST90162626, was acquired from the summary statistics dataset. Causal relationships among 1,247 pairs of expression quantitative trait loci (eQTLs) and outcomes were determined utilizing the extract_instruments and extract_outcome_data functions (Supplementary Table 2). Subsequent Mendelian randomization analysis revealed 567 gene pairs associated with eQTL-positive outcomes (Figure 1A).The findings revealed that 280 genes, including MRPL11, CAMK2G, HSP90AB1, and ATP2B1, were associated with a reduced risk of heart failure (odds ratio [OR] < 1). Conversely, 287 genes, such as ZNF394, DIPK1B, FCRL1, and COPDA1, were linked to an increased risk of heart failure (OR > 1). To evaluate the reliability of the potential causal relationships among these 567 genes, sensitivity analyses were performed. The leave-one-out analysis demonstrated that the overall causal estimate for each gene remained consistent upon the exclusion of any single-nucleotide polymorphism (SNP), underscoring the robustness of the findings. pQTL protein data were subsequently obtained from the deCODE database. The causal relationships between 1,046 pairs of pQTLs and outcomes were determined using the extract_instruments and extract_outcome_data functions (Supplementary Table 1). Further MR analysis revealed 370 pQTL-related gene pairs with positive outcomes (Figure 1B). Among these genes, genes such as MSRA, EDN3, ERO1A, and CSN2 were correlated with a decreased risk of heart failure (OR < 1), whereas genes such as ETV5, CCDC149, CFD, and SPACA1 were associated with an increased risk of heart failure (OR > 1). The leave-one-out method was applied to conduct further sensitivity analysis of the causal relationships for the 370 genes to determine their reliability. The results demonstrated that removing any single SNP did not significantly alter the overall causal estimate, thereby indicating the robustness of the 370 identified potential causal relationships. We classified 3,333 causal relationships between mQTLs and outcomes (Supplementary Table 1) by using extract_instruments and extract_outcome_data sequentially.Subsequent Mendelian randomization analysis revealed 1,472 causal relationships between mQTL-associated genes and favourable outcomes (Figure 1C).The analysis revealed the identification of 716 genes, including LMTK3, GGT5, OR4C16, and CCDC91, which have been previously associated with a reduced risk of heart failure (odds ratio < 1). Conversely, 756 genes, such as SNORD114-12, OR2D3, KLHDC7B, and TAS2R19, were found to be associated with an increased risk of heart failure (odds ratio > 1).A sensitivity analysis was performed to assess the robustness of causal relationships among 1,472 mQTL-associated genes. The findings demonstrated that the exclusion of any single SNP does not significantly alter the overall estimation of causal relationships, indicating that the identified potential causal relationships are robust. To identify key genes influencing heart failure, Venn diagrams were generated to analyse the colocalization of mQTLs, eQTLs, and pQTLs through Mendelian randomization, with a focus on genes associated with both low and high risk for heart failure. Intersecting the results of this analysis with the high-risk and low-risk gene sets revealed a common high-risk gene, DBNL (Figure 1D,E). Subsequently, a colocalization analysis of DBNL at the eQTL-GWAS level was conducted. The results of this analysis provided robust evidence for a shared causal variant, as indicated by a SNP posterior probability (SNP.PP. H4) exceeded 0.90 (Figure 1F). A reverse Mendelian randomization analysis was subsequently conducted, utilizing the GCST90162626 data as the exposure variable and DBNL as the outcome variable. The analysis revealed no evidence of a causal relationship between GCST90162626 and DBNL at the expression quantitative trait locus (eQTL) level (Figure 1G). Consequently, DBNL was identified as the principal gene for further analysis in our study.
Transcriptomic analysis of macrophage subpopulations elucidating the roles of distinct subgroups in the pathogenesis of heart failure and the regulation of immune responses
Initially, expression profiles were imported utilizing the Seurat package. To ensure data integrity across multiple samples, cells of low quality-characterized by fewer than 200 detected genes-and identified outliers were excluded. Subsequently, the DoubletFinder package was employed to eliminate doublets, yielding a final dataset comprising 14,413 high-quality cells for further analysis. Visualizations of the data post-filtering, including violin and scatter plots, are presented in (Supplementary Figure 1A,B). Within this dataset, 2,000 highly variable genes were identified, as illustrated in (Supplementary Figure 1C). The data then underwent a series of preprocessing steps, including normalization, scaling, principal component analysis (PCA), and Harmony integration, as detailed in (Supplementary Figure 1D-F). Utilizing additional UMAP visualization, nine transcriptionally distinct cellular clusters that demonstrated clear separation within the low-dimensional embedding space were identified (Figure 2A). These clusters were subsequently classified into eight principal cell types: fibroblasts, cardiomyocytes, macrophages, pericytes, endothelial cells, T cells, mast cells, and neurons (Figure 2B). To elucidate the potential involvement of macrophages in the risk of heart failure, a targeted subpopulation analysis was performed. This analysis commenced with a volcano plot depicting differentially expressed genes, which emphasized transcripts exhibiting significant changes across conditions, with a particular focus on genes previously associated with the pathogenesis of heart failure (Figure 2C). Subsequently, principal component analysis (PCA) was employed to reduce dimensionality and further delineate the substructure within the macrophage population (Figures 2D,E). Notably, the macrophage subpopulations, comprising M1, M2, VCAN⁺, and EZH2⁺ macrophages, demonstrated considerable heterogeneity at the transcriptomic level, underscoring their critical involvement in immune responses. The cell annotation data are available in the supplementary table (Supplementary Table 3). Subsequent UMAP clustering analysis elucidated the cellular heterogeneity within the transcriptome space. Cells were distinctly classified into various clusters on the basis of RNA expression similarities, with each point representing an individual cell and different colours indicating distinct cell populations. This analysis revealed a pronounced spatial separation of macrophages from other cell types, such as fibroblasts and endothelial cells, suggesting that macrophages possess unique transcriptional characteristics under various biological conditions. These findings underscore the pivotal roles of macrophages in immune responses and tissue remodelling processes (Figure 2F). To further substantiate the clustering results, UMAP plots were generated to illustrate the transcriptional heterogeneity of macrophages, categorizing them into four distinct subgroups: M1, M2, VCAN⁺, and EZH2⁺ macrophages. In these plots, each point corresponds to an individual cell, with colours denoting the various subtypes. The pronounced spatial segregation of these subgroups underscores their unique transcriptional profiles, suggesting specialized roles in immune regulation and tissue remodelling (Figure 2G). These findings suggest that macrophages exhibit pronounced spatial specificity and distinct separation from other cell types. Utilizing bubble plot analysis, the expression profiles of marker genes across various cell types and macrophage subpopulations were investigated. The expression patterns of key marker genes were examined in diverse cell types, including macrophages, fibroblasts, and cardiomyocytes. The results revealed substantial differences in gene expression levels among the different cell types, with certain genes being highly expressed in specific cell types, potentially contributing significantly to cellular functions and biological processes (Figure 2H). A comprehensive analysis of the expression of marker genes, including M1, M2, VCAN⁺, and EZH2⁺, was performed across various macrophage subgroups. This study revealed significant disparities in gene expression profiles among macrophage subpopulations, particularly during the immune response and tissue repair processes. Notably, pronounced differences in gene expression were detected between M1 macrophages and VCAN⁺ macrophages. Specifically, genes that were differentially expressed in M1 macrophages were enriched in adrenergic signalling, ferroptosis, and cytoskeletal pathways in myocardial cells. In contrast, differentially expressed genes in VCAN⁺ macrophages were enriched in pathways related to haematopoietic cell lineages and the synthesis and activity of parathyroid hormone and cholinergic synapses (Figure 2I). These findings provide critical molecular insights into the functional specificity of macrophage subpopulations in pathological conditions such as heart failure and offer a theoretical foundation for the development of future targeted therapeutic strategies. In conclusion, we employed a comprehensive analytical approach, incorporating principal component analysis (PCA), uniform manifold approximation and projection (UMAP), cell type annotation, and gene enrichment analysis, to elucidate the significant roles of diverse macrophage subpopulations in the molecular mechanisms underlying heart failure. These results not only highlight the transcriptomic heterogeneity of macrophages but also provide a theoretical basis for future targeted interventions aimed at specific macrophage subpopulations in the treatment of heart failure.
The expression of DBNL in various macrophage subpopulations elucidating specific ligand-receptor interactions and pathway enrichment associated with disease progression
We performed a comprehensive multidimensional analysis of DBNL expression and its associated pathways, focusing on transcriptional characteristics across various cell types and disease states. UMAP diagrams (Figure 3A,B) illustrate the distribution of cells within both the control and disease groups. In the control group, cell clustering appeared more uniform, whereas in the disease group, cells exhibiting elevated DBNL expression demonstrated notable aggregation. These observations suggest a potential critical role for DBNL in the disease state. We subsequently quantified DBNL expression across different cell types (Figure 3C). Dot plot analysis revealed that DBNL expression was most pronounced in macrophages, demonstrating significant cell type specificity. The dot size represents the proportion of cells expressing DBNL, whereas the colour variations indicate the average expression level, corroborating that macrophages are the primary cell type expressing DBNL. To further investigate the potential role of DBNL expression in intercellular interactions, we generated a Circos diagram (Figure 3D) to illustrate the ligand-receptor interactions among DBNL-expressing macrophages, cardiomyocytes, and fibroblasts. The diagram indicates substantial ligand-receptor interactions between DBNL-expressing macrophages and other cell types, with these interactions being more prominent under disease conditions. Gene set enrichment analysis (GSEA) (Figure 3E) revealed that macrophages expressing DBNL are significantly enriched in several critical biological pathways, including those related to prostate cancer, the PI3K/Akt signalling pathway, and EGFR tyrosine kinase inhibitor resistance. These pathways are intricately linked to the immune response, tumour progression, and drug tolerance, indicating that DBNL may play a pivotal role in these biological processes. Additionally, the heatmap (Figure 3F) illustrates the variations in gene expression across different macrophage subsets (M2, Vcan+, EZH2+, and M1). Through hierarchical clustering, we categorized genes into four principal modules (C1-C4) and determined that these modules are significantly associated with specific biological pathways. Notably, the C1 module is linked to the enrichment of immune signalling pathways, whereas the C2 and C4 modules are involved in tumour-related transcriptional programs and metabolic processes. These findings suggest substantial heterogeneity in the transcriptional regulation of DBNL among macrophage subsets, which may make differential contributions to disease progression.
Pseudotime analysis revealing early activation of VCAN⁺ programs, late induction of M2 signatures, and transient DBNL expression during macrophage differentiation
Research at the single-cell level facilitates the elucidation of intricate physiological processes and transcriptional regulation within highly heterogeneous cell populations. Such studies contribute to the identification of genes that indicate specific cell subtypes, genes that characterize the intermediate states of biological processes, and genes involved in the transition between distinct cell fates.In numerous single-cell studies, individual cells exhibit gene expression asynchronously, with each cell representing a snapshot of the transcriptional process under investigation. Monocle employs a strategy that involves ordering individual cells along a pseudotemporal trajectory. This approach leverages the asynchronous nature of gene expression in single cells to position them along a continuum that corresponds to biological processes, such as cell differentiation.Macrophages were selected for pseudotime series analysis. First, the similarity between cells was calculated, and the cell differentiation trajectory was constructed. By visualizing this trajectory, we can generate a representation of the cell differentiation process in pseudotime, illustrating the developmental progression of cells. This visualization can be utilized to study cell differentiation processes and gene expression patterns at various time points. The visual outputs, depicting cells coloured by pseudotime (a probability calculated by Monocle on the basis of cell gene expression data, indicating the temporal sequence), state (distinct segments differentiated by pathway branches), and various cell subtypes, were produced separately.The findings indicated that vcan+ macrophages were predominantly localized during the initial phase of cell differentiation, whereas M2 macrophages were observed primarily during the later phase of cell differentiation (Figure 4A-C). Utilizing computational analysis, we identified a subset of genes that exhibited the greatest variation along the pseudotime axis for visualization purposes. The x-axis represents the pseudotime values, whereas the y-axis denotes the selected genes. By default, these genes are categorized into three clusters on the basis of their expression dynamics. Our findings indicated that genes such as SLC22A15, AREG, and CXCR4 are predominantly expressed during the early stages of trajectory differentiation. In contrast, genes such as LGMN, FGF13, and NPAS3 are expressed primarily during the later stages of trajectory differentiation (Figure 4D). We subsequently demonstrated alterations in the expression of pivotal genes throughout the trajectory of cellular differentiation. The findings revealed a progressive increase in the expression of the DBNL gene during cell differentiation, followed by a subsequent decrease (Figure 4E).
Involvement of DBNL in immune, metabolic, and signalling pathways and its potential as a therapeutic target
We comprehensively investigated the specific signalling pathways associated with key genes and investigated their potential molecular mechanisms in the context of disease progression. Gene set enrichment analysis (GSEA) revealed that DBNL is significantly enriched in pathways such as the proteasome, circadian entrainment, and ECM-receptor interaction (Figure 5A,B).
Variations in key genes and immunometabolic pathways: An analysis of DBNL expression levels across biological pathways and their significance
The X-axis delineates various functional pathways, including immune, metabolic, signalling, and proliferation pathways, with distinct colours representing the respective categories. Pathways associated with the DBNL gene are shown on the Y-axis. The colour of the points signifies the log₂FC fold change value, ranging from blue (indicating low expression) to red (indicating high expression), while the size of the points corresponds to the false discovery rate (FDR), with larger points denoting higher significance. The data in the figure illustrate that DBNL is significantly associated with immune-related pathways (such as the IL-3 and IL-5 signalling pathways and interferon signalling) and metabolic pathways (such as cholesterol metabolism and glycolysis), underscoring its crucial role in immune regulation and metabolic processes (Figure 5C). Gene set variation analysis (GSVA) revealed that DBNL was significantly enriched in pathways such as coagulation, mitotic spindle, and apical junction (Figure 5D). The results of this study indicate that this key gene may play a role in disease progression via these signalling pathways. Utilizing the Comparative Toxicogenomics Database (CTD), an analysis was conducted to identify drugs that might interact with the key gene DBNL. This analysis revealed seven drugs with potential interactions with DBNL, highlighting the possibility for the development of novel therapeutic targets (Figure 5E). The protein and compound chosen for the key gene were DBNL (Q9UJU6) and pirinixic acid (WY-14643), respectively.
Structural analysis of the binding mode of DBNL and pirinixic acid (WY-14643)
A comprehensive analysis of the optimal binding conformation between DBNL and pirinixic acid (WY-14643) revealed that the small molecule pirinixic acid (WY-14643) is involved in various noncovalent interactions, allowing it to bind to a cleft in DBNL. The terminal ortho-xylene moiety of pirinixic acid (WY-14643) penetrates deeply into a hydrophobic cavity formed by VAL62, TYR64, and VAL101. Additionally, the other terminal carboxyl group interacts with ARG118 through electrostatic interactions. Concurrently, the small molecule forms a hydrogen bond with LYS94. These interactions collectively increase the binding affinity with the protein, and the synergistic effect of these forces may be significant. Pirinixic acid (WY-14643) has a remarkable binding affinity because of its structural framework (Figure 6A,B). This is corroborated by the two-dimensional interaction diagram, which reveals a cation-π interaction between the LYS94 side chain and the aromatic ring of pirinixic acid (WY-14643) (Figure 6C). Collectively, this binding mode suggests favourable stability and specificity. Nonetheless, the dynamic stability of this conformation needed further validation through molecular dynamics simulations.
Dynamic stability and interaction mechanisms of the DBNL-pirinixic acid (WY-14643) complex
Molecular dynamics simulations provided additional confirmation of the dynamic stability of the binding mode between pirinixic acid (WY-14643) and the DBNL protein, as initially suggested by molecular docking studies. The root mean square deviation (RMSD) of the complex increased rapidly during the initial phase before stabilizing, with an average RMSD value of 1.98 Å. The protein backbone exhibited a root mean square fluctuation (RMSF) of 1.38 Å, the binding pocket demonstrated an RMSF of 2.00 Å, and the ligand displayed an RMSF of 2.51 Å, all of which indicated stability with no evidence of dissociation (Figure 7A-C). RMSF analysis revealed that the average fluctuation of protein residues was 0.87 Å, with the majority of residue fluctuations smaller than 1.4 Å, except for the terminal region, which showed increased flexibility. The average RMSF for the ligands was 4.16 Å, with principal fluctuations occurring in the solvent-exposed regions (Figure 7D-F). Analysis of the radius of gyration (Rg) indicated that the protein maintained an overall Rg of 14.69 Å, suggesting a compact and stable structure. The ligand exhibited an average Rg of 3.99 Å, characterized by an extension along the Z-axis and a contraction along the X/Y axis, indicating orientation constraints imposed by the binding site (Supplementary Figure 2A-D). The Ramachandran plot results revealed that α-helices constitute 34.7%, β-sheets account for 16.0%, and PPII helices constitute 11.3% of the structure, with the overall secondary structure remaining stable throughout the simulation process (Figure 7G-I). Principal component analysis (PCA) and free energy landscape analysis suggest that the system progressively converges into multiple low-energy valleys, ultimately reaching a stable energy state during the latter stages of the simulation. Throughout this process, the ligand remains securely bound within the pocket (Figure 7J-N). Hydrogen bond analysis indicated that the average number of hydrogen bonds in the complex simulation is 0.8, initially dominated by ARG118, with ASN116 subsequently forming a new interaction hotspot (Supplementary Figure 2E-G). Molecular mechanics/generalized Born surface area (MM/GBSA) binding energy analysis revealed that the total binding free energy of the complex was -19.46 kcal/mol. The primary favourable contributions to this binding energy are van der Waals interaction energy (-23.36 kcal/mol) and solvation free energy (-7.36 kcal/mol), whereas electrostatic interaction (+11.25 kcal/mol) presented a slightly unfavourable contribution. Further decomposition of the binding energy highlighted ASN116 (-3.54 kcal/mol), VAL101 (-2.13 kcal/mol), and ALA98 (-1.14 kcal/mol) as key binding hotspot residues (Figure 7O-Q).
DATA AVAILABILITY:
All the data used in this study are publicly available or derived from previously published sources. Obtain the single-cell RNA sequencing (scRNA-seq) dataset associated with accession number GSE161470, as well as the bulk RNA sequencing (bulk RNA-seq) dataset corresponding to accession number GSE161472, from the GEO repository. The expression quantitative trait locus (eQTL) data were obtained from the eQTLGen consortium. Methylation quantitative trait locus (mQTL) data were acquired from a published meta-analysis of the European cohort. Plasma protein quantitative trait locus (pQTL) data were downloaded from the deCODE database (https://www.decode.com/summarydata/) (2021 release). Summary statistics for heart failure genome-wide association studies (GWASs) were accessed via the European Bioinformatics Institute (EBI) database under accession number GCST90162626. The 3D structure of the human DBNL protein was predicted using AlphaFold3, and pirinixic acid was retrieved from the PubChem database. Gene sets for enrichment analyses were sourced from the Molecular Signatures Database (MsigDB). All other relevant data supporting the findings of this study can be accessed via the following DOI: 10.5281/zenodo.18030352.

Figure 1: An analysis involving eQTL, pQTL, and mQTL Mendelian randomization, as well as colocalization analysis and reverse Mendelian randomization. (A-C) In this study, Mendelian randomization analyses was conducted for expression quantitative trait loci (eQTLs), protein quantitative trait loci (pQTLs), and methylation quantitative trait loci (mQTLs). For the eQTL analysis, this study examined the distribution of logarithmic hazard ratios and p values to assess the causal relationships for each gene. Similarly, in the pQTL analysis, the logarithmic hazard ratios and p value distributions were evaluated to determine the causal relationships for each protein. mQTL analysis involved assessing the distribution of logarithmic hazard ratios and p values for causal relationships associated with DNA methylation levels. Using a Venn diagram to analyse the intersection of genes identified from eQTL, pQTL, and mQTL analyses, this study found that the number of intersecting genes associated with high disease risk was one, while no intersections were observed for genes associated with low disease risk. (D-E) The Venn diagram illustrates the intersections among mQTL (depicted in blue), eQTL (depicted in yellow), and pQTL (depicted in green) datasets. The overlapping areas denote genes that are regulated by multiple types of QTLs. The numerical values represent the number of genes specific to each region, along with their respective proportions. Notably, the DBNL gene lies within the intersection of these QTLs, indicating that it may be subject to regulation by multiple genetic factors and suggesting its significant role in various biological processes. (F) This study investigated the single-nucleotide polymorphisms (SNPs) of the critical DBNL gene and their association with disease. Each data point indicates a statistically significant relationship between the SNP and the disease. The X-axis represents the P value obtained from the genome-wide association study (GWAS), whereas the Y-axis denotes the P value derived from the expression quantitative trait loci (eQTL) analysis. (G) Reverse Mendelian randomization analysis suggests that DBNL has a causal influence on the development of heart failure. Please click here to view a larger version of this figure.

Figure 2: Analysis of the transcriptome and annotation of cell types within macrophage subsets. (A)UMAP visualization delineates cells into nine distinct clusters, derived from the results of principal component analysis (PCA). (B) Cell annotation was performed for the nine clusters, resulting in the identification of eight distinct cell types: fibroblasts, cardiomyocytes, macrophages, pericytes, endothelial cells, T cells, mast cells, and neurons. (C) The volcano plot illustrates the differentially expressed genes: The x-axis denotes the log fold change in gene expression levels between samples, and the y-axis represents the significance of differential expression as measured by -log10(p value). Red dots indicate genes that are significantly differentially expressed, with a false discovery rate (FDR) of less than 0.05. The highlighted genes are considered biologically meaningful candidates. (D) The figure presents a scatterplot illustrating the results of principal component analysis (PCA). The x-axis represents the principal component indices, whereas the y-axis indicates the percentage of variance explained by each component. This visualization aids in determining the number of principal components that are statistically significant. (E) The principal component analysis (PCA) results are presented as a two-dimensional scatter plot constructed using the first two principal components (PC1 and PC2). In this plot, each point corresponds to an individual cell, with varying colours employed to distinguish between different samples. (F) A UMAP plot illustrates the clustering results based on RNA expression data at a resolution of 0.2. Each point on the plot represents an individual cell, with different colours used to distinguish between cluster populations. The x-axis and y-axis denote the coordinates from UMAP dimensionality reduction, reflecting cellular heterogeneity. (G) Another UMAP plot presents the cell type annotation outcomes. Following the clustering analysis, the cells were further categorized into specific cell types, with distinct colours and labels indicating the distribution of these cell types. (H) A scatter plot depicting the expression profiles of marker genes corresponding to the eight identified cell types. (I) Bubble plot visualization of the expression patterns of marker genes across the cell types. The x-axis represents the genes, while the y-axis denotes the cell types. The size of each bubble indicates the proportion of cells expressing the gene within the respective cell type (% expressed), and the colour gradient reflects the average expression level of the gene, with red indicating high expression and blue indicating low expression. Please click here to view a larger version of this figure.

Figure 3: A comprehensive multidimensional analysis to examine DBNL expression and its associated pathways across various cell types and disease states. (A,B) UMAP diagram illustrating the distribution of cells within both the control and disease groups. Cells are colour-coded according to DBNL expression levels, thereby emphasizing the distinct clustering patterns observed between the two conditions. (C) Dot plot depicting DBNL expression across different cell types, where the dot size indicates the proportion of cells expressing DBNL, and the colour gradient signifies the average expression level. Notably, the DBNL expression was highest in macrophages. (D) Circos diagram detailing ligand-receptor interactions between DBNL-expressing macrophages and other cell types, including cardiomyocytes and fibroblasts, with colour intensity denoting the interaction strength. (E) Results of a gene set enrichment analysis (GSEA), highlighting the principal pathways enriched in DBNL-expressing macrophages, such as prostate cancer, the PI3K-Akt signalling pathway, and resistance to EGFR tyrosine kinase inhibitors. Pathways are colour coded on the basis of their standardized enrichment score (NES) and statistical significance. (F) The heatmap illustrates the gene expression profiles of various macrophage subsets, including M2, Vcan+, EZH2+, and M1, along with their associated enrichment pathways. Clustering analysis revealed distinct transcriptional modules (C1-C4), each characterized by enrichment of a specific pathway. The accompanying colour bar denotes the Z score, indicating the standardized levels of gene expression. Please click here to view a larger version of this figure.

Figure 4: Pseudotime analysis of cellular developmental trajectories. (A) Pseudotime trajectories illustrating the progression of macrophage subtypes along developmental or differentiation pathways. Each point represents an individual cell, with the colour gradient indicating the pseudotime value and the darker shading denoting the initial stage of the trajectory. This analysis is instrumental in inferring the temporal sequence of cell states and transitions. (B-C) These panels depict the distribution of various states and macrophage subtypes within the cellular trajectory. (D) The cluster heatmap reveals that genes within each cluster exhibit similar expression patterns, potentially corresponding to specific stages of cell development or differentiation. The x-axis represents the pseudotime value, while the y-axis denotes gene expression. (E) The expression levels of key genes vary in accordance with cellular development. Please click here to view a larger version of this figure.

Figure 5: Integrative pathway and interaction analyses identified DBNL as a key regulator of immunometabolic signalling, suggesting its potential as a therapeutic target. (A) The gene set enrichment analysis (GSEA) results, which are based on relevant signalling pathways, are depicted with distinct curves in each subgraph representing the enrichment scores (ES) for the associated pathways. The x-axis represents gene rankings in the expression data, whereas the y-axis indicates cumulative enrichment scores. The levels of pathway enrichment provide insights into the biological processes potentially regulated by target genes. (B) The gene set variation analysis (GSVA) significant pathway bar chart illustrates the ranked results of functional pathways significantly associated with the target gene, encompassing both upregulated and downregulated pathways. The blue bars indicate pathways significantly enriched for upregulation, the green bars indicate downregulation, and the grey bars represent nonsignificant pathways with low normalized enrichment scores (NESs). (C) The hallmark pathways associated with DBNL are immune, metabolic, signalling, and proliferative pathways. The dot colour represents the log₂ fold change, ranging from blue to red, while the size of the dots indicates statistical significance, as measured by the false discovery rate (FDR). DBNL is strongly associated with immune pathways, such as IL-3/IL-5 and interferon signalling, as well as metabolic pathways, including cholesterol metabolism and glycolysis, highlighting its significant role in immune regulation and metabolism. (D) The circular diagram illustrates the interconnections between the proteasome, circadian rhythm, and extracellular matrix (ECM)-receptor interaction pathways. (E) The network diagram delineates the relationships between DBNL and various environmental and chemical factors. The central node, representing DBNL, is connected to several key factors, including pirinixic acid (WY-14643), ozone, copper, cisplatin, air pollutants, tetrachlorodibenzodioxin, and smoke. This diagram portrays the potential effects or interactions of these factors on DBNL, as indicated by the directed edges between nodes. Please click here to view a larger version of this figure.

Figure 6: Analysis of the binding mode of the DBNL and pirinixic acid complex. (A) The three-dimensional binding diagram illustrates the interactions between proteins and ligands, with proteins depicted in cartoon representation and ligands in stick representation. Yellow dashed boxes highlight the binding pocket area. (B) An enlarged view of the binding site reveals the interactions between pirinixic acid and key amino acid residues within the DBNL protein. (C) A two-dimensional interaction diagram of the proteins and ligands is presented, detailing specific types of interactions. Please click here to view a larger version of this figure.

Figure 7: Molecular dynamics simulation of pirinixic acid (WY-14643) binding to the DBNL protein: investigation of stability, structural evolution, and energy dynamics. (A) RMSD analysis of protein-ligand complexes within the context of molecular dynamics simulations. (B) Root mean square deviation (RMSD) analysis of the protein backbone, ligands, and binding sites. (C) The extraction and stacking of protein-small-molecule complexes during the final 10 ns of the trajectory, comprising a total of 100 frames. (D-E) The results of the RMSF analysis for proteins and ligands in molecular dynamics simulations. (F) E factor diagram for protein-small-molecule complexes. (G) The Ramachandran plot of the protein. (H) Temporal variation within the secondary structure region. (I) Heatmap depicting the temporal variation in the secondary structure of key residues. The Y-axis denotes the residue number, while the colour scheme indicates distinct secondary structure regions. This visualization effectively captures the dynamic changes in the secondary structure of specific residues throughout the simulation process. (J) Spatial distribution of the molecular system within the framework of the first two principal components (PC1 and PC2). (K) The diagram illustrates a free energy landscape, wherein the blue region denotes a stable conformational state characterized by lower energy. The figure is annotated with a blue circle indicating the starting point, a red circle indicating the ending point, and a star symbol representing the point of minimum free energy. (L-N)Conformations of the respective Start, End, and Energy Minimum points within the JK diagram. (O) Summary of the changes in binding energy of protein-small-molecule complexes observed during molecular dynamics simulations. (P) The delta energy component is represented by the histogram, while (Q) denotes the graph illustrating the residue energy contribution. Please click here to view a larger version of this figure.
Supplementary Figure 1: Preprocessing of single-cell data. (A) Visualization of quality control metrics for single-cell RNA sequencing data. Each panel depicts the distribution of the number of genes detected, the number of transcripts per cell (UMI counts), the proportion of mitochondrial gene expression (%MT), and the proportion of ribosomal gene expression (%Ribo), respectively. The x-axis corresponds to individual cells, while the y-axis represents the respective metric values. (B) Scatter plots illustrating the relationships between sequencing depth and mitochondrial gene content per cell. The left panel demonstrates a strong positive correlation between sequencing depth (total UMI counts) and gene number (correlation coefficient R = 0.96). The central panel reveals a moderate negative correlation between sequencing depth and mitochondrial gene proportion (R = -0.24). The right panel presents an example of a weak negative correlation between sequencing depth and ribosomal gene proportion (R = -0.11). The x-axis denotes sequencing depth per cell (total UMI counts), and the y-axis indicates the mitochondrial gene proportion (%MT). Cells within the normal range are represented by cyan dots, whereas red dots may signify abnormal or low-quality cells. Correlation coefficients displayed above quantify the strength and direction of these associations. (C) Volcano plot depicting differentially expressed genes. The x-axis represents the log fold change in gene expression between sample groups, and the y-axis corresponds to the statistical significance of differential expression (-log10(p-value)). Genes exhibiting significant differential expression (FDR < 0.05) are highlighted in red, with selected genes labeled to indicate biologically relevant candidates. (D) Scree plot summarizing principal component analysis (PCA) results. The x-axis indicates the principal component number, and the y-axis shows the proportion of variance explained by each component. This plot facilitates the determination of the number of principal components to retain for downstream analysis. (E) Visualization of PCA outcomes via a two-dimensional scatter plot of the first two principal components (PC1 and PC2). Each point corresponds to an individual cell, with colors differentiating distinct cell clusters. (F) Re-visualization of PCA results following adjustment procedures. Updated color schemes and point distributions are presented to assess the effectiveness of correcting potential batch effects or other technical confounders. Please click here to download this File.
Supplementary Figure 2: Structural compactness and hydrogen-bond interactions of the DBNL-Pirinixic acid complex during molecular dynamics simulation. (A) Time evolution and probability distribution of the radius of gyration (Rg) of DBNL over a 100 ns molecular dynamics simulation. (B) Component-wise analysis of the DBNL radius of gyration (Rg_x, Rg_y, and Rg_z) over the simulation time. (C) Time evolution and probability distribution of the radius of gyration (Rg) of Pirinixic acid during the simulation. (D) Component-wise analysis of the Pirinixic acid radius of gyration (Rg_x, Rg_y, and Rg_z) over the simulation time. (E) Hydrogen-bond geometry between DBNL and Pirinixic acid shown as a two-dimensional distribution of donor-acceptor distances and donor-hydrogen-acceptor angles. (F) Number of hydrogen bonds formed between DBNL and Pirinixic acid as a function of simulation time, including the corresponding distribution. (G) Time-resolved hydrogen-bond interaction network between Pirinixic acid and individual DBNL residues, with the interaction frequency summarized on the right. Please click here to download this File.
Supplementary Table 1: Results of the GSVA enrichment analysis. Please click here to download this File.
Supplementary Table 2: Extract_instruments and extract_outcome_data functions Please click here to download this File.
Supplementary Table 3: Cell annotation data. Please click here to download this File.
Although inflammation constitutes a fundamental aspect of the pathophysiology of heart failure (HF), previous investigations have predominantly concentrated on generalized inflammatory mediators and cell surface markers17,18,24. This focus has constrained the elucidation of the specific cellular mechanisms involved. To overcome these limitations, we utilized an integrative approach combining multiomics Mendelian randomization (MR) with single-cell transcriptomic analysis, through which we identified the cytoskeletal protein DBNL as a potential novel therapeutic target in HF.
Through Mendelian randomization (MR), colocalization analysis, and single-cell RNA sequencing, we determined that the DBNL gene is significantly associated with the risk of heart failure (HF). Our results indicate that DBNL is predominantly expressed in cardiac macrophages and participates in immunometabolic signalling pathways. Subsequent molecular docking and molecular dynamics simulations demonstrated that pirinixic acid (WY-14643) has a high affinity for DBNL. The binding complex exhibited stable and specific structural characteristics, with a binding free energy of -19.46 kcal/mol, suggesting a notably favourable interaction between the two. These findings offer theoretical support for elucidating the molecular mechanism of action of pirinixic acid (WY-14643) and establish a foundation for future DBNL-based drug optimization and functional studies. Our research addresses a critical knowledge gap in the field of heart failure by revealing, for the first time, the connection between DBNL and immunometabolic processes, as well as cytoskeletal functions associated with heart failure.
Historically, DBNL has been investigated predominantly in the context of neuronal development and cytoskeletal dynamics8; specifically,DBNL modulates immune polarization by orchestrating the reorganization of the macrophage actin cytoskeleton and participates in immune-related metabolic dysfunctions associated with heart failure. Furthermore, DBNL influences metabolic pathways through the regulation of vesicular transport within macrophages, thereby indirectly contributing to the myocardial pro-fibrotic response25. Our findings indicate that DBNL enrichment in cardiac macrophages is consistent with recent evidence suggesting that macrophage-fibroblast interactions contribute to fibrosis and diastolic dysfunction. This observation implies that DBNL may play a pivotal role in regulating macrophage function, thereby impacting immunometabolic processes in heart failure.
Moreover, by integrating multiomics data encompassing gene expression, methylation, and protein levels, we conducted a comprehensive analysis of the role of DBNL in heart failure (HF). Our findings elucidate its significant involvement in immunometabolic and fibrotic signalling pathways, notably within the IL-6/JAK/STAT3, PI3K/AKT/mTOR, and TGF-β pathways. The enrichment of DBNL is consistent with the established roles of these pathways in cardiac remodelling associated with HF, reinforcing its potential as a therapeutic target26. In contrast to conventional genome-wide association studies (GWASs), this research utilized a comprehensive Mendelian randomization (MR) framework incorporating methylation quantitative trait loci (mQTLs), gene expression quantitative trait loci (eQTLs), and protein abundance quantitative trait loci (pQTLs), thereby surpassing mere association analysis27. Our MR analysis revealed significant colocalization signals across multiple omics layers (SNP.PP. H4 > 0.90) without evidence of reverse causality, thereby reinforcing the validity of DBNL as a translational target.
By molecular docking and molecular dynamics simulations, we identified pirinixic acid (WY-14643) as a high-affinity ligand for DBNL,suggesting this compound as a prospective candidate drug anticipated to interact with DBNL. While contemporary drug discovery relies heavily on computational simulations, subsequent validation through biochemical assays and in vivo pharmacological studies is essential for determining drug efficacy and evaluating potential off-target effects. Notably, future research should include relevant experimental investigations, such as combined analyses and studies on macrophage function. Furthermore, our study provides the first functional validation of DBNL in the context of HF specificity, and we propose a DBNL-based drug optimization strategy to lay the groundwork for future clinical therapies.
Despite these advances, several limitations remain. First, our Mendelian randomization (MR) analysis was based on genome-wide association study (GWAS) data derived from European populations, which may restrict the generalizability of our findings to other ethnic groups.The single-cell RNA sequencing dataset included a solitary heart failure specimen and four control specimens. The limited sample size potentially undermines the statistical power necessary for accurate cell subtype annotation and analysis of DBNL expression patterns, complicating the exclusion of individual variability effects on the outcomes. To increase the robustness and reliability of future findings, it is imperative to expand the cohort of heart failure samples, particularly by incorporating stratified samples that represent diverse aetiologies, including ischaemic cardiomyopathy and dilated cardiomyopathy. Second, the sample size in our single-cell RNA sequencing was insufficient to achieve high resolution for rare subpopulations, even with rigorous quality control measures. While blood-derived QTL data offer initial insights into the associations between gene expression and heart failure, they do not comprehensively capture gene regulatory mechanisms within distinct cell types of cardiac tissue, such as cardiac macrophages. Consequently, future investigations should aim to validate these hypotheses through more detailed analyses of cardiac tissue, with particular emphasis on studies of specific cardiac cell populations, to reconcile discrepancies between blood-based and cardiac tissue-derived data. Additionally, while our study offers macrophage transcriptome data for DBNL, it does not include protein-level validation or functional data. Future research should aim to validate these findings across more diverse populations and employ gene editing validation, macrophage-specific knockout models, and preclinical assessments of potential small-molecule candidates.
In conclusion, this study has identified DBNL as a novel target for heart failure through the integration of multiomics data, elucidating the involvement of immunometabolic pathways in the pathogenesis of heart failure. The identification of a potential interaction between DBNL and pirenzepine provides new insights for the development of future therapeutic strategies. Further research is warranted to validate these findings and to investigate the application of DBNL-targeted therapies in clinical models.
The authors declare that they have no competing interests.
This research program was supported by the Zhejiang Provincial Natural Science Foundation of China (LTGY24H020001), the Project of Science and Technology on Traditional Chinese Medicine in Zhejiang Province (2023ZR131, 2023ZL158), the Ningbo Public Welfare Key Project (2024S030), the Key Technology R&D Program of Ningbo (2022Z149), the Key Laboratory of Precision Medicine for Atherosclerotic Diseases of Zhejiang Province (2022E10026), and the Special Fund Project for Clinical Medical Research of Zhejiang Medical Association (2021ZYC-A11).
| CellMarker Database | Harbin Medical University (CellMarker 2.0) | http://bio-bigdata.hrbmu.edu.cn/CellMarker/ | CellMarker Database; For querying tissue/cell type marker genes during single-cell annotation |
| Comparative Toxicogenomics Database (CTD) | CTD (Duke University et al.) | http://ctdbase.org/ | CTD Database; For retrieving interactions between chemicals, genes, phenotypes, and diseases, used for drug prediction |
| deCODE Plasma pQTL Dataset (2021, 4,907 aptamers, 35,559 European samples) | deCODE genetics | https://www.decode.com/summarydata/ | 2021 pQTL data release; Used as pQTL exposure data for MR analysis |
| eQTLGen Whole Blood eQTL Data | eQTLGen Consortium | https://www.eqtlgen.org/ | eQTLGen phase I/II (specific version as per usage); Used as eQTL exposure data for MR and colocalization analysis |
| GEO Single-cell RNA-seq Dataset GSE161470 (Human Cardiac Tissue: 4 control + 1 pathological sample) | NCBI Gene Expression Omnibus (GEO) | https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE161470 | GSE161470; To obtain single-cell RNA sequencing data of human heart tissue for downstream analysis |
| GWAS Catalog | NHGRI–EBI | https://www.ebi.ac.uk/gwas/ | GWAS Catalog; To retrieve HF-related GWAS information, annotations, and auxiliary analysis |
| Heart Failure GWAS Summary Statistics (GCST90162626) | NHGRI–EBI GWAS Catalog | https://www.ebi.ac.uk/gwas/studies/GCST90162626 | Study accession: GCST90162626; Used as outcome data for MR and colocalization analysis |
| Molecular Signatures Database (MSigDB) v7.0 | Broad Institute | https://www.gsea-msigdb.org/gsea/msigdb | MSigDB v7.0; For GSEA / GSVA background gene sets and pathway annotations |
| Pirinixic acid (WY-14643) Structure Data (PubChem) | NCBI PubChem | https://pubchem.ncbi.nlm.nih.gov/compound/5694 | PubChem CID: 5694; Used to obtain ligand 3D structure for molecular docking and dynamics simulation |
| UniProt Protein Database (DBNL, UniProt ID: Q9UJU6) | UniProt Consortium | https://www.uniprot.org/uniprot/Q9UJU6 | UniProt: Q9UJU6; Used to obtain the amino acid sequence of DBNL for structure prediction and docking |
| Whole Blood mQTL Meta-Analysis (Ref 11, 3,701 European samples, 426,636 mQTL traits) | Authors / Consortium (according to your reference 11) | Pending author to add original database URL / DOI | Reference DOI or Dataset ID; Used as mQTL exposure data for MR analysis |