Transcriptomic Identification of Renin&#8212;Angiotensin System-Related Candidate Biomarkers and External Testing of a Hypertension Diagnostic Model

Jing Bai; Yan Wang; Xiangxiang Yang; Li Du; Jianhua Shi; Xiaxia Li; Ming Bai

doi:10.3791/71252

Research Article

Transcriptomic Identification of Renin—Angiotensin System-Related Candidate Biomarkers and External Testing of a Hypertension Diagnostic Model

DOI:

10.3791/71252

⸱

June 22nd, 2026

Jing Bai*¹^,² , Yan Wang*³ , Xiangxiang Yang² , Li Du² , Jianhua Shi² , Xiaxia Li² , Ming Bai⁴

¹Lanzhou University, ²Department of Cardiovascular Medicine, Affiliated Hospital of Gansu Medical College, ³Department of Nephrology, Affiliated Hospital of Gansu Medical College, ⁴Department of Cardiology, First Hospital of Lanzhou University

Summary

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Using public blood transcriptome datasets, this study identified candidate renin-angiotensin system (RAS)-related biomarkers for hypertension. Bioinformatics and machine learning yielded an eight-gene signature tested in an independent blood-based cohort, linking RAS transcriptomic alterations to immune and inflammatory signatures requiring further validation.

Abstract

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This work aimed to identify candidate renin-angiotensin system (RAS)-related blood transcriptomic biomarkers associated with hypertension, construct an externally tested candidate diagnostic model, and validate selected genes in an Ang II-induced endothelial cell model. Two public microarray datasets, GSE75360 and GSE74144, were analyzed. Differential expression analysis was performed using limma, followed by GO/KEGG enrichment, preranked GSEA, CIBERSORT immune infiltration analysis, protein-protein interaction and regulatory network construction, and machine learning-based feature selection using logistic regression and random forest. A logistic regression model based on the selected genes was developed in GSE75360 and externally tested in GSE74144. Experimental validation was performed in Ang II-induced HUVECs using qRT-PCR, western blotting, ELISA, and CST3/FURIN loss- and gain-of-function assays, followed by CCK-8, Transwell, inflammatory, oxidative stress, and endothelial function analyses. In GSE75360, 173 differentially expressed genes were identified, including 18 RAS-related differentially expressed genes. Eight candidate genes, LRP1, CTSD, MTHFR, AUTS2, FURIN, CST3, FCER1G, and TBXAS1, were selected by combined machine learning analyses. Enrichment and immune infiltration analyses indicated that these genes were mainly associated with immune and inflammatory signatures. The eight-gene model showed high discrimination in GSE75360 and retained moderate performance in GSE74144. In Ang II-treated HUVECs, most candidate genes were upregulated at the mRNA level, and CST3, FURIN, and TBXAS1 were further validated at the protein level. CST3 and FURIN modulation altered Ang II-induced endothelial viability, migration, expression of inflammatory/adhesion markers, ROS accumulation, and eNOS/NO-related functional readouts. This study identified candidate RAS-related biomarkers and immune-associated signatures in hypertension and developed an externally tested candidate diagnostic model. The in vitro findings support the functional relevance of CST3 and FURIN in Ang II-induced endothelial responses, warranting further clinical and mechanistic validation.

Introduction

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study focuses on hypertension, a chronic cardiovascular disease that is prevalent globally¹. Hypertension significantly increases the risk of serious health complications, including cardiovascular and cerebrovascular events as well as renal damage. This poses a grave threat to patients' quality of life and longevity, while creating a significant social and economic burden²^,³. Currently, the diagnosis and treatment of hypertension mainly rely on blood pressure monitoring and antihypertensive medications. However, because the causes of hypertension are complex and heterogeneous, many patients do not respond adequately to current treatments, and molecularly informed stratification remains limited⁴^,⁵. Accordingly, there is a need to identify candidate biomarkers and transcriptomic signatures associated with hypertension, while avoiding overinterpretation of blood-based findings as definitive tissue-level mechanisms.

In recent years, the renin-angiotensin system (RAS) has garnered significant attention in the pathogenesis and progression of hypertension⁶. Existing studies have shown that abnormal expression of RAS-related genes is closely associated with blood pressure regulation disorders and target organ damage, but the exact molecular networks and key regulatory factors are not fully understood⁷. Because hypertension is increasingly recognized as a disorder involving vascular, immune, and inflammatory dysregulation, peripheral blood mononuclear cells and white blood cells provide accessible surrogate tissues that may capture systemic RAS-associated transcriptomic alterations⁸. In this study, the RAS-related gene set was curated from GeneCards and PubMed searches and therefore included both canonical RAS pathway genes and genes reported in the literature to be functionally associated with RAS signaling.

This study employs two publicly available microarray datasets (GSE75360 and GSE74144), using bioinformatics and machine learning methodologies for transcriptomic analysis. Specific methods include limma differential analysis, GO/KEGG/GSEA enrichment analyses, CIBERSORT immune infiltration analysis, PPI network construction, and machine learning-based key gene selection and logistic regression diagnostic model construction and evaluation. The advantages of these methods lie in their ability to analyze large-scale transcriptomic data, externally test findings across datasets, and integrate complementary analytic approaches for candidate model construction. We hypothesized that blood-cell transcriptomic profiles could identify candidate RAS-related biomarkers linked to hypertension and associated immune signatures. Therefore, our aim was to identify candidate RAS-related genes, characterize their functional context, and construct an externally tested candidate diagnostic model, rather than to establish definitive molecular mechanisms or a clinically ready precision diagnostic tool.

Protocol

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Overview of analysis workflow

The overall design of this study’s transcriptomic and machine learning-based analysis is illustrated in Figure 1, encompassing key steps: collection of Renin-Angiotensin System-related genes (RASRGs); screening of RAS-related differentially expressed genes (RASRDEGs) from hypertension datasets; functional enrichment analysis (GO/KEGG/GSEA); immune infiltration analysis (CIBERSORT); construction of protein-protein interaction (PPI) and regulatory networks; machine learning-based key gene selection (logistic regression, random forest [RF]); and evaluation of the hypertension diagnostic model. A complete list of software, databases, and online tools used in this study is provided in the Table of Materials.

Data download

Hypertension datasets GSE75360⁸ and GSE74144 (Homo sapiens) were obtained via the R package GEOquery⁹ from the GEO database¹⁰. GSE75360 derived from peripheral blood mononuclear cells (platform: GPL10558) included 10 hypertension and 11 control samples; GSE74144 derived from white blood cells (platform: GPL13497) included 14 hypertension and 8 control samples (Table 1). Protein-coding RASRGs (1,264) were initially identified via GeneCards¹¹ (keyword: "Renin-Angiotensin System") and PubMed (keyword: "Renin-Angiotensin System")¹²^,¹³. The intersection of these RASRGs with genes in GSE75360/GSE74144 yielded 1,159 final RASRGs¹⁴. The two datasets were processed separately because they were generated on different microarray platforms. Probe annotation was performed according to the corresponding GPL platform annotation files, and normalized gene expression matrices were used for downstream analyses. Box plots were used to compare expression distributions before and after normalization.

Hypertension-related renin-angiotensin-related differentially expressed genes

Samples in the GSE75360 dataset were categorized into the hypertension group and control group. The limma software was employed to conduct differential gene expression analysis between the two groups¹⁴, with differentially expressed genes (DEGs) identified by the threshold of |logFC| > 0.45 and p-value < 0.05. The results of this differential analysis were visualized via volcano plots (generated using the R package ggplot2).

To obtain RASRDEGs, DEGs meeting the above threshold (|logFC| > 0.45, p-value < 0.05) were cross-referenced with RAS-related genes (RASRGs), and the intersection result was presented via a Venn diagram. Subsequently, the expression patterns of the identified RASRDEGs were visualized as a heatmap using the R package pheatmap, and the chromosomal localization of RASRDEGs was displayed via chromosome maps generated using the R package RCircos¹⁵.

Differentially expressed gene validation and ROC curve analysis

An intergroup plot was constructed to analyze RASRDEG expression differences between hypertension/control in GSE75360¹⁶, R package pROC was used to plot ROC curves and calculate AUC (0.5–0.7: low accuracy; 0.7–0.9: moderate; >0.9: high) for RASRDEG diagnostic efficacy.

Correlation analysis

Spearman’s correlation analysis was performed on RASRDEG expression in GSE75360; results were visualized via heatmap (R package ggplot2) (|r| < 0.3: no/weak correlation; 0.3–0.5: weak; 0.5–0.8: moderate; >0.8: strong).

Enrichment analysis of GO and KEGG

GO (Gene Ontology, 2024 release, http://geneontology.org/) is a widely used resource for large-scale functional enrichment, covering three domains: biological processes (BP), cellular components (CC), and molecular functions (MF)¹⁷. KEGG (Kyoto Encyclopedia of Genes and Genomes, Release 109.0, 2024, https://www.genome.jp/kegg/) stores data on genomes, biopathways, diseases, and drugs¹⁸.

RASRDEGs were subjected to GO annotation and KEGG pathway enrichment analysis using the R package clusterProfiler¹⁹. Enrichment test method: hypergeometric test; multiple test correction method: Benjamini-Hochberg (BH) method. Screening criterion: adjusted p-value < 0.05.

Gene set enrichment analysis (GSEA)

For cohort-level GSEA, all genes tested in the differential expression analysis of GSE75360 were ranked in descending order by logFC and used as the input gene list for clusterProfiler¹⁹. No DEG prefiltering was applied before GSEA. The c2 gene set collection from MSigDB²⁰. Parameters: seed = 2022, 10–500 genes per set; screening criteria: corrected p < 0.05 (Benjamini-Hochberg, BH method), FDR < 0.25²¹.

Construction of hypertension diagnostic model

To identify key genes associated with hypertension, we employed two types of machine learning algorithms: logistic regression and random forests (RF). Logistic regression (binary dependent variable: hypertension/control) screened RASRDEGs with p < 0.05. Random Forest (RF, R package randomForest): parameters set.seed(520), ntree = 1000; MeanDecreaseGini (variable importance indicator) was extracted, and top15 RASRDEGs were selected. RASRDEGs were screened with a p value < 0.05 as the standard.

The RF (Random Forest) algorithm, an ensemble learning method under the Bagging category (integrating multiple decision trees), was applied via the R package randomForest²² (parameters: set.seed(520), ntree = 1000). MeanDecreaseGini (reflecting variable importance by average purity decrease during node splitting) of feature genes was extracted, and the top 15 RASRDEGs were selected. Finally, a Venn diagram of genes screened by logistic regression and RF was plotted to identify hypertension-related key genes.

Validation of hypertension diagnostic model

A logistic regression model was built based on key genes; linear predicted value (η) was calculated as:

Genomic data analysis equation with sums for mRNA expression coefficients.

The R package pROC¹⁶ was used to plot ROC curves and evaluate the model’s efficacy in predicting hypertension risk. A nomogram was constructed via the R package rms²³ to visualize the contribution of each key gene to the logistic regression model (reflecting the association between key genes and hypertension risk). Calibration curves were generated to assess the consistency between predicted and actual hypertension probabilities; decision curve analysis (DCA, R package ggDCA²⁴) was performed to evaluate the model’s clinical utility (net benefit) in GSE75360 and GSE74144.

Single-gene GSEA

GSEA explores the role of genes associated with a specific gene in biological processes/pathways/diseases by analyzing its expression, aiding in understanding the gene’s functional role. For each focal gene in GSE75360, samples were split at the median into high- and low-expression groups. Differential expression analysis was then performed across all tested genes, and genome-wide logFC values were ranked from highest to lowest before GSEA with clusterProfiler¹⁹. No DEG prefiltering was applied before GSEA. Parameters: seed = 2020, 10–500 genes per set (c2 gene set collection from MSigDB²¹). Screening criteria: p < 0.05 (adj. p corrected via BH method).

Immune infiltration analysis (CIBERSORT)

The CIBERSORT algorithm²⁵ (based on linear support vector regression) deconvoluted the transcriptome matrix to estimate immune cell composition in mixed samples (data with immune cell enrichment score > 0 were selected). The final immune cell infiltration matrix of GSE75360 was visualized via a proportion bar chart. Spearman’s correlation was used to analyze immune cell-immune cell and key gene-immune cell associations, with results presented as a correlation heatmap (R package pheatmap) and correlation bubble plot (R package ggplot2), respectively.

Protein-protein interaction (PPI) network

PPI networks are systems of interconnected proteins regulating biological processes via interactions. Using the STRING database²⁶, a PPI network for key genes was constructed (minimum interaction score: 0.150, low confidence). Renin-angiotensin-related hub genes were selected by screening interacting genes. The GeneMANIA database²⁷, which identifies functionally similar genes using genomic and proteomic datasets, was used to predict functionally similar genes of key RAS genes and to construct a protein interaction network.

Construction of regulatory network

mRNA-TF network: Transcription factors (TFs) regulate gene expression via post-transcriptional interaction with target genes. TFs targeting hub genes and their regulatory relationships were retrieved from the ChIPBase database²⁸, and the mRNA-TF network was visualized using Cytoscape²⁹.

mRNA-miRNA network: miRNAs modulate multiple target genes (single targets may be co-regulated by multiple miRNAs). StarBase v3.0³⁰ was used to identify miRNAs associated with RASRDEGs, and the mRNA-miRNA network was visualized via Cytoscape.

mRNA-drug network: Toxicogenomic databases³¹ were used to predict direct/indirect drug targets of hub genes. The mRNA-drug network (showing gene-drug interactions) was visualized with Cytoscape to complete network construction.

Ang II-induced HUVEC model

Human umbilical vein endothelial cells (HUVECs) were maintained at 37°C in a humidified incubator with 5% CO₂. Cells were maintained in complete endothelial cell culture medium supplemented with fetal bovine serum and antibiotics according to the supplier’s instructions. To establish an in vitro hypertension-related endothelial injury model, HUVECs were treated with angiotensin II (Ang II; 100 nM) for 48 h. Vehicle-treated cells were used as the control group.

For gene intervention experiments, small interfering RNAs targeting CST3 or FURIN (si-CST3 and si-FURIN), corresponding negative control siRNA (si-NC), CST3 or FURIN overexpression plasmids (oe-CST3 and oe-FURIN), and the corresponding empty-vector control (oe-NC) were transfected into HUVECs using a commercial transfection reagent according to the manufacturer’s protocol. After transfection, cells were exposed to Ang II and then harvested for expression validation and functional assays. Knockdown and overexpression efficiencies were confirmed by qRT-PCR and western blotting.

qRT-PCR

Total RNA was isolated from HUVECs with a standard RNA extraction reagent, and complementary DNA was generated using a reverse transcription kit. SYBR Green chemistry was used for qRT-PCR. Expression levels of LRP1, CTSD, MTHFR, AUTS2, FURIN, CST3, FCER1G, TBXAS1, IL-6, TNF-α, VCAM1, ICAM1, and eNOS were normalized to GAPDH. and calculated by the 2^−ΔΔCt method.

Western blotting

For western blot analysis, proteins were extracted with RIPA lysis buffer and quantified using a BCA assay. Equal protein amounts were resolved by SDS-PAGE and transferred to PVDF membranes. After blocking, membranes were incubated with primary antibodies against CST3, FURIN, TBXAS1, or GAPDH and then with suitable secondary antibodies. Bands were detected by chemiluminescence, and densitometry was normalized to GAPDH. Secreted CST3 in culture supernatants was quantified with an ELISA kit following the manufacturer’s protocol.

Cell viability

Cell viability was assessed using the Cell Counting Kit-8 (CCK-8) assay. Briefly, transfected and Ang II-treated HUVECs were seeded into 96-well plates, and absorbance at 450 nm was measured at 0, 24, 48, and 72 h after addition of the CCK-8 reagent. Cell migration was evaluated using Transwell chambers. After the indicated interventions, cells were seeded into the upper chambers, and migrated cells on the lower membrane surface were fixed, stained, and counted under a microscope in randomly selected fields.

Inflammatory test

To evaluate inflammatory activation, oxidative stress, and endothelial function, IL-6, TNF-α, VCAM1, ICAM1, and eNOS .mRNA levels were detected by qRT-PCR. Nitric oxide (NO) levels in the culture supernatant were measured using a commercial NO assay kit, and intracellular reactive oxygen species (ROS) levels were detected using DCF fluorescence according to the manufacturer’s instructions.

Statistical analysis

Transcriptomic processing and modeling were performed in R. Continuous variables were assessed for normality with the Shapiro-Wilk test. For two-group comparisons, independent-samples t-tests were used for normally distributed variables, whereas Wilcoxon rank-sum tests were used for non-normal variables. For three or more groups, one-way analysis of variance with appropriate post hoc testing was used when normality and homogeneity of variance assumptions were met; otherwise, the Kruskal-Wallis test was applied. CCK-8 time-course data were analyzed using two-way analysis of variance. Spearman correlation coefficients were calculated for association analyses. Unless otherwise stated, experimental results are shown as mean ± SD, and two-tailed p < 0.05 was considered significant.

Results

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Cleaning of hypertension datasets

To ensure the reliability of subsequent analyses, datasets GSE75360 and GSE74144 were first subjected to probe annotation and data normalization using the R package limma. The distributions of gene expression values before and after normalization are shown in Supplemental File 1—Supplemental Figure S1A-D, with orange representing hypertension samples and blue representing control samples.

For GSE75360: Before normalization (Supplemental File 1—Supplemental Figure S1A), the expression value ranges of hypertension and control samples showed obvious discrepancies, with inconsistent median and quartile distributions among samples, indicating significant batch effects or technical variation. After normalization (Supplemental File 1—Supplemental Figure S1B), the expression value distributions of all samples were highly aligned—medians and quartile ranges of hypertension and control groups overlapped substantially, and the overall data dispersion was reduced, confirming effective elimination of non-biological variation.

For GSE74144: Pre-normalization (Supplemental File 1—Supplemental Figure S1C), samples exhibited heterogeneous expression patterns, with partial hypertension samples showing abnormally high or low expression values relative to controls, and poor data consistency between groups. Post normalization (Supplemental File 1—Supplemental Figure S1D), the expression profiles of hypertension and control samples were homogenized—inter-sample differences were minimized, and the data conformed to the assumptions of subsequent differential expression analysis and machine learning modeling.

Hypertension-related renin-angiotensin differentially expressed genes

GSE75360 samples were separated into hypertension and control groups, and limma was used to test for expression differences between groups. A total of 173 genes met the |logFC| > 0.45 and p < 0.05 thresholds, including 26 upregulated and 147 downregulated genes. These differential expression results were visualized in a volcano plot (Figure 2A), where RAS-related upregulated and downregulated DEGs were annotated.

To obtain RASRDEGs, the intersections of all DEGs with |logFC| > 0.45 and p < 0.05 and RASRGs were identified, and a Venn diagram was plotted (Figure 2B). A total of 18 RASRDEGs were identified, namely: AUTS2, IFNG, MTHFR, LRP1, FCER1G, CYBB, FURIN, CTSD, CST3, TBXAS1, HMOX1, EHBP1L1, ITGA2B, PPBP, VAMP2, DLG4, DNM2, and TNFRSF1A. Based on the intersection results, we analyzed expression differences in RASRDEGs across different sample groups in the GSE75360 dataset. A heatmap was plotted to present the findings using the R package pheatmap (Figure 2C). Finally, the R package RCircos was employed to analyze the localization of the 18 RASRDEGs on human chromosomes, generating a chromosomal localization map (Figure 2D). The chromosome localization map shows that many RASRDEGs are located on chromosome 12 and chromosome 17, with IFNG, LRP1, and TNFRSF1A located on chromosome 12, and ITGA2B, VAMP2, and DLG4. located on chromosome 17.

Validation of differentially expressed genes and ROC curve profiling

To explore the differential expression of RASRDEGs between hypertensive and control samples in the GSE75360 dataset, a group comparison plot (Supplemental File 1—Supplemental Figure S2A) presents the analysis results of expression levels for 18 RASRDEGs across hypertensive and control samples. The differential results indicate (Supplemental File 1—Supplemental Figure S2A) that RASRDEGs AUTS2, CST3, CTSD, CYBB, EHBP1L1, FCER1G, FURIN, HMOX1, ITGA2B, LRP1, MTHFR, PPBP, TBXAS1, and VAMP2 show statistical significance between the hypertension and control groups. Next, using the R package pROC, ROC curves were plotted based on the expression levels of statistically significant RASRDEGs within the GSE75360 dataset. The ROC curves (Supplemental File 1—Supplemental Figure S2B-E) show that the expression levels of 14 RASRDEGs with significant differences are moderately accurate (0.7 < AUC < 0.9) in classifying hypertension and control samples. Finally, we calculated the correlations among the 18 RASRDEGs and plotted a correlation heatmap to display the results (Supplemental File 1—Supplemental Figure S2F). The results indicate that MTHFR, LRP1, FCER1G, CYBB, FURIN, CTSD, CST3, TBXAS1, HMOX1, EHBP1L1, ITGA2B, PPBP, VAMP2, DLG4, DNM2, and TNFRSF1A, among the 18 RASRDEGs, mainly show positive correlations with other genes, while IFNG and AUTS2. mainly show negative correlations.

Enrichment analysis of GO and KEGG

Using GO/KEGG enrichment analyses, we characterized the functional profiles of the 18 RASRDEGs in hypertension, examining their involvement in BP, CC, MF, and key pathways. Using these 18 RASRDEGs for GO/KEGG enrichment analysis, the specific results of these analyses are shown in Supplemental Table S1. The results show that these 18 RASRDEGs are primarily involved in BP such as antigen processing and presentation of exogenous peptide antigens via MHC class II, receptor-mediated endocytosis, and other related processes in hypertension; CC such as tertiary granules, endosomal vesicle membranes, endosomal vesicles, and secretory granule membranes; and MF such as heme binding, porphyrin binding, amyloid beta binding, flavin adenine dinucleotide binding, and peptide binding. KEGG pathway analysis revealed their significant involvement in several key biological pathways, including fluid shear stress and atherosclerosis, tuberculosis, HIF-1 signaling pathway, sphingolipid signaling pathway, and platelet activation. The GO/KEGG enrichment results were visualized using bubble plots (Supplemental File 1—Supplemental Figure S3A). Simultaneously, network diagrams of BP, CC, MF, and biological pathways were drawn based on the GO/KEGG enrichment analysis results (Supplemental File 1—Supplemental Figure S3B-E). The lines indicate the annotations of molecules and their corresponding entries, with larger nodes indicating more molecules included in the entry.

GSEA

To investigate transcriptome-wide pathway patterns associated with hypertension in GSE75360, preranked GSEA was performed using the ranked gene list, and the significant results are summarized in Supplemental Table S2. The enrichment plots are shown in Supplemental File 1—Supplemental Figure S4A-E. The results highlighted predominantly immune- and inflammation-related signatures, including an overview of pro-inflammatory and pro-fibrotic mediators (Supplemental File 1—Supplemental Figure S4B), immune infiltration in pancreatic cancer (Supplemental File 1—Supplemental Figure S4C), IL26 signaling pathway (Supplemental File 1—Supplemental Figure S4D), and development and heterogeneity of the ILC family (Supplemental File 1—Supplemental Figure S4E). These findings support that the hypertension-associated transcriptomic changes identified in this study are linked mainly to immune and inflammatory pathway activity rather than to a single isolated signaling process.

Construction of diagnostic models

Based on the 18 RASRDEGs and two machine learning algorithms, key genes for hypertension were further selected. In the logistic model, the number of genes with p < 0.05 was 8, namely: LRP1, CTSD, MTHFR, AUTS2, FURIN, CST3, FCER1G, and TBXAS1, which were visualized using a forest plot (Figure 3A). In the RF algorithm, the top 15 important genes were selected, namely: LRP1, FCER1G, FURIN, MTHFR, EHBP1L1, CYBB, ITGA2B, VAMP2, PPBP, AUTS2, CST3, CTSD, IFNG, TBXAS1, and TNFRSF1A, visualized using a MeanDecreaseGini scatter plot (Figure 3B). Finally, the intersection of the genes selected by the two machine learning algorithms yielded 8 key genes: LRP1, CTSD, MTHFR, AUTS2, FURIN, CST3, FCER1G, and TBXAS1. (Figure 3C).

Validation of hypertension diagnostic models

First, the R package pROC was used to plot ROC curves based on the predicted probabilities from the logistic regression model in the discovery dataset GSE75360. The ROC curve indicates (Supplemental File 1—Supplemental Figure S5A) that the logistic regression model has high discrimination (AUC > 0.9) in classifying hypertension and control samples. The linear predicted value (η) in the logistic regression model is calculated using the following formula:

Gene expression formula involving LRP1, CTSD, and FURIN; analysis result for protein interactions.

Equations in gene regulation network, coefficients for CST3, FCER1G, and TBXAS1 analysis.

Next, to evaluate the calibration and discriminatory capability of the hypertension diagnostic model, a calibration curve was plotted. By fitting actual probabilities to model-predicted probabilities under varying conditions, the concordance between model predictions and actual outcomes was assessed (Supplemental File 1—Supplemental Figure S5B). The model's calibration curve demonstrated a high degree of alignment between the dotted calibration line and the diagonal line of an ideal model. The hypertension diagnostic model constructed using key genes from the GSE75360 dataset underwent decision curve analysis to evaluate its potential clinical utility, with results presented in Supplemental File 1—Supplemental Figure S5C. Analysis indicates that within a specific range, the model curve consistently and stably outperforms both the all-positive and all-negative curves. Furthermore, the model demonstrates favorable net benefit within the analyzed threshold range.

To further illustrate the value of the hypertension diagnostic model, a nomogram was plotted based on the key genes to display their interrelationships in dataset GSE75360 (Supplemental File 1—Supplemental Figure S5D). The results indicate that the expression level of the key gene TBXAS1 significantly contributes to the utility of the hypertension diagnostic model, whereas the expression level of MTHFR contributes less.

External evaluation of the hypertension diagnostic model

First, the ROC curve was plotted based on the predicted probabilities of logistic regression in the GSE74144 dataset using the R package pROC. The ROC curve indicates (Figure 4A) that the logistic regression model has moderate discriminatory ability (0.7 < AUC < 0.9) in classifying hypertension samples and control samples. The linear predictor (η) in a logistic regression model is calculated as follows:

Linear regression equation displaying gene expression coefficients; research, data analysis.

Gene expression formula diagram with coefficients for genes CST3, FCER1G, TBXAS1 in analysis.

Next, to evaluate the calibration and discriminatory capability of the hypertension diagnostic model, the calibration curve was plotted via calibration analysis. Based on the fitting results between actual probabilities and model-predicted probabilities in different conditions, the model's predictive performance was assessed against the alignment with actual outcomes (Figure 4B). The calibration curve for this hypertension diagnosis model indicates that the dashed calibration line deviates slightly from the ideal diagonal. The potential clinical utility of the hypertension diagnostic model based on key genes in the GSE74144 dataset was assessed through DCA and results were presented (Figure 4C). Results indicate that the model curve remains stable within a specific range and consistently exceeds all positive and negative reference lines, yielding net benefit within the analyzed threshold range.

Therefore, the model should be interpreted as a candidate transcriptomic classifier evaluated within the analyzed public cohorts, rather than as a clinically validated diagnostic tool.

To further evaluate the diagnostic model for hypertension, a nomogram was plotted on the basis of key genes to illustrate the interrelationships among these genes within the GSE74144 dataset (Figure 4D). The results indicate that the expression level of the key gene MTHFR contributes more substantially to the performance of the hypertension diagnostic model than other variables, whereas the expression level of LRP1 .offers comparatively less utility.

Single-Gene GSEA

To investigate the pathways associated with the 8 key genes in the GSE75360 dataset, single-gene GSEA was performed after stratifying samples into high- and low-expression groups for each key gene. The enrichment plots are shown in Supplemental File 1—Supplemental Figure S6A-D and Supplemental Figure S7A-D. Overall, multiple key genes were associated with recurrent immune- and inflammation-related signatures. AUTS2, CTSD, FCER1G, and FURIN were linked to signatures such as Wilcox Response to Progesterone Up, Mili Pseudopodia Haptotaxis Up, Li Wilms Tumor Vs Fetal Kidney 1 Up, and Blanco Melo Bronchial Epithelial Cells Influenza A Infection Dn (Supplemental File 1—Supplemental Figure S6A,C,D and Supplemental Figure S7A). CST3, LRP1, and TBXAS1 were enriched mainly in inflammatory and immune-regulatory pathways, including Interleukin 10 Signaling, Lian Neutrophil Granule Constituents, Medicus Reference CXCR GNB G PI3K AKT Signaling Pathway, WP IL26 Signaling Pathway, and Overview of Proinflammatory and Profibrotic Mediators (Supplemental File 1—Supplemental Figure S6B and Supplemental Figure S7B,D). MTHFR showed downregulation of several response-related signatures together with enrichment of Selenoamino Acid Metabolism (Supplemental File 1—Supplemental Figure S7C).

CIBERSORT

CIBERSORT estimated the relative abundance of 22 immune cell types in GSE75360. The immune cell composition of each sample is shown in Figure 5A. Correlations among immune cell types in hypertensive samples are displayed in Figure 5B. The strongest positive correlation was observed between macrophage M2 cells and activated dendritic cells (r = 0.450, p < 0.05), whereas the strongest negative correlation was observed between CD8+ T cells and monocytes (r = -0.582, p < 0.05). Gene-immune cell correlations with p < 0.05 are shown in Figure 5C. CST3 had the strongest positive correlation with monocytes (r = 0.609, p < 0.05), while FURIN had the strongest negative correlation with macrophage M2 cells (r = -0.445, p < 0.05).

PPI network

First, the STRING database was utilized to construct a PPI network for eight key genes. (Supplemental File 1—Supplemental Figure S8A). The results of the PPI network indicate that 7 key genes are related, namely: CST3, CTSD, LRP1, FCER1G, MTHFR, FURIN, and TBXAS1. Subsequently, a network comprising seven key interacting genes and their similar functional counterparts was predicted and built through the GeneMANIA website (Supplemental File 1—Supplemental Figure S8B). Lines of varying colors represent co-expression, sharing of protein domains, and other information. This network encompasses seven key genes and twenty similar functional proteins.

Construction of regulatory networks

Predicted regulatory networks were constructed for the hub genes. StarBase identified miRNAs associated with the hub genes, and these interactions were visualized as an mRNA-miRNA network in Cytoscape (Figure 6A). The network included three hub genes and 37 miRNAs, with details in Supplemental File 2. ChIPBase was used to identify transcription factors binding to hub genes, and the mRNA-TF network was visualized in Cytoscape (Figure 6B). This network contained seven hub genes and 47 TFs, with details in Supplemental File 2. CTD was then used to identify potential hub gene-associated drugs or compounds, and Cytoscape was used to display the resulting mRNA-drug network (Figure 6C), which included three hub genes and 18 drugs or compounds.

Experimental validation of candidate genes in Ang II-induced HUVECs

To extend the transcriptomic findings to an endothelial experimental model, HUVECs were stimulated with Ang II (100 nM) for 48 h to establish a hypertension-related endothelial injury model (Figure 7A). qRT-PCR was first performed to validate the expression of the eight candidate genes selected by machine learning. Compared with control cells, Ang II-treated HUVECs showed increased mRNA expression of LRP1, CTSD, MTHFR, FURIN, CST3, FCER1G, and TBXAS1., whereas AUTS2 showed a non-significant upward trend (Figure 7B). Among these genes, CST3 exhibited a marked increase after Ang II stimulation.

To further validate candidate genes at the protein level, western blotting was performed for CST3, FURIN, and TBXAS1. Consistent with the qRT-PCR results, Ang II treatment increased the protein expression levels of CST3, FURIN, and TBXAS1 in HUVECs (Figure 7C). Because CST3 is a secreted protein, ELISA analysis was additionally performed, and it was confirmed that Ang II increased CST3 secretion in the culture supernatant (Figure 7C). These results support the reliability of the transcriptomic screening results and indicate that CST3, FURIN, and TBXAS1 are responsive to Ang II-induced endothelial stimulation.

Because CST3 and FURIN were selected for subsequent functional experiments, their knockdown and overexpression efficiencies were verified before phenotype assays. qRT-PCR and western blotting showed that si-CST3 and si-FURIN markedly reduced CST3 and FURIN expression compared with si-NC, whereas oe-CST3 and oe-FURIN significantly increased CST3 and FURIN expression compared with oe-NC (Figure 7D). These results confirmed successful construction of CST3 and FURIN loss- and gain-of-function models in HUVECs.

Effects of CST3 and FURIN modulation on Ang II-induced endothelial phenotypes

Functional experiments were then performed to evaluate whether CST3 and FURIN participate in Ang II-induced endothelial changes. CCK-8 analysis showed that Ang II-treated si-NC cells displayed a progressive increase in OD₄₅₀ over time. CST3 knockdown or FURIN knockdown reduced the Ang II-enhanced viability/proliferation signal at later time points, indicating that both genes participate in the proliferative/viability response of HUVECs under Ang II stimulation. FURIN overexpression also altered the CCK-8 time-course response, suggesting that FURIN may regulate Ang II-induced endothelial viability in a context-dependent manner (Figure 8A).

Transwell migration assays showed that Ang II markedly increased the number of migrated HUVECs compared with control cells. Knockdown of CST3 or FURIN significantly decreased Ang II-induced migration, whereas overexpression of CST3 or FURIN increased migration relative to the corresponding knockdown conditions (Figure 8B). These data indicate that CST3 and FURIN are involved in Ang II-induced endothelial migratory responses.

We next assessed inflammation, oxidative stress, and endothelial function markers. Ang II stimulation increased the expression of inflammatory and adhesion-related markers, including IL-6, TNF-α, VCAM1, and ICAM1, and enhanced ROS accumulation, while reducing eNOS expression and NO levels. Knockdown of CST3 or FURIN attenuated Ang II-induced increases in IL-6, TNF-α, VCAM1, ICAM1, and ROS and partially restored eNOS/NO-related endothelial functional readouts. In contrast, CST3 or FURIN overexpression generally enhanced migratory and inflammatory/oxidative phenotypes compared with the corresponding knockdown groups, although eNOS and NO readouts also increased in the overexpression conditions (Figure 8C). Together, these experiments suggest that CST3 and FURIN are functionally associated with Ang II-induced endothelial activation, inflammation, oxidative stress, and endothelial functional changes, while the directionality of some endothelial function readouts requires further mechanistic clarification.

DATA AVAILABILITY:

All data generated or analyzed during this study are included in the article and supplemental files.

Gene expression analysis flowchart showing differential analysis, GSEA, RASRDEGs, key genes, PPI.
Figure 1. Flowchart of comprehensive analysis. This flowchart summarizes the overall study design, including RAS-related gene curation, identification of RAS-related differentially expressed genes, functional enrichment analysis, immune infiltration analysis, protein-protein interaction and regulatory network construction, machine learning-based feature selection, and construction and external testing of the candidate diagnostic model for hypertension. Abbreviations: RASRGs = Renin-Angiotensin System-related genes; RASRDEGs = Renin-Angiotensin System-related differentially expressed genes; RF = random forest. Please click here to view a larger version of this figure.

Gene expression analysis; includes volcano plot, Venn diagram, heatmap, ideogram; bioinformatics.
Figure 2. Differential gene expression analysis. (A) Volcanogram of DEGs between hypertensive and control samples in the GSE75360 dataset. (B) Venn diagram of DEGs and RASRGs in dataset GSE75360. (C) Heatmap of RASRDEGs in dataset GSE75360. (D) Chromosome localization map of RASRDEGs. These results show that a subset of hypertension-associated DEGs overlaps with RAS-related genes and that these genes display distinct expression patterns and chromosomal distribution. Orange represents hypertension samples, blue represents control samples. Within the thermal map, the red indicates high expression, while the blue denotes low expression. Abbreviations: DEGs = differentially expressed genes; RASRGs = Renin-Angiotensin System-related genes; RASRDEGs = Renin-Angiotensin System-related differentially expressed genes. Please click here to view a larger version of this figure.

Forest plot and statistical chart, gene significance analysis; Venn diagram, logistic vs RF.
Figure 3. Selection of key genes for hypertension through machine learning. (A) Forest plot of the 18 RASRDEGs included in the logistic regression model for hypertension diagnosis. (B) MeanDecreaseGini scatter plot in the RF algorithm. (C) Venn diagram of the intersection of genes selected by the two machine learning algorithms. These analyses support the selection of eight candidate genes for subsequent model construction. Abbreviations: RF = random forest. Please click here to view a larger version of this figure.

ROC curve, calibration plot, decision curve, and nomogram for predictive model analysis diagrams.
Figure 4. External validation of the candidate hypertension diagnostic model in GSE74144. (A) Receiver operating characteristic curve of the logistic regression model in the independent dataset GSE74144. (B) Calibration curve of the candidate diagnostic model in GSE74144. (C) Decision curve analysis of the candidate diagnostic model in GSE74144. (D) Nomogram of the candidate diagnostic model in GSE74144. These results show that the eight-gene candidate diagnostic model retained moderate discriminatory ability and acceptable calibration/net benefit in an independent external cohort. Abbreviations: DCA = decision curve analysis; ROC = receiver operating characteristic. Please click here to view a larger version of this figure.

Microbiome diversity analysis; bar graph and heat map display cell type variation; correlation plot.
Figure 5. Immune cell infiltration profiling of the GSE75360 dataset via the CIBERSORT algorithm. (A) Bar chart showing the proportion of immune cells in the GSE75360 dataset. (B) Correlation heatmap of immune cells in the GSE75360 dataset. (C) Correlation bubble chart of immune cell infiltration abundance and key genes in the GSE75360 dataset. These analyses show that the hypertension-associated transcriptomic signature is accompanied by variation in immune cell composition and gene–immune cell correlations. Please click here to view a larger version of this figure.

Gene interaction network diagrams showing node relationships; panels A, B, C represent different datasets.
Figure 6. Hub gene regulatory network. (A) mRNA-miRNA regulatory network. (B) mRNA-TF regulatory network. (C) mRNA-drug regulatory network. These predicted networks are hypothesis-generating resources for possible upstream regulators and downstream drug associations of the hub genes. Orange indicates mRNA, purple indicates miRNA, blue indicates TF, and pink indicates drugs. Abbreviation: TF = transcription factor. Please click here to view a larger version of this figure.

Ang II-induced hypertensive model setup; qRT-PCR, ELISA, WB results; gene expression analysis.
Figure 7. Experimental validation of candidate genes and CST3/FURIN intervention efficiency in Ang II-induced HUVECs. (A) Schematic diagram of the Ang II-induced hypertension-related endothelial cell model and experimental intervention design. HUVECs were treated with Ang II (100 nM) and harvested after 48 h. si-NC, si-CST3, oe-NC, oe-CST3, si-FURIN, and oe-FURIN were used for gene intervention experiments. (B) qRT-PCR validation of eight candidate genes in control and Ang II-treated HUVECs. (C) Western blot validation of CST3, FURIN, and TBXAS1 protein expression and ELISA detection of secreted CST3. (D) qRT-PCR and western blot confirmation of CST3 and FURIN knockdown/overexpression efficiency. Data are presented as mean ± SD. ns, P > 0.05; *P < 0.05, **P < 0.01, ***P < 0.001 vs control or si-NC; &&&P < 0.001 vs oe-NC, as indicated. Please click here to view a larger version of this figure.

CCK-8 assay graph, Transwell assay, mRNA, protein expression bar charts, cell viability study results.
Figure 8. Functional effects of CST3 and FURIN modulation in Ang II-induced HUVECs. (A) CCK-8 assay showing cell viability/proliferation at 0, 24, 48, and 72 h after CST3 or FURIN knockdown/overexpression under Ang II stimulation. (B) Quantification of migrated cells per field using the Transwell assay. (C) Detection of inflammation-, oxidative stress-, and endothelial function-related markers, including IL-6, TNF-α, VCAM1, ICAM1, eNOS, NO, and ROS. Data are presented as mean ± SD. *P < 0.05, **P < 0.01, ***P < 0.001 vs control; #P < 0.05, ##P < 0.01, ###P < 0.001 vs Ang II + si-NC; &P < 0.05, &&P < 0.01, &&&P < 0.001 vs Ang II + si-CST3 or Ang II + si-FURIN, as indicated. Please click here to view a larger version of this figure.

	GSE75360	GSE74144
Platform	GPL10588	GPL13497
Species	Homo sapiens	Homo sapiens
Tissue	Peripheral Blood Mononuclear Cell	white blood cells
Samples in Hypertension group	10	14
Samples in Control group	11	8
Reference	PMID: 27779208

Table 1: GEO microarray chip information.

Supplemental File 1. Supplementary figures supporting transcriptomic analysis and candidate model evaluation. This file presents supplementary analyses for dataset normalization, RAS-related differentially expressed genes (RASRDEGs), enrichment results, gene set enrichment analysis, model evaluation, and hub-gene network assessment. Figure S1 shows expression distributions in GSE75360 and GSE74144 before and after normalization. Figure S2 summarizes group-level expression differences, ROC curves, and correlation patterns for the 18 RASRDEGs in GSE75360. Figure S3 presents GO and KEGG enrichment analyses of the 18 RASRDEGs. Figure S4 shows preranked GSEA results from the ranked GSE75360 transcriptome, highlighting immune- and inflammation-related signatures. Figure S5 presents internal evaluation of the eight-gene candidate diagnostic model in GSE75360, including ROC, calibration, decision curve analysis, and nomogram results. Figures S6 and S7 show single-gene GSEA results for the eight key genes, and Figure S8 presents STRING and GeneMANIA analyses of hub-gene protein-protein interaction and functional association networks.Please click here to download this file.

Supplemental File 2. Supplemental datasets supporting RAS gene curation and regulatory network construction. This file provides the curated renin-angiotensin system-related gene list used for intersection analysis and the predicted regulatory interaction datasets used to construct the mRNA-miRNA, mRNA-transcription factor, and mRNA-drug networks for the hub genes.Please click here to download this file.

Supplemental Table S1. GO and KEGG enrichment results for the 18 RASRDEGs. This table lists the significantly enriched GO biological process, cellular component, molecular function, and KEGG pathway terms identified from the 18 RASRDEGs. Enrichment significance was assessed using the hypergeometric test with Benjamini-Hochberg adjustment. Abbreviations: GO = Gene Ontology; BP = Biological Process; CC = Cellular Component; KEGG = Kyoto Encyclopedia of Genes and Genomes。Please click here to download this file.

Supplemental Table S2. Results of GSEA for Datasets GSE75360. This table summarizes the significant gene sets identified by preranked GSEA of the ranked transcriptome in GSE75360, including set size, enrichment score, normalized enrichment score (NES), nominal P value, adjusted P value, and q value.Please click here to download this file.

Discussion

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Hypertension is a complex cardiovascular disorder involving RAS dysregulation, immune-inflammatory activation, vascular injury, and endothelial dysfunction³². In the present study, we first used public blood transcriptomic datasets and machine learning approaches to identify RAS-related candidate biomarkers and to build an externally tested candidate diagnostic model. We then extended these bioinformatics findings through experimental validation in an Ang II-induced HUVEC model. This combined strategy strengthens the biological plausibility of the transcriptomic candidates by linking blood-derived RAS-associated signatures to endothelial responses under Ang II stimulation. In particular, the added cellular experiments enabled us to examine whether the key genes identified from transcriptomic and machine-learning analyses were responsive to a hypertensive endothelial microenvironment and whether selected genes could influence endothelial proliferation, migration, inflammation, oxidative stress, and endothelial functional markers.

Our analysis identified 18 RASRDEGs between hypertension and control samples. Subsequent application of two distinct machine learning algorithms (logistic regression and random forest) screened 8 key genes (LRP1, CTSD, MTHFR, AUTS2, FURIN, CST3, FCER1G, and TBXAS1) that may be relevant to hypertension-associated transcriptomic variation. Several of the identified key genes have previously been implicated in cardiovascular or immune regulation. For example, MTHFR gene polymorphism is a significant risk factor for hypertension, particularly H-type hypertension³³^,³⁴. LRP1 is involved in vascular homeostasis and atherosclerosis³⁵. FURIN has also been linked to blood pressure regulation and hypertension susceptibility³⁶. TBXAS1 encodes thromboxane synthase, which catalyzes the production of thromboxane A2, a potent vasoconstrictor and platelet activator with plausible relevance to hypertension biology³⁷. AUTS2 was also identified as one of the eight candidate genes in our machine-learning workflow. Although AUTS2 is best known for its role in neurodevelopment, its appearance in our candidate signature may reflect broader neuroimmune or systemic regulatory processes rather than an established direct causal mechanism in hypertension. In this context, Bishop et al. reported a high prevalence of cardiovascular risk factors, including high blood pressure, in autistic adults, suggesting that autism-related biology may intersect with cardiovascular phenotypes. However, because that study did not directly evaluate AUTS2, the relationship between AUTS2 expression and hypertension in our study should be considered hypothesis-generating and requires dedicated mechanistic validation³⁸. Our study supports the potential relevance of these genes in hypertension and places them within a broader RAS and immune-related context. The association of CST3 (Cystatin C) and FCER1G (a subunit of the IgE receptor) with hypertension and immune infiltration warrants further investigation, as it may reveal additional candidate links between immunity, inflammation, and blood pressure regulation. Consistent with the transcriptomic screening results, qRT-PCR validation in Ang II-treated HUVECs showed increased expression of most of the eight key genes, including LRP1, CTSD, MTHFR, FURIN, CST3, FCER1G, and TBXAS1, whereas AUTS2. showed a non-significant increasing trend. At the protein level, CST3, FURIN, and TBXAS1 were further confirmed to be elevated after Ang II stimulation, and ELISA showed increased CST3 secretion. These findings provide experimental support that several blood-derived candidate genes are also responsive to Ang II-induced endothelial stress. The logistic regression diagnostic model constructed based on these key genes demonstrated high discrimination in the discovery dataset (GSE75360) and moderate performance in the external validation dataset (GSE74144). These findings support the possibility that the identified genes may contribute to future biomarker development, but they do not establish immediate clinical applicability. The nomogram and DCA (decision curve analysis) further suggested potential value within the analyzed datasets, suggesting its potential value in risk stratification.

Functional enrichment analysis (GO and KEGG) of the 18 RASRDEGs revealed their significant involvement in immune-related biological processes, including antigen processing, exogenous peptide antigen presentation via MHC class II molecules, and receptor-mediated endocytosis. They also participate in pathways including fluid shear stress and atherosclerosis, HIF-1 signalling, and platelet activation.

Specifically, the fluid shear stress pathway, known to regulate endothelial function, may influence hypertension development via endothelial dysfunction and atherosclerotic changes³⁹. The enrichment of the HIF-1 signaling pathway may contribute to hypertension by promoting vascular remodeling and inflammation through hypoxia-induced activation of pro-fibrotic and pro-inflammatory genes⁴⁰. Additionally, platelet activation pathways may exacerbate thrombo-inflammatory processes, further contributing to vascular injury⁴¹. Taken together, these pathway-level findings are consistent with the interpretation that dysregulation of RAS-related genes in hypertension is associated not only with classical vascular and renal processes but also with immune and inflammatory signaling. GSEA on the entire dataset further underscored the significant enrichment of pro-inflammatory and immune infiltration pathways (e.g., Overview of proinflammatory and profibrotic mediators, Immune infiltration in pancreatic cancer, IL26 signaling, and Development and heterogeneity of the ILC family). Among these, IL26 signaling, a key mediator of inflammatory responses, may facilitate hypertension by enhancing the production of pro-inflammatory cytokines and promoting vascular inflammation and endothelial dysfunction⁴². Similarly, the enrichment of pathways related to the development and heterogeneity of innate lymphoid cells (ILCs) suggests a potential role for these immune cells in hypertension pathogenesis, possibly through regulating local inflammation, tissue fibrosis, and vascular dysfunction⁴³. These observations support an immune-associated interpretation of the identified transcriptomic signatures, while still requiring direct experimental validation. and the added HUVEC experiments further connected these pathway-level predictions with endothelial inflammatory and oxidative phenotypes. In Ang II-treated HUVECs, inflammatory markers including IL-6, TNF-α, VCAM1, and ICAM1 were increased, whereas eNOS expression and NO production were reduced and ROS levels were elevated. These results are consistent with Ang II-induced endothelial dysfunction characterized by inflammatory activation, oxidative stress, and impaired nitric oxide bioavailability.

The findings from immune infiltration analysis (using CIBERSORT, a tool for deconvolving immune cell composition from transcriptomic data) further supported immune involvement. We observed significant correlations among specific immune cell subtypes (e.g., monocytes, macrophage M2 cells, and CD8⁺ T cells) and between these immune cell subsets and several key genes. Monocytes, which are known to contribute to vascular inflammation and endothelial dysfunction, have been implicated in hypertension pathophysiology⁴⁴. Similarly, CD8⁺T cells, which can promote vascular damage and hypertension through pro-inflammatory cytokine secretion and endothelial activation, have also been associated with hypertensive immune injury⁴⁴. In contrast, macrophage M2 cells, typically associated with anti-inflammatory and tissue-reparative functions, may reflect protective or compensatory immune responses⁴⁵. Notably, several key genes (e.g., CST3, FURIN) showed strong correlations with the level of infiltration by these immune cells. For example, CST3 exhibited a marked and positive association with monocyte infiltration, implicating its role in monocyte-associated inflammatory signatures. Conversely, FURIN. showed a notable negative correlation with macrophage M2 infiltration, indicating its potential involvement in immune-regulatory pathway variation. These correlations should be interpreted cautiously, but they are consistent with the growing view that inflammation and immunity contribute importantly to hypertension pathogenesis. In the functional validation experiments, modulation of CST3 and FURIN further affected Ang II-induced endothelial phenotypes. Knockdown of CST3 or FURIN attenuated Ang II-enhanced cell viability/proliferation and migration, reduced the expression of inflammatory adhesion and cytokine markers, increased eNOS and NO levels, and decreased ROS accumulation. In contrast, overexpression of CST3 or FURIN generally aggravated these Ang II-induced changes. These findings suggest that CST3 and FURIN may not only serve as candidate biomarkers but may also participate in endothelial inflammatory activation, oxidative stress, and dysfunction under Ang II stimulation.

Furthermore, the PPI network and subsequent analysis using GeneMANia revealed that these key genes are not isolated players but interact closely with each other and with other functionally similar proteins, forming a complex molecular network. Similarly, the predicted miRNA, TF, and drug interaction networks should be regarded as hypothesis-generating resources that may help prioritize future mechanistic and translational studies, rather than as direct evidence of regulatory causality or therapeutic efficacy. The experimental focus on CST3 and FURIN provides an initial functional bridge between the predicted network-level associations and endothelial cell behavior, although the upstream regulatory mechanisms and downstream molecular targets of CST3 and FURIN still require further clarification.

The limitations of this study primarily stem from data sources and validation constraints. Although we performed a systematic analysis using publicly available transcriptomic datasets and supported our findings with an independent validation cohort, the experimental validation was limited to an Ang II-induced HUVEC model and therefore cannot fully reproduce the complexity of hypertension in vivo. Consequently, the in vivo biological functions and therapeutic potential of key candidate biomarkers (such as CST3 and FCER1G) remain unconfirmed, and their mechanistic and clinical relevance require further investigation. In addition, the sample size of the discovery cohort was relatively small, which may increase the risk of unstable feature selection and optimistic model performance. The analyzed tissues were peripheral blood mononuclear cells (white blood cells) rather than primary vascular or renal tissues, so the observed expression changes should be interpreted as surrogate transcriptomic readouts rather than direct tissue-level mechanisms. Potential differences in cohort characteristics, platform effects, and unmeasured confounding related to metadata may also have influenced the observed signals. Furthermore, the relatively limited sample size may restrict the robustness and generalizability of the identified molecular signatures, potentially introducing bias and reducing reproducibility.

In addition, integrating datasets from multiple platforms carries the risk of batch effects, which may compromise the reliability of genes identified as differentially expressed and undermine the consistency of results. Although the added in vitro experiments provide preliminary biological validation for CST3 and FURIN, they do not establish causality at the organismal level, and additional animal experiments, larger clinical cohorts, and mechanistic studies are needed to confirm the role of these genes in hypertension pathogenesis. Future research priorities should include large-scale, multicenter prospective cohort studies, complemented by additional functional experiments to verify these findings and facilitate their translational application in clinical practice.

In conclusion, this study identified candidate RAS-related biomarkers and immune-associated blood transcriptomic features in hypertension and developed an externally tested candidate diagnostic model. The newly added Ang II-induced HUVEC experiments further validated the expression of key candidate genes and demonstrated that CST3 and FURIN are functionally associated with endothelial viability, migration, inflammatory activation, oxidative stress, and eNOS/NO-related endothelial functional changes. These findings provide transcriptomic and experimental support for the potential involvement of RAS-related candidate genes in hypertension-associated endothelial dysfunction, while further clinical, in vivo, and mechanistic validation remains necessary.

Disclosures

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors have no conflicts of interest to declare.

Authors' contributions:

Conception and design of the work: Bai J, Wang Y;

Data collection: Yang X X, Du L, Shi J H, Li X X, Bai M;

Supervision: Bai J, Wang Y;

Analysis and interpretation of the data: Yang X X, Du L, Shi J H, Li X X, Bai M;

Statistical analysis: Bai J, Wang Y, Bai M;

Drafting the manuscript: Bai J, Wang Y;

Critical revision of the manuscript: all authors;

Approval of the final manuscript: all authors.

Acknowledgements

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Funding: Gansu Provincial Department of Education Higher Education Faculty Innovation Fund Project (No. 2026B-268)

Materials

List of materials used in this article
Name	Company	Catalog Number	Comments
96-well cell culture plates			Used for CCK-8 cell viability assays
Angiotensin II (Ang II)			Used to induce endothelial injury in HUVECs at 100 nM for 48 h
Antibiotic solution			Added to endothelial cell culture medium according to the supplier’s instructions
BCA protein assay kit			Used to determine protein concentration for western blotting
Cell Counting Kit-8 (CCK-8)			Used to assess HUVEC viability/proliferation
Chemiluminescence detection reagent			Used to visualize western blot protein bands
ChIPBase v2.0	Sun Lab / Research Resource	N/A	Database used to retrieve transcription factor-target relationships for hub genes
CIBERSORT	Stanford University / Newman Lab	N/A	Immune cell deconvolution algorithm used to estimate the abundance of 22 immune cell types
clusterProfiler (R package)	Bioconductor	N/A	R package used for GO, KEGG, and GSEA analyses
CO2 cell culture incubator			Used to culture HUVECs at 37 °C with 5% CO2
Comparative Toxicogenomics Database (CTD)	NC State University / Mount Desert Island Biological Laboratory	N/A	Database used to predict gene-drug interactions
Complete endothelial cell culture medium			Used for HUVEC culture
Computer workstation with internet access	Any standard research computing platform	N/A	Used for data download, preprocessing, statistical analysis, and visualization
CST3 ELISA kit			Used to detect secreted CST3 in cell culture supernatant
CST3 overexpression plasmid (oe-CST3)			Used for CST3 gain-of-function experiments in HUVECs
Cytoscape	Cytoscape Consortium	N/A	Software used to visualize mRNA-miRNA, mRNA-TF, and mRNA-drug regulatory networks
DCF fluorescent probe/ROS assay reagent			Used to detect intracellular ROS levels
Empty-vector control plasmid (oe-NC)			Used as the overexpression control
Fetal bovine serum			Supplement added to endothelial cell culture medium
FURIN overexpression plasmid (oe-FURIN)			Used for FURIN gain-of-function experiments in HUVECs
GeneCards	Weizmann Institute of Science	N/A	Database used to curate RAS-related genes
GeneMANIA	University of Toronto	N/A	Web-based tool used to identify functionally related genes
GEO database	NCBI	N/A	Public database used to access GSE75360 and GSE74144 datasets
GEOquery (R package)	Bioconductor	N/A	R package used to download GEO datasets
ggDCA (R package)	CRAN / GitHub source used by authors	N/A	R package used for decision curve analysis
ggplot2 (R package)	CRAN	N/A	R package used for data visualization
Human umbilical vein endothelial cells (HUVECs)			Cell model used for Ang II-induced endothelial injury experiments
limma (R package)	Bioconductor	N/A	R package used for normalization and differential expression analysis
MSigDB	Broad Institute	N/A	Gene set database used for GSEA
Negative control siRNA (si-NC)			Used as the knockdown control
Nitric oxide assay kit			Used to measure NO levels in culture supernatant
pheatmap (R package)	CRAN	N/A	R package used to generate heatmaps
Primary antibody against CST3			Used for western blotting
Primary antibody against FURIN			Used for western blotting
Primary antibody against GAPDH			Used as the western blot loading control
Primary antibody against TBXAS1			Used for western blotting
pROC (R package)	CRAN	N/A	R package used to generate ROC curves and calculate AUC
PubMed	U.S. National Library of Medicine	N/A	Literature database used to supplement RAS-related gene curation
PVDF membrane			Used for western blot protein transfer
qRT-PCR primers			Used to detect candidate genes and inflammatory/endothelial markers
R software	R Foundation for Statistical Computing	N/A	Statistical computing environment used for all analyses
randomForest (R package)	CRAN	N/A	R package used for feature selection by random forest
RCircos (R package)	CRAN / Bioconductor-associated resource	N/A	R package used to visualize chromosomal localization of RASRDEGs
Reverse transcription kit			Used to synthesize complementary DNA
RIPA lysis buffer			Used for total protein extraction
rms (R package)	CRAN	N/A	R package used to construct the nomogram
RNA extraction reagent			Used to extract total RNA from HUVECs
Secondary antibodies			Used for western blotting
siRNA targeting CST3 (si-CST3)			Used for CST3 knockdown experiments
siRNA targeting FURIN (si-FURIN)			Used for FURIN knockdown experiments
starBase v3.0	Sun Lab / Research Resource	N/A	Database used to identify miRNA-target interactions
STRING	STRING Consortium	N/A	Database used to construct the protein-protein interaction network
SYBR Green qPCR reagent			Used for qRT-PCR detection
Transfection reagent			Used to transfect siRNAs and overexpression plasmids into HUVECs
Transwell chambers			Used for HUVEC migration assays

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Transcriptomic Identification of Renin—Angiotensin System-Related Candidate Biomarkers and External Testing of a Hypertension Diagnostic Model

In This Article

Summary

Abstract

Introduction

Protocol

Results

Discussion

Disclosures

Acknowledgements

Materials

Reprints and Permissions

Tags

Related Articles