$$\rightleftharpoonup{xx}$$
$$\longleftharp{xx}$$,
$$\longrightharp{xx}$$,
Differential gene expression analysis
Differential expression profiling was performed to investigate transcriptional alterations associated with HER2+ breast cancer progression and therapy resistance using two independent RNA-seq datasets, GSE231524 and GSE231525. In the first dataset (GSE231524), a total of 19,727 genes were initially quantified. Following normalization, filtering, and application of the adjusted regression model (ARM) for differential expression, 6,604 genes were retained as significantly expressed (Supplemental Table S1). After gene symbol standardization, duplicate removal, and annotation refinement, 6,300 unique genes were finalized for downstream analyses. The volcano plots, MA plots, and dispersion estimates (Figure 3A-C) clearly illustrate the distribution of significantly upregulated and downregulated genes (adjusted p < 0.05, |log₂FC| ≥ 1). The dispersion estimates show well-fitted variance patterns across mean normalized counts, indicating robust normalization and appropriate model fitting in the DESeq2 workflow45.
Similarly, for the second dataset (GSE231525), an initial set of 18,372 genes was obtained. After applying the same ARM-based differential expression analysis pipeline, 9,508 genes were identified as significantly expressed (Supplemental Table S2). Upon renaming, duplicate removal, and quality control filtering, 8,510 unique genes were retained for further comparative analysis. The visualizations (Figure 3D-F) demonstrate consistent patterns of gene expression variation across samples, with clearly defined clusters of upregulated and downregulated genes under experimental conditions. The MA plots and dispersion estimates show that the data exhibit well-stabilized variance and no major outlier influence, validating the reliability of downstream integrative analyses46.
Together, these analyses highlight the reproducibility and integrity of gene expression patterns between the two datasets. Both GSE231524 and GSE231525 exhibited distinct transcriptional profiles consistent with HER2-associated signaling perturbations and therapy-induced metabolic stress. The processed datasets, now harmonized and quality-controlled, form the foundation for identifying mitochondrial oxidative stress-related genes and subsequent network-based enrichment analyses47.

Figure 3. Differential gene expression analysis of HER2+ breast cancer datasets. Results for (A-C) GSE231524 (control vs. breast cancer), and (D-F) for GSE231525 (control vs. DUSP6 knockdown). (A,D) Volcano plots display significantly upregulated (red) and downregulated (blue) genes (padj. < 0.05, |log₂FC| ≥ 1). (B,E) MA plots illustrate log₂ fold change versus mean normalized counts, confirming stable variance across samples. (C,F) Dispersion plots show fitted variance trends, validating proper normalization in DESeq2. Together, the datasets demonstrate consistent transcriptional alterations and robust model fitting suitable for downstream analysis. Please click here to view a larger version of this figure.
Identification of mitochondrial oxidative stress-related differentially expressed genes (MOS-DEGs)
Following differential expression analysis, the two datasets (GSE231524 and GSE231525) were cross-compared to identify common transcriptional alterations associated with HER2+ breast cancer therapy resistance. To specifically focus on metabolic and oxidative stress-related pathways, both DEG lists were overlapped with a curated mitochondrial gene (MG) dataset comprising 1,137 genes obtained from MitoCarta3.0, KEGG Oxidative Phosphorylation, and GO:0006979 (response to oxidative stress) databases.
The Venn diagram (Figure 4A) illustrates the overlap among DEGs from both datasets and the mitochondrial gene set. From this intersection, a total of 262 overlapping genes designated as Mitochondrial Oxidative Stress-Related Differentially Expressed Genes (MOS-DEGs)-were identified. These overlapping genes represent a convergent mitochondrial response to HER2-targeted therapy stress, integrating oxidative stress adaptation, metabolic reprogramming, and redox homeostasis. Specifically, 2,706 unique DEGs were exclusive to GSE231524, 4,870 were unique to GSE231525, while 3,167 genes were shared between the two datasets. The overlapping set of 262 genes with mitochondrial origin underscores the biological convergence between redox signaling, energy metabolism, and mitochondrial function in resistant breast cancer phenotypes (Supplemental Table S3). To further characterize these MOS-DEGs, an expression heatmap was generated to visualize transcriptional variation across experimental conditions (Figure 4B). Using RStudio (v4.3.2), hierarchical clustering was performed via the ComplexHeatmap (v2.18.0) and pheatmap (v1.0.12) packages, while data normalization and scaling were achieved through DESeq2 (v1.42.0) and ggplot2 (v3.5.0). The heatmap displays the z-score-scaled expression profiles of the 262 MOS-DEGs across all samples, revealing distinct clustering patterns that differentiate control, drug-tolerant, and resistant phenotypes.
Notably, several genes associated with mitochondrial respiration and oxidative stress defense including PRDX3, SOD2, NDUFA1, COX6C, ATP5ME, and GPX1 exhibited substantial upregulation in resistant and DUSP6-knockdown samples, whereas metabolic regulators such as IDH2, ACLY, and SDHA. showed variable modulation, indicating dynamic shifts in mitochondrial energy metabolism. This transcriptional reprogramming supports the hypothesis that HER2+ resistant cells rely on mitochondrial oxidative stress adaptation and metabolic plasticity for survival under therapeutic pressure. Overall, the integration of mitochondrial gene sets with differential expression results provided a targeted subset of functionally relevant genes MOS-DEGs that are central to redox balance, oxidative phosphorylation, and metabolic resilience in resistant breast cancer cells. The expression heatmap confirms that these genes form a coherent transcriptional signature associated with therapy adaptation and mitochondrial functional remodeling.

Figure 4. Identification and expression profiling of MOS-DEGs. (A) Venn diagram showing the overlap between DEGs from GSE231524, GSE231525, and 1,137 mitochondrial genes. The intersection identifies 262 shared MOS-DEGs involved in oxidative stress and mitochondrial metabolism. (B) Heatmap displaying z-score-scaled expression profiles of the 262 MOS-DEGs across control, drug-tolerant, and resistant samples. The heatmap was generated in RStudio using the ComplexHeatmap, pheatmap, ggplot2, and DESeq2 packages, demonstrating clear clustering between experimental conditions and highlighting upregulation of key mitochondrial oxidative stress regulators in resistant phenotypes. Please click here to view a larger version of this figure.
Protein-protein interaction (PPI) network and pathway enrichment analysis
Subsequent protein-protein interaction (PPI) analysis using the STRING database revealed a densely interconnected mitochondrial oxidative stress-related DEG (MOS-DEG) network consisting of 210 nodes and 951 edges, with an average node degree of 9.06 and a local clustering coefficient of 0.457 (Figure 5A). The PPI enrichment p-value (< 1.0 × 10⁻16) indicated that the observed connectivity was significantly greater than expected by chance, reflecting a functionally coherent mitochondrial interaction landscape. Core hub proteins such as COX6C, NDUFA10, ATP5F1, PRDX3, and SOD2 emerged as central mediators of redox homeostasis and mitochondrial metabolic regulation, underscoring their importance in maintaining oxidative balance and energy stability during therapy-induced stress.
To refine the network and identify the most influential regulatory elements, the MOS-DEG interactome was imported into Cytoscape and analyzed using the CytoHubba MCC (Maximal Clique Centrality) algorithm (Figure 5B). The ranking revealed the top 50 hub MOS-DEGs, which were reanalyzed using STRING for functional enrichment and network clustering (Figure 5C,D). The resulting subnetwork demonstrated exceptionally high functional cohesion, comprising 369 observed interactions (expected = 39), an average node degree of 14.76, a local clustering coefficient of 0.698, and an enrichment p-value of 0.0. These metrics confirmed that the network's organization was not random but represented a tightly regulated molecular architecture integrating mitochondrial metabolism and oxidative stress signaling. The MCL clustering analysis revealed three highly connected functional modules: one corresponding to central energy metabolism, including key TCA cycle and oxidative phosphorylation enzymes (SDHA, IDH2, ACO2, DLAT, UQCRC1, COX10); a second representing the mitochondrial translation machinery (MRPL16, MRPL45, DAP3, LYRM7), which maintains mitochondrial protein synthesis and respiratory complex integrity; and a third involving amino acid catabolism and redox regulation (GLUD1, GLS2, PRODH, L2HGDH, ALDH18A1), responsible for replenishing TCA intermediates and sustaining the NADH/FADH₂ balance required for cellular redox control. Functional enrichment indicated a strong overrepresentation of mitochondrial localization (FDR = 3.39 × 10⁻50), carboxylic acid metabolism (FDR = 6.33 × 10⁻23), and cellular respiration (FDR = 9.18 × 10⁻14), indicating that these genes form an integrated bioenergetic-oxidative stress axis linking oxidative phosphorylation, antioxidant defense, and amino acid metabolism.
Based on network topology and functional enrichment, ten master regulatory MOS genes were prioritized for further characterization: PRDX3, MTHFD2, SDHA, NDUFV1, PDK1, ACLY, GLS2, GSR, DAP3, and LYRM7 (Figure 5E). STRING-based analysis of this curated subset revealed a highly cohesive interaction core comprising 13 direct associations, with an average node degree of 2.6, a clustering coefficient of 0.843, and a network enrichment p-value of 3.3 × 10⁻8. Nearly all proteins were localized to the mitochondrion (9 of 10; FDR = 2.27 × 10⁻7) or mitochondrial matrix (7 of 10; FDR = 2.27 × 10⁻7). The principal pathways represented by these proteins included dicarboxylic acid metabolism (SDHA, GLS2, MTHFD2, ACLY; FDR = 1.6 × 10⁻3), the pyruvate dehydrogenase complex (PDK1, ACLY; FDR = 3.9 × 10⁻3), and the TCA cycle (SDHA, ACLY; FDR = 1.68 × 10⁻2). In parallel, redox-related functions such as oxidoreductase activity (GSR, SDHA, PRDX3, MTHFD2, NDUFV1; FDR = 1 × 10⁻3) and flavoprotein motif enrichment (GSR, SDHA, NDUFV1; FDR = 3.9 × 10⁻3) were significantly enriched, confirming the network's role in ROS detoxification and mitochondrial redox regulation. The strongest interactions, including SDHA-NDUFV1 (score = 0.999), SDHA-ACLY (0.964), DAP3-LYRM7 (0.678), and GSR-PRDX3 (0.661), established two interlinked subnetworks: an energy-redox module (SDHA-NDUFV1-ACLY-GSR-PRDX3) that bridges oxidative phosphorylation with antioxidant capacity, and a translation-assembly module (DAP3-LYRM7) that integrates mitochondrial ribosome function with respiratory complex III biogenesis.
Together, these ten genes form a cohesive mitochondrial core that integrates energy metabolism, redox buffering, and biosynthetic adaptation. PRDX3 and GSR constitute the antioxidant arm that safeguards the mitochondrial environment, while MTHFD2 supports NADPH regeneration and one-carbon metabolism essential for proliferative survival. SDHA and NDUFV1 anchor the electron transport system, linking TCA flux to ATP production, whereas PDK1 and ACLY orchestrate metabolic rewiring that sustains anabolic growth under stress. DAP3 and LYRM7 represent structural and translational stabilizers that preserve mitochondrial integrity. This multi-tiered molecular framework defines a bioenergetic-redox regulatory core fundamental to HER2-targeted therapy resistance. The integrative analysis underscores how oxidative metabolism and redox adaptation are co-opted to preserve mitochondrial functionality and cellular viability in resistant breast cancer phenotypes, revealing potential targets for metabolic intervention and precision therapy (Table 1).

Figure 5. Network analysis and prioritization of mitochondrial oxidative stress-related genes. (A) STRING-based PPI network of the top 210 MOS-DEGs illustrating overall interaction density. (B) Import of the PPI network into Cytoscape for visualization and topological analysis. (C) Extraction of the top 50 hub MOS-DEGs using the CytoHubba MCC algorithm. (D) STRING reanalysis of the top 50 hub genes reveals three major functional modules: central energy metabolism, mitochondrial translation, and amino acid catabolism/redox regulation. (E) STRING-based visualization of the top 10 prioritized MOS-DEGs depicting a highly connected oxidative-metabolic network (average node degree = 2.6; clustering coefficient = 0.843; PPI enrichment p = 3.3 × 10⁻8). Please click here to view a larger version of this figure.
Table 1: Final prioritized gene panel (Top 10 MOS master regulators). Please click here to download this Table.
Functional enrichment analysis of top 10 MOS-DEGs
To elucidate the biological relevance and molecular mechanisms associated with the top 10 prioritized MOS-DEGs, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed. The GO enrichment was categorized into three major ontologies-Biological Process (BP), Cellular Component (CC), and Molecular Function (MF) to uncover their biological roles in oxidative metabolism and mitochondrial regulation.
GO Biological Process (BP) analysis revealed significant enrichment in metabolic and redox-related pathways, including dicarboxylic acid metabolic process (GO:0043648; enrichment score = 6.87; p = 1.33 × 10⁻7), acetyl-CoA biosynthetic process (GO:0006085; enrichment = 4.40), generation of precursor metabolites and energy (GO:0006091; enrichment = 4.06), and electron transport chain (GO:0022900; enrichment = 4.03). Other enriched biological processes included cell redox homeostasis and cellular response to oxidative stress, underscoring the tight link between oxidative phosphorylation, redox equilibrium, and metabolic reprogramming in tumor progression. GO Cellular Component (CC) enrichment indicated a predominant localization of these MOS-DEGs within mitochondrial structures, particularly the mitochondrial matrix (GO:0005759; p = 5.79 × 10⁻10), mitochondrial inner membrane, oxidoreductase complex, and respiratory chain complex.
The presence of genes such as MTHFD2, NDUFV1, PRDX3, GSR, and SDHA reflects their direct role in electron transport and oxidative phosphorylation systems, supporting their importance in mitochondrial energy conversion and redox signaling. GO Molecular Function (MF) analysis highlighted key catalytic and redox functions, including oxidoreductase activity, acting on sulfur group of donors, NAD(P) as acceptor (GO:0016668; enrichment = 4.68; p = 2.07 × 10⁻5), electron transfer activity (GO:0009055; enrichment = 4.43), antioxidant activity (GO:0016209; enrichment = 3.02), and flavin adenine dinucleotide (FAD) binding (GO:0050660; enrichment = 3.03). The consistent enrichment of PRDX3, GSR, MTHFD2, and SDHA. across multiple redox-related functions further supports their involvement in maintaining oxidative balance in mitochondrial compartments. Collectively, these GO findings highlight the central involvement of MOS-DEGs in mitochondrial oxidative phosphorylation, NAD(P)H-mediated redox cycling, and antioxidant defense mechanisms-processes essential for sustaining tumor metabolism and adaptation to oxidative stress (Figure 6A-D).

Figure 6. Gene Ontology enrichment analysis of top 10 MOS-DEGs. (A) Biological Process terms showing significant enrichment in oxidative metabolism and energy generation. (B) Cellular Component enrichment indicating mitochondrial localization and respiratory chain association. (C) Molecular Function enrichment highlighting oxidoreductase, electron transfer, and antioxidant activity. (D) Bar plot summarizing top GO terms across BP, CC, and MF categories based on enrichment scores and -log₁₀(p-values). Please click here to view a larger version of this figure.
KEGG pathway enrichment analysis
The KEGG pathway analysis provided a complementary systems-level perspective, revealing that the top 10 MOS-DEGs were significantly enriched in several oxidative metabolism and disease-associated pathways (Figure 7A-E). The most significantly enriched pathways included the Citrate cycle (TCA cycle) (hsa00020; p = 0.0002; enrichment score = 3.69), Diabetic cardiomyopathy (hsa05415; p = 0.0003), Central carbon metabolism in cancer (hsa05230; p = 0.0011), and Oxidative phosphorylation (hsa00190; p = 0.0042). Additionally, enrichment in pathways such as Non-alcoholic fatty liver disease (NAFLD), Chemical carcinogenesis-reactive oxygen species, Parkinson's disease, and Prion disease reflects the broad association of these genes with oxidative damage and mitochondrial dysfunction, common hallmarks of cancer and neurodegeneration. Among the genes driving these enrichments, SDHA and NDUFV1 were consistently represented across oxidative phosphorylation, ROS detoxification, and mitochondrial complex I/II-linked pathways. MTHFD2 and GLS2 contributed to one-carbon metabolism and glutamine-derived anaplerotic flux in the TCA cycle, while GSR and PRDX3. participated in glutathione redox cycling. These findings demonstrate that metabolic-oxidative crosstalk, mediated by these MOS-DEGs, underpins both oncogenic energy reprogramming and mitochondrial resilience to stress.

Figure 7. KEGG pathway enrichment analysis of MOS-DEGs. (A) Sankey diagram linking MOS-DEGs to their enriched KEGG pathways, showing overlaps in oxidative phosphorylation, TCA cycle, and metabolic reprogramming. (B-E) Network and dot plots illustrating gene-pathway relationships and enrichment levels across metabolic and oxidative stress-related signaling pathways. Highlighted pathways include the TCA cycle, oxidative phosphorylation, and neurodegeneration-related oxidative damage, confirming mitochondrial oxidative stress regulation as the core functional axis of these genes. Please click here to view a larger version of this figure.
ROC-based clinical validation of MOS in breast cancer
Receiver operating characteristic (ROC) analysis was performed to evaluate the ability of the prioritized mitochondrial oxidative stress-related genes to discriminate between chemotherapy responders and non-responders. As shown in Figure 8A-J, all evaluated genes demonstrated limited discriminative performance, with AUC values ranging approximately from 0.50 to 0.66.
Specifically, PRDX3, MTHFD2, SDHA, and ACLY showed relatively higher but still modest discriminatory ability (AUC ~0.57-0.66), whereas genes such as GSR, GLS2, and several electron transport-related genes exhibited AUC values close to 0.50, indicating near-random classification performance. Among the panel, ACLY demonstrated the highest AUC (~0.65), followed by SDHA (~0.59) and MTHFD2 (~0.58), suggesting only weak separation between responder and non-responder groups.
While significant differences were observed for several genes, the effect sizes were small, and the ROC performance indicates limited predictive accuracy when considered individually. These findings suggest that mitochondrial oxidative stress-related genes alone are insufficient as standalone predictive biomarkers for chemotherapy response. Instead, they may contribute incremental information within a broader multi-gene or multi-omics predictive framework. Consequently, these results should be interpreted as exploratory and hypothesis-generating rather than indicative of immediate clinical utility (Figure 8).

Figure 8. ROCplotter-based validation of mitochondrial oxidative stress-related master regulators in breast cancer. (A-H) Receiver operating characteristic analyses of the 10 evaluated MOS master regulators: (A) PRDX3, (B) MTHFD2, (C) SDHA, (D) NDUFV1, (E) PDK1, (F) ACLY, (G) GLS2, (H) GSR, (I) LYRM7, and (J) DAP3. Each panel shows the AUC value and corresponding statistical metrics (ROC p-value and Mann-Whitney p.-value) indicating the discriminative ability of each gene to differentiate between chemotherapy responders and non-responders. Elevated AUC values (>0.55) indicate weak to modest predictive associations between gene expression and therapeutic response. The results highlight ACLY and PDK1 as the moderate predictive contributions, followed by PRDX3, MTHFD2, and SDHA, while GSR and GLS2 exhibit moderate predictive contributions to mitochondrial redox homeostasis during chemotherapy adaptation. Please click here to view a larger version of this figure.
Differential expression profiling of top MOS-DEGs
To investigate the transcriptional behavior of MOS-DEGs across breast cancer progression, comparative expression analysis was conducted, which integrates both RNA-Seq and gene chip data derived from The Cancer Genome Atlas (TCGA), Genotype-Tissue Expression (GTEx), and MET500 datasets. Eight key genes were analyzed across normal, tumor, and metastatic breast tissues to evaluate consistent transcriptional trends and identify those most relevant to tumorigenesis and metastasis.
A stringent selection criterion was applied, defining significant upregulation as fold-change (FC) ≥ 1.3 for Tumor vs Normal (TvsN) and FC ≥ 1.5 for Metastatic vs Normal (MvsN) across both platforms. Genes meeting these thresholds in a consistent direction were considered biologically relevant and prioritized for functional interpretation. As shown in Figure 9A,B, the RNA-Seq-based expression analysis revealed that GSR and MTHFD2 were markedly upregulated in both tumor and metastatic tissues compared to normal controls. GSR demonstrated the most prominent increase (TvsN = 2.73; MvsN = 4.83), while MTHFD2 exhibited substantial and consistent induction (TvsN = 1.88; MvsN = 1.96). These results were validated in the gene chip dataset (Figure 9C,D), where GSR again showed robust expression elevation (TvsN = 1.76; MvsN = 2.19) and MTHFD2 maintained high transcriptional activation (TvsN = 2.03; MvsN = 2.55). The strong and reproducible upregulation of these two genes across independent platforms underscores their central roles in mitochondrial redox regulation-GSR as the primary enzyme maintaining glutathione equilibrium and MTHFD2 as a metabolic catalyst for NADPH production and one-carbon metabolism.
Moderate yet consistent elevation was observed for PRDX3, which fulfilled the RNA-Seq inclusion threshold (TvsN = 1.39; MvsN = 1.58) and showed stable expression in the gene chip dataset (TvsN = 1.02; MvsN = 1.12). Functionally, PRDX3 supports mitochondrial ROS detoxification and complements GSR in sustaining oxidative balance. ACLY demonstrated near-threshold upregulation (RNA-Seq: TvsN = 1.24; MvsN = 1.28; Gene Chip: TvsN = 1.11; MvsN = 0.89), reflecting its metabolic coupling role between oxidative phosphorylation and anabolic lipid biosynthesis. In contrast, PDK1, NDUFV1, and SDHA displayed minimal or inconsistent fold-change variations, suggesting stable mitochondrial baseline activity with limited transcriptional responsiveness to oncogenic stress. GLS2, on the other hand, exhibited a consistent and strong downregulation pattern across both platforms (RNA-Seq: TvsN = 0.32; MvsN = 0.45; Gene Chip: TvsN = 0.66; MvsN = 0.38), confirming its potential role as a metabolic suppressor whose loss may facilitate tumor progression through impaired glutamine-mediated redox homeostasis (Table 2).
Applying the established thresholds allowed the prioritization of four genes GSR, MTHFD2, PRDX3, and ACLY as the most consistent and biologically relevant regulators of mitochondrial oxidative stress adaptation in breast cancer. Among these, GSR and MTHFD2 emerged as the top-tier master regulators, exhibiting high expression reproducibility and functional importance in redox defense and metabolic reprogramming. PRDX3 and ACLY. formed the secondary regulatory layer, bridging antioxidant protection with biosynthetic adaptation. Together, these genes delineate a hierarchical mitochondrial redox-metabolic regulatory axis, essential for sustaining cancer cell survival, oxidative resilience, and metabolic flexibility during tumor progression and metastasis (Figure 9A-D).
Table 2: Cross-platform expression summary and threshold-based prioritization of MOS master regulators in breast cancer. Please click here to download this Table.

Figure 9. Comparative expression profiling of MOS-DEGs in breast cancer. (A,B) RNA-Seq-based expression analysis of eight key MOS genes SDHA, PRDX3, PDK1, NDUFV1, MTHFD2, GSR, GLS2, and ACLY across normal (green), tumor (red), and metastatic (gray) breast tissues. Boxplots (A) display the log₂-transformed gene expression levels, while density plots (B) illustrate the probability distributions across sample categories. (C,D) The corresponding gene chip-based expression data, confirming consistent expression trends observed in RNA-Seq analysis. MTHFD2 and GSR exhibit strong tumor-specific upregulation, PRDX3 shows moderate elevation, PDK1 and ACLY reflect metabolic activation, and GLS2. demonstrates consistent suppression, highlighting the coordinated redox-metabolic reprogramming of mitochondrial pathways during breast cancer progression. Please click here to view a larger version of this figure.
Prognostic evaluation of prioritized MOS-DEGs
Because survival analysis was performed in a broader breast cancer cohort without restriction to HER2 status, these prognostic findings should be interpreted as supportive rather than HER2-specific. To evaluate the prognostic significance of the identified MOS-DEGs Kaplan-Meier survival curves were generated to assess the association between MOS gene expression and recurrence-free survival (RFS) in breast cancer patients (Figure 10A-D). The analysis included 4,929 breast cancer patients, assessing recurrence-free survival (RFS) based on the median expression levels of four selected MOS-DEGs: MTHFD2 (201761_at), PRDX3 (209766_at), GSR (225609_at), and ACLY (201128_s_at).
As illustrated in Figure 10A-D, distinct prognostic patterns were observed among these MOS-DEGs. MTHFD2 exhibited a strong negative prognostic association (HR = 1.53; 95% CI: 1.39-1.70; log-rank p = 1.1×10⁻16), where elevated expression levels were significantly correlated with shorter RFS, underscoring its role as an oncogenic metabolic regulator involved in one-carbon metabolism and redox adaptation (Figure 10B). Conversely, PRDX3 (209766_at) demonstrated a significant protective effect (HR = 0.73; 95% CI: 0.66-0.81; log-rank p = 7.7×10⁻10), where higher expression was linked to prolonged RFS, emphasizing its function as a mitochondrial antioxidant enzyme mitigating oxidative stress (Figure 10C). In contrast, GSR (HR = 1.05; p = 0.50) and ACLY (HR = 1.02; p = 0.77.) showed no significant association with recurrence-free survival (Figure 10A,D), suggesting a secondary, supportive role in maintaining redox equilibrium and metabolic homeostasis rather than directly influencing clinical outcomes. Overall, these findings identify MTHFD2 as a key poor-prognostic MOS-DEG driving metabolic proliferation and redox imbalance, while PRDX3 emerges as a protective MOS-DEG contributing to redox stability and a favorable prognosis. Together, these two genes represent opposing molecular forces within the mitochondrial oxidative stress framework-MTHFD2 promoting metabolic aggressiveness and PRDX3 reinforcing antioxidant defense mechanisms in breast cancer (Figure 10A-D).

Figure 10. Kaplan-Meier survival analysis of MOS-DEGs in breast cancer. (A) GSR (225609_at) shows no significant correlation with recurrence-free survival (HR = 1.05; p = 0.50). (B) MTHFD2 (201761_at) demonstrates a strong association with poor prognosis, with high expression predicting reduced RFS (HR = 1.53; p = 1.1×10⁻16). (C) PRDX3 (209766_at) indicates a protective association, where higher expression corresponds to longer RFS (HR = 0.73; p = 7.7×10⁻10). (D) ACLY (201128_s_at) exhibits no significant prognostic effect (HR = 1.02; p = 0.77.). High-expression groups are shown in red, while low-expression groups are shown in black. Cross marks represent censored samples. Please click here to view a larger version of this figure.
nsSNP characterization and prioritization of MTHFD2 and PRDX3
The Ensembl Variant Effect Predictor (VEP) analysis of the canonical transcript MTHFD2-201 (ENST00000394053.7) revealed a total of 40 non-synonymous variants. Among these, five variants-rs1471336772, rs2103841302, rs1428567735, rs1694270227, and rs1694375712 displayed strong evidence of deleterious functional impact across multiple predictive tools. These substitutions were consistently predicted to affect protein structure and function across multiple computational tools, including SIFT ≤ 0.05, PolyPhen-2 ≥ 0.85, CADD values ranging from 29 to 36, MetaLR ranging from 0.73 to 0.86, Mutation Assessor > 3.5, and REVEL scores approaching 0.99. The most deleterious variant, rs1471336772 (T>G), exhibited a CADD score of 33, PolyPhen-2 score of 0.99, and REVEL score of 0.997, suggesting potential structural destabilization and possible impairment of the enzyme's NADPH-dependent catalytic activity (Supplemental Table S4). Most high-scoring variants were located within the NADP⁺-binding and methylenetetrahydrofolate dehydrogenase catalytic domains, which are critical for redox regulation and one-carbon metabolism. These findings suggest that such substitutions could disrupt metabolic flux and redox balance, and may be associated with oncogenic reprogramming in breast cancer.
For PRDX3-201 (ENST00000298510.4), 28 non-synonymous SNPs were identified, of which five variants-rs747786383, rs2133647474, rs761292643, rs377652956, and rs1847844616-demonstrated consistently damaging profiles. These variants met multiple deleterious thresholds: SIFT ≤ 0.05, PolyPhen-2 ≥ 0.85, CADD scores between 28 and 32, MetaLR ranging from 0.43 to 0.99, and REVEL ≥ 0.9. The highest-impact variant, rs747786383 (A>G), presented with CADD = 28, MetaLR = 0.993, and REVEL = 0.97, indicating a strong pathogenic potential (Supplemental Table S5). This mutation occurs within the thioredoxin-dependent peroxidase domain, a region essential for hydrogen peroxide reduction and maintenance of mitochondrial oxidative stress homeostasis. Alterations in this domain may hinder peroxidase catalytic efficiency, compromise antioxidant defenses and increasing susceptibility to oxidative damage. A comparative summary of both genes indicated that MTHFD2 and PRDX3 harbor distinct but complementary functional vulnerabilities in mitochondrial oxidative metabolism. MTHFD2 variants primarily affect enzymatic redox cycling and NADPH regeneration, promoting metabolic aggressiveness, whereas PRDX3 variants compromise peroxidase-mediated detoxification, reducing antioxidant resilience. Among all mutations, rs1471336772 in MTHFD2 and rs747786383 in PRDX3 were identified as the most pathogenic nsSNPs based on their cumulative high scores across CADD, REVEL, and Mutation Assessor predictions. Figure 11 provides a graphical overview of the comparative pathogenicity profiles for the two proteins, highlighting the consistently high scores across SIFT, PolyPhen-2, CADD, MetaLR, REVEL, and Mutation Assessor. Both genes exhibited strong deleterious potential, with MTHFD2 (CADD = 33, REVEL = 0.997) indicating maximal structural disruption and PRDX3 (MetaLR = 0.99, REVEL = 0.97) displaying comparable functional impact on antioxidant domain integrity (Figure 11). Together, these findings reveal that MTHFD2 and PRDX3 harbor highly damaging nsSNPs that could critically impair mitochondrial redox homeostasis and energy metabolism, supporting their potential relevance as oxidative stress-related genes and candidate therapeutic targets in breast cancer pathogenesis.
A systematic re-evaluation of the nsSNP annotation framework was performed to ensure consistency between quantitative pathogenicity scores and categorical classifications across all reported variants. In particular, discrepancies between CADD scores and their qualitative labels were corrected by applying standardized interpretation thresholds (CADD ≥20 as deleterious and ≥30 as highly deleterious), ensuring that all prioritized variants reflect true high-impact substitutions. Furthermore, variant identifiers were harmonized with those reported in the primary results to maintain consistency between the tabulated data and the narrative interpretation. This refinement strengthens the study's analytical rigor and ensures that the reported nsSNPs constitute a coherent, high-confidence set of functionally relevant variants suitable for downstream structural and translational investigations (Table 3).

Figure 11. Comparative nsSNP pathogenicity scores for MTHFD2 and PRDX3. Bar plot showing the aggregated predictive scores of deleteriousness for MTHFD2 (ENST00000394053.7) and PRDX3. (ENST00000298510.4) across six computational tools: SIFT, PolyPhen-2, CADD, MetaLR, REVEL, and Mutation Assessor. Higher scores correspond to increased probability of functional disruption, indicating that both genes possess highly damaging nsSNPs that may compromise mitochondrial redox and metabolic integrity. Please click here to view a larger version of this figure.
Table 3: Top predicted deleterious nsSNPs in MTHFD2 and PRDX3. PolyPhen Class was probably damaging for all nsSNPs. Please click here to download this Table.
Secondary structure and 2D modeling analysis of MTHFD2 and PRDX3 proteins
To further interpret the structural and positional context of deleterious nsSNPs, 2D secondary structure modeling of MTHFD2 (MTDC_HUMAN) and PRDX3 (PRDX3_HUMAN) proteins was performed using PSIPRED and NetSurfP 2.0. These analyses provided detailed insights into the secondary structure elements, residue disorder propensity, and relative solvent accessibility (RSA) of each protein, highlighting the effect of pathogenic mutations on local folding and solvent exposure.
In the MTHFD2 protein (Figure 12A), the 2D model revealed an alternating pattern of α-helices (orange) and β-strands (purple) forming the compact core of the enzyme's catalytic and NADP⁺-binding regions. The deleterious residues identified from nsSNP screening, P208R, A247E, D248G, G291V, and G313D, were predominantly located in or near structured regions (α-helices and β-sheets) with moderate-to-high solvent accessibility. These residues were mapped within well-ordered, low-disorder segments, indicating structural rigidity essential for maintaining enzymatic stability and cofactor binding. Notably, A247E and D248G, positioned within a short α-helical turn adjacent to the active site, displayed high relative surface accessibility (RSA ≥ 0.25), suggesting that these substitutions could directly affect catalytic surface interactions. Similarly, G291V and G313D, located within the β-sheet region near the NADP⁺ pocket, exhibited local distortion potential due to loss of flexibility and increased side-chain volume, as reflected in the thicker disorder lines in these regions. The P208R mutation occurred within a helical-coil transition zone, a region that typically requires precise backbone rigidity, implying that replacement of proline with arginine may introduce local instability and alter folding dynamics. Collectively, these data indicate that the deleterious residues in MTHFD2 reside in structurally constrained, solvent-exposed sites, where even minor substitutions can compromise catalytic geometry and redox coupling.
In the PRDX3 protein (Figure 12B), the secondary structure map exhibited a conserved thioredoxin fold composed of alternating α-helices and β-strands connected by flexible coil loops. The critical mutations P70R, P101R, A218V, and P230S were found in solvent-accessible loop regions or adjacent to secondary structural elements, reflecting their potential influence on redox-active cysteine accessibility and dimer stabilization. The P70R and P101R variants occurred near disordered and solvent-exposed coils, which may disrupt the dynamic flexibility required for peroxidase catalysis and thioredoxin interaction. On the other hand, A218V and P230S mapped within helical loop transitions, regions vital for conformational switching during the catalytic cycle. These residues demonstrated increased exposure and reduced disorder, suggesting that mutations at these positions could rigidify local structural elements, limiting redox conformational adaptability. Overall, the 2D topology indicated that PRDX3 deleterious variants primarily alter loop-to-helix stability, solvent exposure, and local disorder, consistent with the predicted impairment of catalytic and antioxidant functions. The combined analysis of MTHFD2 and PRDX3 2D models revealed that pathogenic nsSNPs cluster in regions balancing rigidity and flexibility, where perturbations in hydrogen bonding or polarity can have amplified effects on protein function. The affected residues occupy structurally conserved, functionally exposed sites, confirming their potential to alter enzymatic conformation and mitochondrial redox regulation.

Figure 12. Secondary structure and relative surface accessibility of deleterious nsSNP sites. (A) The 2D topology of MTHFD2 (MTDC_HUMAN) showing predicted α-helices (orange), β-strands (purple), and disordered regions (black line thickness). Red vertical boxes mark deleterious residues (P208R, A247E, D248G, G291V, and G313D), which localize within structured and moderately solvent-exposed regions near the catalytic site. (B) The PRDX3 (PRDX3_HUMAN) 2D secondary structure representation highlighting α-helices, β-sheets, and coil regions. Deleterious residues (P70R, P101R, A218V, and P230S) are boxed in red, primarily occupying solvent-accessible loops and helical interfaces critical for redox activity. Relative surface accessibility is indicated in red (exposed) and black (buried), while disorder propensity is depicted by line thickness. Together, the models reveal that disease-associated nsSNPs are positioned in functionally sensitive and structurally constrained domains essential for enzymatic activity and mitochondrial redox balance. Please click here to view a larger version of this figure.
Structural and functional impact of deleterious nsSNPs predicted
To further validate the structural and mechanistic implications of the prioritized nsSNPs identified in MTHFD2 and PRDX3, molecular dynamics-based computational analysis was performed using DynaMut and MutPred2 servers. These tools provided complementary insights into mutation-induced alterations in protein flexibility, stability, solvent accessibility, and catalytic site conformation.
For MTHFD2, five high-confidence substitutions (A247E, D248G, P208R, G291V, and G313D) exhibited MutPred2 scores ≥ 0.90, signifying strong pathogenic potential. Functional motif mapping revealed that these variants localized within or adjacent to PROSITE motifs (PS00006, PS00008, PS00767) and ELM functional regions (ELME000233, ELME000335) associated with NADP⁺ binding and catalytic function. The mutations A247E and D248G, situated near the enzyme's catalytic pocket, demonstrated profound disruption of metal-binding sites (p = 3.0×10⁻4-4.4×10⁻5) and altered ordered interface interactions, which are essential for enzyme-substrate recognition and cofactor stability. These substitutions were predicted to gain new allosteric and catalytic sites (at D248 and I251) while simultaneously losing native catalytic functionality, implying a destabilizing conformational shift. The G291V and G313D variants displayed gain of allosteric sites (F295, G313) and altered transmembrane propensity, indicating possible misfolding and subcellular localization defects. Importantly, both variants were associated with reduced methylation and increased loop flexibility, potentially affecting the enzyme's structural integrity. Similarly, P208R (MutPred2 = 0.916) resulted in altered DNA-binding affinity and gain of a helical structure, suggesting disruption of inter-domain stability and electrostatic coordination within the NADP⁺-binding groove. Collectively, the MTHFD2 nsSNPs converge on mechanisms involving metal-binding perturbation, allosteric regulation gain, and catalytic site disruption, suggesting that they may impair enzymatic activity and redox regulation and metabolic roles critical for cancer cell proliferation and resistance.
For PRDX3, four major substitutions P230S, A218V, P101R, and P70R were predicted to be functionally damaging (MutPred2 ≥ 0.80). These variants mapped within conserved motifs (ELME000053, ELME000085, ELME000146, ELME000137) corresponding to the thioredoxin-dependent peroxidase domain. The P230S mutation (MutPred2 = 0.836) was associated with altered metal binding (p = 1.2 × 10⁻3), gain of disulfide linkage at C229, and loss of an allosteric site at W233, suggesting interference with redox-active cysteine residues crucial for peroxidase catalysis. The A218V variant led to loss of solvent accessibility and gain of an allosteric site at R214, indicative of reduced catalytic exposure and potential redox inactivation. Moreover, P101R and P70R exhibited high MutPred2 scores (0.924-0.945) and introduced structural perturbations in the N-terminal domain. These mutations caused altered transmembrane potential, gain of β-strand conformations, and loss of post-translational modification sites (sulfation and pyrrolidone formation), collectively suggesting impaired folding and redox stability. Overall, PRDX3 variants displayed a consistent pattern of altered metal coordination, increased disorder, and gain/loss of catalytic sites, which could compromise peroxidase efficiency and weaken mitochondrial antioxidant defense mechanisms. These structural perturbations, combined with the deleterious predictions from VEP, CADD, and REVEL analyses, indicate that both MTHFD2 and PRDX3 harbor high-impact nsSNPs that can destabilize protein conformation, disrupt catalytic architecture, and alter molecular interactions essential for maintaining redox equilibrium. The convergence of these damaging mutations on metal-binding and catalytic residues underscores their potential pathogenic role in compromising mitochondrial oxidative stress regulation in breast cancer. It is important to note that these findings are based on computational predictions and require experimental validation to confirm their biological and functional relevance (Table 4).
Table 4: Predicted functional and structural consequences of deleterious nsSNPs in MTHFD2 and PRDX3 based on DynaMut and MutPred2 analyses. Please click here to download this Table.
Structural visualization and atomic-level interpretation of deleterious nsSNPs in MTHFD2 and PRDX3
To elucidate the atomic-scale structural consequences of the most deleterious non-synonymous variants, molecular visualization and 3D structural modeling were performed for MTHFD2 and PRDX3 using PyMOL and DynaMut simulations. These analyses aimed to identify conformational perturbations, residue interaction shifts, and catalytic site rearrangements resulting from high-impact substitutions. For PRDX3, the predicted deleterious mutations P230S, A218V, P101R, and P70R were modeled within the thioredoxin-dependent peroxidase domain. The wild-type protein exhibited a compact α/β-fold characterized by an extensive hydrogen bonding network stabilizing the catalytic cysteine residues. However, the introduction of mutations induced noticeable structural deviations localized near the catalytic and substrate recognition pockets.
Specifically, the P230S substitution (Figure 13A,B) replaced a rigid, nonpolar proline with a polar serine residue. This substitution increased side-chain flexibility, introducing a new hydrogen bonding potential while disrupting the native loop rigidity required for peroxidase domain stability. The result was an increased surface solvent accessibility and partial unfolding near the redox-active site. Similarly, A218V (Figure 13B) substituted a small alanine residue with a bulkier valine, creating steric hindrance in the hydrophobic core and potentially disturbing cofactor accommodation. The P101R and P70R mutations (Figure 13C,D) replaced proline with arginine residues, introducing positively charged side chains in regions originally dominated by nonpolar contacts. These alterations are predicted to introduce electrostatic changes and potential loop distortions near the N-terminal domain, which likely interfere with thioredoxin docking and dimerization stability. Collectively, these structural predictions suggest that PRDX3 variants may exhibit altered redox interface geometry, increased disorder, and reduced peroxidase efficiency, compromising the enzyme's ability to neutralize hydrogen peroxide and maintain mitochondrial oxidative balance.
For MTHFD2, the modeled variants A247E, D248G, P208R, G291V, and G313D were distributed across the NADP⁺-binding and catalytic domains, which are critical for one-carbon metabolism and NADPH regeneration. The wild-type structure revealed a tightly packed catalytic pocket, coordinated by divalent metal ions and hydrogen-bonding interactions, essential for folate substrate binding. In contrast, all modeled substitutions introduced charge, polarity, or size imbalances that destabilized these regions. The A247E mutation (Figure 14B) introduced a negatively charged glutamic acid residue, disrupting metal-binding coordination and electrostatic balance within the catalytic cavity. Similarly, D248G (Figure 14F) resulted in the loss of an aspartic acid carboxyl group, abolishing a key catalytic contact and destabilizing the local secondary structure. The G291V substitution (Figure 14A) introduced a bulkier valine residue, causing steric clashes and partial distortion of the loop adjacent to the NADP⁺ pocket.
The G313D variant (Figure 14C) introduced a negatively charged residue in a region typically occupied by a small glycine, resulting in altered hydrogen bonding patterns and reduced flexibility at the catalytic interface. Finally, P208R (Figure 14E) replaced a rigid proline with a positively charged arginine, potentially disturbing interdomain interactions and increasing disorder in the protein's central helical structure. These combined alterations are predicted to significantly reduce enzyme stability, impair metal ion coordination, and decrease catalytic efficiency. Collectively, both MTHFD2 and PRDX3 mutants show localized destabilization within functionally conserved catalytic and redox-active sites. MTHFD2 variants predominantly affect NADPH-binding and catalytic residues, impairing metabolic redox cycling, while PRDX3 mutations interfere with thioredoxin-binding and disulfide formation, weakening mitochondrial antioxidant capacity. These findings underscore that deleterious nsSNPs contribute to the molecular destabilization of mitochondrial redox networks, potentially driving tumor metabolic reprogramming and therapy resistance.

Figure 13. Structural modeling of deleterious nsSNPs in PRDX3 (ENST00000298510.4). (A) Ribbon diagram of the PRDX3 tertiary structure highlighting mutation sites (green) within the thioredoxin-dependent peroxidase domain. (B) Substitution of proline with serine at position 230 and alanine with valine at position 218 induces conformational relaxation and local unfolding in the catalytic loop. (C,D) Replacement of proline with arginine at positions 101 and 70 introduces positive charges and disrupts N-terminal helical packing. Each mutation schematic shows the chemical transition of the affected residues and the altered side-chain orientation within the 3D structure. The predicted effects include increased solvent exposure, hydrogen-bonding rearrangement, and reduced peroxidase stability.Please click here to view a larger version of this figure.

Figure 14. Structural visualization of deleterious nsSNPs in MTHFD2 (ENST00000394053.7). (A) The MTHFD2 monomer structure illustrating high-impact nsSNP positions
Supplemental Table S1. List of genes along with their corresponding normalized expression values and associated sample information derived from the GSE231524 dataset, used for downstream analysis. Please click here to download this File.
Supplemental Table S2. List of genes along with their corresponding normalized expression values and associated sample information derived from the GSE231525 dataset, used for downstream analysis. Please click here to download this File.
Supplemental Table S3. List of differentially expressed genes (DEGs) identified from the 3-MOS-DEGs, including gene names, fold change values, and statistical significance used for downstream interpretation. Please click here to download this File.
Supplemental Table S4. Variants associated with the MTHFD2-201 transcript, including variant type, genomic position, and related annotation information used for downstream analysis. Please click here to download this File.
Supplemental Table S5. Variants associated with the PRDX3-201 transcript, including variant type, genomic position, and related annotation information used for downstream analysis. Please click here to download this File.