Method Article

Lymphocyte Trajectory Modeling for Mortality Risk Prediction in Pneumonia-Associated ARDS

DOI:

10.3791/69338

⸱

November 21st, 2025

In This Article

Summary

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study addresses a significant gap in prognostic models for pneumonia-associated acute respiratory distress syndrome by introducing a novel method that leverages dynamic lymphocyte trajectories, rather than single-point measurements, for mortality risk stratification.

Abstract

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Pneumonia-associated acute respiratory distress syndrome (ARDS) is characterized by high mortality, yet current prognostic models often rely on static biomarkers that fail to capture dynamic immune responses. This protocol introduces a reproducible computational framework that employs Group-Based Trajectory Modeling (GBTM) to identify distinct lymphocyte count trajectories and assess their prognostic value for mortality risk in ARDS patients with pneumonia. Using data extracted from the MIMIC-IV v2.2 database, the protocol details each step from data curation and preprocessing to trajectory construction and model validation. The approach includes subgroup identification through GBTM, followed by multivariable logistic and Cox regression analyses to quantify associations between trajectory patterns and 28-day mortality, adjusting for key clinical covariates such as APS III score, ICU stay, and heart rate. Model performance is comprehensively evaluated using ROC curves, calibration plots, and decision curve analysis, ensuring both statistical robustness and clinical interpretability. By leveraging longitudinal immune data rather than single-timepoint measurements, this workflow provides clinicians with a methodologically transparent, data-driven strategy to improve risk stratification and explore immune heterogeneity in critical illness. The protocol is fully reproducible, adaptable to other longitudinal biomarkers, and designed for visualization and instructional demonstration, making it an accessible tool for researchers seeking to integrate temporal biomarker modeling into critical care prognostics.

Introduction

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Acute respiratory distress syndrome (ARDS) is a common and clinically complex acute pulmonary inflammatory syndrome with a mortality rate of up to 40%1. If ARDS is combined with pneumonia, it is associated with high mortality in critically ill patients2. A recently published clinical study or meta-analyses about ARDS have reported inconsistent conclusions in many aspects, which may be due to a variety of reasons3. Lymphocytes are essential in the immune response during pneumonia, with lymphopenia linked to increased vulnerability to secondary infections, higher sepsis severity, and mortality4. For instance, a lymphocyte count below 0.5 x 109 cells/L upon admission has been associated with poor prognosis5. However, previous research failed to provide convincing evidence of the relationship between lymphocyte dynamic changes and prognostic implications. 

The MIMIC-IV database provides real-world clinical data with broad patient coverage, facilitating detailed subgroup analyses where longitudinal lymphocyte counts are accessible. To model these temporal dynamics, we employed Group-Based Trajectory Modeling (GBTM), a semi-parametric finite-mixture approach that identifies latent subgroups following distinct longitudinal trends, rather than imposing a single average trajectory6. Compared to other longitudinal or time-series clustering methods, GBTM offers specific advantages for clinical biomarker analysis: it accommodates unbalanced or missing observations common in electronic health records, directly models the probability of subgroup membership, and produces readily interpretable trajectory profiles ideal for risk stratification.

Therefore, this study presents a protocol for applying GBTM to identify lymphocyte trajectories and evaluate their association with 28-day mortality in ARDS patients with pneumonia. The methodology is applicable to longitudinal laboratory data where at least two measurements per patient are available within the first 7 days after ARDS diagnosis. This approach provides a reproducible framework for leveraging dynamic, routinely collected biomarkers to improve prognostic assessment and explore immune heterogeneity in critical illness. 

This study includes two steps. First, clinical and survival data for ARDS combined with pneumonia were obtained from the MIMIC-IV database. Then, the R software (version 4.4.1) was used to draw the trajectory of lymphocyte changes and analyze the relationship between lymphocyte trajectories and 28-day mortality rate.

Protocol

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study does not require ethical approval or consent to participate. The data used in this study were obtained from databases. This study utilized the de-identified MIMIC-IV database, which has been approved by the Institutional Review Boards (IRBs) of Beth Israel Deaconess Medical Center and the Massachusetts Institute of Technology. The requirement for individual patient consent was waived. Researchers must complete a recognized course in human research ethics (CITE Number: 64579441, Langqing Xu) and request formal data access from the PhysioNet platform.

1. Study population and data extraction

  1. Obtain data for the retrospective study from the Medical Information Mart for Intensive Care (MIMIC)-IV database, maintained by the Computational Physiology Laboratory at Massachusetts Institute of Technology. The data encompasses clinical information of 58,000 patients who were admitted to the ICU of Beth Israel Deaconess Medical Center (Boston, Massachusetts, USA) between 2008 and 2019. 
  2. Select data of adult patients (≥ 18 years) with a diagnosis of both acute respiratory distress syndrome (ARDS) and pneumonia. The diagnosis should be based on the International Classification of Diseases, 9th and 10th editions (ICD-9, ICD-10)7, with ARDS ranked among the top three diagnoses. 
  3. Exclude data based on the following criteria: Age < 18 years; used immunosuppressants previously; have tumors, hematological diseases, rheumatic system diseases; had a history of transplantation. 
  4. Use Structured Query Language (SQL) within a database management tool named Navicat Premium to extract data for the final cohort, including demographics, clinical scores in admission (APS III, SOFA, Charlson Comorbidity Index), vital signs in admission, laboratory indicators, 28-day mortality status, hospital, and ICU length of stay. Include data for all recorded absolute lymphocyte counts from the first 7 days following ICU admission. The SQL code is provided in Supplementary File 1.
  5. Ensure that each patient has at least two measurements within this period to be included. Click on Output, name the data, and click on Execute to output and save the data.

2. Group-based trajectory modeling (GBTM) 

  1. To identify distinct immune trajectories, perform group-based trajectory modeling (GBTM) on longitudinal lymphocyte counts using R software (v4.4.1+) with the gbmt package. 
  2. Structure data in long format, and fit models by varying group numbers (1-5) and polynomial orders. Select the optimal model based on statistical fit (AIC/BIC), clinical interpretability of patterns, and group stability (>5% of cohort, posterior probability >0.7).
  3. Visualize the final trajectories to illustrate subgroup dynamics over the 7-day period. See Supplementary File 1 for the R code for GBMT modeling.

3. Statistical analysis for prognostic modeling

  1. For prognostic modeling, include measures such as the mean, median, standard deviation, range, and quartiles of continuous variables, as well as frequency tables of categorical variables. Use the median and interquartile range (IQR) to describe continuous variables with a non-normal distribution. 
  2. To examine differences in variable characteristics between groups, use Fisher's exact test, Pearson's Chi-squared test for categorical variables and Wilcoxon rank sum test for continuous variables. 
  3. Use the developed multivariable logistic and Cox proportional hazards regression models to evaluate the association between trajectory groups and 28-day mortality, adjusting for key covariates including age, APS III score, and ICU days. See code details in Supplementary File 1. 
  4. Assess model performance through discrimination (area under the receiver operating characteristic curve), calibration (bootstrap-validated calibration plots), and clinical utility (decision curve analysis across risk thresholds). Consider a two-sided p < 0.05 as indicative of statistical significance.

Results

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Patient characteristics
A total of 161 patients were included in this study. Execution of the GBTM protocol (step 2) successfully identified three distinct lymphocyte trajectories, as visualized in Figure 1. This outcome confirms successful model convergence and valid subgroup separation, which is a key checkpoint for proceeding to prognostic modeling. This decision was based on a balance of statistical fit (Supplementary Table 1) and clinical interpretability: while models with four or five classes showed marginally better AIC/BIC values, the three-class model provided clearly distinct, clinically actionable trajectories—representing persistent immunosuppression, moderate recovery, and rapid immune reconstitution—with sufficient group size and stability for robust prognostic analysis. Trajectory 1, the continuous-rising class, included 62 patients (38.5%) and was characterized by a steady, almost linear increase in lymphocyte counts from day 1 through day 7. But the starting point was very low. Trajectory 2, the U-shaped class, included 36 patients (22.3%) and exhibited an early decline in lymphocyte counts over the 1st 3 days, followed by gradual recovery thereafter. Trajectory 3, the early-dip then rapid-rise class, included 63 patients (39.1%) who showed a modest decrease in lymphocyte counts during days 1–3 and a pronounced upward trend from day 4 onward. At the same time, the starting point is close to the normal value. Baseline characteristics varied significantly across these trajectories, validating their clinical relevance (Table 1). Patients in trajectory 1 were older and demonstrated greater illness severity (higher APS III and SOFA scores), the lowest baseline lymphocyte counts, and the highest 28-day mortality rate (24.2%). As shown in Table 2, several baseline characteristics differed significantly between survivors and non-survivors. Non-survivors were older and more frequently belonged to the high–mortality lymphocyte trajectory. They also experienced longer ICU stays, a higher prevalence of hypertension, and more severe illness on admission (higher APS III scores and greater comorbidity burden). Finally, non-survivors presented with elevated heart rates and increased blood urea nitrogen.

Multivariate logistic regression
The logistic regression protocol (step 4) was executed to identify independent predictors of 28-day mortality in patients with ARDS complicated by pneumonia. In the unadjusted mode (Model 1), and using Trajectory 1 as the reference, neither Trajectory 2 (OR 0.51; 95% CI 0.15–1.45; p = 0.228) nor Trajectory 3 (OR 0.10; 95% CI 0.02–0.39; p = 0.003) proved significant except for the latter, which was associated with a marked reduction in odds of 28-day mortality. After adjusting for confounders (Model 3), Trajectory 3 remained an independent predictor of lower mortality (OR 0.06; 95% CI 0.01–0.36; p = 0.006; Table 3). Hospital length of stay retained a protective effect (OR 0.59 per day; 95% CI 0.40–0.78; p = 0.002), whereas each ICU day further increased mortality risk (OR 1.75; 95% CI 1.32–2.57; p < 0.001). Higher APS III scores (OR 1.04 per point; 95% CI 1.00–1.08; p = 0.047) and elevated admission heart rate (OR 1.05 per bpm; 95% CI 1.01–1.10; p = 0.019) were also significant, while sex, Charlson index and BUN did not reach significance. These results confirm the protocol's ability to derive a strong and adjusted association between immune trajectories and outcomes. Forest plots were used for visualization simultaneously (Figure 2). Model performance metrics (step 4) demonstrated the robustness of our approach. The model showed excellent discriminative ability (AUC = 0.932; Figure 3). This high value, significantly above the 0.5 benchmark of random chance, indicates that the model derived from the protocol effectively distinguishes between survivors and non-survivors. The calibration curve (Figure 4) indicated good agreement between predicted and observed probabilities, with minimal miscalibration at the extremes, supporting the model's reliability. Decision curve analysis (Figure 5) showed a consistent net benefit across threshold probabilities of 5%–45%, with peak benefit near 10%, indicating the model's potential for clinical decision-making.

Multivariate Cox regression
The Cox regression protocol (step 4) yielded results consistent with the logistic model. Compared with Trajectory 1, patients in Trajectory 2 showed a non-significant trend toward lower 28-day mortality (HR 0.54; 95% CI 0.20–1.48; p = 0.232), whereas those in Trajectory 3 had a markedly reduced hazard of death (HR 0.12; 95% CI 0.03–0.52; p = 0.005; Model 1). After adjusting for confounders (Model 3), Trajectory 3 remained independently associated with lower mortality (HR 0.13; 95% CI 0.03–0.64; p = 0.012; Table 4). Hospital days was protective (HR 0.68 per day; 95% CI 0.54–0.87; p = 0.002), whereas each additional ICU day increased hazard by 51% (HR 1.51; 95% CI 1.19–1.91; p < 0.001). Higher APS III scores (HR 1.02; 95% CI 1.00–1.05; p = 0.026) and elevated admission heart rate (HR 1.03; 95% CI 1.00–1.07; p = 0.041) were also modest but significant predictors of mortality, while Charlson index and BUN did not reach statistical significance. The stability of this association across both regression frameworks underscores the robustness of the lymphocyte trajectory as a prognostic marker. The proportional hazards assumption was verified using Schoenfeld residuals, with no significant violations detected. Forest plots were used for visualization simultaneously (Figure 6).

figure-results-1
Figure 1: Lymphocyte trajectories identified by 3-class GBTM. Generated using the R gbmt package. Successful replication should show three distinct trajectories representing persistent immunosuppression, moderate recovery, and rapid immune reconstitution. Please click here to view a larger version of this figure.

figure-results-2
Figure 2: Forest plot for multivariable logistic regression. Created with R ggplot2. Statistically significant predictors should show confidence intervals not crossing OR = 1. Please click here to view a larger version of this figure.

figure-results-3
Figure 3: ROC curve for logistic model discrimination. Generated using the R pROC package. AUC = 0.932 indicates excellent performance; the replicated curve should rise sharply to the top-left. Please click here to view a larger version of this figure.

figure-results-4
Figure 4: Calibration curve (bootstrap 200). Created with the R rms package. Good calibration is confirmed by a close fit of the bias-corrected curve to the ideal line. Please click here to view a larger version of this figure.

figure-results-5
Figure 5: Decision curve analysis for clinical utility. Generated using the R rmda package. The model should show net benefit over treat-all/none strategies across a 5%-45% threshold range. Please click here to view a larger version of this figure.

figure-results-6
Figure 6: Forest plot for Cox proportional hazards model. Created with the R survminer package. Successful replication should clearly distinguish protective (HR < 1) and risk (HR > 1) factors. Please click here to view a larger version of this figure.

variableTrajectory 1 (N=62)Trajectory 2 (N=36)Trajectory 3 (N=63)p
Age, years71 (62.2–81)72 (58–78.2)62 (45.5–81)0.0204
SOFA5 (4–6)4 (3–6)4 (3–6)0.0333
GCS15 (15–15)15 (15–15)15 (15–15)0.803
APSIII52 (45–63)40 (28.8–57.5)49 (33.5–57)0.0107
Charlson index5 (3–6)4.5 (3–5)3 (2–6)0.213
COPD14 (22.6%)11 (30.6%)14 (22.2%)0.602
Hypertension42 (67.7%)22 (61.1%)31 (49.2%)0.104
Diabetes18 (29%)13 (36.1%)16 (25.4%)0.529
Heart disease32 (51.6%)19 (52.8%)28 (44.4%)0.638
Cerebrovascular12 (19.4%)5 (13.9%)7 (11.1%)0.425
CKD9 (14.5%)7 (19.4%)8 (12.7%)0.659
CLD0 (0%)0 (0%)4 (6.3%)0.0412
Ventilation11 (17.7%)5 (13.9%)13 (20.6%)0.701
Respiratory rate22 (19–25)19.5 (16–24)22 (18–24)0.0753
Heart rate93 (82.2–101)84 (76.8–98.2)93 (88.5–106)0.0371
SBP126 (114.2–138.5)126 (116–134)126 (115–130)0.518
DBP75 (66–84.8)75 (63–79.8)75 (66–81)0.921
WBC10.2 (6.8–16.3)11.8 (9.2–13.9)12.5 (8.8–15)0.208
Hemoglobin10.8 (10–12)11.4 (10.5–13.1)11.7 (10.7–12.9)0.117
Platelets223.5 (169.8–279)223.5 (175.8–264.2)226 (192–282)0.775
Lymphocyte0.6 (0.4–0.7)1.4 (1.3–1.6)1.1 (0.6–1.9)<0.001
Creatinine0.9 (0.8–1.4)0.9 (0.8–1.2)0.8 (0.6–1.2)0.0652
BUN23 (17–38)20 (14.5–33.2)18 (12.5–24.5)0.0167
Potassium4 (3.6–4.7)4 (3.8–4.5)4.1 (3.5–4.4)0.488
Sodium140 (136–143)139 (136–143.2)139 (137–141)0.444
Length of hospital, days11.5 (7.2–20.8)11 (6.8–14.2)12 (8–16)0.557
Length of ICU, days7 (4.2–12)6 (4–10.2)6 (3–11)0.603
28-day mortality15 (24.2%)5 (13.9%)2 (3.2%)0.0029

Table 1: Baseline characteristics by lymphocyte trajectories. The significance tests used were Fisher's exact test, Pearson's Chi-squared test, and Wilcoxon rank sum test.

CharacteristicOverall  SurviveDeathp-value
N = 161N = 139N = 22
trajetory, n (%)0.001
162.0 (38.5%)47.0 (33.8%)15.0 (68.2%)
236.0 (22.4%)31.0 (22.3%)5.0 (22.7%)
363.0 (39.1%)61.0 (43.9%)2.0 (9.1%)
sex, n (%)0.446
Female83.0 (51.6%)70.0 (50.4%)13.0 (59.1%)
Male78.0 (48.4%)69.0 (49.6%)9.0 (40.9%)
age, Median (Q1, Q3)69.00 (55.00, 81.00)66.00 (52.00, 78.00)79.00 (70.00, 87.00)0.001
hospital days, Median (Q1, Q3)12.00 (8.00, 17.00)12.00 (8.00, 17.00)11.00 (5.00, 18.00)0.303
ICU days, Median (Q1, Q3)6.00 (4.00, 12.00)6.00 (3.00, 11.00)11.00 (5.00, 14.00)0.033
Ventilation admission, n (%)>0.999
No132.0 (82.0%)114.0 (82.0%)18.0 (81.8%)
Yes29.0 (18.0%)25.0 (18.0%)4.0 (18.2%)
COPD, n (%)0.719
No122.0 (75.8%)106.0 (76.3%)16.0 (72.7%)
Yes39.0 (24.2%)33.0 (23.7%)6.0 (27.3%)
Hypertension, n (%)0.019
NO66.0 (41.0%)62.0 (44.6%)4.0 (18.2%)
Yes95.0 (59.0%)77.0 (55.4%)18.0 (81.8%)
Diabetes, n (%)0.771
No114.0 (70.8%)99.0 (71.2%)15.0 (68.2%)
Yes47.0 (29.2%)40.0 (28.8%)7.0 (31.8%)
Heart disease, n (%)0.312
No82.0 (50.9%)73.0 (52.5%)9.0 (40.9%)
Yes79.0 (49.1%)66.0 (47.5%)13.0 (59.1%)
Cerebrovascular, n (%)0.104
No137.0 (85.1%)121.0 (87.1%)16.0 (72.7%)
Yes24.0 (14.9%)18.0 (12.9%)6.0 (27.3%)
CKD, n (%)0.534
No137.0 (85.1%)117.0 (84.2%)20.0 (90.9%)
Yes24.0 (14.9%)22.0 (15.8%)2.0 (9.1%)
CLD, n (%)>0.999
No157.0 (97.5%)135.0 (97.1%)22.0 (100.0%)
Yes4.0 (2.5%)4.0 (2.9%)0.0 (0.0%)
SOFA admission, Median (Q1, Q3)5.00 (3.00, 6.00)5.00 (3.00, 6.00)5.50 (3.00, 7.00)0.241
APSIII admission, Median (Q1, Q3)49.00 (35.00, 60.00)48.00 (32.00, 57.00)61.50 (50.00, 69.00)<0.001
GCS, Median (Q1, Q3)15.00 (15.00, 15.00)15.00 (15.00, 15.00)15.00 (14.00, 15.00)0.367
Charlson Comorbidity Index, Median (Q1, Q3)5.00 (3.00, 6.00)4.00 (2.00, 6.00)5.00 (4.00, 6.00)0.047
Respiratory rate, Median (Q1, Q3)22.00 (18.00, 24.00)22.00 (18.00, 24.00)23.00 (19.00, 26.00)0.091
Heart rate, Median (Q1, Q3)93.00 (82.00, 103.00)93.00 (81.00, 101.00)101.00 (93.00, 110.00)0.01
SBP, Median (Q1, Q3)126.00 (115.00, 135.00)126.00 (115.00, 135.00)126.00 (117.00, 136.00)0.919
DBP, Median (Q1, Q3)75.00 (65.00, 83.00)75.00 (65.00, 83.00)75.00 (63.00, 84.00)0.963
Cr, Median (Q1, Q3)0.90 (0.70, 1.30)0.90 (0.70, 1.30)0.95 (0.70, 1.40)0.582
BUN, Median (Q1, Q3)20.00 (15.00, 30.00)19.00 (14.00, 27.00)33.50 (20.00, 54.00)0.002
Potassium, Median (Q1, Q3)4.00 (3.70, 4.50)4.00 (3.60, 4.50)4.25 (3.80, 5.10)0.133
Sodium, Median (Q1, Q3)139.00 (136.00, 142.00)139.00 (136.00, 142.00)140.50 (136.00, 143.00)0.483
WBC, Median (Q1, Q3)11.50 (7.80, 15.20)11.40 (7.70, 15.20)12.10 (9.10, 18.20)0.321
HBG, Median (Q1, Q3)11.30 (10.30, 12.70)11.50 (10.30, 12.80)10.60 (9.80, 11.70)0.072
PLT, Median (Q1, Q3)226.00 (175.00, 279.00)225.00 (174.00, 278.00)236.00 (200.00, 358.00)0.208
Lymphocyte, Median (Q1, Q3)0.83 (0.52, 1.44)0.88 (0.53, 1.45)0.67 (0.45, 0.93)0.111

Table 2: Survivors and non-survivors characteristic. The significance tests used were Fisher's exact test, Pearson's Chi-squared test, and Wilcoxon rank sum test.

GroupCharacteristicOR95% CIp-value
model 1trajetory
1——
20.510.15, 1.450.228
30.10.02, 0.390.003
model 2trajetory
1——
20.460.12, 1.590.236
30.10.01, 0.440.007
age1.081.03, 1.130.003
sex
F——
M0.50.15, 1.500.222
hosp_days0.690.52, 0.860.005
ICU_days1.511.22, 2.010.001
model 3trajetory
1——
20.560.12, 2.320.435
30.060.01, 0.360.006
age1.061.00, 1.140.049
sex
F——
M0.420.11, 1.470.187
hosp_days0.590.40, 0.780.002
ICU_days1.751.32, 2.57<0.001
APS III_admission1.041.00, 1.080.047
charlson_comorbidity_index0.890.63, 1.200.49
heart_rate_admission1.051.01, 1.100.019
BUN_adm11.00, 1.010.209

Table 3: Multivariate logistic regression. Abbreviations: CI = Confidence Interval, OR = Odds Ratio.

GroupCharacteristicNEvent NHR95% CIp-value
model 1trajetory16122
162——
2360.540.20, 1.480.232
3630.120.03, 0.520.005
model 2trajetory16122
162——
2360.510.18, 1.420.197
3630.160.03, 0.700.016
age161221.061.02, 1.100.004
sex16122
F83——
M780.620.26, 1.460.274
hosp_days161220.730.58, 0.910.005
ICU_days161221.411.14, 1.730.001
model 3trajetory16122
162——
2360.650.22, 1.910.437
3630.130.03, 0.640.012
age161221.051.00, 1.100.063
sex16122
F83——
M780.670.26, 1.740.411
hosp_days161220.680.54, 0.870.002
ICU_days161221.511.19, 1.91<0.001
APS III_admission161221.021.00, 1.050.026
charlson_comorbidity_index161220.920.71, 1.180.493
heart_rate_admission161221.031.00, 1.070.041
BUN_adm1612211.00, 1.000.293

Table 4: Multivariate Cox regression. Abbreviations: CI = Confidence Interval, HR = Hazard Ratio.

Supplementary Table 1: Summary statistics for each lymphocyte trajectory group. Abbreviations: AIC = Akaike information criterion; BIC = Bayesian information criterion; CAIC = Consistent AIC; AvePP = Average posterior probability. Please click here to download this file.

Supplementary File 1: Codes and script for steps 1-3. Please click here to download this file.

Discussion

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The novel findings of our analysis are based on the MIMIC-IV database. We developed a novel ARDS combined with pneumonia longitudinal lymphocyte subgroup analysis based on widely available lymphocyte counts. We found that longitudinal lymphocyte count subgroups were better than the lymphocyte count at the time of admission in evaluating the prognosis.

Critical procedural steps significantly influenced the results and should be noted for reproducibility. The data filtering criterion, which requires > 2 lymphocyte measurements within 7 days, ensures adequate longitudinal data. The GBTM parameter selection, specifically the choice of three trajectories based on clinical interpretability rather than marginal statistical improvements, was crucial for generating clinically actionable phenotypes.

The distinct lymphocyte trajectories might imply different immune profiles and outcomes in ARDS combined with pneumonia, especially during the treatment process8. The APSIII score is a measurement indicator for the degree of individual physiological disorders9. The higher the APS score, the more obvious the organ dysfunction of the body is, and the worse the prognosis is. Compared with Trajectory 3, the initial value of lymphocytes in patients of Trajectory 1 is lower, and the APSIII score is higher. This indicates that patients in Trajectory 1 are more likely to be in the stage of immune dysfunction. Trajectory 1 (Continuous rise) likely represents patients with limited early lymphocyte mobilization and persistent lymphocytosis. Trajectory 3 (Early dip then rapid rise) was independently associated with the lowest 28-day mortality (HR 0.13; p = 0.012), suggesting that a transient lymphocyte decrease followed by prompt recovery confers better prognosis. This pattern aligns with prior research indicating that early death in ARDS is driven by intense inflammation, while late death is more commonly associated with immunosuppression10. These findings highlight the potential role of tailored therapies for different ARDS combined with pneumonia subtypes. Specifically, patients with a pro-inflammatory profile (Trajectory 3) may benefit from anti-inflammatory agents like corticosteroids or ulinastatin11. For patients with an immunosuppressive profile (Trajectory 1), immune-stimulating therapies such as thymosin α1, which restores lymphocyte counts, or IL-7, which promotes lymphocyte proliferation and prevents apoptosis, might be advantageous12. The identification of these trajectories has direct clinical implications for risk stratification. By monitoring lymphocyte counts during the first 3-4 days of ICU stay, clinicians could potentially identify patients at persistently high risk (Trajectory 1) who might be candidates for enrollment in trials of immunostimulatory therapies. Conversely, patients following Trajectory 3, despite an initial dip, have a favorable prognosis and might be spared from more aggressive and potentially harmful immunomodulatory interventions. However, these proposals are hypothetical and must be rigorously tested in prospective, interventional studies before clinical adoption. Therefore, targeted treatment for ARDS is particularly important13.

Compared with existing modeling approaches, our GBTM-based protocol offers distinct methodological advantages. While machine learning models typically require complex feature engineering and extensive computational resources, this approach leverages routinely available longitudinal data with a straightforward implementation. The protocol's ability to handle irregularly spaced measurements and missing data makes it particularly suitable for real-world clinical datasets. Furthermore, the resulting trajectory groups provide immediately interpretable clinical phenotypes, unlike the black box nature of some complex machine learning algorithms.

Heart rate was significantly higher in non-survivors compared to survivors in our study. This conclusion is consistent with a previous study14. An increased HR indicates that a person has a low oxygen level, which represents a more severe condition of ARDS15. Furthermore, a prolonged elevated HR in critically ill, cardiac high-risk patients could result in major cardiac events, which may cause an adverse prognosis16. Therefore, when a patient is admitted to the hospital with an abnormally fast heart rate, it is essential to closely monitor the patient's blood oxygen levels and be vigilant for signs of deterioration in the clinical symptoms of ARDS combined with pneumonia. Hospitalization duration exhibits a prognostic duality: each additional day on general wards may signal recovery momentum, whereas prolonged ICU confinement often encapsulates a self-perpetuating cycle of critical illness17.

To contextualize the performance of our lymphocyte trajectory model, we compared it with recent prognostic models for ARDS. For instance, a recent study focusing specifically on pneumonia-ARDS (p-ARDS) developed six ML models, among which the Support Vector Machine (SVM) model demonstrated the best performance, with an AUC of 0.7718. More advanced models incorporating specific biomarkers (such as lactate-to-albumin ratio) have improved performance, with reported AUC up to 0.811 for 28-day mortality19. In this landscape, this model, which integrates dynamic lymphocyte trajectories with basic clinical parameters, achieved a superior discriminative ability, with an AUC of 0.932. This suggests that the longitudinal immune profile captured by lymphocyte trends may provide more potent prognostic information than a single time-point measurement or even complex ML models based on static admission variables.

The efficiency and adaptability of this workflow represent additional advantages. The modular design of our analytical pipeline enables rapid application to new datasets, with complete analysis from data extraction to visualization achievable within hours. This protocol can be readily adapted to study other dynamic biomarkers or different critical care populations, enhancing its utility beyond the current application.

Several limitations merit consideration. First, the retrospective analysis of a single-center database (MIMIC-IV) is susceptible to unmeasured confounding, and the reliance on ICD-9/10 codes for ARDS identification not the prospective Berlin definition may introduce misclassification bias and limit generalizability due to regional diagnostic variations. Second, the cohort size (n=161), while adequate for initial modeling, requires external validation in larger, multi-center studies. Third, the absence of dynamic cytokine profiles and immunotherapy data restricts a fuller immunological assessment. Finally, the clinical utility of our model requires prospective evaluation.

In conclusion, three distinct lymphocyte trajectories were identified in ARDS patients with pneumonia using GBTM. Lymphocyte trajectories, high HR, and ICU stays were strong predictors of 28-day mortality. These findings might support the development of more personalized management strategies for ARDS combined with pneumonia. Future prospective studies could focus on investigating the efficacy of targeted immune therapy on different trajectories to better understand potential interactions between immune therapy and ARDS subgroups.

Disclosures

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors have no conflicts of interest to disclose.

Acknowledgements

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study is based on the baseline data MIMIC-IV database. We extend our sincere gratitude to the dedicated MIMIC-IV team for their contributions. Thanks for the companionship and encouragement of Mr. Hou.

Materials

List of materials used in this article
NameCompanyCatalog NumberComments
MIMIC MIT Lab for Computational PhysiologyIV 2.2
Navicat Premium PremiumSoft CyberTech Ltd16
R R Core Team4.4.1
RStudioPosit Software, PBC

References

Loading...
$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,
  1. Epidemiology, Patterns of Care, and Mortality for Patients With Acute Respiratory Distress Syndrome in Intensive Care Units in 50 Countries. JAMA. 315 (8), 788_800(2016).">Bellani, G., et al. Epidemiology, Patterns of Care, and Mortality for Patients With Acute Respiratory Distress Syndrome in Intensive Care Units in 50 Countries. JAMA. 315 (8), 788_800(2016).
  2. Acute Respiratory Distress Syndrome. New Engl J Med. 377 (6), 562_572(2017).">Thompson, B. T., et al. Acute Respiratory Distress Syndrome. New Engl J Med. 377 (6), 562_572(2017).
  3. Acute Respiratory Distress Syndrome Heterogeneity and the Septic ARDS Subgroup. Front Immunol. 14, 1277161(2023).">Xu, H., et al. Acute Respiratory Distress Syndrome Heterogeneity and the Septic ARDS Subgroup. Front Immunol. 14, 1277161(2023).
  4. Persistent Lymphopenia after Diagnosis of Sepsis Predicts Mortality. Shock. 5 (5), 391_391(2014).">Drewry, A. M., et al. Persistent Lymphopenia after Diagnosis of Sepsis Predicts Mortality. Shock. 5 (5), 391_391(2014).
  5. The Presence of Hypothermia within 24 Hours of Sepsis Diagnosis Predicts Persistent Lymphopenia. Crit Care Med. 43 (6), 1165-1169 (2015).">Drewry, A. M., et al. The Presence of Hypothermia within 24 Hours of Sepsis Diagnosis Predicts Persistent Lymphopenia. Crit Care Med. 43 (6), 1165-1169 (2015).
  6. Group-Based Multi-Trajectory Modeling. Statist Methods Med Res. 27 (7), 2015-2023 (2018).">Nagin, D. S., et al. Group-Based Multi-Trajectory Modeling. Statist Methods Med Res. 27 (7), 2015-2023 (2018).
  7. https://www.cdc.gov/nchs/icd/icd-10/index.html (2024).">ICD-10. , https://www.cdc.gov/nchs/icd/icd-10/index.html (2024).
  8. Subphenotypes in Acute Respiratory Distress Syndrome: Latent Class Analysis of Data from Two Randomised Controlled Trials. Lancet Resp Med. 2 (8), 611-620 (2014).">Calfee, C. S., et al. Subphenotypes in Acute Respiratory Distress Syndrome: Latent Class Analysis of Data from Two Randomised Controlled Trials. Lancet Resp Med. 2 (8), 611-620 (2014).
  9. Prediction of Prognosis in Elderly Patients with Sepsis Based on Machine Learning (Random Survival Forest). BMC Emerg Med. 22 (1), 26(2022).">Zhang, L., et al. Prediction of Prognosis in Elderly Patients with Sepsis Based on Machine Learning (Random Survival Forest). BMC Emerg Med. 22 (1), 26(2022).
  10. Sepsis-Induced Immune Dysfunction: Can Immune Therapies Reduce Mortality. J Clin Invest. 126 (1), 23-31 (2016).">Delano, M. J., Ward, P. A. Sepsis-Induced Immune Dysfunction: Can Immune Therapies Reduce Mortality. J Clin Invest. 126 (1), 23-31 (2016).
  11. Sepsis-Induced Immunosuppression: Mechanisms, Diagnosis and Current Treatment Options. Military Med Res. 9 (1), 56(2022).">Liu, D., et al. Sepsis-Induced Immunosuppression: Mechanisms, Diagnosis and Current Treatment Options. Military Med Res. 9 (1), 56(2022).
  12. Immunotherapy: A Promising Approach to Reverse Sepsis-Induced Immunosuppression. Pharmacol Res. 111, 688-702 (2016).">Patil, N. K., et al. Immunotherapy: A Promising Approach to Reverse Sepsis-Induced Immunosuppression. Pharmacol Res. 111, 688-702 (2016).
  13. Mechanisms of pulmonary endothelial barrier dysfunction in acute lung injury and acute respiratory distress syndrome. Chinese Med J Pulmon Crit Care Med. 2 (2), 80-87 (2024).">Su, Y., et al. Mechanisms of pulmonary endothelial barrier dysfunction in acute lung injury and acute respiratory distress syndrome. Chinese Med J Pulmon Crit Care Med. 2 (2), 80-87 (2024).
  14. Predictors of Survival in Older Adults Hospitalized with COVID-19. Neurol Sci Off J Italian Neurol Soc Italian Soc Clin Neurophysiol. 42 (10), 3953-3958 (2021).">Tyson, B., et al. Predictors of Survival in Older Adults Hospitalized with COVID-19. Neurol Sci Off J Italian Neurol Soc Italian Soc Clin Neurophysiol. 42 (10), 3953-3958 (2021).
  15. Dynamic Oxygenation Subgroup Bringing New Insights in ARDS: More Predictive of Outcomes and Response to PEEP than Static PaO2/FiO2. Thorax. , (2025).">Bai, Y., et al. Dynamic Oxygenation Subgroup Bringing New Insights in ARDS: More Predictive of Outcomes and Response to PEEP than Static PaO2/FiO2. Thorax. , (2025).
  16. Impact of Prolonged Elevated Heart Rate on Incidence of Major Cardiac Events in Critically Ill Patients with a High Risk of Cardiac Complications. Crit Care Med. 33 (1), discussion 241-242 81-88 (2005).">Sander, O., et al. Impact of Prolonged Elevated Heart Rate on Incidence of Major Cardiac Events in Critically Ill Patients with a High Risk of Cardiac Complications. Crit Care Med. 33 (1), discussion 241-242 81-88 (2005).
  17. Clinical Subtypes of Sepsis Survivors Predict Readmission and Mortality after Hospital Discharge. Ann Am Thorac Soc. 19 (8), 1355-1363 (2022).">Taylor, S. P., et al. Clinical Subtypes of Sepsis Survivors Predict Readmission and Mortality after Hospital Discharge. Ann Am Thorac Soc. 19 (8), 1355-1363 (2022).
  18. Machine Learning-based prognostic prediction model of pneumonia-associated acute respiratory distress Syndrome. Front Med. 12, 1582426(2025).">Lv, J., Chen, J., Liu, M. Machine Learning-based prognostic prediction model of pneumonia-associated acute respiratory distress Syndrome. Front Med. 12, 1582426(2025).
  19. Association between Platelet-albumin-bilirubin grade and the 30-day mortality in patients with acute respiratory distress syndrome: Evidence from the MIMIC-IV Database. Balkan Med J. 42 (1), 66-74 (2025).">Ye, D., Jiang, W., Gu, D. Association between Platelet-albumin-bilirubin grade and the 30-day mortality in patients with acute respiratory distress syndrome: Evidence from the MIMIC-IV Database. Balkan Med J. 42 (1), 66-74 (2025).

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Tags

Lymphocyte TrajectoryARDS MortalityGroup Based Trajectory ModelingPneumonia ARDSLongitudinal BiomarkersMortality Risk PredictionMIMIC IV DatabaseLogistic RegressionCox RegressionImmune Heterogeneity

Related Articles