Forced expiratory volume in one second (FEV1) remains the standard parameter for assessing airflow limitation, although the most appropriate expression of its decline for prognostic evaluation remains unclear. This study aims to assess the prognostic significance of different annual FEV1 decline indices for medium- and long-term mortality in COPD.
MethodsOne thousand two hundred forty-seven patients with clinically diagnosed COPD were included, each undergoing at least three annual spirometric evaluations. Six FEV1 indices were analysed: absolute value, percentage of predicted, z-score, FEV1 normalized by height squared (FEV1·Ht−2) and cubed (FEV1·Ht−3), and FEV1Q. Longitudinal changes were estimated using random-coefficient models. The primary outcome was all-cause mortality over a 15-year follow-up.
ResultsA total of 12,863 person-years were analysed. During the follow-up, 577 patients (46.3%) died. All FEV1 indices were significantly associated with mortality risk. However, in multivariate analysis, only the annual decline in FEV1z-score remained an independent predictor (adjusted hazard ratio 0.104, 95% confidence interval 0.080–0.135, p<0.001). ROC analyses demonstrated that FEV1z-score decline provided superior predictive accuracy compared to other indices. A z-score annual decline ≥−0.1969/year was associated with a 4.6-fold increased mortality risk. Additionally, the baseline value of FEV1·Ht−3 showed greater prognostic value than the other baseline FEV1 indices.
ConclusionThe annual decline in FEV1z-score is the most robust FEV1 expression predictor of long-term mortality in COPD. These findings suggest that incorporating longitudinal z-score assessments into clinical practice may improve risk stratification and patient management.
Chronic obstructive pulmonary disease (COPD) is a highly prevalent condition associated with substantial socioeconomic burden and a growing cause of mortality worldwide [1–3]. A hallmark of COPD is persistent airflow limitation, the severity of which is commonly assessed using the forced expiratory volume at one second (FEV1) [1,4]. Beyond its role in defining the severity of airflow limitation, FEV1 remains the most widely used variable to characterize the natural history of COPD through its annual rate of decline [5].
Despite its centrality in COPD assessment, discrepancies persist regarding the optimal classification of airflow limitation severity, particularly whether FEV1 should be expressed as a percentage of the predicted value or as a z-score [1,6–8]. These discrepancies are further accentuated when considering the most suitable expression of FEV1 to evaluate disease progression over time. Several alternative indices have been introduced to address these limitations, including FEV1·Ht−2 and FEV1·Ht−3, which normalize FEV1 by height squared or cubed, respectively [9], and FEV1Q, defined as the ratio of absolute FEV1 to its first decile [10]. Each of these indices offers potential advantages in reflecting physiological changes and enhancing predictive accuracy in COPD progression.
Although recent ERS/ATS guidelines have endorsed FEV1Q as a preferred metric for longitudinal evaluations [11], the evidence remains insufficient, particularly regarding its prognostic implications. Most studies have focused on baseline FEV1 indices rather than their annual rates of change [9,10,12,13], which may provide additional prognostic insights. Moreover, many investigations include mixed populations of individuals with and without COPD [10,14], limiting their generalizability to clinical COPD cohorts.
This study aims to evaluate the prognostic value of the annual rate of change in various FEV1 indices, including the absolute value, percentage of predicted, z-score, FEV1·Ht−2, FEV1·Ht−3, and FEV1Q, for predicting medium- and long-term mortality risk in a cohort of patients with clinically diagnosed COPD.
MethodsStudy patients and designPatients with COPD from the Fuenlabrada cohort who had at least three annual spirometric evaluations over three consecutive years were selected. Details of the cohort have been published previously [7,15]. In brief, individuals were recruited between April 2004 and December 2008 if they were aged 40 years or older, showed airflow limitation, and had a clinical diagnosis of COPD recorded by general practitioners or pulmonologists in the ninth district of the Madrid Metropolitan Area, Spain.
COPD was diagnosed on the basis of a post-bronchodilator FEV1/FVC ratio below the lower limit of normal, corroborated by a clinical record indicating COPD (codes 491.xx or 492.xx in the International Classification of Diseases, Ninth Revision, Clinical Modification). Patients with concurrent diagnoses of asthma, cystic fibrosis, interstitial lung disease, pulmonary thromboembolic disease, active tuberculosis, chest wall disease, neuromuscular disorders, malignant tumours, or a history of thoracotomy with pulmonary resection were excluded. At enrolment, all participants were clinically stable and had not experienced a respiratory infection in the preceding six weeks. The study protocol was approved by the local ethics committee (Comité Ético de Investigación Clínica del Área 9 de Madrid. PI11062010).
All participants underwent pulmonary function testing at baseline and at least once annually for the following three years. Additional information, including smoking history, exacerbations, current treatment, and comorbidities, was collected to complement spirometric data.
Baseline and longitudinal assessment of airflow limitationAll post-bronchodilator spirometries were performed by the same technician using a MasterScreen Body device (Jaeger-Viasys, Wurtzburg, Germany), in accordance with current guidelines [16,17]. The best values for FVC and FEV1 were automatically selected from three acceptable, reproducible manoeuvres [4]. Only spirometries meeting quality grades A or B were retained. Predicted values and z-scores were calculated for each participant using the Global Lung Function Initiative equations [18].
For each participant, the following baseline FEV1 indices were derived: absolute value, percentage of the predicted value, z-score, FEV1/height2, FEV1/height3, and FEV1Q. The FEV1Q was calculated by dividing an individual's FEV1 by a sex-specific first percentile threshold for adults with lung disease (0.4L for women and 0.5L for men) [10]. Airflow limitation severity in patients with COPD was classified according to both GOLD criteria (mild [>80% predicted], moderate [50–79%], severe [30–49%], and very severe [<30%]) [1] and the 2021 ERS/ATS recommendations [6], which define mild, moderate, and severe obstruction as z-scores of −1.65 to −2.5, −2.51 to −4.0, and <−4.1, respectively.
To assess changes in the different expressions of FEV1 over the 3-year interval, we used random-coefficient models that allowed each participant to have an individual random slope. We also explored quadratic and piecewise approaches, with both fixed and random join points (i.e., points where segments with distinct slopes meet), but found that neither provided a materially better fit than the linear specification. The random slope was defined according to the timing of each FEV1 assessment. Individual empirical Bayes estimates of the FEV1 rate of change were then derived.
Follow-up and outcome measurementsDuring follow-up through December 31, 2019, patients received treatment from their general practitioner or pulmonologist in accordance with current guidelines [1,19], with evaluations every 3–6 months. The primary outcome was all-cause mortality, measured by time to death. The secondary outcome was the prediction of mid-term (up to 5 years) and long-term mortality. Vital status was ascertained via direct follow-up with patients or relatives, national registries, or hospital records, ensuring no missing mortality data.
Statistical analysisContinuous variables were expressed as mean±standard deviation (SD), whereas categorical variables were reported as absolute numbers and percentages. Comparisons between groups were conducted using analysis of variance (ANOVA) with Bonferroni post hoc analysis, Student's t-test, or the chi-square test. Time-to-death data were analysed using standard semi-parametric Cox proportional hazards models. For each FEV1-derived index, hazard ratios were calculated both as unadjusted estimates and as multivariable-adjusted estimates. The adjusted models included the following covariates: age, sex assigned at birth, smoking status, body mass index, Charlson comorbidity index, history of severe acute exacerbations in the preceding year, and baseline post-bronchodilator FEV1.
To evaluate the association between annual changes in FEV1 and all-cause mortality without assuming a predefined functional form, a penalized spline method was employed. This non-parametric regression model was selected based on the Akaike information criterion, with all models using the top 5% of values (≥95th percentile) as the reference category (hazard ratio=1) [20]. Additionally, optimal thresholds for annual changes in FEV1 indices were determined to maximize the area under the curve (AUC) over a 15-year follow-up period, applying Youden's index.
The performance of the time-to-event models was evaluated by calculating the cumulative incident AUC for annual changes in FEV1 indices considered as discrete variables. The Delong method was used to compute the standard error of the AUC and the differences between AUCs [21].
A p-value of less than 0.05 was deemed statistically significant. Data analyses were performed using SPSS (Armonk, NY), MedCalc (Ostend, Belgium), R statistical software version 3.3.1 (“Bug in Your Hair”; Boston, MA), and the R CRAN survival, risksetROC, and pspline packages (R Foundation for Statistical Computing, Vienna, Austria).
ResultsA total of 1247 patients with COPD were included in the analysis (Fig. S1 and Table S1), contributing a combined follow-up of 12,863 person-years. The cohort was predominantly male (71.8%), with an age (mean±SD) of 64±10 years. At baseline, 32.5% of the patients were current smokers, while 48.7% were former smokers. Based on the GOLD classification, 131 patients (10.5%) had mild airflow limitation, 661 (53.0%) had moderate airflow limitation, 383 (30.7%) had severe airflow limitation, and 72 (5.8%) had very severe airflow limitation. In contrast, according to the 2021 ERS/ATS recommendations, airflow limitation was classified as mild in 325 patients (30.8%), moderate in 593 (56.3%) and severe in 136 (12.9%). The key demographic, clinical and functional characteristics of COPD patients, stratified by severity of airflow limitation, are summarized in Table 1.
General characteristics of study subjects.*
| Overall (n=1247) | Mild(n=131) | Moderate (n=661) | Severe (n=383) | Very severe (n=72) | p-Value | |
|---|---|---|---|---|---|---|
| Males, n (%) | 895 (71.8) | 91 (69.5) | 458 (69.3) | 289 (75.5) | 57 (79.2) | 0.076 |
| Age, yrs | 64±10 | 60±11† | 62±11† | 66±11 | 64±10 | <0.001 |
| Height, m | 1.62±0.09 | 1.63±0.08 | 1.63±0.09 | 1.61±0.09 | 1.63±0.08 | 0.054 |
| Weight, kg | 76±17 | 72±13!! | 78±17‡ | 75±17!! | 69±14§ | <0.001 |
| BMI, kg/m2 | 28.8±5.6 | 27.2±4.6§ | 29.5±5.5 | 28.6±6.0 | 26.2±5.2§‡ | <0.001 |
| Smoking habit, n (%) | 0.037 | |||||
| Current smoker | 387 (32.5) | 39 (30.2) | 210 (33.5) | 114 (31.2) | 24 (34.8) | |
| Former smoker | 579 (48.7) | 56 (43.4) | 290 (46.3) | 198 (54.2) | 35 (50.7) | |
| Never smoker | 224 (18.8) | 34 (26.4) | 127 (20.3) | 53 (14.5) | 10 (14.5) | |
| Pack-years | 50±28 | 40±27†#$ | 48±27† | 56±28§ | 54±24 | <0.001 |
| Postbronchodilator spirometry | ||||||
| FEV1/FVC | 0.58±0.10 | 0.67±0.04|†§ | 0.62±0.07|† | 0.52±0.09| | 0.43±0.12 | <0.001 |
| FVC, L | 2.68±0.93 | 3.80±0.94|†§ | 2.87±0.79|† | 2.15±0.67| | 1.81±0.61 | <0.001 |
| FVC, % pred. | 76±18 | 104±12|†§ | 80±12|† | 63±12| | 50±13 | <0.001 |
| FEV1, L | 1.57±0.63 | 2.53±0.56|†§ | 1.76±0.46|† | 1.08±2.86| | 0.72±0.16 | <0.001 |
| FEV1, % pred. | 57±18 | 88±8|†§ | 63±8|† | 41±6| | 26±3 | <0.001 |
| FEV1, z-score | −2.68±1.05 | −0.78±0.53|†§ | −2.38±0.55|† | −3.54±0.52| | −4.40±0.50 | <0.001 |
| FEV1·Ht−2, L/m2 | 0.59±0.21 | 0.94±0.15|†§ | 0.66±0.13|† | 0.41±0.08| | 0.27±0.05 | <0.001 |
| FEV1·Ht−3, L/m3 | 0.36±0.13 | 0.58±0.08|†§ | 0.40±0.07|† | 0.25±0.05| | 0.16±0.03 | <0.001 |
| FEV1Q | 3.93±1.59 | 6.31±1.41|†§ | 4.40±1.16|† | 2.70±0.71| | 1.80±0.40 | <0.001 |
| Charlson comorbidity index | 3.8±2.0 | 3.2±2.0† | 3.6±2.0|† | 4.2±2.0 | 3.7±1.6 | <0.001 |
| Comorbidities, n (%) | ||||||
| Diabetes | 172 (13.8) | 4 (4.6) | 92 (13.9) | 68 (17.8) | 6 (8.3) | 0.001 |
| Ischemic heart disease | 79 (6.3) | 7 (5.3) | 40 (6.1) | 25 (6.5) | 7 (9.7) | 0.632 |
| Congestive heart failure | 76 (6.1) | 2 (1.5) | 31 (4.7) | 38 (9.9) | 5 (6.9) | 0.001 |
| Peripheral vascular disease | 235 (18.8) | 26 (19.8) | 115 (17.4) | 84 (21.9) | 10 (13.9) | 0.209 |
| Cerebrovascular disease | 34 (2.7) | 2 (1.5) | 19 (2.9) | 12 (3.1) | 1 (1.4) | 0.685 |
| Current treatment, n (%) | ||||||
| SABA | 630 (50.5) | 62 (47.3) | 324 (49.0) | 203 (53.0) | 41 (56.9) | 0.353 |
| SAMA | 108 (8.7) | 8 (6.1) | 47 (7.1) | 49 (12.8) | 4 (5.6) | 0.007 |
| LABA | 959 (76.9) | 80 (61.1) | 487 (73.7) | 330 (86.2) | 62 (86.1) | <0.001 |
| LAMA | 786 (63.0) | 52 (39.7) | 396 (59.9) | 279 (72.8) | 59 (81.9) | <0.001 |
| Inhaled corticosteroids | 957 (76.7) | 84 (64.1) | 480 (72.6) | 330 (86.2) | 63 (87.5) | <0.001 |
| Theophylline | 82 (6.6) | 0 | 22 (3.3) | 39 (10.2) | 21 (29.2) | <0.001 |
| N-acetylcysteine | 95 (7.6) | 10 (7.6) | 49 (7.4) | 33 (8.6) | 3 (4.2) | 0.615 |
Abbreviations: BMI, body mass index; FVC, forced vital capacity; FEV1, forced expiratory volume at one second; SABA, short acting beta agonists; SAMA, short acting muscarinic antagonists; LABA, long acting beta agonists; LAMA, long acting muscarinic antagonists.
After completing the three-year longitudinal spirometric evaluation, 577 (46.3%) patients died during the 15-year follow-up period. Table 2 summarizes mortality rates according to the GOLD and ERS/ATS classifications of airflow limitation severity at the end of the three-year evaluation period. Kaplan–Meier curves for the GOLD and ERS/ATS classifications of airflow limitation and mortality are shown in Fig. 1A and B, and hazard ratios for the different cut-offs of the two classifications are presented in Table 3. Overall, the severity classification based on the percentage of predicted FEV1 (GOLD) demonstrated greater discriminative capacity than the z-score-based classification (ERS/ATS).
Mortality rates according to severity classification of airflow limitation.
| Mortality rates per 1000 person-years | |
|---|---|
| Overall (n=1247) | 44.9 (42.1–47.6) |
| GOLD classification | |
| Mild (n=131) | 3.5 (3.1–3.9) |
| Moderate (n=661) | 21.6 (15.1–28.2) |
| Severe (n=383) | 35.2 (31.7–38.6) |
| Very severe (n=72) | 92.3 (85.6–99.1) |
| ATS/ERS 2021 | |
| Mild (n=325) | 40.6 (35.2–45.6) |
| Moderate (n=593) | 51.2 (47.2–55.3) |
| Severe (n=136) | 47.1 (39.4–54.8) |
Kaplan–Meier survival curves depicting 15-year outcomes stratified by different classifications of airflow limitation: (A) the Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria; (B) the 2021 recommendations of the European Respiratory Society/American Thoracic Society (ERS/ATS); (C) quartiles of height-normalized FEV1 (FEV1·Ht−3).
Unadjusted fifteen-year hazard ratios of death at several levels of airflow limitation according to different classification systems.
| GOLD classification | ATS/ERS 2021 classification | |
|---|---|---|
| Mild | ||
| HR (95% CI) | 1 (ref.) | 1 (ref.) |
| p-Value | – | – |
| Moderate | ||
| HR (95% CI) | 1.64 (1.14–2.36) | 1.28 (1.04–1.56) |
| p-Value | 0.007 | 0.018 |
| Severe | ||
| HR (95% CI) | 3.23 (2.24–4.65) | 1.45 (1.10–1.92) |
| p-Value | <0.001 | 0.009 |
| Very severe | ||
| HR (95% CI) | 4.57 (2.96–7.04) | – |
| p-Value | <0.001 | – |
Abbreviations: ATS, American Thoracic Society; CI, confidence interval; ERS, European Respiratory Society; GOLD, Global Initiative for Chronic Obstructive Lung Disease; HR, hazard ratio.
The analysis of the predictive capacity of all FEV1 indices over the 15-year follow-up revealed that, although all them were statistically significant, the area under the curve (AUC) values were significantly higher for FEV1·Ht−2, FEV1·Ht−3, and FEV1Q compared with the percentage of predicted or z-score indices, both in the medium and long term (Fig. S2). Similarly, the hazard ratios for mortality over the 15-year follow-up period were significant for all FEV1 indices both in crude and adjusted analyses (Table S2). However, in the multivariate analysis, only FEV1·Ht−3 remained as an independent predictor of mortality (adjusted hazard ratio 0.067, 95% confidence interval [CI] 0.028–0.158, p<0.001). Table S3 presents the distribution of 15-year hazard ratios of death across severity levels of airflow limitation stratified by quartiles of FEV1·Ht−3, while Fig. 1C shows the corresponding Kaplan–Meier curves.
The annual rates of change during the three-year functional assessment period were as follows: −21±19mL/year for absolute FEV1, −0.21±7.09%/year for the percentage of predicted FEV1, −0.07±0.421/year for the z-score, −0.004±0.072L/m2/year for FEV1·Ht−2, −0.003±0.044L/m3/year for FEV1·Ht−3, and −0.03±0.491/year for FEV1Q. Their distributions are summarized in histograms shown in Fig. S3.
Regarding the prognostic capacity of different indices of annual FEV1 decline, the Cox regression analysis indicated that all indices were significant predictors of 15-year mortality risk in patients with COPD, both in crude and adjusted analysis (Table 4). However, in the multivariate model including all indices of FEV1 decline, only the annual change in FEV1 expressed as a z-score was retained as an independent predictor (adjusted hazard ratio 0.104, 95% CI 0.080–0.135, p<0.001).
Fifteen-year unadjusted and adjusted hazard ratios of death of several expressions of annual difference in forced expiratory volume at one second.
| Crude hazard ratio(95% CI) | p-Value | Adjusted hazard ratio(95% CI)a | p-Value | |
|---|---|---|---|---|
| Annual change in FEV1, mL/yr | 0.044 (0.029–0.067) | <0.001 | 0.051 (0.031–0.083) | <0.001 |
| Annual change in FEV1, % pred./yr | 0.913 (0.902–0.924) | <0.001 | 0.924 (0.911–0.937) | <0.001 |
| Annual change in z-score FEV1, 1/yr | 0.110 (0.089–0.137) | <0.001 | 0.106 (0.081–0.137) | <0.001 |
| Annual change in FEV1·Ht−2, L/m2/yr | 0.0002 (0.00005–0.0005) | <0.001 | 0.0002 (0.00004–0.00067) | <0.001 |
| Annual change in FEV1·Ht−3, L/m3/yr | 0.000001 (0.000001–0.000004) | <0.001 | 0.000001 (0.000002–0.000004) | <0.001 |
| Annual change in FEV1Q, 1/yr | 0.237 (0.196–0.288) | <0.001 | 0.303 (0.249–0.369) | <0.001 |
Abbreviations: CI, confidence interval; Ht, height.
Fig. 2 illustrates the relationship between mortality and the percentile distribution of annual changes in the different FEV1 indices. Although all indices exhibit an inverse exponential relationship with mortality risk, the hazard ratio for the annual change in the z-score is higher than that observed for the other FEV1 expressions.
Penalized spline models depicting the hazard ratio for all-cause mortality based on annual changes in FEV1, expressed as absolute value (A), percentage of predicted (B), z-score (C), FEV1·Ht−2 (D), FEV1·Ht−3 (E), or FEV1Q (F). The top 5% of values (≥95th percentile) for each index were used as the reference category (hazard ratio=1). Dashed curves represent 95% confidence intervals.
Similarly, analysis of the area under the ROC curves for survival prediction demonstrated that the annual change in the z-score was significantly superior to the other indices across short-, medium-, and long-term follow-up periods (Fig. 3). The ROC curve's Youden index for the annual change in the z-score identified −0.1969/year as the optimal cut-off point. Fig. 4 presents Kaplan–Meier curves for patients with an annual decline of less than or at least −0.1969/year. These curves illustrate that patients with a greater annual decline in FEV1 have an approximately 4.6-fold increased risk of death over the subsequent 15 years.
Change over time of area under the incidence/cumulative receiver operating characteristic curve to predict death within 15 years of different indices of annual change in post-bronchodilator FEV1. AUC, area under the curve; ROC, receiver operating characteristic. Comparisons versus z-score/year: †p<0.001, ‡p<0.01, ¶p<0.05; comparisons versus % pred./year: §p<0.001, !!p<0.01, #p<0.05.
This study offers important insights into the prognostic relevance of different ways of expressing annual FEV1 decline in patients with clinically diagnosed COPD, emphasizing the value of longitudinal trends over static baseline measures. All evaluated expressions – absolute values, percentage of predicted, z-score, FEV1·Ht−2, FEV1·Ht−3 and FEV1Q – were significantly associated with 15-year all-cause mortality, although their prognostic roles differed in multivariate analyses. Notably, the annual decline in FEV1z-scores emerged as the strongest independent predictor. The Youden-derived threshold (≥−0.1969/year) identified patients at clearly increased risk, underscoring its potential utility as a clinical risk-stratification tool.
Until now, most assessments of the prognostic value of FEV1 expressions have relied on single baseline measurements. Our results are consistent with emerging evidence showing that height-normalized and percentile-based indices better account for physiological variability and may improve prognostic performance. Miller et al. [10] found slightly higher AUCs for FEV1Q and FEV1·Ht−3 compared with percentage-predicted FEV1 or z-scores, although their findings were influenced by a cohort in which only 4% had diagnosed COPD and by the limitations of ECSC reference equations, particularly in women and older adults. Other studies have similarly reported better discrimination with single-determination FEV1Q or FEV1·Ht−3 than with traditional reference-based expressions [9,12,14].
Height-normalized indices and FEV1Q are considered physiologically sound measures that reduce the impact of demographic and anthropometric variability. In our cohort, AUCs for FEV1Q and FEV1·Ht−3 were comparable, but Cox regression retained only FEV1·Ht−3 as an independent predictor of long-term mortality. This aligns with earlier observations, including the Framingham study, which highlighted the value of FEV1·Ht−3 in survival prediction [22]. In contrast, FEV1Q may be affected by sample-specific first percentile differences and by the fixed sex-based thresholds (0.5L for men, 0.4L for women). Although first-percentile values in our COPD cohort differed slightly (0.46L in women, 0.59L in men), recalculating FEV1Q did not improve its performance. FEV1·Ht−3 may therefore offer an advantage by better capturing differences related to sex and body size and by showing the least asymmetry relative to a normal distribution in previous analyses [10].
Our findings also reinforce the limited prognostic value of isolated z-scores, which tend to underestimate mortality risk in older COPD patients [7]. Because predicted FEV1 declines with age and observed decline approaches a physiological floor, the potential for z-score deterioration is smaller in older adults [10].
The main contribution of this study is the identification of the FEV1 expression with the highest prognostic value in longitudinal assessments. To our knowledge, no prior studies have evaluated the prognostic significance of annual declines across multiple FEV1 expressions. We show that annual decline in FEV1z-scores more accurately reflects deviation from expected age-related trajectories, thus capturing disease-specific deterioration. The threshold of ≥−0.1969/year supports the concept that departures from normal aging-related decline are more informative of prognosis than absolute loss. Importantly, the identification of this threshold has direct clinical implications. An annual z-score decline ≥−0.1969/year indicates a deterioration that clearly exceeds expected age-related decline, thereby marking individuals at particularly high risk. In practice, this threshold could serve as an actionable indicator for clinicians, prompting earlier reassessment of disease control, closer monitoring, or consideration of therapeutic adjustment or intensification. Integrating this parameter into routine longitudinal follow-up could therefore contribute to more timely and personalized management strategies for COPD patients.
In terms of feasibility, the implementation of height-normalized indices, FEV1Q, or z-score-based longitudinal measures in routine practice may appear complex; however, modern spirometry platforms and electronic health record systems increasingly incorporate automated computation of reference values and derived indices. This trend enhances their applicability not only in pulmonology but also in primary care settings. Moreover, these novel functional metrics could potentially be integrated into multidimensional tools such as the BODE index, particularly in light of evidence showing that longitudinal changes in BODE better reflect disease progression than isolated spirometric measurements [23].
The discrepancy between the optimal FEV1 expression for baseline mortality assessment and for longitudinal follow-up underscores the distinct roles of these indices. Percentage-predicted FEV1 remains widely used for baseline severity classification because of its simplicity [1], but its dependence on population averages may mask individual variability, especially in heterogeneous cohorts. In contrast, FEV1·Ht−3 normalizes lung function to body size and showed superior baseline prognostic performance in our study, likely due to its reduced susceptibility to anthropometric confounding. However, its usefulness decreases in longitudinal analyses, as changes in this index may reflect global physiological aging rather than disease-specific decline. FEV1Q, which expresses FEV1 as a fraction of its first percentile, is also valuable for identifying patients at extreme risk but lacks the sensitivity to capture subtle, clinically relevant changes over time. Conversely, z-scores – through demographic adjustment and the ability to detect deviations from expected trajectories – offer a clear advantage for monitoring disease progression and provide a more individualized assessment of functional decline in COPD.
These findings have relevant clinical implications. Firstly, the robust association between annual z-score decline and mortality underscores the potential of this metric as a routine clinical tool for risk stratification. The identification of an optimal z-score threshold for predicting mortality risk provides clinicians with a practical parameter for monitoring disease progression and tailoring interventions. Secondly, our findings reinforce the distinction between baseline and longitudinal assessment of COPD severity. At baseline, height-standardized indices such as FEV1·Ht−3 and FEV1Q showed better prognostic discrimination than percentage-predicted values or z-scores, suggesting that traditional reference-based expressions may not optimally capture cross-sectional airflow limitation severity. In contrast, for the longitudinal evaluation of disease progression, the annual decline in FEV1z-score was the strongest independent predictor of fifteen-year mortality. This indicates that baseline and longitudinal assessments serve different purposes and may require different functional metrics. In COPD patients, no evidence supports maintaining an airflow limitation severity classification based on the z-score, as proposed by the recent ERS/ATS recommendation [6]. Regarding FEV1 expressed as a percentage of its predicted value, it remains to be elucidated whether its demonstrated inferiority compared with FEV1·Ht−3 or FEV1Q justifies a change in severity stratification, especially considering its widespread use and the fact that survival curves between GOLD classification and FEV1·Ht−3 quartiles do not show a markedly different pattern.
This study has several notable strengths, including a large and well-characterized cohort of patients with clinically diagnosed COPD, comprehensive spirometric assessments over three years, and a long follow-up of 15 years. The rigorous analytical approach – using random-coefficient models and penalized splines – allowed a robust evaluation of longitudinal changes in FEV1 indices, and the complete ascertainment of mortality outcomes further reinforces the reliability of the results. Nevertheless, several limitations must be acknowledged. First, the cohort was predominantly male, which may limit the generalizability of the findings to women. As sex-related differences in COPD progression and mortality have been described [24,25], additional studies in more diverse populations are needed to confirm these observations. Second, a potential survivor bias should be acknowledged, as the analytical cohort required at least three annual high-quality spirometries. However, comparison of included and excluded individuals showed that those not entering the longitudinal analysis had fewer respiratory treatments, lower event rates (hospitalizations and pneumonia), better baseline lung function, and lower mortality (Table S1). Thus, rather than selecting a healthier survivor subgroup, our cohort reflects a clinically more severe population that typically requires closer specialist follow-up. Third, although adjustments were made for key confounders, the observational nature of the study precludes definitive causal inferences, and some degree of residual confounding cannot be excluded. Socioeconomic status and environmental exposures were not collected systematically for all participants and therefore could not be incorporated into the multivariable models. Similarly, although baseline treatment was recorded, information on treatment intensity, duration, and changes over time was not available in a standardized manner. Consequently, these factors might have influenced the observed associations, although their expected impact is likely modest. Fourth, similarly, although our analyses were adjusted for the history of previous severe exacerbations, detailed information on exacerbations during follow-up and emphysema burden was not available in a standardised manner; therefore, these factors may have contributed to residual confounding in the assessment of lung function decline. Fifth, age- and disease-related reductions in height may influence the prognostic performance of height-normalized indices. Vertebral fractures affect an estimated 25% of patients with COPD [26], and in women, height loss has been independently associated with increased mortality [27]. This phenomenon could lead to an underestimation of the prognostic value of FEV1·Ht−3, which is more sensitive to such changes than FEV1% predicted, whereas FEV1Q is unaffected because it does not incorporate height. Sixth, despite rigorous quality control, variability in repeated spirometric assessments may have introduce measured error, potentially affecting the precision of estimated FEV1 decline rates. Finally, this study used all-cause mortality as the primary endpoint, a robust and clinically relevant outcome; however, cause-specific mortality was not systematically adjudicated. As standardized information distinguishing respiratory, cardiovascular, or other causes of death was not available for the entire cohort, cause-specific or competing-risk analyses could not be performed.
In conclusion, this study highlights the prognostic value of longitudinal changes in FEV1 indices in COPD. The annual decline in FEV1z-scores emerged as a robust predictor of 15-year mortality, offering a clinically relevant tool for risk stratification. These findings support the incorporation of longitudinal spirometric assessments into clinical practice to better characterize disease progression and optimize management strategies for patients with COPD.
CRediT authorship contribution statementF.G.-R. had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. R.G., R.C., A.G., S.R., and F.G.-R. were responsible for developing the research question. R.G., R.C., A.G., S.R., E.T., E.M.-C., E.A., J.M.P., P.P., E.D.-G., P.P.-M., E.P., C.C.-Z., and F.G.-R. were responsible for the study design and collection of data. R.G., R.C., A.G., S.R., E.P., C.C.-Z., and F.G.-R. were responsible for study management and coordination. R.G., R.C., M.G.-G., and F.G.-R. were responsible for the analysis. F.G.-R. drafted the paper. All authors have read and approved the final manuscript. All authors had final responsibility for the decision to submit for publication.
Declaration of generative AI and AI-assisted technologies in the writing processArtificial intelligence tools were not used in the preparation of this manuscript.
Data sharingAll of the individual patient data collected during the trial will be shared. In addition, the study protocol and statistical analysis plan will be available as well. The data will be made available within 12 months of publication. All available data can be obtained by contacting the corresponding author (fgr01m@gmail.com). It will be necessary to provide a detailed protocol for the proposed study, to provide the approval of an ethics committee, to supply information about the funding and resources one has to carry out the study, and to consider inviting the original authors to participate in the re-analysis.
FundingThis study was supported by a grant from Instituto de Salud Carlos III, Spain, and co-funded by the European Union-Fondos FEDER (PI19/01612) to F. García-Río.
Conflict of interestWe declare no competing interests.
















