Dear Editor,
Introduction: The COVID-19 pandemic highlighted the critical need for robust crisis standards of care (CSC) protocols to handle extreme strain when scarce resources require rationing. Evaluating how such policies might perform in the real world remains paramount; however, to date, study of their performance has been limited to retrospective cohort designs using virtual simulations.Reference Bhavani, Luo and Miller1, Reference Walsh and Pradhan2 The Sequential Organ Failure Assessment score (SOFA)—a composite 0-24 score of organ dysfunction incorporating neurologic, pulmonary, cardiovascular, hematologic, hepatobiliary, and renal subscores—remains ubiquitous in nearly all crisis standards of care protocolsReference Raschke, Agarwal and Rangan3, Reference Ashana, Anesi and Liu4, Reference Miller, Han and Peek5 despite concerns regarding the utilization and potentially exacerbating existing racial inequities.Reference Miller, Han and Peek5 Existing simulation studies have handled missing SOFA values by either imputing zero or assuming data are missing at random, followed by complex computational statistical modeling.Reference Brinton, Ford and Martin6, Reference Molinnus, Beulertz and Bickenbach7 This approach may introduce significant bias, with larger outcome implications than missing data in other forms of medical research, as these values directly affect decisions on who receives life-sustaining therapies. Our study aims to better understand the frequency, structure, and consequence of missing data in CSC simulation studies.
Methods: We conducted a retrospective simulation study of all mechanically ventilated patients across a single New York City academic hospital system between March 1, 2020, and June 30, 2020 (surge period). SOFA scores were collected for each day of mechanical ventilation as close to 10 a.m. as possible, with a 24-hour window permitted for data inclusion. We defined the “crisis period” as initiating once 95% of the health system’s pre-pandemic ventilatory supply was utilized and lasting 2 weeks (crisis cohort), consistent with previously published CSC simulations under the New York State Ventilator Allocation Guidelines (NY).Reference Walsh, Zhu and Feng8 The “surge cohort” included all ventilated patients during the surge period but excluded those whose ventilation only occurred during the crisis period. The primary outcome was the daily frequency of missing SOFA subscores. The secondary outcome was the cumulative number of missing subscores over time throughout the surge.
Results: In total, 1671 patients were ventilated during the surge period: 1091 (65.3%) male, mean age 64.2 years, 887 (53.1%) COVID-19 positive, mean duration of intubation 13.5 days (Q25 1, Q50 6, Q75 17.5 days), and hospitalized mortality 887 (50.0%). In the crisis period, 674 patients were ventilated: 465 (69.9%; P = .09) male, mean age 63.7 years (P = .46), 571 (84.7%; P <.0001) COVID-19 positive, mean duration of intubation 19.8 days (Q25 5, Q50 12, Q75 25 days; P <.0001), and hospital mortality 395 (59.4%; P <.0001). Figure 1 depicts the frequency of six missing SOFA categories throughout the surge period (primary outcome). Patients were significantly more likely to have at least one missing SOFA subscore during the crisis period compared to the non-crisis surge period (P <.0001). The neurologic subscore was more frequently missing during the crisis period compared to the surge period (P <.0001). Conversely, the hepatic category was more likely to be present during the crisis period (P <.0001). Figure 2 presents the distributions of missing SOFA categories within the patient cohort (secondary outcome). While patients missing one or two categories during the crisis period were common, missing more than two categories was rare.

Figure 1. Frequency of missing SOFA categories throughout the Spring NYC 2020 COVID-19 surge.

Figure 2. No. of missing categories throughout the Spring NYC 2020 COVID-19 surge.
Discussion: To our knowledge, this is the first study to specifically evaluate missing data patterns and their implications in CSC simulation studies. We found that the severity of missing SOFA subcategories correlates with periods of peak clinical strain during the surge period, i.e., the crisis period, and suggests that data are not missing at random but are instead associated with operational pressures during crisis periods.
Complex computational replacement methods, such as zero imputation, regression substitution, or multiple substitutions,Reference Brinton, Ford and Martin6, Reference Molinnus, Beulertz and Bickenbach7 often assume data are missing at random—an assumption contradicted by our data. Missing-at-random assumption is methodologically unsound and may introduce biases that distort CSC performance and undermine fair allocation.
Zero imputation, commonly used in prior CSC simulation studies, replaces missing values with 0, and thereby risks underestimating severity for patients whose missing values would otherwise raise their SOFA score.Reference Walsh, Zhu and Feng8 Zero imputation CSC protocols using a static SOFA score may introduce an optimism bias,Reference Valiani, Terrett and Gebhardt9 as missing data on the day of evaluation might underestimate illness severity and increase a patient’s triage priority. In CSC protocols relying on dynamic SOFA comparisons,10 where current and previous SOFA scores are compared to assess clinical trajectory, missing current-day data may falsely suggest improvement (optimism bias), while missing prior-day data may be falsely interpreted as suggesting clinical deterioration (pessimistic bias), and lead potentially to deprioritizing patients for limited resources. Both scenarios highlight how missing information can profoundly impact resource allocation decisions.
Importantly, our study suggests that missing data is likely to occur and will impact real CSC triage decisions. The neurologic subscore, most frequently missing during crisis periods, may be under-reported due to factors such as examination limitations under continuous neuromuscular blockade, efforts to limit provider exposure, or diagnostic time constraints when patients arrive in extremis.Reference Walsh, Zhu and Feng8 We were surprised that hepatobiliary scores were more common during the crisis period, hypothesizing that the relatively higher frequency may reflect automated lab ordering or protocolized care rather than clinical necessity.
Our pragmatic approach to missing data—imputing missing SOFA scores from the closest available time point, followed by zero imputation if none available—acknowledges that perfect data may not be feasible during crisis conditions. While this too could introduce potential bias, it better considers operational realities.
In summary, missing data in CSC simulations are common, non-random, and can be impactful on triage and reallocation decision-making of lifesaving therapies. A thorough understanding of their frequency and patterns is essential for accurately evaluating current CSC performance. Beyond addressing missingness, maintaining accurate, consistent, and timely data input is equally critical to ensuring the integrity and fairness of CSCs under real-world crisis conditions.Reference Hick, Hanfling and Wynia11 Together, these efforts form the foundation for designing future CSCs that ensure equitable, accurate, and effective triage decisions.
Jianan Zhu, MS
Deepak Pradhan, MD, MHPE, FCCP, ATSF
I. Obi Emeruwa, MD, PhD, MBA
B. Corbett Walsh, MD, MBE
Competing interests
The authors have no conflicts of interest to disclose.