Automated Sepsis Screening in the ICU
A FHIR-Based Clinical Decision Support Pipeline with Machine Learning Risk Stratification
Synthetic ICU Cohort Generated 2026-03-10
Abstract
Background: Sepsis remains a leading cause of mortality in intensive care units worldwide, with delayed recognition contributing to preventable deaths. Automated screening tools integrated with electronic medical records offer the potential to improve early detection and time-to-treatment.
Methods: We developed an automated sepsis screening and risk stratification pipeline using data extracted via the HL7 FHIR R4 standard from a Cerner EMR system. The pipeline applies SIRS criteria augmented with organ dysfunction markers. Two ML classifiers — logistic regression and random forest — were trained using stratified 5-fold cross-validation.
Results: The cohort comprised 230 ICU encounters. The SIRS-based screening algorithm flagged 53.5% of encounters as sepsis-positive. The random forest classifier achieved an AUC of 0.9985 with F1 of 0.984, while logistic regression achieved an AUC of 0.9580 with F1 of 0.882.
Conclusion: An automated, FHIR-interoperable pipeline can effectively screen for sepsis and stratify patient risk using routinely collected ICU data.
1. Introduction
Sepsis, defined as life-threatening organ dysfunction caused by a dysregulated host response to infection, affects approximately 49 million people annually worldwide and accounts for nearly 20% of all global deaths [1]. In intensive care units, sepsis and septic shock remain the primary drivers of morbidity, extended length of stay, and mortality.
Early identification is critical. Each hour of delay in appropriate antimicrobial therapy has been associated with measurable increases in mortality [2]. The HL7 FHIR standard provides a modern, RESTful API framework for extracting structured clinical observations from EMR systems such as Cerner Millennium [3].
Machine learning approaches have shown promise in augmenting rule-based sepsis screening, with systematic reviews reporting pooled AUC values of 0.85 for in-ICU prediction models [4].
2. Methods
2.1 Data Source and Extraction
Clinical observations were extracted from the Synthetic ICU Cohort using the HL7 FHIR R4 standard. The following LOINC-coded observations were extracted:
| Observation | LOINC Code | Unit |
|---|---|---|
| Body Temperature | 8310-5 | °C |
| Heart Rate | 8867-4 | bpm |
| Respiratory Rate | 9279-1 | breaths/min |
| WBC Count | 6690-2 | 103/μL |
| Lactate | 2524-7 | mmol/L |
| Systolic BP | 8480-6 | mmHg |
2.2 Study Population
| Characteristic | Sepsis-Positive | Sepsis-Negative | p-value |
|---|---|---|---|
| N | 123 | 107 | |
| Temperature (°C) | 37.8 ± 1.1 | 37.4 ± 0.7 | 0.0006 |
| Heart rate (bpm) | 102.0 ± 22.4 | 88.3 ± 21.5 | <0.0001 |
| Respiratory rate (breaths/min) | 24.9 ± 6.5 | 21.0 ± 7.5 | <0.0001 |
| WBC (10³/µL) | 11.0 ± 4.1 | 9.6 ± 3.6 | 0.0025 |
| Lactate (mmol/L) | 4.2 ± 1.3 | 2.7 ± 1.9 | <0.0001 |
| Systolic BP (mmHg) | 105.0 ± 27.2 | 110.2 ± 27.4 | 0.1255 |
2.3 Sepsis Screening Algorithm
The automated screening implements a modified SIRS-based approach with organ dysfunction assessment [5].
| Temperature | > 38.0°C or < 36.0°C |
| Heart rate | > 90 bpm |
| Respiratory rate | > 20 breaths/min |
| WBC count | > 12.0 or < 4.0 × 103/μL |
| Serum lactate | ≥ 2.0 mmol/L |
| Systolic BP | < 90 mmHg |
3. Results
3.1 Cohort Characteristics
A total of 230 encounters were included. The SIRS criteria were met in 156 encounters (67.8%). After applying the organ dysfunction requirement, 123 encounters (53.5%) were flagged as sepsis-positive.
3.2 SIRS Distribution
3.3 Machine Learning Model Performance
| Model | AUC | F1 | Precision | Recall | Accuracy |
|---|---|---|---|---|---|
| Logistic Regression | 0.9580 | 0.882 | 0.8852 | 0.8780 | 0.8739 |
| Random Forest | 0.9985 | 0.984 | 0.9917 | 0.9756 | 0.9826 |
3.4 Feature Importance
3.5 Risk Stratification
4. Discussion
This study demonstrates the feasibility of an automated, FHIR-interoperable sepsis screening pipeline that integrates rule-based clinical criteria with machine learning risk stratification. The pipeline successfully extracted, transformed, and analysed clinical observations from a Cerner EMR system.
Strengths: (1) Full automation from data extraction through risk scoring; (2) FHIR R4 interoperability enabling deployment across any compliant EMR; (3) YAML-configurable screening thresholds; (4) Reproducible analysis with version-controlled code.
Limitations: The analysis was conducted on synthetic data lacking the complexity of real clinical environments. External validation on de-identified clinical datasets is required before deployment.
5. Conclusion
An automated, FHIR-based sepsis screening and risk stratification pipeline integrating established clinical criteria with machine learning classifiers demonstrates strong technical performance. The critical next step is external validation on real-world clinical data.
References
- Singer M, Deutschman CS, Seymour CW, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016;315(8):801-810.
- Seymour CW, Gesten F, Prescott HC, et al. Time to Treatment and Mortality during Mandated Emergency Care for Sepsis. N Engl J Med. 2017;376(23):2235-2244.
- Mandel JC, Kreda DA, Mandl KD, Kohane IS, Ramoni RB. SMART on FHIR: a standards-based, interoperable apps platform for electronic health records. J Am Med Inform Assoc. 2016;23(5):899-908.
- Fleuren LM, Klausch TLT, Zwager CL, et al. Machine learning for the prediction of sepsis: a systematic review and meta-analysis. Intensive Care Med. 2020;46(3):383-400.
- Bone RC, Balk RA, Cerra FB, et al. Definitions for sepsis and organ failure and guidelines for the use of innovative therapies in sepsis. Chest. 1992;101(6):1644-1655.