Cerner FHIR Pipeline
An end-to-end clinical data pipeline that extracts patient data from FHIR servers, screens ICU encounters for sepsis risk using SIRS criteria, and applies machine learning models for risk stratification.
Pipeline Architecture
Pull patient, encounter, observation, and condition resources from a FHIR R4 server (Cerner open sandbox) or generate a synthetic ICU cohort with realistic vital signs and lab values.
Flatten FHIR bundles into tabular Parquet files, load into a DuckDB analytical warehouse, and build a screening-ready dataset joining encounter-level vitals and lab results.
Apply SIRS-based sepsis criteria (temperature, heart rate, respiratory rate, WBC count) plus organ-dysfunction markers (lactate). Flag encounters meeting ≥2 SIRS criteria with elevated lactate.
Train Logistic Regression and Random Forest classifiers to predict sepsis. Compute composite risk scores, rank encounters, and assign risk categories (High / Moderate / Low-Moderate / Low).
Visualise results in this interactive web application with summary metrics, interactive Plotly charts, model comparison views, and a publication-ready clinical research report.
Data Sources
Use the toggle in the navigation bar to switch between sources. All pages update instantly — no re-processing needed.
Synthetic ICU Cohort
patients · encounters · sepsis-flagged
Programmatically generated with realistic vital-sign distributions and a ~50% sepsis prevalence for model training.Live FHIR — Cerner Sandbox
30 patients · 265 encounters · 0 sepsis-flagged
Real clinical structure from Cerner's open FHIR R4 endpoint (fhir-open.cerner.com). Demonstrates the
pipeline on production-format data.
Sepsis Screening Criteria
An encounter is flagged when ≥2 SIRS criteria are met and lactate indicates organ dysfunction.
SIRS Criteria
Organ Dysfunction
Page Guide
API Endpoints
JSON APIs at /api/metrics and
/api/risk-summary for programmatic access.
Both accept the ?source= parameter.