Cerner FHIR Pipeline

An end-to-end clinical data pipeline that extracts patient data from FHIR servers, screens ICU encounters for sepsis risk using SIRS criteria, and applies machine learning models for risk stratification.

FHIR R4 DuckDB scikit-learn Plotly FastAPI

Pipeline Architecture

Extract

Pull patient, encounter, observation, and condition resources from a FHIR R4 server (Cerner open sandbox) or generate a synthetic ICU cohort with realistic vital signs and lab values.

Transform

Flatten FHIR bundles into tabular Parquet files, load into a DuckDB analytical warehouse, and build a screening-ready dataset joining encounter-level vitals and lab results.

Sepsis Screening

Apply SIRS-based sepsis criteria (temperature, heart rate, respiratory rate, WBC count) plus organ-dysfunction markers (lactate). Flag encounters meeting ≥2 SIRS criteria with elevated lactate.

ML Risk Stratification

Train Logistic Regression and Random Forest classifiers to predict sepsis. Compute composite risk scores, rank encounters, and assign risk categories (High / Moderate / Low-Moderate / Low).

Dashboard & Reporting

Visualise results in this interactive web application with summary metrics, interactive Plotly charts, model comparison views, and a publication-ready clinical research report.

Data Sources

Use the toggle in the navigation bar to switch between sources. All pages update instantly — no re-processing needed.

Synthetic ICU Cohort

patients · encounters · sepsis-flagged

Programmatically generated with realistic vital-sign distributions and a ~50% sepsis prevalence for model training.

Live FHIR — Cerner Sandbox

30 patients · 265 encounters · 0 sepsis-flagged

Real clinical structure from Cerner's open FHIR R4 endpoint (fhir-open.cerner.com). Demonstrates the pipeline on production-format data.

Sepsis Screening Criteria

An encounter is flagged when ≥2 SIRS criteria are met and lactate indicates organ dysfunction.

SIRS Criteria

Temp (high)≥ 38.0 °C

Temp (low)≤ 36.0 °C

Heart rate≥ 90 bpm

Resp rate≥ 20 /min

WBC (high)≥ 12.0 k/μL

WBC (low)≤ 4.0 k/μL

Organ Dysfunction

Lactate≥ 2.0 mmol/L

Page Guide

📊

Dashboard

High-level summary: patient & encounter counts, sepsis screening results, risk distribution charts, model performance snapshot, and top-risk encounters table.

👥

Patients

Browse individual patient records extracted from the FHIR server — ID, name, gender, and date of birth.

🏥

Encounters

List of ICU encounter records with status, admission and discharge timestamps, linked to patient IDs.

🧬

ML Analysis

Deep dive into model performance: Logistic Regression vs Random Forest comparison, ROC & PR curves, feature importance rankings, and risk stratification breakdown.

📈

Biostatistics

Kaplan-Meier survival curves, Cox PH hazard ratios, Bayesian logistic regression with credible intervals, and SHAP model explainability.

📝

Clinical Report

Publication-style research report with abstract, methods, Table 1 baseline characteristics, statistical tests, cohort flow diagram, and full results narrative.

📱

Mobile View

Lightweight, phone & tablet-friendly dashboard. Fetches live data from the JSON API with instant source switching and a refresh button.

⚙️

API Endpoints

JSON APIs at /api/metrics and /api/risk-summary for programmatic access. Both accept the ?source= parameter.

Technology Stack

Python 3.11+ — Core language FastAPI — Web framework DuckDB — Analytical warehouse Parquet — Columnar storage scikit-learn — ML models Plotly — Interactive charts Jinja2 — Templating Bootstrap 5 — UI framework FHIR R4 — Data standard