Database Open Access
Clinical data from the MIMIC-II database for a case study on indwelling arterial catheters
Published: Oct. 28, 2016. Version: 1.0
When using this resource, please cite:
(show more options)
Raffa, J. (2016). Clinical data from the MIMIC-II database for a case study on indwelling arterial catheters (version 1.0). PhysioNet. https://doi.org/10.13026/C2NC7F.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
This dataset was created for the purpose of a case study in the book: Secondary Analysis of Electronic Health Records, published by Springer in 2016. In particular, the dataset was used throughout Chapter 16 (Data Analysis) by Raffa J. et al. to investigate the effectiveness of indwelling arterial catheters in hemodynamically stable patients with respiratory failure for mortality outcomes. The dataset is derived from MIMIC-II, the publicly-accessible critical care database. It contains summary clinical data and outcomes for 1,776 patients.
Indwelling arterial catheters (IACs) are used extensively in the ICU for hemodynamic monitoring and for blood gas analysis. IAC use also poses potentially serious risks, including bloodstream infections and vascular complications. In 2015, Hsu et al published a study to assess whether IAC use was associated with mortality in patients who are mechanically ventilated and do not require vasopressor support .
The original study was later used as a case study in Chapter 16 of Secondary Analysis of Electronic Health Records, published by Springer in 2016 . The dataset shared here was recreated for the purpose of the textbook. It was extracted from MIMIC-II, the publicly available critical care database .
The dataset was generated using code available on GitHub and archived in this repository . In total, 46 variables were extracted from MIMIC-II, including demographics (e.g. age, weight), clinical observations collected during the first day of ICU stay (e.g. white blood cell count, heart rate), and outcomes (e.g. 28 day mortality and length of stay). It is shared as a comma separated value (CSV) file, along with a data dictionary.
The dataset (
full_cohort_data.csv) is a comma separated value file that includes a header with descriptive variable names. '
day_28_flg` was the main outcome of interest, while '
aline_flg' was the primary covariate of interest. There is an accompanying data dictionary ( '
data_dictionary.txt') which is reproduced below:
aline_flg: IAC used (binary, 1 = year, 0 = no)
icu_los_day: length of stay in ICU (days, numeric)
hospital_los_day: length of stay in hospital (days, numeric)
age: age at baseline (years, numeric)
gender_num: patient gender (1 = male; 0=female)
weight_first: first weight, (kg, numeric)
bmi: patient BMI, (numeric)
sapsi_first: first SAPS I score (numeric)
sofa_first: first SOFA score (numeric)
service_unit: type of service unit (character: FICU, MICU, SICU)
service_num: service as a numeric (binary: 0 = MICU or FICU, 1 = SICU)
day_icu_intime: day of week of ICU admission (character)
day_icu_intime_num: day of week of ICU admission (numeric, corresponds with day_icu_intime)
hour_icu_intime: hour of ICU admission (numeric, hour of admission using 24hr clock)
hosp_exp_flg: death in hospital (binary: 1 = yes, 0 = no)
icu_exp_flg: death in ICU (binary: 1 = yes, 0 = no)
day_28_flg: death within 28 days (binary: 1 = yes, 0 = no)
mort_day_censored: day post ICU admission of censoring or death (days, numeric)
censor_flg: censored or death (binary: 0 = death, 1 = censored)
sepsis_flg: sepsis present (binary: 0 = no, 1 = yes -- absent (0) for all)
chf_flg: Congestive heart failure (binary: 0 = no, 1 = yes)
afib_flg: Atrial fibrillation (binary: 0 = no, 1 = yes)
renal_flg: Chronic renal disease (binary: 0 = no, 1 = yes)
liver_flg: Liver Disease (binary: 0 = no, 1 = yes)
copd_flg: Chronic obstructive pulmonary disease (binary: 0 = no, 1 = yes)
cad_flg: Coronary artery disease (binary: 0 = no, 1 = yes)
stroke_flg: Stroke (binary: 0 = no, 1 = yes)
mal_flg: Malignancy (binary: 0 = no, 1 = yes)
resp_flg: Respiratory disease (non-COPD) (binary: 0 = no, 1 = yes)
map_1st: Mean arterial pressure (mmHg, numeric)
hr_1st: Heart Rate (numeric)
temp_1st: Temperature (F, numeric)
spo2_1st: S_pO_2 (%, numeric)
abg_count: arterial blood gas count (number of tests, numeric)
wbc_first: first White blood cell count (K/uL, numeric)
hgb_first: first Hemoglobin (g/dL, numeric)
platelet_first: first Platelets (K/u, numericL)
sodium_first: first Sodium (mEq/L, numeric)
potassium_first: first Potassium (mEq/L, numeric)
tco2_first: first Bicarbonate (mEq/L, numeric)
chloride_first: first Chloride (mEq/L, numeric)
bun_first: first Blood urea nitrogen (mg/dL, numeric)
creatinine_first: first Creatinine (mg/dL, numeric)
po2_first: first PaO_2 (mmHg, numeric)
pco2_first: first PaCO_2 (mmHg, numeric)
iv_day_1: input fluids by IV on day 1 (mL, numeric)
aline-mimic-ii-master.zip is a zipped file containing the SQL code used to extract the dataset from MIMIC-II, as well as sample code for analysis and visualization.
The primary use of this dataset is to carry out the case study in Chapter 16 of Secondary Analysis of Electronic Health Records . The case study data walks the reader through the process of examining the effect of indwelling arterial catheters (IAC) on 28 day mortality in the intensive care unit (ICU) in patients who were mechanically ventilated during the first day of ICU admission. Sample R code for analyzing and visualizing the data is provided in the textbook and archived in
We would like to thank the authors of the original paper (Hsu DJ, Feng M, Kothari R, Zhou H, Chen KP, Celi LA) for allowing us to base our book chapter on their work .
Conflicts of Interest
The authors have no conflicts of interest to declare.
- Raffa J.D., Ghassemi M., Naumann T., Feng M., Hsu D. (2016) Data Analysis. In: Secondary Analysis of Electronic Health Records. Springer, Cham.
- Hsu DJ, Feng M, Kothari R, Zhou H, Chen KP, Celi LA. The association between indwelling arterial catheters and mortality in hemodynamically stable patients with respiratory failure: A propensity score analysis. Chest, 148(6):1470–1476, Aug. 2015. http://doi.org/10.1378/chest.15-0516
- M. Saeed, M. Villarroel, A.T. Reisner, G. Clifford, L. Lehman, G.B. Moody, T. Heldt, T.H. Kyaw, B.E. Moody, R.G. Mark. Multiparameter intelligent monitoring in intensive care II (MIMIC-II): A public-access ICU database. Critical Care Medicine 39(5):952-960 (2011 May); http://doi.org/10.1097/CCM.0b013e31820a92c6.
- Code for Chapter 16 of Secondary Analysis of Electronic Health Records. https://github.com/MIT-LCP/aline-mimic-ii [Accessed 27 May 2020]
Anyone can access the files, as long as they conform to the terms of the specified license.
License (for files):
Open Data Commons Attribution License v1.0
Total uncompressed size: 6.9 MB.
Access the files
- Download the ZIP file (6.7 MB)
- Download the files using your terminal:
wget -r -N -c -np https://physionet.org/files/mimic2-iaccd/1.0/
|LICENSE.txt (download)||19.9 KB||2020-05-27|
|SHA256SUMS.txt (download)||339 B||2020-05-27|
|aline-mimic-ii-master.zip (download)||6.6 MB||2020-05-27|
|data_dictionary.txt (download)||2.5 KB||2020-05-27|
|full_cohort_data.csv (download)||288.7 KB||2020-05-27|