2018 IEEE BHI and BSN Data Challenge 1.0

File: <base>/notebooks/challenge-demo.ipynb (182,200 bytes)
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Python demo for the 2018 BHI & BSN Data Challenge\n",
    "\n",
    "This notebook provides a simple introduction to analysing the MIMIC-III database. It was created as a demonstrator for the [2018 BHI & BSN Data Challenge](https://mimic.physionet.org/events/bhibsn-challenge/), which explores the following question:\n",
    "\n",
    "> Are patients admitted to the intensive care unit (ICU) on a weekend more likely to die in the hospital than those admitted on a weekday?\n",
    "\n",
    "We have provided an example slide template for final presentations (`slide-template.pptx`) at: https://github.com/MIT-LCP/bhi-bsn-challenge. There is no obligation to use it!\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Background on MIMIC-III\n",
    "\n",
    "MIMIC-III (‘Medical Information Mart for Intensive Care’) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care hospital. \n",
    "\n",
    "Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more. \n",
    "\n",
    "For details, see: https://mimic.physionet.org/. The data is downloaded as 26 CSV files, which can then be loaded into a database system. Scripts for loading the data into Postgres are provided in the [MIMIC Code Repository](https://mimic.physionet.org/gettingstarted/dbsetup/). A demo dataset is also available: https://mimic.physionet.org/gettingstarted/demo/\n",
    "\n",
    "Points to note:\n",
    "\n",
    "- A patient-level shift has been applied to dates. Day of week is retained. \n",
    "- Patients aged >89 years on first admission have been reassigned an age of ~300 years.\n",
    "- Patients may have multiple hospital admissions. Each hospital admission may comprise multiple ICU stays (e.g. a patient may visit the ICU, leave for surgery, and then return to the ICU for recovery, all within a single hospital admission).\n",
    "\n",
    "If you need help getting set up with access to MIMIC-III, please contact `data-challenge@physionet.org`.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Import libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/usr/local/lib/python2.7/site-packages/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.\n",
      "  from pandas.core import datetools\n"
     ]
    }
   ],
   "source": [
    "# Data processing libraries\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import itertools\n",
    "\n",
    "# Database libraries\n",
    "import psycopg2\n",
    "\n",
    "# Stats libraries\n",
    "from tableone import TableOne\n",
    "import statsmodels.api as sm\n",
    "import statsmodels.formula.api as smf\n",
    "import scipy.stats\n",
    "\n",
    "# Image libraries\n",
    "# https://jakevdp.github.io/pdvega/\n",
    "# jupyter nbextension enable vega3 --py --sys-prefix\n",
    "import matplotlib.pyplot as plt\n",
    "import pdvega \n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Connect to the MIMIC-III database\n",
    "\n",
    "If you have created an instance of the MIMIC-III database, then you should be able to connect with the following settings (or similar). If you need help getting set up with access to MIMIC-III, please contact `data-challenge@physionet.org`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a database connection\n",
    "user = 'XXX'\n",
    "password = 'XXX'\n",
    "host = 'localhost'\n",
    "dbname = 'mimic'\n",
    "schema = 'public, mimiciii_demo'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Password:········\n"
     ]
    }
   ],
   "source": [
    "# Connect to the database\n",
    "con = psycopg2.connect(dbname=dbname, user=user, host=host, \n",
    "                       password=password)\n",
    "cur = con.cursor()\n",
    "cur.execute('SET search_path to {}'.format(schema))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Extract data from MIMIC-III and assign to a Pandas DataFrame\n",
    "\n",
    "The following query extracts a simple dataset from the MIMIC-III database, comprising demographics, hospital and ICU admission times, and a severity of illness score ([OASIS](https://www.ncbi.nlm.nih.gov/pubmed/23660729)).\n",
    "\n",
    "Before running this query, you must first build the `icustay_detail` and `oasis` materialized views. Code for building these views is available in the MIMIC Code Repository:\n",
    "- `icustay_detail`: https://github.com/MIT-LCP/mimic-code/tree/master/concepts/demographics\n",
    "- `oasis`: https://github.com/MIT-LCP/mimic-code/tree/master/concepts/severityscores\n",
    "\n",
    "You will notice that our example restricts the analysis to:\n",
    "\n",
    "- first hospital admissions \n",
    "- patients who were `>= 16` years at time of hospital admission.\n",
    "- the first ICU stay (patients may move to the ICU multiple times within a hospital stay)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run query and assign the results to a Pandas DataFrame\n",
    "# Requires the icustay_detail view from:\n",
    "# https://github.com/MIT-LCP/mimic-code/tree/master/concepts/demographics\n",
    "# And the OASIS score from:\n",
    "# https://github.com/MIT-LCP/mimic-code/tree/master/concepts/severityscores\n",
    "query = \\\n",
    "\"\"\"\n",
    "WITH first_icu AS (\n",
    "    SELECT i.subject_id, i.hadm_id, i.icustay_id, i.gender, i.admittime admittime_hospital, \n",
    "      i.dischtime dischtime_hospital, i.los_hospital, i.age, i.admission_type, \n",
    "      i.hospital_expire_flag, i.intime intime_icu, i.outtime outtime_icu, i.los_icu, \n",
    "      s.first_careunit\n",
    "    FROM icustay_detail i\n",
    "    LEFT JOIN icustays s\n",
    "    ON i.icustay_id = s.icustay_id\n",
    "    WHERE i.hospstay_seq = 1\n",
    "      AND i.icustay_seq = 1\n",
    "      AND i.age >= 16\n",
    ")\n",
    "SELECT f.*, o.icustay_expire_flag, o.oasis, o.oasis_prob\n",
    "FROM first_icu f\n",
    "LEFT JOIN oasis o\n",
    "ON f.icustay_id = o.icustay_id;\n",
    "\"\"\"\n",
    "\n",
    "data = pd.read_sql_query(query,con)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Check the extracted data\n",
    "\n",
    "It is always a good idea to inspect the data after you have extracted it.  We will look at the first six patients (rows), and then check the number of rows, and get some summary statistics of the dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index([u'subject_id', u'hadm_id', u'icustay_id', u'gender',\n",
       "       u'admittime_hospital', u'dischtime_hospital', u'los_hospital', u'age',\n",
       "       u'admission_type', u'hospital_expire_flag', u'intime_icu',\n",
       "       u'outtime_icu', u'los_icu', u'first_careunit', u'icustay_expire_flag',\n",
       "       u'oasis', u'oasis_prob'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>subject_id</th>\n",
       "      <th>hadm_id</th>\n",
       "      <th>icustay_id</th>\n",
       "      <th>gender</th>\n",
       "      <th>admittime_hospital</th>\n",
       "      <th>dischtime_hospital</th>\n",
       "      <th>los_hospital</th>\n",
       "      <th>age</th>\n",
       "      <th>admission_type</th>\n",
       "      <th>hospital_expire_flag</th>\n",
       "      <th>intime_icu</th>\n",
       "      <th>outtime_icu</th>\n",
       "      <th>los_icu</th>\n",
       "      <th>first_careunit</th>\n",
       "      <th>icustay_expire_flag</th>\n",
       "      <th>oasis</th>\n",
       "      <th>oasis_prob</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>10076</td>\n",
       "      <td>198503</td>\n",
       "      <td>201006</td>\n",
       "      <td>M</td>\n",
       "      <td>2107-03-21 21:16:00</td>\n",
       "      <td>2107-03-30 12:00:00</td>\n",
       "      <td>8.6139</td>\n",
       "      <td>68.8636</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>1</td>\n",
       "      <td>2107-03-24 04:06:14</td>\n",
       "      <td>2107-03-31 06:55:09</td>\n",
       "      <td>7.1173</td>\n",
       "      <td>MICU</td>\n",
       "      <td>1</td>\n",
       "      <td>42</td>\n",
       "      <td>0.305849</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>42321</td>\n",
       "      <td>114648</td>\n",
       "      <td>201204</td>\n",
       "      <td>F</td>\n",
       "      <td>2121-12-07 20:49:00</td>\n",
       "      <td>2121-12-12 16:40:00</td>\n",
       "      <td>4.8271</td>\n",
       "      <td>80.5627</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>0</td>\n",
       "      <td>2121-12-07 20:50:36</td>\n",
       "      <td>2121-12-09 18:43:58</td>\n",
       "      <td>1.9121</td>\n",
       "      <td>CSRU</td>\n",
       "      <td>0</td>\n",
       "      <td>37</td>\n",
       "      <td>0.188911</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10045</td>\n",
       "      <td>126949</td>\n",
       "      <td>203766</td>\n",
       "      <td>F</td>\n",
       "      <td>2129-11-24 00:31:00</td>\n",
       "      <td>2129-12-01 01:45:00</td>\n",
       "      <td>7.0514</td>\n",
       "      <td>68.6669</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>1</td>\n",
       "      <td>2129-11-24 22:46:57</td>\n",
       "      <td>2129-12-01 06:03:55</td>\n",
       "      <td>6.3034</td>\n",
       "      <td>MICU</td>\n",
       "      <td>1</td>\n",
       "      <td>48</td>\n",
       "      <td>0.486353</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>10104</td>\n",
       "      <td>177678</td>\n",
       "      <td>204201</td>\n",
       "      <td>F</td>\n",
       "      <td>2120-08-24 17:39:00</td>\n",
       "      <td>2120-08-31 13:12:00</td>\n",
       "      <td>6.8146</td>\n",
       "      <td>70.5196</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>0</td>\n",
       "      <td>2120-08-24 23:47:23</td>\n",
       "      <td>2120-08-25 15:41:49</td>\n",
       "      <td>0.6628</td>\n",
       "      <td>MICU</td>\n",
       "      <td>0</td>\n",
       "      <td>29</td>\n",
       "      <td>0.077479</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>10017</td>\n",
       "      <td>199207</td>\n",
       "      <td>204881</td>\n",
       "      <td>F</td>\n",
       "      <td>2149-05-26 17:19:00</td>\n",
       "      <td>2149-06-03 18:42:00</td>\n",
       "      <td>8.0576</td>\n",
       "      <td>73.6792</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>0</td>\n",
       "      <td>2149-05-29 18:52:29</td>\n",
       "      <td>2149-05-31 22:19:17</td>\n",
       "      <td>2.1436</td>\n",
       "      <td>CCU</td>\n",
       "      <td>0</td>\n",
       "      <td>30</td>\n",
       "      <td>0.087098</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   subject_id  hadm_id  icustay_id gender  admittime_hospital  \\\n",
       "0       10076   198503      201006      M 2107-03-21 21:16:00   \n",
       "1       42321   114648      201204      F 2121-12-07 20:49:00   \n",
       "2       10045   126949      203766      F 2129-11-24 00:31:00   \n",
       "3       10104   177678      204201      F 2120-08-24 17:39:00   \n",
       "4       10017   199207      204881      F 2149-05-26 17:19:00   \n",
       "\n",
       "   dischtime_hospital  los_hospital      age admission_type  \\\n",
       "0 2107-03-30 12:00:00        8.6139  68.8636      EMERGENCY   \n",
       "1 2121-12-12 16:40:00        4.8271  80.5627      EMERGENCY   \n",
       "2 2129-12-01 01:45:00        7.0514  68.6669      EMERGENCY   \n",
       "3 2120-08-31 13:12:00        6.8146  70.5196      EMERGENCY   \n",
       "4 2149-06-03 18:42:00        8.0576  73.6792      EMERGENCY   \n",
       "\n",
       "   hospital_expire_flag          intime_icu         outtime_icu  los_icu  \\\n",
       "0                     1 2107-03-24 04:06:14 2107-03-31 06:55:09   7.1173   \n",
       "1                     0 2121-12-07 20:50:36 2121-12-09 18:43:58   1.9121   \n",
       "2                     1 2129-11-24 22:46:57 2129-12-01 06:03:55   6.3034   \n",
       "3                     0 2120-08-24 23:47:23 2120-08-25 15:41:49   0.6628   \n",
       "4                     0 2149-05-29 18:52:29 2149-05-31 22:19:17   2.1436   \n",
       "\n",
       "  first_careunit  icustay_expire_flag  oasis  oasis_prob  \n",
       "0           MICU                    1     42    0.305849  \n",
       "1           CSRU                    0     37    0.188911  \n",
       "2           MICU                    1     48    0.486353  \n",
       "3           MICU                    0     29    0.077479  \n",
       "4            CCU                    0     30    0.087098  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>count</th>\n",
       "      <th>mean</th>\n",
       "      <th>std</th>\n",
       "      <th>min</th>\n",
       "      <th>25%</th>\n",
       "      <th>50%</th>\n",
       "      <th>75%</th>\n",
       "      <th>max</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>subject_id</th>\n",
       "      <td>99.0</td>\n",
       "      <td>26324.373737</td>\n",
       "      <td>16202.635658</td>\n",
       "      <td>10006.000000</td>\n",
       "      <td>10068.000000</td>\n",
       "      <td>40124.000000</td>\n",
       "      <td>42278.000000</td>\n",
       "      <td>44228.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>hadm_id</th>\n",
       "      <td>99.0</td>\n",
       "      <td>151749.454545</td>\n",
       "      <td>28975.138680</td>\n",
       "      <td>100375.000000</td>\n",
       "      <td>127326.000000</td>\n",
       "      <td>157466.000000</td>\n",
       "      <td>174868.000000</td>\n",
       "      <td>199395.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>icustay_id</th>\n",
       "      <td>99.0</td>\n",
       "      <td>249167.949495</td>\n",
       "      <td>27983.121905</td>\n",
       "      <td>201006.000000</td>\n",
       "      <td>224042.000000</td>\n",
       "      <td>246080.000000</td>\n",
       "      <td>271795.500000</td>\n",
       "      <td>298685.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>los_hospital</th>\n",
       "      <td>99.0</td>\n",
       "      <td>10.006785</td>\n",
       "      <td>14.103571</td>\n",
       "      <td>0.038200</td>\n",
       "      <td>3.543050</td>\n",
       "      <td>6.830600</td>\n",
       "      <td>11.922250</td>\n",
       "      <td>123.984700</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>age</th>\n",
       "      <td>99.0</td>\n",
       "      <td>89.077015</td>\n",
       "      <td>64.855919</td>\n",
       "      <td>17.192000</td>\n",
       "      <td>65.709250</td>\n",
       "      <td>76.931900</td>\n",
       "      <td>85.233050</td>\n",
       "      <td>300.003400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>hospital_expire_flag</th>\n",
       "      <td>99.0</td>\n",
       "      <td>0.323232</td>\n",
       "      <td>0.470091</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>los_icu</th>\n",
       "      <td>99.0</td>\n",
       "      <td>4.586355</td>\n",
       "      <td>6.677401</td>\n",
       "      <td>0.105900</td>\n",
       "      <td>1.123700</td>\n",
       "      <td>2.014100</td>\n",
       "      <td>4.597000</td>\n",
       "      <td>35.406500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>icustay_expire_flag</th>\n",
       "      <td>99.0</td>\n",
       "      <td>0.222222</td>\n",
       "      <td>0.417855</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>oasis</th>\n",
       "      <td>99.0</td>\n",
       "      <td>34.747475</td>\n",
       "      <td>8.657123</td>\n",
       "      <td>12.000000</td>\n",
       "      <td>29.000000</td>\n",
       "      <td>34.000000</td>\n",
       "      <td>39.000000</td>\n",
       "      <td>56.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>oasis_prob</th>\n",
       "      <td>99.0</td>\n",
       "      <td>0.194093</td>\n",
       "      <td>0.162787</td>\n",
       "      <td>0.009522</td>\n",
       "      <td>0.077479</td>\n",
       "      <td>0.137099</td>\n",
       "      <td>0.231102</td>\n",
       "      <td>0.724202</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                      count           mean           std            min  \\\n",
       "subject_id             99.0   26324.373737  16202.635658   10006.000000   \n",
       "hadm_id                99.0  151749.454545  28975.138680  100375.000000   \n",
       "icustay_id             99.0  249167.949495  27983.121905  201006.000000   \n",
       "los_hospital           99.0      10.006785     14.103571       0.038200   \n",
       "age                    99.0      89.077015     64.855919      17.192000   \n",
       "hospital_expire_flag   99.0       0.323232      0.470091       0.000000   \n",
       "los_icu                99.0       4.586355      6.677401       0.105900   \n",
       "icustay_expire_flag    99.0       0.222222      0.417855       0.000000   \n",
       "oasis                  99.0      34.747475      8.657123      12.000000   \n",
       "oasis_prob             99.0       0.194093      0.162787       0.009522   \n",
       "\n",
       "                                25%            50%            75%  \\\n",
       "subject_id             10068.000000   40124.000000   42278.000000   \n",
       "hadm_id               127326.000000  157466.000000  174868.000000   \n",
       "icustay_id            224042.000000  246080.000000  271795.500000   \n",
       "los_hospital               3.543050       6.830600      11.922250   \n",
       "age                       65.709250      76.931900      85.233050   \n",
       "hospital_expire_flag       0.000000       0.000000       1.000000   \n",
       "los_icu                    1.123700       2.014100       4.597000   \n",
       "icustay_expire_flag        0.000000       0.000000       0.000000   \n",
       "oasis                     29.000000      34.000000      39.000000   \n",
       "oasis_prob                 0.077479       0.137099       0.231102   \n",
       "\n",
       "                                max  \n",
       "subject_id             44228.000000  \n",
       "hadm_id               199395.000000  \n",
       "icustay_id            298685.000000  \n",
       "los_hospital             123.984700  \n",
       "age                      300.003400  \n",
       "hospital_expire_flag       1.000000  \n",
       "los_icu                   35.406500  \n",
       "icustay_expire_flag        1.000000  \n",
       "oasis                     56.000000  \n",
       "oasis_prob                 0.724202  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.describe().T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Add day of week to DataFrame\n",
    "\n",
    "If we are going to examine the weekend effect, we need to pull this out of the dataset, as you can see, all we have above are dates. We will define a weekend, as anytime between Saturday (00:00:00) until Sunday (23:59:59). The dates above are shifted, and that's why they look odd, but they are matched on the day of week, so this aspect is preserved."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>subject_id</th>\n",
       "      <th>hadm_id</th>\n",
       "      <th>icustay_id</th>\n",
       "      <th>gender</th>\n",
       "      <th>admittime_hospital</th>\n",
       "      <th>dischtime_hospital</th>\n",
       "      <th>los_hospital</th>\n",
       "      <th>age</th>\n",
       "      <th>admission_type</th>\n",
       "      <th>hospital_expire_flag</th>\n",
       "      <th>...</th>\n",
       "      <th>los_icu</th>\n",
       "      <th>first_careunit</th>\n",
       "      <th>icustay_expire_flag</th>\n",
       "      <th>oasis</th>\n",
       "      <th>oasis_prob</th>\n",
       "      <th>admitday_hospital</th>\n",
       "      <th>dischday_hospital</th>\n",
       "      <th>inday_icu</th>\n",
       "      <th>inday_icu_seq</th>\n",
       "      <th>outday_icu</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>27513</td>\n",
       "      <td>163557</td>\n",
       "      <td>200003</td>\n",
       "      <td>M</td>\n",
       "      <td>2199-08-02 17:02:00</td>\n",
       "      <td>2199-08-22 19:00:00</td>\n",
       "      <td>20.0819</td>\n",
       "      <td>48.2960</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>5.8884</td>\n",
       "      <td>SICU</td>\n",
       "      <td>0</td>\n",
       "      <td>35</td>\n",
       "      <td>0.152892</td>\n",
       "      <td>Friday</td>\n",
       "      <td>Thursday</td>\n",
       "      <td>Friday</td>\n",
       "      <td>4</td>\n",
       "      <td>Thursday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>20707</td>\n",
       "      <td>129310</td>\n",
       "      <td>200007</td>\n",
       "      <td>M</td>\n",
       "      <td>2109-02-17 10:02:00</td>\n",
       "      <td>2109-02-20 15:47:00</td>\n",
       "      <td>3.2396</td>\n",
       "      <td>43.3450</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.2914</td>\n",
       "      <td>CCU</td>\n",
       "      <td>0</td>\n",
       "      <td>26</td>\n",
       "      <td>0.054187</td>\n",
       "      <td>Sunday</td>\n",
       "      <td>Wednesday</td>\n",
       "      <td>Sunday</td>\n",
       "      <td>6</td>\n",
       "      <td>Monday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>9514</td>\n",
       "      <td>127229</td>\n",
       "      <td>200014</td>\n",
       "      <td>M</td>\n",
       "      <td>2105-02-16 23:15:00</td>\n",
       "      <td>2105-02-21 13:46:00</td>\n",
       "      <td>4.6049</td>\n",
       "      <td>84.7300</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>1.7338</td>\n",
       "      <td>SICU</td>\n",
       "      <td>0</td>\n",
       "      <td>56</td>\n",
       "      <td>0.724202</td>\n",
       "      <td>Monday</td>\n",
       "      <td>Saturday</td>\n",
       "      <td>Monday</td>\n",
       "      <td>0</td>\n",
       "      <td>Wednesday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>21789</td>\n",
       "      <td>112486</td>\n",
       "      <td>200019</td>\n",
       "      <td>F</td>\n",
       "      <td>2178-07-08 09:02:00</td>\n",
       "      <td>2178-07-11 06:45:00</td>\n",
       "      <td>2.9049</td>\n",
       "      <td>82.8831</td>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>1</td>\n",
       "      <td>...</td>\n",
       "      <td>3.0594</td>\n",
       "      <td>CCU</td>\n",
       "      <td>1</td>\n",
       "      <td>47</td>\n",
       "      <td>0.454600</td>\n",
       "      <td>Wednesday</td>\n",
       "      <td>Saturday</td>\n",
       "      <td>Wednesday</td>\n",
       "      <td>2</td>\n",
       "      <td>Saturday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>41710</td>\n",
       "      <td>181955</td>\n",
       "      <td>200028</td>\n",
       "      <td>M</td>\n",
       "      <td>2133-10-29 10:00:00</td>\n",
       "      <td>2133-11-01 14:54:00</td>\n",
       "      <td>3.2042</td>\n",
       "      <td>64.8677</td>\n",
       "      <td>ELECTIVE</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>2.9038</td>\n",
       "      <td>CCU</td>\n",
       "      <td>0</td>\n",
       "      <td>35</td>\n",
       "      <td>0.152892</td>\n",
       "      <td>Thursday</td>\n",
       "      <td>Sunday</td>\n",
       "      <td>Thursday</td>\n",
       "      <td>3</td>\n",
       "      <td>Sunday</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   subject_id  hadm_id  icustay_id gender  admittime_hospital  \\\n",
       "0       27513   163557      200003      M 2199-08-02 17:02:00   \n",
       "1       20707   129310      200007      M 2109-02-17 10:02:00   \n",
       "2        9514   127229      200014      M 2105-02-16 23:15:00   \n",
       "3       21789   112486      200019      F 2178-07-08 09:02:00   \n",
       "4       41710   181955      200028      M 2133-10-29 10:00:00   \n",
       "\n",
       "   dischtime_hospital  los_hospital      age admission_type  \\\n",
       "0 2199-08-22 19:00:00       20.0819  48.2960      EMERGENCY   \n",
       "1 2109-02-20 15:47:00        3.2396  43.3450      EMERGENCY   \n",
       "2 2105-02-21 13:46:00        4.6049  84.7300      EMERGENCY   \n",
       "3 2178-07-11 06:45:00        2.9049  82.8831      EMERGENCY   \n",
       "4 2133-11-01 14:54:00        3.2042  64.8677       ELECTIVE   \n",
       "\n",
       "   hospital_expire_flag    ...     los_icu first_careunit  \\\n",
       "0                     0    ...      5.8884           SICU   \n",
       "1                     0    ...      1.2914            CCU   \n",
       "2                     0    ...      1.7338           SICU   \n",
       "3                     1    ...      3.0594            CCU   \n",
       "4                     0    ...      2.9038            CCU   \n",
       "\n",
       "   icustay_expire_flag oasis  oasis_prob  admitday_hospital  \\\n",
       "0                    0    35    0.152892             Friday   \n",
       "1                    0    26    0.054187             Sunday   \n",
       "2                    0    56    0.724202             Monday   \n",
       "3                    1    47    0.454600          Wednesday   \n",
       "4                    0    35    0.152892           Thursday   \n",
       "\n",
       "   dischday_hospital  inday_icu inday_icu_seq outday_icu  \n",
       "0           Thursday     Friday             4   Thursday  \n",
       "1          Wednesday     Sunday             6     Monday  \n",
       "2           Saturday     Monday             0  Wednesday  \n",
       "3           Saturday  Wednesday             2   Saturday  \n",
       "4             Sunday   Thursday             3     Sunday  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data['admitday_hospital'] = data.admittime_hospital.dt.weekday_name\n",
    "data['dischday_hospital'] = data.dischtime_hospital.dt.weekday_name\n",
    "data['inday_icu'] = data.intime_icu.dt.weekday_name\n",
    "data['inday_icu_seq'] = data.intime_icu.dt.weekday\n",
    "data['outday_icu'] = data.outtime_icu.dt.weekday_name\n",
    "data.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Friday       6263\n",
       "Tuesday      6141\n",
       "Monday       6097\n",
       "Wednesday    5985\n",
       "Thursday     5876\n",
       "Saturday     4235\n",
       "Sunday       3960\n",
       "Name: inday_icu, dtype: int64"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data['inday_icu'].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "weekday    30362\n",
       "weekend     8195\n",
       "Name: inday_icu_wkd, dtype: int64"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create weekday vs weekend column for icu_intime \n",
    "data['inday_icu_wkd'] = np.where(data.intime_icu.dt.weekday <= 4, \n",
    "                                 'weekday','weekend')\n",
    "data['inday_icu_wkd'].value_counts()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# Produce some Summary Statistics by DOW and Weekday vs. Weekend\n",
    "\n",
    "Next, it's good to look at some basic summaries of the data.  We will compute simple averages and percentages/counts for each of the variables we have extracted, and look at it by day of week and weekend."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index([u'subject_id', u'hadm_id', u'icustay_id', u'gender',\n",
       "       u'admittime_hospital', u'dischtime_hospital', u'los_hospital', u'age',\n",
       "       u'admission_type', u'hospital_expire_flag', u'intime_icu',\n",
       "       u'outtime_icu', u'los_icu', u'first_careunit', u'icustay_expire_flag',\n",
       "       u'oasis', u'oasis_prob', u'admitday_hospital', u'dischday_hospital',\n",
       "       u'inday_icu', u'inday_icu_seq', u'outday_icu', u'inday_icu_wkd'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th colspan=\"8\" halign=\"left\">Grouped by inday_icu</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>Friday</th>\n",
       "      <th>Monday</th>\n",
       "      <th>Saturday</th>\n",
       "      <th>Sunday</th>\n",
       "      <th>Thursday</th>\n",
       "      <th>Tuesday</th>\n",
       "      <th>Wednesday</th>\n",
       "      <th>isnull</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>variable</th>\n",
       "      <th>level</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>n</th>\n",
       "      <th></th>\n",
       "      <td>6263</td>\n",
       "      <td>6097</td>\n",
       "      <td>4235</td>\n",
       "      <td>3960</td>\n",
       "      <td>5876</td>\n",
       "      <td>6141</td>\n",
       "      <td>5985</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">admission_type</th>\n",
       "      <th>ELECTIVE</th>\n",
       "      <td>1016 (16.22)</td>\n",
       "      <td>1265 (20.75)</td>\n",
       "      <td>162 (3.83)</td>\n",
       "      <td>101 (2.55)</td>\n",
       "      <td>999 (17.0)</td>\n",
       "      <td>1292 (21.04)</td>\n",
       "      <td>1243 (20.77)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>EMERGENCY</th>\n",
       "      <td>5118 (81.72)</td>\n",
       "      <td>4687 (76.87)</td>\n",
       "      <td>3852 (90.96)</td>\n",
       "      <td>3681 (92.95)</td>\n",
       "      <td>4746 (80.77)</td>\n",
       "      <td>4704 (76.6)</td>\n",
       "      <td>4600 (76.86)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>URGENT</th>\n",
       "      <td>129 (2.06)</td>\n",
       "      <td>145 (2.38)</td>\n",
       "      <td>221 (5.22)</td>\n",
       "      <td>178 (4.49)</td>\n",
       "      <td>131 (2.23)</td>\n",
       "      <td>145 (2.36)</td>\n",
       "      <td>142 (2.37)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>age</th>\n",
       "      <th></th>\n",
       "      <td>74.56 (53.22)</td>\n",
       "      <td>73.13 (51.37)</td>\n",
       "      <td>73.92 (58.58)</td>\n",
       "      <td>75.26 (60.66)</td>\n",
       "      <td>75.51 (55.46)</td>\n",
       "      <td>75.48 (55.22)</td>\n",
       "      <td>74.16 (53.88)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"5\" valign=\"top\">first_careunit</th>\n",
       "      <th>CCU</th>\n",
       "      <td>838 (13.38)</td>\n",
       "      <td>918 (15.06)</td>\n",
       "      <td>695 (16.41)</td>\n",
       "      <td>621 (15.68)</td>\n",
       "      <td>850 (14.47)</td>\n",
       "      <td>919 (14.96)</td>\n",
       "      <td>851 (14.22)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>CSRU</th>\n",
       "      <td>1416 (22.61)</td>\n",
       "      <td>1632 (26.77)</td>\n",
       "      <td>237 (5.6)</td>\n",
       "      <td>194 (4.9)</td>\n",
       "      <td>1282 (21.82)</td>\n",
       "      <td>1575 (25.65)</td>\n",
       "      <td>1268 (21.19)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MICU</th>\n",
       "      <td>2139 (34.15)</td>\n",
       "      <td>1940 (31.82)</td>\n",
       "      <td>1765 (41.68)</td>\n",
       "      <td>1706 (43.08)</td>\n",
       "      <td>2020 (34.38)</td>\n",
       "      <td>2019 (32.88)</td>\n",
       "      <td>2020 (33.75)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>SICU</th>\n",
       "      <td>1044 (16.67)</td>\n",
       "      <td>865 (14.19)</td>\n",
       "      <td>743 (17.54)</td>\n",
       "      <td>743 (18.76)</td>\n",
       "      <td>996 (16.95)</td>\n",
       "      <td>933 (15.19)</td>\n",
       "      <td>1038 (17.34)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>TSICU</th>\n",
       "      <td>826 (13.19)</td>\n",
       "      <td>742 (12.17)</td>\n",
       "      <td>795 (18.77)</td>\n",
       "      <td>696 (17.58)</td>\n",
       "      <td>728 (12.39)</td>\n",
       "      <td>695 (11.32)</td>\n",
       "      <td>808 (13.5)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">gender</th>\n",
       "      <th>F</th>\n",
       "      <td>2662 (42.5)</td>\n",
       "      <td>2559 (41.97)</td>\n",
       "      <td>1857 (43.85)</td>\n",
       "      <td>1736 (43.84)</td>\n",
       "      <td>2603 (44.3)</td>\n",
       "      <td>2671 (43.49)</td>\n",
       "      <td>2636 (44.04)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>M</th>\n",
       "      <td>3601 (57.5)</td>\n",
       "      <td>3538 (58.03)</td>\n",
       "      <td>2378 (56.15)</td>\n",
       "      <td>2224 (56.16)</td>\n",
       "      <td>3273 (55.7)</td>\n",
       "      <td>3470 (56.51)</td>\n",
       "      <td>3349 (55.96)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">hospital_expire_flag</th>\n",
       "      <th>0</th>\n",
       "      <td>5576 (89.03)</td>\n",
       "      <td>5468 (89.68)</td>\n",
       "      <td>3658 (86.38)</td>\n",
       "      <td>3388 (85.56)</td>\n",
       "      <td>5202 (88.53)</td>\n",
       "      <td>5491 (89.42)</td>\n",
       "      <td>5350 (89.39)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>687 (10.97)</td>\n",
       "      <td>629 (10.32)</td>\n",
       "      <td>577 (13.62)</td>\n",
       "      <td>572 (14.44)</td>\n",
       "      <td>674 (11.47)</td>\n",
       "      <td>650 (10.58)</td>\n",
       "      <td>635 (10.61)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">icustay_expire_flag</th>\n",
       "      <th>0</th>\n",
       "      <td>5768 (92.1)</td>\n",
       "      <td>5650 (92.67)</td>\n",
       "      <td>3811 (89.99)</td>\n",
       "      <td>3548 (89.6)</td>\n",
       "      <td>5399 (91.88)</td>\n",
       "      <td>5673 (92.38)</td>\n",
       "      <td>5514 (92.13)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>495 (7.9)</td>\n",
       "      <td>447 (7.33)</td>\n",
       "      <td>424 (10.01)</td>\n",
       "      <td>412 (10.4)</td>\n",
       "      <td>477 (8.12)</td>\n",
       "      <td>468 (7.62)</td>\n",
       "      <td>471 (7.87)</td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">inday_icu_wkd</th>\n",
       "      <th>weekday</th>\n",
       "      <td>6263 (100.0)</td>\n",
       "      <td>6097 (100.0)</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td>5876 (100.0)</td>\n",
       "      <td>6141 (100.0)</td>\n",
       "      <td>5985 (100.0)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weekend</th>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td>4235 (100.0)</td>\n",
       "      <td>3960 (100.0)</td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "      <td></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>los_hospital</th>\n",
       "      <th></th>\n",
       "      <td>10.23 (11.61)</td>\n",
       "      <td>9.71 (9.92)</td>\n",
       "      <td>10.00 (10.84)</td>\n",
       "      <td>9.82 (10.70)</td>\n",
       "      <td>9.88 (10.90)</td>\n",
       "      <td>9.86 (10.58)</td>\n",
       "      <td>9.77 (10.22)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>los_icu</th>\n",
       "      <th></th>\n",
       "      <td>4.15 (6.19)</td>\n",
       "      <td>3.83 (5.52)</td>\n",
       "      <td>4.42 (6.58)</td>\n",
       "      <td>4.44 (6.29)</td>\n",
       "      <td>3.96 (6.12)</td>\n",
       "      <td>3.80 (5.65)</td>\n",
       "      <td>4.04 (5.94)</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>oasis</th>\n",
       "      <th></th>\n",
       "      <td>31.17 (9.02)</td>\n",
       "      <td>31.21 (8.77)</td>\n",
       "      <td>31.49 (9.27)</td>\n",
       "      <td>32.07 (9.02)</td>\n",
       "      <td>31.10 (8.82)</td>\n",
       "      <td>30.90 (8.77)</td>\n",
       "      <td>30.58 (9.11)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>oasis_prob</th>\n",
       "      <th></th>\n",
       "      <td>0.14 (0.14)</td>\n",
       "      <td>0.14 (0.14)</td>\n",
       "      <td>0.15 (0.15)</td>\n",
       "      <td>0.16 (0.15)</td>\n",
       "      <td>0.14 (0.14)</td>\n",
       "      <td>0.14 (0.14)</td>\n",
       "      <td>0.14 (0.14)</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Grouped by inday_icu                 \\\n",
       "                                             Friday         Monday   \n",
       "variable             level                                           \n",
       "n                                              6263           6097   \n",
       "admission_type       ELECTIVE          1016 (16.22)   1265 (20.75)   \n",
       "                     EMERGENCY         5118 (81.72)   4687 (76.87)   \n",
       "                     URGENT              129 (2.06)     145 (2.38)   \n",
       "age                                   74.56 (53.22)  73.13 (51.37)   \n",
       "first_careunit       CCU                838 (13.38)    918 (15.06)   \n",
       "                     CSRU              1416 (22.61)   1632 (26.77)   \n",
       "                     MICU              2139 (34.15)   1940 (31.82)   \n",
       "                     SICU              1044 (16.67)    865 (14.19)   \n",
       "                     TSICU              826 (13.19)    742 (12.17)   \n",
       "gender               F                  2662 (42.5)   2559 (41.97)   \n",
       "                     M                  3601 (57.5)   3538 (58.03)   \n",
       "hospital_expire_flag 0                 5576 (89.03)   5468 (89.68)   \n",
       "                     1                  687 (10.97)    629 (10.32)   \n",
       "icustay_expire_flag  0                  5768 (92.1)   5650 (92.67)   \n",
       "                     1                    495 (7.9)     447 (7.33)   \n",
       "inday_icu_wkd        weekday           6263 (100.0)   6097 (100.0)   \n",
       "                     weekend                                         \n",
       "los_hospital                          10.23 (11.61)    9.71 (9.92)   \n",
       "los_icu                                 4.15 (6.19)    3.83 (5.52)   \n",
       "oasis                                  31.17 (9.02)   31.21 (8.77)   \n",
       "oasis_prob                              0.14 (0.14)    0.14 (0.14)   \n",
       "\n",
       "                                                                             \\\n",
       "                                     Saturday         Sunday       Thursday   \n",
       "variable             level                                                    \n",
       "n                                        4235           3960           5876   \n",
       "admission_type       ELECTIVE      162 (3.83)     101 (2.55)     999 (17.0)   \n",
       "                     EMERGENCY   3852 (90.96)   3681 (92.95)   4746 (80.77)   \n",
       "                     URGENT        221 (5.22)     178 (4.49)     131 (2.23)   \n",
       "age                             73.92 (58.58)  75.26 (60.66)  75.51 (55.46)   \n",
       "first_careunit       CCU          695 (16.41)    621 (15.68)    850 (14.47)   \n",
       "                     CSRU           237 (5.6)      194 (4.9)   1282 (21.82)   \n",
       "                     MICU        1765 (41.68)   1706 (43.08)   2020 (34.38)   \n",
       "                     SICU         743 (17.54)    743 (18.76)    996 (16.95)   \n",
       "                     TSICU        795 (18.77)    696 (17.58)    728 (12.39)   \n",
       "gender               F           1857 (43.85)   1736 (43.84)    2603 (44.3)   \n",
       "                     M           2378 (56.15)   2224 (56.16)    3273 (55.7)   \n",
       "hospital_expire_flag 0           3658 (86.38)   3388 (85.56)   5202 (88.53)   \n",
       "                     1            577 (13.62)    572 (14.44)    674 (11.47)   \n",
       "icustay_expire_flag  0           3811 (89.99)    3548 (89.6)   5399 (91.88)   \n",
       "                     1            424 (10.01)     412 (10.4)     477 (8.12)   \n",
       "inday_icu_wkd        weekday                                   5876 (100.0)   \n",
       "                     weekend     4235 (100.0)   3960 (100.0)                  \n",
       "los_hospital                    10.00 (10.84)   9.82 (10.70)   9.88 (10.90)   \n",
       "los_icu                           4.42 (6.58)    4.44 (6.29)    3.96 (6.12)   \n",
       "oasis                            31.49 (9.27)   32.07 (9.02)   31.10 (8.82)   \n",
       "oasis_prob                        0.15 (0.15)    0.16 (0.15)    0.14 (0.14)   \n",
       "\n",
       "                                                                     \n",
       "                                      Tuesday      Wednesday isnull  \n",
       "variable             level                                           \n",
       "n                                        6141           5985         \n",
       "admission_type       ELECTIVE    1292 (21.04)   1243 (20.77)      0  \n",
       "                     EMERGENCY    4704 (76.6)   4600 (76.86)         \n",
       "                     URGENT        145 (2.36)     142 (2.37)         \n",
       "age                             75.48 (55.22)  74.16 (53.88)      0  \n",
       "first_careunit       CCU          919 (14.96)    851 (14.22)      0  \n",
       "                     CSRU        1575 (25.65)   1268 (21.19)         \n",
       "                     MICU        2019 (32.88)   2020 (33.75)         \n",
       "                     SICU         933 (15.19)   1038 (17.34)         \n",
       "                     TSICU        695 (11.32)     808 (13.5)         \n",
       "gender               F           2671 (43.49)   2636 (44.04)      0  \n",
       "                     M           3470 (56.51)   3349 (55.96)         \n",
       "hospital_expire_flag 0           5491 (89.42)   5350 (89.39)      0  \n",
       "                     1            650 (10.58)    635 (10.61)         \n",
       "icustay_expire_flag  0           5673 (92.38)   5514 (92.13)      0  \n",
       "                     1             468 (7.62)     471 (7.87)         \n",
       "inday_icu_wkd        weekday     6141 (100.0)   5985 (100.0)      0  \n",
       "                     weekend                                         \n",
       "los_hospital                     9.86 (10.58)   9.77 (10.22)      0  \n",
       "los_icu                           3.80 (5.65)    4.04 (5.94)      2  \n",
       "oasis                            30.90 (8.77)   30.58 (9.11)      0  \n",
       "oasis_prob                        0.14 (0.14)    0.14 (0.14)      0  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "columns = ['gender', 'los_hospital', 'age', 'admission_type', 'hospital_expire_flag', \n",
    "           'los_icu','icustay_expire_flag', 'oasis', 'oasis_prob', 'first_careunit',\n",
    "           'inday_icu_wkd']\n",
    "\n",
    "groupby = 'inday_icu'\n",
    "\n",
    "pval = False\n",
    "\n",
    "categorical = ['gender','admission_type','hospital_expire_flag','icustay_expire_flag',\n",
    "               'first_careunit','inday_icu_wkd']\n",
    "\n",
    "t = TableOne(data, columns=columns, categorical=categorical, groupby=groupby, pval=pval)\n",
    "t.tableone"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr:last-of-type th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th colspan=\"3\" halign=\"left\">Grouped by inday_icu_wkd</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>isnull</th>\n",
       "      <th>weekday</th>\n",
       "      <th>weekend</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>variable</th>\n",
       "      <th>level</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>n</th>\n",
       "      <th></th>\n",
       "      <td></td>\n",
       "      <td>30362</td>\n",
       "      <td>8195</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"3\" valign=\"top\">admission_type</th>\n",
       "      <th>ELECTIVE</th>\n",
       "      <td>0</td>\n",
       "      <td>5815 (19.15)</td>\n",
       "      <td>263 (3.21)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>EMERGENCY</th>\n",
       "      <td></td>\n",
       "      <td>23855 (78.57)</td>\n",
       "      <td>7533 (91.92)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>URGENT</th>\n",
       "      <td></td>\n",
       "      <td>692 (2.28)</td>\n",
       "      <td>399 (4.87)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>age</th>\n",
       "      <th></th>\n",
       "      <td>0</td>\n",
       "      <td>74.56 (53.84)</td>\n",
       "      <td>74.57 (59.59)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"5\" valign=\"top\">first_careunit</th>\n",
       "      <th>CCU</th>\n",
       "      <td>0</td>\n",
       "      <td>4376 (14.41)</td>\n",
       "      <td>1316 (16.06)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>CSRU</th>\n",
       "      <td></td>\n",
       "      <td>7173 (23.62)</td>\n",
       "      <td>431 (5.26)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MICU</th>\n",
       "      <td></td>\n",
       "      <td>10138 (33.39)</td>\n",
       "      <td>3471 (42.36)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>SICU</th>\n",
       "      <td></td>\n",
       "      <td>4876 (16.06)</td>\n",
       "      <td>1486 (18.13)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>TSICU</th>\n",
       "      <td></td>\n",
       "      <td>3799 (12.51)</td>\n",
       "      <td>1491 (18.19)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">gender</th>\n",
       "      <th>F</th>\n",
       "      <td>0</td>\n",
       "      <td>13131 (43.25)</td>\n",
       "      <td>3593 (43.84)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>M</th>\n",
       "      <td></td>\n",
       "      <td>17231 (56.75)</td>\n",
       "      <td>4602 (56.16)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">hospital_expire_flag</th>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>27087 (89.21)</td>\n",
       "      <td>7046 (85.98)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td></td>\n",
       "      <td>3275 (10.79)</td>\n",
       "      <td>1149 (14.02)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">icustay_expire_flag</th>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>28004 (92.23)</td>\n",
       "      <td>7359 (89.8)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td></td>\n",
       "      <td>2358 (7.77)</td>\n",
       "      <td>836 (10.2)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>los_hospital</th>\n",
       "      <th></th>\n",
       "      <td>0</td>\n",
       "      <td>9.89 (10.67)</td>\n",
       "      <td>9.91 (10.77)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>los_icu</th>\n",
       "      <th></th>\n",
       "      <td>2</td>\n",
       "      <td>3.96 (5.89)</td>\n",
       "      <td>4.43 (6.44)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>oasis</th>\n",
       "      <th></th>\n",
       "      <td>0</td>\n",
       "      <td>30.99 (8.90)</td>\n",
       "      <td>31.77 (9.15)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>oasis_prob</th>\n",
       "      <th></th>\n",
       "      <td>0</td>\n",
       "      <td>0.14 (0.14)</td>\n",
       "      <td>0.15 (0.15)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               Grouped by inday_icu_wkd                 \\\n",
       "                                                 isnull        weekday   \n",
       "variable             level                                               \n",
       "n                                                                30362   \n",
       "admission_type       ELECTIVE                         0   5815 (19.15)   \n",
       "                     EMERGENCY                           23855 (78.57)   \n",
       "                     URGENT                                 692 (2.28)   \n",
       "age                                                   0  74.56 (53.84)   \n",
       "first_careunit       CCU                              0   4376 (14.41)   \n",
       "                     CSRU                                 7173 (23.62)   \n",
       "                     MICU                                10138 (33.39)   \n",
       "                     SICU                                 4876 (16.06)   \n",
       "                     TSICU                                3799 (12.51)   \n",
       "gender               F                                0  13131 (43.25)   \n",
       "                     M                                   17231 (56.75)   \n",
       "hospital_expire_flag 0                                0  27087 (89.21)   \n",
       "                     1                                    3275 (10.79)   \n",
       "icustay_expire_flag  0                                0  28004 (92.23)   \n",
       "                     1                                     2358 (7.77)   \n",
       "los_hospital                                          0   9.89 (10.67)   \n",
       "los_icu                                               2    3.96 (5.89)   \n",
       "oasis                                                 0   30.99 (8.90)   \n",
       "oasis_prob                                            0    0.14 (0.14)   \n",
       "\n",
       "                                               \n",
       "                                      weekend  \n",
       "variable             level                     \n",
       "n                                        8195  \n",
       "admission_type       ELECTIVE      263 (3.21)  \n",
       "                     EMERGENCY   7533 (91.92)  \n",
       "                     URGENT        399 (4.87)  \n",
       "age                             74.57 (59.59)  \n",
       "first_careunit       CCU         1316 (16.06)  \n",
       "                     CSRU          431 (5.26)  \n",
       "                     MICU        3471 (42.36)  \n",
       "                     SICU        1486 (18.13)  \n",
       "                     TSICU       1491 (18.19)  \n",
       "gender               F           3593 (43.84)  \n",
       "                     M           4602 (56.16)  \n",
       "hospital_expire_flag 0           7046 (85.98)  \n",
       "                     1           1149 (14.02)  \n",
       "icustay_expire_flag  0            7359 (89.8)  \n",
       "                     1             836 (10.2)  \n",
       "los_hospital                     9.91 (10.77)  \n",
       "los_icu                           4.43 (6.44)  \n",
       "oasis                            31.77 (9.15)  \n",
       "oasis_prob                        0.15 (0.15)  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "columns = ['gender', 'los_hospital', 'age', 'admission_type', 'hospital_expire_flag', \n",
    "           'los_icu','icustay_expire_flag', 'oasis', 'oasis_prob', 'first_careunit']\n",
    "\n",
    "groupby = 'inday_icu_wkd'\n",
    "\n",
    "pval = False\n",
    "\n",
    "categorical = ['gender','admission_type','hospital_expire_flag','icustay_expire_flag',\n",
    "               'first_careunit']\n",
    "\n",
    "t = TableOne(data, columns=columns, categorical=categorical, groupby=groupby, pval=pval)\n",
    "t.tableone"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It looks like there's a higher rate of hospital mortality (14.0% vs 10.8%) and ICU mortality (10.2% vs 7.8%) on weekends when compared to weekdays.  There are also statistically significant differences between several other important variables, including: admission type, disease severity (OASIS), and the patient's first care unit, suggesting that these groups may be fundamentally different in some way.  Let's explore this a little further."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Plot the data\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>admission_type</th>\n",
       "      <th>ELECTIVE</th>\n",
       "      <th>EMERGENCY</th>\n",
       "      <th>URGENT</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>inday_icu_seq</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.020553</td>\n",
       "      <td>0.124387</td>\n",
       "      <td>0.137931</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.019350</td>\n",
       "      <td>0.130315</td>\n",
       "      <td>0.082759</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.026549</td>\n",
       "      <td>0.127391</td>\n",
       "      <td>0.112676</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.027027</td>\n",
       "      <td>0.133165</td>\n",
       "      <td>0.114504</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.026575</td>\n",
       "      <td>0.126612</td>\n",
       "      <td>0.093023</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.080247</td>\n",
       "      <td>0.137072</td>\n",
       "      <td>0.162896</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.089109</td>\n",
       "      <td>0.146971</td>\n",
       "      <td>0.123596</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "admission_type  ELECTIVE  EMERGENCY    URGENT\n",
       "inday_icu_seq                                \n",
       "0               0.020553   0.124387  0.137931\n",
       "1               0.019350   0.130315  0.082759\n",
       "2               0.026549   0.127391  0.112676\n",
       "3               0.027027   0.133165  0.114504\n",
       "4               0.026575   0.126612  0.093023\n",
       "5               0.080247   0.137072  0.162896\n",
       "6               0.089109   0.146971  0.123596"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Pivot data to summarise by day\n",
    "dat_dow = data.groupby(['admission_type',\n",
    "                        'inday_icu_seq'])['hospital_expire_flag'].mean().reset_index()\n",
    "\n",
    "dat_dow = dat_dow.pivot(index='inday_icu_seq', \n",
    "                        columns='admission_type', values='hospital_expire_flag')\n",
    "\n",
    "dat_dow"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class=\"vega-embed\" id=\"947caf96-c17f-4466-ad0c-6634cc666d2a\"></div>\n",
       "\n",
       "<style>\n",
       ".vega-embed svg, .vega-embed canvas {\n",
       "  border: 1px dotted gray;\n",
       "}\n",
       "\n",
       ".vega-embed .vega-actions a {\n",
       "  margin-right: 6px;\n",
       "}\n",
       "</style>\n"
      ]
     },
     "metadata": {
      "jupyter-vega3": "#947caf96-c17f-4466-ad0c-6634cc666d2a"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "var spec = {\"selection\": {\"grid\": {\"bind\": \"scales\", \"type\": \"interval\"}}, \"encoding\": {\"y\": {\"field\": \"Hospital mortality rate\", \"type\": \"quantitative\"}, \"x\": {\"field\": \"inday_icu_seq\", \"type\": \"quantitative\"}, \"color\": {\"field\": \"variable\", \"type\": \"nominal\"}}, \"height\": 300, \"width\": 450, \"$schema\": \"https://vega.github.io/schema/vega-lite/v2.json\", \"mark\": \"line\", \"data\": {\"values\": [{\"variable\": \"ELECTIVE\", \"inday_icu_seq\": 0, \"Hospital mortality rate\": 0.020553359683794466}, {\"variable\": \"ELECTIVE\", \"inday_icu_seq\": 1, \"Hospital mortality rate\": 0.01934984520123839}, {\"variable\": \"ELECTIVE\", \"inday_icu_seq\": 2, \"Hospital mortality rate\": 0.02654867256637168}, {\"variable\": \"ELECTIVE\", \"inday_icu_seq\": 3, \"Hospital mortality rate\": 0.02702702702702703}, {\"variable\": \"ELECTIVE\", \"inday_icu_seq\": 4, \"Hospital mortality rate\": 0.0265748031496063}, {\"variable\": \"ELECTIVE\", \"inday_icu_seq\": 5, \"Hospital mortality rate\": 0.08024691358024691}, {\"variable\": \"ELECTIVE\", \"inday_icu_seq\": 6, \"Hospital mortality rate\": 0.0891089108910891}, {\"variable\": \"EMERGENCY\", \"inday_icu_seq\": 0, \"Hospital mortality rate\": 0.12438660123746532}, {\"variable\": \"EMERGENCY\", \"inday_icu_seq\": 1, \"Hospital mortality rate\": 0.13031462585034015}, {\"variable\": \"EMERGENCY\", \"inday_icu_seq\": 2, \"Hospital mortality rate\": 0.12739130434782608}, {\"variable\": \"EMERGENCY\", \"inday_icu_seq\": 3, \"Hospital mortality rate\": 0.13316477033291194}, {\"variable\": \"EMERGENCY\", \"inday_icu_seq\": 4, \"Hospital mortality rate\": 0.12661195779601406}, {\"variable\": \"EMERGENCY\", \"inday_icu_seq\": 5, \"Hospital mortality rate\": 0.13707165109034267}, {\"variable\": \"EMERGENCY\", \"inday_icu_seq\": 6, \"Hospital mortality rate\": 0.1469709318120076}, {\"variable\": \"URGENT\", \"inday_icu_seq\": 0, \"Hospital mortality rate\": 0.13793103448275862}, {\"variable\": \"URGENT\", \"inday_icu_seq\": 1, \"Hospital mortality rate\": 0.08275862068965517}, {\"variable\": \"URGENT\", \"inday_icu_seq\": 2, \"Hospital mortality rate\": 0.11267605633802817}, {\"variable\": \"URGENT\", \"inday_icu_seq\": 3, \"Hospital mortality rate\": 0.11450381679389313}, {\"variable\": \"URGENT\", \"inday_icu_seq\": 4, \"Hospital mortality rate\": 0.09302325581395349}, {\"variable\": \"URGENT\", \"inday_icu_seq\": 5, \"Hospital mortality rate\": 0.16289592760180996}, {\"variable\": \"URGENT\", \"inday_icu_seq\": 6, \"Hospital mortality rate\": 0.12359550561797752}]}};\n",
       "var selector = \"#947caf96-c17f-4466-ad0c-6634cc666d2a\";\n",
       "var type = \"vega-lite\";\n",
       "\n",
       "var output_area = this;\n",
       "require(['nbextensions/jupyter-vega3/index'], function(vega) {\n",
       "  vega.render(selector, spec, type, output_area);\n",
       "}, function (err) {\n",
       "  if (err.requireType !== 'scripterror') {\n",
       "    throw(err);\n",
       "  }\n",
       "});\n"
      ]
     },
     "metadata": {
      "jupyter-vega3": "#947caf96-c17f-4466-ad0c-6634cc666d2a"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": ""
     },
     "metadata": {
      "jupyter-vega3": "#947caf96-c17f-4466-ad0c-6634cc666d2a"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# day_map = {0:'Mon', 1:'Tue', 2:'Wed', 3:'Thu', 4:'Fri', 5:'Sat', 6:'Sun'}\n",
    "dat_dow.vgplot.line(value_name='Hospital mortality rate')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>admission_type</th>\n",
       "      <th>ELECTIVE</th>\n",
       "      <th>EMERGENCY</th>\n",
       "      <th>URGENT</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>inday_icu_wkd</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>weekday</th>\n",
       "      <td>0.023732</td>\n",
       "      <td>0.128359</td>\n",
       "      <td>0.108382</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weekend</th>\n",
       "      <td>0.083650</td>\n",
       "      <td>0.141909</td>\n",
       "      <td>0.145363</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "admission_type  ELECTIVE  EMERGENCY    URGENT\n",
       "inday_icu_wkd                                \n",
       "weekday         0.023732   0.128359  0.108382\n",
       "weekend         0.083650   0.141909  0.145363"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dat_wkd = data.groupby(['admission_type','inday_icu_wkd'])['hospital_expire_flag'].mean().reset_index()\n",
    "dat_wkd = dat_wkd.pivot(index='inday_icu_wkd', columns='admission_type', values='hospital_expire_flag')\n",
    "dat_wkd.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div class=\"vega-embed\" id=\"5ee9966a-ba11-4902-afdb-475556325e20\"></div>\n",
       "\n",
       "<style>\n",
       ".vega-embed svg, .vega-embed canvas {\n",
       "  border: 1px dotted gray;\n",
       "}\n",
       "\n",
       ".vega-embed .vega-actions a {\n",
       "  margin-right: 6px;\n",
       "}\n",
       "</style>\n"
      ]
     },
     "metadata": {
      "jupyter-vega3": "#5ee9966a-ba11-4902-afdb-475556325e20"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "var spec = {\"selection\": {\"grid\": {\"bind\": \"scales\", \"type\": \"interval\"}}, \"encoding\": {\"y\": {\"field\": \"Hospital mortality rate\", \"type\": \"quantitative\"}, \"x\": {\"field\": \"inday_icu_wkd\", \"type\": \"nominal\"}, \"color\": {\"field\": \"variable\", \"type\": \"nominal\"}}, \"height\": 300, \"width\": 450, \"$schema\": \"https://vega.github.io/schema/vega-lite/v2.json\", \"mark\": \"line\", \"data\": {\"values\": [{\"variable\": \"ELECTIVE\", \"Hospital mortality rate\": 0.023731728288907995, \"inday_icu_wkd\": \"weekday\"}, {\"variable\": \"ELECTIVE\", \"Hospital mortality rate\": 0.08365019011406843, \"inday_icu_wkd\": \"weekend\"}, {\"variable\": \"EMERGENCY\", \"Hospital mortality rate\": 0.1283588346258646, \"inday_icu_wkd\": \"weekday\"}, {\"variable\": \"EMERGENCY\", \"Hospital mortality rate\": 0.14190893402362936, \"inday_icu_wkd\": \"weekend\"}, {\"variable\": \"URGENT\", \"Hospital mortality rate\": 0.10838150289017341, \"inday_icu_wkd\": \"weekday\"}, {\"variable\": \"URGENT\", \"Hospital mortality rate\": 0.14536340852130325, \"inday_icu_wkd\": \"weekend\"}]}};\n",
       "var selector = \"#5ee9966a-ba11-4902-afdb-475556325e20\";\n",
       "var type = \"vega-lite\";\n",
       "\n",
       "var output_area = this;\n",
       "require(['nbextensions/jupyter-vega3/index'], function(vega) {\n",
       "  vega.render(selector, spec, type, output_area);\n",
       "}, function (err) {\n",
       "  if (err.requireType !== 'scripterror') {\n",
       "    throw(err);\n",
       "  }\n",
       "});\n"
      ]
     },
     "metadata": {
      "jupyter-vega3": "#5ee9966a-ba11-4902-afdb-475556325e20"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": ""
     },
     "metadata": {
      "jupyter-vega3": "#5ee9966a-ba11-4902-afdb-475556325e20"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "dat_wkd.vgplot.line(value_name='Hospital mortality rate')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Model building\n",
    "\n",
    "Let's try to incorporate what we saw above into a very simple model. We will use logistic regression with hospital mortality as our outcome. First an unadjusted estimate, and then we will try to adjust for admission type.\n",
    "\n",
    "The unadjusted analysis should mirror pretty closely what we saw in the one of the tables above. The odds ratio corresponding with 14.0% and 10.8% mortality in the the weekend and weekday groups, respectively, is about 1.35. Performing logistic regression on the same data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<tr>\n",
       "        <td>Model:</td>                 <td>GLM</td>              <td>AIC:</td>        <td>27416.8526</td> \n",
       "</tr>\n",
       "<tr>\n",
       "    <td>Link Function:</td>            <td>logit</td>             <td>BIC:</td>       <td>-379723.8199</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>Dependent Variable:</td> <td>hospital_expire_flag</td> <td>Log-Likelihood:</td>    <td>-13706.</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "         <td>Date:</td>          <td>2018-03-02 10:45</td>      <td>LL-Null:</td>        <td>-13738.</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "   <td>No. Observations:</td>          <td>38557</td>           <td>Deviance:</td>       <td>27413.</td>   \n",
       "</tr>\n",
       "<tr>\n",
       "       <td>Df Model:</td>                <td>1</td>           <td>Pearson chi2:</td>    <td>3.86e+04</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "     <td>Df Residuals:</td>            <td>38555</td>            <td>Scale:</td>         <td>1.0000</td>   \n",
       "</tr>\n",
       "<tr>\n",
       "        <td>Method:</td>               <td>IRLS</td>                <td></td>               <td></td>      \n",
       "</tr>\n",
       "</table>\n",
       "<table class=\"simpletable\">\n",
       "<tr>\n",
       "               <td></td>                <th>Coef.</th>  <th>Std.Err.</th>     <th>z</th>      <th>P>|z|</th> <th>[0.025</th>  <th>0.975]</th> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Intercept</th>                   <td>-2.1127</td>  <td>0.0185</td>  <td>-114.2000</td> <td>0.0000</td> <td>-2.1490</td> <td>-2.0765</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(inday_icu_wkd)[T.weekend]</th> <td>0.2992</td>   <td>0.0368</td>   <td>8.1288</td>   <td>0.0000</td> <td>0.2270</td>  <td>0.3713</td> \n",
       "</tr>\n",
       "</table>"
      ],
      "text/plain": [
       "<class 'statsmodels.iolib.summary2.Summary'>\n",
       "\"\"\"\n",
       "                      Results: Generalized linear model\n",
       "=============================================================================\n",
       "Model:                  GLM                    AIC:              27416.8526  \n",
       "Link Function:          logit                  BIC:              -379723.8199\n",
       "Dependent Variable:     hospital_expire_flag   Log-Likelihood:   -13706.     \n",
       "Date:                   2018-03-02 10:45       LL-Null:          -13738.     \n",
       "No. Observations:       38557                  Deviance:         27413.      \n",
       "Df Model:               1                      Pearson chi2:     3.86e+04    \n",
       "Df Residuals:           38555                  Scale:            1.0000      \n",
       "Method:                 IRLS                                                 \n",
       "-----------------------------------------------------------------------------\n",
       "                             Coef.  Std.Err.     z     P>|z|   [0.025  0.975]\n",
       "-----------------------------------------------------------------------------\n",
       "Intercept                   -2.1127   0.0185 -114.2000 0.0000 -2.1490 -2.0765\n",
       "C(inday_icu_wkd)[T.weekend]  0.2992   0.0368    8.1288 0.0000  0.2270  0.3713\n",
       "=============================================================================\n",
       "\n",
       "\"\"\""
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# R style syntax\n",
    "simple_glm = smf.glm('hospital_expire_flag ~ C(inday_icu_wkd)', \n",
    "                     data=data, family=sm.families.Binomial()).fit()\n",
    "simple_glm.summary2()\n",
    "\n",
    "# Alternative syntax\n",
    "# y = data.hospital_expire_flag\n",
    "# X = sm.tools.add_constant(data.inday_icu_wkd.factorize()[0])\n",
    "# simple_glm = sm.GLM(y, X, family=sm.families.Binomial()).fit()\n",
    "# simple_glm.summary2()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "...yields the same results. The coefficient shown above for weekend is on the log scale, so when we exponentiate it, we get the odds-ratio: `exp(0.2992) = 1.35`. So, looking at these crude rates and odds ratios, we can see that patients admitted on a weekend have about a 35% increase in the odds of dying in the hospital when compared to those on a weekday. This effect is statistically significant (p<0.001). \n",
    "\n",
    "Are we done?\n",
    "\n",
    "I hope not. We saw from the tables and figures above, there is likely some confounding and maybe even effect modification happening. Next let''s look at admission type and weekend ICU admission in the same model. There are two such models we could consider. \n",
    "\n",
    "The first adjusts for admission type, but assumes that the effect of weekend admission is the same regardless if the patient is of any of the admission types. The second one adjusts for admission type, but then allows the effect of weekend ICU admission to vary across the different levels of admission type. \n",
    "\n",
    "The first type of model would be able to account for confounding (when a nuisance variable is associated with both the outcome and the exposure/variable of interest), while the second permits what is called effect modification or a statistical interaction. \n",
    "\n",
    "Interactions are sometimes difficult to understand, but if ignored, can lead to incorrect conclusions about the effect of one or more of the variables. In this example, we fit both models, output estimates of the log-odds ratios, and perform a hypothesis test which evaluates the statistical significance of dropping one of the variables. Below is the resulting output:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<tr>\n",
       "        <td>Model:</td>                 <td>GLM</td>              <td>AIC:</td>        <td>26729.0141</td> \n",
       "</tr>\n",
       "<tr>\n",
       "    <td>Link Function:</td>            <td>logit</td>             <td>BIC:</td>       <td>-380394.5386</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>Dependent Variable:</td> <td>hospital_expire_flag</td> <td>Log-Likelihood:</td>    <td>-13361.</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "         <td>Date:</td>          <td>2018-03-02 10:45</td>      <td>LL-Null:</td>        <td>-13738.</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "   <td>No. Observations:</td>          <td>38557</td>           <td>Deviance:</td>       <td>26721.</td>   \n",
       "</tr>\n",
       "<tr>\n",
       "       <td>Df Model:</td>                <td>3</td>           <td>Pearson chi2:</td>    <td>3.85e+04</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "     <td>Df Residuals:</td>            <td>38553</td>            <td>Scale:</td>         <td>1.0000</td>   \n",
       "</tr>\n",
       "<tr>\n",
       "        <td>Method:</td>               <td>IRLS</td>                <td></td>               <td></td>      \n",
       "</tr>\n",
       "</table>\n",
       "<table class=\"simpletable\">\n",
       "<tr>\n",
       "                 <td></td>                 <th>Coef.</th>  <th>Std.Err.</th>     <th>z</th>     <th>P>|z|</th> <th>[0.025</th>  <th>0.975]</th> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Intercept</th>                      <td>-3.6173</td>  <td>0.0801</td>  <td>-45.1364</td> <td>0.0000</td> <td>-3.7743</td> <td>-3.4602</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(inday_icu_wkd)[T.weekend]</th>    <td>0.1444</td>   <td>0.0372</td>   <td>3.8871</td>  <td>0.0001</td> <td>0.0716</td>  <td>0.2172</td> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(admission_type)[T.EMERGENCY]</th> <td>1.6944</td>   <td>0.0822</td>   <td>20.6095</td> <td>0.0000</td> <td>1.5333</td>  <td>1.8555</td> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(admission_type)[T.URGENT]</th>    <td>1.5881</td>   <td>0.1231</td>   <td>12.9034</td> <td>0.0000</td> <td>1.3469</td>  <td>1.8293</td> \n",
       "</tr>\n",
       "</table>"
      ],
      "text/plain": [
       "<class 'statsmodels.iolib.summary2.Summary'>\n",
       "\"\"\"\n",
       "                       Results: Generalized linear model\n",
       "===============================================================================\n",
       "Model:                  GLM                     AIC:               26729.0141  \n",
       "Link Function:          logit                   BIC:               -380394.5386\n",
       "Dependent Variable:     hospital_expire_flag    Log-Likelihood:    -13361.     \n",
       "Date:                   2018-03-02 10:45        LL-Null:           -13738.     \n",
       "No. Observations:       38557                   Deviance:          26721.      \n",
       "Df Model:               3                       Pearson chi2:      3.85e+04    \n",
       "Df Residuals:           38553                   Scale:             1.0000      \n",
       "Method:                 IRLS                                                   \n",
       "-------------------------------------------------------------------------------\n",
       "                                Coef.  Std.Err.    z     P>|z|   [0.025  0.975]\n",
       "-------------------------------------------------------------------------------\n",
       "Intercept                      -3.6173   0.0801 -45.1364 0.0000 -3.7743 -3.4602\n",
       "C(inday_icu_wkd)[T.weekend]     0.1444   0.0372   3.8871 0.0001  0.0716  0.2172\n",
       "C(admission_type)[T.EMERGENCY]  1.6944   0.0822  20.6095 0.0000  1.5333  1.8555\n",
       "C(admission_type)[T.URGENT]     1.5881   0.1231  12.9034 0.0000  1.3469  1.8293\n",
       "===============================================================================\n",
       "\n",
       "\"\"\""
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Without effect modification\n",
    "adj_glm = smf.glm('hospital_expire_flag ~ C(inday_icu_wkd) + C(admission_type)', \n",
    "                     data=data, family=sm.families.Binomial()).fit()\n",
    "adj_glm.summary2()\n",
    "# drop1(adj.glm,test=\"Chisq\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<tr>\n",
       "        <td>Model:</td>                 <td>GLM</td>              <td>AIC:</td>        <td>26712.4403</td> \n",
       "</tr>\n",
       "<tr>\n",
       "    <td>Link Function:</td>            <td>logit</td>             <td>BIC:</td>       <td>-380393.9926</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>Dependent Variable:</td> <td>hospital_expire_flag</td> <td>Log-Likelihood:</td>    <td>-13350.</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "         <td>Date:</td>          <td>2018-03-02 10:45</td>      <td>LL-Null:</td>        <td>-13738.</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "   <td>No. Observations:</td>          <td>38557</td>           <td>Deviance:</td>       <td>26700.</td>   \n",
       "</tr>\n",
       "<tr>\n",
       "       <td>Df Model:</td>                <td>5</td>           <td>Pearson chi2:</td>    <td>3.86e+04</td>  \n",
       "</tr>\n",
       "<tr>\n",
       "     <td>Df Residuals:</td>            <td>38551</td>            <td>Scale:</td>         <td>1.0000</td>   \n",
       "</tr>\n",
       "<tr>\n",
       "        <td>Method:</td>               <td>IRLS</td>                <td></td>               <td></td>      \n",
       "</tr>\n",
       "</table>\n",
       "<table class=\"simpletable\">\n",
       "<tr>\n",
       "                               <td></td>                               <th>Coef.</th>  <th>Std.Err.</th>     <th>z</th>     <th>P>|z|</th> <th>[0.025</th>  <th>0.975]</th> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>Intercept</th>                                                  <td>-3.7169</td>  <td>0.0862</td>  <td>-43.1428</td> <td>0.0000</td> <td>-3.8858</td> <td>-3.5481</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(inday_icu_wkd)[T.weekend]</th>                                <td>1.3232</td>   <td>0.2388</td>   <td>5.5409</td>  <td>0.0000</td> <td>0.8551</td>  <td>1.7912</td> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(admission_type)[T.EMERGENCY]</th>                             <td>1.8014</td>   <td>0.0883</td>   <td>20.4002</td> <td>0.0000</td> <td>1.6283</td>  <td>1.9744</td> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(admission_type)[T.URGENT]</th>                                <td>1.6095</td>   <td>0.1496</td>   <td>10.7598</td> <td>0.0000</td> <td>1.3164</td>  <td>1.9027</td> \n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(inday_icu_wkd)[T.weekend]:C(admission_type)[T.EMERGENCY]</th> <td>-1.2071</td>  <td>0.2418</td>   <td>-4.9913</td> <td>0.0000</td> <td>-1.6812</td> <td>-0.7331</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <th>C(inday_icu_wkd)[T.weekend]:C(admission_type)[T.URGENT]</th>    <td>-0.9872</td>  <td>0.3036</td>   <td>-3.2521</td> <td>0.0011</td> <td>-1.5822</td> <td>-0.3922</td>\n",
       "</tr>\n",
       "</table>"
      ],
      "text/plain": [
       "<class 'statsmodels.iolib.summary2.Summary'>\n",
       "\"\"\"\n",
       "                                     Results: Generalized linear model\n",
       "===========================================================================================================\n",
       "Model:                            GLM                              AIC:                        26712.4403  \n",
       "Link Function:                    logit                            BIC:                        -380393.9926\n",
       "Dependent Variable:               hospital_expire_flag             Log-Likelihood:             -13350.     \n",
       "Date:                             2018-03-02 10:45                 LL-Null:                    -13738.     \n",
       "No. Observations:                 38557                            Deviance:                   26700.      \n",
       "Df Model:                         5                                Pearson chi2:               3.86e+04    \n",
       "Df Residuals:                     38551                            Scale:                      1.0000      \n",
       "Method:                           IRLS                                                                     \n",
       "-----------------------------------------------------------------------------------------------------------\n",
       "                                                            Coef.  Std.Err.    z     P>|z|   [0.025  0.975]\n",
       "-----------------------------------------------------------------------------------------------------------\n",
       "Intercept                                                  -3.7169   0.0862 -43.1428 0.0000 -3.8858 -3.5481\n",
       "C(inday_icu_wkd)[T.weekend]                                 1.3232   0.2388   5.5409 0.0000  0.8551  1.7912\n",
       "C(admission_type)[T.EMERGENCY]                              1.8014   0.0883  20.4002 0.0000  1.6283  1.9744\n",
       "C(admission_type)[T.URGENT]                                 1.6095   0.1496  10.7598 0.0000  1.3164  1.9027\n",
       "C(inday_icu_wkd)[T.weekend]:C(admission_type)[T.EMERGENCY] -1.2071   0.2418  -4.9913 0.0000 -1.6812 -0.7331\n",
       "C(inday_icu_wkd)[T.weekend]:C(admission_type)[T.URGENT]    -0.9872   0.3036  -3.2521 0.0011 -1.5822 -0.3922\n",
       "===========================================================================================================\n",
       "\n",
       "\"\"\""
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# With effect modification\n",
    "adj_glm_int = smf.glm('hospital_expire_flag ~ C(inday_icu_wkd) * C(admission_type)', \n",
    "                     data=data, family=sm.families.Binomial()).fit()\n",
    "adj_glm_int.summary2()\n",
    "# drop1(adj.glm,test=\"Chisq\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>admission_type</th>\n",
       "      <th>inday_icu_wkd</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ELECTIVE</td>\n",
       "      <td>weekday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ELECTIVE</td>\n",
       "      <td>weekend</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>weekday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>EMERGENCY</td>\n",
       "      <td>weekend</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>URGENT</td>\n",
       "      <td>weekday</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>URGENT</td>\n",
       "      <td>weekend</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  admission_type inday_icu_wkd\n",
       "0       ELECTIVE       weekday\n",
       "1       ELECTIVE       weekend\n",
       "2      EMERGENCY       weekday\n",
       "3      EMERGENCY       weekend\n",
       "4         URGENT       weekday\n",
       "5         URGENT       weekend"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Create data structure to hold odds of hospital death\n",
    "def expand_grid(data_dict):\n",
    "    rows = itertools.product(*data_dict.values())\n",
    "    return pd.DataFrame.from_records(rows, columns=data_dict.keys())\n",
    "\n",
    "weekend_grid = expand_grid({'inday_icu_wkd': ['weekday', 'weekend'],\n",
    "                            'admission_type': ['ELECTIVE', 'EMERGENCY', 'URGENT']})\n",
    "\n",
    "weekend_grid"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the first model (no interaction), we see that although the effect of weekend is almost halved, it remains statistically significant, after adjusting for admission type (p<0.001).\n",
    "\n",
    "In the second model, we are primarily interested in the significance of the interaction.  We can see when assessed with the `drop1` function, the interaction (`weekend:admission_type`) is statistically significant (p<0.001), suggesting that the effect of weekend may be different depending on which hospital admission type you are.  How exactly to interpret this:\n",
    "\n",
    "One way of looking at this complexity is by computing the odds ratio in each of the levels of admission type.  We can do this using the `predict` function, which by default outputs the log-odds of death.  If for each hospital admission type, we calculate the log odds of death for each of the levels of weekend,"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "def prob2logodds(prob):\n",
    "    odds = prob / (1 - prob)\n",
    "    logodds = np.log(odds)\n",
    "    return logodds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>predict</th>\n",
       "      <th>log_odds</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>inday_icu_wkd</th>\n",
       "      <th>admission_type</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>weekday</th>\n",
       "      <th>ELECTIVE</th>\n",
       "      <td>0.023732</td>\n",
       "      <td>-3.716925</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weekend</th>\n",
       "      <th>ELECTIVE</th>\n",
       "      <td>0.083650</td>\n",
       "      <td>-2.393754</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weekday</th>\n",
       "      <th>EMERGENCY</th>\n",
       "      <td>0.128359</td>\n",
       "      <td>-1.915548</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weekend</th>\n",
       "      <th>EMERGENCY</th>\n",
       "      <td>0.141909</td>\n",
       "      <td>-1.799525</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weekday</th>\n",
       "      <th>URGENT</th>\n",
       "      <td>0.108382</td>\n",
       "      <td>-2.107381</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>weekend</th>\n",
       "      <th>URGENT</th>\n",
       "      <td>0.145363</td>\n",
       "      <td>-1.771439</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               predict  log_odds\n",
       "inday_icu_wkd admission_type                    \n",
       "weekday       ELECTIVE        0.023732 -3.716925\n",
       "weekend       ELECTIVE        0.083650 -2.393754\n",
       "weekday       EMERGENCY       0.128359 -1.915548\n",
       "weekend       EMERGENCY       0.141909 -1.799525\n",
       "weekday       URGENT          0.108382 -2.107381\n",
       "weekend       URGENT          0.145363 -1.771439"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "weekend_grid['predict'] = adj_glm_int.predict(weekend_grid[['inday_icu_wkd','admission_type']])\n",
    "weekend_grid['log_odds'] = prob2logodds(weekend_grid['predict'])\n",
    "weekend_grid.set_index(['inday_icu_wkd','admission_type'], inplace=True)\n",
    "weekend_grid"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now compute the log odds ratio ($log(OR) =  logOdds_{weekend} - logOdds_{weekday}$), and exponentiate to get the odds ratio:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "admission_type\n",
       "ELECTIVE     3.755307\n",
       "EMERGENCY    1.123022\n",
       "URGENT       1.399257\n",
       "Name: log_odds, dtype: float64"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "diff_grid = weekend_grid.loc['weekend']['log_odds'] - weekend_grid.loc['weekday']['log_odds']\n",
    "np.exp(diff_grid)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, this mirrors what we saw above. While there may be differences between EMERGENCY and URGENT admission types, an ELECTIVE admission occurring on a weekend has an odds of mortality almost four times that of an ELECTIVE admission on a weekday. This seems particularly odd -- patients usually do not get admitted to a hospital electively on a weekend.\n",
    "\n",
    "What do you think?\n",
    "\n",
    "- Do patients admitted on a weekend have a higher rate of mortality than those admitted during the week?\n",
    "- Who is most affected, if at all?\n",
    "- What factors can you rule out might be causing this effect? e.g., is it because the patients are simply sicker on a weekend? Are they more likely to have complications?\n",
    "\n",
    "Looking forward to see what you guys come up with!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}