Predicting patient outcomes is critical to healthcare management, enabling hospitals to optimize resources and improve patient care. Machine learning algorithms or deep learning techniques have proven valuable in survival prediction rates, offering insights that can help guide treatment plans and prioritize resources.
In this captivating blog, I promise to make you all delve into the following learning objectives:
Source: biteable.com
This article was published as a part of the Data Science Blogathon.
The Hospital Patient Survival Prediction-Deployment project is a machine learning-based web application that predicts the probability of the patient surviving during their stay in a hospital. This project can be helpful for healthcare organizations and hospitals to prioritize care and allocate resources for patients at higher risk of adverse outcomes. It also has the potential to improve patient care and overall hospital efficiency.
The complete code and related files for this project can be found at the following GitHub repository: https://github.com/Abhissaro/Hospital_PatientSurvival_Prediction-Deployment
This project aims to build and deploy a machine learning model that accurately predicts the survival probability of a patient during their hospital stay, based on their medical records and other relevant information.
Our primary goal is to classify the ‘hospital_death’ variable, which is binary in nature, using the 84 available features as predictors. The classification process will be carried out step-by-step, focusing on different aspects of the problem. The model’s performance will be evaluated using accuracy and the area under the Receiver Operating Characteristic (ROC) curve as the primary scoring metrics.
To understand and complete this project, one should have a basic understanding of Python programming, machine learning algorithms, and libraries such as pandas, scikit-learn, keras architecture, and Streamlit.
The dataset used for this project is sourced from the Global Open Source Severity of Illness Score (GOSSIS) study. The study, published in the journal Critical Care Medicine, aimed to develop and evaluate a large, open-source, international, and multicenter dataset of intensive care unit (ICU) patients. The dataset has nearly 100K rows and includes various features such as age, gender, type of admission, length of stay, and diagnosis. Detailed information about each feature is provided in the dataset’s documentation. The dataset includes various clinical parameters and patient outcomes. You can find more information about the dataset and the study in the following link:
Please note that you might need to request access to the dataset from the authors or the institution responsible for the study.
Below is an overview of the basic steps followed to fine-tune the model to get the desired output for the prediction of patient survival for a better understanding before we delve into the coding part.
Let’s dive right into the exhilarating world of code and start building our exceptional Hospital Patient Survival Prediction ipynb file and relava
App. Prepare to be amazed by the incredible features and seamless user experience. Get ready to bring your A-game, and let’s get coding!
!pip install tensorflow
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
df = pd.read_csv("D:/TMLC/Patient survival DL1/Dataset/Dataset.csv")
df.head()
df.info()
df.shape
# Checking missing values
df.isnull().any().sum()
df.isnull().sum().sort_values(ascending=True)
# Statistical measures about the data
df.describe()
# Checking the distribution of the target variable
df['hospital_death'].value_counts()
df.groupby('hospital_death').mean()
sns.set_style('whitegrid')
sns.countplot(x='hospital_death',data=df)
Output observations:
object_columns = df.select_dtypes(include=['object']).columns
cols = df.select_dtypes([np.number]).columns
# Checking outliers
sns.distplot(df['hospital_death'])
sns.distplot(df['hepatic_failure'])
sns.distplot(df['solid_tumor_with_metastasis'])
# Filling missing values
df[cols] = df[cols].fillna(df[cols].mean())
object_columns = ['ethnicity', 'gender', 'hospital_admit_source', 'icu_admit_source',
'icu_stay_type', 'icu_type', 'apache_3j_bodysystem',
'apache_2_bodysystem']
for i in object_columns:
df[i].fillna(df[i].mode()[0], inplace=True)
print(i)
df.isnull().sum()
# Encoding categorical features
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
df[object_columns] = df[object_columns].apply(le.fit_transform)
df.head()
df.info()
Output’s key observations:
X = df.drop(columns='hospital_death', axis=1)
Y = df['hospital_death']
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=2)
from sklearn.feature_selection import mutual_info_classif
mutual_info = mutual_info_classif(X_train, Y_train)
mutual_info = pd.Series(mutual_info)
mutual_info.index = X_train.columns
mutual_info.sort_values(ascending=False).plot.bar(figsize=(20, 8))
from sklearn.feature_selection import SelectKBest
sel_six_cols = SelectKBest(mutual_info_classif, k=6)
sel_six_cols.fit(X_train, Y_train)
X_train_new = sel_six_cols.transform(X_train)
X_test_new = sel_six_cols.transform(X_test)
Output’s Key observation:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_std = scaler.fit_transform(X_train_new)
X_test_std = scaler.transform(X_test_new)
from keras import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(12, input_dim=6, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
import tensorflow as tf
model.compile(
loss = tf.keras.losses.binary_crossentropy,
optimizer = tf.keras.optimizers.Adam(learning_rate = 0.02),
metrics = [
tf.keras.metrics.BinaryAccuracy(name='accuracy'),
tf.keras.metrics.Precision(name='precision'),
tf.keras.metrics.Recall(name='recall')
]
)
Fit the Keras model on the dataset
history = model.fit(X_train_std, Y_train, epochs=20, validation_split=0.2, batch_size=10)#import csv
Output’ key Observation:
From these results, it can be observed that the l achieves relatively high accuracy on both the training and validation sets.
# Evaluate the keras model
_, accuracy = model.evaluate(X_test_std, Y_test)
print('Accuracy: %.2f' % (accuracy*100))
_, precision = model.evaluate(X_test_std, Y_test)
print('Precision: %f' % precision)
_, recall = model.evaluate(X_test_std, Y_test)
print('Recall: %f' % recall)
# Plotting accuracy and loss curves
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['training data', 'validation data'], loc='lower right')
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['training data', 'validation data'], loc='upper right')
Output’s Key observation:
# Saving the model
model.save('keras_model.h5')
# Loading the model
from keras.models import load_model
model = load_model('keras_model.h5')
Key observation:
import shap
shap.initjs()
X_sample = pd.DataFrame(X_train_new, columns=X_train.columns[sel_six_cols.get_support()]).sample(100)
explainer = shap.KernelExplainer(model.predict, X_train_new)
shap_values = explainer.shap_values(X_sample)
# SHAP summary plot
shap.summary_plot(shap_values, X_sample, plot_type="bar")
# Force plot
shap.force_plot(explainer.expected_value[0], shap_values[0], features=X_sample)
# SHAP summary plot for individual features
shap.summary_plot(shap_values[0], features=X_sample)
Key Observations:
To deploy the patient survival prediction model using Streamlit, we need to prepare several files, including the Python files for processing user data and returning predictions and the Streamlit code for the front end, which are explained below, one by one.
import pandas as pd
import numpy as np
import streamlit as st
from keras.models import load_model
from tensorflow import keras
import tensorflow as tf
from prediction import get_prediction
st.set_page_config(page_title='Hostpital Patient Survival Prediction', page_icon="🏥", layout="wide", initial_sidebar_state='expanded')
model = load_model('keras_model.h5')
# creating option list for dropdown menu
features = ['apache_3j_diagnosis','gcs_motor_apache', 'd1_lactate_max', 'd1_lactate_min','apache_4a_hospital_death_prob', 'apache_4a_icu_death_prob']
#'apache_3j_diagnosis', 'gcs_motor_apache', 'd1_lactate_max','d1_lactate_min', 'apache_4a_hospital_death_prob', 'apache_4a_icu_death_prob'], dtype='object'
st.markdown("<h1 style='text-align: center;'>Patient Survival Prediction App 🏥 </h1>", unsafe_allow_html=True)
def main():
with st.form('prediction_form'):
st.header("Predict the input for following features:")
apache_3j_diagnosis = st.slider('gcs_motor_apache', 0.0300, 2201.05, value=1.0000, format="%f")
gcs_motor_apache = st.slider('gcs_motor_apache', 1.0000, 6.0000, value=1.0000, format="%f")
d1_lactate_max = st.selectbox( 'd1_lactate_max:', [3.970, 5.990, 0.900, 1.100, 1.200, 2.927, 9.960, 19.500])
d1_lactate_min = st.selectbox('d1_lactate_min:', [2.380, 2.680, 6.860, 0.900, 1.100, 1.200, 1.000, 2.125])
apache_4a_hospital_death_prob = st.selectbox( 'apache_4a_hospital_death_prob:', [0.990, 0.980, 0.950, 0.040, 0.030, 0.086, 0.020, 0.010])
apache_4a_icu_death_prob = st.selectbox('apache_4a_icu_death_prob:', [0.950, 0.940, 0.920, 0.030, 0.043, 0.030, 0.043, 0.020, 0.010])
submit = st.form_submit_button("Predict")
if submit:
data= np.array([apache_3j_diagnosis, gcs_motor_apache, d1_lactate_max, d1_lactate_min, apache_4a_hospital_death_prob, apache_4a_icu_death_prob]).reshape(1, -1)
pred = get_prediction(data=data, model=model)
if pred[0][0]< 0.5:
survival = 'No'
elif pred[0][0] > 0.5:
survival = 'Yes'
st.write(f"The predicted Patient Survival is: {survival}")
if __name__ == '__main__':
main()
"prediction" file:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras.models import load_model
model = load_model('keras_model.h5')
def get_prediction(data,model):
"""
Predict the class of a given data point.
"""
return model.predict(data)
Preparing Files for Streamlit Deployment:
Creating “requirements.txt”: This file lists all the libraries required for your project. To generate it, follow these steps:
streamlit==1.15.2
numpy==1.21.5
pandas==1.3.5
keras==2.11.0
tensorflow==2.11.0
scikit-learn==1.0.2
protobuf==3.19.6
"runtime.txt":
Python 3.11.0
"setup.sh" file:
#import csv
After preparing the necessary files, create a new repository on GitHub and push the project folder containing all the files to the repository. You can follow this guide on adding files to a GitHub repository or watch this video tutorial.
Once your deep learning project is pushed to GitHub, we can proceed with the deployment process on Streamlit.
Deploying the deep learning Project with Streamlit:
To deploy your project on Streamlit, you must have an account on Streamlit Sharing. After creating an account and signing in, follow these steps:
Streamlit will deploy your app, and you will receive a URL to access the deployed application.
That’s all, folks! We’ve successfully created a cutting-edge, visually appealing Hospital Patient Survival Prediction App.
Introducing the Patient Survival Prediction App, an innovative and user-friendly tool designed to revolutionize the healthcare industry. This powerful web-based application is hosted live and can be accessed at Hospital Patient Survival Prediction.
The Patient Survival Prediction deep learning model App for healthcare harnesses the power of machine learning and advanced analytics to predict a patient’s likelihood of survival in a hospital setting. The cutting-edge model, trained on numerous critical features, is adept at processing complex medical data to deliver accurate predictions.
We have chosen the Render cloud platform to deploy this fantastic app, ensuring seamless performance and exceptional user experience. By leveraging the robust capabilities of Render, we have made the app available to medical professionals and researchers globally.
Click here to discover how the Patient Survival Prediction App can revolutionize healthcare!
Key Takeaways
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.