Hospital Flow Forecasting with LSTM

Applied Machine Learning 2019 Applied Prototype

Overview

This project explored neural time-series forecasting for hospital flow data. The workflow transformed emergency-department timestamps into temporal features and tested LSTM-based models for short-horizon forecasting of operational dynamics.

The focus was not deterministic prediction of individual patients. The goal was to build an applied forecasting prototype for aggregated hospital activity: arrival intensity, inter-arrival times, temporal variability and short-term load patterns.

Problem Setting

Hospital operations are shaped by irregular arrivals, daily and weekly cycles, triage composition and local demand shocks. Raw event timestamps therefore need to be converted into structured time-series data before they can support forecasting or planning.

The project framed the problem as multi-step forecasting over aggregated hospital-flow indicators, using historical time windows to forecast future intervals.

Data Preparation

The pipeline started from timestamped triage and acceptance records. These records were cleaned, ordered and converted into time-indexed variables.

Parsed acceptance timestamps into date, hour and weekday fields.
Computed inter-arrival times from consecutive records.
Handled day-boundary effects when differencing timestamps.
Aggregated observations into fixed time windows, including three-hour intervals.
Built train and test datasets across different calendar periods.
Constructed mean, standard deviation and count-based flow indicators.

Modeling Approach

The neural forecasting layer used LSTM models to learn temporal dependencies from lagged windows of hospital-flow variables. One implementation followed an encoder-decoder structure for multi-step forecasts.

In this setting, the encoder reads the past sequence and compresses it into an internal representation. The decoder uses that representation to generate a future sequence, such as the next set of operational intervals.

LSTM layers for sequential dependency modeling.
RepeatVector and sequence-returning LSTM layers for encoder-decoder forecasting.
TimeDistributed dense layers for step-by-step output generation.
Min-max scaling and supervised learning windows for neural-network input.

Validation

The prototype included walk-forward validation logic. Forecasts were generated for each test window, compared with realized values and then the realized window was added to the history before the next forecast.

Evaluation used RMSE-style diagnostics both at the aggregate forecast level and across individual forecast steps. This made it possible to inspect whether errors were concentrated in specific parts of the forecast horizon.

Interpretation

The project is best understood as an applied machine-learning prototype for healthcare operations, not as a production-grade hospital staffing system. Its value is in the full workflow: event-data cleaning, temporal aggregation, supervised sequence construction, LSTM modeling and forecast evaluation.

It also provides a useful bridge between time-series econometrics and modern deep-learning forecasting: the same operational problem can be studied with classical models, recurrent neural networks and more recent sequence models.

Technologies

Python
pandas
NumPy
scikit-learn
Keras / TensorFlow
LSTM recurrent neural networks
Matplotlib

Future Extensions

A modern version of this project would compare LSTM forecasts against classical baselines and newer architectures, including gradient-boosted tabular models, temporal convolutional networks or transformer-based sequence models. It would also add stronger calendar features, uncertainty intervals and explicit operational metrics for capacity planning.