To develop and test predictive models of housing instability and homelessness based on responses to a brief screening instrument administered throughout the Veterans Health Administration ().
Data Sources/Study Setting
Electronic medical record data from 5.8 million Veterans who responded to the 's Homelessness Screening Clinical Reminder () between October 2012 and September 2015.
We randomly selected 80% of Veterans in our sample to develop predictive models. We evaluated the performance of both logistic regression and random forests—a machine learning algorithm—using the remaining 20% of cases.
Data Collection/Extraction Methods
Data were extracted from two sources: 's Corporate Data Warehouse and National Homeless Registry.
Performance for all models was acceptable or better. Random forests models were more sensitive in predicting housing instability and homelessness than logistic regression, but less specific in predicting housing instability. Rates of positive screens for both outcomes were highest among Veterans in the top strata of model‐predicted risk.
Predictive models based on medical record data can identify Veterans likely to report housing instability and homelessness, making the screening process more efficient and informing new engagement strategies. Our findings have implications for similar instruments in other health care systems.