There has been a dramatic increase in the types of microdata, and this holds great promise for health services research. However, legislative efforts to protect individual privacy have reduced the flow of health care data for research purposes and increased costs and delays, affecting the quality of analysis.
This paper provides an overview of the challenges raised by concerns about data confidentiality in the context of health services research, the current methodologies used to ensure data security, and a description of one successful approach to balancing access and privacy.
We analyze the issues of access and privacy using a conceptual framework based on balancing the of reidentification with the associated with data analysis. The guiding principle should be to generate released data that are as close to the maximum acceptable risk as possible. HIPAA and other privacy measures can perhaps be seen as having had the effect of lowering the “maximum acceptable risk” level and rendering some data unreleasable.
We discuss the levels of risk and utility associated with different types of data used in health services research and the ability to link data from multiple sources as well as current models of data sharing and their limitations.
One particularly compelling approach is to establish a remote access “data enclave,” where statistical protections are applied to the data, technical protections ensure compliance with data‐sharing requirements, and operational controls limit researchers' access to the data they need for their specific research questions.
We recommend reducing delays in access to data for research, increasing the use of remote access data enclaves, and disseminating knowledge and promulgating standards for best practices related to data protection.