To improve an existing method, Medicare Bayesian Improved Surname Geocoding () 1.0 that augments the Centers for Medicare & Medicaid Services’ () administrative measure of race/ethnicity with surname and geographic data to estimate race/ethnicity.
Data Sources/Study Setting
Data from 284 627 respondents to the 2014 Medicare survey.
We compared performance (cross‐validated Pearson correlation of estimates and self‐reported race/ethnicity) for several alternative models predicting self‐reported race/ethnicity in cross‐sectional observational data to assess accuracy of estimates, resulting in 2.0. 2.0 adds to 1.0 first name, demographic, and coverage predictors of race/ethnicity and uses a more flexible data aggregation framework.
Data Collection/Extraction Methods
We linked survey‐reported race/ethnicity to administrative and census data.
2.0 removed 25‐39 percent of the remaining 1.0 error for Hispanics, Whites, and Asian/Pacific Islanders (), and 9 percent for Blacks, resulting in correlations of 0.88 to 0.95 with self‐reported race/ethnicity for these groups.
2.0 represents a substantial improvement over 1.0 and the use of administrative data on race/ethnicity alone. 2.0 is used in ’ public reporting of Medicare Advantage contract measures stratified by race/ethnicity for Hispanics, Whites, , and Blacks.