Threads of racial bias may be entangled in computer models for child protection
The Boston Globe, Letter to the Editor
I WANT to comment on using computer analytics to protect children ("Computers may spot abuse risks," Page A1, Oct. 7). While big data, more powerful computers, and advanced analytics enable us to develop useful models for prediction, existing data can mask prejudice and reify past and existing discriminatory practices.
Using statistical models to analyze and assess so called risk is not new. My first exposure to the practice was more than 30 years ago to prepare a critique of credit risk assessment. While it may be unlawful to use certain demographic characteristics, such as race, ethnicity, and gender, to determine a loan applicant's creditworthiness, there are other variables that can be used, such as an address or ZIP code. However, because of housing discrimination, both formal and informal, where you live can serve as a proxy for your skin color. To the extent that past loan decisions and repayment processes may have been racially biased, the data available to develop new algorithms may be racially biased as well.
The new algorithm under consideration may appear to be both objective in its use of data and devoid of the biases that it's meant to prevent while it does its best to protect children. But the data collected may be tainted as well.
I believe strongly in data and their analysis and in the importance of modeling to better understand, predict, and inform decisions. I'm just cautioning that data generated by biased systems tend to perpetuate aspects of that bias, whether the specific characteristics are included in developing the expert models or not.
This may be a data analytic extension of eating fruit of the poison tree.