soulium

1 Researchers Reduce Bias in aI Models while Maintaining Or Improving Accuracy

Machine-learning designs can fail when they try to make predictions for people who were underrepresented in the datasets they were trained on.

For example, a model that anticipates the finest treatment alternative for somebody with a persistent disease might be trained utilizing a dataset that contains mainly male patients. That design might make inaccurate forecasts for female clients when deployed in a hospital.

To enhance results, engineers can attempt stabilizing the training dataset by removing information points up until all subgroups are represented equally. While dataset balancing is appealing, it typically requires removing big quantity of data, harming the design's overall efficiency.

MIT researchers established a new strategy that recognizes and eliminates particular points in a training dataset that contribute most to a design's failures on minority subgroups. By eliminating far less datapoints than other techniques, this method maintains the total accuracy of the model while enhancing its efficiency concerning underrepresented groups.

In addition, the technique can identify covert sources of bias in a training dataset that lacks labels. information are far more prevalent than labeled information for lots of applications.

This technique could also be integrated with other approaches to improve the fairness of machine-learning models released in high-stakes scenarios. For instance, it may at some point help make sure underrepresented patients aren't misdiagnosed due to a biased AI model.

"Many other algorithms that attempt to address this problem assume each datapoint matters as much as every other datapoint. In this paper, we are revealing that presumption is not real. There are specific points in our dataset that are adding to this bias, and we can find those information points, eliminate them, and get better efficiency," says Kimia Hamidieh, an electrical engineering and computer science (EECS) graduate trainee at MIT and co-lead author of a paper on this technique.

She wrote the paper with co-lead authors Saachi Jain PhD '24 and fellow EECS graduate trainee Kristian Georgiev