Image databases of skin conditions are notoriously biased towards lighter skin. Rather than wait for the time-consuming process of collecting more images of conditions like cancer or inflammation on darker skin, one group wants to fill in the gaps using artificial intelligence. He is working on an artificial intelligence program to generate synthetic images of disease in darker skin and use those images for a tool that could help diagnose skin cancer.
“Having real images of darker skin is the ultimate solution,” says Eman Rezk, a machine learning expert at McMaster University in Canada who is working on the project. “Until we have that data, we need to find a way to close the gap.”
But other experts working in the field are concerned that the use of synthetic images could present its own problems. The focus should be on adding more diverse real images to existing databases, says Roxana Daneshjou, a clinical research fellow in dermatology at Stanford University. “Creating synthetic data sounds like an easier route than doing the hard work of creating a diverse data set,” she says.
There are dozens of efforts to use AI in dermatology. Researchers are building tools that can scan images of rashes and moles to discover the most likely type of problem. Dermatologists can then use the results to help them make diagnoses. But most tools are based on image databases that do not include many examples of conditions in darker skin or that do not have good information about the range of skin tones they include. That makes it hard for groups to trust that a tool will be as accurate on darker skin.
That’s why Rezk and the team turned to synthetic imaging. the project has four main phases. The team has already analyzed available image sets to understand how darker skin tones were underrepresented to begin with. He also developed an AI program that used images of skin conditions on lighter skin to produce images of those conditions on darker skin, and validated the images given to them by the model. “Thanks to advances in artificial intelligence and deep learning, we were able to use available white scan images to generate high-quality synthetic images with different skin tones,” says Rezk.
Next, the team will combine the synthetic images of darker skin with real images of lighter skin to create a program that can detect skin cancer. It will continually check image databases to find new, real images of skin conditions on darker skin that they can add to the future model, says Rezk.
The team is not the first to create images of synthetic skin, a group that included researchers from Google Health. published an article in 2019 describing a method to generate them, and could create images of different skin tones. (Google is interested in dermatological AI and announced a tool that can identify skin conditions last spring).
Rezk says synthetic images are a stopgap until more real images of darker skin conditions become available. However, Daneshjou is concerned about the use of synthetic images, even as a temporary solution. Research teams would have to carefully check whether the AI-generated images would have any common quirks that people would not be able to see with the naked eye. That kind of quirk could theoretically skew the results of an AI program. The only way to confirm that synthetic images perform as well as real images on a model would be to compare them to real images, which are in short supply. “Then it goes back to the fact of, well, why not just work on trying to get more real images?” she says.
If a diagnostic model is based on synthetic images from one group and real images from another, even temporarily, that’s a concern, says Daneshjou. It could make the model work differently on different skin tones.
Leaning on synthetic data could also make people less likely to push for real, diverse images, she says. “If you’re going to do that, are you really going to keep doing the work? she says. “In fact, I would like to see more people working on getting real data that is diverse, rather than trying to do this solution.”