Reducing Bias and Improving Safety in DALL·E 2
Today, we are implementing a new technique so that DALL·E generates images of people that more accurately reflect the diversity of the world’s population. This technique is applied at the system level when DALL·E is given a prompt describing a person that does not specify race or gender, like “firefighter.”
Based on our internal evaluation, users were 12× more likely to say that DALL·E images included people of diverse backgrounds after the technique was applied. We plan to improve this technique over time as we gather more data and feedback.
In April, we started previewing the DALL·E 2 research to a limited number of people, which has allowed us to better understand the system’s capabilities and limitations and improve our safety systems.
During this preview phase, early users have flagged sensitive and biased images which have helped inform and evaluate this new mitigation.
We are continuing to research how AI systems, like DALL·E, might reflect biases in its training data and different ways we can address them.
During the research preview we have taken other steps to improve our safety systems, including:
Minimizing the risk of DALL·E being misused to create deceptive content by rejecting image uploads containing realistic faces and attempts to create the likeness of public figures, including celebrities and prominent political figures.
Making our content filters more accurate so that they are more effective at blocking prompts and image uploads that violate our content policy while still allowing creative expression.
Refining automated and human monitoring systems to guard against misuse.
These improvements have helped us gain confidence in the ability to invite more users to experience DALL·E.
Expanding access is an important part of our deploying AI systems responsibly because it allows us to learn more about real-world use and continue to iterate on our safety systems.