Back to news
Ethics Society

A new evaluation framework seeks to determine when text-to-image models start producing unreliable or biased content

A research group has introduced a new way to evaluate the reliability, fairness, and diversity of generative models that produce images from text. The topic has become central because such models can create highly accurate, user-directed images, but their behavior can also be unpredictable—and thus susceptible to misuse.

The article is based on the observation that the class or concept representations produced by models can, in some situations, be 'twisted' in a desired direction. In other words, certain inputs can trigger unreliable behavior or biases in how the model represents people, things, or concepts. Researchers highlight that this is not only a technical but also a societal issue: if models do not treat different groups or concepts equally, the consequences may be reflected in, for example, the types of images of people and phenomena that are generally easy to produce.

The proposed evaluation framework examines the model's behavior in its so-called embedding space, the internal representation by which the model converts text prompts into numerical descriptions. Reliability is measured by analyzing how the content produced by the model changes when these internal representations are disturbed either broadly (global disturbances) or locally (local disturbances). The goal is to identify inputs that are prone to lead to unreliable or biased outcomes.

According to the authors, fairness and diversity are not just ethical add-ons but an essential part of 'robust' and reliable model behavior. The framework aims to provide a deeper insight into when and why models falter in these fundamental requirements.

Source: On the fairness, diversity and reliability of text-to-image generative models, Artificial Intelligence Review.

This text was generated with AI assistance and may contain errors. Please verify details from the original source.

Original research: On the fairness, diversity and reliability of text-to-image generative models
Publisher: Artificial Intelligence Review
Authors: Jordan Vice, Naveed Akhtar, ... Ajmal Mian
January 14, 2026
Read original →