Testing Data Stories with Simulated Audiences

A few months ago, I presented a data story about historical data to a room full of historians and experts. They knew the history, the issues, the data. But when I showed them how there was a connection between data in historical archives and real people, something unexpected happened. The room went silent. A few people teared up. It was one of those rare moments when data transcends numbers and becomes emotion.

Some months later, I gave the exact same presentation to a group of high school students. The data was identical. The visuals were identical. The narrative was identical. The reaction could not have been more different: instead of tears, there was laughter. Some students whispered jokes. Others seemed bored.

That experience stayed with me. It was a vivid reminder t…

That experience stayed with me. It was a vivid reminder that a data story is never universal. The same message that resonates deeply with one audience can completely miss the mark with another. The difference wasn’t the story; it was the audience.

That experience made me wonder: Is it possible to use generative AI to foresee how various audiences might react to a data story, even before sharing it with them?

The Invisible Variable

When we build a data story, we often focus on getting the numbers right, cleaning the data, choosing the best chart, and writing a clear narrative.1 But every story also carries an invisible variable: the audience’s mindset.3 Are they skeptical or trusting? Emotionally engaged or detached? Do they seek scientific rigor, or are they motivated by moral urgency?

These differences shape how people interpret data and how they feel about it. And in communication design, feelings often determine whether comprehension turns into action. Traditional user testing can reveal these dynamics, but it usually happens after the story is produced. By then, we’ve already invested hours (or weeks) of work. What we need is a way to test audience reactions early, during design, not as a validation step, but as a creative one.

That’s where generative AI enters the picture. We often think of AI as a content creator: writing text, generating visuals, composing music.2 But what if we flipped the role? Instead of asking AI to tell the story, we could ask it to listen to one.

In other words, we can use AI as a simulated audience: a stand-in that reacts to a data story as if it were a specific type of reader or viewer. This isn’t about replacing real humans, it’s about gaining early insight and seeing how different kinds of audiences might respond before we test with real people.4

A Theoretical Workflow

Think of this as a four-stage loop (Figure 1):

Figure 1

Design the story. Start with the basics: your insight, your visuals, your message. But also define who your audience is. Are they experts, skeptics, or newcomers? What’s their level of trust in data?
Structure the evaluation. Identify which dimensions you want to test, such as comprehension, emotional engagement, perceived credibility, and behavioral intent. Build a few short survey-style questions around them.
Test with a small real audience (once). This gives you an anchor in reality. Collect qualitative and quantitative feedback from a few people representing your intended audience. Use their responses to outline behavioral profiles.
Simulate with generative AI. Now translate those profiles into personas, detailed, natural-language descriptions that capture each audience’s mindset. Feed these personas to a model like ChatGPT, along with your data story and evaluation questions. The model will respond as if it were that type of person.

Over time, you can refine your prompts, compare responses across personas, and even test multiple versions of the same story to see how tone, framing, or design choices affect simulated reactions. This process doesn’t give you the truth. Instead, it gives you signals, which are often enough to guide early design decisions.

Deeper Than Demographics

The real power of this approach lies in how you design your personas. Too often, personas in communication design are shallow: “female, 35, educated, urban.” These categories tell us little about how people think.

A simulation persona needs to go deeper. It should reflect cognitive and emotional traits that influence interpretation. For example:

A Skeptical but Attentive persona might be someone who prefers calm, fact-based messaging and resists alarmist tones.
An Engaged Believer might already trust the data, feel emotionally involved, and look for actionable next steps.

When we ask the AI to adopt one of these perspectives, it doesn’t magically “become” that person, but it generates responses that align with the language, tone, and priorities of that mindset. This gives us a way to see our story through multiple cognitive lenses.

Generative AI performs remarkably well in cognitive domains4: it can detect ambiguity, assess clarity, and point out logical inconsistencies. It also mirrors differences in belief or skepticism quite effectively when guided by well-designed personas. Where it struggles is in affective realism. AI can describe emotion, but it does not feel it. It might “say” that a story is moving, but there is no inner resonance behind that statement.

A Mirror, Not a Replacement

There are also important ethical questions. When we use AI to simulate human behavior, we risk introducing bias from its training data, oversimplifying real diversity, or treating machine-generated feedback as objective truth.

To avoid this, we must:

Be transparent about how personas are created and used.
Document every prompt and assumption.
Always include human validation before making communication decisions.

Generative AI can expand our empathy, but only if we remember that it is synthetic empathy. It’s a mirror, not a replacement.

Hybrid Workflow

The same story that once made one audience cry and another laugh taught me that data stories live in the space between message and perception. If AI can help us explore that space, then it may become less a machine and more a mirror, showing us how differently truth can sound depending on who listens.

The next step is to integrate AI simulation and human evaluation into a hybrid workflow:

AI provides the breadth: fast, low-cost iterations across many personas.
Humans provide the depth: emotional nuance, authenticity, and lived experience.

Together, they can make data storytelling more intentional, ethical, and empathetic. For me, the goal isn’t to automate understanding; it’s to make understanding itself a design material, something we can prototype, test, and refine just like visuals or text.

Disclosure: I used AI in creating this post: Grammarly to correct grammar and sentence structures; ChatGPT to write the basic skeleton of the post.

References

1. Dykes, B. (2019). Effective Data Storytelling. John Wiley and Sons.

2. Li, H., Wang, Y., Liao, Q.V., and Qu, H. (2025). Why is AI not a panacea for data workers? An interview study on human-ai collaboration in data storytelling. IEEE Transactions on Visualization and Computer Graphics.

3. Lo Duca, A. (2025). Become a Great Data Storyteller. John Wiley and Sons.

4. Lo Duca, A., and Yocco, V. (forthcoming). Exploring the Use of Generative AI for Assessing Data-Driven Stories. In Computer-Human Interaction Research and Applications. CHIRA 2025. Communications in Computer and Information Science.

***Angelica Lo Duca **is a researcher at the Institute of Informatics and Telematics of the National Research Council, Italy. Her research interests include data storytelling and the application of AI to different domains, including cultural heritage, tourism, education, and more. She is the author of Data Storytelling with Altair and AI (Manning Publications, 2024) and *Become a Great Data Storyteller (Wiley, 2025).

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read