STEMMA hackathan at PorterShed, Galway. Image: STEMMA
Curiosity-driven research can take you to unexpected places, says University of Galway’s Prof Erin McCarthy.
Perhaps unlike many humanities professors, Prof Erin McCarthy is very open to AI. So much so in fact, she tells me during our recent conversation that just the other day she used generative AI to create a song to help her learn Irish. “Because I have no hope of remembering the simple prepositions except through song,” she says.
That’s not to say she doesn’t have reservations, but she thinks it’s important for everybody, including in humanities departments, to have conversations about technologies such as AI and “interrogate them critically” and think about what they can and can’t do. “I think everyone would benefit from …
STEMMA hackathan at PorterShed, Galway. Image: STEMMA
Curiosity-driven research can take you to unexpected places, says University of Galway’s Prof Erin McCarthy.
Perhaps unlike many humanities professors, Prof Erin McCarthy is very open to AI. So much so in fact, she tells me during our recent conversation that just the other day she used generative AI to create a song to help her learn Irish. “Because I have no hope of remembering the simple prepositions except through song,” she says.
That’s not to say she doesn’t have reservations, but she thinks it’s important for everybody, including in humanities departments, to have conversations about technologies such as AI and “interrogate them critically” and think about what they can and can’t do. “I think everyone would benefit from these technologies and being able to talk about them.”
McCarthy is established professor of English literature and computational humanities at the University of Galway. Her European Research Council-funded project, STEMMA (Systems of Transmitting Early Modern Manuscript Verse, 1475–1700), is mapping about 65,000 manuscript records with the aim of developing a high-level model of poetry circulation for the period.
“[We’ll] be able to get this overview that’s never been possible before,” McCarthy says.
However, before they can get to analysing the data and creating this model, it needs to be cleaned.
The team is combining six of the most comprehensive datasets of early modern manuscript poetry available. And what they’re finding is that this data is very messy, with many duplicate records, different file formats and collated with different biases, and they need to get them to talk to each other.
“Reconciling databases is always tricky,” McCarthy says. “I think reconciling databases made by English professors might be even trickier.”
And this is where AI has been a “real gift”, she says. “I don’t know how I thought we were going to do this without the LLMs [large language models].”
A major issue with the dataset is that, over the 225-year period of study, there are many variations of spelling and punctuation so the process of determining whether one poem is the same as another confounds simple computer systems. 10 years ago the process might have been to modernise and standardise the differences in the dataset to be able to analyse it. But that would mean “losing all of that historical evidence”, McCarthy explains.
The team can use AI to analyse texts and give them similarity scores and group variations of the same poem together without changing the underlying texts.
“That’s been the methodological innovation – being able to reconcile in that way without destroying what we inherited.
“So, that’s what I try to remind my humanities colleagues when they get a little bit suspicious of these things – it’s actually helping us.”
The work still requires a high level of expert review, she says. The process is not fully automated. But AI has given them a way to “cope with this massive, unwieldy data”.
“I don’t think everyone needs a big database or wants a big database or wants to be using AI. But I think it lets us ask new questions of the material that we already care about, and that’s really powerful.”
Rebel with a book
It’s not surprising that McCarthy has found herself using advanced technologies in her humanities research. “I grew up in an IT house.” Her dad worked for IBM and she grew up in Silicon Valley in the 80s. She taught herself to code from her dad’s books when she was just a small child. “I’ve always loved computers.”
Her decision to study humanities at university was maybe a kind of rebellion, she says. And there weren’t many women choosing to study IT at that time so it didn’t have a huge appeal for her.
It was the rather unglamourous Microsoft Excel that brought her back to tech. She would make spreadsheets during her postgraduate research to manage information about poetry books. She showed these to one of her professors and found “a surprisingly sympathetic audience”.
“And so it’s been a gradual move back into this space.”
![]()
Image: Erin McCarthy
Of course, the humanities has generally been opening up to technology for some years now – with the area of digital humanities surging in popularity. “Almost every form of humanities research at this stage has some digital piece to it,” McCarthy says. So much so that we could probably drop the digital and just call it humanities again, she says – then, ever the academic, quickly points out that she isn’t the first person to make this point.
The almost ubiquity of digital humanities can be seen as part of the wider push for interdisciplinarity in academia. Collaboration across disciplines is considered by many to bring innovative thinking and diverse perspectives to research problems.
Bridging disciplines
In this vein, McCarthy and her project team hosted a three-day hackathon at Galway’s PorterShed in July, bringing together students and scholars in the humanities with data scientists and even a rare books librarian to work with the STEMMA dataset. In the end, seven multidisciplinary teams built prototypes and pitched ideas.
“It was a really fun, multidisciplinary collaboration and we learned a lot about the data,” McCarthy says, including what they need to improve and how people will want to use the data.
She found that having the data scientists involved meant the teams could ask different questions of the data. One of her humanities colleagues was particularly excited by the collaborative possibilities of the hackathon. He had questions that he’s wanted to ask for years, and a programmer could help him answer them. “You need to hire this guy,” he enthused.
One of the things McCarthy found really encouraging about the hackathon was “how excited the data people were to work with our material”. She didn’t think data scientists would be so interested in what is for them a “relatively small and kind of chaotic dataset”.
“That’s been really heartening.”
She has thought in the past that maybe the different disciplines can’t talk to each other, but she realises now that’s not the case. “I think we just often don’t have a meeting place or a common ground or a reason to try to work with each other.”
And sometimes, it’s just a case of not knowing how to get started, she says. “But if you put them all in a room together and give them some pizza and some time, suddenly they start coming up with working prototypes.”
McCarthy and her team are editing a special issue of a journal to showcase the projects from the hackathon. She sees it as a chance for the teams to develop the work they started and keep the collaborations going. She also sees it as a chance for reflections on this methodological approach and what it means for the discipline. “What do we gain by working in this way? And are there trade-offs?”
Funding curiosity
McCarthy says one of her overall aims for the STEMMA project has always been to develop a transferable methodological approach. “It’s focusing on this time period because that’s my particular area of expertise. But the methods should really be transferable and extendable.” She sees their data-cleaning method being applied to other humanities and social science projects and even outside of academia to anyone working with qualitative datasets.
This is a good argument, she thinks, for the value of frontier research. So often research is thought about in terms of commercial or practical applications, whereas the STEMMA project is ostensibly about a very niche area of humanities research with arguably limited impact.
Of course, she wants to emphasise the inherent value in this historical research, but also the methods they are developing with support from the ERC, and the Irish Research Council before that, will potentially bring benefits to many other researchers. McCarthy is talking to the university’s innovation office about a potential spin-out even. So, there’s clear economic value now, but it all started with frontier research, and McCarthy is grateful there are still avenues for this kind of research to be supported.
“Frontier research can go in all kinds of directions that the researchers themselves don’t anticipate.” And that’s why it’s so important to fund curiosity and curiosity-driven research, she says.
As for now, McCarthy and the team are still in data-cleaning mode. The plan is to be able to move onto analysis by the midpoint of the ERC grant, so they have just a few months to reach their target. “It’ll be close,” McCarthy says. And she’s excited to get to the next stage. “It’s just been even more complicated than I expected.”
On the plus side, the team are getting to know the data “in a really different way” and turning up poems, including some “wacky” ones, they didn’t know existed. “So that’s been really cool.”
I ask McCarthy if she has time for the kind of traditional humanities research that got her started in this field in the first place.
“There are peaks and troughs,” she says. She’s happy to be distracted by the technical stuff and finds herself going down rabbit holes that maybe she doesn’t need to go down. She’s less enthusiastic about the management and admin side of things. “I had to approve an invoice today. Are you kidding?” she jokes. “I just want to read a book and write another book.”
She’s writing an article at the moment about a poet who featured in her first book. She says it’s really nice when she finds the time to do this work. “It feels like a luxury now when I have a few ours to just sit and write and think.”
Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.