ABSTRACT
Horses are depended on as work animals by humans and are used in leisure and sport across the world, but the extent to which humans can recognize pain in horse faces is not known, which could impact their welfare. There are also significant gaps in our understanding of which psychological traits influence recognition of human facial expressions of pain. To address this, 100 participants, with either some (n = 30) or no prior horse-care experience (n = 70), rated 30 human and 30 horse faces for pain, arousal, and valence and completed trait measures of empathy and social anxiety. Ten equine behavior professionals also rated the horse faces as a baseline for assessing accuracy. Overall, the accuracy of pain recognition was higher for human faces, but participants wit…
ABSTRACT
Horses are depended on as work animals by humans and are used in leisure and sport across the world, but the extent to which humans can recognize pain in horse faces is not known, which could impact their welfare. There are also significant gaps in our understanding of which psychological traits influence recognition of human facial expressions of pain. To address this, 100 participants, with either some (n = 30) or no prior horse-care experience (n = 70), rated 30 human and 30 horse faces for pain, arousal, and valence and completed trait measures of empathy and social anxiety. Ten equine behavior professionals also rated the horse faces as a baseline for assessing accuracy. Overall, the accuracy of pain recognition was higher for human faces, but participants with horse experience were more accurate at pain recognition in horse faces than those without, and years of horse experience predicted horse pain recognition accuracy. Social anxiety traits predicted the accuracy of pain recognition in human but not horse faces, while also predicting subjective ratings of pain in horse but not human faces. Empathy and its cognitive and emotional components were not related to pain recognition accuracy or ratings of horse or human faces. Relationships between trait measures and arousal and valence ratings for both species are reported. Our study is the first to report the human ability to read pain in horse faces and the factors that influence this and extends current knowledge on face processing in social anxiety.
The human face processing system is highly proficient at extracting information from the faces of conspecifics. However, whether this ability extends to nonhuman species remains largely unexplored. The domestic horse (Equus caballus) has been used for millennia (Johns, Citation2006) in work and, more recently, in equine-assisted interventions (Alfonso et al., Citation2015), as well as for leisure and sport. Public scrutiny of horse welfare in sports (Equine Ethics and Wellbeing Commission (FEI), Citation2022; World Horse Welfare, Citation2025) has increased significantly both in mainstream media (Baudet, Citation2024; BBC, Citation2024; Ingle, Citation2024) and within the equestrian sector itself (Furtado et al., Citation2021). Given this concern, understanding how psychological traits influence the recognition of horse pain is crucial for improving equine welfare (Luna & Tadich, Citation2019). This study is the first to examine participants’ accuracy and ratings in recognizing pain in both human and horse facial expressions, along with the roles of trait empathy, social anxiety, and horse-care experience in this process.
Humans are often considered “face experts” due to their unparalleled experience with human faces compared with other objects. This expertise is linked to an own-species bias, where humans process human faces more efficiently than those of other species, an effect attributed to limited experience with nonhuman faces (Jakobsen et al., Citation2021; Scott & Fava, Citation2013; Simpson et al., Citation2014). However, expertise in nonhuman object categories can enhance perceptual processing as seen in bird (Gauthier & Tarr, Citation1997) and dog experts (Diamond & Carey, Citation1986), whose perceptual abilities resemble those observed with human faces. Despite this, substantial individual differences do exist in human face processing (White & Burton, Citation2022).
Facial expressions of pain in humans evolved to signal the presence of danger and elicit support from others (Kappesser, Citation2019; Williams, Citation2002). Implicit in this is that the observer must be able to decode the expression. One factor that may influence this is empathy, the ability to recognize and share in the emotions of others (Abramson et al., Citation2020; Goubert et al., Citation2005). Empathy consists of cognitive and emotional subcomponents, subserved by distinct neural and genetic mechanisms (Abramson et al., Citation2020; Shalev & Uzefovsky, Citation2020; Shamay-Tsoory et al., Citation2009). While emotional empathy has been suggested to enhance pain recognition (Green et al., Citation2009; Ruben & Hall, Citation2013; Schmidt et al., Citation2023), findings remain inconclusive. Social anxiety (SA) may also influence pain recognition as individuals with high SA or diagnosed social anxiety disorder (SAD) exhibit a “negativity bias,” demonstrating heightened sensitivity to seeing threat in faces (Armstrong & Olatunji, Citation2012; Chen & Clarke, Citation2017; Gregory et al., Citation2019; Konovalova et al., Citation2021). However, pain recognition in SA has never been tested.
Horse pain ethograms outline specific facial markers associated with pain and are designed for use by caregivers. Prominent amongst these are the Horse Grimace Scale (Costa et al., Citation2014, Citation2018) and the Equine Pain Face (EPF; Gleerup et al., Citation2015, Citation2018). However, their effective application depends on the observer’s ability to perceive these subtle cues, given the known variability in human face perception (White & Burton, Citation2022). Indeed, some research suggests the application of the HGS by individuals without horse experience is not straightforward, with only limited improvements in pain detection after a 30-min training session (Dai et al., Citation2020). This may be driven by difficulties in the perceptual discrimination of some subtler pain markers. Pain ethograms have also been used by researchers for initial coding of the stimuli for automatic pain detection models (Broomé et al., Citation2022; Hummel et al., Citation2020; Lencioni et al., Citation2021), yet human errors at the training stage could compromise the resultant models’ accuracy, further highlighting the need to fully understand how humans process equine facial expressions.
Both empathy and social anxiety may influence pain recognition in horse faces as well as in human faces as higher levels of animal-directed empathy have been linked to improved welfare outcomes for working horses (Luna et al., Citation2018), while veterinarians with greater human-directed empathy assign higher pain ratings to descriptions of veterinary procedures in cattle (Norring et al., Citation2014). Additionally, equine-assisted interventions have been successful in improving outcomes in various populations (Hemingway et al., Citation2019; Kendall et al., Citation2015; Sullivan & Hemingway, Citation2024) including SAD (Alfonso et al., Citation2015), but the impact of SA and empathy on the perception of horse faces has never been researched. Understanding these influences is essential for the welfare of both horses and humans.
To address these outstanding issues, we asked the following research questions:
(1) Do accuracy and ratings of pain differ between horse and human faces? (2) Does the horse experience influence the accuracy and ratings of pain in horse faces? (3) Do trait SA and trait empathy and its cognitive and emotional subcomponents influence the accuracy and ratings of pain in horse and human faces? (4) Are subjective ratings of valence and arousal of horse and human facial expressions related to pain judgements, horse experience, SA traits, and empathy?
Methods
Ethical approval was granted by the ethics committee of Bournemouth University, and the study adhered to the British Psychological Society code of conduct (ethics ID 54598).
Design
The study utilized a between-subjects correlational design. The between-subjects variable was horse experience. Group allocation was determined by the answer to the question “Have you ever loaned, owned or worked with horses?” Those who answered “yes” were placed into the Experienced group, and those who answered “no” were placed into the No Experience group. Those who answered “yes” were then further asked about the total time in years that they had cared for horses. The dependent variables were the accuracy of pain recognition (see analysis section) and subjective ratings of pain, valence, and arousal for both horse and human faces. Additional variables were self-reported social anxiety scores (via the Liebowitz Social Anxiety Scale – Self-Report [Liebowitz, Citation1987]) and self-reported trait empathy scores, including the cognitive and emotional sub-scores and total score (via the Empathy Quotient [Baron-Cohen & Wheelwright, Citation2004]).
Participants
The original sample consisted of 104 participants across the Experienced and No Experience groups. Four outliers (all from the No Experience group) were removed where mean ratings were less than the first quartile −1.5 × the interquartile range or greater than the third quartile + 1.5 × the interquartile range, as indicated by boxplots (IBM SPSS v. 28) on two or more (out of six) of the subjective rating variables or one or both accuracy variables.
Seventy participants were Bournemouth University undergraduate psychology students recruited through convenience sampling, who received course credits. The remaining 30 participants were gathered through opportunity sampling via a UK-based general equestrian interest group on Facebook. Group allocation (No Experience, Experienced) was determined by participants’ answering yes or no to the question about having equine experience, rather than on the sampling source (despite this, all the students were allocated to the No Experience group). The final sample consisted of 100 participants (No Experience group: 58 women and 12 men; Mean age = 20.01 years, SD = 2.20; Experienced group: 26 women, 3 men, and 1 non-binary; Mean age = 30.23 years, SD = 14.29). Group characteristics can be seen in .
Table 1. Descriptive statistics for pain ratings for the horse and human stimulus sets.
Ten expert raters assessed the horse-face images. This included three of the authors plus another seven equine behavior professionals known to the authors to be appropriately qualified, who were invited to participate and received no compensation. They provided their names so that their eligibility could be verified. It was important that we specifically recruited equine behavior professionals, as even qualified veterinarians, who receive behavior training as only a small part of their qualification, can struggle to accurately judge pain in horses (Broomé et al., Citation2022). By recruiting this group, we provided a robust baseline with which to compare our non-experts’ abilities. The expert raters included accredited equine behavior professionals, equine veterinarians with behavior specialisms, academics specializing in equine behavior, and those with Bachelor-level degrees or higher in equine science or behavior.
Materials
The Empathy Quotient (EQ; Baron-Cohen & Wheelwright, Citation2004) is a self-report scale used to establish empathy levels in the general population and autistic people and is one of the two most commonly used empathy scales in the literature (the other being the Interpersonal Reactivity Index (IRI; Davis, Citation1980; Lima & Osório, Citation2021)). It includes 60 questions: 40 are statements about empathy and 20 are filler items intended to distract participants from the focus on empathy. Each question involves a scenario such as “I really enjoy caring for others” and “It upsets me to see an animal in pain.” Participants select one of four options: strongly agree, slightly agree, slightly disagree and strongly disagree. Each question is scored by allocating two, one, or zero points, with some questions being reverse scored. Total scores can range from 0 to 80, with higher total scores indicating higher levels of empathy. The EQ has good internal consistency (Muncer & Ling, Citation2006), ranging from 0.76 to 0.85, with excellent test-retest reliability (an average of 0.89) and a moderate concordant validity with the IRI.
As well as total empathy, we also measured cognitive and emotional empathy, as identified in previous research (Lawrence et al., Citation2004; Muncer & Ling, Citation2006). Cognitive empathy (CE) was calculated using the scores from items 1, 11, 14, 15, 22, 26, 29, 34, 35, 36, and 38, and emotional empathy (EE) was calculated using the scores from questions 3, 12, 13, 16, 18, 19, 27, 28, 31, 33, and 39.
The self-report version of the Liebowitz Social Anxiety Scale (LSAS-SR; Baker et al., Citation2002; Fresco et al., Citation2001; Liebowitz, Citation1987) was used to measure trait social anxiety. The LSAS-SR asks participants to separately rate their fear and avoidance of 24 social and performance situations on a scale of 0–4. Scores across both fear and avoidance subscales are summed to provide a total out of 144. The LSAS-SR has good to excellent internal validity and test-retest reliability in both clinical and non-clinical samples (Baker et al., Citation2002; dos Santos et al., Citation2013; Fresco et al., Citation2001; Oakman et al., Citation2003). It can detect clinically significant levels of social anxiety which would indicate the presence of SAD in both its generalized (social anxiety which occurs across a range of situations; scores of at least 60) and specific forms (i.e., social anxiety which is restricted to specific types of social situations; e.g., performance-based scenarios; scores between 30 and 59) (Mennin et al., Citation2002). In our sample, 13 participants scored under 30, 22 scored between 30 and 60, and 65 scored over 60.
The 30 human faces used came from the Delaware Pain Database (Mende-Siedlecki et al., Citation2018, Citation2020), which comprises 229 stimuli. The 30 images chosen were selected to provide a range of intensities, showing 10 × no pain (i.e., neutral), 10 × low pain (both eyes open but mouth closed with a slight nose scrunch), and 10 × high pain (eyes and mouth closed with a nose scrunch). All 30 images were 400 × 400 pixels, with a white background, and all were of white men and women to reduce ethnicity effects, given the predicted predominantly white sample (see ).
Figure 1. Examples of human face stimuli used in the present study from the Delaware Pain Database (Mende-Siedlecki et al., Citation2018, Citation2020), displaying different intensity expressions: (a) neutral/no pain; (b) low pain, and (c) high pain.
As no validated horse-face databases were available, we gathered our horse-face stimuli from Google images and from personal image collections within the research team (with permission). In all but one case (where the horse had been known to one of the authors as being in pain and at the end of life), we did not know the absolute “ground truth” pain status of these horses. However, to provide a selection of images with a range of painful expressions, we selected images based on the criteria set out in Gleerup et al.’s (Citation2015) Equine Pain Face ethogram. These features included: low ears, enlarged nostrils, tension of the muzzle, and angled eye. Therefore, the 10 images intended to depict high levels of pain included multiple examples of these characteristics. For the 10 low-level pain images, each horse demonstrated one of these characteristics, such as tension in face, whereas the 10 neutral images included none of these pain characteristics. The images were validated for the intended category by two of the authors who are qualified equine behaviorists. After this process, two images that produced ratings inconsistent with the intended category were removed and replaced with more appropriate examples, which were again verified.
Images were horses at rest, with no tack (except a headcollar in some cases) as horses at work and/or wearing bridles can influence the production of natural expressions (Gleerup et al., Citation2018). The faces could be in profile or front-on or any orientation in between and were of horses with a range of colors and marking. All images displayed were as close to 200 × 400 pixels as possible without distorting the original aspect ratio, with the original background removed to avoid inclusion of extraneous pain cues, resulting in the face alone displayed on a white background (see for examples).
Figure 2. Horse face images obtained from Google Images displaying different levels of pain: (a) Horse 01 scored the lowest amongst the expert raters (M = 1.50, SD = 0.71), copyright unknown, (b) Horse 26 was the median scoring face according to the experts (M = 3.70, SD = 2.36), copyright Irina Orlova, Shutterstock, and (c) Horse 13 was the highest scoring face amongst the experts (M = 7.50, SD = 3.83), copyright Zita Stankova, Adobe Stock.
Data Analysis
We validated the pain ratings of the horse images by asking our 10 experts to rate them for pain, arousal, and valence on a scale of 1–10. We assessed the consistency of pain ratings between the experts using measures of inter-rater reliability and intra-class correlations. Cronbach’s alpha was 0.92 and the intraclass correlation coefficient (average measures) was 0.90 for pain, showing excellent agreement. As such, we calculated the mean pain rating of each image. This then became the baseline with which to compare the participants’ ratings for accuracy. The mean pain rating of each image was correlated with the rating given by each participant. Then, the mean of the correlation coefficients was calculated to create a mean correlation coefficient for each participant for the set of horse faces.
To determine accuracy for human faces, we took the mean pain rating of each stimulus from the normative database accompanying the DPD (Mende-Siedlecki et al., Citation2018, Citation2020) and calculated mean correlation coefficients for each participant as with the horse faces. This also circumvented the issue that in the original DPD study participants rated images on scale from 1 to 7, whereas our participants rated on scales from 1 to 10.
Levene’s test showed that although the groups (Experienced, No Experience) were of unequal size, they were of equal variance on all the rating measures. (all ps > 0.05). All rating measures were normally distributed except for the No Experience group’s human pain accuracy scores. However, standardized scores for skew and kurtosis were in the normal range for all measures (e.g., within +- 2.58 for a sample of 100). Therefore, parametric analyses were conducted.
To determine pain recognition accuracy between groups and species, we conducted a mixed ANOVA, with group (No Experience, Experienced) as the between-subjects variable, species (horse, human) as the within-subjects variable, and accuracy (represented by the correlation coefficient) as the dependent variable. As age differed significantly between the groups (p < 0.001), we ran the above ANOVA again but as an ANCOVA with age as a covariate. With two analyses, alpha levels were corrected to p < 0.025.
Bivariate Pearson’s correlations were calculated to assess the relationships between social anxiety, empathy (and CE and EE), experience in years, accuracy, and pain ratings, and multiple linear regression was conducted to assess the predictive power of these variables on accuracy and ratings of pain, where justified statistically.
Procedure
The study was presented online via Qualtrics. After providing written informed consent, participants were questioned about their experience with horses for the purpose of group allocation. Next, 30 human face stimuli were presented in a randomized order for 5 s each. Subsequently, each image was replaced with three sliding rating scales: for valence (from very negative (1) to very positive (10)), arousal (from low arousal (1) to high arousal (10)), and pain (from no pain (1) to high pain (10)). This was repeated with the horse faces, presented randomly. Participants then completed the EQ (Baron-Cohen & Wheelwright, Citation2004) and the LSAS-SR (Baker et al., Citation2002; Liebowitz, Citation1987).
Results
shows independent samples t-tests on single variables. The groups did not differ significantly on LSAS-SR or EQ or EE and CE subscales, but there was a difference in age between groups. Therefore, age was included as a covariate where a group difference was identified.
Table 2. Between groups descriptive statistics of the variables under investigation, plus results of independent samples t-tests between the Experienced and No Experience groups.
In relation to the accuracy of pain recognition in horse and human faces, a significant main effect of species was found (F(1,98) = 139. 56, ηp2 = 0.587, p < 0.001), with participants more accurate at recognizing pain in human faces than in horse faces. The main effect of group was also significant (F(1,98) = 6.26, ηp2 = 0.060, p = 0.014), with the Experienced group being more accurate overall than the No Experience group.
The species × group interaction was significant (F(1,98) = 22.76, ηp2 = 0.188, p < 0.001). Planned comparisons showed that the Experienced group was more accurate at recognizing pain in the horse faces than the No Experience group (p < 0.001), but there was no difference between the groups in recognition of human pain (p = 0.114).
With inclusion of age as a covariate, the interaction between group and species was still highly significant (F(1,97) = 9.55, ηp2 = 0.090, p = 0.003) as was the main effect of species (F(1,97) = 35.56, ηp2 = 0.268, p < 0.001). However, with the inclusion of the age covariate, the group main effect was no longer significant (F(1,97) = 1.27, ηp2 = 0.013, p = 0.262).
The same analysis was conducted on the subjective pain ratings of horse and human faces of the participants in the two groups. The main effect of species was significant (F(1,98) = 139. 56, ηp2 = 0.587, p < 0.001), with participants rating pain as higher in human faces than in horse faces. Neither the main effect of group nor the interaction was significant (p > 0.376).
Age of the participants was positively correlated with accuracy (for brevity, correlational results are not repeated in the text – see ). We therefore added age into the multiple linear regression with experience and accuracy.
Table 3. Correlation matrix of the variables under investigation.
Using the Enter method, a multiple linear regression was conducted with age and experience (years) as predictors and horse pain recognition accuracy as the outcome variable. The model explained 25% of the variance in horse pain recognition accuracy (Adj. R2 = 0.234), which significantly predicted outcome (F(2,97) = 16.13, p < 0.001). Only experience (years) significantly contributed to the model, with a medium effect size (β = 0.013, t = 3.21, p = 0.002, r2part = 0.080, f2 = 0.087), demonstrating that having more years of experience predicted a higher accuracy in detecting horse pain. Age did not significantly predict horse pain recognition accuracy (β = 0.000, t = −0.037, p = 0.971, r2part < 0.001, f2 < 0.001). Although accuracy increased with increased years of horse experience, years of experience were not significantly related to subjective ratings of horse pain.
SA did not correlate with the accuracy recognition of horse faces (). However, a positive correlation between SA and horse pain subjective ratings was found. Therefore, a linear regression was conducted, with SA score as the predictor variable and horse pain rating as the outcome variable. The model explained 10.1% of the variance in horse pain subjective ratings (Adj. R2 = 0.092), which significantly predicted outcome (F(1, 98) = 10.99, p = 0.001). SA scores significantly contributed to the model, with a medium effect size (β = 0.014, t = 3.32, p = 0.00, r2part = 0.101, f2 = *0.*113), showing that higher levels of SA predicted higher ratings of pain in the horse faces.
There was no relationship between total empathy, CE, or EE in recognition accuracy or subjective ratings of horse pain.
Subjective rating of pain in human faces did not correlate with SA. However, there was a positive correlation between SA and accuracy (). Therefore, a linear regression with SA score as the predictor and human pain accuracy as the outcome variable was conducted. The model was significant (F(1, 98) = 6.60, p = 0.012), explaining 6.3% (Adj. R2 = 0.054) of the variance in accuracy. SA scores significantly contributed to the model, with a small to medium effect size (β = 0.002, t = 2.57, p = 0.012, r2part = 0.063, f2 = 0.067), demonstrating that those with higher SA scores were more accurate at recognizing pain in humans.
Total empathy, EE, and CE were not correlated with human pain ratings or accuracy (see ).
Discussion
This study set out to determine how experience with horses affected individual’s subjective ratings and recognition accuracy of pain in horse and human faces. It also assessed the impact of SA traits and empathy on these measures. Further, it explored the relationships between these variables and participants’ ratings of arousal and valence in faces of both species. To assess the accuracy of horse pain ratings, we gathered the responses of 10 equine behavior professionals to our equine stimulus set.
Our study revealed several important findings. First**,** the overall ability of participants to accurately judge the level of pain was higher for human than horse faces, as were subjective ratings of pain. Second, the horse-experienced group was more accurate at judging pain in horse faces than those without experience, and the length of experience predicted horse pain recognition accuracy. Third, we found differences between the accuracy and ratings of horse and human faces depending on SA level. SA traits predicted human, but not horse, pain recognition accuracy, whereas they predicted horse, but not human, pain ratings. Finally, a notable null effect emerged as our measures of empathy (total, cognitive, and emotional) were not related to pain ratings or accuracy for either horse or human faces.
It may be unsurprising that our participants were more able to perceive pain in the human faces they viewed. Horses, despite being the longest-standing domesticated animal after the dog, have coexisted with humans for only around 6,000 years, whereas the emergence of Homo sapiens predates this by some 200,000–300,000 years (Galway-Witham & Stringer, Citation2018). From an evolutionary perspective, being able to determine the pain status of conspecifics through facial cues confers considerable advantage to the decoder (Kappesser, Citation2019; Williams, Citation2002). Hence, this finding aligns with an extensive body of evidence supporting superior face processing for human faces over those of other species (Jakobsen et al., Citation2021; Scott & Fava, Citation2013; Simpson et al., Citation2014). In addition, the DPD (Mende-Siedlecki et al., Citation2018, Citation2020) is a database of posed facial expressions of pain designed to capture the entire range of intensity. The horse faces, however, were unlikely to display such extreme expressions of pain as the facial expressions could be nothing but spontaneous and further, as horses are prey animals, they may be predisposed to concealing their pain from predators to hide their vulnerability, which may reduce their apparent intensity (Ashley et al., Citation2005; Kata et al., Citation2014; although see Carbone, Citation2020).
Our results showed that not only was the horse-experienced group more accurate at recognizing horse pain, but that the length of that experience predicted pain recognition accuracy, linearly. This finding is important as it indicates that pain recognition accuracy is malleable and therefore could be targeted directly in interventions to improve this ability with relevant individuals. That our horse caregiver sample was recruited from a local general interest equestrian Facebook group, rather than one with a focus on behavior or welfare, provides support to the idea that regular interaction with horses may be adequate to improve pain recognition skills. Furthermore, the linear relationship suggests that small but significant improvements could be made over a relatively short period of intensive training. This approach has been utilized successfully with human emotional expression training for autistic people (Rice et al., Citation2015; Russo-Ponsaran et al., Citation2016; Wieckowski et al., Citation2020).
In addition, one previous study trained individuals with no horse experience to detect pain in horse faces, over a single 30-min session using the Horse Grimace Scale (HGS). Improvements in detecting only two pain markers were found (stiffly backward ears and orbital tightening), demonstrating the difficulties that naïve participants have in applying the HGS. Future research is clearly warranted to assess the viability of alternative forms of training for pain detection.
Our results demonstrated that SA level influenced perceptions of both human- and horse-pain faces, but in different ways. First, higher SA participants showed superior pain recognition abilities compared with those who were less anxious. This finding adds to the extensive literature that confirms the negativity and interpretation biases in social anxiety (Alden & Bieling, Citation1998; Clark & Wells, Citation1995; Gregory et al., Citation2019; Konovalova et al., Citation2021). However, this result goes further by demonstrating for the first time the enhanced processing ability for pain faces specifically. The mechanism by which this effect emerges requires further investigation. However, some studies have reported superior mentalizing ability in SA samples (Sutterby et al., Citation2012; Tibi-Elhanany & Shamay-Tsoory, Citation2011); therefore, our finding could be explained by an increased aptitude for simulating the mental state of the person in the photograph. An alternative explanation would be that higher SA individuals may be more attentive to certain facial cues related to expressions of pain, such as the wrinkled nose or brow (Prkachin, Citation2011; Williams, Citation2002; Yan et al., Citation2017). Future work might consider employing eye-tracking to explore this possibility.
Being more sensitive to the pain of others would feasibly contribute to the distress felt by SA people when in social situations and may contribute to their wish to avoid social interaction, especially where they are involved with sick or injured individuals. On the other hand, there may be potential benefits associated with enhanced decoding of pain that might be advantageous in health or care settings and especially when interacting with non-verbal individuals. Our study was limited to being able to report the association between pain recognition accuracy and SA, but future research should investigate the subjective emotional reactions of SA participants when they encounter those in pain to better understand this relationship.
For horse faces, higher SA scores predicted higher ratings of pain. This is distinct from the finding with human faces in that with horses, the participants were inaccurate in their judgments. One explanation is that SA participants may be misreading certain facial cues as indicators of pain; for example, they could be misinterpreting the closed eyes of the horses in pain as an indicator of relaxation and therefore rating the images with open eyes as being more painful. Alternatively, the SA participants may have a greater affinity for animals; this might bias their ratings. For example, it has been shown that individuals with mental health conditions or those with socioemotional vulnerabilities or disadvantaged backgrounds often benefit greatly from interventions involving horses (Alfonso et al., Citation2015; Kendall et al., Citation2015; Sullivan & Hemingway, Citation2024) perhaps because the horse provides a psychologically safer form of interaction than they experience with humans, or, as suggested by one reviewer, the animals involved have been selected because they are particularly calm or inexpressive. These possibilities all require additional exploration.
A further notable finding was that our measures of empathy – total score and the CE and EE subscales of the EQ (Baron-Cohen & Wheelwright, Citation2004) – were not related to any pain judgments. To our knowledge, no previous research has examined relationships between pain recognition and empathy (using the EQ) in either species, although a relationship between pain accuracy and empathic concern (an aspect of emotional empathy) was shown using a different scale (Green et al., Citation2009; Ruben & Hall, Citation2013).
Although observing someone in pain activates similar neural regions associated with the direct experience (Corradi-Dell’Acqua et al., Citation2011; Lamm et al., Citation2011; Singer et al., Citation2004), this short-term neural response is quite distinct from the general construct of trait empathy. While our findings suggest that the relationship between empathy and pain recognition in human faces may not extend to self-report measures, this issue requires further exploration. Notably, self-report measures of empathy, including the EQ, are subject to methodological criticisms, including their vulnerability to response bias and their tenuous relationships to everyday empathic processes and behaviors (Harrison et al., Citation2022; Vieten et al., Citation2024). We suggest that the null findings herein should be interpreted cautiously and that the question of relationships between empathy and same- and different-species recognition remain open for future investigation.
Our analysis of relationships between arousal and valence ratings and the other measures highlighted several associations that could be explored in future research. Of greatest note, participants who were higher in emotional empathy tended to rate horse faces more negatively. This accords with the findings around SA scores predicting higher horse pain ratings in that both suggest participants experiencing a heightened emotional state may feel greater emotional resonance with the horse’s mental state (though interestingly, not also with that of humans). For horse and human faces, higher arousal ratings were associated with higher pain judgments, with a stronger effect for horses than humans, and for human faces, this extended to greater pain accuracy as well as ratings. This suggests that participants may have been basing their pain judgments of horses using the same heuristic as for humans (i.e., more arousal cues = more pain), though the results suggest this was not an effective strategy.
Finally, we found a weak positive association between horse experience and horse arousal and horse valence, with more experienced participants rating the horse faces as more aroused and more positive, perhaps reflecting their more developed repertoire of horse emotional states. Clearly, these findings must be interpreted with caution, and future work should attempt to replicate these effects, but they do provide the first evidence of how people with and without horse experience interpret the facial expressions of horses.
Our study has demonstrated that, as with the human face processing system, there is individual variability in how people perceive the faces of horses. Our research has begun the process of establishing which variables confer superior pain recognition abilities and which may bias processing in one direction or another. One issue we could not address, even in an exploratory fashion due to the small number of men in our sample, was the impact of participants’ gender on pain recognition. Previous research suggests attitudes toward horses may differ between equestrians of different genders (Górecka-Bruzda et al., Citation2011), and so future research should attempt to explore gender differences in relation to pain recognition by recruiting a more balanced sample in terms of gender.
Our results are suggestive of the potential benefits of training people who regularly interact with horses to recognize facial expressions of pain. Most equestrians will not realize a horse is in pain until the animal’s behavior escalates to potentially dangerous levels. Such education would therefore improve handler safety and would positively impact horse welfare. Emotion recognition training is commonly used for human faces in groups of people who struggle in this domain, and translating this model to horse faces would be an obvious area for future research. Such a program could be incorporated into the syllabus of equestrian qualifications and training offered by organizations in the UK such as the Pony Club and the British Horse Society.
Our results will be of benefit in the creation of more accurate automatic horse pain detection algorithms, which require an initial training period by a human classifier. By recruiting suitable individuals with enhanced pain-recognition skills to carry out this initial phase of development, the potential of this technology can be maximized, allowing the creation of additional models with even greater accuracy than those currently available. The current study demonstrates for the first time that people with higher SA are more sensitive to the pain of others. This novel finding will be of benefit to clinicians and researchers working with patients with SAD as it provides further insight into the potential contributory factors involved in developing and maintaining the condition.
Data Sharing
Data are available on request.
Acknowledgements
We thank our participants, without whom the research would not be possible. We especially convey our gratitude to the equine behavior professionals (the “expert group”) who rated the horse faces in the study.
Disclosure Statement
No potential conflict of interest was reported by the authors.