Recent papers with perspectives on CORE translations (3): Evans, Paz, and Mascialino 2021- can we need more than one translation into the same target language?! : Clinical Outcomes in Routine Evaluation (and CST)

I am coming back to this theme within my CORE work: how issues about translating measures, particularly mental health & well-being self-report questionnaires are not trivial.

This post is the first of two looking at when linguistic variations within a target language, usually between countries, are such that they will impact on score comparability and perhaps mean that we really need more than one translation for the same language.

How to translate “unhappy”?

This is the starting point. So is s/he in the header image unhappy? I think not but can we assume that the word “unhappy”, as in item 27 of the CORE-OM “I have felt unhappy”, is easy to translate into other languages and meaningful across different cultures. The main challenge of getting a good enough translation of a measure is to find a wording in the target language that probably is close to the meaning in the source language for as wide a membership of those who speak the target language as possible. Achieving this is far more than just getting a forward translation and an independent back translation and comparing them and accepting the translation if they original and the back translation look very similar. There are a number of good guides to translating self-report measures and I believe that my CORE one (https://www.coresystemtrust.org.uk/home/translations/) is in line with most of the dominant ones though mine leans more heavily on lay people’s inputs and less on domain and language expert inputs. (The logic is that I think the expert leaning methods can be more prone to use language that is fine for the well educated but not so good for those with less education who often form a large part of those asking for help.)

Language variation

One issue for all translations is that of language variation: for some languages variation is mainly oral, a matter of pronunciation and written language is very standard even across speakers whose spoken language may be very hard for those from a different area to understand. I am told (by Chinese people) that there are variants of spoken Chinese that really are mutually unintelligible but that written Chinese, at least using the simplified character set rather than the traditional one, is understood by all. It’s a standing joke in the UK that many people from England struggle to understand some Scots, perhaps particularly those from Glasgow. Sadly, I confess that’s true for me.

In this paper we addressed an issue that had concerned Clara (Professor Paz) and myself for some years: was the translation of the CORE-OM into Spanish done in Spain, and with attention to language variation across Spain OK for use in Ecuador and other Latin American Spanish speaking countries?

In this paper we think we have pioneered one very solid way to address this question. The method was a mix of qualitative and quantitative steps as follows.

Ten people born in Ecuador, but who had lived in English-speaking countries and are sensitive to cultural differences between those contexts, were asked to translate the CORE-OM from English to Spanish. The instructions were:
Do not translate the items literally but consider each item as a whole. Try to make a translation that can be understood in the Ecuadorian context while conveying the nuances of the English version of each item.
The translations produced were analyzed to identify consensus and issues for translations for each item. These translations were then compared with the Spanish version of the CORE-OM (Trujillo et al., 2016) to generate a pool of possible translations for each item to be used as alternatives in the next step.
Eleven people, none of whom had participated in step one, were asked to “talk aloud” through the items, including alternatives emerging from step 1. These participants were chosen to cover Spanish language variation within Ecuador and to cover a range of ages, gender and education. They were asked:
- (1) Do you understand the item? What do you think it means?
- (2) Do you think the general population could understand this item? If not, why do you think it will be difficult to understand? How would you change the wording to make it more understandable?, and
- (3) Do you think that a person who is experiencing distress could understand this item? If not, how would you change the wording to make it more understandable?
- For items with multiple versions of the translations from phase one, the interviewees were asked which of the multiple versions, the original from Spain, or new alternatives, would be best understood in Ecuador.
At this point it was clear that the translation from Spain of item 27 “I have felt unhappy” as “Me he sentido infeliz” was widely seen as inferior for an Ecuadorean population to “Me he sentido triste”. Simplifying a little, the issue was that “infeliz” can have quite a pejorative trait connotation in Ecuador while “triste” is more simply a state descriptor. At this point it was clear that we needed to see whether the difference was sufficient to justify using a different translation from the one from Spain. To start to explore this we conducted a fairly classical rating study with 54 psychology students. To avoid over focusing on the “infeliz” versus “triste” issue they were presented with five items:
- The original: “Me he sentido infeliz”
- The preferred alternative “Me he sentido triste,” that had emerged from the steps above and three other options:
- “Me he sentido intimidado/a” [I have felt afraid],
- “Me he sentido tenso/a” [I have felt stressed] and
- “Me he sentido desesperanzado/a” [I have felt hopeless]
They were asked to rate each item on an visual analogue scales, the first from “This does not fit how I have felt at all today” to “This really fits how I have felt today” and the second rating from “This does not fit how I have felt in the last seven days” to “This really fits how I have felt in the last 7 days.” The focal analyses were whether there were mean differences between the “infeliz” and the “triste” items on those ratings. In addition the participants were asked:
- How much do you think this item might upset some people?
- Do you think this item might put someone off answering a questionnaire if they found it in there?
- Do you think some people might not answer this item honestly?
Finally, to provide a different quantitative exploration of differences between “infeliz” and “triste” their mapping them to other words was explored. In this the students were asked to consider a list of adjectives: “culpable” [guilty], “asqueado” [nauseous], “avergonzado” [ashamed], “deprimido” [depressed], “enfadado” [angry], “irritable” [irritable], “desesperado” [desperate], “miserable” [miserable], “desdichado” [unfortunate], and for each of those to say which they thought was closest to each of the five adjectives included in the items presented in previous sections (“infeliz,” “triste,” “intimidado/a,” “tenso/a,” and “desesperanzado/a”). The difference in paired proportions of association of each of the five adjectives with (“infeliz” and “triste”) was tested by McNemar’s test of paired association.
The final step was a classical psychometric exploration of responses on a modified CORE-OM with an extra item “Me he sentido triste”. The Participants formed two subgroups: a help-seeking (n = 171) and a non-help-seeking (n = 1,002). The focal analyses were of mean differences between scores on the two versions of item 27, differences in scores on the scales to which item 27 contributes and differences in Cronbach’s alpha for those scales.

All the details are in the paper which is open access but to cut a long story short, all the quantitative steps showed that there were statistically significant differences between “infeliz” and “triste” but the final relatively large n study showed that the differences in mean scores and for the internal reliabilities for the scale scores (and total score) were trivial and we were able to conclude that while there clearly are language variation differences between Spain and Ecuador, differences that give detectably statistically significant impacts the impacts for the purposes of score comparison were sufficiently small as to be ignorable.

We encourage others to use these or similar steps to look for potentially important language variation where a target language clearly has substantial language variation (Spanish and Arabic are probably the prime examples but what about English?) If the early steps suggest that differences exist then we believe that, for MH/WB measures such as the CORE measures, OQ, ORS etc., the final step is more robust and useful assessment of score comparability and of the possible need for multiple translations than is the current trend for doing factorial invariance explorations.