The CORE-OM and the h-index

The CORE-OM was the yield from a 3-year grant from the UK Mental Health Foundation, which started in 1995 and resulted in the launch of the CORE-OM and associated system in 1998. One question with all such developments is: what has been its impact?

As those working in university settings will know, one index of the impact of an individual’s research is their h-index. In 2005, physicist Jorge E. Hirsch developed a simple premise to quantify the scientific output of an individual researcher. The value of h is equal to the number of papers (N) for a researcher that have N or more citations. For example, an h-index of 10 means there are 10 articles that have 10 citations or more. This metric is useful because it discounts the disproportionate weight of highly cited papers or papers that have not yet been cited. It gets increasingly difficult to raise the h-index because the index impacts on all those articles. So, in order for that h-index of 20 to rise to 21, each of those 21 articles now needs to have individual citations of 21 or more.

So let’s suppose we imagine the CORE-OM as a researcher – what would be the h-index for the CORE-OM? Does this give us an idea of one aspect – there are many others – of the impact of the CORE-OM?

Well, like everything, nothing is straightforward. But a simple search on the term ‘CORE-OM’ in SCOPUS yields an h-index of 20 (one article was excluded as not relating to the CORE-OM). So, 20 articles using the CORE-OM have each been cited on SCOPUS at least 20 times.

The top 2 publications, each cited over 200 times, understandably relate to the key initial articles on the development of the CORE-OM and published in the British Journal of Psychiatry (2002) and the Journal of Consulting and Clinical Psychology (2001).

However, some key papers, particularly those early on, did not explicitly use the term ‘CORE-OM’, and some articles might use the CORE-OM in their report but not cite it in the Abstract, which is the usual source for electronic searches to pick up related work. Some early work was not identified, as some journals are not sampled by SCOPUS. However, there was the slightly inexplicable exclusion of an article containing ‘CORE-OM’ in the title!

So a slightly wider search (together with some filtering as more inclusive search terms collect work that is not relevant) yielded an h-index of 23. This search strategy picked up 2 early publications relating to the CORE-OM, both of which appeared in Journal of Mental Health (1998, 2000). However, even this search did not pick up some important articles that used other versions of the CORE-OM, in particular the Short Form A and B versions.

In contrast, a search of ‘CORE-OM’ on Web of Science yielded an h-index of 22 after a couple of papers were deleted as the research did not focus on the CORE-OM. The top 2 articles were the same as in the SCOPUS search. However, WoS does not search Journal of Mental Health, so some of the early work is not detected regardless of extending the search terms, but the search did pick up interesting work using the Short Form versions of the CORE-OM.

So, it would seem that we can say that the CORE-OM has an h-index of 23 according to SCOPUS and 22 for Web of Science. Although the profile of articles for each database is slightly different, the slightly higher h-index for SCOPUS is consistent with the database having a wider scope than WoS – you will likely find the same if you establish the h-index for your own research.

So, what does this mean for the impact of the CORE-OM?

Well, Hirsch suggested that after 20 years of research, an h index of 20 is good, 40 would be outstanding, and 60 exceptional. So, on these simply guidelines, in less than 20 years, the CORE-OM has had a good impact.

All-in-all, I guess that’s not bad. Using the h-index in this way is not as precise as applying it to researchers (they are either an author or not). So the role of the CORE-OM in the articles varies and of course it is only one (imperfect) index of the impact that the CORE-OM has had.

If readers have examples of the impact that the CORE-OM has had for their work or practice, then let us know – we would be very interested to hear from you.

Thank you



What’s in a name (2): domains, scales, scores, factors & dimensions

The original commissioning specification for the CORE system required that the items in the measure covered domains of wellbeing (or “well being”, or “well-being”: there’s another naming issue!), problems/symptoms, functioning and risk.  The questions were supposed, where possible to include intrapsychic ones and interpersonal ones and functioning was to cover both more personal/intimate and more social functioning and risk should cover intrapunitive and extrapunitive risk, i.e. risk to self and risk to others.  We liked this framework and noted that the first three domains had some links to the phase model of change in therapy which suggests that well-being change comes first, then symptom/problem improvement, then functioning improvement (Howard, Lueger, Maling & Martinovich (1993) A phase model of psychotherapy outcome: causal mediation of change. J Consult Clin Psychol. 61(4):678-85).

We thought the commission specification was right that these were fairly conceptually distinct domains of experience that should be covered by a measure of change in therapy and that was supported by extensive surveys of therapists/practitioners, managers, commissioners (“purchasers” in the jargon of the time) and end users and lay people and we thought we should say which items we saw as belonging most strongly to which domains and offer the opportunity to study scores, and changes in scores, on each domain.  However, we never imagined that these would form clear “factors” or principal components in cross-sectional psychometric studies nor that the chronological relationships between them over time in cohorts or even within a single person in therapy would be neat.  If you feel lousy (low wellbeing) it’s likely that you will have or develop problems and even symptoms, and vice versa.  Similarly, struggling to function well either in personal interactions and/or at work or in caring duties will dent a sense of wellbeing and lead to problems: these simply aren’t independent factors or dimensions. 

With the advantage of hindsight it’s easy to see that we should have been clearer about that.   We tried to use the terms “domains” and “domain scores” in preference to “factors”, “dimensions”, “scales” but slipped from time to time.  We thought we were sufficiently explicit about our use of exploratory factor analysis being exactly that: exploratory, and mainly to check that there was a large main factor and a good collection of smaller factors.  We were unsurprised in our early work (Evans, Connell, Barkham, Margison, McGrath, Mellor-Clark & Audin (2002) Towards a standardised brief outcome measure: psychometric properties and utility of the CORE-OM. British Journal of Psychiatry, 180, 51–60) to find a structure that didn’t reflect the domains but which seemed to some extent to separate positively cued from negatively cued items and to separate the risk items from the other items.  We have never expected that this structure would replicate strongly in different cultures and samples and we only used confirmatory factor analysis to show just how poor the fit to a simple factor structure (Lyne, Barrett, Evans & Barkham (2006) Dimensions of variation on the CORE-OM. British Journal of Clinical Psychology, 45, 185–203).  That paper was intended to be a definitive statement about the expected psychometric structure, at least in British clinical samples.  Here’s the statement from the abstract:

The CORE-OM has a complex factor structure and may be best scored as 2 scales for risk and psychological distress. The distinct measurement of psychological problems and functioning is problematic, partly because many patients receiving out-patient psychological therapies and counselling services function relatively well in comparison with patients receiving general psychiatric services. In addition, a clear distinction between self-report scales for these variables is overshadowed by their common variance with a general factor for psychological distress.

And the end of the discussion:

These considerations with respect to the CORE-OM domains are of importance for future research and scale development, but the utility of CORE-OM has already been demonstrated as a widely used benchmarking measure and reliable indicator of change in psychotherapy research and practice. The scoring method that has proved most
useful in this regard is that in which all 28 non-risk items are scored as one scale and the
risk items as the other. This research confirms that the scale quality of CORE-OM when
scored in this way is satisfactory.

So some suggestions/pleas:

  1. by all means report change on specific domain scores if they are pertinent for the work that went on with the client/patient but don’t imply that the specific scores are well defined factor analytic scales;
  2. the risk and non-risk items are sufficiently distinct in cross-sectional psychometric studies that it may be wise to report the non-risk and risk scores as well as the total scores in almost any study;
  3. if you possibly can, talk about the scores from the CORE-OM and CORE-SF/A and SF/B as “domain scores” not “dimensions” or “factors”.

What’s in a name (1): scoring CORE measures

We may have caused a bit of confusion by introducing the term “Clinical score”.  Perhaps it’s not on the scale of the Capulet/Montague name tragedy (Shakespeare, 1591-1995?) but it may be worth explaining the scoring here as I do see mistakes and do get asked about this.


We started out scoring using the mean of the items and recommending pro-rating if not more than 10% of items were missing, i.e. using the mean of the remaining items.  That meant you could get a pro-rated mean overall score for the CORE-OM if as many as three items were missing, for the “non-risk” score if two of the non-risk items were missing, for the function and problems scores if one of their items was missing, and you couldn’t pro-rate if any items were missing for the well-being or risk scores.  You could get overall scores for the CORE-SF/A, CORE-SF/B if one of their items was missing (but not for domain scores as any missing item there means more than 10% of the items are missing).  Similarly, you could use a pro-rated score for the GP-CORE, the LD-CORE, the YP-CORE and the CORE-10 if one item was missing but pro-rating the CORE-5 was clearly impossible. 

All those scores had to lie between 0 and 4 by definition but they could be awkward looking numbers like 0.84 and over the early years we got feedback that many clinicians and managers didn’t like these “less than one and fractional” scores. 

“Clinical Scores”

With mixed feelings in the team, the idea of “Clinical Scores” came in: the item mean as above, but multiplied by 10 to get a score that in clinical samples would pretty much always be a x.y sort of number with x >= 1 and scores ranging between 0 and 40. The same rules about pro-rating were retained.  This “x10 = Clinical Score” gives that rather easy scoring for a complete CORE-10 or complete YP-CORE that the “Clinical Score” is just the sum of the 10 items completed (but if one item is omitted you still have to find the mean of the nine completed items and multiply that by 10).   For a completed  CORE-5 the route to the “Clinical Score” is almost equally easy: the Clinical Score is twice (2x) the sum of the five items’ scores.

We sometimes see people reporting the sum of the items: please don’t do that, we’ve never recommended that anywhere.  We also see people not saying explicitly that they’re using the original “mean item score” or the “Clinical Score”, please do say which you used even if it seems very obvious.  Finally, we encourage people always to be explicit about having used pro-rating (if you have) and to be explicit about numbers of incomplete questionnaires and numbers of items missed. This all maximise comparability of reports.  Non-comparable scoring may not be as lethal as Mantua family feud was to Romeo, Juliet and Mercutio, but it’s definitely to be avoided!


Shakespeare, W. (1591-1595, exact date uncertain) “Romeo and Juliet” available in many versions as the peer-reviewed format hadn’t been invented: quarto 1, quarto 2, first folio and later versions.  However, the fatal name issue is consistent in all.