ME/CFS Research review: Analysis of data from 500,000 individuals in UK Biobank demonstrates an inherited component to ME/CFS, by Simon McGrath, 11 June 2018
With a guest blog by Professor Chris Ponting and colleagues.
UK Biobank – a national biobank different from the ME/CFS biobank – has data from around 500,000 individuals, including both healthy people and those with one or more of the many different diseases in the UK population. About 2,000 people in the sample reported that they had been given a diagnosis of CFS.
Analysis of data from this biobank indicates an inherited biological component for ME/CFS. The results show only one statistically significant change in a particular section of DNA and even this is problematic. This analysis indicates that a much bigger study, with many more ME/CFS cases, will be needed to indicate which genes and biological pathways are altered in people with ME/CFS.
Myalgic encephalomyelitis (ME, also described as chronic fatigue syndrome, CFS) is a devastating long-term condition affecting 250,000 UK individuals. People with ME experience severe, disabling fatigue associated with post-exertional malaise. A few make good progress and may recover, while most others remain ill for years and may never recover. There is no known cause, or effective treatment for most. Consequently, it is vital to try new approaches to understand the reasons for the development of the condition.
This blog sets out what we can glean from the release, last summer, of data from about 500,000 individuals who make up the UK Biobank. (This biobank is not to be confused with the UK ME/CFS Biobank, UKMEB.) The data were acquired from individuals between 40 and 69 years of age in 2006-2010 who live across the UK. These people provided samples (e.g. blood, urine and saliva) and answered questionnaires. In addition, for some of these people their electronic health record data are being linked in. Importantly for this blog, the DNA variation (‘genotype’) of all the volunteer participants has been determined.
Genetic variation can provide insights into the causes of disease when these have a heritable component (i.e. are inherited down through the generations). DNA sequence is not altered by disease (except in cancer) and so variants can reveal the causes, rather than consequences, of disease.
Here we draw heavily from an analysis of the UK Biobank data by Oriol Canela-Xandri, Konrad Rawlik and Albert Tenesa which is described in a preprint available from bioRxiv. (The authors have kindly shared their results in this way in order to share results with others before the findings have been peer reviewed.)
From this (specifically, Supplemental Table 1) we see that data were analysed from 1,829 people among the UK Biobank cohort who self-reported as having been diagnosed with ME/CFS. The table also provides five pieces of information:
(1) The prevalence of ME/CFS among UK Biobank individuals was 0.448%. In other words, picking any person randomly in the UK then there is an even chance that they know someone with ME/CFS if they know about 200 people.
(2) There is a reasonably strong female bias: the prevalence rates are female = 0.611%; male = 0.255%; so there are 2.4-fold more females than males with ME/CFS in the UK Biobank cohort.
(3) Extrapolating these numbers to the UK as a whole, here are the full population prevalence predictions (using 2016 estimates for UK census populations).
There is one caveat that should be mentioned with respect to these numbers. This is that the 500,000 people assessed in the Biobank, despite being recruited for assessment at 22 centres in Scotland, Wales and England, are not fully representative of the general population. There appears to be a “healthy volunteer” selection bias which would imply that the prevalence estimates are lower-bound values. Furthermore, if ME/CFS prevalence is different in other groups then this is not accounted for in the numbers above.
(4) ME/CFS has a biological component because the heritability of ME/CFS is not zero. Canela-Xandri et al. estimate that the genetic heritability (liability scale) is 0.080. This is slightly lower than the median heritability of heritable binary traits (0.11; see Figure 1). So among all such things measured, it’s in the lower half of the heritability, but not zero. Note that this doesn’t rule out non-heritable biological causes.
(5) The analysis identifies one, and only one, DNA position whose genetic variation associates with (in part) ME/CFS susceptibility. (The plot below is called a Manhattan plot and any point above the dashed line is predicted to be a significant “hit”. Each dot represents a position (X axis) along a chromosome – shown alternatively in red and blue – and its position on the Y-axis indicates the statistical significance of the association: the higher the better.)
Statistical significance for the association between each DNA position and ME/CFS across 22 chromosomes. The arrow highlights the one “significant hit”.
This proposed “significant hit” is on chromosome 10 (position 74828696; rs150954845). The calculated p-value is 2.5×10-12. This DNA change (A-to-T) is predicted to alter a protein called P4HA1, changing an aspartic acid (“D”; GAT) for a valine (“V”; GTT) at its 124th amino acid position. P4HA1 is prolyl 4-hydroxylase subunit alpha 1: in other words, one part of prolyl 4-hydroxylase, a key enzyme in collagen synthesis. We know what this molecule looks like and where the aspartic acid (D124) occurs within it (below; courtesy of Luis Sanchez-Pulido).
We can even see at a resolution of 10-10 of a metre what effect such a change would have on the protein (below; courtesy of Luis Sanchez-Pulido).