Machine learning analysis of SF-36 Health questionnaire could be as diagnostic tool for ME/CFS


Spanish researchers looked for an alternative to using an expensive and invasive exercise test (CPET) as a diagnostic biomarker in patients with ME/CFS. They found that using Machine Learning to analyse the answers to the Short Form-36 (SF-36) questionnaire could predict oxygen consumption and reveal sub-types.


Unsupervised cluster analysis reveals distinct subtypes of ME/CFS patients based on peak oxygen consumption and SF-36 scores, by Marcos Lacasa, Patricia Launois, Ferran Prados, José Alegre, Jordi Casas-Roma in Clinical Therapeutics Oct 4, 2023 []


  • ME/CFS is a disabling chronic disease with a lack of diagnostic tests.
  • Oxygen consumption is a possible biomarker of CFS.
  • O2 consumption allows classifying patients status according to the Weber’s classification.
  • A worse Weber’s classification infers a worse outcome on the SF-36 questionnaire.
  • Unsupervised machine learning is a powerful tool for analyzing data.

Research abstract

Myalgic encephalomyelitis, commonly referred to as chronic fatigue syndrome (ME/CFS), is a severe, disabling chronic disease and an objective assessment of prognosis is crucial to evaluate the efficacy of future drugs. Attempts are ongoing to find a biomarker to objectively assess the health status of (ME/CFS), patients.

This study therefore aims to demonstrate that oxygen consumption is a biomarker of ME/CFS provides a method to classify patients diagnosed with ME/CFS based on their responses to the Short Form-36 (SF-36) questionnaire, which can predict oxygen consumption using cardiopulmonary exercise testing (CPET).

Two datasets were used in the study. The first contained SF-36 responses from 2,347 validated records of ME/CFS diagnosed participants, and an unsupervised machine learning model was developed to cluster the data. The second dataset was used as a validation set and included the cardiopulmonary exercise test (CPET) results of 239 participants diagnosed with ME/CFS. Participants from this dataset were grouped by peak oxygen consumption according to Weber’s classification.

he SF-36 questionnaire was correctly completed by only 92 patients, who were clustered using the machine learning model. Two categorical variables were then entered into a contingency table: the cluster with values {0,1} and Weber classification {A, B, C, D} were assigned. Finally, the Chi-square test of independence was used to assess the statistical significance of the relationship between the two parameters.

The results indicate that the Weber classification is directly linked to the score on the SF-36 questionnaire. Furthermore, the 36-response matrix in the machine learning model was shown to give more reliable results than the subscale matrix (p − value < 0.05) for classifying patients with ME/CFS.

Low oxygen consumption on CPET can be considered a biomarker in patients with ME/CFS. Our analysis showed a close relationship between the cluster based on their SF-36 questionnaire score and the Weber classification, which was based on peak oxygen consumption during CPET. The dataset for the training model comprised raw responses from the SF-36 questionnaire, which is proven to better preserve the original information, thus improving the quality of the model.

This entry was posted in News. Bookmark the permalink.

Comments are closed.