Classification of phonocardiogram signals to detect heart murmurs

N8 Digital Health Community Day

Dr Nicola Rennie

What are phonocardiogram signals?

Time series plot of aortic valve sound recording for subject 13918.

Data

  • CirCor DigiScope Phonocardiogram Dataset.1

  • 5,272 sound recordings of heartbeats.

  • 1,568 different patients.

  • 4 recording locations.

  • Patient information such as height, weight, age, …, and whether or not they had been diagnosed with a heart murmur.

Time series plot of aortic valve sound recording for subject 13918.

1J. H. Oliveira et al. (2021). The CirCor DigiScope Dataset: From Murmur Detection to Murmur Classification. IEEE Journal of Biomedical and Health Informatics.

Why classify phonocardiogram signals?

Aim: predict which time series of recordings belong to those with heart murmurs.


  • Additional diagnostic tool
    • can pick up sub-audible sounds
    • cost-effective, first-line screening
  • Longer term monitoring

Grid of 100 users with one highlighted

Classifying time series

Classifying time series

Time series plot plotted in different colours above and below zero

Scatter plot of mean against standard deviation of time series

  • Calculate some features of the time series.

  • Use the features as input to classification algorithms instead of the raw time series data.

Time series features

Some time series features will tell us useful things…

Box plot showing ACF distributions for four locations, coloured by those with and without heart murmurs.

… some won’t.

Machine learning models

  • Logistic Regression

  • Lasso Logistic Regression

  • Random Forests

  • Support Vector Machines

  • Naive Bayes

Initial results

Lasso logisitic regression

Accuracy: 0.81
ROC AUC: 0.65

confusion matrix of lasso regression results

Random forest

Accuracy: 0.80
ROC AUC: 0.71

confusion matrix of random forest results

Future work

  • Issues with bias

  • Multinomial classification

  • Exploiting location information

  • Feature engineering

Diagram showing mixture of high and low accuracy