According to recent statistics, dementia in its various forms afflicts more than 7.5 million people in the United States, with Alzheimer's disease accounting for 60~80% of these cases. Doctors diagnose Alzheimer's and other types of dementia through a examination that includes assessing cognitive function as a key indicator, but the disease is often diagnosed late. Early intervention could improve quality of life helping alleviate the symptoms and slow down the progression of the disease.
This research aims to identify auditory biomarkers which could be used for early detection of dementia. As a base for our analysis we used audio data collected from the Framingham Heart Study cohort during neuropsychological testing exams and the associated high fidelity dementia status reports. Acoustic features have been computed on the raw audio, and languagebased features have been computed over the annotated text obtained through both automatic and manual transcriptions. For the prediction task we trained a random forest classifier and used mean area under the receiver operating curve (AUC) scores across a 10fold cross validation scheme as a performance metric.
Despite some issue processing the data due to the records’ age and low quality audio, the results obtained are very encouraging: a model based only on health and demographic features obtained a 0.52 AUC; using simple acoustic features boosted the AUC to 0.81; and including part of speech features allowed the model to reach an AUC of 0.91.