Skip to main content Skip to main navigation


Machine learning approaches to predict age from accelerometer records of physical activity at biobank scale

Alan Le Goallec; Sasha Collin; M'Hamed Jabri; Samuel Diai; Théo Vincent; Chirag J Patel
In: PLOS Digital Health, Vol. 2, No. 1, Pages e0000176-e0000176, Public Library of Science San Francisco, San Francisco, CA USA, 2023.


Physical activity improves quality of life and protects against age-related diseases. With age, physical activity tends to decrease, increasing vulnerability to disease in the elderly. In the following, we trained a neural network to predict age from 115,456 one week-long 100Hz wrist accelerometer recordings from the UK Biobank (mean absolute error = 3.7±0.2 years), using a variety of data structures to capture the complexity of real-world activity. We achieved this performance by preprocessing the raw frequency data as 2,271 scalar features, 113 time series, and four images. We defined accelerated aging for a participant as being predicted older than one’s actual age and identified both genetic and environmental exposure factors associated with the new phenotype. We performed a genome wide association on the accelerated aging phenotypes to estimate its heritability (h_g2 = 12.3±0.9%) and identified ten single nucleotide polymorphisms in close proximity to genes in a histone and olfactory cluster on chromosome six (e.g HIST1H1C, OR5V1). Similarly, we identified biomarkers (e.g blood pressure), clinical phenotypes (e.g chest pain), diseases (e.g hypertension), environmental (e.g smoking), and socioeconomic (e.g income and education) variables associated with accelerated aging. Physical activity-derived biological age is a complex phenotype associated with both genetic and non-genetic factors.