Publication
Where traffic meets DNA: mobility mining using biological sequence analysis revisited
Ahmed Jawad; Kristian Kersting; Natalia V. Andrienko
In: Isabel F. Cruz; Divyakant Agrawal; Christian S. Jensen; Eyal Ofek; Egemen Tanin (Hrsg.). 19th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS 2011, Proceedings. International Conference on Advances in Geographic Information Systems (ACM GIS-2011), November 1-4, Chicago, IL, USA, Pages 357-360, ACM, 2011.
Abstract
Traffic and mobility mining are fascinating and fast growing areas of data mining and geographical information systems that impact the lives of billions of people every day. Another well-known scientific field that impacts lives of billions is biological sequence analysis. It has experienced an incredible evolution in the recent decade, especially since the Human Genome project. Although, a very first link between both fields has been established already in the early 90ies, many recent papers on mobility mining seem to be unaware of it. We therefore revisit the link and show that many unexplored and novel mobility mining methods fall naturally out of it. Specifically, using advanced discretization techniques for stay-point detection and map matching, we turn traffic sequences into a "biological" ones. Then, we introduce a novel distance function that enables us to directly apply the rich toolbox for biological sequence analysis to it. For instance, by just looking at complex traffic data through the biological glasses of sequence logos we get a novel, easy-to-grasp visualization of data, called "Traffic Logos". For clustering and prediction tasks, our empirical evaluation on three real-world data sets demonstrates that revisiting the link can yield performance as good as state-of-the-art data mining techniques.