Publication
Server-side Prediction of Source IP Addresses using Density Estimation
Markus Goldstein; Matthias Reif; Armin Stahl; Thomas Breuel
In: Availability, Reliability and Security, 2009. ARES 09. Fourth International Conference on. International Conference on Availability, Reliability and Security (ARES-2009), ARES - The International Dependability Conference, March 16-19, Fukuoka, Japan, Pages 82-89, ISBN 978-0-7695-3564-7, IEEE Computer Society Press, 3/2009.
Abstract
Source IP addresses are often used as a major feature for user modeling in computer networks. Particularly in the field of Distributed Denial of Service (DDoS) attack detection and mitigation traffic models make extensive use of source IP addresses for detecting anomalies. Typically the real IP address distribution is strongly undersampled due to a small amount of observations. Density estimation overcomes this shortage by taking advantage of IP neighborhood relations. In many cases simple models are implicitly used or chosen intuitively as a network based heuristic. In this paper we review and formalize existing models including a hierarchical clustering approach first. In addition, we present a modified k-means clustering algorithm for source IP density estimation as well as a statistical motivated smoothing approach using the Nadaraya-Watson kernel-weighted average. For performance evaluation we apply all methods on a 90 days real world dataset consisting of 1.3 million different source IP addresses and try to predict the users of the following next 10 days. ROC curves and an example DDoS mitigation scenario show that there is no uniformly better approach: k-means performs best when a high detection rate is needed whereas statistical smoothing works better for low false alarm rate requirements like the DDoS mitigation scenario.