Fiadino, P. (2015). Traffic characterization, anomaly detection and diagnosis in internet scale services [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2015.34158
Rewinding the clock of the Internet to a decade ago, network traffic was largely dominated by peer to peer (P2P) file sharing and web services were provided by centralized or barely distributed platforms. Today the situation has drastically changed: the most popular services rely on web technologies, while highly dynamic and distributed Content Delivery Networks (CDNs) rule the Internet's landscape. The explosion of cloud-based services, the ever-growing volume of video streaming traffic, and the large user-base of online social networks call for sophisticated load balancing and caching techniques to optimize the usage of the underlying transport network, as well as the end-user experience. As a result, current Internet traffic patterns are characterized by a much higher dynamism, posing serious challenges to network operators. Understanding today's traffic has become a daunting task, making traffic engineering, network optimization, and trend analysis arduous processes. The picture is complicated by the growing occurrence of unexpected anoma- lies which potentially impact the interests of the involved stakeholders, from the end-user's experience to the network planning enforced by providers. In the light of this Internet scenario, we claim that traditional network analysis techniques need to be revised to better capture and explain current and future traffic dynamics. This thesis brings three major contributions to the field of network traffic monitoring and analysis. The first contribution regards the analysis and characterization of Internet scale services and large-scale provisioning systems. We have not only analyzed and dissected rarely explored services and popular CDN infrastructures using both passive and active measurements, but also proposed multiple novel techniques to unveil their traffic patterns in both normal operations and during anomalies, even when they run on encrypted protocols. The second contribution targets the automatic detection of network and traffic anomalies in modern services, where we have proposed novel anomaly detection techniques as well as extended previous proposals to self-adapt to current Internet dynamics and flag relevant anomalies. In particular, anomalies impacting both the experience of the end users as well as the performance of the network have been discovered through the proposed techniques. The detection performance of our system was compared against well-known solutions, such as entropy-based detectors, showing outperforming results in several cases. The last contribution focuses on the diagnosis of the detected issues. We have provided a framework to unveil the root causes behind the flagged anomalies, relying on Machine Learning techniques and on the combined analysis of symptomatic and diagnostic passive measurements. We have also devised the design of a more advanced approach that relies on the analysis of both passive and distributed active measurements to iteratively investigate the anomalies. To provide strong evidence on the relevance of our contributions, the presented studies were validated using real large-scale traffic measurements from different operational networks, including both cellular and fixed-line. Taking together the ensemble of the contributions, this thesis offers a holistic approach for network operators to efficiently monitor Internet scale services and interpret unexpected network traffic behaviors.