Exploring and analyzing the tor hidden services graph

Abstract
The exploration and analysis of Web graphs has flourished in the recent past, producing a large number of relevant and interesting research results. However, the unique characteristics of the Tor network limit the applicability of standard techniques and demand for specific algorithms to explore and analyze it. The attention of the research community has focused on assessing the security of the Tor infrastructure (i.e., its ability to actually provide the intended level of anonymity) and on discussing what Tor is currently being used for. Since there are no foolproof techniques for automatically discovering Tor hidden services, little or no information is available about the topology of the Tor Web graph. Even less is known on the relationship between content similarity and topological structure. The present article aims at addressing such lack of information. Among its contributions: A study on automatic Tor Web exploration/data collection approaches; the adoption of novel representative metrics for evaluating Tor data; a novel in-depth analysis of the hidden services graph; a rich correlation analysis of hidden services' semantics and topology. Finally, a broad interesting set of novel insights/considerations over the TorWeb organization and content are provided.
Anno
2017
Tipo pubblicazione
Altri Autori
Bernaschi, Massimo; Celestini, Alessandro; Guarino, Stefano; Lombardi, Flavio
Editore
Association for Computing Machinery
Rivista
ACM transactions on the web