As we know, content discovery is one of the fundamental research topics in the field of communication networks such as P2P (Peer-to-Peer) networks and Information-Centric Networking. Traditional content discovery on a network aims at discovering content that exactly matches a query, i.e., a request, issued by a user. However, with the growth and diversification of the content space in recent years, it has become necessary to resolve queries flexibly and effectively. A promising approach to address this requirement is similarity searching, which resolves queries based on the similarity among contents. Therefore, in this paper, we incorporate the concept of similarity searching into content discovery on a network and study similar content discovery using a random walk on a graph. Specifically, we introduce a performance metric for similar content discovery, the s-content discovery time, which is defined as the time taken to discover content whose similarity is larger than or equal to a given similarity s, and derive it. We also analyze, through numerical evaluations, the trade-off between similarity and s-content discovery time. As a consequence, we reveal the relationship between the s-content discovery and the following factors: network topology, random walk mobility model, and distribution of content placement.
View full abstract