Fast shortest path distance estimation in large networks
Source:
Conference on Information and Knowledge Management (CIKM) (2009)
Abstract:
In this paper we study approximate landmark-based methods
for point-to-point distance estimation on very large networks.
These methods involve selecting a subset of nodes as landmarks and
computing offline the distances from each node in the graph to
those landmarks. At runtime, when the distance between a pair of
nodes is needed, it can estimated quickly by combining the
precomputed distances.
We prove that selecting the optimal set of landmarks is an
NP-hard problem, and thus heuristic solutions need to be
employed. We therefore explore theoretical insights to devise and
experimentally compare a variety of simple methods that scale well
to very large networks. The efficiency of the suggested techniques
is tested experimentally using five real-world graphs having millions
of edges.
While theoretical bounds support the claim
that random landmarks work well in practice, our extensive
experimentation shows that smart landmark selection can yield
dramatically more accurate results: for a given target accuracy,
our methods require as much as 250 times less space than selecting
landmarks at random.
In addition, we demonstrate that at a very
small accuracy loss our techniques are several orders of magnitude
faster than the state-of-the-art exact methods.
Finally we study
an application of our methods to the task of social search in
large graphs.
Download: