Reduction of large-scale graphs: Effective edge shedding at a controllable ratio under resource constraints

作者:

Highlights:

摘要

As technology advances, many complicated systems can be represented by networks/graphs. However, when using limited computing resources such as portable computers or personal desktop computers, users are not able to store and mine large-scale graphs due to the unparalleled growth of the amount of data we generate. In order to address this challenge, we present effective edge shedding. Effective edge shedding can reduce the amount of data to be processed and the corresponding storage space while speeding up graph algorithms and queries, thereby supporting interactive analysis, helping knowledge discovery, and eliminating noise. In this paper, to extract the underlying features of a graph, we present two effective edge shedding methods on the basis of preserving the expected vertex degree. Both methods allow users to control the edge shedding process, thus generating a reduced graph of the predefined size based on the computing resource constraint. Using four real-world datasets in different domains, we performed an extensive experimental evaluation of our methods and compared them with the state-of-the-art graph summarization method on seven graph analysis tasks. The results indicate that our methods can achieve up to 58.6% higher accuracy on graph analysis tasks compared with the state-of-the-art method. For very large datasets, our methods consumes only 0.3% of the running time of the competitive method when generating the reduced graph. The above results fully illustrate the advantages of our methods.

论文关键词:Degree distribution,Edge shedding,Graph reduction,Limited resources

论文评审过程:Received 24 July 2021, Revised 29 December 2021, Accepted 1 January 2022, Available online 7 January 2022, Version of Record 25 January 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2022.108126