Distributed Generation of Billion-node Social Graphs with Overlapping Community Structure.


Distributed Generation of Billion-node Social Graphs with Overlapping Community Structure.

Authors

Chykhradze K., Korshunov A., Buzun N., Pastukhov R., Kuzyurin N., Turdakov D., Kim H.

Abstract

In the field of social community detection, it is commonly accepted to utilize graphs with reference community structure for accuracy evaluation. The method for generating large random social graphs with realistic community structure is introduced in the paper. The resulting graphs have several of recently discovered properties of social community structure which run counter to conventional wisdom: dense community overlaps, superlinear growth of number of edges inside a community with its size, and power law distribution of user-community memberships. Further, the method is by-design distributable and showed near-linear scalability in Amazon EC2 cloud using Apache Spark implementation.

Full text of the paper in pdf

Keywords

random graph, social network, community detection, benchmark network, graph generation, LFR benchmark, Affiliation Graph Model, SNAP, distributed algorithms, Amazon EC2, Apache Spark

Edition

5th Workshop on Complex Networks, CompleNet 2014, Bologna, Italy. Studies in Computational Intelligence Volume 549, 2014, pp. 199-208.

Research Group

Information Systems

All publications during 2014 All publications