2019
Journal article  Open Access

CDLIB: a python library to extract, compare and evaluate communities from complex networks

Rossetti G., Milli L., Cazabet R.

Community detection framework  [INFO.INFO-SI]Computer Science [cs]/Social and Information Networks [cs.SI]  Social network analysis  Community discovery library  Computational Mathematics  [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]  community discovery  Computer Networks and Communications  Multidisciplinary 

Community Discovery is among the most studied problems in complex network analysis. During the last decade, many algorithms have been proposed to address such task; however, only a few of them have been integrated into a common framework, making it hard to use and compare different solutions. To support developers, researchers and practitioners, in this paper we introduce a python library - namely CDlib - designed to serve this need. The aim of CDlib is to allow easy and standardized access to a wide variety of network clustering algorithms, to evaluate and compare the results they provide, and to visualize them. It notably provides the largest available collection of community detection implementations, with a total of 39 algorithms.

Source: Applied network science 4 (2019). doi:10.1007/s41109-019-0165-9

Publisher: Springer international, Cham, Svizzera


Ahn Y.-Y., Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761
Baumes J, Goldberg M, Magdon-Ismail M (2005) Efficient identification of overlapping communities. In: International Conference on Intelligence and Security Informatics. Springer. pp 27-36
Blondel VD, Guillaume J.-L., Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):10008
Cazabet R, Rossetti G, Amblard F (2018) Dynamic community detection. Encyclopedia of Social Network Analysis and Mining. Springer, New York. pp 1-10
Chen J, Saad Y (2012) Dense subgraph extraction with application to community detection. IEEE Trans Knowl Data Eng 24(7):1216-1230
Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111
Coscia M, Giannotti F, Pedreschi D (2011) A classification for community discovery methods in complex networks. Statistical Analysis and Data Mining: The ASA Data Science Journal 4(5):512-546
Coscia M, Rossetti G, Giannotti F, Pedreschi D (2012) Demon: a local-first discovery method for overlapping communities. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. pp 615-623
Coscia M, Rossetti G, Giannotti F, Pedreschi D (2014) Uncovering hierarchical and overlapping communities with a local-first approach. ACM Trans Knowl Discov Data (TKDD) 9(1):6
Dao V.-L., Bothorel C, Lenca P (2018) Estimating the similarity of community detection methods based on cluster size distribution. In: International Conference on Complex Networks and Their Applications. Springer. pp 183-194
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1-30
Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30(7):1575-1584
Erdös P, Rényi A (1959) On random graphs i. Publ Math Debr 6:290
Flake GW, Lawrence S, Giles CL, et al. (2000) Efficient identification of web communities. In: KDD Vol. 2000. pp 150-160
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3-5):75-174
Fortunato S, Hric D (2016) Community detection in networks: A user guide. Phys Rep 659:1-44
Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821-7826
Gregory S (2007) An algorithm to find overlapping community structure in networks. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer. pp 91-102
Gregory S (2008) A fast algorithm to find overlapping communities in networks. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer. pp 408-423
Harenberg S, Bello G, Gjeltema L, Ranshous S, Harlalka J, Seay R, Padmanabhan K, Samatova N (2014) Community detection in large-scale networks: a survey and empirical evaluation. Wiley Interdiscip Rev Comput Stat 6(6):426-439
Hollocou A, Bonald T, Lelarge M (2017) Multiple local community detection. SIGMETRICS Perform Eval Rev 45(3):76-83. https://doi.org/10.1145/3199524.3199537
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193-218
Jebabli M, Cherifi H, Cherifi C, Hamouda A (2018) Community detection algorithm evaluation with ground-truth data. Phys A Stat Mech Appl 492:651-706
Kozdoba M, Mannor S (2015) Community detection via measure space embedding. In: Advances in Neural Information Processing Systems. Curran Associates, Inc. pp 2890-2898
Kundu S, Pal SK (2015) Fuzzy-rough community in social networks. Pattern Recognit Lett 67:145-152
Lancichinetti A, Fortunato S (2009) Community detection algorithms: a comparative analysis. Phys Rev E 80(5):056117
Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3):033015
Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):046110
Leicht EA, Newman ME (2008) Community structure in directed networks. Phys Rev Lett 100(11):118703
Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th International Conference on World Wide Web. ACM. pp 631-640
Li Y, He K, Bindel D, Hopcroft JE (2015) Uncovering the small community structure in large networks: A local spectral approach. In: Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee. pp 658-668
Li Z, Wang R.-S., Zhang S, Zhang X.-S. (2016) Quantitative function and algorithm for community detection in bipartite networks. Inf Sci 367:874-889
Malliaros FD, Vazirgiannis M (2013) Clustering and community detection in directed networks: A survey. Phys Rep 533(4):95-142
McDaid AF, Greene D, Hurley N (2011) Normalized mutual information to evaluate overlapping community finding algorithms. arXiv preprint. arXiv:1110.2515
Meila˘ M (2007) Comparing clusterings-an information based distance. J Multivar Anal 98(5):873-895
Miyauchi A, Kawase Y (2016) Z-score-based modularity for community detection in networks. PloS One 11(1):0147805
Murray G, Carenini G, Ng R (2012) Using the omega index for evaluating abstractive community detection. In: Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization. Association for Computational Linguistics. pp 10-18
Newman ME (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113
Newman ME, Leicht EA (2007) Mixture models and exploratory analysis in networks. Proc Natl Acad Sci 104(23):9564-9569
Nicosia V, Mangioni G, Carchiolo V, Malgeri M (2009) Extending the definition of modularity to directed graphs with overlapping communities. J Stat Mech Theory Exp 2009(03):03024
Orman GK, Labatut V, Cherifi H (2012) Comparative evaluation of community detection algorithms: a topological approach. J Stat Mech Theory Exp 2012(08):08001
Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814
Parés F, Gasulla DG, Vilalta A, Moreno J, Ayguadé E, Labarta J, Cortés U, Suzumura T (2017) Fluid communities: A competitive, scalable and diverse community detection algorithm. In: International Workshop on Complex Networks and Their Applications. Springer. pp 229-240
Peixoto TP (2014) Efficient monte carlo and greedy heuristic for the inference of stochastic block models. Phys Rev E 89(1):012804
Peixoto TP (2014) Hierarchical block structures and high-resolution model selection in large networks. Phys Rev X 4(1):011047
Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: International Symposium on Computer and Information Sciences. Springer. pp 284-293
Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci 101(9):2658-2663
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106
Reichardt J, Bornholdt S (2006) Statistical mechanics of community detection. Phys Rev E 74(1):016110
Rossetti G (2017) Rdyn: graph benchmark handling community dynamics. J Compl Netw 5(6):893-912
Rossetti G, Cazabet R (2018) Community discovery in dynamic networks: A survey. ACM Comput Surv (CSUR) 51(2):35
Rossetti G, Milli L, Rinzivillo S, Sîrbu A, Pedreschi D, Giannotti F (2018) Ndlib: a python library to model and analyze diffusion processes over complex networks. Int J Data Sci Analytics 5(1):61-79
Rossetti G, Pappalardo L, Rinzivillo S (2016) A novel approach to evaluate community detection algorithms on ground truth. In: Complex Networks VII: Proceedings of the 7th Workshop on Complex Networks CompleNet 2016. Springer International Publishing. pp 133-144
Rossetti G, Pedreschi D, Giannotti F (2017) Node-centric community discovery: From static to dynamic social network analysis. Online Soc Netw Media 3:32-48
Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118-1123
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22:888-905
Soundarajan S, Hopcroft JE (2015) Use of local group information to identify communities in networks. ACM Trans Knowl Discov Data (TKDD) 9(3):21
Traag VA, Aldecoa R, Delvenne J.-C. (2015) Detecting communities using asymptotical surprise. Phys Rev E 92(2):022816
Traag VA, Krings G, Van Dooren P (2013) Significant scales in community structure. Sci Rep 3:2930
Traag VA, Van Dooren P, Nesterov Y (2011) Narrow scope for resolution-limit-free community detection. Phys Rev E 84(1):016114
Traag V, Waltman L, van Eck NJ (2018) From louvain to leiden: guaranteeing well-connected communities. arXiv preprint. arXiv:1810.08473
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J Mach Learn Res 11(Oct):2837-2854
Whang JJ, Gleich DF, Dhillon IS (2013) Overlapping community detection using seed set expansion. In: Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management. ACM. pp 2099-2108
Xie J, Kelley S, Szymanski BK (2013) Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Comput Surv (csur) 45(4):43
Xie J, Szymanski BK, Liu X (2011) Slpa: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In: Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference On. IEEE. pp 344-349
Xu X, Yuruk N, Feng Z, Schweiger TA (2007) Scan: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. pp 824-833
Yang J, Leskovec J (2013) Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining. ACM. pp 587-596
Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181-213
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452-473
Zhang W, Wang X, Zhao D, Tang X (2012) Graph degree linkage: Agglomerative clustering on a directed graph. In: European Conference on Computer Vision. Springer. pp 428-441

Metrics



Back to previous page
BibTeX entry
@article{oai:it.cnr:prodotti:415651,
	title = {CDLIB: a python library to extract, compare and evaluate communities from complex networks},
	author = {Rossetti G. and Milli L. and Cazabet R.},
	publisher = {Springer international, Cham, Svizzera},
	doi = {10.1007/s41109-019-0165-9},
	journal = {Applied network science},
	volume = {4},
	year = {2019}
}

SoBigData
SoBigData Research Infrastructure


OpenAIRE