Community structure
From Wikipedia, the free encyclopedia
This article may require cleanup to meet Wikipedia's quality standards. Please improve this article if you can. (December 2006) |
Contents |
[edit] What is a community structure?
In the study of real networks, many different characteristics have been found to be common including the small-world property, heavy-tailed degree distributions, and clustering, among others.[1] Another characteristic often seen in real networks is the emergence of community structures.
In the context of networks, community structures are groups of nodes which are more densely interconnected with each other than with the rest of the network. As shown in the example image to the right, nodes within a community are much more likely to be connected with another node in the community than with a node outside the community. This inhomogeneity suggests that the network has certain natural divisions within it.
Not all networks must contain community structures. In fact, many network models, such as the Erdős–Rényi model and the BA model, do not produce community structures.
[edit] Why are we interested in them?
Community structures are quite common in real networks. Social networks often include community groups (the origin of the term, in fact) based on common location, interests, occupation, etc. Metabolic networks have communities based on functional groupings. Citation networks form communities by research topic.[1] Being able to identify these sub-structures within a network can provide insight into how network function and topology affect each other.
[edit] Algorithms for finding communities?
Finding community substructures within an arbitrary network can be a difficult task. The number of communities within the network, if any, is unknown. These communities might not be of equal size, nor mutually exclusive. Despite these difficulties, however, several methods for community finding have been developed and employed with varying success.
[edit] Minimum-cut method
One of the oldest algorithms for dividing networks into parts is the minimum-cut method. This sees prominent use in load-balancing in parallel-computing applications because it minimizes necessary communication between processor nodes.
In the minimum-cut method, the network is divided into a set number of parts, each of approximately the same size chosen such that the number of edges between groups is minimized. This method works well for dividing up a problem for different processors in a parallel computer, but it is less than ideal for finding community structures in a network since it will find communities regardless of whether they are implicit in the structure, and it will only find a set number of them.[2]
[edit] Hierarchical clustering
Another method for finding community structures in networks is hierarchical clustering. A detailed description of this method is beyond the scope of this entry, please see hierarchical clustering.
[edit] Newman-Girvan algorithm
Another fairly successful algorithm for finding clusters is the Newman-Girvan algorithm[1] which is based on the measurement of betweenness centrality. This algorithm works well, but it runs too slowly to be practical on large networks.[3]
[edit] References
- ^ a b c | Community structure in social and biological networks, M. Girvan and M. E. J. Newman, Proc. Natl. Acad. Sci. USA 99, 7821-7826 (2002).
- ^ Personal notes from a seminar given by Mark Newman at the University of Notre Dame, October 27, 2006
- ^ | Fast algorithm for detecting community structure in networks, M. E. J. Newman, Phys. Rev. E 69, 066133 (2004).