Faculté des sciences de base SB, Programme doctoral Physique, Institut de théories des phénomènes physiques ITP (Laboratoire de biophysique statistique LBS)

Simplifying complex networks : from a clustering to a coarse graining strategy

Gfeller, David ; De Los Rios, Paolo (Dir.)

Thèse sciences Ecole polytechnique fédérale de Lausanne EPFL : 2007 ; no 3888.

Ajouter à la liste personnelle
    The framework of complex networks has been shown to describe adequately a wide class of complex systems made up of a large number of interacting units. In this framework, a node is associated to each unit and two nodes are connected by an edge if the two units interact with each other. Examples of such systems can be found in living organisms—the map of interactions between proteins or the network of neurons in the brain. Moreover, artificial systems such as the WWW, electrical grids or airplane connections have been studied using the tools of complex networks. Finally networks have found many applications in social sciences to characterize for instance human interactions of different kinds underlying the spreading of an epidemic. For most of these systems, the complexity arises because of the large number of units and their intricate connection patterns. A natural approach is therefore to simplify the systems by decreasing their size. Different schemes can indeed be designed for each particular system, leading to effective but case-dependent methods. From a more global and statistical perspective, a promising alternative is to reduce the complexity of the corresponding networks. In order to simplify complex networks, two strategies are presented in this Thesis. The first approach refers to the well-known clustering paradigm. It aims at identifying groups of nodes densely connected between each other and much less to the rest of the network. Those groups are referred to as clusters or communities. For most real systems, nodes within a community share some similarity or common feature. For instance, in a synonymy network where nodes are words and edges connect synonymous words, we have shown that finding communities allowed us to identify words corresponding to a single concept. We have also studied a network describing the dynamics of a peptide by associating a node to a microscopic configuration and an edge to a transition. The community structure of the network was shown to provide a new methodology to explore the main characteristics of the peptide dynamics and to unravel the large-scale features of the underlying free-energy landscape. Finally we have designed a new technique to probe the robustness of the community structure against external perturbations of the network topology. This method allows us, among else, to assess whether communities correspond to a real structure of the network, or are simple artifacts of the clustering algorithms. Community detection techniques have found a large number of practical applications as a method to simplify networks since the number of clusters is often much smaller than the number of nodes. However, a crucial issue has often been disregarded: is the network of clusters truly representative of the initial one? In this Thesis, we show that this is indeed not verified for most networks. For example we have considered the evolution of random walks on the network of clusters and found that it behaves quite differently than in the initial network. This observation led us to develop a new strategy to simplify complex networks, ensuring that the reduced network is representative of the initial one. It is based on the idea of grouping nodes, akin to community detection. However, the aim is no longer to identify the "correct" clusters, but to find a smaller network which preserves the relevant features of the initial one, and especially the spectral properties. We therefore refer to our method as Spectral Coarse Graining, by analogy with the coarse graining framework used in Statistical Physics. Applying this method to various kinds of networks, we have shown that the coarse-grained network provides an excellent approximation of the initial one, while the size could be easily reduced by a factor of ten. Therefore, the Spectral Coarse Graining provides a well-defined way of studying large networks and their dynamics considering a much smaller coarse-grained version. Overall, we first discuss the use and the limits of the usual clustering approach to reduce the complexity of networks, and apply it to several real-world systems. In a second part, we develop a new coarse graining strategy to approximate large networks by smaller ones and provide several examples to illustrate the power and the novelty of the method.