^{1,2,a)}, Mason A. Porter

^{3,4}, Nicholas F. Wymbs

^{5}, Scott T. Grafton

^{5}, Jean M. Carlson

^{1}and Peter J. Mucha

^{6,7}

### Abstract

We describe techniques for the robust detection of community structure in some classes of time-dependent networks. Specifically, we consider the use of statistical null models for facilitating the principled identification of structural modules in semi-decomposable systems. Null models play an important role both in the optimization of quality functions such as modularity and in the subsequent assessment of the statistical validity of identified community structure. We examine the sensitivity of such methods to model parameters and show how comparisons to null models can help identify system scales. By considering a large number of optimizations, we quantify the variance of network diagnostics over optimizations (“optimization variance”) and over randomizations of network structure (“randomization variance”). Because the modularity quality function typically has a large number of nearly degenerate local optima for networks constructed using real data, we develop a method to construct representative partitions that uses a null model to correct for statistical noise in sets of partitions. To illustrate our results, we employ ensembles of time-dependent networks extracted from both nonlinear oscillators and empirical neuroscience data.

Many social, physical, technological, and biological systems can be modeled as networks composed of numerous interacting parts.

^{1}As an increasing amount of time-resolved data has become available, it has become increasingly important to develop methods to quantify and characterize dynamic properties of

*temporal networks.*

^{2}Generalizing the study of static networks, which are typically represented using graphs, to temporal networks entails the consideration of nodes (representing entities) and/or edges (representing ties between entities) that vary in time. As one considers data with more complicated structures, the appropriate network analyses must become increasingly nuanced. In the present paper, we discuss methods for algorithmic detection of dense clusters of nodes (i.e., communities) by optimizing quality functions on multilayernetwork representations of temporal networks.

^{3,4}We emphasize the development and analysis of different types of null-model networks, whose appropriateness depends on the structure of the networks one is studying as well as the construction of representative partitions that take advantage of a multilayernetwork framework. To illustrate our ideas, we use ensembles of time-dependent networks from the human brain and human behavior.

We thank L. C. Bassett, M. Bazzi, K. Doron, L. Jeub, S. H. Lee, D. Malinow, and the anonymous referees for useful discussions. This work was supported by the Errett Fisher Foundation, the Templeton Foundation, David and Lucile Packard Foundation, PHS Grant NS44393, Sage Center for the Study of the Mind, and the Institute for Collaborative Biotechnologies through Contract No. W911NF-09-D-0001 from the U.S. Army Research Office. M.A.P. was supported by the James S. McDonnell Foundation (Research Award #220020177), the EPSRC (EP/J001759/1), and the FET-Proactive Project PLEXMATH (FP7-ICT-2011-8; Grant #317614) funded by the European Commission. He also acknowledges SAMSI and KITP for supporting his visits to them. P.J.M. acknowledges support from Award No. R21GM099493 from the National Institute of General Medical Sciences; the content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health.

I. INTRODUCTION

II. METHODS

A. Community detection

B. Network diagnostics

C. Data sets

D. Data set 1: Brainnetworks

E. Data set 2: Behavioral networks

III. RESULTS

A. Modularity-optimization null models

1. Optimization null models for ordered node networks

2. Optimization null models for networks derived from time series

B. Post-optimization null models

1. Intra-layer and inter-layer null models

2. Calculation of diagnostics on real versus null-model networks

C. Structural and temporal resolution parameters

D. Examination of data generated from a dynamical system

E. Dealing with degeneracy: Constructing representative partitions

IV. CONCLUSIONS

### Key Topics

- Networks
- 76.0
- Multilayers
- 35.0
- Time series analysis
- 30.0
- Brain
- 28.0
- Community modeling
- 26.0

## Figures

An important property of many real-world networks is community structure, in which there exist cohesive groups of nodes such that a network has stronger connections within such groups than it does between such groups. Community structure often changes in time, which can lead to the rearrangement of cohesive groups, the formation of new groups, and the fragmentation of existing groups.

An important property of many real-world networks is community structure, in which there exist cohesive groups of nodes such that a network has stronger connections within such groups than it does between such groups. Community structure often changes in time, which can lead to the rearrangement of cohesive groups, the formation of new groups, and the fragmentation of existing groups.

Methodological considerations important in the investigation of dynamic community structure in temporal networks. (A) Depending on the system under study, a single network layer (which is represented using an ordinary adjacency matrix with an extra index to indicate the layer) might by definition only allow edges from some subset of the complete set of node pairs, as is the case in the depicted chain-like graph. We call such a situation *partial connectivity*. (B) Although the most common optimization null model employs random graphs (e.g., the Newman-Girvan null model, which is closely related to the configuration model ^{ 1,16 } ), other models can also provide important insights into network community structure. (C) After determining a set of partitions that maximize the modularity *Q* (or a similar quality function), it is interesting to test whether the community structure is different from, for example, what would be expected with a scrambling of time layers (i.e., a temporal null model) or node identities (i.e., a nodal null model). ^{ 4 }

Methodological considerations important in the investigation of dynamic community structure in temporal networks. (A) Depending on the system under study, a single network layer (which is represented using an ordinary adjacency matrix with an extra index to indicate the layer) might by definition only allow edges from some subset of the complete set of node pairs, as is the case in the depicted chain-like graph. We call such a situation *partial connectivity*. (B) Although the most common optimization null model employs random graphs (e.g., the Newman-Girvan null model, which is closely related to the configuration model ^{ 1,16 } ), other models can also provide important insights into network community structure. (C) After determining a set of partitions that maximize the modularity *Q* (or a similar quality function), it is interesting to test whether the community structure is different from, for example, what would be expected with a scrambling of time layers (i.e., a temporal null model) or node identities (i.e., a nodal null model). ^{ 4 }

Network layers and community assignments from two example data sets: (A) a brain network based on correlations between blood-oxygen-level-dependent (BOLD) signals ^{ 4 } and (B) a behavioral network based on similarities in movement times during a simple motor learning experiment. ^{ 13 } We use these data sets to illustrate situations with categorical nodes and ordered nodes, respectively. In the bottom panels, we show community assignments obtained using multilayer community detection for (C) the brain networks and (D) the behavioral networks.

Network layers and community assignments from two example data sets: (A) a brain network based on correlations between blood-oxygen-level-dependent (BOLD) signals ^{ 4 } and (B) a behavioral network based on similarities in movement times during a simple motor learning experiment. ^{ 13 } We use these data sets to illustrate situations with categorical nodes and ordered nodes, respectively. In the bottom panels, we show community assignments obtained using multilayer community detection for (C) the brain networks and (D) the behavioral networks.

Modularity-optimization null models. (A) Example layer from a behavioral network. (B) Newman-Girvan and (C) chain null models for the layer shown in panel (A). (D) Optimized multilayer modularity value *Q*, (E) number of communities *n*, and (F) mean community size *s* for the complete multilayer behavioral network employing the Newman-Girvan (black) and chain (red) optimization null models as a function of the structural resolution parameter γ. (G)Optimized modularity value *Q*, (H) number of communities *n*, and (I) mean community size *s* for the multilayer behavioral network employing chain optimization null models as a function of the effective fraction of edges that have larger weights than their null-model counterparts. We averaged the values of *Q*, *n*, and *s* over the 3 different 12-note sequences and *C* = 100 optimizations. Box plots in (D-F) indicate quartiles and 95% confidence intervals over the 22 individuals in the study. The error bars in panels (G-I) indicate a standard deviation from the mean. In some instances, this is smaller than the line width. The temporal resolution-parameter value is .

Modularity-optimization null models. (A) Example layer from a behavioral network. (B) Newman-Girvan and (C) chain null models for the layer shown in panel (A). (D) Optimized multilayer modularity value *Q*, (E) number of communities *n*, and (F) mean community size *s* for the complete multilayer behavioral network employing the Newman-Girvan (black) and chain (red) optimization null models as a function of the structural resolution parameter γ. (G)Optimized modularity value *Q*, (H) number of communities *n*, and (I) mean community size *s* for the multilayer behavioral network employing chain optimization null models as a function of the effective fraction of edges that have larger weights than their null-model counterparts. We averaged the values of *Q*, *n*, and *s* over the 3 different 12-note sequences and *C* = 100 optimizations. Box plots in (D-F) indicate quartiles and 95% confidence intervals over the 22 individuals in the study. The error bars in panels (G-I) indicate a standard deviation from the mean. In some instances, this is smaller than the line width. The temporal resolution-parameter value is .

Modularity-optimization null models for time series. (A) Example coherence matrix averaged over layers from a brain network. (B) Random time shuffle, (C) FT surrogate, and (D) AAFT surrogate null models averaged over layers. (E) Coherence of each matrix type averaged over subjects, scans, and layers. We note that the apparent lack of structure in (B) is partially related to its significantly decreased coherence in comparison to the other models. (F) Optimized modularity values *Q*, (G) number of communities *n*, and (H) mean community size *s* for the multilayer brain network employing the Newman-Girvan (black), random time-shuffle (blue), FT surrogate (gray), and AAFT surrogate (red) optimization null models as functions of the structural resolution parameter γ. We averaged the values of these diagnostics over 3 different scanning sessions and *C* = 100 optimizations. Box plots indicate quartiles and 95% confidence intervals over the 20 individuals in the study. The temporal resolution parameter is .

Modularity-optimization null models for time series. (A) Example coherence matrix averaged over layers from a brain network. (B) Random time shuffle, (C) FT surrogate, and (D) AAFT surrogate null models averaged over layers. (E) Coherence of each matrix type averaged over subjects, scans, and layers. We note that the apparent lack of structure in (B) is partially related to its significantly decreased coherence in comparison to the other models. (F) Optimized modularity values *Q*, (G) number of communities *n*, and (H) mean community size *s* for the multilayer brain network employing the Newman-Girvan (black), random time-shuffle (blue), FT surrogate (gray), and AAFT surrogate (red) optimization null models as functions of the structural resolution parameter γ. We averaged the values of these diagnostics over 3 different scanning sessions and *C* = 100 optimizations. Box plots indicate quartiles and 95% confidence intervals over the 20 individuals in the study. The temporal resolution parameter is .

Post-optimization null models. We compare four multilayer diagnostics (optimized modularity, number of communities, mean community size, and stationarity) and two single-layer diagnostics (mean and variance of *Q _{s} *) for (A) brain and (B) behavioral networks with the connectional (row 1), nodal (row 2), and temporal (row 3) null-model networks. Box plots indicate quartiles and 95% confidence intervals over the individuals and experimental conditions. The structural resolution parameter is , and the temporal resolution parameter is .

Post-optimization null models. We compare four multilayer diagnostics (optimized modularity, number of communities, mean community size, and stationarity) and two single-layer diagnostics (mean and variance of *Q _{s} *) for (A) brain and (B) behavioral networks with the connectional (row 1), nodal (row 2), and temporal (row 3) null-model networks. Box plots indicate quartiles and 95% confidence intervals over the individuals and experimental conditions. The structural resolution parameter is , and the temporal resolution parameter is .

Optimized modularity *Q* and Rand *z*-score as functions of the resolution parameters γ and ω for the (A) brain and (B) behavioral networks. The top row shows the mean value of maximized *Q* over *C* = 100 optimizations and the mean partition similarity *z* over all possible pairs of the *C* partitions. The bottom row shows the variance of maximized *Q* over the optimizations and the variance of the partition similarity over all possible pairs of partitions. The results shown in this figure come from a single individual and experimental scan, but we obtain qualitatively similar results for other individuals and scans. Note that the axis scalings are nonlinear.

Optimized modularity *Q* and Rand *z*-score as functions of the resolution parameters γ and ω for the (A) brain and (B) behavioral networks. The top row shows the mean value of maximized *Q* over *C* = 100 optimizations and the mean partition similarity *z* over all possible pairs of the *C* partitions. The bottom row shows the variance of maximized *Q* over the optimizations and the variance of the partition similarity over all possible pairs of partitions. The results shown in this figure come from a single individual and experimental scan, but we obtain qualitatively similar results for other individuals and scans. Note that the axis scalings are nonlinear.

Differences, as a function of γ and ω, between the real networks and the (A,B) nodal and (C,D) temporal null models for maximized modularity *Q* and partition similarity *z* for the (A,C) brain and (B,D) behavioral networks. The first row in each panel gives the difference in the mean values of the diagnostic variables between the real and null-model networks. Panels (A,B) show the results for and , and panels (C,D) show the results for and . The quantities *Q* and *z* again denote the modularity and partition similarity of the real network, *Q ^{n} * and

*z*denote the modularity and partition similarity of the nodal null-model network, and

^{n}*Q*and

^{t}*z*denote the modularity and partition similarity of the temporal null-model network. The second row in each panel gives the difference between the optimization variance of the real network and the randomization variance of the null-model network for the same diagnostic variable pairs. The third row in each panel gives the difference in the optimization variance of the real network and the optimization variance of the null-model network for the same diagnostic variable pairs. We show results for a single individual and scan in the experiment, but results are qualitatively similar for other individuals and scans. Note that the axis scalings are nonlinear.

^{t}Differences, as a function of γ and ω, between the real networks and the (A,B) nodal and (C,D) temporal null models for maximized modularity *Q* and partition similarity *z* for the (A,C) brain and (B,D) behavioral networks. The first row in each panel gives the difference in the mean values of the diagnostic variables between the real and null-model networks. Panels (A,B) show the results for and , and panels (C,D) show the results for and . The quantities *Q* and *z* again denote the modularity and partition similarity of the real network, *Q ^{n} * and

*z*denote the modularity and partition similarity of the nodal null-model network, and

^{n}*Q*and

^{t}*z*denote the modularity and partition similarity of the temporal null-model network. The second row in each panel gives the difference between the optimization variance of the real network and the randomization variance of the null-model network for the same diagnostic variable pairs. The third row in each panel gives the difference in the optimization variance of the real network and the optimization variance of the null-model network for the same diagnostic variable pairs. We show results for a single individual and scan in the experiment, but results are qualitatively similar for other individuals and scans. Note that the axis scalings are nonlinear.

^{t}Dynamic community detection in a network of Kuramoto oscillators. (A*, top*) The coupling matrix between *N* = 128 phase oscillators contains 8 communities, each of which has 16 nodes. (A*, bottom*) Over time, oscillators synchronize with one another. Color indicates the mean phase correlation between oscillators, where hotter (darker gray) colors indicate stronger correlations. (B) Phase correlation between oscillators as a function of time. The mean phase correlation between oscillators in the same community (dashed red curve) increases faster than the mean phase correlation between all oscillators in the system (solid gray curve). Regime I encompasses the first 50 time steps, and regime II emcompasses the subsequent 50 time steps. (C) Variance of maximized multilayer modularity (left), number of communities (middle), and partition similarity *z* (right) over 100 optimizations of the multilayer modularity quality function for the temporal network in regime II as a function of the structural resolution parameter for . The shaded gray area indicates values of the structural resolution parameter that provide 0 variance in the number of communities. (D) Example partition of the temporal network in regime II at , which occurs near the troughs in panel (C). (E) Example partition of the temporal network in regime I at . (F) Number of communities as a function of time for (*left*) the temporal network in regime I and (*right*) its corresponding temporal null model. We averaged values over *C* = 100 optimizations of multilayer modularity.

Dynamic community detection in a network of Kuramoto oscillators. (A*, top*) The coupling matrix between *N* = 128 phase oscillators contains 8 communities, each of which has 16 nodes. (A*, bottom*) Over time, oscillators synchronize with one another. Color indicates the mean phase correlation between oscillators, where hotter (darker gray) colors indicate stronger correlations. (B) Phase correlation between oscillators as a function of time. The mean phase correlation between oscillators in the same community (dashed red curve) increases faster than the mean phase correlation between all oscillators in the system (solid gray curve). Regime I encompasses the first 50 time steps, and regime II emcompasses the subsequent 50 time steps. (C) Variance of maximized multilayer modularity (left), number of communities (middle), and partition similarity *z* (right) over 100 optimizations of the multilayer modularity quality function for the temporal network in regime II as a function of the structural resolution parameter for . The shaded gray area indicates values of the structural resolution parameter that provide 0 variance in the number of communities. (D) Example partition of the temporal network in regime II at , which occurs near the troughs in panel (C). (E) Example partition of the temporal network in regime I at . (F) Number of communities as a function of time for (*left*) the temporal network in regime I and (*right*) its corresponding temporal null model. We averaged values over *C* = 100 optimizations of multilayer modularity.

Constructing representative partitions for an example brain network layer. (A) Partitions extracted during *C* optimizations of the quality function *Q*. (B) The *N* × *N* nodal association matrix **T**, whose elements indicate the number of times node *i* and node *j* have been assigned to the same community. (C) The *N* × *N* random nodal association matrix , whose elements indicate the number of times node *i* and node *j* are expected to be assigned to the same community by chance. (D) The thresholded nodal association matrix , where elements with values less than those expected by chance have been set to 0. (E) Partitions extracted during *C* = 100 optimizations of the single-layer modularity quality function *Q _{s} * for the matrix

**T**from panel (D). Note that each of the

*C*optimizations yields the same partition. (F) Visualization of the representative partition given in (E).

^{ 82 }We have reordered the nodes in the matrices in panels (A-E) for better visualization of community structure.

Constructing representative partitions for an example brain network layer. (A) Partitions extracted during *C* optimizations of the quality function *Q*. (B) The *N* × *N* nodal association matrix **T**, whose elements indicate the number of times node *i* and node *j* have been assigned to the same community. (C) The *N* × *N* random nodal association matrix , whose elements indicate the number of times node *i* and node *j* are expected to be assigned to the same community by chance. (D) The thresholded nodal association matrix , where elements with values less than those expected by chance have been set to 0. (E) Partitions extracted during *C* = 100 optimizations of the single-layer modularity quality function *Q _{s} * for the matrix

**T**from panel (D). Note that each of the

*C*optimizations yields the same partition. (F) Visualization of the representative partition given in (E).

^{ 82 }We have reordered the nodes in the matrices in panels (A-E) for better visualization of community structure.

Representative partitions of multilayer brain networks for an example subject and scan. (A) Partitions extracted for *C* = 100 optimizations of the quality function *Q* on the real multilayer network (112 nodes × 25 time windows, which yields 2800 nodes in total). Partitions extracted for *C* randomizations for the (B) temporal and (C) nodal null-model networks. (D) Partitions extracted for *C* optimizations of the quality function *Q* of the thresholded nodal association matrix for the (D) real, (E) temporal null-model, and (F) nodal null-model networks. Note that the partitioning is robust to multiple optimizations. We have reordered the nodes in each column for better visualization of community structure. The structural resolution parameter is , and the temporal resolution parameter is .

Representative partitions of multilayer brain networks for an example subject and scan. (A) Partitions extracted for *C* = 100 optimizations of the quality function *Q* on the real multilayer network (112 nodes × 25 time windows, which yields 2800 nodes in total). Partitions extracted for *C* randomizations for the (B) temporal and (C) nodal null-model networks. (D) Partitions extracted for *C* optimizations of the quality function *Q* of the thresholded nodal association matrix for the (D) real, (E) temporal null-model, and (F) nodal null-model networks. Note that the partitioning is robust to multiple optimizations. We have reordered the nodes in each column for better visualization of community structure. The structural resolution parameter is , and the temporal resolution parameter is .

Article metrics loading...

Full text loading...

Commenting has been disabled for this content