Statistical Analysis of Networks

Networks describe pairwise relationships (or interactions) between a set of objects.  Mathematically, a network is a graph where the objects are treated as vertices (nodes)  and an edge (link) is placed between two vertices if they are related to each other. Networks and network data are ubiquitous: examples include Facebook and Twitter networks, the internet, social networks, protein- and gene-interaction networks, and ecological networks of interacting species.

The statistical analysis of networks has been motivated by, and made contributions to, modeling and understanding complex systems in a variety of fields. Our research on networks began with the problem of community detection.   In community detection, the goal is to identify in a given network a group of vertices, called a community,  with the property that the vertices in the community are highly interconnected, but have relatively few connections with vertices outside the community. Community detection, which is related to the problem clustering, is an important exploratory tool for identifying structure in large networks, and is a common first step in network analysis.

Our initial work addressed community detection in unweighted networks. We developed  a method called Extraction of Statistically Significant Communities (ESSC) that accommodates community overlap, and accounts for background nodes that may not belong to any community.   ESSC is based on an iterative-testing based procedure and  a configuration based null model in which node degrees are preserved, but edges are randomly assigned.  In ongoing research, we are
extending the null model and testing procedure to identify communities in networks with weighted edges, and in multi-layer networks where the set of objects is fixed but the relationships between the objects changes from layer to layer.  Multilayer networks are a simple, but useful, special case of dynamic networks, which we intend to study more carefully in the future.