Two-sample hypothesis testing for comparing two networks is an important yet difficult problem. The main challenges are: Potentially different sizes and rarefaction levels. non-repeating observations of adjacency matrices; computational scalability; especially theoretical investigations on finite sample precision and minimax optimality. In this article, we propose the first provably high-order accurate two-sample inference method by comparing network moments. This method extends the traditional 2-sample t-test to network settings. It makes weak modeling assumptions and can effectively handle networks of varying sizes and sparsity levels. We establish strong finite-sample theoretical guarantees, including rate-optimality properties. Our method is easy to implement and fast to compute. We also devised a new nonparametric framework for offline hashing and fast querying that is particularly effective for maintaining and querying very large network databases. Comprehensive simulations demonstrate the effectiveness of this method. We apply this method to two real-world data sets and discover interesting new structures.

Source link


Leave A Reply