Statistics for Structures Seminar

The Statistics for Structures Seminar is an informal seminar focusing on the recent development of statistics on structural data. The talks concern methodological, theoretical, and applied findings. The focus of the seminar is both on discussing the existing literature and presenting new results in the topic.

The seminar takes place (usually) every second Friday between 15:00-16:00 (before the Bayes club)

Location is either at University of Amsterdam, Science park 105-107 KdVI, room F3.20,
or at Leiden University, Snellius Building, varying room.

The complete internet
Source of the picture: http://www.unc.edu/~unclng/Internet_History.htm

For more information please contact Botond Szabó (b.t.szabo@math.leidenuniv.nl) or Moritz Schauer (schauermr@math.leidenuniv.nl).

Schedule of the seminar

Academic year 2017-2018
Date Speaker Title Location
December 1
15:00-16:00
Mark van de Wiel
VU medical center
Empirical Bayes for p>n: use of auxiliary information to improve prediction and variable selection
MI Leiden, Niels Bohrweg 1
room 408
November 3
15:00-16:00
I. Gabriel Bucur
Radboud University
Robust Causal Estimation in the Large-Sample Limit without Strict Faithfulness
MI Leiden, Niels Bohrweg 1
room 408
October 20
15:00-16:00
Felix Lucka
CWI & UCL
Sparse Bayesian Inference and Uncertainty Quantification for Inverse Imaging Problems
MI Leiden, Niels Bohrweg 1
room 408
October 6
15:00-16:00
Gino Kpogbezan
Leiden University
A sparse Bayesian SEM approach to network recovery using external knowledge
UvA, Science Park 904,
room D1.110
September 22
15:00-16:00
...
University of Amsterdam
...
TBA
room

Abstracts

Mark van de Wiel (VUMC): Empirical Bayes for p>n: use of auxiliary information to improve prediction and variable selection

Leiden, December 1, 2017

Empirical Bayes (EB) is a versatile approach to 'learn from a lot' in two ways: first, from a large number of variables and second, from a potentially large amount of prior information, e.g. stored in public repositories. I will present applications of a variety of EB methods to several prediction methods, with examples on ridge regression and Bayesian models with a spike-and-slab prior. Both (marginal) likelihood and moment-based EB methods will be discussed. I consider a simple empirical Bayes estimator in a linear model setting to study the relation between the quality of an empirical Bayes estimator and p. I argue that EB is particularly useful when the prior contains multiple parameters, modeling a priori information on variables, termed 'co-data'. This will be illustrated with an application to cancer genomics. Finally, some ideas on how to include prior structural information in a ridge setting will be shortly discussed.

Ioan Gabriel Bucur (RU): Robust Causal Estimation in the Large-Sample Limit without Strict Faithfulness

Leiden, November 3, 2017

Causal effect estimation from observational data is an important and much studied research topic. The instrumental variable (IV) and local causal discovery (LCD) patterns are canonical examples of settings where a closed-form expression exists for the causal effect of one variable on another, given the presence of a third variable. Both rely on faithfulness to infer that the latter only influences the target effect via the cause variable. In reality, it is likely that this assumption only holds approximately and that there will be at least some form of weak interaction. This brings about the paradoxical situation that, in the large-sample limit, no predictions are made, as detecting the weak edge invalidates the setting. We introduce an alternative approach by replacing strict faithfulness with a prior that reflects the existence of many 'weak' (irrelevant) and 'strong' interactions. We obtain a posterior distribution over the target causal effect estimator which shows that, in many cases, we can still make good estimates. We demonstrate the approach in an application on a simple linear-Gaussian setting, using the MultiNest sampling algorithm, and compare it with established techniques to show our method is robust even when strict faithfulness is violated. This is joint work with Tom Claassen and Tom Heskes

Felix Lucka (CWI & UCL): Sparse Bayesian Inference and Uncertainty Quantification for Inverse Imaging Problems

Leiden, October 20, 2017

During the last two decades, sparsity has emerged as a key concept to solve linear and non-linear ill-posed inverse problems, in particular for severely ill-posed problems and applications with incomplete, sub-sampled data. At the same time, there is a growing demand to obtain quantitative instead of just qualitative inverse results together with a systematic assessment of their uncertainties (Uncertainty quantification, UQ). Bayesian inference seems like a suitable framework to combine sparsity and UQ but its application to large-scale inverse problems resulting from fine discretizations of PDE models leads to severe computational and conceptional challenges. In this talk, we will focus on two different Bayesian approaches to model sparsity as a-priori information: Via convex, but non-smooth prior energies such as total variation and Besov space priors and via non-convex but smooth priors arising from hierarchical Bayesian modeling. To illustrate our findings, we will rely on experimental data from challenging biomedical imaging applications such as EEG/MEG source localization and limited-angle CT. We want to share the experiences, results we obtained and the open questions we face from our perspective as researchers coming from a background in biomedical imaging rather than in statistics and hope to stimulate a fruitful discussion for both sides.

Gino Kpogbezan (MI Leiden): A sparse Bayesian SEM approach to network recovery using external knowledge

Amsterdam Science Park, October 6, 2017

We develop a sparse Bayesian Simultaneous Equations Models (SEMs) approach to network reconstruction which incorporates prior knowledge. We use an extended version of the horseshoe prior for the regressions parameters where on one hand priors have been assigned to hyperparameters and on the other hand the hyperparameters have been estimated by Empirical Bayes (EB). We use fast variational Bayes method for posterior densities approximation and compare its accuracy with that of MCMC strategy. Compared to their ridge counterpart, both models perform well in sparse situations, specially, the EB approach seems very promising. In a simulation study we show that accurate prior data can greatly improve the reconstruction of the network, but need not harm the reconstruction if wrong.

Name of Speaker: Title of the Talk

Location

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Past schedules

Academic year 2016-2017
Date Speaker Title Location
February 24
15:00-16:00
Jarno Hartog
University of Amsterdam
Nonparametric Bayesian label prediction on a graph
UvA
March 17
15:00-16:00
Löic Schwaller
Leiden University
Exact Bayesian inference for off-line change-point detection in tree-structured graphical models
Leiden
April 7
15:00-16:00
Peter Bloem
Vrije Universiteit Amsterdam
Network motif detection at scale
Leiden
May 12
15:00-16:00
Nurzhan Nurushev
Vrije Universiteit Amsterdam
Oracle uncertainty quantification for biclustering model
Leiden
October 7
15:00-16:00
Guus Regts
University of Amsterdam
Approximation algorithms for graph polynomials
and partition functions

UvA ScP 105-107 KdVI
room F3.20
November 4
15:00-16:00
Marco Grzegorczyk
Rijksuniversiteit Groningen
Bayesian inference of semi-mechanistic network models
UvA ScP 105-107 KdVI
room F3.20
November 25
15:00-16:00
Joris Mooij
University of Amsterdam
Automating Causal Discovery and Prediction
UvA ScP 105-107 KdVI
room F3.20
December 9
15:00-16:00
Stephanie van der Pas
Leiden University
Bayesian community detection
UvA ScP 105-107 KdVI
room F3.20
Academic year 2015-2016
Date Speaker Title Location
October 16
15:00-16:00
Ervin Tánczos
Eindhoven University of Technology
Adaptive Sensing for Recovering Structured Sparse Sets
UvA ScP 904
room C0.110
October 23
15:00-16:00
Moritz Schauer
University of Amsterdam
Working with graphs in Julia
UvA ScP 904
room A1.04
November 6
15:00-16:00
Fengnan Gao
Leiden University
On the Estimation of the Preferential Attachment Network Model
UvA ScP 904
room A1.04
March 18
15:00-16:00
Paulo Serra
University of Amsterdam
Dimension Estimation using Random Connection Models
UvA ScP 904
room F1.02
April 15
15:00-16:00
Rui Castro
Eindhoven University of Technology
Distribution-Free Detection of Structured Anomalies:
Permutation and Rank-Based Scans

UvA ScP 904
room G2.10
April 22
15:00-16:00
Koen van Oosten
Leiden University
Achieving Optimal Misclassification Proportion
in Stochastic Block Model

UvA ScP 904
room G2.10
May 13
15:00-16:00
Wessel van Wieringen
Vrije University Amsterdam
A tale of two networks: two GGMs and their differences
UvA ScP 904
room A1.04
June 3
15:00-16:00
Pariya Behrouzi
Rijksuniversiteit Groningen
Detecting Epistatic Selection in the Genome of RILs
via a latent Gaussian Copula Graphical Model

UvA ScP 904
room A1.10
Academic year 2014-2015
Date Speaker Title Location
March 13
15:30
Chao Gao
Yale University
Rate-optimal graphon estimation
Leiden University
room 401
April 1
15:00-17:00
Moritz Schauer
University of Amsterdam
Botond Szabo
University of Amsterdam
A graphical perspective on Gauss-Markov process priors

Detecting community structures in networks
VU Amsterdam
room WN-M607
April 17
14:00-16:00
Bartek Knapik
Vrije University
Johannes Schmiedt-Hieber
Leiden University
Point process modelling for directed interaction networks

Detecting community structures in networks
UvA ScP 904
room A1.10
April 24
14:00-16:00
Fengnan Gao
Leiden University
Stephanie van der Pas
Leiden University
A quick survey in random graph models

Stochastic block models
UvA ScP 904
room A1.10
May 1
15:00-16:00
Kolyan Ray
Leiden University
Estimating Sparse Precision Matrix
UvA ScP 904
room A1.10
May 8
15:30-17:30
Gino B. Kpogbezan
Leiden University
Jarno Hertog
University of Amsterdam
Variational Bayesian SEM for undirected Network
recovery using external data


Kernel-based regression
UvA ScP 904
room D1.116
Academic year 2013-2014
Date Speaker Title Location
March 13
15:30
Leila Mohammadi
Leiden University
A nonparametric view of network models
Leiden University
room 312
April 4
15:00-16:00
Aad van der Vaart
Leiden University
Harry van Zanten
University of Amsterdam
Stochastic block models

Regression on graphs
UvA ScP 904
room A 1.06

Archive of Abstracts and Slides

Academic year 2016-2017

Nurzhan Nurushev: Oracle uncertainty quantification for biclustering model

Leiden, Friday, May 12, 2017

We study the problem of inference on the unknown parameter in the biclustering model by using the penalization method which originates from the empirical Bayes approach. The underlying biclustering structure is that the high-dimensional parameter consists of a few blocks of equal coordinates. The main inference problem is the uncertainty quantification (i.e., construction of a conference set for the unknown parameter), but on the way we solve the estimation problem as well. We pursue a novel local approach in that the procedure quality is characterized by a local quantity, the oracle rate, which is the best trade-off between the approximation error by a biclustering structure and the best performance for that approximating biclustering structure. The approach is also robust in that the additive errors in the model are not assumed to be independent (in fact, in general dependent) with some known distribution, but only satisfying certain mild exchangeable exponential moment conditions. We introduce the excessive bias restriction (EBR) under which we establish the local (oracle) confidence optimality of the proposed confidence ball. Adaptive minimax results (for the graphon estimation and posterior contraction problems) follow from our local results. The results for the stochastic block model follow, with implications for network modeling. [Joint work with E. Belitser.]

Peter Bloem: Network motif detection at scale

Leiden, April 7, 2017

Network motif analysis is a form of pattern mining on graphs. It searches for subgraphs that are unexpectedly frequent with respect to a null model. To compute the expected frequency of the subgraph, the search for motifs is normally repeated on as many as 1 000 random graphs sampled from the null model. This is an expensive operation that currently limits motif analysis to graphs of around 10 000 links. Using the minimum description length principle, we have developed an approximation that avoids the graph samples and computes motif significance efficiently, allowing us to perform motif detection on graphs with billions of links, using commodity hardware.

Loïc Schwaller: Exact Bayesian inference for off-line change-point detection in tree-structured graphical models

Leiden, March 17, 2017

We consider the problem of change-point detection in multivariate time-series. The multivariate distribution of the observations is supposed to follow a graphical model, whose graph and parameters are affected by abrupt changes throughout time. We demonstrate that it is possible to perform exact Bayesian inference whenever one considers a simple class of undirected graphs called spanning trees as possible structures. We are then able to integrate on the graph and segmentation spaces at the same time by combining classical dynamic programming with algebraic results pertaining to spanning trees. In particular, we show that quantities such as posterior distributions for change-points or posterior edge probabilities over time can efficiently be obtained. We illustrate our results on both synthetic and experimental data arising from biology and neuroscience.

Jarno Hartog: Nonparametric Bayesian label prediction on a graph

Leiden, March 17, 2017

I will present an implementation of a nonparametric Bayesian approach to solving binary classification problems on graphs. I consider a hierarchical Bayesian approach with a randomly scaled Gaussian prior.

Guus Regts: Approximation algorithms for graph polynomials and partition functions

The correlation decay method, pioneered by Weitz in 2006, is a method that yields efficient (polynomial time) deterministic approximation algorithms for computing partition functions of several statistical models. While the method yields deterministic algorithms it has a probabilistic flavour. In this talk I will sketch how this method works for the hardcore model, i.e., for counting independent sets in bounded degree graphs. After that I will discuss a different method pioneerd by Barvinok based on Taylor approximations of the logarithm of the partition function and on the location of zeros of the partition function. I will explain how this approach can give polynomial time approximation algorithms for computing several partition functions on bounded degree graphs.
This is based on joint work with Viresh Patel (UvA)

Marco Grzegorczyk: Bayesian inference of semi-mechanistic network models

A topical and challenging problem for statistics and machine learning is to infer the structure of complex systems of interacting units.In many scientific disciplines such systems are represented by interaction networks described by systems of differential equations. My presentation is about a novel semi-mechanistic Bayesian modelling approach for infering the structures and parameters of these interaction networks from data. The inference approach is based on gradient matching and a non-linear Bayesian regression model. My real.-world applications stem from the topical field of computational systems biology, where researchers aim to reconstruct the structure of biopathways or regulatory networks from postgenomic data. My focus is on investigating to which extent certain factors influence the network reconstruction accuracy. To this end, I compare not only (i) different methods for model selection, including various Bayesian information criteria and marginal likelihood approximation methods, but also (ii) different ways to approximate the gradients of the observed time series. Finally, I cross-compare the performance of the new method with a set of state-of-the art network reconstruction networks, such as Bayesian networks. Within the comparative evaluation studies I employ ANOVA schemes to disambiguate to which extents confounding factors impact on the network reconstruction accuracies.

Joris Mooij: Automating Causal Discovery and Prediction

The discovery of causal relationships from experimental data and the construction of causal theories to describe phenomena are fundamental pillars of the scientific method. How to reason effectively with causal models, how to learn these from data, and how to obtain causal predictions has been traditionally considered to be outside of the realm of statistics. Therefore, most empirical scientists still perform these tasks informally, without the help of mathematical tools and algorithms. This traditional informal way of causal inference does not scale, and this is becoming a serious bottleneck in the analysis of the outcomes of large-scale experiments nowadays. In this talk I will describe formal causal reasoning methods and algorithms that can help to automate the process of scientific discovery from data.

Stephanie van der Pas: Bayesian community detection

In the stochastic block model, nodes in a graph are partitioned into classes ('communities') and it is assumed that the probability of the presence of an edge between two nodes solely depends on their class labels. We are interested in recovering the class labels, and employ the Bayesian posterior mode for this purpose. We present results on weak consistency (where the fraction of misclassified nodes converges to zero) and strong consistency (where the number of misclassified nodes converges to zero) of the posterior mode , in the 'dense' regime where the probability of an edge occurring between two nodes remains bounded away from zero, and in the 'sparse' regime where this probability does go to zero as the number of nodes increases.

Academic year 2015-2016

Ervin Tánczos: Adaptive Sensing for Recovering Structured Sparse Sets

Consider the problem of recovering the support of a sparse signal, that is we are given an unknown s-sparse vector $x$, whose non-zero elements are $\mu>0$ and we are tasked with recovering the support of $x$. Suppose each coordinate of $x$ is measured independently with additive standard normal noise. In case the support can be any s-sparse set, we know that $\mu$ needs to scale as $\sqrt{\log n}$ for us to be able to reliably recover the support. However, in some practical settings the support set has a certain structure. For instance in gene-expression studies the signal support can be viewed a submatrix of the gene-expression matrix, or when searching for network anomalies the support set can be viewed as a star in the network graph. In such cases we might be able to recover the support of weaker signals. This question has been recently addressed by various authors. Now consider a setting where instead of measuring every coordinate of $x$ the same way, we can collect observations sequentially using our knowledge accumulated from previous observations. This setup is usually referred to as ``active learning" or ``adaptive sensing". We aim to characterize the difficulty of accurately recovering structured support sets using adaptive sensing, and also provide near optimal procedures for support recovery. In particular we are interested in the gains adaptive sensing provides over non-adaptive sensing in these situations. We consider two measurement models, namely coordinate-wise observations and compressive sensing. Our results show that adaptive sensing strategies can improve on non-adaptive ones both by better mitigating the effect of measurement noise, and capitalizing on structural information to a larger extent.

Moritz Schauer: Working with graphs in Julia

Julia is an emerging technical programming language, which has some properties which make it especially interesting for the implementation of Bayesian methods. In this talk I give an introduction into the graph-related functionality Julia provides. After a demonstration how to create and display graphs in Julia using the package Graphs.jl, I show how to perform Bayesian inference on a "smooth" function defined on a graph in Julia.

Fengnan Gao: On the Estimation of the Preferential Attachment Network Model

The preferential attachment (PA) network is a popular way of modeling the social networks, the collaboration networks and etc. The PA network model is an evolving network where new nodes keep coming in. When a new node comes in, it establishes only one connection with an existing node. The random choice on the existing node is via a multinormial distribution with probability weights based on a preferential function $f$ on the degrees. f is assumed apriori nondecreasing, which means the nodes with high degrees are more likely to get new connections, i.e. "the rich get richer". We proposed an estimator on f, that maps the natural numbers to the positive real line. We show, with techniques from branching process, our estimator is consistent. If $f$ is affine, meaning $f(k) = k + delta$, it is well known that such a model leads to a power-law degree distribution. We proposed a maximum likelihood estimator for delta and establish a central limit result on the MLE of delta.

Paulo Serra: Dimension Estimation using Random Connection Models

In statistics we often want to discover (sometimes impose) structure on observed data, and dimension plays a crucial role in this task. The setting that I will consider in this talk is the following: some high-dimensional data has been collected but it (potentially) lives in some lower dimensional space (this lower dimension is called the intrinsic dimension of the dataset); the objective is to estimate the intrinsic dimension of the high-dimensional dataset.
Why would we want to to this? Dimensionality reduction techniques (e.g., PCA, manifold learning) usually rely on knowledge about intrinsic dimension. Knowledge about dimension is also important to try to avoid the curse of dimensionality. From a computational perspective, the dimension of a dataset has impact in terms of the amount of space needed to store data (compressibility). The speed of algorithms is also commonly affected by the dimension of input data. One can also envision situations where we have access to some regression data, but the design points are unknown (this occurs, for example, in graphon estimation problems); the dimension of the design space has a large impact on the rate with which the regression function can be recuperated.
Our approach relies on having access to a certain graph: each vertex represents an obser- vation, and there is an edge between two vertices if the corresponding observations are close in some metric. We model this graph as a random connection model (a model from continuum percolation), and use this to propose estimators for the intrinsic dimension based on the dou- bling property of the Lebesgue measure. I will give some conditions under which the dimension can be estimated consistently, and some bounds on the probability of correctly recuperating an integer dimension. I will also show some numerical results and compare our estimators with some competing approaches from the literature.
This is joint work with Michel Mandjes.

Rui Castro (TU/e): Distribution-Free Detection of Structured Anomalies: Permutation and Rank-Based Scans

The scan statistic is by far the most popular method for anomaly detection, being popular in syndromic surveillance, signal and image processing and target detection based on sensor networks, among other applications. The use of scan statistics in such settings yields an hypothesis testing procedure, where the null hypothesis corresponds to the absence of anomalous behavior. If the null distribution is known calibration of such tests is relatively easy, as it can be done by Monte-Carlo simulation. However, when the null distribution is unknown the story is less straightforward. We investigate two procedures: (i) calibration by permutation and (ii) a rank-based scan test, which is distribution-free and less sensitive to outliers. A further advantage of the rank-scan test is that it requires only a one-time calibration for a given data size making it computationally much more appealing than the permutation-based test. In both cases, we quantify the performance loss with respect to an oracle scan test that knows the null distribution. We show that using one of these calibration procedures results in only a very small loss of power in the context of a natural exponential family. This includes for instance the classical normal location model, popular in signal processing, and the Poisson model, popular in syndromic surveillance. Numerical experiments further support our theory and results (joint work with Ery Arias-Castro, Meng Wang (UCSD) and Ervin Tánczos (TU/e)).

Koen van Oosten: Achieving Optimal Misclassification Proportion in Stochastic Block Model

Community detection is a fundamental statistical problem in network data analysis. Many algorithms have been proposed to tackle this problem. Most of these algorithms are not guaranteed to achieve the statistical optimality of the problem, while procedures that achieve information theoretic limits for general parameter spaces are not computationally tractable. In my talk I present a computationally feasible two-stage method that achieves optimal statistical performance in misclassification proportion for stochastic block model under weak regularity conditions. This two-stage procedure consists of a refinement stage motivated by penalized local maximum likelihood estimation. This stage can take a wide range of weakly consistent community detection procedures as initializer, to which it applies and outputs a community assignment that achieves optimal misclassification proportion with high probability.

Wessel van Wieringen: A tale of two networks: two GGMs and their differences

The two-sample problem is addressed from the perspective of Gaussian graphical models (GGMs), in exploratory and confirmatory fashion. The former amounts to the estimation of a precision matrix for each group. First, this is done group-wise by means of penalized maximum likelihood with an algebraically proper l2-penalty, for which an analytic expression of the estimator and its properties are derived. To link the groups the ridge penalty is then augmented with an fused term, which penalizes the difference between the group precisions. The confirmatory part concentrates on the situation in which partial correlations are systematically smaller/larger (in an absolute sense) in one of the groups. Data in both groups again are assumed to follow a GGM but now their partial correlations are proportional, differing by a multiplier (common to all partial correlations). The multiplier reflects the overall strength of the conditional dependencies. As before model parameters are estimated by means of penalized maximum likelihood, now using a ridge-like penalty. A permutation scheme to test for the multiplier differing from zero is proposed. A re-analysis of publicly available gene expression data on the Hedgehog pathway in normal and cancer prostate tissue combines both strategies to show its activation in the disease group.

Pariya Behrouzi: Detecting Epistatic Selection in the Genome of RILs via a latent Gaussian Copula Graphical Model

Recombinant Inbred Lines (RILs) derived from divergent parental lines can display extensive segregation distortion and long-range linkage disequilibrium (LD) between distant loci on same or different chromosomes. These genomic signatures are consistent with epistatic selection having acted on entire networks of interacting parental alleles during inbreeding. The reconstruction of these interaction networks from observations of pair-wise marker-marker correlations or pair-wise genotype frequency distortions is challenging as multiple testing approaches are under-powered and true long-range LD is difficult to distinguish from drift, particularly in small RIL panels. Here we develop an efficient method for reconstructing an underlying network of genomic signatures of high-dimensional epistatic selection from multi-locus genotype data. The network captures the conditionally dependent short- and long-range LD structure of RIL genomes and thus reveals aberrant marker-marker associations that are due to epistatic selection rather than gametic linkage. The network estimation relies on penalized Gaussian copula graphical models, which accounts for the large number of markers p and the small number of individuals n. We overcome the p >> n problem by using a penalized maximum likelihood technique that imposes an l1 penalty on the precision matrix of the latent process inside the EM estimation. A multi-core implementation of our algorithm makes it feasible to estimate the graph in high-dimensions (max markers approximately 3000). We demonstrate the efficiency of the proposed method on simulated datasets as well as on genotyping data in A.thaliana and Maize.

Academic year 2014-2015

Chao Gao: Rate Optimal Graphon Estimation

Network analysis is becoming one of the most active research areas in statistics. Significant advances have been made recently on developing theories, methodologies and algorithms for analyzing networks. However, there has been little fundamental study on optimal estimation. In this paper, we establish optimal rate of convergence for graphon estimation. Link: http://arxiv.org/abs/1410.5837

Botond Szabo: Detecting community structures in networks

I will review different algorithms for community detection described in: NEWMAN, M. E. J. (2004). Detecting community structure in networks. Eur. Phys. J. B 38 321-330. NEWMAN, M. E. J. (2006). Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E (3) 74 036104, 19. MR2282139 MR2282139 (2007j:82115)

Moritz Schauer: A graphical perspective on Gauss-Markov process priors

In this short talk I look at the connections between Gauss-Markov process priors on a line and Gaussian Markov Random fields on a tree via the midpoint displacement procedure. The Markov-property of the prior corresponds to a sparsity constraint for the prior precision on the tree which allows to solve the Gaussian inverse problem under quasi-linear time and space constraints using a divide and conquer algorithm. This leads to the notion of computationally desirable sparsity properties connecting Gramian matrix stemming from an Gaussian inverse problem and the prior precision matrix.

Johannes Schmiedt-Hieber: High-dimensional covariance estimation

I am going to talk about the paper: Ravikumar, Pradeep, Martin J. Wainwright, Garvesh Raskutti and Bin Yu High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence, EJS, 2011

Bartek Knapik: Point process modelling for directed interaction networks

I will present the paper: Perry, Patrick O.; Wolfe, Patrick J. Point process modelling for directed interaction networks. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 (2013), no. 5, 821-849

Fengnan Gao: A quick survey in random graph models

We will review several important random graph models, their definitions and important results on them. The models include Erdős–Rényi model, configuration model and preferential attachment model. We will focus on preferential attachment model. Most of the presentation is based on Remco van der Hofstad's lecture notes http://www.win.tue.nl/~rhofstad/NotesRGCN.pdf

Stephanie van der Pas: Stochastic block models

I will review the paper: Antoine Channarond, Jean-Jacques Daudin, and Stéphane Robin. Classification and estimation in the Stochastic Blockmodel based on the empirical degrees. Electron. J. Statist. Volume 6 (2012), 2574-2601. Link: http://projecteuclid.org/euclid.ejs/1357913089

Kolyan Ray: Estimating Sparse Precision Matrix

I will present the paper: Cai, Tony, Weidong Liu and Harrison H. Zhou. Estimating Sparse Precision Matrix: Optimal Rates of Convergence and Adaptive Estimation Link: http://arxiv.org/abs/1212.2882

Gino B. Kpogbezan: Variational Bayesian SEM for undirected Network recovery using external data

Recently we developed a Bayesian structural equation model (SEM) framework with shrinkage priors for undirected network reconstruction. It was shown that Bayesian SEM in combination with variational Bayes is particularly attractive as it performs well, is computationally very fast and a flexible framework. A posteriori variable selection is feasible in our Bayesian SEM and so is the use of shrinkage priors. These shrinkage priors depend on all regression equations allowing borrowing of information across equations and improve inference when the number of features is large. An empirical Bayes procedure is used to estimate our hyperparameters. We also showed in simulations that our approach can outperform popular (sparse) methods. Here, we focus on addressing the problem of incorporating external data and/or prior information into network inference. In many settings information regarding network connectivity is often available. It is then natural to take such information into account during network reconstruction. Based on Bayesian SEM we propose a new model that focuses on the use of external data. It performs better than that of our Bayesian SEM when the external information is relevant, and as good when it is not.

Jarno Hertog: Kernel-based regression

I will discuss the basic kernel approach to regression in the context of graphs and present an number of methods to construct such kernels as in section 8.4 of Statistical Analysis of Network Data by Eric D. Kolaczyk.