A GLOBAL-LOCAL APPROACH FOR DETECTING EQTL HOTSPOTS IN ULTRA-HIGH MULTIPLE RESPONSE REGRESSIONS

Speaker: Leonardo Bottolo (University of Cambridge / Alan Turing Institute)

Title: A global-local approach for detecting eQTL hotspots in ultra-high multiple response regressions

This event was co-organised by the Institute for Data Science & Artificial Intelligence and the MBS Big Data Forum

Abstract:

We consider how to specify prior distributions for top-level scale parameters in a sparse hierarchical regression model with many predictors and many responses. Our model borrows information across responses through a parameter that captures the “propensity” of a predictor to be a hotspot, i.e., to influence several responses at once.

It can detect associations between p = 10E4 − 10E6 predictors (e.g., genetic variants) and q = 10E2 − 104 responses (e.g., molecular expression levels), but for very large q, inference can be sensitive to the variance of the hotspot “propensity”. While this sensitivity can be cast as a general problem of specifying prior distributions in variance components, we show that it is also caused by a lack of adjustment for the number of responses considered. To solve this problem, we introduce a control parameter which depends on q as part of a global-local hotspot prior variance based on the Horseshoe prior. Our proposal shrinks noise globally and hence adapts to the sparse context of eQTL analyses, while being robust to individual signals, thus leaving the effects of hotspot genetic variants unshrunk. It can, therefore, detect important pleiotropic effects, of particular interest for current research in genetics. Inference is carried out using an annealed Variational Bayes procedure, which allows fast and efficient exploration of multimodal distributions. If time will permit, we will also illustrate an extension to include annotation to help the detection of important associations while retaining the computational advantages of the Variational Bayes formulation. We illustrate the benefits of proposed models on simulated data sets and two real examples that aim to detect hotspots pleiotropic effects in eQTL experiments.

This is joint work with Helene Ruffieux (EPFL Lausanne and MRC-BSU

Cambridge) and Sylvia Richardson (MRC-BSU Cambridge).

Bio:

Dr Leonardo Bottolo is Reader in Statistics for Biomedicine at the University of Cambridge. He received his PhD in Methodological Statistics from the University of Trento, Italy, in 2001. Before joining the University of Cambridge, he was appointed Senior Lecturer in Statistics in the Department of Mathematics, Imperial College. He worked as postdoc in the Mathematical Genetics group, University of Oxford and at the Institute of Mathematical Sciences, Imperial College. He is currently interested in inference for tall data, y, collected on n data points with n very large and approximate Bayesian methods such as Variational Bayes.