Partial Least Square methods for omics data sets

Jeanine Houwing-Duistermaat, University of Leeds

  • Date: 27 SEPTEMBER 2018  at 14:15

  • Event location: Room III, 2nd floor, Department of Statistica, Via Belle Arti 41, Bologna

  • Type: Statistics Seminars

Abstract:

The availability of large omics datasets in epidemiological and clinical studies provides many
opportunities for research in statistical bioinformatics. The hope is that the abundance of
information will provide better understanding of underlying disease mechanisms and accurate
prediction models enabling patient targeted screening and treatment. Statistical challenges are to
deal with data cleaning, heterogeneity across omic datasets, high dimensionality, data integration
and the presence of high correlation within and between datasets (Morris et al, 2017; Houwing-
Duistermaat et al, 2017). In this talk I will present Partial Least Squares (PLS) methods for
multivariate regression and for data integration and dimension reduction when analysing several
omics datasets simultaneously.
Three PLS type of methods for omics analysis will be considered namely the standard PLS
algorithm (Wold, 1972), Envelope (Cook et al, 2015) and our recently developed Probabilistic PLS
(PPLS) (Bouhaddani et al, 2018). Envelope and PPLS are maximum likelihood methods. PLS and
PPLS can deal with high dimensions while Envelope requires n larger than p. PPLS maximizes a
constrained log likelihood to ensure that the solution is unique. The methods will be illustrated with
several data examples. The results of simulation studies to compare their performances will be
shown.


Il Direttore
Prof. Angela Montanari

La S.V. è invitata