Graphical models for zero-inflated single cell gene expression

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

Graphical models for zero-inflated single cell gene expression. / McDavid, Andrew; Gottardo, Raphael; Simon, Noah; Drton, Mathias.

I: Annals of Applied Statistics, Bind 13, Nr. 2, 2019, s. 848-873.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

McDavid, A, Gottardo, R, Simon, N & Drton, M 2019, 'Graphical models for zero-inflated single cell gene expression', Annals of Applied Statistics, bind 13, nr. 2, s. 848-873. https://doi.org/10.1214/18-AOAS1213

APA

McDavid, A., Gottardo, R., Simon, N., & Drton, M. (2019). Graphical models for zero-inflated single cell gene expression. Annals of Applied Statistics, 13(2), 848-873. https://doi.org/10.1214/18-AOAS1213

Vancouver

McDavid A, Gottardo R, Simon N, Drton M. Graphical models for zero-inflated single cell gene expression. Annals of Applied Statistics. 2019;13(2):848-873. https://doi.org/10.1214/18-AOAS1213

Author

McDavid, Andrew ; Gottardo, Raphael ; Simon, Noah ; Drton, Mathias. / Graphical models for zero-inflated single cell gene expression. I: Annals of Applied Statistics. 2019 ; Bind 13, Nr. 2. s. 848-873.

Bibtex

@article{f505a514637c443299acf0049d208318,
title = "Graphical models for zero-inflated single cell gene expression",
abstract = "Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in mi-crofluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene coreg-ulatory networks from such data, we propose a multivariate Hurdle model. It is comprised of a mixture of singular Gaussian distributions. We employ neighborhood selection with the pseudo-likelihood and a group lasso penalty to select and fit undirected graphical models that capture conditional inde-pendences between genes. The proposed method is more sensitive than existing approaches in simulations, even under departures from our Hurdle model. The method is applied to data for T follicular helper cells, and a high-dimensional profile of mouse dendritic cells. It infers network structure not revealed by other methods, or in bulk data sets. A R implementation is available at https://github.com/amcdavid/HurdleNormal.",
keywords = "Gene network, Graphical model, Group lasso, Single cell gene expression",
author = "Andrew McDavid and Raphael Gottardo and Noah Simon and Mathias Drton",
year = "2019",
doi = "10.1214/18-AOAS1213",
language = "English",
volume = "13",
pages = "848--873",
journal = "Annals of Applied Statistics",
issn = "1932-6157",
publisher = "Institute of Mathematical Statistics",
number = "2",

}

RIS

TY - JOUR

T1 - Graphical models for zero-inflated single cell gene expression

AU - McDavid, Andrew

AU - Gottardo, Raphael

AU - Simon, Noah

AU - Drton, Mathias

PY - 2019

Y1 - 2019

N2 - Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in mi-crofluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene coreg-ulatory networks from such data, we propose a multivariate Hurdle model. It is comprised of a mixture of singular Gaussian distributions. We employ neighborhood selection with the pseudo-likelihood and a group lasso penalty to select and fit undirected graphical models that capture conditional inde-pendences between genes. The proposed method is more sensitive than existing approaches in simulations, even under departures from our Hurdle model. The method is applied to data for T follicular helper cells, and a high-dimensional profile of mouse dendritic cells. It infers network structure not revealed by other methods, or in bulk data sets. A R implementation is available at https://github.com/amcdavid/HurdleNormal.

AB - Bulk gene expression experiments relied on aggregations of thousands of cells to measure the average expression in an organism. Advances in mi-crofluidic and droplet sequencing now permit expression profiling in single cells. This study of cell-to-cell variation reveals that individual cells lack detectable expression of transcripts that appear abundant on a population level, giving rise to zero-inflated expression patterns. To infer gene coreg-ulatory networks from such data, we propose a multivariate Hurdle model. It is comprised of a mixture of singular Gaussian distributions. We employ neighborhood selection with the pseudo-likelihood and a group lasso penalty to select and fit undirected graphical models that capture conditional inde-pendences between genes. The proposed method is more sensitive than existing approaches in simulations, even under departures from our Hurdle model. The method is applied to data for T follicular helper cells, and a high-dimensional profile of mouse dendritic cells. It infers network structure not revealed by other methods, or in bulk data sets. A R implementation is available at https://github.com/amcdavid/HurdleNormal.

KW - Gene network

KW - Graphical model

KW - Group lasso

KW - Single cell gene expression

UR - http://www.scopus.com/inward/record.url?scp=85068503200&partnerID=8YFLogxK

U2 - 10.1214/18-AOAS1213

DO - 10.1214/18-AOAS1213

M3 - Journal article

C2 - 31388390

AN - SCOPUS:85068503200

VL - 13

SP - 848

EP - 873

JO - Annals of Applied Statistics

JF - Annals of Applied Statistics

SN - 1932-6157

IS - 2

ER -

ID: 226951301