Program:

Thursday 3/3

9:1510:35 
Dynamic stochastic models


Massimiliano Tamborrino Weak convergence of kdimensional Stein's processes to kdimensional Ornstein Uhlenbeck processes
Stein's processes represent a commonlyused description of spontaneous neuronal activity, where the discharge of a neural impulse, called spike, is modeled as the first passage time of such process through a certain threshold. The largest part of these studies performs a diffusion limit on Stein's equation to get a mathematically tractable stochastic process. The use of these continuous processes has allowed the discovery of various neuronal features that are hidden in the original Stein's model, as for instance the stochastic resonance. In general, the existing diffusion models are onedimensional. However, nowadays neuroscientists are interested to model groups of neurons and to investigate their dependencies. For this reason, this work deals with the weak convergence of a kdimensional Stein's process to a kdimensional Ornstein Uhlenbeck diffusion process, as well as the weak convergence of their first passage times.
Joint work with Laura Sacerdote and Martin Jacobsen
Martin Jacobsen Diffusions with jumps: introduction and overview
Jump diffusions as Markov processes, the generator and Ito's formula, three methods of construction, the problem of stationarity, nice models for finding martingale estimating functions.

10:5011:30 
Statistical computing


Klaus Holst, Thomas Scheike and Thomas Gerds will speak about their Rpackages.
Thomas Gerds Calibration plots for predictions of absolute risk using R
Klaus Holst Latent Variable Models in R
Thomas Scheike HaploSurvival: Haplotype effects for survival data

11:4013:00 
Survival analysis


Torben Martinussen Quantifying the magnitude of confounding using the Cox model and the Aalen additive hazards model.
When estimating the association between an exposure and outcome, a simple approach to quantifying the size of confounding by a factor Z is to compare the estimates of the exposureoutcome association with and without adjustment for Z. This approach can sometimes be problematic as the adjusted and the unadjusted exposure effects can differ even in the absence of confounding (Greenland, Robins and Pearl, 1999), which is referred to as the nonlinearity effect. In this talk I will explore this problem when the response is a (possibly rightcensored) survival time and when we assume either the Cox model or the Aalen additive hazards model. Under the latter model I show that there is no nonlinearity problem (perhaps not surprising) while the problem is present under the Cox model. I will show how to correct the measure of confounding under the Cox model.
Ulla B Mogensen Comparison of predictions in multiclass decision problems
Many medical settings face a decision problem with multiclass outcome. In a diagnostic study of inflammatory bowel disease (IBD) two major types (Crohn's disease and ulcerative colitis) must be discriminated and distinguished from patients without IBD based on microarray data. In the Copenhagen stroke study patients can die from strokerelated causes or other causes or survive within a 10year followup. The aim is to predict outcome from baseline covariates. We first discuss criteria for predictions of mutually exclusive events and then compare the predictive performance of rival prediction models in both the diagnostic and the competing risks study with crossvalidation.

14:4015:20 
Functional data analysis and image analysis


Lars Lau Hansen Operator approximations and analysis of multivariate functional data
Abstract: In this talk we will consider models for multidimensional functional data where the roughness of the underlying functions is penalized. Thinking of the data as an observed function rather than discretely sampled points turns out to provide considerable benefits. It will be shown how statistical quantities can be identified by solving partial differential equations, and that the need for computational resources is dramatically reduced when working in the functional domain.

17:1518:00 
Invited Lecture


Carsten Wiuf Stochastic Modeling and Analysis of DNA Sequence Data from Heterogeneous Tumors
Abstract: Many cancers are believed to have clonal origin, starting from a single cell with a defining mutation and further acquiring one or more additional mutations before the first cancerous cell is established. A population of cancer cells evolves further over time and accumulates further genetic changes. Consequently, cells in different parts of a tumor might show differences in their genomes, or DNA. This phenomenon is referred to as genetic heterogeneity.
Here, I address the problem of modeling how the tumor evolves over time and accumulates changes in the DNA, starting from the initial cell with a defining mutation. The model is stochastic and relies on birthdeath processes. I show that there is a simple description of how the (stochastic) number of tumor cells in the system changes over time and that the model imposes constraints on parameters that determine the cell replication; thus the model leads to biological insight.
Further, the model leads to a simple way of simulating tumor evolution. Based on this, two Follicular Lymphoma data sets are used to draw inference on model parameters and the relative ages of tumor origin, defining and subsequent mutations. The latter might have clinical relevance.


Firday 4/3

9:1510:35 
Bioinformatics


Jessica Kasza Methods for the estimation of Bayesian networks with exogenous variables
Methods for the estimation of Bayesian networks, flexible frameworks allowing the representation of conditional independence relationships of sets of variables, typically require a data set that consists of independent and identically distributed samples. Often the data set available will be more complex, containing information on exogenous variables thought to affect the variables of interest. Here, two methods for the estimation of a Bayesian network given such a data set will be discussed. These approaches will be compared, and their use demonstrated through their application to a gene expression data set that contains data on covariates thought to affect gene expression levels.
Martin Vincent Regularized multinomial regression using sparse group lasso
We consider regularized multinomial regression using sparse group lasso. The sparse group lasso penalty combines the lasso penalty (L1norm) with the group lasso penalty (L2norm). After introducing the sparse group lasso method, we shall investigate some of the characteristics of the method. We shall in particular see that the method, in some cases, produces predictors with higher sparsity, at the feature level (i.e. selects fewer features), than multinomial regression using only L1 penalty (lasso).
As a practical example we apply the sparse group lasso method to a cancer data set. The data set consist of rtqpcr measurements of mircoRNA expression levels of 197 primary cancer tumors divide into 9 classes.

11:0511:50 
Invited Lecture


Bjarke Feenstra Genomewide association studies based on Danish health register data
With the advent of highthroughput genotyping microarrays some 5 years ago, genomewide assocation studies (GWAS) emerged as a hypothesisfree method of screening of the entire genome for disease related genetic variants. A typical GWAS data set consists of a few thousand persons, each genotyped for more than 500,000 single nucleotide polymorphisms (SNPs). This genetic data is analyzed for association with phenotypic data, such as disease status. In Denmark, we benefit from detailed nationwide health registers, which allows very cost efficient genetic screening for many diseases. At Statens Serum Institut, we are currently conducting several GWAS based on health register data. In the talk, I will present an example. Using dental data from the nationwide orthodontic registry for children, we conducted a GWAS on the timing of permanent tooth eruption and identified 4 genomic regions with robust association. All four signals were replicated in independent sample sets from the United States and Denmark. I will round off by discussing some of the additional possibilities offered by this type of data.

13:3014:30 
Invited Lecture


Jens Ledet Jensen Context dependent evolutionary models
I consider continuous time Markov models for the evolution of a DNA string. The models are used for the analysis of aligned DNA sequences. In the simplest case one has two aligned sequences, and more generally one has several aligned sequences connected in a known phylogenetic tree. The talk will partly be a review starting with simple models for independent nucleotides, progressing to independent codons before coming to the context dependent models. In the latter models the instantaneous rates for a change at a position depend on the values of the neighbouring sites. I will spent some time on discussing reversibility of the process before turning to inference problems. Simulations, as an aid in the estimation process, seem unavoidable. Asymptotic normality of the estimates can be treated through the theory of hidden Markov models, but I do not plan to dwell on this. If time permits I will make some remarks on calculations in endpoint conditioned Markov chains. 


