WORKSHOP ON STATISTICAL INFERENCE FOR STOCHASTIC PROCESSES:

THEORETICAL AND COMPUTATIONAL ASPECTS


ABSTRACTS:


Yacine Ait-Sahalia (Chicago):   Maximum-Likelihood Estimation of Discretely-Sampled Diffusions: A Closed-Form Approach.

When a continuous-time diffusion is observed only at discrete dates, not necessarily close together, the likelihood function of the observations is in most cases not explicitly computable. Researchers have relied on simulations of sample paths in between the observation points, or numerical solutions of partial differential equations, to obtain estimates of the function to be maximized. By contrast, we construct a sequence of fully explicit functions which we show converge under very general conditions, including non-ergodicity, to the true (but unknown) likelihood function of the discretely-sampled diffusion. We document that the rate of convergence of the sequence is extremely fast for a number of examples relevant in finance. We then show that maximizing the sequence instead of the true function results in an estimator which converges to the true maximum-likelihood estimator and shares its asymptotic properties of consistency, asymptotic normality and efficiency. Applications to the valuation of derivative securities are also discussed.


Mikkel Baadsgaard (Copenhagen):   Estimation of Continuous Time Models of the Term Structure of Interest Rates Using Second Order Filtering.

This paper presents an econometric analysis of continuous time models of the term structure of interest rates, using the prices of coupon bonds as the observed entity. The model considered is a three factor model of the term structure of interest rates. In the model the prices of coupon bonds depends on the following state variables 1) the current short rate, 2) the short term mean of the short rate and 3) the current volatility of the short. The three state variables are assumed to be stochastic and the processes described by appropriate SDEs. In order to obtain a reasonable estimate of the entire term structure a panel of bond prices with different maturities are used in the estimation. As it is assumed that the observed bond prices are encumbered with measurement noise, a second order filtering approach is used to estimate the unobservable state variables, and the parameters are estimated by a quasi maximum likelihood method. Monte Carlo simulation as well as empirical results based on the Danish bond market are presented.


Bo Martin Bibby (Copenhagen):   Simplified Estimating Functions for Diffusion Models with a High-Dimensional Parameter.

A satisfactory description of complex dynamical systems often leads to a parametric model with a high-dimensional parameter. We consider observations at discrete time-points of a diffusion process where the parameter of interest is of high dimension. We propose to estimate part of the parameter using a simple estimating function and to use a martingale estimating function to estimate the remaining part of the parameter. We consider the asymptotic properties of the resulting estimator. In the talk data from an experiment involving the monitoring of wind speed at the west coast of Jutland will be presented as a motivating example. The estimation procedure will also be considered for other simple examples of diffusion models.


Ola Elerian (Oxford):   Likelihood Inference for Discretely Observed Non-linear Diffusions. Co-authors: Neil Shephard (Nuffield) and Siddhartha Chib (John M. Olin School of Business, Washington University)

This paper is concerned with the Bayesian estimation of non-linear stochastic differential equations when only discrete observations are available. The estimation is carried out using MCMC methods, in particular the Metropolis-Hastings algorithm, by introducing auxiliary points and using the Euler-Maruyama discretisation scheme. We develop efficient simulation routines and show that naive MCMC methods can perform dramatically poorly in this situation. Techniques for computing the likelihood function, the marginal likelihood and diagnostic measures (all based on the MCMC output) are presented. Examples using simulated and real data are presented and discussed in detail.

Keywords: Bayes estimation, nonlinear diffusion, Euler-Maruyama approximation, Maximum Likelihood, Markov chain Monte Carlo, Metropolis Hastings algorithm, missing data, Simulation, Stochastic Differential Equation.


Bjørn Eraker (Bergen):   Bayesian Analysis for Discretely Observed Diffusions with Unobserved State Variables.

In this paper a new method is proposed for estimation of parameters in diffusion processes from discrete observations. The proposed simulation based MCMC methodology applies to a wide class of models including systems with unobservable state variables and non-linearities. We apply the method to the estimation of parameters in one-factor interest rate models of the CEV class and to a generalization of this model to a two-factor model with a stochastic volatility component. The small sample properties of the estimator are studied trough sampling experiments for the stochastic volatility model and the results indicate that the method provides accurate estimates at moderate sample sizes.


Alexander Gushchin (Moscow):   On Convergence to Exponential-Type Statistical Models.

In this talk we discuss conditions for a sequence of filtered statistical experiments to converge to an experiment generated by an exponential-type family of probability measures. Our approach covers many practically important models. The non-filtered case is also considered.


Ernst Hansen (Copenhagen):   Pulse Dimension.

Pulse dimension is introduced as a tool for discussion of correlation dimension. It is defined for an arbitrary continuous distribution function F on [0, \infty) as dim_P(F) = \limsup_{x \to 0^+} \frac {\log F(x)}{\log x}, provided that the limit exists. We propose and investigate a class of estimators of pulse dimension, IGP_n(p, \xi), depending on two auxiliary parameters, p and \xi - the index n designates the number of independent observations from F. These estimators are based on the empirical quantile function, and they are quite naive, but they have the advantage from adaption of more common estimators of correlation dimension, that they are always well defined. Analysis, based on the asymptotic distribution of IGP_n for fixed choice of the auxiliary parameters identifies the optimal choice of these parameters. The optimal parameters depend on the number of observations. It is shown that asymptotic normality is preserved, if we calculate the IGP_n not for fixed values of the auxiliary parameters, but for the optimal choice. Finally, simulation results are presented to justify the approximations involved in the analysis.

Keywords: Correlation dimension, Grassberger Procaccia estimator, optimal asymptotic scheme, pulse dimension, tail index.


Peter Honoré (Aarhus):   Panel-Data Estimation of Non-Linear Term-Structure Models.

In this paper we present a maximum likelihood estimator of non-linear term-structure models based on a panel-data approach, which facilitates examining a broader class of term-structure models compared to the majority of recent panel-data literature. It is assumed that all zero-coupon yields are observed with measurement error. By imposing linear restrictions on the errors, the underlying state variables are recovered. Hence, the explicit likelihood function is directly available without the need for a filtering algorithm. Furthermore, we use a finite-difference approach to calculate the yields of zero-coupon bonds. Monte Carlo simulations show the benefit of adopting the panel-data approach. Empirical results based on the U.S. term structure of interest rate are presented.

Keywords: Panel-Data Approach, Non-Linear Spot-Rate Diffusion Models, Maximum Likelihood Estimation, Finite-Difference Method, Linear Inversion Restriction.

JEL Codes: C15, C23, C60, G12.


Reinhard Höpfner (Paderborn):   Nonparametric Estimation in Birth and Death on a Flow.

Birth and death on a flow (bdf), introduced by \c{C}inlar and Kao (1992), is a model for the random evolution of finite point configurations in Euclidean space. We speak of these points as particles. The process starts at time $t{=}0$ in a finite deterministic initial configuration. New particles appear at random times/positions, modelled by Poisson random measure $\mu (dt,dy)$ with intensity $dt \, \pi (dy)$ for some finite measure $\pi$. During their random life time, particles are transported in space by a stochastic Brownian flow $\Phi$ - independent of $\mu$ - which is the solution flow to an SDE $ dX_t = b(X_t)dt + \gs (X_t)dW_t $ with drift $b(.)$ and diffusion coefficient $\gs (.)$; on this flow, particles living at the same time move dependently. Particles die according to a position dependent killing rate $k(.)$ and vanish from the configuration. bdf is the resulting c$\grave {\rm a}$dl$\grave {\rm a}$g process $(\gph _t)_{t\geq 0}$ of finite particle configurations. A statistical model for bdf is a class of probability laws on a canonical path space defined in terms of a birth measure $\pi$, a drift function $b(.)$ and a death rate function $k(.)$ (we assume $\gs (.)$ to be fixed and known), under certain assumptions on these. For one-dimensional state space $\RR$, we prove asymptotic normality of kernel estimators for $k(.)$ if the bdf process $\gph$ is observed over a long time interval. Key tools are limit theorems for local time of the bdf process.


Valerie Isham (London):   Inference for Spatio-Temporal Processes: A Hydrological Case-Study.

Formal theories of inference for stochastic processes are generally likelihood-based, but for many spatio-temporal processes, the likelihood function is not readily available. As an illustration of some of the problems involved, we consider the case of spatio-temporal precipitation fields, where the fundamental underlying binary (wet/dry) structure makes Gaussian-based models inappropriate. Some models developed for use in addressing specific problems arising in hydrology will be described; one purpose of this work is to enable the continuous simulation of rainfall fields over very long time periods, for input into distributed rainfall-runoff models. The fitting and assessment of the adequacy of such models raises many interesting statistical and computational issues which will be discussed.


Mathieu Kessler (Murcia):   Simulations Based Estimating Functions for a Discretely Observed Diffusion.

When dealing with the estimation of a parameter theta in the coefficients of a stochastic differential equation from a discrete observation of a trajectory, it is well known that, since the likelihood is intractable, the mle is not a solution in practice. The estimating functions turned out to provide a quite satisfactory way to overcome the difficulty : they are easily implemented and yield a consistent and asymptotically normal estimator (see Sørensen, 1997). However the expression of the estimating functions usually involve a quantity that cannot be computed exactly, (typically some moment of the transition density) and therefore relies on a numerical approximation through the simulation of a ''large'' number of paths of the solution thanks to an approximation scheme. This procedure turned out to yield satisfactory results from a practical point of view but, up to now, there was no theoretical results about the loss of precision due to these numerical approximations of the theoretically well behaved estimating functions. In this work, we address these issues and provide results to assess the choice of the number of approximating paths as well as the choice of the approximating scheme.

Sørensen, M. (1997): Estimating functions for discretely observed diffusions: A review. In Basawa, I.V., Godambe, V.P. and Taylor, R.L. (eds.): Selected Proceedings of the Symposium on Estimating Functions. IMS Lecture Notes - Monograph Series, Vol. 32, 305-325.


Henrik Madsen (Copenhagen) :   Methods for Estimating Embedded Parameters in Linear and Non-linear SDEs Using State Filtering Techniques.

During the last couple of decades state filtering techniques have been used for estimating parameters in stochastic differential equations based on discrete time measurements. In the talk some of the methods will be described, and examples from engineering applications will be provided. Finally, some of the numerical details used in an implementation of the methods will be outlined.


Jan Nygaard Nielsen (Copenhagen):   Estimation in Continuous-time Stochastic Volatility Models Using Nonlinear Filters.

Volatility modelling and estimation plays an important role in the valuation and hedging of financial derivatives. The stylized facts of stock prices, interest and exchange rates have lead econometricians to propose stochastic volatility models in both discrete and continuous time. However, the volatility as a measure of economic uncertainty is not directly observable in the financial markets. The objective of the continuous-discrete filtering problem considered here is to obtain estimates of the stock price and, in particular, the volatility using discrete-time observations of the stock price. In general, only approximate solutions to the continuous-discrete filtering problem exist, under some regularity conditions, in the form of two ordinary differential equations for the mean and variance of the state variables. In the present paper a nonlinear, second order filter is examined for some bivariate stochastic volatility models and the filter is applied to US stock market data using a maximum likelihood method. The filter is a generalization of the extended Kalman filter that cannot handle state-dependent diffusion terms.


Søren Feodor Nielsen (Copenhagen):   On Simulated EM Algorithms.

The EM algorithm is a useful method for finding the maximum likelihood estimator in incomplete data problems. However, in some cases we cannot calculate the conditional expectation required in the E-step of the algorithm. Instead an estimate can be formed by simulation, leading to a so-called simulated EM algorithm. The simulations can in principle be done in two ways; we can either draw new random numbers in each iteration or we can re-use the random numbers in each iteration. This leads to two rather different algorithms, which will be discussed and compared during the talk.


Jan Pedersen (Aarhus):   Weak Convergence of Generating Strategies.

The binomial model is a discrete time complete model. This means that an arbitrary claim is generated by an appropriate trading strategy. Further, this strategy is explicitly known and is easy to represent in terms of stock and claim prices. In continuous time the Black-Scholes model is complete as well. However, some generating strategies are not explicitly known. We show that a weak approximation to a (generally unknown) continuous time strategy is obtained in terms of suitable binomial models.


Ulrike Putschke (Berlin):   Properties of the ML-Estimator for Homogeneous Gaussian Diffusions.

For a (0,B,C)-diffusion in finite dimensions the asymptotic properties of the likelihood-function are studied in the sense of Le Cam with a special emphasis on the case in which the matrix B is normal. The essential influence is based on the spectral properties of the unknown matrix B. They lead to a rich structure of different cases classified in four categories -- LAN, LAMN, PLAMN and LAQ -- where each of them is diveded into several sub-classes. Applications to the asymptotic behaviour of the MLE are given.


Anders Rahbek (Copenhagen):   Unit-root Inference in Autoregressive (AR)\ Models with Autoregressive Conditional Heteroscedastic Innovations (ARCH). Co-authors: Peter Boswijk (Amsterdam) and Anders Svennesen (Copenhagen).

In unit-root and, in particular, cointegration analysis involving financial data it is often the case that estimated residuals appear to have ARCH-like behaviour. Implications and modelling of ARCH innovations are discussed for the class of AR models with emphasis on unit-root hypotheses. Two invariance principles hold. The first implies that likelihood ratio tests for unit-roots in AR models with ARCH\ innovations have non-standard asymptotic distributions. These may be characterized in terms of two, possibly dependent, brownian motions and stochastic integrals in terms of these. The second invariance principle implies that unit-root likelihood ratio tests derived in AR models with iid. gaussian distributed innovations, have the same asymptotic distribution whether or not the innovations are iid. or ARCH.


Tina Rydberg (Oxford):   A Modelling Framework for the Prices and Times of Trades Made on the NYSE.

We set down a framework for the modelling of the price and time of each trade made on a particular stock on the New York Stock Exchange (NYSE). The model has two main characteristics: (i) Prices only occur on a non-negative lattice of points separated by 1/8 of a dollar, (ii) the times of the trades occur randomly and the time between each consecutive trade has distinct intra-daily patterns and are serially correlated around the pattern. We model the time between trades as a Cox process and the price movements as being very close to being a compound Poisson process. Our models have the advantage that they have no direct latent variable and so likelihood inference is straightforward. There is no time deformation or stochastic volatility component which makes the model easy to simulate.

Key words: Complete models, Compound Poisson processes, Cox processes, Randomly spaced observations.


Tobias Rydén (Lund):   Bayesian Inference in Hidden Markov Models through Reversible Jump Markov Chain Monte Carlo.

A hidden Markov model (HMM) is a bivariate stochastic process $\{(X_k,Y_k)\}$ such that (i) $\{X_k\}$ is a finite state Markov chain (ii) given $\{X_k\}$, the process $\{Y_k\}$ is a sequence of conditionally independent random variables with the conditional distribution of $Y_n$ depending on $X_n$ only. The chain $\{X_k\}$ is generally not observable, hence the word `hidden', so that inference has to be based on $\{Y_k\}$alone.

HMMs have during the last decade become widely spread for modelling sequences of weakly dependent random variables with applications in areas like speech processing, communication networks, biochemistry, biology, medicine, econometrics, environmetrics, etc. Sometimes the hidden Markov chain $\{X_k\}$ does indeed exist, so that the physical nature of the problem suggests the use of an HMM, in other cases HMMs just provide a good fit to data.

One of the most difficult problems in inference for HMM is to estimate the number of states, $d$ say, of $\{X_k\}$. Classical approaches to this problem include likelihood ratio tests and penalized likelihoods (AIC/BIC). In this talk we present a Bayesian approach: by placing a prior on the unknown $d$ we obtain a posterior distribution for $d$ and the other parameters of the model. This distribution is analytically untractable but can be explored using jump Markov chain Monte Carlo algorithms. Finally an application to stock market data is presented.


Vladimir Spokoiny (Berlin):   Adaptive Estimation for Non-stationary Stochastic Systems.

We consider a stochastic system which is perturbed at some unknown moments. Each perturbation may drastically change characteristics of the system. The goal is to estimate adaptively the current system parameters from the observations of this system.


Anders Stockmarr (Copenhagen):   Asymptotic Behavior of the MLE's in Models for Multivariate Time-homogeneous Gaussian Diffusions.

The asymptotic behavior of the Maximum Likelihood estimators in these models have proven to be very complex and hard to describe, and though the behaviour is well-known in the stationary case, a general result has not yet been obtained. The talk will go a bit of the way via an approach based on the Jordan decomposition of the linear drift parameter.


Helgi Tomasson (Reykjavik):   Estimation of Market Value When Trading is Infrequent.

The value process of a stock market is assumed to evolve continuously. When trading takes place a noisy observation, (the price process) of the value process is obtained. Individual stocks have different trading intensity. A computational algorithm based on the Kalman filter for estimating the value process based on observations on the price process is given. This is implemented on Icelandic data to get an on-line estimate of a market value index.


Esko Valkeila (Helsinki):   Stock Prices Driven by Fractional Brownian Motion.

It is common to use geometric Brownian motion as a model for stock prices. In this talk we look, what happens, when the driving Brownian motion is replaced by a fractional Brownian motion.


Jeanette Wörner (Freiburg):   Optimal Estimation for Discretely Observed Diffusion Processes.

In the ergodic case we prove LAN for a discretely observed diffusion process and apply the results to martingale estimating functions. Furthermore, we derive a relation between the LAN property and the optimality in the sense of Godambe and Heyde.