Goodness-of-Fit in GLM

Specialeforsvar: Finn Olai Milhøj

Title: Goodness-of-Fit in GLM

Resume: In this thesis several ways of measuring goodness-of-fit for count data in the generalized linear model (GLM) are presented and investigated. The methods are applied to simulated and real life data from Topdanmark A/S. The most commonly used goodness-of-fit tests in GLM, scaled deviance and scaled Pearson chi^2-statistic, are flawed as the rely on grouped data asymptotics which can be questioned in practical applications. A method based on the weak convergence of the empirical residual process in the non-parametrized GLM are introduced. It is proved that this process converges towards a complicated time scaled Brownian motion. This problem is proved to be solved by an application of the wild bootstrap method. Another method introduced is the unrestricted
bootstrapped Pearson chi^2-test. Whereas the Pearson chi^2-test based on the maximum likelihood estimator does not have an exact chi^2 limit, it is proved that the randomness of the bootstrap reconstructs the exact chi^2 limit. The simulation study shows the wild bootstrap clearly works better than the classical methods when testing the mean structure, whereas the bootstrapped Pearson chi^2-test is superior when testing the distributional assumption. Tests for overdispersion are constructed and their asymptotics proved, since they work as goodness-of-fit tests for Poisson regressions for count data. By
simulation it is shown that testing against an overdispersed alternative is preferable to testing alpha=0 in the negative binomial regression.

Vejleder: Jostein Paulsen
Censor: Mette Havning