An Analysis of Measures of Predictive Performance

Specialeforsvar ved Eva Johnsen

Titel:  An Analysis of Measures of Predictive Performance

Abstract: This thesis examines and discusses how to evaluate a predictive model and how to determine if a variable included in a model have an added predictive value. It is a relevant problem since there is no standardized procedure to evaluate a predictive model for binary data. By a simulation study, different performance measures; Accuracy, AUC, IDI and NRI are compared and discussed. Furthermore, those performance measures are compared to a Likelihood Ratio Test statistic and it is concluded that a Likelihood Ratio test generally have the greatest power of all measures analysed. A Shiny App is created to investigate the performance measures under different assumptions that can be controlled by the user. The input variables that can be varied are the data generating process, the choice of predictive model, and the size of the simulation study.
A guideline on how to use the results to real world data is provided and the distribution of each performance measure under the null hypothesis is analysed by a simulation study. Some rough guidelines of critical values are given

Vejleder: Anders Tolver
Censor:   Anders Rønn-Nielsen, CBS