Analyzing the effect of extrapolation on feature importance measures

Specialeforsvar: Rikke Birgitte Melander

Titel: Analyzing the effect of extrapolation on feature importance measures

Abstract: Machine learning models perform particularly good at prediction tasks. However, even though they are effective at predicting, they are often difficult to interpret due to the complexity of the underlying models. In this thesis, we consider three interpretation methods, which are methods used to interpret black box prediction models. We consider the Partial Derivative method, the Permutation method and the Shapley value method. Inspired by these methods, we construct feature influence measures and feature importance measures, which allocate, respectively, a signed and a positive value to each feature. The aim of the thesis is to analyse the extrapolation of such measure, which we do given three different machine learning models: the linear regression model, the kernel ridge regression model and the random forest regression model. To get a more nuanced picture of the measures we furthermore analyse the effect of correlation between features.

Vejleder: Niklas Pfister
Censor:   Sören Möller, SDU