Regression analysis of pharmaceutical drug production
Specialeforsvar: Simon Mejlhede Jensen
Titel: Regression analysis of pharmaceutical drug production
Resume: One method for industrial drug production is to use bioprocesses, i.e. processes which involve living cells. The yield of such bioprocesses typically exhibits considerable stochastic variation due to the biological variation of the living cells. On the other hand, it is interesting from an economic point of view to be able to predict the final yield while the
bioprocess is in progress. This thesis analyses the behavior of a bioprocess which yielded antibody. The bioprocess was repeated 10 times. For each run of the bioprocess, 13 observables (e.g. CO2 concentration) were measured about 15 times each. These measurements together with the yield of each run formed the data set on which the present thesis was based. The purpose of this thesis was to construct a model which can predict the final yield from the measurements. First, we tried to predict the yield through a linear model. The number of explanatory variables was significantly larger than the number of response variables, and for this reason we used the lasso method to select which explanatory variables to include in the model. The linear model showed that it
was possible to predict the yield after 201 hours. In particular, the linear model found that among all the 13 observables, a particular one of the observables was best at predicting the yield. Second, we tried to reduce the number of explanatory variables by using spectral clustering to seperate the explanatory variables into groups. Furthermore, we compared the performance of different variations of spectral clustering. In order to asses the prediction strength of the linear model and the clustering algorithm we applied the so-called cross-validation procedure to the two models. Through cross-validation we were
able to show that the yield could be predicted after 201 hours by using measurements made on a particular observable.
Supervisor: Olivier Wintenberger
Censor: Pierre Pinson