Abstract

A Finnish pension insurance company provides statutory earnings-related pension insurance. In addition to old-age pensions, insurance provides protection against disability. Insuring againstdisability causes pension insurance company to accumulate data about disability pension applications.The main purpose of this data is to decide an applicant's entitlement to a disability pension. However, a remarkable volume of data can make it possible to apply machine learning techniques to solve complex problems around the phenomenon of disability.

In this paper, machine learning is applied to solve a binary classification problem to predict the outcome of a disability pension application. This is done with the Finnish pension insurance company Varma's data about disability pension applications from the last ten years. The classification problem will be solved with four machine learning models, including logistic regression, random forest and two versions of gradient boosting. The paper also includes basics about interpreting the behavior of amachine learning model.

As a result, the best machine learning model achieves an accuracy of 84 %. In addition, some basic behaviors of the model can be explained. The Applicant's age seems to have the highest effect on the model output. But also, some combinations of the applicant's diagnosis and other explanatory variables can be observed in terms of model sensitivity.

At the end of the paper, it will be briefly considered how the prediction could be useful from the point of view of a pension insurance company and an actuary. Three points of view will be introduced. The first one focuses on how a disability pension handling process could be optimized in the case of desirable handling time. The second one gives views about the quality control of disability pension decisions. The last one focuses on the short-term prediction of disability pension expenses.

Share Share Share