Predicting online gambling self-exclusion: an analysis of the performance of supervised machine learning models

Abstract

As gambling operators become increasingly sophisticated in their analysis of individual gambling behaviour, this study evaluates the potential for using machine learning techniques to identify individuals who used self-exclusion tools out of a sample of 845 online gamblers, based on analysing trends in their gambling behaviour. Being able to identify other gamblers whose behaviour is similar to those who decided to use self-exclusion tools could, for instance, be used to share responsible gaming messages or other information that aids self-aware gambling and reduces the risk of adverse outcomes. However, operators need to understand how accurate models can be and which techniques work well. The purpose of the article is to identify the most accurate technique out of four highly diverse techniques and to discuss how to deal analytically and practically with a rare event like self-exclusion, which was used by fewer than 1% of gamblers in our data-set. We conclude that balanced training data-sets are necessary for creating effective models and that, on our data-set, the most effective method is the random forest technique which achieves an accuracy improvement of 35 percentage points versus baseline estimates.

Problem with this document? Please report it to us.