Relying on the Area Under the Curve (AUC) measure, we compare the performance of the Logit regression model and the LightGBM algorithm. Despite these methods being common in the literature, our study emphasizes the role of statistical inference to evaluate and compare the results comprehensively. We use the training set of the Vesta (2018) dataset, provided by Vesta—a global fraud prevention company headquartered in the United States specializing in payment solutions and risk management. Originally released as part of a Kaggle competition focused on credit card fraud detection, this dataset comprises diverse transaction records, representing a rich source for exploring advanced fraud detection methods. Our analysis reveals that while the LightGBM algorithm generally yields higher predictive accuracy, the differences between the calculated AUCs of the two methods are not statistically significant. This underscores the importance of using inferential techniques to validate model performance differences in fraud detection.
Rights and permissions | |
![]() |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |