A dual-model machine learning approach to medicare fraud detection: combining unsupervised anomaly detection with supervised learning

Jesu Marcus Immanuvel Arockiasamy, Gowrishankar Bhoopathi

Abstract


Medicare fraud, costing $54.35 billion in improper payments in 2024, undermines U.S. healthcare by draining resources meant for vulnerable populations. Traditional detection methods struggle with reactive designs, high false positives, and reliance on scarce labeled data, exacerbated by a 0.017% fraud prevalence. This paper proposes a dual-model machine learning framework to tackle these challenges. Unsupervised anomaly detection uses cluster-based local outlier factor (CBLOF) and empirical cumulative outlier detection (ECOD) to identify novel fraud patterns across 37 million records. These findings are validated by the list of excluded individuals/entities (LEIE). Supervised classification, with C4.5 decision trees and logistic regression, refines these anomalies using an 80:20 balanced dataset, reducing false positives by 63%. Key innovations include hybrid sampling to address class imbalance, LEIE integration for labeled validation, and parallelized processing of 2.1 million claims hourly. Achieving an area under the curve (AUC), a measure of model accuracy, of 88.3%, this approach outperforms single-model systems by 24%, blending exploratory detection with actionable precision. This scalable, interpretable framework potentially advances fraud detection, safeguarding public funds and Medicare’s integrity with a practical, adaptable solution for evolving threats.

Keywords


Artificial intelligence; Cluster-based local outlier factor; Empirical cumulative outlier detection; Machine learning; Medicare fraud; Unsupervised learning

Full Text:

PDF


DOI: https://doi.org/10.11591/csit.v6i3.p245-252

Refbacks

  • There are currently no refbacks.


Computer Science and Information Technologies
p-ISSN: 2722-323X, e-ISSN: 2722-3221
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Universitas Ahmad Dahlan (UAD).

CSIT Visitor Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.