Dissertation/Thesis Abstract

Big Data Analytics and Engineering for Medicare Fraud Detection
by Herland, Matthew Andrew, Ph.D., Florida Atlantic University, 2019, 205; 13814186
Abstract (Summary)

The United States (U.S.) healthcare system produces an enormous volume of data with a vast number of financial transactions generated by physicians administering healthcare services. This makes healthcare fraud difficult to detect, especially when there are considerably less fraudulent transactions than non-fraudulent. Fraud is an extremely important issue for healthcare, as fraudulent activities within the U.S. healthcare system contribute to significant financial losses. In the U.S., the elderly population continues to rise, increasing the need for programs, such as Medicare, to help with associated medical expenses. Unfortunately, due to healthcare fraud, these programs are being adversely affected, draining resources and reducing the quality and accessibility of necessary healthcare services. In response, advanced data analytics have recently been explored to detect possible fraudulent activities. The Centers for Medicare and Medicaid Services (CMS) released several 'Big Data' Medicare claims datasets for different parts of their Medicare program to help facilitate this effort.

In this dissertation, we employ three CMS Medicare Big Data datasets to evaluate the fraud detection performance available using advanced data analytics techniques, specifically machine learning. We use two distinct approaches, designated as anomaly detection and traditional fraud detection, where each have very distinct data processing and feature engineering. Anomaly detection experiments classify by provider specialty, determining whether outlier physicians within the same specialty signal fraudulent behavior. Traditional fraud detection refers to the experiments directly classifying physicians as fraudulent or non-fraudulent, leveraging machine learning algorithms to discriminate between classes. We present our novel data engineering approaches for both anomaly detection and traditional fraud detection including data processing, fraud mapping, and the creation of a combined dataset consisting of all three Medicare parts. We incorporate the List of Excluded Individuals and Entities database to identify real-world fraudulent physicians for model evaluation. Regarding features, the final datasets for anomaly detection contain only claim counts for every procedure a physician submits while traditional fraud detection incorporates aggregated counts and payment information, specialty, and gender. Additionally, we compare cross-validation to the real-world application of building a model on a training dataset and evaluating on a separate test dataset for severe class imbalance and rarity.

Indexing (document details)
Advisor: Khoshgoftaar, Taghi M.
Commitee: Solomon, Martin, Zhu, Xingquan, Zhuang, Hanqi
School: Florida Atlantic University
Department: Computer Engineering
School Location: United States -- Florida
Source: DAI-B 80/11(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer Engineering, Health care management
Keywords: Anomaly detection, Big data analytics, Machine learning, Medicare fraud, Rarity, Severe class imabalance
Publication Number: 13814186
ISBN: 9781392274026
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest