Credit Card Fraud Detection Using SMOTE (Classification approach) :
This is the 2nd approach I’m sharing for credit card fraud detection.
We are going to explore resampling techniques like oversampling in this 2nd approach. Here are the key steps involved in this kernel.
1) Balance the dataset by oversampling fraud class records using SMOTE.
2) Train the model using oversampled data by Random Forest.
3) Evaluate the performance of this model based on predictions on original imbalanced test data.
4) Add cluster segments to the original train and test data using K-Means algorithm.
5) Repeat the steps 1, 2 & 3 and see if the performance of Random Forest has improved by adding clusters.
6) Finally evaluate our model performance and check if it can generalize well on the unseen data using K-fold cross validation on original train data.
Reference – Anomaly Detection using Gaussian (Normal) Distribution (kaggle)