Built LDA and QDA models on variables obtained from Principal Component Analysis (PCA) and Kolmogorov-Smirnov (KS) and tuned by leave-one-out cross-validation (LOOCV) to predict fraudulent online advertising click traffic
- Download both data files and R code in the same folder
- Run
Fraud Detection.R
in Rstudio
- Source files:
train_50k.csv
,ks_variables.csv
- Supporting files:
Project Instructions.pdf
,data dictionary.xlsx
- R code:
Fraud Detection.R
- Outputs:
PCA.png
,PCA_cum.png
,PCA_varaiance.csv
,ROC_LDA_PCA.png
,ROC_QDA_PCA.png
,ROC_LDA_KS.png
,ROC_QDA_KS.png
- Final Report:
Final presentation.pptx
,Final Report on Fraud Detection.pdf
P.S. The result obtained after running the code may be different from the result showed in the final report because the uploaded dataset is only part of the original data due to file size limit
- Ian Chi
- Mahalakshmi Raghavan
- Ushita Palande
- Yu Luo
- Riddhi Malviya