This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#### 1. Import libraries and set working directory #### | |
library(readr) | |
library(dplyr) | |
library(ggplot2) | |
library(lubridate) | |
library(pwr) | |
library(plyr) | |
setwd("C:\\Users\\sssssss\\Desktop\\TM\\") | |
#### 2. Read in files and merge #### |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#**************************************************************************************** | |
# | |
# PROJECT: 20181002 | |
# | |
# MODULE: 020 - ANALYSE - PREDICTIVE MODELLING | |
# | |
# DESCRIPTION: | |
# | |
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## 1. Set libraries and import data #### | |
library(data.table) | |
library(dplyr) | |
library(padr) | |
library(xgboost) | |
library(Matrix) | |
library(RcppRoll) | |
library(zoo) | |
library(readr) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+---------------------+--------------------+--------------------+--------------------+ | |
| Variable Name | Variable Type | Variable Name | Variable Type | | |
+---------------------+--------------------+--------------------+--------------------+ | |
| Title | String | Telecommuting | Binary | | |
| Location | String | Company Logo | Binary | | |
| Department | String | Questions | Binary | | |
| Salary range | String | Fraudulent | Binary | | |
| Company profile | String | In balanced | Binary | | |
| Description | String | Employment Type | Categorical/Factor | | |
| Requirements | String | Benefits | String | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+------------+---------------------+ | |
| Model Type | Model Accuracy(AUC) | | |
+------------+---------------------+ | |
| DRF | 0.962 | | |
| GBM | 0.882 | | |
| GLM | 0.928 | | |
+------------+---------------------+ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+----------------+----------------+------------+-----------------+ | |
| Predicted | | | | | |
+----------------+----------------+------------+-----------------+ | |
| Actual | Non-Fraudulent | Fraudulent | Error Rate | | |
| Non-Fraudulent | 297 | 63 | 17.5% (63/360) | | |
| Fraudulent | 29 | 327 | 8.15% (29/356) | | |
| Total | 326 | 390 | 12.95% (92/716) | | |
+----------------+----------------+------------+-----------------+ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+---------------------+----------------------------+ | |
| Predictor | Rank (Variable Importance) | | |
+---------------------+----------------------------+ | |
| Location | 1 | | |
| Company logo | 2 | | |
| Industry | 3 | | |
| Function | (a)4 (b) 5 | | |
| Salary range | (a) 5 (b) 8 | | |
| Department | (a) 6 (b) 4 | | |
| Required education | (a) 7 (b) 6 | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+----------------------------------------------+---------+--------+------+ | |
| Predictor | Coef | OR | Prob | | |
+----------------------------------------------+---------+--------+------+ | |
| Has company logo (True) | -1.5688 | 0.2083 | 17% | | |
| Industry – Consumer Services | 0.5842 | 1.7937 | 64% | | |
| Has company logo (False) | 1.5583 | 4.7511 | 83% | | |
| Required experience – Unknown | 0.5407 | 1.7172 | 63% | | |
| Required education – Bachelor’s Degree | -1.2558 | 0.2849 | 22% | | |
| Required experience – Mid senior level | -0.5168 | 0.5964 | 37% | | |
| Company function – Administrative | 1.0324 | 2.8080 | 74% | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+----+--------------+-----------------+--------------------------+---------------+ | |
| ID | Income Level | Education Level | Number of Family Members | Outcome | | |
+----+--------------+-----------------+--------------------------+---------------+ | |
| 1 | <=75,000 | <= 12 years | <= 4 | Purchaser | | |
| 2 | > 75,000 | > 12 years | <= 4 | Non-Purchaser | | |
| 3 | <=75,000 | <= 12 years | <= 4 | Purchaser | | |
| 4 | <=75,000 | > 12 years | >4 | Purchaser | | |
| 5 | <=75,000 | <= 12 years | > 4 | Non-Purchaser | | |
| 6 | > 75,000 | > 12 years | > 4 | Purchaser | | |
| 7 | > 75,000 | <= 12 years | > 4 | Non-Purchaser | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+ | |
| Advantages | Disadvantages | | |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+ | |
| Easy to Understand |
OlderNewer