This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# list and drop columns that are less related to the target based on my judgment | |
cols_to_drop = ['duration', 'emp.var.rate', 'cons.price.idx', 'cons.conf.idx', 'euribor3m', 'nr.employed'] | |
# at the same time, rename the columns so they are understandable. Please read the UCI page (https://archive.ics.uci.edu/ml/datasets/bank+marketing) for details | |
df = df.drop(columns=cols_to_drop).rename(columns={'job': 'job_type', 'default': 'default_status', | |
'housing': 'housing_loan_status', 'loan': 'personal_loan_status', | |
'contact': 'contact_type', 'month': 'contact_month', | |
'day_of_week': 'contact_day_of_week', 'campaign': 'num_contacts', | |
'pdays': 'days_last_contact', 'previous': 'previous_contacts', | |
'poutcome': 'previous_outcome', | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
# please use the dataset bank-additional.zip and extract it | |
df = pd.read_csv('bank-additional/bank-additional/bank-additional-full.csv', delimiter=';') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
forecast_test_auto = auto_arima.predict(n_periods=len(df_test)) | |
df['forecast_auto'] = [None]*len(df_train) + list(forecast_test_auto) | |
df.plot() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mae = mean_absolute_error(df_test, forecast_test_auto) | |
mape = mean_absolute_percentage_error(df_test, forecast_test_auto) | |
rmse = np.sqrt(mean_squared_error(df_test, forecast_test_auto)) | |
print(f'mae - auto: {mae}') | |
print(f'mape - auto: {mape}') | |
print(f'rmse - auto: {rmse}') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error, mean_squared_error | |
mae = mean_absolute_error(df_test, forecast_test) | |
mape = mean_absolute_percentage_error(df_test, forecast_test) | |
rmse = np.sqrt(mean_squared_error(df_test, forecast_test)) | |
print(f'mae - manual: {mae}') | |
print(f'mape - manual: {mape}') | |
print(f'rmse - manual: {rmse}') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
auto_arima.summary() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pmdarima as pm | |
auto_arima = pm.auto_arima(df_train, stepwise=False, seasonal=False) | |
auto_arima |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
forecast_test = model_fit.forecast(len(df_test)) | |
df['forecast_manual'] = [None]*len(df_train) + list(forecast_test) | |
df.plot() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
acf_res = plot_acf(residuals) | |
pacf_res = plot_pacf(residuals) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import matplotlib.pyplot as plt | |
residuals = model_fit.resid[1:] | |
fig, ax = plt.subplots(1,2) | |
residuals.plot(title='Residuals', ax=ax[0]) | |
residuals.plot(title='Density', kind='kde', ax=ax[1]) | |
plt.show() |