Skip to content

Instantly share code, notes, and snippets.

View carlleston's full-sized avatar

Anthony Carlleston de Lima carlleston

View GitHub Profile
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@carlleston
carlleston / 3-ln_model.ipynb
Last active August 23, 2020 12:09
pre-processing and linear model in pyspark
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@carlleston
carlleston / 2-ETL_news_Studio.ipynb
Created July 28, 2020 12:57
Extract, transform, classify the news and load.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
from sklearn.externals.six import StringIO
from IPython.display import Image
from sklearn.tree import export_graphviz
import pydotplus
import os
os.environ['PATH'] = os.environ['PATH']+';'+os.environ['CONDA_PREFIX']+r"\Library\bin\graphviz"
dot_data = StringIO()
export_graphviz(pipe.named_steps['regressor'].estimators_[0], out_file=dot_data)
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
conda install pydotplus
conda install graphviz
regr = RandomForestRegressor(random_state = 100,bootstrap = True, max_depth=2,max_features=2,min_samples_leaf=3,min_samples_split=5,n_estimators=3)
pipe = Pipeline([
('scaler', StandardScaler()),
('reduce_dim', PCA()),
('regressor', regr)
])
pipe.fit(X_train,y_train)
ypipe=pipe.predict(X_test)
X = df[['GDP','UnemploymentRate']]
y = df['Revenue']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)