Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save eblancoh/4cc4763fd59d0fddd1fec2e8d933caab to your computer and use it in GitHub Desktop.
Save eblancoh/4cc4763fd59d0fddd1fec2e8d933caab to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Construyendo un Random Forest Classifier"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Partición entre Training y Test Set "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Procedemos a importar las librerías que van a ser necesarias, tanto para el modelo de `Random Forest` como para la partición del dataset de cara al entrenamiento."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"from sklearn.ensemble import RandomForestClassifier"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"labels = df['class']\n",
"features = df.drop(['class', 'objid', 'run', 'rerun', \n",
" 'camcol', 'field', 'specobjid', 'plate',\n",
" 'mjd', 'fiberid', 'redshift'], axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"x_train, x_test, y_train, y_test = train_test_split(features, labels, test_size=0.3, random_state=123, stratify=labels)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Llamamos al clasificador:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"clf = RandomForestClassifier()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Feature Scaling"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Estandarizamos nuestro dataset en el intervalo $[-1, 1]$ con `minMaxScaler()` "
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.preprocessing import MinMaxScaler\n",
"scaling = MinMaxScaler(feature_range=(-1,1)).fit(x_train)\n",
"x_train_scaled = scaling.transform(x_train)\n",
"x_test_scaled = scaling.transform(x_test)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.7"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
},
"labels_anchors": false,
"latex_user_defs": false,
"report_style_numbering": false,
"user_envs_cfg": false
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": true,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment