omairaasim/sklearn_exercise_07.ipynb

## sklearn_exercise_07.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How to use k-fold cross validation in sklearn?\n",
    "1. K-Folds cross-validator\n",
    "2. The k-fold cross-validation procedure is a method for estimating the performance of a ML algorithm on a dataset. The k-fold cross-validation procedure divides a limited dataset into k non-overlapping folds.  A total of k models are fit and evaluated on the k hold-out test sets and the mean performance is reported.\n",
    "3. Each fold is then used once as a validation while the k - 1 remaining folds form the training set."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step:1 Import Libraries:-"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import KFold\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create the range 1 to 25\n",
    "rn = range(1,26)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step:2 Createing Folds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "# to demonstrate how the data are split, we will create 3 and 5 folds. \n",
    "# it returns an location (index) of the train and test samples.\n",
    "kf5 = KFold(n_splits=5, shuffle=False)\n",
    "kf3 = KFold(n_splits=3, shuffle=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24] [0 1 2 3 4 5 6 7 8]\n",
      "[ 0  1  2  3  4  5  6  7  8 17 18 19 20 21 22 23 24] [ 9 10 11 12 13 14 15 16]\n",
      "[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16] [17 18 19 20 21 22 23 24]\n"
     ]
    }
   ],
   "source": [
    "# the Kfold function retunrs the indices of the data. Our range goes from 1-25 so the index is 0-24\n",
    "for train_index, test_index in kf3.split(rn):\n",
    "    print(train_index, test_index)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25] [1 2 3 4 5 6 7 8 9]\n",
      "[ 1  2  3  4  5  6  7  8  9 18 19 20 21 22 23 24 25] [10 11 12 13 14 15 16 17]\n",
      "[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17] [18 19 20 21 22 23 24 25]\n"
     ]
    }
   ],
   "source": [
    "# to get the values from our data, we use np.take() to access a value at particular index\n",
    "for train_index, test_index in kf3.split(rn):\n",
    "    print(np.take(rn,train_index), np.take(rn,test_index))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# How to use k-fold cross validation in sklearn?\n",
	"1. K-Folds cross-validator\n",
	"2. The k-fold cross-validation procedure is a method for estimating the performance of a ML algorithm on a dataset. The k-fold cross-validation procedure divides a limited dataset into k non-overlapping folds. A total of k models are fit and evaluated on the k hold-out test sets and the mean performance is reported.\n",
	"3. Each fold is then used once as a validation while the k - 1 remaining folds form the training set."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Step:1 Import Libraries:-"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"from sklearn.model_selection import KFold\n",
	"import numpy as np"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [],
	"source": [
	"# create the range 1 to 25\n",
	"rn = range(1,26)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Step:2 Createing Folds"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [],
	"source": [
	"# to demonstrate how the data are split, we will create 3 and 5 folds. \n",
	"# it returns an location (index) of the train and test samples.\n",
	"kf5 = KFold(n_splits=5, shuffle=False)\n",
	"kf3 = KFold(n_splits=3, shuffle=False)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"[ 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24] [0 1 2 3 4 5 6 7 8]\n",
	"[ 0 1 2 3 4 5 6 7 8 17 18 19 20 21 22 23 24] [ 9 10 11 12 13 14 15 16]\n",
	"[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16] [17 18 19 20 21 22 23 24]\n"
	]
	}
	],
	"source": [
	"# the Kfold function retunrs the indices of the data. Our range goes from 1-25 so the index is 0-24\n",
	"for train_index, test_index in kf3.split(rn):\n",
	" print(train_index, test_index)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"[10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25] [1 2 3 4 5 6 7 8 9]\n",
	"[ 1 2 3 4 5 6 7 8 9 18 19 20 21 22 23 24 25] [10 11 12 13 14 15 16 17]\n",
	"[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17] [18 19 20 21 22 23 24 25]\n"
	]
	}
	],
	"source": [
	"# to get the values from our data, we use np.take() to access a value at particular index\n",
	"for train_index, test_index in kf3.split(rn):\n",
	" print(np.take(rn,train_index), np.take(rn,test_index))"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.8.3"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 4
	}