bfarzin/Pytorch DataLoader Speed.ipynb

## Pytorch DataLoader Speed.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Testing Data Loader Speed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.5.0a0+8f3653f\n"
     ]
    }
   ],
   "source": [
    "import copy\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "gSeed=154\n",
    "np.random.seed(gSeed) # for reproducibility\n",
    "\n",
    "\n",
    "import torch\n",
    "import torch.nn as nn\n",
    "import torch.utils.data as datautils\n",
    "import torch.nn.functional as F\n",
    "from torch.autograd import Variable\n",
    "torch.manual_seed(gSeed)\n",
    "print(torch.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Times are all the same (per hit) regardless of the data size"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(2096279, 16) (2096279, 1)\n"
     ]
    }
   ],
   "source": [
    "data_length = 2096279\n",
    "X = np.random.rand(data_length,16)\n",
    "y = np.random.rand(data_length,1)\n",
    "print(X.shape,y.shape)\n",
    "\n",
    "tensor_X = torch.stack([torch.Tensor(i) for i in X])\n",
    "tensor_y = torch.stack([torch.Tensor(i) for i in y])\n",
    "\n",
    "batch_size = 256\n",
    "my_dataset = datautils.TensorDataset(tensor_X,tensor_y.to(torch.int64)) # create your datset\n",
    "data_loader = datautils.DataLoader(my_dataset,batch_size=batch_size,shuffle=True) # create your dataloader\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def total_data_loop(data_loader):\n",
    "    for (feat,label) in data_loader:\n",
    "        pass\n",
    "    return 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext line_profiler"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "%lprun -f total_data_loop total_data_loop(data_loader)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Timer unit: 1e-06 s\n",
    "\n",
    "Total time: 20.8734 s\n",
    "File: <ipython-input-10-ed5e115f2b3e>\n",
    "Function: total_data_loop at line 1\n",
    "\n",
    "Line #      Hits         Time  Per Hit   % Time  Line Contents\n",
    "==============================================================\n",
    "     1                                           def total_data_loop(data_loader):\n",
    "     2      8190   20866215.0   2547.8    100.0      for (feat,label) in data_loader:\n",
    "     3      8189       7175.0      0.9      0.0          pass\n",
    "     4         1          0.0      0.0      0.0      return 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Time goes up/down with batch size (per hit) but total is about the same\n",
    "\n",
    "So the problem is not realted to the data set that is incoming.  It must have to do with the \\__next\\__ call?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "%lprun -f total_data_loop total_data_loop(data_loader)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Timer unit: 1e-06 s\n",
    "\n",
    "Total time: 15.7686 s\n",
    "File: <ipython-input-3-ed5e115f2b3e>\n",
    "Function: total_data_loop at line 1\n",
    "\n",
    "Line #      Hits         Time  Per Hit   % Time  Line Contents\n",
    "==============================================================\n",
    "     1                                           def total_data_loop(data_loader):\n",
    "     2      8190   15756034.0   1923.8     99.9      for (feat,label) in data_loader:\n",
    "     3      8189      12559.0      1.5      0.1          pass\n",
    "     4         1          3.0      3.0      0.0      return 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### timing to permute the data (fast on a per-epoch basis)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "57.1 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit permed = np.random.permutation(np.arange(len(tensor_X)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "61.4 ms ± 647 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "permed = np.random.permutation(np.arange(len(tensor_X)))\n",
    "%timeit tensor_X[torch.from_numpy(permed),:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Custom Dataloader\n",
    "Try a custom dataloader that does what we expect (and confirm it is a similar speed)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "from torch.utils.data import RandomSampler,SequentialSampler,BatchSampler\n",
    "from torch.utils.data.dataloader import default_collate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Showing steps of how the data_loader works (from the code, simplified"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "8189\n",
      "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]\n",
      "[tensor([[0.5077, 0.6306, 0.6856,  ..., 0.5328, 0.6881, 0.8779],\n",
      "        [0.8011, 0.2601, 0.1394,  ..., 0.2662, 0.3528, 0.0364],\n",
      "        [0.6469, 0.7477, 0.7283,  ..., 0.4790, 0.2592, 0.1350],\n",
      "        ...,\n",
      "        [0.3677, 0.8976, 0.6331,  ..., 0.2510, 0.8928, 0.1734],\n",
      "        [0.0917, 0.1114, 0.8487,  ..., 0.3839, 0.2294, 0.0816],\n",
      "        [0.9121, 0.6420, 0.2897,  ..., 0.9560, 0.2802, 0.8563]]), tensor([[0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0],\n",
      "        [0]])]\n"
     ]
    }
   ],
   "source": [
    "my_dataset = datautils.TensorDataset(tensor_X,tensor_y.to(torch.int64)) # create your datset\n",
    "\n",
    "batch_sample = BatchSampler(SequentialSampler(my_dataset),256,False)\n",
    "sample_iter = iter(batch_sample)\n",
    "indices = next(sample_iter)\n",
    "print(len(batch_sample))\n",
    "print(indices)\n",
    "batch = default_collate([my_dataset[i] for i in indices])\n",
    "print(batch)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Example of a custom data loader, replicates the current data loader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "## Example data loader, replicates orginal\n",
    "class myDataLoader(object):\n",
    "    def __init__(self,dataset,batch_size=16,shuffle=False):\n",
    "        self.dataset = dataset\n",
    "        self.batch_size = batch_size\n",
    "        self.shuffle = shuffle\n",
    "        \n",
    "        if self.shuffle:\n",
    "            self.batch_sampler = BatchSampler(RandomSampler(self.dataset),self.batch_size,False)\n",
    "        else:\n",
    "            self.batch_sampler = BatchSampler(SequentialSampler(self.dataset),self.batch_size,False)\n",
    "        self.sample_iter = iter(self.batch_sampler)  \n",
    "    def __len__(self):\n",
    "        #number of batches\n",
    "        pass\n",
    "    \n",
    "    def __iter__(self):\n",
    "        return self\n",
    "    \n",
    "    def __next__(self):\n",
    "        indices = next(self.sample_iter)\n",
    "        batch = default_collate([self.dataset[i] for i in indices])\n",
    "        return batch            \n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "#testing\n",
    "my_data_loader = myDataLoader(my_dataset,batch_size=256,shuffle=False)\n",
    "%lprun -f total_data_loop total_data_loop(my_data_loader)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Timer unit: 1e-06 s\n",
    "\n",
    "Total time: 21.0301 s\n",
    "File: <ipython-input-3-ed5e115f2b3e>\n",
    "Function: total_data_loop at line 1\n",
    "\n",
    "Line #      Hits         Time  Per Hit   % Time  Line Contents\n",
    "==============================================================\n",
    "     1                                           def total_data_loop(data_loader):\n",
    "     2      8190   21023529.0   2567.0    100.0      for (feat,label) in data_loader:\n",
    "     3      8189       6531.0      0.8      0.0          pass\n",
    "     4         1          0.0      0.0      0.0      return 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the sequential case, you can be smaterer about the indexing\n",
    "it is a small change, but has a big impact"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "class myDataLoader(object):\n",
    "    def __init__(self,dataset,batch_size=16,shuffle=False):\n",
    "        self.dataset = dataset\n",
    "        self.batch_size = batch_size\n",
    "        self.shuffle = shuffle\n",
    "        \n",
    "        if self.shuffle:\n",
    "            self.batch_sampler = BatchSampler(RandomSampler(self.dataset),self.batch_size,False)\n",
    "        else:\n",
    "            self.batch_sampler = BatchSampler(SequentialSampler(self.dataset),self.batch_size,False)\n",
    "        self.sample_iter = iter(self.batch_sampler)  \n",
    "    def __len__(self):\n",
    "        #number of batches\n",
    "        pass\n",
    "    \n",
    "    def __iter__(self):\n",
    "        return self\n",
    "    \n",
    "    def __next__(self):\n",
    "        indices = next(self.sample_iter)\n",
    "        if self.shuffle:\n",
    "            batch = default_collate([self.dataset[i] for i in indices])\n",
    "        else:\n",
    "            batch = [x for x in self.dataset[indices[0]:(indices[-1] + 1)]]\n",
    "        return batch            "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "my_data_loader = myDataLoader(my_dataset,batch_size=256,shuffle=False)\n",
    "%lprun -f total_data_loop total_data_loop(my_data_loader)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Timer unit: 1e-06 s\n",
    "\n",
    "Total time: 1.40686 s\n",
    "File: <ipython-input-3-ed5e115f2b3e>\n",
    "Function: total_data_loop at line 1\n",
    "\n",
    "Line #      Hits         Time  Per Hit   % Time  Line Contents\n",
    "==============================================================\n",
    "     1                                           def total_data_loop(data_loader):\n",
    "     2      8190    1403383.0    171.4     99.8      for (feat,label) in data_loader:\n",
    "     3      8189       3473.0      0.4      0.2          pass\n",
    "     4         1          0.0      0.0      0.0      return 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is clear that this difference comes from how you index into the dataloader.  The equiv speed up is similar."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "indices = [x for x in range(4)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "18.6 µs ± 376 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit [my_dataset[i] for i in indices]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "5 µs ± 75.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit my_dataset[indices[0]:indices[-1]]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python (pytorch python 3.6 CUDA 9.2",
   "language": "python",
   "name": "pytorch37_cuda92"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Testing Data Loader Speed"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"0.5.0a0+8f3653f\n"
	]
	}
	],
	"source": [
	"import copy\n",
	"import pandas as pd\n",
	"import numpy as np\n",
	"gSeed=154\n",
	"np.random.seed(gSeed) # for reproducibility\n",
	"\n",
	"\n",
	"import torch\n",
	"import torch.nn as nn\n",
	"import torch.utils.data as datautils\n",
	"import torch.nn.functional as F\n",
	"from torch.autograd import Variable\n",
	"torch.manual_seed(gSeed)\n",
	"print(torch.__version__)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Times are all the same (per hit) regardless of the data size"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"(2096279, 16) (2096279, 1)\n"
	]
	}
	],
	"source": [
	"data_length = 2096279\n",
	"X = np.random.rand(data_length,16)\n",
	"y = np.random.rand(data_length,1)\n",
	"print(X.shape,y.shape)\n",
	"\n",
	"tensor_X = torch.stack([torch.Tensor(i) for i in X])\n",
	"tensor_y = torch.stack([torch.Tensor(i) for i in y])\n",
	"\n",
	"batch_size = 256\n",
	"my_dataset = datautils.TensorDataset(tensor_X,tensor_y.to(torch.int64)) # create your datset\n",
	"data_loader = datautils.DataLoader(my_dataset,batch_size=batch_size,shuffle=True) # create your dataloader\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [],
	"source": [
	"def total_data_loop(data_loader):\n",
	" for (feat,label) in data_loader:\n",
	" pass\n",
	" return 0"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {},
	"outputs": [],
	"source": [
	"%load_ext line_profiler"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [],
	"source": [
	"%lprun -f total_data_loop total_data_loop(data_loader)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Timer unit: 1e-06 s\n",
	"\n",
	"Total time: 20.8734 s\n",
	"File: <ipython-input-10-ed5e115f2b3e>\n",
	"Function: total_data_loop at line 1\n",
	"\n",
	"Line # Hits Time Per Hit % Time Line Contents\n",
	"==============================================================\n",
	" 1 def total_data_loop(data_loader):\n",
	" 2 8190 20866215.0 2547.8 100.0 for (feat,label) in data_loader:\n",
	" 3 8189 7175.0 0.9 0.0 pass\n",
	" 4 1 0.0 0.0 0.0 return 0"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"* Time goes up/down with batch size (per hit) but total is about the same\n",
	"\n",
	"So the problem is not realted to the data set that is incoming. It must have to do with the \\__next\\__ call?"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {},
	"outputs": [],
	"source": [
	"%lprun -f total_data_loop total_data_loop(data_loader)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Timer unit: 1e-06 s\n",
	"\n",
	"Total time: 15.7686 s\n",
	"File: <ipython-input-3-ed5e115f2b3e>\n",
	"Function: total_data_loop at line 1\n",
	"\n",
	"Line # Hits Time Per Hit % Time Line Contents\n",
	"==============================================================\n",
	" 1 def total_data_loop(data_loader):\n",
	" 2 8190 15756034.0 1923.8 99.9 for (feat,label) in data_loader:\n",
	" 3 8189 12559.0 1.5 0.1 pass\n",
	" 4 1 3.0 3.0 0.0 return 0"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### timing to permute the data (fast on a per-epoch basis)\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"57.1 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
	]
	}
	],
	"source": [
	"%timeit permed = np.random.permutation(np.arange(len(tensor_X)))"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"61.4 ms ± 647 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
	]
	}
	],
	"source": [
	"permed = np.random.permutation(np.arange(len(tensor_X)))\n",
	"%timeit tensor_X[torch.from_numpy(permed),:]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Custom Dataloader\n",
	"Try a custom dataloader that does what we expect (and confirm it is a similar speed)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 9,
	"metadata": {},
	"outputs": [],
	"source": [
	"from torch.utils.data import RandomSampler,SequentialSampler,BatchSampler\n",
	"from torch.utils.data.dataloader import default_collate"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Showing steps of how the data_loader works (from the code, simplified"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 10,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"8189\n",
	"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255]\n",
	"[tensor([[0.5077, 0.6306, 0.6856, ..., 0.5328, 0.6881, 0.8779],\n",
	" [0.8011, 0.2601, 0.1394, ..., 0.2662, 0.3528, 0.0364],\n",
	" [0.6469, 0.7477, 0.7283, ..., 0.4790, 0.2592, 0.1350],\n",
	" ...,\n",
	" [0.3677, 0.8976, 0.6331, ..., 0.2510, 0.8928, 0.1734],\n",
	" [0.0917, 0.1114, 0.8487, ..., 0.3839, 0.2294, 0.0816],\n",
	" [0.9121, 0.6420, 0.2897, ..., 0.9560, 0.2802, 0.8563]]), tensor([[0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0],\n",
	" [0]])]\n"
	]
	}
	],
	"source": [
	"my_dataset = datautils.TensorDataset(tensor_X,tensor_y.to(torch.int64)) # create your datset\n",
	"\n",
	"batch_sample = BatchSampler(SequentialSampler(my_dataset),256,False)\n",
	"sample_iter = iter(batch_sample)\n",
	"indices = next(sample_iter)\n",
	"print(len(batch_sample))\n",
	"print(indices)\n",
	"batch = default_collate([my_dataset[i] for i in indices])\n",
	"print(batch)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Example of a custom data loader, replicates the current data loader"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 11,
	"metadata": {},
	"outputs": [],
	"source": [
	"## Example data loader, replicates orginal\n",
	"class myDataLoader(object):\n",
	" def __init__(self,dataset,batch_size=16,shuffle=False):\n",
	" self.dataset = dataset\n",
	" self.batch_size = batch_size\n",
	" self.shuffle = shuffle\n",
	" \n",
	" if self.shuffle:\n",
	" self.batch_sampler = BatchSampler(RandomSampler(self.dataset),self.batch_size,False)\n",
	" else:\n",
	" self.batch_sampler = BatchSampler(SequentialSampler(self.dataset),self.batch_size,False)\n",
	" self.sample_iter = iter(self.batch_sampler) \n",
	" def __len__(self):\n",
	" #number of batches\n",
	" pass\n",
	" \n",
	" def __iter__(self):\n",
	" return self\n",
	" \n",
	" def __next__(self):\n",
	" indices = next(self.sample_iter)\n",
	" batch = default_collate([self.dataset[i] for i in indices])\n",
	" return batch \n",
	" "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 12,
	"metadata": {},
	"outputs": [],
	"source": [
	"#testing\n",
	"my_data_loader = myDataLoader(my_dataset,batch_size=256,shuffle=False)\n",
	"%lprun -f total_data_loop total_data_loop(my_data_loader)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Timer unit: 1e-06 s\n",
	"\n",
	"Total time: 21.0301 s\n",
	"File: <ipython-input-3-ed5e115f2b3e>\n",
	"Function: total_data_loop at line 1\n",
	"\n",
	"Line # Hits Time Per Hit % Time Line Contents\n",
	"==============================================================\n",
	" 1 def total_data_loop(data_loader):\n",
	" 2 8190 21023529.0 2567.0 100.0 for (feat,label) in data_loader:\n",
	" 3 8189 6531.0 0.8 0.0 pass\n",
	" 4 1 0.0 0.0 0.0 return 0"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"For the sequential case, you can be smaterer about the indexing\n",
	"it is a small change, but has a big impact"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 13,
	"metadata": {},
	"outputs": [],
	"source": [
	"class myDataLoader(object):\n",
	" def __init__(self,dataset,batch_size=16,shuffle=False):\n",
	" self.dataset = dataset\n",
	" self.batch_size = batch_size\n",
	" self.shuffle = shuffle\n",
	" \n",
	" if self.shuffle:\n",
	" self.batch_sampler = BatchSampler(RandomSampler(self.dataset),self.batch_size,False)\n",
	" else:\n",
	" self.batch_sampler = BatchSampler(SequentialSampler(self.dataset),self.batch_size,False)\n",
	" self.sample_iter = iter(self.batch_sampler) \n",
	" def __len__(self):\n",
	" #number of batches\n",
	" pass\n",
	" \n",
	" def __iter__(self):\n",
	" return self\n",
	" \n",
	" def __next__(self):\n",
	" indices = next(self.sample_iter)\n",
	" if self.shuffle:\n",
	" batch = default_collate([self.dataset[i] for i in indices])\n",
	" else:\n",
	" batch = [x for x in self.dataset[indices[0]:(indices[-1] + 1)]]\n",
	" return batch "
	]
	},
	{
	"cell_type": "code",
	"execution_count": 14,
	"metadata": {},
	"outputs": [],
	"source": [
	"my_data_loader = myDataLoader(my_dataset,batch_size=256,shuffle=False)\n",
	"%lprun -f total_data_loop total_data_loop(my_data_loader)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"Timer unit: 1e-06 s\n",
	"\n",
	"Total time: 1.40686 s\n",
	"File: <ipython-input-3-ed5e115f2b3e>\n",
	"Function: total_data_loop at line 1\n",
	"\n",
	"Line # Hits Time Per Hit % Time Line Contents\n",
	"==============================================================\n",
	" 1 def total_data_loop(data_loader):\n",
	" 2 8190 1403383.0 171.4 99.8 for (feat,label) in data_loader:\n",
	" 3 8189 3473.0 0.4 0.2 pass\n",
	" 4 1 0.0 0.0 0.0 return 0"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"It is clear that this difference comes from how you index into the dataloader. The equiv speed up is similar."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 15,
	"metadata": {},
	"outputs": [],
	"source": [
	"indices = [x for x in range(4)]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 16,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"18.6 µs ± 376 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
	]
	}
	],
	"source": [
	"%timeit [my_dataset[i] for i in indices]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 17,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"5 µs ± 75.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n"
	]
	}
	],
	"source": [
	"%timeit my_dataset[indices[0]:indices[-1]]"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python (pytorch python 3.6 CUDA 9.2",
	"language": "python",
	"name": "pytorch37_cuda92"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.7.0"
	},
	"varInspector": {
	"cols": {
	"lenName": 16,
	"lenType": 16,
	"lenVar": 40
	},
	"kernels_config": {
	"python": {
	"delete_cmd_postfix": "",
	"delete_cmd_prefix": "del ",
	"library": "var_list.py",
	"varRefreshCmd": "print(var_dic_list())"
	},
	"r": {
	"delete_cmd_postfix": ") ",
	"delete_cmd_prefix": "rm(",
	"library": "var_list.r",
	"varRefreshCmd": "cat(var_dic_list()) "
	}
	},
	"types_to_exclude": [
	"module",
	"function",
	"builtin_function_or_method",
	"instance",
	"_Feature"
	],
	"window_display": false
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}