Anirban Das akaanirban

## README.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                akaanirban
                / README.md
            
            
              Created
              November 8, 2023 18:39
            
              
                Reverse Engineer docker image to build Dockerfile using docker history
              
          
    The following is how you can sort of reverse engineer a docker image to get the Dockerfile using docker history. You can use a sophisticated tool like dive but that has its own problem.
Lets assume you have an image nvcr.io/nvidia/pytorch:23.10-py3
Run the following command to create a semi correct Dockerfile : docker history --no-trunc nvcr.io/nvidia/pytorch:23.10-py3 --format '{{ .CreatedBy }}' | tail -r > Dockerfile
The dockerfile:
/bin/sh -c #(nop)  ARG RELEASE
/bin/sh -c #(nop)  ARG LAUNCHPAD_BUILD_ARCH


## README.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                akaanirban
                / README.md
            
            
              Created
              September 26, 2023 03:40
            
              
                Download Models from Huggingface to a local directory and exclude `*.bin` files
              
          
    Run the following code to download model to a. local directory from huggingface and exclude *.bin files:
import huggingface_hub

huggingface_hub.snapshot_download(repo_id="meta-llama/Llama-2-7b-chat-hf", local_dir="./meta-llama_Llama-2-7b-chat-hf", local_dir_use_symlinks=False, resume_download=True, ignore_patterns=["*.msgpack", "*.h5", "*.bin"])
print("done")

  
## parallel.py
##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
## Created by: Hang Zhang, Rutgers University, Email: zhang.hang@rutgers.edu
## Modified by Thomas Wolf, HuggingFace Inc., Email: thomas@huggingface.co
## Copyright (c) 2017-2018
##
## This source code is licensed under the MIT-style license found in the
## LICENSE file in the root directory of this source tree
##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

"""Encoding Data Parallel"""

## different-ways-to-perform-gradient-accumulation.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                akaanirban
                / different-ways-to-perform-gradient-accumulation.ipynb
            
            
              Created
              August 30, 2021 21:26
            
              
                Different ways to perform gradient accumulation.ipynb
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## spacy_preprocessor.py
import re
from typing import List

import spacy
from spacy.tokens import Doc
from tqdm import tqdm


class SpacyPreprocessor:
    def __init__(

## README.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                akaanirban
                / README.md
            
            
              Last active
              July 16, 2021 17:50
            
              
                Setup script to configure a (GCP/AWS) Ubuntu VM with NVIDIA drivers and NVIDIA docker container toolkit.
              
          
    #!/bin.bash

set -e

# More details on other OS in https://cloud.google.com/compute/docs/gpus/install-drivers-gpu

# install docker 
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - 
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" 

  
## README.md

      
              1 file
            
          
              0 forks
            
          
              1 comment
            
          
              0 stars
            
          
                akaanirban
                / README.md
            
            
              Created
              July 6, 2021 16:36
            
              
                Test
              
          
    import dask
import dask.dataframe as dd
import pandas as pd
import pandas as pd
import numpy as np
from pandas.tseries.holiday import USFederalHolidayCalendar
import os
import time
import pyarrow.dataset as ds

  
## setup.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                akaanirban
                / setup.md
            
            
              Created
              April 23, 2021 22:47
            
              
                How to set up Kind with multiple nodes, and Connect from a remote computer
              
          
    The following shows how to setup Kind locally with multiple nodes and connect to it from a remote computer.

WARNING: DO NOT DO THIS UNLESS YOU KNOW AHT YOU ARE DOING. OR UNLESS YOU ARE IN A SUBNET. KIND HAS VERY LITTLE SECURITY AND EXPOSING IT TO OUTSIDE MAY COMPROMISE YOUR SYSTEM!

Step 1:


Install Kind in the local computer. Lets assume the ip of the local computer is a.b.c.d and you want the kubernetes control plane to run on port 4321.
Lets further suppose you want a kind deployment with 1 master node and 3 worker node. Some of this is taken from kubernetes-sigs/kind#873 (comment) .
Make a file kind_node_config and paste the following in it
# four node (three workers) cluster config


kind: Cluster

  
## get_matplotlib_cmap_color_list.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                akaanirban
                / get_matplotlib_cmap_color_list.md
            
            
              Created
              March 13, 2021 21:14
            
              
                Get Color list from matplotlib Cmap
              
          
    A small script to get the colors in a specific cmap as lists and then you can use them in your code.
Main code was here: https://matplotlib.org/stable/gallery/color/colormap_reference.html
Colormaps "Dark2" and "Set2" looks very good.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sat Mar 13 16:08:06 2021

  
## read_cuda_tensor_in_cpu.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                akaanirban
                / read_cuda_tensor_in_cpu.md
            
            
              Created
              January 24, 2021 16:16
            
              
                How to read a pickled collection (list or dictionary etc.) of pytorch cuda tensor in cpu
              
          
    What if you saved some loss values / accuracy values as a list of pytorch tensor in a system with cuda and then trying to plot the losses in a system with no GPU?

With some googling I found that the following code from (pytorch/pytorch#16797 (comment)) works fine! You just need to define the custome unpickler and use it in place of pickle.load!

import io
import torch
class CPU_Unpickler(pickle.Unpickler):
    def find_class(self, module, name):
 if module == 'torch.storage' and name == '_load_from_bytes':
	##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	## Created by: Hang Zhang, Rutgers University, Email: zhang.hang@rutgers.edu
	## Modified by Thomas Wolf, HuggingFace Inc., Email: thomas@huggingface.co
	## Copyright (c) 2017-2018
	##
	## This source code is licensed under the MIT-style license found in the
	## LICENSE file in the root directory of this source tree
	##+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

	"""Encoding Data Parallel"""
	import re
	from typing import List

	import spacy
	from spacy.tokens import Doc
	from tqdm import tqdm


	class SpacyPreprocessor:
	def __init__(