Skip to content

Instantly share code, notes, and snippets.

View gokart23's full-sized avatar

Karthik Duddu gokart23

View GitHub Profile
@gokart23
gokart23 / README.md
Created March 25, 2022 21:47
Subtoken mask for subword vocabs
@gokart23
gokart23 / resave_electra_ckpt_w_variables.py
Last active November 30, 2020 21:44
Resave ELECTRA checkpoint with optimizer variables reinitialized and added back in
import tensorflow as tf
import re
from model import modeling
from model import optimization
from pretrain import pretrain_data
from pretrain import pretrain_helpers
from util import training_utils
from util import utils
@gokart23
gokart23 / useful_tf_tips.md
Created June 18, 2020 02:37
Useful TF tips/tricks
  1. Creating binary "input values" file for TFLite benchmarking tool:
import numpy as np

inp_feat_array = ....
...
inp_feat_array.astype(TYPE_OF_TFLITE_ARRAY_DTYPE).write(FNAME)
@gokart23
gokart23 / bert_gen_each_token_masked.py
Last active May 21, 2020 01:29
Computation graph operations for tokenizing, masking each token, and adding BOS/EOS for processing by BERT
import tensorflow as tf
import tensorflow_text as text
tf.enable_eager_execution()
CLS_TOKEN, SEP_TOKEN, MASK_TOKEN = 101, 102, 103
def merge_dims(rt, axis=0):
to_expand = rt.nested_row_lengths()[axis]
to_elim = rt.nested_row_lengths()[axis + 1]
@gokart23
gokart23 / mask_each_out.py
Last active April 17, 2020 02:27
Mask each token out in a batch of sentences of variable length (PyTorch)
def mask_all_out(self, x, lengths):
"""
Returns tensor with all tokens masked out one at a time, with respect for seq lengths
Inputs:
x : (slen, bsz) LongTensor
lengths : (bsz) LongTensor
Returns:
@gokart23
gokart23 / useful_bash_tricks.md
Last active October 19, 2020 23:02
Useful bash tricks
  1. Filter CSV based on number of words in a field using awk:

awk -F ',' 'split($2,a," ") == 2 {print $2}' fname

  1. Convert Kubernetes events to local timezone (for comparison with other logs):

kubectl get events -o json | jq '.items[] | .lastTimestamp + "," + .message' | tr -d '"' | awk -F',' '{cmd="date +\"%d/%b/%Y:%T\" --date=" $1; cmd | getline conv_date; print "[" conv_date "] (kubectl events) " $2}'

  1. Remove EOS/BOS from a text data file using sed
  1. Setting up docker to use a different image directory: https://stackoverflow.com/a/34731550
  2. Jellyfin docker image: docker run --name "jellyfin-server" --restart=unless-stopped -d -v $(pwd)/config/:/config -v $(pwd)/cache/:/cache -v /home/nemty/media:/media --net=host jellyfin/jellyfin
@gokart23
gokart23 / run-arm-chroot-on-x86.md
Last active June 3, 2024 00:57
Run ARM chroot on x86 machine (both ArchLinux)

Steps

  1. Install qemu-user-static (yay -S qemu-user-static).
    • This might need you to install pcre-static and update PGP keys (follow the tips in the comments here).
  2. If not already present, install systemd-binfmt, the revamped version of binfmt-support tools
    • Your system has to support transparent Qemu user emulation, but fortunately, that is mostly enabled once the steps here have been followed.
  3. Check the status of the systemd-binfmt unit (systemctl status systemd-binfmt) and (re)start if needed (sudo systemctl restart systemd-binfmt)
    • This unit has ARM support by default, but check the current documentation to make install it if needed
  4. Mount the root partition of the ARM system into a folder (for e.g., sudo mount /dev/sdb2 arm_mountpoint)
  5. Copy the QEMU ARM static binary (/usr/bin/qemu-arm-static on my distro) to the mounted root directory's usr/bin
@gokart23
gokart23 / download.py
Last active December 19, 2019 00:28 — forked from rjchee/download.py
Download videos from Panopto based on RSS feed file
#!/bin/env python3
import argparse, sys
import os, subprocess
import xml.etree.ElementTree as ET
parser = argparse.ArgumentParser()
parser.add_argument("--rssfile", help="Location of RSS feed file downloaded from Panopto",
required=True, type=argparse.FileType('r'))
args = parser.parse_args()
@gokart23
gokart23 / kubernetes-rest-apply.sh
Created February 9, 2018 11:16
Kubernetes REST API for `kubectl apply -f <file>`
# Kubernetes Server: 1.3
# If this is the first time that apply is being called
curl -X POST -H "Content-Type: application/json" "${KUBERNETES_API_SERVER}"/apis/extensions/v1beta1/namespaces/my-namespace/deployments/ --data @my-deployment-spec.json
# If this is the first time that apply is being called
curl -X PATCH -H "Content-Type: application/json-patch+json" "${KUBERNETES_API_SERVER}"/apis/extensions/v1beta1/namespaces/my-namespace/deployments/my-deployment --data @my-deployment-spec.json