Skip to content

Instantly share code, notes, and snippets.

View mrm8488's full-sized avatar
🏠
Working from home

Manuel Romero mrm8488

🏠
Working from home
View GitHub Profile
@mrm8488
mrm8488 / smallberta_pretraining.ipynb
Created February 25, 2020 20:44 — forked from aditya-malte/smallberta_pretraining.ipynb
smallBERTa_Pretraining.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
from tkinter import *
from PIL import ImageTk,Image
import time
import os
targetImageWidth = 850
targetImageHeight = 400
inputImageWidth = 0
inputImageHeight = 0
class CustomIterableDatasetv2(IterableDataset):
def __init__(self, filename_en, filename_gm):
#Store the filenames in object's memory
self.filename_en = filename_en
self.filename_gm = filename_gm
#And that's it, we no longer need to store the contents in the memory
dataset = CustomIterableDatasetv1('path_to/somefile')
dataloader = DataLoader(dataset, batch_size = 64)
for X, y in dataloader:
print(len(X)) # 64
print(y.shape) # (64,)
### Do something with X and y
###
class CustomIterableDatasetv1(IterableDataset):
def __init__(self, filename):
#Store the filename in object's memory
self.filename = filename
#And that's it, we no longer need to store the contents in the memory
def preprocess(self, text):
#Creating the iterable dataset object
dataset = CustomIterableDataset('path_to/somefile')
#Creating the dataloader
dataloader = DataLoader(dataset, batch_size = 64)
for data in dataloader:
#Data is a list containing 64 (=batch_size) consecutive lines of the file
print(len(data)) #[64,]
#We still need to separate the text and labels from each other and preprocess the text
from torch.utils.data import IterableDataset
class CustomIterableDataset(IterableDataset):
def __init__(self, filename):
#Store the filename in object's memory
self.filename = filename
#And that's it, we no longer need to store the contents in the memory
@mrm8488
mrm8488 / ngrok_tensorboard_colab.md
Last active September 19, 2022 09:35
Setup ngrok and run TensorBoard on Colab
@mrm8488
mrm8488 / app.js
Created February 4, 2020 01:32 — forked from stongo/app.js
Joi validation in a Mongoose model
var mongoose = require('mongoose');
mongoose.connect('mongodb://localhost/test');
var db = mongoose.connection;
db.on('error', function() {
return console.error.bind(console, 'connection error: ');
});
@mrm8488
mrm8488 / memory_profiling.sh
Created January 24, 2020 15:09
Profile memory usage of a script
while ps auxw | grep '[m]yscript'; do sleep 30; done | stdbuf -o0 uniq | ts
# Monitor changes in memory usage of myscript and timestamp the lines using ts. stdbuf -o0 turns off output buffering. [m] in the grep expression prevents the grep process line itself from being matched.