Skip to content

Instantly share code, notes, and snippets.

View mrm8488's full-sized avatar
🏠
Working from home

Manuel Romero mrm8488

🏠
Working from home
View GitHub Profile
@mrm8488
mrm8488 / finetune_llama_v2.py
Created July 19, 2023 14:25 — forked from younesbelkada/finetune_llama_v2.py
Fine tune Llama v2 models on Guanaco Dataset
# coding=utf-8
# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
@mrm8488
mrm8488 / smallberta_pretraining.ipynb
Created February 25, 2020 20:44 — forked from aditya-malte/smallberta_pretraining.ipynb
smallBERTa_Pretraining.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mrm8488
mrm8488 / smallberta_pretraining.ipynb
Created February 25, 2020 20:44 — forked from aditya-malte/smallberta_pretraining.ipynb
smallBERTa_Pretraining.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
from tkinter import *
from PIL import ImageTk,Image
import time
import os
targetImageWidth = 850
targetImageHeight = 400
inputImageWidth = 0
inputImageHeight = 0
class CustomIterableDatasetv2(IterableDataset):
def __init__(self, filename_en, filename_gm):
#Store the filenames in object's memory
self.filename_en = filename_en
self.filename_gm = filename_gm
#And that's it, we no longer need to store the contents in the memory
dataset = CustomIterableDatasetv1('path_to/somefile')
dataloader = DataLoader(dataset, batch_size = 64)
for X, y in dataloader:
print(len(X)) # 64
print(y.shape) # (64,)
### Do something with X and y
###
class CustomIterableDatasetv1(IterableDataset):
def __init__(self, filename):
#Store the filename in object's memory
self.filename = filename
#And that's it, we no longer need to store the contents in the memory
def preprocess(self, text):
#Creating the iterable dataset object
dataset = CustomIterableDataset('path_to/somefile')
#Creating the dataloader
dataloader = DataLoader(dataset, batch_size = 64)
for data in dataloader:
#Data is a list containing 64 (=batch_size) consecutive lines of the file
print(len(data)) #[64,]
#We still need to separate the text and labels from each other and preprocess the text
from torch.utils.data import IterableDataset
class CustomIterableDataset(IterableDataset):
def __init__(self, filename):
#Store the filename in object's memory
self.filename = filename
#And that's it, we no longer need to store the contents in the memory
@mrm8488
mrm8488 / app.js
Created February 4, 2020 01:32 — forked from stongo/app.js
Joi validation in a Mongoose model
var mongoose = require('mongoose');
mongoose.connect('mongodb://localhost/test');
var db = mongoose.connection;
db.on('error', function() {
return console.error.bind(console, 'connection error: ');
});