Skip to content

Instantly share code, notes, and snippets.

View ravishchawla's full-sized avatar

Ravish Chawla ravishchawla

View GitHub Profile
Model Configuration Number of Epochs Training F1 Score Validation F1 Score
2 LSTM Layers with 64 Hidden Units 5 epochs 0.637 0.540
2 LSTM Layers with 128 Hidden Units 5 epochs 0.625 0.567
2 LSTM Layers with 128 Hidden Units 15 epochs 0.661 0.558
1 LSTM Layer and a Self Attention Layer 5 epochs 0.771 0.601
@ravishchawla
ravishchawla / quora_model_training.py
Last active March 25, 2020 21:10
Torch Model Training
def train(nn_model, nn_optimizer, nn_criterion, data_loader, val_loader = None, num_epochs = 5, print_ratio = 0.1, verbose=True):
for epoch in range(num_epochs):
# Enable Training for the model
nn_model.train()
running_loss = 0;
for ite, (x, y, l) in enumerate(data_loader):
init_time = time.time();
# Convert our tensors to GPU tensors
@ravishchawla
ravishchawla / quora_torch_model.py
Created March 25, 2020 21:06
Torch Model Basic
class Model(nn.Module):
def __init__(self, embedding_matrix, hidden_unit = 64):
super(Model, self).__init__();
vocab_size = embeddings_tensor.shape[0];
embedding_dim = embeddings_tensor.shape[1];
self.embedding_layer = nn.Embedding(vocab_size, embedding_dim);
self.embedding_layer.weight = nn.Parameter(embeddings_tensor);
self.embedding_layer.weight.requires_grad = True;
@ravishchawla
ravishchawla / quora_torch_model
Created March 25, 2020 21:05
torch_model_basic.py
class Model(nn.Module):
def __init__(self, embedding_matrix, hidden_unit = 64):
super(Model, self).__init__();
vocab_size = embeddings_tensor.shape[0];
embedding_dim = embeddings_tensor.shape[1];
self.embedding_layer = nn.Embedding(vocab_size, embedding_dim);
self.embedding_layer.weight = nn.Parameter(embeddings_tensor);
self.embedding_layer.weight.requires_grad = True;
@ravishchawla
ravishchawla / data_augmentation.py
Created March 20, 2020 21:19
quora_data_augmentation
nearest_syns = NearestNeighbors(n_neighbors=total_syns+1).fit(embeddings_matrix);
neighbours_mat = nearest_syns.kneighbors(embeddings_matrix[1:top_k])[1];
synonyms = {x[0]: x[1:] for x in neighbours_mat};
def augment_sentence(encoded_sentence, prob = 0.5):
for posit in range(len(encoded_sentence)):
if random.random() > prob:
try:
syns = synonyms[encoded_sentence[posit]];
rand_syn = np.random.choice(syns);
@ravishchawla
ravishchawla / data_cleaning_methods.csv
Last active March 20, 2020 16:59
quora_data_cleaning
Data Cleaning Procedure Coverage of Vocabulary Coverage of Dataset
Raw Data (all records) 0.18 0.71
Raw Data (on 10% sample) 0.08 0.71
Lower Casing all words (on 10% sample) 0.10 0.87
Removing and Replacing Non-Alpha Numeric Characters (on 10% sample) 0.11 0.98
Replacing Contractions with Full words (on 10% sample) 0.11 0.98
All methods (all records) 0.27 0.98
@ravishchawla
ravishchawla / check_coverage.py
Created March 20, 2020 16:52
quora_check_coverage
def check_coverage(text, embeddings_dict):
known_words, unknown_words = {}, {};
total_known, total_unknown = 0, 0;
for sentence in text:
for word in sentence.split(' '):
if word in known_words:
total_known = total_known + 1;
elif word in embeddings_dict:
known_words[word] = embeddings_dict[word];
{
"openapi":"3.0.1",
"info":{
"title":"The Jira Cloud platform REST API",
"description":"Jira Cloud platform REST API documentation",
"termsOfService":"http://atlassian.com/terms/",
"contact":{
"email":"ecosystem@atlassian.com"
},
"license":{
Agent Hyperparameters
Replay Buffer Size 1e5
Minibatch Size 128
Discount Rate 0.99
TAU 1e-3
Actor Learning Rate 1e-4
Critic Learning Rate 1e-4
L2 Weight Decay 1e-6
Actor Model Hyperparameters