Skip to content

Instantly share code, notes, and snippets.

View omarsar's full-sized avatar
🐙

Elvis Saravia omarsar

🐙
View GitHub Profile
@omarsar
omarsar / submitting_newsletter_pr.md
Last active February 19, 2020 17:42
Guide for submitting a PR on NLP Newsletter translations.

These are the instructions for submitting an NLP Newsletter PR

First, I need to send you an invite to push to the repository. Just send me an email with your GitHub account and I will add you. If I added you as a contributer already, ignore it.

If you would like to be added as an official writer to the publication, I will need the following information from you (replace the items in CAPS):

GITHUB_USERNAME:
  name: NAME
 web: PERSONAL_WEBSITE (OPTIONAL)
####################
# Helsinki Meetup
# Machine learning in the Elastic Stack
####################
####### Transforms ############
## 1. Check and explore the index you are working with:
GET kibana_sample_data_ecommerce/_search

.es(index=apa*,q=geoip.country_code2:FR,metric=sum:bytes).bars().label("France"), .es(index=apa*,q=geoip.country_code2:FR,metric=sum:bytes,offset=-10d).bars(stack=false).label("France")

@omarsar
omarsar / odsc_nlp.md
Last active September 17, 2020 15:55

Title

Applied Deep Learning for NLP Applications

Abstract

Natural language processing (NLP) has become an important field with interest from many important sectors that leverage modern deep learning methods for approaching several NLP problems and tasks such as text summarization, question answering, and sentiment classification, to name a few. In this tutorial, we will introduce several of the fundamental NLP techniques and more modern approaches (BERT, GTP-2, etc.) and show how they can be applied via transfer learning to approach many real-world NLP problems. We will focus on how to build an NLP pipeline using several open-source tools such as Transformers, Tokenizers, spaCy, TensorFlow, and PyTorch, among others. Then we will learn how to use the NLP model to search over documents based on semantic relationships. We will use open-source technologies such as BERT and Elasticsearch for this segment to build a proof of concept. In essence, the learner will take away the important theoretical pieces ne

@omarsar
omarsar / timelion-percentage-kibana.md
Last active January 14, 2020 12:23
This timelion code snippet shows percentages per buckets

.es(index=apa*,q=geoip.country_code2:FR,metric=sum:bytes).label("France").divide(.es(index=apa*, q=geoip.country_code2:,metric=sum:bytes)).multiply(100).yaxis(units="custom::%"), .es(index=apa,q=geoip.country_code2:DE,metric=sum:bytes).label("Germany",).divide(.es(index=apa*, q=geoip.country_code2:*,metric=sum:bytes)).multiply(100).yaxis(units="custom::%")

Cost after iteration 0: 0.6931470036506653 | Train Acc: 50.40983581542969 | Test Acc: 45.75163269042969
Cost after iteration 10: 0.6691471934318542 | Train Acc: 64.3442611694336 | Test Acc: 54.24836730957031
Cost after iteration 20: 0.6513187885284424 | Train Acc: 68.44261932373047 | Test Acc: 54.24836730957031
Cost after iteration 30: 0.6367831230163574 | Train Acc: 68.03278350830078 | Test Acc: 54.24836730957031
Cost after iteration 40: 0.6245343685150146 | Train Acc: 69.67213439941406 | Test Acc: 54.90196228027344
Cost after iteration 50: 0.6139233112335205 | Train Acc: 70.90164184570312 | Test Acc: 56.20914840698242
Cost after iteration 60: 0.6045243740081787 | Train Acc: 72.54098510742188 | Test Acc: 56.86274337768555
Cost after iteration 70: 0.5960519909858704 | Train Acc: 74.18032836914062 | Test Acc: 57.51633834838867
Cost after iteration 80: 0.5883094668388367 | Train Acc: 73.77049255371094 | Test Acc: 57.51633834838867
Cost after iteration 90: 0.581156849861145 | Train Acc: 74.59016418457031 | Test
## the trend in the context of loss
plt.plot(costs)
plt.show()
## hyperparams
costs = []
dim = x_flatten.shape[0]
learning_rate = torch.scalar_tensor(0.0001).to(device)
num_iterations = 100
lrmodel = LR(dim, learning_rate)
lrmodel.to(device)
## transform the data
def transform_data(x, y):
## model pretesting
x, y = next(iter(train_dataset))
## flatten/transform the data
x_flatten = x.T
y = y.unsqueeze(0)
## num_px is the dimension of the images
dim = x_flatten.shape[0]
class LR(nn.Module):
def __init__(self, dim, lr=torch.scalar_tensor(0.01)):
super(LR, self).__init__()
# intialize parameters
self.w = torch.zeros(dim, 1, dtype=torch.float).to(device)
self.b = torch.scalar_tensor(0).to(device)
self.grads = {"dw": torch.zeros(dim, 1, dtype=torch.float).to(device),
"db": torch.scalar_tensor(0).to(device)}
self.lr = lr.to(device)