This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
PUT /test_index | |
{ | |
"mappings": { | |
"properties": { | |
"title": { | |
"type": "text", | |
"analyzer": "synonym_analyzer", | |
"search_analyzer": "synonym_analyzer" | |
}, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
PUT /test_index | |
{ | |
"mappings": { | |
"properties": { | |
"title": { | |
"type": "text", | |
"analyzer": "synonym_analyzer", | |
"search_analyzer": "synonym_analyzer" | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-synonym-tokenfilter.html | |
# Explicit mappings match any token sequence on the LHS of "=>" | |
# and replace with all alternatives on the RHS. These types of mappings | |
# ignore the expand parameter in the schema. | |
# Examples: | |
i-pod, i pod => ipod, | |
sea biscuit, sea biscit => seabiscuit | |
# Equivalent synonyms may be separated with commas and give | |
# no explicit mapping. In this case the mapping behavior will |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
from abc import ABCMeta, abstractmethod | |
class DataProcessor(metaclass=ABCMeta): | |
"""Base processor to be used for all preparation.""" | |
def __init__(self, input_directory, output_directory): | |
self.input_directory = input_directory | |
self.output_directory = output_directory |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[loggers] | |
keys=root | |
[logger_root] | |
level=INFO | |
handlers=screen,file | |
[formatters] | |
keys=simple |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import datetime | |
import json | |
import logging | |
import ntpath | |
import os | |
def create_folder(directory): | |
try: | |
if not os.path.exists(directory): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from fastprogress.fastprogress import master_bar, progress_bar | |
from time import sleep | |
mb = master_bar(range(10)) | |
for i in mb: | |
for j in progress_bar(range(100), parent=mb): | |
sleep(0.01) | |
mb.child.comment = f'second bar stat' | |
mb.first_bar.comment = f'first bar stat' | |
mb.write(f'Finished loop {i}.') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from tqdm import tqdm | |
import time | |
tqdm.pandas() | |
df['col'] = df['col'].progress_apply(lambda x: x**2) | |
text = "" | |
for char in tqdm(["a", "b", "c", "d"]): | |
time.sleep(0.25) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import json | |
import os | |
from sklearn.metrics import (accuracy_score, classification_report, | |
confusion_matrix, f1_score, fbeta_score) | |
def get_metrics(y, y_pred, beta=2, average_method='macro', y_encoder=None): | |
if y_encoder: | |
y = y_encoder.inverse_transform(y) | |
y_pred = y_encoder.inverse_transform(y_pred) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def set_seed(args): | |
random.seed(args.seed) | |
np.random.seed(args.seed) | |
torch.manual_seed(args.seed) | |
if args.n_gpu > 0: | |
torch.cuda.manual_seed_all(args.seed) |