Skip to content

Instantly share code, notes, and snippets.

View cnmoro's full-sized avatar
🎯
Focusing

Carlo Moro cnmoro

🎯
Focusing
View GitHub Profile
@cnmoro
cnmoro / tiktoken_chunkenizer_with_overlap.py
Created March 13, 2024 23:35
tiktoken_chunkenizer_with_overlap.py
import tiktoken
gpt_encoding = tiktoken.encoding_for_model("gpt-3.5-turbo-16k")
def chunk_text(full_text, tokens_per_chunk=300, chunk_overlap=20):
chunks = []
current_chunk = []
current_chunk_length = 0
tokens = gpt_encoding.encode(full_text)
for i, token in enumerate(tokens):
@cnmoro
cnmoro / zram Arch.md
Created January 17, 2024 19:07 — forked from zax4r0/zram Arch.md
Zram On Arch

zRam is a virtual memory compression using block devices named /dev/zram using a fast compression algorithm (LZ4) that compress the least recently used (LRU) or inactive space in the memory allows the GNU/Linux kernel to free up more memory with less performance hit.

zRam is greatly increased the available amount of memory by compressing memory without swap disks/partition. It is recommended for the user to use zRam instead of not use/disable the swap to prevent out of memory (OOM) killer. Create a zRam block devices Load the zRam modules to the kernel using modprobe:

sudo modprobe zram

Set the zRam extremely fast compression algorithm using lz4:

@cnmoro
cnmoro / fastapi_mongo_non_blocking_example.py
Created November 1, 2023 03:56
fastapi_mongo_non_blocking_example.py
import uvicorn, asyncio
import motor.motor_asyncio
from fastapi import FastAPI
import pandas as pd
app = FastAPI()
client = motor.motor_asyncio.AsyncIOMotorClient()
db = client['MYDB']
def process_data(data):
@cnmoro
cnmoro / get_woe_mappings.py
Created May 11, 2022 17:37
Extract WOE Encoder Mappings
import category_encoders as ce
categoric_features = ['a', 'b', 'c']
woe_encoder = ce.WOEEncoder(cols=categoric_features)
woe_encoder.fit(X[categorics], y)
def get_woe_value(feature):
for map in woe_encoder.ordinal_encoder.mapping:
if map['col'] == feature:
@cnmoro
cnmoro / install-oracle-client-ubuntu.md
Created March 29, 2022 20:17 — forked from bmaupin/install-oracle-client-ubuntu.md
Install Oracle client on Ubuntu

Reference: https://help.ubuntu.com/community/Oracle%20Instant%20Client

Tested on: Ubuntu 18.04, 20.04

  1. Decide which version of the Oracle client to install

  2. Download the Oracle client packages

@cnmoro
cnmoro / SKLearn Snippets.py
Created January 4, 2022 23:06
SKLearn Snippets
# CLUSTERING
# Davies Bouldin Index -> Menor Melhor para escolha do K
# Descrição das estatísticas das features
df.groupby("cluster").describe()
centroids = kmeans.cluster_centers_
max = centroids[0]
@cnmoro
cnmoro / shap_feature_importance.py
Created December 2, 2021 19:49
shap_feature_importance.py
import shap
import numpy as np
import pandas as pd
categoric_features = tuple(['FEATURE1', 'FEATURE2', 'ETC'])
def avaliar_importancias_features(modelo_treinado, X):
explainer = shap.Explainer(modelo_treinado)
shap_values = explainer.shap_values(X)
@cnmoro
cnmoro / WireGuard
Last active August 22, 2021 03:00
WireGuard Server+Client Configuration
SERVER-SIDE
$ sudo apt install wireguard
$ sudo -i
$ cd /etc/wireguard/
$ umask 077; wg genkey | tee privatekey | wg pubkey > publickey
$ cat privatekey
( Save the key )
$ cat publickey
( Save the key )
@cnmoro
cnmoro / MLFlow + Authentication
Created September 17, 2020 19:42
MLFlow + Authentication
sudo apt install nginx (ubuntu)
sudo yum install nginx (rhel)
#apache2
#httpd
sudo apt install apache2-utils (ubuntu)
sudo yum install httpd-tools (rhel)
sudo htpasswd -c /etc/nginx/.htpasswd USUARIO
@cnmoro
cnmoro / PIP Ignore SSL Certificate Verification
Created July 21, 2020 20:53
PIP Ignore SSL Certificate Verification
$ pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org pip setuptools