Skip to content

Instantly share code, notes, and snippets.

View ftfarias's full-sized avatar
🎯
Focusing

Felipe Farias ftfarias

🎯
Focusing
  • Data Lead at Alice
  • São Paulo, Brazil
View GitHub Profile
@ftfarias
ftfarias / Bigram.py
Last active November 28, 2023 03:14
Bigram Detector
import math
from collections import Counter
def calculate_mutual_information(words, bigrams):
# Count the frequency of individual words and bigrams
word_counts = Counter(words)
bigram_counts = Counter(bigrams)
# Calculate the total number of words and bigrams
total_words = sum(word_counts.values())
@ftfarias
ftfarias / linear_regression.sql
Last active December 27, 2022 18:46
how to do a linear regression in SQL
drop table if exists linear_test;
create table linear_test (
x numeric(19,4),
y numeric(19,4),
workspace_id text
);
insert into linear_test (x,y, workspace_id) values (1, 10, 'a'), (2, 20, 'a'), (3, 30, 'a'), (4, 40, 'a'), (5, 50, 'a');
insert into linear_test (x,y, workspace_id) values (5, 10, 'b'), (6, 20, 'b'), (7, 30, 'b'), (8, 40, 'b'), (9, 50, 'b');
@ftfarias
ftfarias / docker-compose.yml
Created May 7, 2020 14:55
docker-compose with 3 kafka instances in cluster + 3 zookeepers
version: '2'
networks:
kafka:
driver: bridge
services:
zookeeper-1:
image: confluentinc/cp-zookeeper:latest
hostname: zookeeper-1
@ftfarias
ftfarias / docker-compose.yml
Created April 6, 2020 20:53
docker-compose with Kafka and Zookeeper
version: '2'
networks:
kafka:
driver: bridge
services:
zookeeper-1:
image: confluentinc/cp-zookeeper:latest
hostname: zookeeper-1
@ftfarias
ftfarias / python_nlp_packages.md
Created October 14, 2019 20:19 — forked from brianspiering/python_nlp_packages.md
A Hacker's Guide to Python string and Natural Language Processing (NLP) packages

A Hacker's Guide to Python string and Natural Language Processing (NLP) packages

Preprocessing

  • The Python Standard Library, especially str.methods and string module are powerful for text processing. Start there.
  • regex - Extends Python's Standard Library re module while being backwards-compatible.
  • chardet - Finds character encoding.
  • ftfy - Take in bad Unicode and output good Unicode. Seriously automagical.
  • ploygot - Helpful for multilingual preprocessing.
@ftfarias
ftfarias / docker.txt
Created September 4, 2019 14:06
Docker / Kubernets cheat sheet
# Purging All Unused or Dangling Images, Containers, Volumes, and Networks
docker system prune -a
# Remove all images
docker rmi $(docker images -a -q)
import copy
def verify_partial_valid_solution(A,w):
if len(w[0]) > 2:
return False
w1 = sum(w[1])
w2 = sum(w[2])
w3 = sum(w[3])
a = sum(A)
if a + w1 < w2: return False
@ftfarias
ftfarias / small_funcs.py
Last active January 27, 2022 12:06
Small but practical functions
# parse iso time
datetime.strptime('2019-04-17T09:02:16.00428Z', '%Y-%m-%dT%H:%M:%S.%fZ')
import itertools
def grouper(iterable, n):
it = iter(iterable)
while True:
chunk = tuple(itertools.islice(it, n))
@ftfarias
ftfarias / AirflowCleaning.py
Created January 23, 2019 18:18
How to clean airflow dags
import sqlalchemy
import sys
def print_clear_dag(dag_id):
"""Clear all information for a DAG from airflow postgres database"""
list_tables = ['xcom', 'task_instance', 'sla_miss', 'log', 'job', 'dag_run', 'dag',
'dag_stats', 'task_fail']
for table in list_tables:
queries = ["DELETE FROM {} WHERE dag_id='{}'".format(table, dag_id),
--------------------------------- Process
SELECT * FROM pg_stat_activity WHERE state = 'active';
SELECT pg_cancel_backend(<pid of the process>)
SELECT pg_terminate_backend(<pid of the process>)
--------------------------------- LOCKS
Ref: https://wiki.postgresql.org/wiki/Lock_Monitoring
SELECT relation::regclass, * FROM pg_locks WHERE NOT GRANTED;