Skip to content

Instantly share code, notes, and snippets.

View turicas's full-sized avatar

Álvaro Justen turicas

View GitHub Profile
@turicas
turicas / create-tags
Created April 4, 2024 22:16
Create ctags-like file using tree-sitter and awk
#!/bin/bash
filename="$1"
if [[ -z $filename ]]; then
echo "ERROR - usage: $0 <filename>"
exit 1
fi
transformation_code='
@turicas
turicas / cidade_exterior.csv
Last active February 9, 2024 02:13
Script que cria dados de exemplo de pessoas e empresas para serem usados em demonstrações
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
cidade_exterior
ABLASSERDAN
A CORUNA
ACRA
ADDISON
AGUEDA
AICHI-KEN
ALAMEDA
ALBLASSERDAM
ALCALA DE HENARES
@turicas
turicas / gist:27aa6a137f3e239c1535818fe82799d5
Created September 20, 2023 17:22
Python's timezoned now() (using only standard library)
import datetime
import time
def timezoned_now():
"""Equivalent to `datetime.datetime.now()` but replaces `tzinfo` with local timezone offset"""
offset = time.timezone if time.localtime().tm_isdst == 0 else time.altzone
delta = datetime.timedelta(seconds=-offset)
timezone = datetime.timezone(offset=delta)
@turicas
turicas / rescue.sh
Created May 4, 2023 03:51
Copy all the possible files from a failing disk (I/O error)
#!/bin/bash
# XXX: replace /dev/sdb1 with the partition you'd like to rescue
sudo apt update && sudo apt install -y gddrescue
sudo time ddrescue /dev/sdb1 /path/to/file.raw /path/to/ddrescue.log --try-again --force --verbose
sudo mkdir -p /mnt/rescued
sudo mount -o ro,loop /path/to/file.raw /mnt/rescued
cd /mnt/rescued # enjoy :)
@turicas
turicas / srt_unique.py
Created May 4, 2023 00:05
Filter repeated captions on SRT files
# pip install srt
import argparse
import srt
parser = argparse.ArgumentParser()
parser.add_argument("input_filename")
parser.add_argument("output_filename")
args = parser.parse_args()
@turicas
turicas / pdf_plot.py
Created April 26, 2023 00:22
Plot PDF text/rect objects' using rows + Pillow
# pip install pillow cached-property pdfminer.six https://github.com/turicas/rows/archive/develop.zip
import argparse
from rows.plugins.plugin_pdf import (
RectObject,
TextObject,
PDFMinerBackend,
group_objects,
YGroupsAlgorithm,
plot_objects,
@turicas
turicas / Transcrição de textos em Português com whisper (OpenAI).ipynb
Last active May 8, 2024 16:10
Transcrição de textos em Português com whisper (OpenAI)
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@turicas
turicas / README.md
Last active March 5, 2024 15:08
View to get postgres table sizes (table, toast, index and total size for each one)

postgres table sizes

The result will be like this:

 schema |                table                | row_estimate  | total_size | index_size | toast_size | table_size | table_size_ratio | avg_row_size | total_bytes | index_bytes | toast_bytes | table_bytes 
--------+-------------------------------------+---------------+------------+------------+------------+------------+------------------+--------------+-------------+-------------+-------------+-------------
 public | estabelecimento                     | 5.4933384e+07 | 17 GB      | 4901 MB    | 8192 bytes | 12 GB      |             0.72 |       332.22 | 18249842688 |  5139447808 |        8192 | 13110386688
 public | empresa                             |  5.211064e+07 | 9883 MB    | 3985 MB    | 8192 bytes | 5898 MB    |             0.60 |       198.87 | 10363297792 |  4178649088 |        8192 |  6184640512
@turicas
turicas / README.md
Created January 23, 2023 22:20
How to copy data quickly to MinIO

How to copy data quickly to MinIO

s3cmd and aws-cli throughput could be very slow (s4cmd and s5cmd are not easy to install in all environments). If you're using MinIO to host your files, use mc instead.

# Get the latest `mc` version:
docker pull minio/mc