Skip to content

Instantly share code, notes, and snippets.

View pistocop's full-sized avatar
🦆
Quack

Simone Guardati pistocop

🦆
Quack
View GitHub Profile
@pistocop
pistocop / main.py
Created June 6, 2023 10:26
[template][py] Minimal script template using standard libraries
"""
<Script description here>
"""
import argparse
import logging
import sys
try:
import requests # example of required library
except ImportError:
@pistocop
pistocop / script.sh
Last active September 8, 2023 09:12
bash script template
#!/usr/bin/env bash
# Template from: https://sharats.me/posts/shell-script-best-practices/
set -o errexit
set -o nounset
set -o pipefail
if [[ "${TRACE-0}" == "1" ]]; then
set -o xtrace
fi
@pistocop
pistocop / es8.2.2-debian-installer.sh
Created June 9, 2022 15:14
Install Elasticsearch client on Debian VM and launch the service using systemclt (from official documentation)
#!/usr/bin/env bash
# From official documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html
sudo apt install wget
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.2.2-amd64.deb
sudo dpkg -i elasticsearch-8.2.2-amd64.deb
sudo /bin/systemctl daemon-reload
@pistocop
pistocop / debian-python.sh
Last active March 29, 2022 09:57
debian python dependencies installation
#!/usr/bin/env bash
# Code from:
# https://www.codegrepper.com/code-examples/whatever/python+++++ModuleNotFoundError%3A+No+module+named+%27_ctypes%27
sudo apt-get update
sudo apt-get upgrade
sudo apt-get dist-upgrade
sudo apt-get install build-essential python3-dev python3-setuptools python3-pip python3-smbus
sudo apt-get install libncursesw5-dev libgdbm-dev libc6-dev
@pistocop
pistocop / [medium][garrascobike] recommendation trainer
Created March 16, 2022 12:38
[medium][garrascobike] recommendation trainer
$ python garrascobike/04_recommendation_trainer.py --presence_data_path ./data/03_correlation_data/presence_dataset/20210331124802/presences.csv \
--output_path ./data/04_recommendation_models/knn/
@pistocop
pistocop / [medium][garrascobike] correlation extraction
Created March 16, 2022 12:37
[medium][garrascobike] correlation extraction
$ python garrascobike/03_correlation_extraction.py --es_host localhost \
--es_port 9200 \
--es_index_list my_index opt_my_2nd_index
@pistocop
pistocop / [medium][garrascobike] es uploader
Last active March 16, 2022 12:37
[medium][garrascobike] es uploader
# folder garrascobike-core
$ python garrascobike/02_es_uploader.py --es_index my_index \
--es_host http://localhost \
--es_port 9200 \
--input_file ./data/02_entities_extractions/extraction.parquet
@pistocop
pistocop / [medium][subreddit-downloader]dataset-builder.sh
Created February 11, 2021 20:24
[medium][subreddit-downloader]dataset-builder.sh
# Build the dataset, the results will be under `./dataset/` path
$ python src/dataset_builder.py
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 84.56it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 348.01it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 963.54it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 23.11it/s]
$ ls dataset/20210211210341
comments.csv submissions.csv
@pistocop
pistocop / [medium][subreddit-downloader]example.sh
Last active February 11, 2021 19:36
[medium][subreddit-downloader]example.sh
# Init
$ git clone https://github.com/pistocop/subreddit-comments-dl.git
$ cd subreddit-comments-dl
$ pip install -r requirements.txt
# Download the AskReddit comments of the last 30 submissions
$ python src/subreddit_downloader.py AskReddit --batch-size 10 --laps 3 --reddit-id <reddit_id> --reddit-secret <reddit_secret> --reddit-username <reddit_username>
2021-02-11 19:54:44.175 | INFO | __main__:main:241 - Start download: UTC range: [None, None], direction: `before`, batch size: 10, total submissions to fetch: 30
2021-02-11 19:54:49.769 | INFO | codetiming._timer:stop:57 - Lap 0/3 completed in 0.1m | [new/tot]: 0/0
@pistocop
pistocop / [medium][subreddit-downloader]simple-scraper.sh
Last active February 11, 2021 17:44
subreddit-text-downloader simple scrape
# Init
$ git clone https://github.com/pistocop/subreddit-comments-dl.git
$ cd subreddit-comments-dl
$ pip install -r requirements.txt
# Download the AskReddit comments of the last 30 submissions
$ python src/subreddit_downloader.py AskReddit --batch-size 10 --laps 3 --reddit-id <reddit_id> --reddit-secret <reddit_secret> --reddit-username <reddit_username>