Skip to content

Instantly share code, notes, and snippets.

View mgbckr's full-sized avatar

Martin Becker mgbckr

View GitHub Profile
@mgbckr
mgbckr / cpu_mem.py
Created May 15, 2023 16:08
Get %CPU and %MEM for each process on Linux (using Python)
import pandas as pd
import subprocess
# 'ps aux'
output = subprocess.check_output(['ps', 'aux']).decode('utf-8')
# split lines
lines = output.split('\n')
# headers

Script to configure displays via EDID rather than outputs with xrandr

Sometime output designations change in xrandr. Particular when using a docking station (e.g., a Lenovo Thunderbolt 4 Dock). In those cases scripts that use output names to configure a display setup fail. This script provides function to use a screens.ini file (see example) with EDIDs rather than output names.

# get EDIDs for output names
bash screens.sh map
@mgbckr
mgbckr / Dockerfile
Last active December 23, 2022 09:49
Dockerfile with JupyterLab and custom environment based on micromamba and mamba
# ATTENTION: This container is intended to run in ROOTLESS mode:
# https://docs.docker.com/engine/security/rootless/
# TODO:
# * check this for better approach to run in user space rather than root if rootless mode is not an option:
# https://github.com/jupyter/docker-stacks/blob/main/base-notebook/Dockerfile
# * secure Jupyter Lab
ARG BASE_IMAGE=ubuntu:22.04
FROM ${BASE_IMAGE}
# source: https://bcrf.biochem.wisc.edu/2021/02/05/download-google-drive-files-using-wget/
download() {
FILEID=$1
OUT=$2
confirm=$(wget --quiet --save-cookies ./tmp-cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=${FILEID}" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')
wget --load-cookies ./tmp-cookies.txt "https://docs.google.com/uc?export=download&confirm=${confirm}&id=${FILEID}" -O ${OUT}
rm -rf ./tmp-cookies.txt
}
@mgbckr
mgbckr / _read_csv_joblib.md
Last active February 9, 2022 17:46
Reading a large CSV file via pandas and joblib. Probably degrades due to pd.concat usage. Tests and better function parameter definitions and documentation pending.

Parallelized pd.read_csv with joblib

Reading a large CSV file via pandas and joblib. Probably degrades due to pd.concat usage. Tests and better function parameter definitions and documentation pending.

A very objective test on a 5GB CSV file (shape=()) resulted in a Kernel died message (it was run in a Jupyter notebook and repeated twice) when using pd.read_csv directly. In contrast, using read_csv_joblib with the following settings returned in 3h 4m: Concatenating the row chunks took the longest.

@mgbckr
mgbckr / _nbconvert_hide_input.md
Last active January 31, 2022 23:21
Remove cells after execution when converting Jupyer Notebook to markdown via nbconvert by using custom templates

There is no postprocessor to exclude cells from the final output when converting Jupyter notebooks to markdown. One solution for this is to use custom templates as outlined in this gist.

Instructions:

  • Put the conf.json and index.md.j2 into a folder for example ./templates/markdown2
  • In your Jupyter notebook add the tag remove to the cells that should be removed after executing cells
  • Run the following command to execute and convert the notebook:
jupyter nbconvert notebook.ipynb --execute --to markdown --TemplateExporter.extra_template_basedirs=./templates --template markdown2 
@mgbckr
mgbckr / test_memmap_memory_consumption.py
Last active November 29, 2021 18:19
Testing numpy's memmap memory consumption
# testing numpy's memmap memory consumption
# see: https://numpy.org/doc/stable/reference/generated/numpy.memmap.html
import os
import numpy as np
from memory_profiler import memory_usage
# benchmark parameters
memory_usage_kwargs = dict(
@mgbckr
mgbckr / prepare_references_for_science_publication.sh
Last active August 4, 2021 16:43
Create joint references across main text and supplement for Science publication in LaTeX
# Science publications want the citations to be numbered sequentially across the main text and then the supplement.
# Additionally all references (including those in the supplement) are to appear in the main text.
# Reference: https://www.sciencemag.org/authors/instructions-preparing-initial-manuscript#format-supplemental
# After struggling a bit with the `xr` package in Overleaf, I gave up and just scripted this quick and dirty hack.
#
# Instructions:
# * place `% CITATION HANDLE` right before you bibliography in the main file
# and right after `\begin{document}` in the supplement
# * run the file via `bash prepare_references_for_science_publication.sh`
#
@mgbckr
mgbckr / switch_bluetooth_profile.sh
Last active November 22, 2023 09:03
Method to make audio quality better for bluetooth headsets on linux (Fedora 34) using mSBC
#!/bin/bash
# source: https://bbs.archlinux.org/viewtopic.php?pid=1973004
# prepare via enabling mSBC codec for HSP/HFP:
# https://wiki.archlinux.org/title/PipeWire#Low_audio_quality_on_Bluetooth
# the `bluez-monitor.conf` is located at:
# `/usr/share/pipewire/media-session.d/bluez-monitor.conf`
# note that the settings gui needs to be restarted after editing the file and calling
# `systemctl --user restart pipewire.service`
#msbc=`pactl list | grep Active | grep msbc`
@mgbckr
mgbckr / matplotlib_subplots_legend_bottom_center.py
Last active February 2, 2020 06:30
Matplotlib subplots legend bottom center; does not work in Jupyter but the output is fine
import matplotlib.pyplot as plt
import matplotlib.lines
# matplotlib.rcParams["legend.frameon"] = False # need to set this once to enable styling??? WTF???
titles = ["title 1", "title 2", "title 3", "title 4"]
colors = ['#ffd600', '#f44336', '#43a047', '#1e88e5', '#ab47bc', '#3f51b5', '#f57c00']
# init figure / plot
fig, axes = plt.subplots(1, len(titles), figsize=(5*len(titles) - 3, 5))
for i_t, ax in enumerate(axes):