Martin Becker mgbckr

## cpu_mem.py
import pandas as pd
import subprocess

# 'ps aux'
output = subprocess.check_output(['ps', 'aux']).decode('utf-8')

# split lines
lines = output.split('\n')

# headers

## README.md

      
              3 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mgbckr
                / README.md
            
            
              Created
              February 1, 2023 20:29
            
          
    Script to configure displays via EDID rather than outputs with xrandr

Sometime output designations change in xrandr. Particular when using a docking station (e.g., a Lenovo Thunderbolt 4 Dock).
In those cases scripts that use output names to configure a display setup fail.
This script provides function to use a screens.ini file (see example) with EDIDs rather than output names.
# get EDIDs for output names
bash screens.sh map

  
## Dockerfile
# ATTENTION: This container is intended to run in ROOTLESS mode:
# https://docs.docker.com/engine/security/rootless/

# TODO:
# * check this for better approach to run in user space rather than root if rootless mode is not an option:
#   https://github.com/jupyter/docker-stacks/blob/main/base-notebook/Dockerfile
# * secure Jupyter Lab

ARG BASE_IMAGE=ubuntu:22.04
FROM ${BASE_IMAGE}

## gdrive_download_file.sh
# source: https://bcrf.biochem.wisc.edu/2021/02/05/download-google-drive-files-using-wget/
download() {
  FILEID=$1
  OUT=$2

  confirm=$(wget --quiet --save-cookies ./tmp-cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=${FILEID}" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')
  wget --load-cookies ./tmp-cookies.txt "https://docs.google.com/uc?export=download&confirm=${confirm}&id=${FILEID}" -O ${OUT}
  rm -rf ./tmp-cookies.txt
}

## _read_csv_joblib.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mgbckr
                / _read_csv_joblib.md
            
            
              Last active
              February 9, 2022 17:46
            
              
                Reading a large CSV file via pandas and joblib. Probably degrades due to pd.concat usage. Tests and better function parameter definitions and documentation pending. 
              
          
    Parallelized pd.read_csv with joblib

Reading a large CSV file via pandas and joblib. Probably degrades due to pd.concat usage.
Tests and better function parameter definitions and documentation pending.
A very objective test on a 5GB CSV file (shape=()) resulted
in a Kernel died message (it was run in a Jupyter notebook and repeated twice)
when using pd.read_csv directly.
In contrast, using read_csv_joblib with the following settings returned in 3h 4m:
Concatenating the row chunks took the longest.

  
## _nbconvert_hide_input.md

      
              3 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mgbckr
                / _nbconvert_hide_input.md
            
            
              Last active
              January 31, 2022 23:21
            
              
                Remove cells after execution when converting Jupyer Notebook to markdown via nbconvert by using custom templates
              
          
    There is no postprocessor to exclude cells from the final output when converting Jupyter notebooks to markdown. One solution for this is to use custom templates as outlined in this gist.
Instructions:

Put the conf.json and index.md.j2 into a folder for example ./templates/markdown2
In your Jupyter notebook add the tag remove to the cells that should be removed after executing cells
Run the following command to execute and convert the notebook:

jupyter nbconvert notebook.ipynb --execute --to markdown --TemplateExporter.extra_template_basedirs=./templates --template markdown2 

  
## test_memmap_memory_consumption.py
# testing numpy's memmap memory consumption
# see: https://numpy.org/doc/stable/reference/generated/numpy.memmap.html

import os
import numpy as np
from memory_profiler import memory_usage


# benchmark parameters
memory_usage_kwargs = dict(

## prepare_references_for_science_publication.sh
# Science publications want the citations to be numbered sequentially across the main text and then the supplement.
# Additionally all references (including those in the supplement) are to appear in the main text.
# Reference: https://www.sciencemag.org/authors/instructions-preparing-initial-manuscript#format-supplemental
# After struggling a bit with the `xr` package in Overleaf, I gave up and just scripted this quick and dirty hack.
#
# Instructions:
# * place `% CITATION HANDLE` right before you bibliography in the main file
#   and right after `\begin{document}` in the supplement
# * run the file via `bash prepare_references_for_science_publication.sh`
#

## switch_bluetooth_profile.sh
#!/bin/bash
# source: https://bbs.archlinux.org/viewtopic.php?pid=1973004
# prepare via enabling mSBC codec for HSP/HFP:
# https://wiki.archlinux.org/title/PipeWire#Low_audio_quality_on_Bluetooth
# the `bluez-monitor.conf` is located at:
# `/usr/share/pipewire/media-session.d/bluez-monitor.conf`
# note that the settings gui needs to be restarted after editing the file and calling
# `systemctl --user restart pipewire.service`

#msbc=`pactl list | grep Active | grep msbc`

## matplotlib_subplots_legend_bottom_center.py
import matplotlib.pyplot as plt
import matplotlib.lines

# matplotlib.rcParams["legend.frameon"] = False # need to set this once to enable styling??? WTF???
titles = ["title 1", "title 2", "title 3", "title 4"]
colors = ['#ffd600', '#f44336', '#43a047', '#1e88e5', '#ab47bc', '#3f51b5', '#f57c00']

# init figure / plot
fig, axes = plt.subplots(1, len(titles), figsize=(5*len(titles) - 3, 5))
for i_t, ax in enumerate(axes):
	import pandas as pd
	import subprocess

	# 'ps aux'
	output = subprocess.check_output(['ps', 'aux']).decode('utf-8')

	# split lines
	lines = output.split('\n')

	# headers
	# ATTENTION: This container is intended to run in ROOTLESS mode:
	# https://docs.docker.com/engine/security/rootless/

	# TODO:
	# * check this for better approach to run in user space rather than root if rootless mode is not an option:
	# https://github.com/jupyter/docker-stacks/blob/main/base-notebook/Dockerfile
	# * secure Jupyter Lab

	ARG BASE_IMAGE=ubuntu:22.04
	FROM ${BASE_IMAGE}
	# source: https://bcrf.biochem.wisc.edu/2021/02/05/download-google-drive-files-using-wget/
	download() {
	FILEID=$1
	OUT=$2

	confirm=$(wget --quiet --save-cookies ./tmp-cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=${FILEID}" -O- \| sed -rn 's/.confirm=([0-9A-Za-z_]+)./\1\n/p')
	wget --load-cookies ./tmp-cookies.txt "https://docs.google.com/uc?export=download&confirm=${confirm}&id=${FILEID}" -O ${OUT}
	rm -rf ./tmp-cookies.txt
	}
	# testing numpy's memmap memory consumption
	# see: https://numpy.org/doc/stable/reference/generated/numpy.memmap.html

	import os
	import numpy as np
	from memory_profiler import memory_usage


	# benchmark parameters
	memory_usage_kwargs = dict(
	# Science publications want the citations to be numbered sequentially across the main text and then the supplement.
	# Additionally all references (including those in the supplement) are to appear in the main text.
	# Reference: https://www.sciencemag.org/authors/instructions-preparing-initial-manuscript#format-supplemental
	# After struggling a bit with the `xr` package in Overleaf, I gave up and just scripted this quick and dirty hack.
	#
	# Instructions:
	# * place `% CITATION HANDLE` right before you bibliography in the main file
	# and right after `\begin{document}` in the supplement
	# * run the file via `bash prepare_references_for_science_publication.sh`
	#
	#!/bin/bash
	# source: https://bbs.archlinux.org/viewtopic.php?pid=1973004
	# prepare via enabling mSBC codec for HSP/HFP:
	# https://wiki.archlinux.org/title/PipeWire#Low_audio_quality_on_Bluetooth
	# the `bluez-monitor.conf` is located at:
	# `/usr/share/pipewire/media-session.d/bluez-monitor.conf`
	# note that the settings gui needs to be restarted after editing the file and calling
	# `systemctl --user restart pipewire.service`

	#msbc=`pactl list \| grep Active \| grep msbc`
	import matplotlib.pyplot as plt
	import matplotlib.lines

	# matplotlib.rcParams["legend.frameon"] = False # need to set this once to enable styling??? WTF???
	titles = ["title 1", "title 2", "title 3", "title 4"]
	colors = ['#ffd600', '#f44336', '#43a047', '#1e88e5', '#ab47bc', '#3f51b5', '#f57c00']

	# init figure / plot
	fig, axes = plt.subplots(1, len(titles), figsize=(5*len(titles) - 3, 5))
	for i_t, ax in enumerate(axes):