Skip to content

Instantly share code, notes, and snippets.

View jasalt's full-sized avatar

Jarkko Saltiola jasalt

View GitHub Profile
@jasalt
jasalt / pdfplumber-tesseract.py
Created January 26, 2021 06:25
Extract tables from pdf using pdfplumber and pytesseract
# Extracting tabular data from pdf using Python pdfplumber together with Tesseract OCR
# Author Jarkko Saltiola 2021 (MIT License, Python 3.8.6)
# Pdfplumber, tabula, camelot and probably some other PDF parser utilities have hard
# time parsing tables that have column data overlapping over other columns, and
# probably on many other cases too.
# Pdfplumber gives a good level of control for splitting pdf into parts which can
# be read with it's methods or be passed for pytesseract as PIL image.
#! /usr/bin/env python3
import toga
from toga.style.pack import CENTER, COLUMN, ROW, Pack
import asyncio
# Both are failing on Windows 10, Python 3.9.7, 3.8.6,
# Toga not installing on 3.9.1 cause pythonnet not supported yet.
#!/bin/bash
# Adjust homserver, room, and accesstoken to your particular setup
# Script is expecting data to be piped in on STDIN
# Example:
# echo "some text" | sendmatrix
msgtype=m.text
homeserver=<homeserver>
room=<room id>
# record frames from a public camera in jyväskylä city centre for timelapse purposes etc
# uses threads, probably to not miss any frames, non threaded version commented in bottom
# written in 2016 during random student event happening in the city centre
# demo video https://youtu.be/tMSrBDwmtUo compiled with ffmpeg.
# stutters a bit cause duplicate frames are not processed
import requests
from io import open as iopen
from urlparse import urlsplit
@jasalt
jasalt / gist:bd29e4fc773a5b8f6c0faeb7b4bd3ba7
Created June 28, 2020 20:07 — forked from jaukia/gist:1d41a0045ab8e9f411ff
Most active public GitHub users in Finland — Feb 2015

Most active public GitHub users in Finland

The count of contributions (summary of Pull Requests, opened issues and commits) to public repos at GitHub.com from Wed, 12 Feb 2014 13:09:28 GMT till Thu, 12 Feb 2015 13:09:28 GMT.

Only first 1000 GitHub users according to the count of followers are taken. This is because of limitations of GitHub search. Sorting algo in pseudocode:

githubUsers
 .filter((user) -&gt; user.followers &gt; 11)

Oh my zsh.

Install with curl

sh -c "$(curl -fsSL https://raw.githubusercontent.com/robbyrussell/oh-my-zsh/master/tools/install.sh)"

Enabling Plugins (zsh-autosuggestions & zsh-syntax-highlighting)

  • Download zsh-autosuggestions by
@jasalt
jasalt / Instructions.sh
Created June 7, 2018 16:10 — forked from GhazanfarMir/Instructions.sh
Install PHP7.2 NGINX and PHP7.2-FPM on Ubuntu 16.04
########## Install NGINX ##############
# Install software-properties-common package to give us add-apt-repository package
sudo apt-get install -y software-properties-common
# Install latest nginx version from community maintained ppa
sudo add-apt-repository ppa:nginx/stable
# Update packages after adding ppa
@jasalt
jasalt / gist:0a684cf0bb620bf95f02582f6dbc1a97
Created September 19, 2017 16:33
Making money is killing your business highlights
==========
Making Money is Killing Your Business (Blakeman, Chuck;Seeling, Caleb)
- Added on Thursday, June 19, 2014 10:07:49 AM
We’re too busy making money to get to the important stuff. As a result everything is backwards. We build a business and take whatever lifestyle that business happens to throw off for us, which at best usually involves having money, but rarely a lot of time, and almost never significance.This isn’t surprising because “he who makes the rules wins,” and we too often let our business and the business world around us make the rules for us.
==========
- Added on Thursday, June 19, 2014 10:23:04 AM
If we gain clarity about where we are and where we want to go, and a measure of clarity about the first few steps to get there, that gives us hope. Not hope in the modern sense of wishing, but hope in the correct sense of the word—believing with conviction that I can do it.
==========
@jasalt
jasalt / dldjbb.py
Last active September 26, 2016 10:15 — forked from kamiheku/djbb.py
Simple Break Beat Paradise downloader
#!/usr/bin/env python3
# Download all zip files from a single bbp page eg:
# ./djbb.py http://www.breakbeat-paradise.com/samplesite/bb_synth.php
import requests
import re
import shutil
import sys