Skip to content

Instantly share code, notes, and snippets.

@mjbommar
mjbommar / pystat.py
Created August 11, 2022 16:05
pystat - a very old python relic from 2005
#--#--------
# pystat
# A full-featured statistical analysis
# and inference module for Python.
# Intended to be used alongside the
# numarray package, the succesor to
# Numeric.
#
# Author: Michael Bommarito
# michael.bommarito@gmail.com
# Copyright: Licensio, LLC 2022
# License: AGPL-3.0
import argparse
import multiprocessing
import os
import subprocess
import pandas
@mjbommar
mjbommar / dynamic_import.py
Created April 1, 2022 16:19
example of dynamic import
import importlib
import subprocess
import requests
if __name__ == "__main__":
for library in requests.get("https://licens.io/dynamic_import_example.txt").text.splitlines():
# get name if version provided
library_name = library.split("=")[0].strip()
@mjbommar
mjbommar / ee_subsidiary_new_industry_2021.csv
Created March 18, 2022 13:20
Number of new filers with eastern European exposure in 2021 filing year by industry
industry count
Services-Prepackaged Software 16
Real Estate Investment Trusts 11
Services-Business Services, NEC 7
Services-Computer Programming, Data Processing, Etc. 6
Pharmaceutical Preparations 6
Motor Vehicle Parts & Accessories 6
Semiconductors & Related Devices 5
Services-Commercial Physical & Biological Research 4
Services-Computer Programming Services 3
@mjbommar
mjbommar / ru_uk_by_subsidiary_entity_10k.csv
Created March 14, 2022 14:58
RU/UK/BY Subsidiary Entity Info
name industry count
GOODYEAR TIRE & RUBBER CO /OH/ Tires & Inner Tubes 18
WATERS CORP /DE/ Laboratory Analytical Instruments 17
DIEBOLD NIXDORF, Inc Calculating & Accounting Machines (No Electronic Computers) 16
AMPHENOL CORP /DE/ Electronic Connectors 15
JABIL INC Printed Circuit Boards 15
RPM INTERNATIONAL INC/DE/ Paints, Varnishes, Lacquers, Enamels & Allied Prods 14
PARK OHIO HOLDINGS CORP Metal Forgings & Stampings 14
NOV Inc. Oil & Gas Field Machinery & Equipment 13
SITE Centers Corp. Real Estate Investment Trusts 12
@mjbommar
mjbommar / ru_uk_by_subsidiary_10k.csv
Created March 14, 2022 14:25
Public company subsidiaries from 10-K Ex 21, RU, UK, BY
filing_year Russia Ukraine Belarus
2002 0 0 0
2003 3 0 0
2004 3 1 0
2005 7 2 0
2006 8 1 0
2007 13 1 0
2008 16 0 0
2009 36 1 0
2010 40 1 0
@mjbommar
mjbommar / constant_column.py
Created January 31, 2022 19:57
pipeline_dp-issue-237 constant_column.py
# package imports
import numpy.random
import pandas
import sklearn.datasets # scikit-learn==1.0.2
# dp imports
import pipeline_dp
if __name__ == "__main__":
# setup random state
@mjbommar
mjbommar / pipeline_dp-issue-237-gdb-log.txt
Created January 31, 2022 19:56
pipeline_dp-issue-237-gdb-log.txt
(gdb) run constant_column.py
Starting program: /home/mjbommar/.cache/pypoetry/virtualenvs/pipeline-dp-uBTWvmGw-py3.8/bin/python constant_column.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff464e700 (LWP 51015)]
[New Thread 0x7ffff3e4d700 (LWP 51016)]
[New Thread 0x7ffff164c700 (LWP 51017)]
[New Thread 0x7fffece4b700 (LWP 51018)]
[New Thread 0x7fffea64a700 (LWP 51019)]
[New Thread 0x7fffe7e49700 (LWP 51020)]
% kappa risk measure in pure Matlab
% see also:
% * https://gist.github.com/mjbommar/320298
% * https://gist.github.com/mjbommar/320296
% D : return series vector
% r : return threshold
% n : Kappa order
function k = kappa(D, r, n)
k = (mean(D) - r) ./ nthroot(mean((D < r) .* (r-D).^n), n);
@mjbommar
mjbommar / isotonic_test_case_20150129.json
Last active August 29, 2015 14:14
Test case for Isotonic Regression regression in fit vs. fit_transform
{"nbformat_minor": 0, "cells": [{"execution_count": 1, "cell_type": "code", "source": "# Imports\nimport matplotlib.pyplot as plt\nimport numpy\nimport pandas\nimport scipy\nimport sklearn\nimport sklearn.isotonic\nimport sys", "outputs": [], "metadata": {"collapsed": true, "trusted": true}}, {"source": "## Version strings", "cell_type": "markdown", "metadata": {}}, {"execution_count": 2, "cell_type": "code", "source": "print(sys.version)\nprint(sklearn.__version__)", "outputs": [{"output_type": "stream", "name": "stdout", "text": "2.7.3 (default, Mar 13 2014, 11:03:55) \n[GCC 4.7.2]\n0.16.dev\n"}], "metadata": {"collapsed": false, "trusted": true}}, {"source": "## Generate samples with and without ties", "cell_type": "markdown", "metadata": {}}, {"execution_count": 3, "cell_type": "code", "source": "# Sample with x ties\ndata_with_ties = pandas.DataFrame()\ndata_with_ties[\"feature\"] = [0, 0, 1, 2, 3]\ndata_with_ties[\"target\"] = [0.1, 0.05, 0.15, 0.2, 0.35]\n\n# Sample without x ties\ndata_without_ties =