Skip to content

Instantly share code, notes, and snippets.

Avatar

Peter Organisciak organisciak

View GitHub Profile
@organisciak
organisciak / README.md
Last active Apr 4, 2019
Jupyterhub as a service on RHEL
View README.md

Place the below file in /lib/systemd/system/jupyterhub.service.

Then, run systemctl daemon-reload and start the service with service jupyterhub start. This will need to be done as root.

The file below runs as root, but you'll likely want to specify a more restricted User and WorkingDirectory. You may also need to add to the path in the Environment arg.

Additional environmental variables, like "GITHUB_CLIENT_ID", can be added with more Environment="VAR=value" lines.

@organisciak
organisciak / Instructions.md
Created Mar 20, 2019
Running OpenRefine on a Pixelbook or Other Chromebook
View Instructions.md

If you have a Chromebook with Linux, you can run OpenRefine on your computer.

Running OpenRefine

If you're new to this, see the 'First Time Preparation' section below.

  1. Go to the OpenRefine folder.
    • cd openrefine-3.1 for the version that I have, your directory name may be different.
  2. Run Open refine on the internal hostname (which is not 127.0.0.1)
    • ./refine -i $(hostname -I)
@organisciak
organisciak / sestina.txt
Last active Jul 14, 2017
NOD WEED ROAD CRUDE CLOSE DEPOT COMMUTE CRUMB BOBBINET SWEDISH IMPAIRMENT ARGIL MIGHT PROFANATION COMPARABLE CORONADITE EXCESS BASE FLESH BANDOG READY REPARATION LEGIBLE DESPITE ANCHORESS IGNOMINY AFFRIGHT ID DISABILITY IMPEDIMENT STROKESMAN DECLARATION AUTOPSY BEARD SLEW BEDCLOTHES FLORID/READY REPLY/ASCENDANT BICKER/DEDUCTION
View sestina.txt
Quick declination of the head.
Free from weeds.
Thoroughfare: way.
Raw, unprepared.
Enclosed place.
Railway station.
Effect of commutation.
Small bit of bread.
Machine-made net or lace.
View calculate_tfidf
def calculate_tfidf(tokencounts, idf_df, df='PF', case=True, log_tf=True):
'''Takes a 'token, count' DF and returns TF*IDF weights '''
if not case:
tc['token'] = tc['token'].str.lower()
tc = tc.groupby('token', as_index=False).sum()
tfidf = pd.merge(tc.set_index('token'), idf_df, left_index=True, right_index=True)
if log_tf:
tfidf['TF'] = tfidf['count'].add(1).apply(np.log10)
else:
tfidf['TF'] = tfidf['count']
@organisciak
organisciak / gist:d5d0ff1e0dc48f7424e16ea723ca338a
Last active Dec 3, 2016
Marc Distribution in the HathiTrust
View gist:d5d0ff1e0dc48f7424e16ea723ca338a
Field Coverage Description
035 100% SYSTEM CONTROL NUMBER (R)
245 100% TITLE STATEMENT (NR)
538 100% SYSTEM DETAILS NOTE (R)
974 100% NA
260 99.7% PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) (R)
300 99.0% PHYSICAL DESCRIPTION (R)
040 95.8% CATALOGING SOURCE (NR)
100 73.9% MAIN ENTRY--PERSONAL NAME (NR)
650 64.9% SUBJECT ADDED ENTRY--TOPICAL TERM (R)
View gist:163e59ea6cf71c3cd12de410d075567c
tl = vol.tokenlist(pages=False)
just_nouns = tl.loc[(slice(None), slice(None), ["NN", "NNS"]),]
top_nouns = just_nouns.sort_values('count', ascending=False)
top_nouns.head(5)
# OUTPUT:
# count
# section token pos
# body doctor NN 83
# time NN 80
View Lesson Draft.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@organisciak
organisciak / map-stats.py
Created Mar 18, 2016
Calculate frequencies in many books
View map-stats.py
from htrc_features import FeatureReader
import argparse
import pandas as pd
import numpy as np
import random
import string
def main():
parser = argparse.ArgumentParser(description='Calculate Collection '
@organisciak
organisciak / process.py
Created Feb 24, 2016
Script to process a Wordpress Export for Mallet
View process.py
'''
Author: Peter Organisciak
Convert Day of DH (or other Wordpress) export to Mallet import format.
[url] [user] [post text]
Use in the following way:
>> python process.py input-file output-file --split [post|author]
For the split argument, choose either post (a document representation is the words of a post) or author (a document representation is the words that an author has written).
@organisciak
organisciak / post-commit
Created Dec 16, 2015
githook to convert iPython README to Markdown
View post-commit
READPY=$(git log --name-only HEAD^.. | grep "^README.ipynb$")
READMD=$(git log --name-only HEAD^.. | grep "^README.md$")
if [ -n "$READPY" ] && [ -z "$READMD" ]; then
echo "It looks like a new README was committed, appending a Markdown version"
ipython nbconvert --to markdown README.ipynb
# Adding this file doesn't work in pre-commit hooks, which is
# why we're appending post-commit
git add README.md
You can’t perform that action at this time.