Skip to content

Instantly share code, notes, and snippets.

Russell Jurney rjurney

Block or report user

Report or block rjurney

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View keybase.md

Keybase proof

I hereby claim:

To claim this, I am signing this object:

@rjurney
rjurney / remove_all_security_groups_boto3.py
Created Jan 7, 2020
A script that removes all non-default security group rules and groups in a single REGION using boto3
View remove_all_security_groups_boto3.py
import boto3
from botocore.exceptions import ClientError
REGION = 'us-east-1'
ec2 = boto3.client('ec2', region_name=REGION)
# Keep removing until all are gone
while True:
groups = ec2.describe_security_groups()['SecurityGroups']
@rjurney
rjurney / github.sh
Created Dec 19, 2019
How to fetch the README of any Github repository in one line of bash
View github.sh
curl "https://api.github.com/repos/<user>/<repo>/readme" | jq -r .content | base64 -D
@rjurney
rjurney / pre.py
Last active Dec 4, 2019
How do you chain a preprocessor for an LF to occur AFTER SpacyPreprocessor?
View pre.py
spacy = SpacyPreprocessor(
text_field='body',
doc_field='spacy',
memoize=True,
language='en_core_web_lg',
disable=['vectors']
)
@preprocessor(memoize=True, pre=[spacy])
def restore_entity(x):
@rjurney
rjurney / matcher_lf.py
Created Dec 2, 2019
Example of spaCy object Labeling Function
View matcher_lf.py
from spacy.matcher import Matcher
matcher = Matcher(nlp.vocab)
pattern = [{'POS': 'VERB'}, {'POS': 'ADP'}, {'POS': 'PROPN'}]
matcher.add("VERB_ADP_PROPN", None, pattern)
@labeling_function()
def lf_verb_in_noun(x):
"""Return positive if the pattern"""
sp = x['spacy']
matches = matcher(sp)
View candidates.py
window = 5
candidates = []
for index, row in df.iterrows():
doc = nlp(row['_Body'])
for ent in doc.ents:
rec = {}
rec['body'] = doc.text
rec['entity'] = ent
rec['entity_text'] = ent.text
rec['entity_start'] = ent.start
@rjurney
rjurney / tty.txt
Created Nov 11, 2019
What /dev/ttyS* port does this correspond to?
View tty.txt
T: Bus=01 Lev=01 Prnt=01 Port=08 Cnt=04 Dev#= 5 Spd=12 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=051d ProdID=0002 Rev=00.90
S: Manufacturer=American Power Conversion
S: Product=Back-UPS ES 850M2 FW:931.a7 .D USB FW:a7
S: SerialNumber=4B1716P37698
C: #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=2mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid
@rjurney
rjurney / spark_mongo_kafka_predictions.py
Created Nov 4, 2019
Writing Predictions to MongoDB using Kafka and Structured Streaming
View spark_mongo_kafka_predictions.py
# Make the prediction
predictions = rfc.transform(final_vectorized_features)
# Drop the features vector and prediction metadata to give the original fields
predictions = predictions.drop("Features_vec")
final_predictions = predictions.drop("indices").drop("values").drop("rawPrediction").drop("probability")
# Store the results to MongoDB
class MongoWriter:
@rjurney
rjurney / pad.py
Created Oct 22, 2019
Custom padding of dense vectors with min/max or mean
View pad.py
padded_posts = []
for post in encoded_docs:
# Pad short posts with alternating min/max
if len(post) < MAX_LENGTH:
# Method 1
pointwise_min = np.minimum.reduce(post)
pointwise_max = np.maximum.reduce(post)
padding = [pointwise_max, pointwise_min]
@rjurney
rjurney / gensim_word2vec.py
Last active Oct 22, 2019
Encoding tokenized text with gensim.models.Word2Vec
View gensim_word2vec.py
from gensim.models import Word2Vec
w2v_model = None
model_path = f'models/word2vec.model'
# Load the Word2Vec model if it exists
if os.path.exists(model_path):
w2v_model = Word2Vec.load(model_path)
else:
w2v_model = Word2Vec(
You can’t perform that action at this time.