Skip to content

Instantly share code, notes, and snippets.

View rolyatm's full-sized avatar
🧟‍♂️

Mike Taylor rolyatm

🧟‍♂️
View GitHub Profile
@rolyatm
rolyatm / gist:56c2b1a81df10173c9295d8d0fbd8a7a
Last active January 14, 2020 15:09
Cognito Password Change Flow
aws cognito-idp initiate-auth --client-id <CLIENTID> --auth-flow USER_PASSWORD_AUTH --auth-parameters USERNAME=<USER>,PASSWORD="<PASSWORD>"
Response will contain a Session token.
aws cognito-idp admin-respond-to-auth-challenge --user-pool-id <USERPOOLID> --client-id <CLIENTID> --challenge-responses "NEW_PASSWORD=<NEW>,USERNAME=<USER>" --challenge-name NEW_PASSWORD_REQUIRED --session "<SESSION TOKEN>"
@rolyatm
rolyatm / readme
Last active April 10, 2018 14:33
Sample Spark Application
Create a sample Spark application to process the example data and report some sort of interesting results.
Send the source code and your findings.
Example data:
https://s3.amazonaws.com/ipsos-rad-sample-data/0000_part_00.gz
https://s3.amazonaws.com/ipsos-rad-sample-data/0001_part_00.gz
Hint:
The delimiter in the example data is a non-standard character.
In Python I use: sc.textFile('file').map(lambda x: x.split(chr(31)))
@rolyatm
rolyatm / url_regex.py
Created April 10, 2018 14:11
URL Regex
'''
url_pattern regex will very generously match URL patterns. It also matches numbers, email address and
a few other funky cases.
Please update the code below to eliminate the special cases.
'''
# very liberal match of a possible URL pattern
import re
url_pattern = '(([\w]+:)?//)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?([\d\w][-\d\w]{0,253}[\d\w]?\.)+[\w]{2,63}(:[\d]+)?(/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?'
matched_url = []
@rolyatm
rolyatm / create_rtree_file_index.py
Created February 23, 2017 09:52
Create rtree file index from geojson with X,Y extents
import json
from rtree import index
FILE = 'admin_v2.geojson'
OUTPUT = 'geotag_admin_v2'
# index settings
# http://libspatialindex.github.io/overview.html#references
LEAFCAPACITY = 100
INDEXCAPACITY = 100
FILLFACTOR = 0.7
@rolyatm
rolyatm / emojis.py
Created February 23, 2017 07:51
Example emoticon/emoji process
# -*- coding: utf-8 -*-
# Scorpion Emojis
# converts emoticons to emoji equivalents
# add synonyms of emojis for better search support
from __future__ import unicode_literals, print_function
import re
class Emojis():
def __init__(self):
@rolyatm
rolyatm / tokenizer.py
Created February 23, 2017 07:49
Example tokenizing process
# -*- coding: utf-8 -*-
# Scorpion Tokenizer
# includes Arabic specific stemmer
from __future__ import unicode_literals, print_function
import os
import re
import itertools
from nltk.tokenize import TweetTokenizer
from nltk.stem.porter import PorterStemmer
@rolyatm
rolyatm / nltk_stanford_segmenter.py
Created February 23, 2017 07:45
NLTK implementation of Stanfrod Segmenter
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Natural Language Toolkit: Interface to the Stanford Segmenter
# for Chinese and Arabic
#
# Copyright (C) 2001-2017 NLTK Project
# Author: 52nlp <52nlpcn@gmail.com>
# Casper Lehmann-Strøm <casperlehmann@gmail.com>
# Alex Constantin <alex@keyworder.ch>
#