Skip to content

Instantly share code, notes, and snippets.

View amnrzv's full-sized avatar
🤹

Amaan Rizvi amnrzv

🤹
  • London
View GitHub Profile
{
"products": [
{
"productId": 1,
"purchaseable": true,
"prices": {
"usd": 1750,
"gbp": 1250
}
},
@amnrzv
amnrzv / a_PhoneticTranslations_main.py
Last active November 1, 2017 13:35
Get phonetic transcriptions of words by scraping it from the website http://www.phonemicchart.com/
import urllib.request
import urllib.error
import urllib.parse
import re
from bs4 import BeautifulSoup
from bs4 import UnicodeDammit
lines = []
base_url = "http://www.phonemicchart.com/transcribe/?w=%s"
output_file = open("output.txt", 'w', encoding='utf-8')
@amnrzv
amnrzv / nltk_pos_workaround.py
Last active November 1, 2017 13:10
Python NLTK POS tagger workaround example.
import nltk
import re
from nltk.tokenize import word_tokenize, sent_tokenize
text = "I'm not going to the party."
words = word_tokenize(text)
pos_tags = nltk.pos_tag(words)
print (pos_tags)
act | 0
be | 6
begin | 0
believe | 0
break | 0
call | 0
can | 5
change | 0
choose | 0
clean | 0
@amnrzv
amnrzv / words.txt
Last active October 19, 2017 17:45
Words list for language analysis done here https://gist.github.com/amnrzv/596ba910524e0b1b4e8fa2167fd773bf
act
be
begin
believe
break
call
can
change
choose
clean
@amnrzv
amnrzv / input.txt
Last active October 19, 2017 17:44
Input script file for language analysis done here https://gist.github.com/amnrzv/596ba910524e0b1b4e8fa2167fd773bf
Ruby... Ruby, can you hear me?
Moli? Moli, where are you?
Moli?
Ruby, I've crashed.
Yeah... But where?
I'm hurt, Ruby. Can you find me?
Okay, I can see plants.
I can see rocks.
@amnrzv
amnrzv / a_language_analysis.py
Last active November 1, 2017 13:02
Python NLTK vocabulary analysis example.
import nltk
import re
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
input_file = "./input.txt"
words_file = "./words.txt"
output_file = "./output.txt"
curriculum_words = []
pos_tagged_array = []
@amnrzv
amnrzv / ntlk_lemmatizer.py
Last active November 1, 2017 13:07
An example of NLTK's WordNet Lemmatizer.
from nltk.stem import WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()
print (wordnet_lemmatizer.lemmatize("geese"))
print (wordnet_lemmatizer.lemmatize("bottles", 'n'))
print (wordnet_lemmatizer.lemmatize("said", 'v'))
print (wordnet_lemmatizer.lemmatize("better", 'a'))
print (wordnet_lemmatizer.lemmatize("quickly", 'r'))
@amnrzv
amnrzv / nltk_pos_tags
Created October 19, 2017 15:08
A list of POS tags used in NLTK and what they mean
POS tag list:
CC coordinating conjunction
CD cardinal digit
DT determiner
EX existential there (like: "there is" ... think of it like "there exists")
FW foreign word
IN preposition/subordinating conjunction
JJ adjective 'big'
JJR adjective, comparative 'bigger'
@amnrzv
amnrzv / nltk_pos_tags.py
Last active November 1, 2017 13:06
An example of NLTK's POS tagging. Output here: https://gist.github.com/amnrzv/2d726c1f107d444b0cdaed21299b7da1
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize
text1 = "I'm going to watch a play tonight."
text2 = "I like to play guitar."
words1 = word_tokenize(text1)
pos_tags1 = nltk.pos_tag(words1)
words2 = word_tokenize(text2)