Skip to content

Instantly share code, notes, and snippets.

View jadessechan's full-sized avatar

Jadesse Chan jadessechan

View GitHub Profile
Save this as a skill called voice-dna.md in my context folder. Reference it every time you write content for me.
# Voice DNA
Voice reference for AI-assisted writing. ALWAYS apply when writing content meant for publication (social posts, newsletters, emails, articles, threads).
## Writing Rules
- Write like a sharp human, not a language model
- Use contractions naturally (don't, can't, won't)
- Short paragraphs. 1-3 sentences max.
@jadessechan
jadessechan / main.py
Created March 19, 2021 03:25
getting a weighted random probability of predicted words and its weights
word = []
weight = []
# make a list of words and its respective weights/probabilities
for key, prob in dict(model[prev_words[0], prev_words[1]]).items():
word.append(key)
weight.append(prob)
# pick from a weighted random probability of predictions
next_word = random.choices(word, weights=weight, k=1)
@jadessechan
jadessechan / main.py
Created March 19, 2021 03:20
Calculating the probability of a word (w3) preceded by 2 words (w1 and w2)
# make conditional frequencies dictionary
cfdist = ConditionalFreqDist()
for w1, w2, w3 in trigrams:
cfdist[(w1, w2)][w3] += 1
# transform frequencies to probabilities
for w1_w2 in cfdist:
total_count = float(sum(cfdist[w1_w2].values()))
for w3 in cfdist[w1_w2]:
cfdist[w1_w2][w3] /= total_count
@jadessechan
jadessechan / main.py
Created March 19, 2021 03:13
n-gram information about corpus
# get frequency distribution of trigrams in corpus
freq_tri = nltk.FreqDist(trigrams)
freq_tri.plot(30, cumulative=False)
print("Most common trigrams: ", freq_tri.most_common(5))
@jadessechan
jadessechan / main.py
Created March 19, 2021 02:47
example of using Python's re library for regex parsing
# normalize text
text = (unicodedata.normalize('NFKD', text).encode('ascii', 'ignore').decode('utf-8', 'ignore'))
# replace html chars with ' '
text = re.sub('<.*?>', ' ', text)
# remove punctuation
text = text.translate(str.maketrans(' ', ' ', string.punctuation))
# only alphabets and numerics
text = re.sub('[^a-zA-Z]', ' ', text)
# replace newline with space
text = re.sub("\n", " ", text)