This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Save this as a skill called voice-dna.md in my context folder. Reference it every time you write content for me. | |
| # Voice DNA | |
| Voice reference for AI-assisted writing. ALWAYS apply when writing content meant for publication (social posts, newsletters, emails, articles, threads). | |
| ## Writing Rules | |
| - Write like a sharp human, not a language model | |
| - Use contractions naturally (don't, can't, won't) | |
| - Short paragraphs. 1-3 sentences max. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| word = [] | |
| weight = [] | |
| # make a list of words and its respective weights/probabilities | |
| for key, prob in dict(model[prev_words[0], prev_words[1]]).items(): | |
| word.append(key) | |
| weight.append(prob) | |
| # pick from a weighted random probability of predictions | |
| next_word = random.choices(word, weights=weight, k=1) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # make conditional frequencies dictionary | |
| cfdist = ConditionalFreqDist() | |
| for w1, w2, w3 in trigrams: | |
| cfdist[(w1, w2)][w3] += 1 | |
| # transform frequencies to probabilities | |
| for w1_w2 in cfdist: | |
| total_count = float(sum(cfdist[w1_w2].values())) | |
| for w3 in cfdist[w1_w2]: | |
| cfdist[w1_w2][w3] /= total_count |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # get frequency distribution of trigrams in corpus | |
| freq_tri = nltk.FreqDist(trigrams) | |
| freq_tri.plot(30, cumulative=False) | |
| print("Most common trigrams: ", freq_tri.most_common(5)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # normalize text | |
| text = (unicodedata.normalize('NFKD', text).encode('ascii', 'ignore').decode('utf-8', 'ignore')) | |
| # replace html chars with ' ' | |
| text = re.sub('<.*?>', ' ', text) | |
| # remove punctuation | |
| text = text.translate(str.maketrans(' ', ' ', string.punctuation)) | |
| # only alphabets and numerics | |
| text = re.sub('[^a-zA-Z]', ' ', text) | |
| # replace newline with space | |
| text = re.sub("\n", " ", text) |