Skip to content

Instantly share code, notes, and snippets.

@deblnia
Created January 2, 2025 02:54
Show Gist options
  • Save deblnia/46b09c4542ce20802c38fe13619ff36d to your computer and use it in GitHub Desktop.
Save deblnia/46b09c4542ce20802c38fe13619ff36d to your computer and use it in GitHub Desktop.
Some of the code behind @petebutbot
# I scraped exisisting tweets using R
# library(rtweet)
# library(tidyverse)
# petey <- get_timeline("PeteButtigieg", n = 3200, include_rts = FALSE, exclude_replies = FALSE)
# petey %>%
# select(text) %>%
# rename(tweets = text) %>%
# write_csv("petebuttigieg_tweets.csv")
import markovify
# Get raw text as string.
with open("petebuttigieg_tweets.csv") as f:
text = f.read()
# Build the model.
text_model = markovify.Text(text)
# # Print five randomly-generated sentences
# for i in range(5):
# print(text_model.make_sentence())
# Print three randomly-generated sentences of no more than 280 characters
jsonfile = {"origin" : []}
stopWords = ["@", "https://"]
while len(jsonfile["origin"]) <= 365:
tweet = text_model.make_short_sentence(280)
shouldInclude = all([stopword not in tweet for stopword in stopWords])
if shouldInclude:
jsonfile["origin"].append(tweet)
print('{ \n "origin": \n[')
[print('"' + elem + '", \n') for elem in jsonfile["origin"]]
print("]}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment