nic-hartley/rough design.md

## rough design.md

      
    Raw
  

              rough design.md
            
          
    General notes, first:

Each of these blocks runs repeatedly on its own thread.

There might be some delay (to let queues refill, or to keep from getting ratelimited
by Reddit, or whatever) between each run; in theory, that shouldn't matter to the
actual operation.
Some additional boilerplate is probably also required (checking if they're empty,
etc.) but screw writing all that


Anything before .push or .pop is a queue
logging and metrics have been intentionally omitted; they can be added where necessary.
also, there's probably better ways to write this code, but I'm not good at Python
I'm only looking at Reddit interaction for this bit. Discord interaction could
probably be added relatively easily, but that's too much to think about right now.
I'm not looking at first-time posts yet. It should just be a couple of extra lines
of code, but... later.
I don't know exactly how the Reddit API works; for this document, I'm assuming that
it gives you all of the information about a given comment, rather than just an ID that
you then have to look up. However, it shouldn't be too difficult to adapt it if the
latter is the case.
I intentionally didn't implement the class or helper functions; this is meant to be
higher-level than that. When it comes down to actually writing the code, I'd be happy
to :)

TR class

Basically, a class which represents a transcription request. This class
allows us to query data about each transcription request without needing to go to
Reddit every single time. That way, we can use API hits for only the things that
are necessary, rather than using them to check if something has been claimed.
This mostly just collects a couple of posts/comments and holds them together.

Store information about / reference to the original post to be transcribed and the post
on r/ToR about the request.
Record the claiming and doneing comments

Collecting posts from subreddits and normalizing

In short, we want to get everything all in one place, all in one format.
This could be split into two separate workers (one to just query subreddits,
one just to normalize them into a unified format), but the tasks are related enough
that it feels kinda wasteful.
for subreddit in partners:
  for post in subreddit.new_posts():
    tr = TR(post)
    if tr.valid():
      requests.push(tr)

Moving them to r/ToR

...and adding a little extra info.
tr = requests.pop()
post = r_tor.post_url('subreddit} | {type} | "{title}"'.format(tr.subreddit, tr.type, tr.title), tr.url)
post.set_flair(Flair.UNCLAIMED)
to_clear.push(tr)
tr.r_tor_url = post.url
post.comment(Comments.CLAIM_HERE)

Processing inbox contents

Process replies to ToRBot's posts and comments. Farms out the actual work of
claiming, etc. to other workers.
Note that this, done-process, claim-process, unclaim-process, etc. could all be
condensed into one step. I've separated them just because otherwise this one
gets to be like double the length of the others, but that's not strictly
necessary.
for item in unread_inbox_items:
  url = item.parent_post_url
  tr = TR.get_by_url(url)
  if not tr:
    pass # can this happen? what does it mean?
         # (because in theory, we should only be getting replies to our
         #  posts, which _should_ all be on r/ToR posts)
  if "claim" in item.text:
    claims.push([item, tr])
  elif "unclaim":
    unclaims.push([item, tr])
  elif "done" in item.text:
    dones.push([item, tr])
  else: # did I miss something the bot has to do?
    item.reply(Comments.NO_ACTION_IN_COMMENT)

Action processing

Process the various actions a user can take on a given TR.
These are all, by their nature, very similar. However, they're
all different enough that I can't see an easy, simple way to
unify them, so I don't think it's worth it.
Process claims

claim_comment, tr = claims.pop()
if tr.claim:
  claim_comment.reply(Comments.ALREADY_CLAIMED) # aka Comments.PIXIED
elif tr.done:
  claim_comment.reply(Comments.ALREADY_DONE)
else:
  tr.claim = claim_comment
  tr.r_tor_post.flair = Flair.IN_PROGRESS
  claim_notifications.push(tr)

Process dones

done_comment, tr = dones.pop()
if !tr.claim:
  done_comment.reply(Comments.NO_DONE_WITHOUT_CLAIM)
elif tr.claim.author != done_comment.author: #or, possibly, if they're a mod
  done_comment.reply(Comments.NOT_YOUR_CLAIM)
else:
  tr.done = done_comment
  tr.r_tor_post.flair = Flair.COMPLETE
  to_clear.push(tr)
  done_comment.reply(Comments.DONED)
  increment_flair(done_comment.author)

Process unclaims

unclaim_comment, tr = unclaims.pop()
if !tr.claim:
  unclaim_comment.reply(Comments.NO_UNCLAIM_WITHOUT_CLAIM)
elif tr.claim.author != unclaim_comment.author:
  unclaim_comment.reply(Comments.NOT_YOUR_CLAIM)
else:
  tr.claim = None
  unclaim_comment.reply(Comments.UNCLAIMED)

Process late notifications

while claim_notifications:
  top = claim_notifications.pop()
  if top.claim.time < datetime.now() - timedelta(hours = 6):
    claim_notifications.unpop(top) # could also be done with peek/pop, whatever
    break
  elif !task.done:
    top.claim.reply(Comments.DID_YOU_FORGET)
  # else no-op