Skip to content

Instantly share code, notes, and snippets.

@jhutchings
Last active November 4, 2022 15:04
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save jhutchings/1f3ac2f27790506b5e9bd0c1ec356d49 to your computer and use it in GitHub Desktop.
Save jhutchings/1f3ac2f27790506b5e9bd0c1ec356d49 to your computer and use it in GitHub Desktop.
Folding at Home Babysitter - Original Code @danielocdh
#!/usr/bin/python3
# 1.0 - Original Code Belongs to @danielocdh
# 1.1 - Added ability to check FAH Control APIs on other ports
# 1.2 - Added support for Linux Control API (Slight changes in response checks)
# 1.3 - Fix for latest version of FAH Control API and Client (7.6.9)
# 1.4 - Added @danielocdh's feedback and his local changes around !# and spacing in the substitution
# 1.5 - @danielocdh updated expected result for authentication issues
# 1.6 - Removed un-needed tEnd references to end of readResult
# 1.7 - Added getting started
# 1.8 - Added "hacky" check to see if there is a new version of the gist using github rest api
################################################################################
## getting started ##
################################################################################
# 1. Install Python 3
# - https://www.howtogeek.com/197947/how-to-install-python-on-windows/
# - Scroll down the page untill you get to "How to Install Python 3" and follow it
# 2. Create a folder somewhere memorable, easy places C:\babysitter
# 3. Copy and paste the contents of this gist into notepad and save it as "babysitter.py" in C:\babysitter\
# 4. Next open a Command Prompt
# - Start Button > Command Prompt
# - Type: cd C:\babysitter
# - Hit Enter
# - Type: python babysitter.py
# 5. That should be all you need to do if you are just running 1 computer.
# If you need to run on more machines there are further instructions below
################################################################################
## options ##
################################################################################
hosts = [ #list of quoted strings, hosts or IPs, with optional colon separted port (e.g. localhost:36331), separated by comma
'localhost'
]
hostsPassword = '' #quoted string, if the host(s) don't use a password just leave it as: ''
restartLimit = 10 * 60 #in seconds, pause+unpause if next attempt to get WU is this or more
checkEvery = 2 * 60 #in seconds, do a check for all hosts every this seconds
checkUpdate = True # True or False, check for update in the script
checkUpdateCycles = 30 # number, multiply this by checkEvery and it will tell you how long between checks (defaults: 30 * 2 * 60 = 1 hour)
tConTimeout = 15 #in seconds, connection timeout
tReadTimeout = 10 #in seconds, read timeout
testMode = False # if set to True: checkEvery=6 and restartLimit=0 but won't actually pause+unpause slots
################################################################################
## code ##
################################################################################
import json
import re
import telnetlib
import time
import datetime
import urllib.request
import urllib.parse
if testMode:
restartLimit = 0
checkEvery = 6
version = 10 # internal version number that equals the number of commits in https://api.github.com/gists/1f3ac2f27790506b5e9bd0c1ec356d49/commits
countCycles = checkUpdateCycles # counter for cycles passed, set initially to the same as the interval so it'll give user feedback
countEvery = 1 #seconds, have to be a factor of checkEvery, default: 1
countEveryDec = max(0, str(countEvery)[::-1].find('.'))
countEveryDecStr = f'{{:.{countEveryDec}f}}'
def remSeconds(seconds):
if seconds > 0:
if (seconds * 10000) % (countEvery * 10000) == 0:
secondsP = countEveryDecStr.format(seconds)
pr(f'Next check in {secondsP} seconds', same=True)
time.sleep(countEvery)
seconds = round((seconds - countEvery) * 10000) / 10000
remSeconds(seconds)
def checkUpdate():
global countCycles
countCycles += 1
if(checkUpdate and countCycles >= checkUpdateCycles):
countCycles = 0
try:
resp = urllib.request.urlopen('https://api.github.com/gists/1f3ac2f27790506b5e9bd0c1ec356d49/commits')
if (resp):
commits = json.loads(resp.read().decode('utf-8'))
if (len(commits) > version):
print("New version of babysitter script is available at https://gist.github.com/jhutchings/1f3ac2f27790506b5e9bd0c1ec356d49")
except Exception as err:
print("Error checking version, continuing to run. Will check later!")
prLastLen = 0
prLastSame = False
def pr(t, indent=0, same=False, overPrev=False):
global prLastLen, prLastSame
if not overPrev and not same and prLastSame:
prLastLen = 0
print('')
t = str(t)
toPrint = (' ' * indent) + t
tLen = len(toPrint)
print(toPrint + (' ' * max(0, prLastLen - tLen)), end='\r')
prLastSame = same
prLastLen = tLen
if not same:
print('')
prLastLen = 0
def checkKeep():
while (True):
checkAll()
checkUpdate()
remSeconds(checkEvery)
def checkAll():
for host in hosts: check(host)
now = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
pr(f'check complete at {now}', 0, False, True)
prompt = '\n*>\s*'.encode('utf-8')
pyonEnd = '\n---\n'.encode('utf-8')
def readResult(expected, expectedResult=''):
index = expected[0]
readB = expected[2]
read = readB.decode('utf-8')
#noting
if index < 0 or read == '': return [False, 'nothing was read']
#expected result
if expectedResult:
result = re.sub('\s+>$', '', read.strip())
if (result != expectedResult):
return [False, f'{readB}']
#PyON->json
match = re.search('\n*PyON\s+(\d+)\s+([-_a-zA-Z\d]+)\n(.*)\n---\n', read, re.DOTALL)
#print('');print('');print('');print(index);print(match);print("read");print(read);print("readB");print(readB);print('');
if match:
version = match.group(1)
if version != '1': raise Exception('Response data version does not match')
data = match.group(3)
#to json
data = re.sub('(:\s*)False', r'\1false', data)
data = re.sub('(:\s*)True', r'\1true', data)
data = re.sub('(:\s*)None', r'\1null', data)
data = json.loads(data)
return [True, data]
#auth error
match = re.search('\nERROR: unknown command or variable', read, re.DOTALL)
if match:
raise Exception('error sending command, wrong password?')
#return read
return [True, read]
def tnCreate(host):
match = re.search('(.*):(\d+)', host);
port = 36330;
tEnd = [prompt];
if match:
host = match.group(1);
port = match.group(2);
tn = telnetlib.Telnet(host, port, tConTimeout)
readResult(tn.expect(tEnd, tReadTimeout),)
return tn
def sendCmd(tn, cmd, par=''):
#print(cmd);
if cmd == 'auth':
tEnd = [prompt]
if hostsPassword:
cmdStr = f'auth {hostsPassword}';
tn.write(f'{cmdStr}\n'.encode('utf-8'))
res = readResult(tn.expect(tEnd, tReadTimeout), 'OK')
if not res[0]: raise Exception(f'Error with {cmd}, {res[1]}')
return res[1]
return True
elif cmd == 'exit':
cmdStr = f'{cmd}';
tn.write(f'{cmdStr}\n'.encode('utf-8'))
tEnd = [prompt]
res = readResult(tn.expect(tEnd, tReadTimeout))
if not res[0]: raise Exception(f'Error with {cmd}, {res[1]}')
return res[1]
elif cmd == 'slot-info' or cmd == 'queue-info':
cmdStr = f'{cmd}';
tn.write(f'{cmdStr}\n'.encode('utf-8'))
tEnd = [pyonEnd]
res = readResult(tn.expect(tEnd, tReadTimeout))
if not res[0]: raise Exception(f'Error with {cmd}, {res[1]}')
return res[1]
elif cmd == 'get-info-and-restart':
queueData = sendCmd(tn, 'queue-info')
slotData = sendCmd(tn, 'slot-info')
###
#if type(queueData) == str: print('');print('');print('');print(queueData);print(queueData.encode('utf-8'));print('');
#if type(slotData) == str: print('');print('');print('');print(slotData);print(slotData.encode('utf-8'));print('');
restarted = []
for slot in slotData:
isStillRunning = False
queueDl = False
for queue in queueData:
if queue['slot'] == slot['id']:
if queue['state'] == 'RUNNING': isStillRunning = True
if queue['state'] == 'DOWNLOAD': queueDl = queue
if not isStillRunning and queueDl and queueDl['waitingon'] == 'WS Assignment':
match = re.match('\s?(\d+ days?)?\s?(\d+ hours?)?\s?(\d+ mins?)?\s?([\d.]+ secs?)?', queueDl['nextattempt'])
if match:
seconds = 0
if match.group(1): seconds += int(re.sub('[^\d.]', '', match.group(1))) * 3600 * 24
if match.group(2): seconds += int(re.sub('[^\d.]', '', match.group(2))) * 3600
if match.group(3): seconds += int(re.sub('[^\d.]', '', match.group(3))) * 60
if match.group(4): seconds += round(float(re.sub('[^\d.]', '', match.group(4))) * 1)
if seconds >= restartLimit:
if not testMode:
sendCmd(tn, 'pause', queueDl['slot'])
time.sleep(1)
sendCmd(tn, 'unpause', queueDl['slot'])
restarted.append([queueDl['slot'], queueDl['nextattempt']])
else: raise Exception(f'Error with {cmd}, parsing queue nextattempt:{queueDl["nextattempt"]}')
return restarted
elif par and (cmd == 'pause' or cmd == 'unpause'):
tEnd = [prompt]
cmdStr = f'{cmd} {par}';
tn.write(f'{cmdStr}\n'.encode('utf-8'))
res = readResult(tn.expect(tEnd, tReadTimeout))
if not res[0]: raise Exception(f'Error with {cmd}, {res[1]}')
return res[1]
else : return False
def check(host):
st = time.time()
pr(f'checking {host}', 1, True)
try:
tn = tnCreate(host)
sendCmd(tn, 'auth')
restarted = sendCmd(tn, 'get-info-and-restart')
if len(restarted):
pr(f'{host}: restarted {len(restarted)} slot{"s" if len(restarted) > 1 else ""}: ' + ', '.join(map(lambda item: '' + (' with '.join(item)), restarted)), 1, False, True)
sendCmd(tn, 'exit')
ed = time.time()
time.sleep(max(0, 1 - (ed - st)))
except Exception as err:
pr(f'{host} error: {err}', 1, False, True)
checkKeep()
@danielocdh
Copy link

Oh wow, that's... wow. Gonna have to think what to do about it.

Anyways... the authentication was not working with 1.4, I made some changes and tested with windows(7.5.1 and 7.6.9) and ubuntu(7.5.1 and 7.6.9) fahclients, it seems to be working now
I replaced prompt to be prompt = '\n*>\s*'.encode('utf-8') and #expected result inside readResult with:

    #expected result
    if expectedResult:
        result = re.sub('\s+>$', '', read.strip())
        if (result != expectedResult):
            return [False, f'{readB}']

I didn't have "waiting on ws assignment" slots on the 4 systems so I tested pause/unpause semi manually

@tamaracha
Copy link

Hi, I was not aware of this gist until I was mentioned here. Indeed, both decisions (PyON and telnet) look really weird from today's technical point of view. ;-) I have no experience in Python, but I thought that YAML was the python way of serialization, if JSON was not sufficient. Both would be fine in my opinion. The telnet interface serves as a frontend for humans and as an api for machines at the same time, which leads to mixed concerns at makes it difficult for both kinds of users.

I didn't find anything about PyON online, that's why I wrote my fah-pyon package. Your are welcome to try it out, if you find it useful. It's not on npm, but it can be installed from github releases. Maybe I must update the readme since I migrated to nearley.

I also have a fah-client package which contains helper stuff for command generation and response parsing. This is more related to the telnet topic, but it uses fah-pyon.

In fact, at the beginning of this journey, I wanted to create a more screenreader-friendly web frontend. The FahControl GUI is completely inaccessible and the WebControl could be nicer. I am trying to create an electron app which uses my fah libraries. It seems to work well, but I haven't published a project yet, because I wanted to explore my architecture decisions a bit further.

@jhutchings, if you're still interested in working with me (I am only one person), we should discuss the fields and tasks that are still open.

@jhutchings
Copy link
Author

@danielocdh updated to 1.5 and then 1.6 to clean up reference to tEnd that was no longer needed after cleaning the expectedResult check

@tamaracha I wanted to bring visibility to your projects from these two as I've been checking out your updates and am excited to have another person working on this. Do you have a board that you're working from? I'll try to be as active as possible to help update these libs. I like the idea of an electron app, I haven't done any work in electron yet but I can do some research. I was hoping to write a client on top of either an initial grammar & lib or now that you've advanced so far to work with you and what you have developed. After that maybe hosting a service that the client communicates to, build a UI on top of that. Potentially eventually pulling together data from other users using said client (long term) as it's more real-time direct from client and then cross-relating it to the results from F@H Stats to sure up the values later or show the delta between client predictions vs WS/CS actual values. It seems that Extreme OC's F@H stats are great but it's the piece that they lack is that they are fully dependent on the F@H team's updates and they are done only periodically and in bulk.

Granted I have other after work projects that I should be working on so it might be a slow burn, so any help I can provide any of the aforementioned project owners alongside anything I might try to play with the better 😃

@jhutchings
Copy link
Author

Added 1.7 with getting started info based on what marknd59 wrote in the forum

@bafoah
Copy link

bafoah commented Apr 21, 2020

Hi, I just start writing my own code because I didn't know is already written, my goal is just to know python better (I just finding this project could be interesting and useful too)

Beside wu short-out (and this pause-unpause solution) I think it would be better if this babysitter can give us notification if there is something wrong (like FAHClient freezing - there is no log-updates in a looooong time) for now I just think of Telegram Bot

I also fold in several client, so I think it would be better if I can "simplify" the process, for example after installed FAHClient instead setting this machine by hand, I want just to run this babysitter, and viola!! Yes I know it only one-time-setup (and I just very lazy), but just imagined if you fold on cloud, and if you abusing google free credit (yes I do it so I know the pain...)

I use eval just because I follow this reference https://github.com/FoldingAtHome/fah-control/wiki/3rd-party-FAHClient-API but since @jhutchings mention it, I planed to make my own "parsing" method (basically it just converting string to python a variable), maybe....

My babysitter future plan

  1. Have config file - so babysitter can run on several machine with different configuration (host-list etc)
  2. Can manage FAHClient setting (user, passkey, team, next-unit-percentage, client-type etc)
  3. Summarized all machine and folding slot into simple matrix (ie. like PPD, GPU/CPU count)
  4. Display another information from external site (Like project cause - because I think so many people will be excited if they know they fold for Covid-19, team info, total point, etc)

Maybe all of this doesn't fit to do with python (because involved some kind of "GUI"), maybe in future I can create "http-server" and just display everything on web browser... maybe, I just don't know...

Maybe I just try to make folding a little bit fun... for me and everyone else... Because you know, its boring, and eat my electricity like a monster...

@jhutchings
Copy link
Author

Added 1.8 with an optional check for script updates 😃

@jhutchings
Copy link
Author

@bafoah I completely agree with the write something, it's fun to do right? I was about to start to write something initially for the pause issue then found this and modified. Then I was going to write some node stuff and found node-fah-xyz stuff so I figured might as well help that out.

The Telegram stuff is kinda cool, I haven't used that in a while (one of my servers used to Telegram me if there were issues with downloads). It can likely be useful for some people too! I kinda like the idea of having the default values for your setup in babysitter as well since it's not too hard and it makes sure people are configured correctly, heck the other day I found out my Azure client lost a config entry and I was folding for Default on Anonymous for I don't know how long.

But honestly keep on building! It's fun right? Honestly whenever something I'm playing with has an API I have to immediately look at what data is available via that API to imagine new things I could build from it...
I have similar ideas to you for that UI I was talking about above 👍

As far as the eval stuff goes, the comments above are purely just observational and nothing against the approach. Heck the Folding@Home people are using it for their implementation of FAHControl, there are inherent risks with it but if you look at the possibility of attack it's fairly low right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment