Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Folding at Home Babysitter - Original Code @danielocdh
#!/usr/bin/python3
# 1.0 - Original Code Belongs to @danielocdh
# 1.1 - Added ability to check FAH Control APIs on other ports
# 1.2 - Added support for Linux Control API (Slight changes in response checks)
# 1.3 - Fix for latest version of FAH Control API and Client (7.6.9)
# 1.4 - Added @danielocdh's feedback and his local changes around !# and spacing in the substitution
# 1.5 - @danielocdh updated expected result for authentication issues
# 1.6 - Removed un-needed tEnd references to end of readResult
# 1.7 - Added getting started
# 1.8 - Added "hacky" check to see if there is a new version of the gist using github rest api
################################################################################
## getting started ##
################################################################################
# 1. Install Python 3
# - https://www.howtogeek.com/197947/how-to-install-python-on-windows/
# - Scroll down the page untill you get to "How to Install Python 3" and follow it
# 2. Create a folder somewhere memorable, easy places C:\babysitter
# 3. Copy and paste the contents of this gist into notepad and save it as "babysitter.py" in C:\babysitter\
# 4. Next open a Command Prompt
# - Start Button > Command Prompt
# - Type: cd C:\babysitter
# - Hit Enter
# - Type: python babysitter.py
# 5. That should be all you need to do if you are just running 1 computer.
# If you need to run on more machines there are further instructions below
################################################################################
## options ##
################################################################################
hosts = [ #list of quoted strings, hosts or IPs, with optional colon separted port (e.g. localhost:36331), separated by comma
'localhost'
]
hostsPassword = '' #quoted string, if the host(s) don't use a password just leave it as: ''
restartLimit = 10 * 60 #in seconds, pause+unpause if next attempt to get WU is this or more
checkEvery = 2 * 60 #in seconds, do a check for all hosts every this seconds
checkUpdate = True # True or False, check for update in the script
checkUpdateCycles = 30 # number, multiply this by checkEvery and it will tell you how long between checks (defaults: 30 * 2 * 60 = 1 hour)
tConTimeout = 15 #in seconds, connection timeout
tReadTimeout = 10 #in seconds, read timeout
testMode = False # if set to True: checkEvery=6 and restartLimit=0 but won't actually pause+unpause slots
################################################################################
## code ##
################################################################################
import json
import re
import telnetlib
import time
import datetime
import urllib.request
import urllib.parse
if testMode:
restartLimit = 0
checkEvery = 6
version = 10 # internal version number that equals the number of commits in https://api.github.com/gists/1f3ac2f27790506b5e9bd0c1ec356d49/commits
countCycles = checkUpdateCycles # counter for cycles passed, set initially to the same as the interval so it'll give user feedback
countEvery = 1 #seconds, have to be a factor of checkEvery, default: 1
countEveryDec = max(0, str(countEvery)[::-1].find('.'))
countEveryDecStr = f'{{:.{countEveryDec}f}}'
def remSeconds(seconds):
if seconds > 0:
if (seconds * 10000) % (countEvery * 10000) == 0:
secondsP = countEveryDecStr.format(seconds)
pr(f'Next check in {secondsP} seconds', same=True)
time.sleep(countEvery)
seconds = round((seconds - countEvery) * 10000) / 10000
remSeconds(seconds)
def checkUpdate():
global countCycles
countCycles += 1
if(checkUpdate and countCycles >= checkUpdateCycles):
countCycles = 0
try:
resp = urllib.request.urlopen('https://api.github.com/gists/1f3ac2f27790506b5e9bd0c1ec356d49/commits')
if (resp):
commits = json.loads(resp.read().decode('utf-8'))
if (len(commits) > version):
print("New version of babysitter script is available at https://gist.github.com/jhutchings/1f3ac2f27790506b5e9bd0c1ec356d49")
except Exception as err:
print("Error checking version, continuing to run. Will check later!")
prLastLen = 0
prLastSame = False
def pr(t, indent=0, same=False, overPrev=False):
global prLastLen, prLastSame
if not overPrev and not same and prLastSame:
prLastLen = 0
print('')
t = str(t)
toPrint = (' ' * indent) + t
tLen = len(toPrint)
print(toPrint + (' ' * max(0, prLastLen - tLen)), end='\r')
prLastSame = same
prLastLen = tLen
if not same:
print('')
prLastLen = 0
def checkKeep():
while (True):
checkAll()
checkUpdate()
remSeconds(checkEvery)
def checkAll():
for host in hosts: check(host)
now = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
pr(f'check complete at {now}', 0, False, True)
prompt = '\n*>\s*'.encode('utf-8')
pyonEnd = '\n---\n'.encode('utf-8')
def readResult(expected, expectedResult=''):
index = expected[0]
readB = expected[2]
read = readB.decode('utf-8')
#noting
if index < 0 or read == '': return [False, 'nothing was read']
#expected result
if expectedResult:
result = re.sub('\s+>$', '', read.strip())
if (result != expectedResult):
return [False, f'{readB}']
#PyON->json
match = re.search('\n*PyON\s+(\d+)\s+([-_a-zA-Z\d]+)\n(.*)\n---\n', read, re.DOTALL)
#print('');print('');print('');print(index);print(match);print("read");print(read);print("readB");print(readB);print('');
if match:
version = match.group(1)
if version != '1': raise Exception('Response data version does not match')
data = match.group(3)
#to json
data = re.sub('(:\s*)False', r'\1false', data)
data = re.sub('(:\s*)True', r'\1true', data)
data = re.sub('(:\s*)None', r'\1null', data)
data = json.loads(data)
return [True, data]
#auth error
match = re.search('\nERROR: unknown command or variable', read, re.DOTALL)
if match:
raise Exception('error sending command, wrong password?')
#return read
return [True, read]
def tnCreate(host):
match = re.search('(.*):(\d+)', host);
port = 36330;
tEnd = [prompt];
if match:
host = match.group(1);
port = match.group(2);
tn = telnetlib.Telnet(host, port, tConTimeout)
readResult(tn.expect(tEnd, tReadTimeout),)
return tn
def sendCmd(tn, cmd, par=''):
#print(cmd);
if cmd == 'auth':
tEnd = [prompt]
if hostsPassword:
cmdStr = f'auth {hostsPassword}';
tn.write(f'{cmdStr}\n'.encode('utf-8'))
res = readResult(tn.expect(tEnd, tReadTimeout), 'OK')
if not res[0]: raise Exception(f'Error with {cmd}, {res[1]}')
return res[1]
return True
elif cmd == 'exit':
cmdStr = f'{cmd}';
tn.write(f'{cmdStr}\n'.encode('utf-8'))
tEnd = [prompt]
res = readResult(tn.expect(tEnd, tReadTimeout))
if not res[0]: raise Exception(f'Error with {cmd}, {res[1]}')
return res[1]
elif cmd == 'slot-info' or cmd == 'queue-info':
cmdStr = f'{cmd}';
tn.write(f'{cmdStr}\n'.encode('utf-8'))
tEnd = [pyonEnd]
res = readResult(tn.expect(tEnd, tReadTimeout))
if not res[0]: raise Exception(f'Error with {cmd}, {res[1]}')
return res[1]
elif cmd == 'get-info-and-restart':
queueData = sendCmd(tn, 'queue-info')
slotData = sendCmd(tn, 'slot-info')
###
#if type(queueData) == str: print('');print('');print('');print(queueData);print(queueData.encode('utf-8'));print('');
#if type(slotData) == str: print('');print('');print('');print(slotData);print(slotData.encode('utf-8'));print('');
restarted = []
for slot in slotData:
isStillRunning = False
queueDl = False
for queue in queueData:
if queue['slot'] == slot['id']:
if queue['state'] == 'RUNNING': isStillRunning = True
if queue['state'] == 'DOWNLOAD': queueDl = queue
if not isStillRunning and queueDl and queueDl['waitingon'] == 'WS Assignment':
match = re.match('\s?(\d+ days?)?\s?(\d+ hours?)?\s?(\d+ mins?)?\s?([\d.]+ secs?)?', queueDl['nextattempt'])
if match:
seconds = 0
if match.group(1): seconds += int(re.sub('[^\d.]', '', match.group(1))) * 3600 * 24
if match.group(2): seconds += int(re.sub('[^\d.]', '', match.group(2))) * 3600
if match.group(3): seconds += int(re.sub('[^\d.]', '', match.group(3))) * 60
if match.group(4): seconds += round(float(re.sub('[^\d.]', '', match.group(4))) * 1)
if seconds >= restartLimit:
if not testMode:
sendCmd(tn, 'pause', queueDl['slot'])
time.sleep(1)
sendCmd(tn, 'unpause', queueDl['slot'])
restarted.append([queueDl['slot'], queueDl['nextattempt']])
else: raise Exception(f'Error with {cmd}, parsing queue nextattempt:{queueDl["nextattempt"]}')
return restarted
elif par and (cmd == 'pause' or cmd == 'unpause'):
tEnd = [prompt]
cmdStr = f'{cmd} {par}';
tn.write(f'{cmdStr}\n'.encode('utf-8'))
res = readResult(tn.expect(tEnd, tReadTimeout))
if not res[0]: raise Exception(f'Error with {cmd}, {res[1]}')
return res[1]
else : return False
def check(host):
st = time.time()
pr(f'checking {host}', 1, True)
try:
tn = tnCreate(host)
sendCmd(tn, 'auth')
restarted = sendCmd(tn, 'get-info-and-restart')
if len(restarted):
pr(f'{host}: restarted {len(restarted)} slot{"s" if len(restarted) > 1 else ""}: ' + ', '.join(map(lambda item: '' + (' with '.join(item)), restarted)), 1, False, True)
sendCmd(tn, 'exit')
ed = time.time()
time.sleep(max(0, 1 - (ed - st)))
except Exception as err:
pr(f'{host} error: {err}', 1, False, True)
checkKeep()
@bafoah
Copy link

bafoah commented Apr 21, 2020

Hi, I just start writing my own code because I didn't know is already written, my goal is just to know python better (I just finding this project could be interesting and useful too)

Beside wu short-out (and this pause-unpause solution) I think it would be better if this babysitter can give us notification if there is something wrong (like FAHClient freezing - there is no log-updates in a looooong time) for now I just think of Telegram Bot

I also fold in several client, so I think it would be better if I can "simplify" the process, for example after installed FAHClient instead setting this machine by hand, I want just to run this babysitter, and viola!! Yes I know it only one-time-setup (and I just very lazy), but just imagined if you fold on cloud, and if you abusing google free credit (yes I do it so I know the pain...)

I use eval just because I follow this reference https://github.com/FoldingAtHome/fah-control/wiki/3rd-party-FAHClient-API but since @jhutchings mention it, I planed to make my own "parsing" method (basically it just converting string to python a variable), maybe....

My babysitter future plan

  1. Have config file - so babysitter can run on several machine with different configuration (host-list etc)
  2. Can manage FAHClient setting (user, passkey, team, next-unit-percentage, client-type etc)
  3. Summarized all machine and folding slot into simple matrix (ie. like PPD, GPU/CPU count)
  4. Display another information from external site (Like project cause - because I think so many people will be excited if they know they fold for Covid-19, team info, total point, etc)

Maybe all of this doesn't fit to do with python (because involved some kind of "GUI"), maybe in future I can create "http-server" and just display everything on web browser... maybe, I just don't know...

Maybe I just try to make folding a little bit fun... for me and everyone else... Because you know, its boring, and eat my electricity like a monster...

@jhutchings
Copy link
Author

jhutchings commented Apr 21, 2020

Added 1.8 with an optional check for script updates 😃

@jhutchings
Copy link
Author

jhutchings commented Apr 21, 2020

@bafoah I completely agree with the write something, it's fun to do right? I was about to start to write something initially for the pause issue then found this and modified. Then I was going to write some node stuff and found node-fah-xyz stuff so I figured might as well help that out.

The Telegram stuff is kinda cool, I haven't used that in a while (one of my servers used to Telegram me if there were issues with downloads). It can likely be useful for some people too! I kinda like the idea of having the default values for your setup in babysitter as well since it's not too hard and it makes sure people are configured correctly, heck the other day I found out my Azure client lost a config entry and I was folding for Default on Anonymous for I don't know how long.

But honestly keep on building! It's fun right? Honestly whenever something I'm playing with has an API I have to immediately look at what data is available via that API to imagine new things I could build from it...
I have similar ideas to you for that UI I was talking about above 👍

As far as the eval stuff goes, the comments above are purely just observational and nothing against the approach. Heck the Folding@Home people are using it for their implementation of FAHControl, there are inherent risks with it but if you look at the possibility of attack it's fairly low right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment