Skip to content

Instantly share code, notes, and snippets.

@dkatz23238
Last active April 8, 2019 19:21
Show Gist options
  • Save dkatz23238/b70cd05b788a86ad5207549e0be6b3be to your computer and use it in GitHub Desktop.
Save dkatz23238/b70cd05b788a86ad5207549e0be6b3be to your computer and use it in GitHub Desktop.
pybotlib-protocol

The Pybotlib Protocol Spec Sheet

This document specifies the best practices for building RPA's in python using pybotlib. These requirements and specifications are the result of many years of building RPA's with python and deploying them to remote virtual Desktops using both Linux and Windows operating systems.

Definitions

  • pybotlib RPA: pybotlib RPA refers to a specific robotic process automation developed using the pybotlib and other python tools. When referring to an individual pybotlib RPA we are referring to all of the individual and static source code files that are hosted in a git repository and ready to be deployed.

A pybotlib RPA must follow the following directory structure:

<pybotlib RPA Name>
│   README.md # This contains documentation on RPA
│   run_RPA.sh # This runs one or more python scripts that execute the RPA
│   run_RPA.py # The RPA python code
│   get_pybotlib.sh # A script to get the latest version of pybotlib or other depends.

Specifications

0 -> Strive to Linux

  • Name: Strive to use Linux
  • Description: Whenever not using specific applications that need to work on Windows, it is ideal to use ubuntu desktop enviornments to develop RPA's using python and pybotlib. The ubuntu-client branch is tested and can be deployed to any Ubuntu server running a Desktop enviornment such as GNOME, lxqt, MATEand others.
  • Additional-Information: Additional relevant information to the specification
  • Category: Techology

1 -> Decouple Business Data

  • Name: Decouple business input and output data
  • Description: Any business process input data that is subject to change must be decoupled from the RPA and must remain in a separate source that is easily editable by human operators. For example a google sheets spreadsheet that the RPA will read from on execution to update any relevant business data will allow the business data to be seperate from the RPA source code. Any source code related to a pybotlib RPA must be static and should only be changed for fixing bugs or improving/adding features
  • Additional-Information: I recommend running a local instance of Minio object storage and use it as a data input and output source for the RPA's to read and write to. Google Drive, One Drive, and other cloud storage options are also iable
  • Category: Business

2 -> Single entrypoints for RPA

  • Name: Single Entrypoint for RPA's
  • Description: The entry point for the RPA must be bash executable. The entrypoint for the RPA must be called run_RPA.sh (or run_RPA.bat on windows). This bash file must contain first get_pybotlib.sh (a simple script that clones the most recent version of pybotlib.) Following this initial git clone the script should proceed to call any other scripts that are needed to run the and execute the RPA. Ideally the script should also cleanup the enviornment in which it ran in as any persisted data should be decoupled from the virtual desktop running the RPA. Using a single entrypoing script that is well commented allows for simple and easy deployment of individual RPA's. Complex RPA's may need various python scripts to be run in specific orders, this can all be controlled by the single entrypoint.
  • Additional-Information: An example of run_RPA.sh is provided in this gist.
  • Category: Techology

3 -> Use Git for Deploying and Maintaining

  • Name: Use Git for versioning and deploying
  • Description: Python RPA's are like other python programs and can be versioned using git. It is recomended to have a centralized repository of python RPA's within your orgranization and create individual repositories for every RPA. RPA projects tend to evolve over time and also need to be accessed by varios team member throughout a project. Using either an internal git server or private external git server such as Github or Bitbucket is essential for deploying, maintaining, and versioning RPA's.
  • Additional-Information: You can spin up a simple internal private git server using an open source project called gitea.
  • Category: Techology

4 -> Use intelligent logging

  • Name: use pybotlib logging
  • Description: Pybotlib allows for the creation of on the fly csv log files that is taken care of by the pybotlib library. It is recomened to use this functionality to accurately log transactional and executional information and have the log files uploaded/sent to a decoupled cloud file storage service.
  • Additional-Information: An example of the loggin capability of pybotlib can be found in the pybotlib git repository
  • Category: Techology
# first run the get_pybotlib.sh
bash get_pybotlib.sh
# then run the developed RPA
echo $(python --version);
echo "Executing pybot RPA"
echo "RPA Initializing";
# Install any RPA specific dependencies provided in pybotlib RPA git repo
python -m pip install - requirements.txt
# Run the pybotlib RPA
python "./run_RPA.py";
# Finish and cleanup!
sudo rm -r ./pybotlib_logs
echo "Process Complete!";
# sudo apt-get update
# sudo apt install nano git python-pip -y
# sudo chmod -R 777 /home/robot/Desktop
# Clone pybotlib
git clone -b ubuntu-client https://github.com/dkatz23238/pybotlib.git pybotlib-clone
cp -r pybotlib-clone/pybotlib ./pybotlib
cp pybotlib-clone/requirements.txt ./pybotlib-requirements.txt
cp pybotlib-clone/get_geckodriver.sh ./get_geckodriver.sh
rm -r ./pybotlib-clone --force
python -m pip install -r pybotlib-requirements.txt
rm ./pybotlib-requirements.txt
rm ./get_geckodriver.sh
bash get_geckodriver.sh
# Starts up an initial firefox to run default geckodriver init scripts
firefox &
sleep 15
pkill firefox && exit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment