WhoAteDaCake/PIPELINE.md

## PIPELINE.md

      
    Raw
  

              PIPELINE.md
            
          
    Overview

Processor


Individual servers that perform the tasks
Endpoints

GET /meta

Provides resource usage as well as name and id


POST /process


Should register to the manager

Manager

Split into 3 parts
Uses redis database to store meta data
Handler


Endpoints

POST /register

Register a new processor
Should pass a (name and id)
On register

Handler will ping server back

/meta to validate name and id
(maybe) ping to validate /process endpoint


GET /meta

Resource usage of handler
List of process data:

Name, Id, usage stats, Average times taken, items processed, connected on.


POST /action

Create a new action that will be put in the queue for processors to execute
Usually is array of jobs together with data entry
The data item with 'reduced' over processors


POST /complete

Processor would make this request to signify the end of a job
Item will be :

If has any processors left -> put back in pending queue
Otherwise to completion queue


Scheduler

Uses FIFO scheduling (maybe updated in the future)

Handles input queue

Sends out sends out items from queue to processors
Puts the item into pending queue


Handles pending queue


Handles output queue

Sends reponses back combined with action id
Also calculates average processor execution times


TODO:
kick of queue if takes too long ?
Healthcare


Runs periodic health checks in order to make sure the plugins are still working

Update resource usage


Resends failed processes

How it works

Each processor is provided with a url of a manager. As soon as it starts, it will register itself with a manager
Database

Could potentially be any database in the future, but I chose to use redis

Processors meta

url
name
id
memory usage
cpu usage
last average time
processed item count
connected on


Actions queue

action id
response url


Input queue

Items here are waiting their turn to be processed
Contains (actionid, list of processor id, data)


Pending queue

Items here have been sent to


Output queue

Items here are ready to be sent back to the issuers
Containes (action id, data)


Failed queue (FUTURE ADDITION)

Contains (actionid, list of processor id, data)


TODO


Dealing with failure ?

Should action allow to specify whether partial failure is allowed ?