Can we apply the Reinforcement Learning problem for the ML scheduling while considering the case that the jobs may be preempted?
Paper | State Representation | Action Representation | Reward Function | RL Customizations | Policy Representation | Goal |
---|---|---|---|---|---|---|
Resource Management with Deep Reinforcement Learning (HotNets '16) | An image showing the resource allocation of the cluster of resources and resource demands of the jobs waiting to be scheduled. | Whether to admit a job or not | Negative of the total slowdown | REINFORCE + Negating the value function in order to reduce the variance. | Simple feed forward network | Reducing the average job slowdown |
Device Placement Optimization with Reinforcement Learning (ICML '17) | A computation graph + a list of devices | The given placement strategy | Square root of the total running time in a given placement or a large const |