I'm putting this list together as a sort of reading plan for myself in order to learn more about general cluster scheduling/utilization and various ways of generically programming to them. Lists of direct links to PDFs here in the order I think makes some sense from skimming reference sections.
Happy to here of any additions that might be sensible.
- Google File System since everything references it and data locality is a thing.
- Google MapReduce because it's one of the earlier well-known functional approaches to programming against a cluster.
- Dryad for a more general (iterative?) programming model.
- Quincy for a different take on scheduling.
- [Delay Scheduling](h