Skip to content

Instantly share code, notes, and snippets.

@nqn
Created July 18, 2017 18:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nqn/246d7502e787a509588d4e9d26e5ecd1 to your computer and use it in GitHub Desktop.
Save nqn/246d7502e787a509588d4e9d26e5ecd1 to your computer and use it in GitHub Desktop.

Meeting notes

Topology and NUMA awareness

Skipping for now. Device plugin is more pressing.

Agenda

Vish is out this week. So the roadmap pushed out to next week.

Device plugin

Renaud: what problems does the alternative proposal address? The critique has been that the two way gRPC is complicated to maintain and test. Where are the problems with the current one?

Ram: before we get started on the API, can I get to my topic? I have to step out at 11:30am. There isn't much use for hot swap for NICs (context: during the plugin for SolarFlare NICs)

We do init and all the steps and are pretty much ready for allocate and monitor phase.

Essence: hotswap won't be a requirement. Maybe discovery can be skipped.

Niklas: how does the resource mapping happen if you skip.

Ram: on_load stacks will populate.

Renaud: discovery is not about hot swap. It's about the plugin having more context about the device made available.

Jiaying: the information from the resource should be sent to the kubelet and api server.

Ram: Maybe the term 'discovery' isn't the right one. Proposing not having an expilicit discover phase.

In the plugin you do lspci and you start initializing. In the init, the software has knowledge about the devices. End of init phase, report back what resources you have. Init may take minutes and should be pulled out of the life cycle.

Connor: move discover into registration, is that what it's about?

Jiaying: register was for authentication. Agreeing with Ram.

Renaud: what's the issue with them being separate.

Ram: to simplify.

Niklas: suggesting to get a PoC done on the oneway version to convince the group about the reduced complexity. Feeling we are rabit holing on the RPC model.

Renaud: using OIRs in the PoC.

Jiaying: last time we talked about extending the Node object.

Renaud: because we are giving the devices on advertising, we don't need to check pointing.

Niklas: suggesting to get to a conclusion by Tuesday.

Connor: How about making the counter proposal into PRs and comments into the feature PR.

Jiaying: Agree with Conner. Just trying to reduce the scope of the initial prototype.

Renaud: Right now, the two way has 80% test coverage. Take a look at the current code and see how it looks. Going from one way to two way may require API redo.

Jiaying: can be avoided.

Renaud: Can you provide a path for the transition from one to two way.

Jiaying: (Listing ways to change one way registration to two way)

Renaud: Some parts of the device plugin will be in gRPC. Some part of the discovery is the device plugin will be pushed directly to the kubelet.

Niklas: Suggesting time boxing

Derek: do not aim for code by Tuesday (extreme requirement). By next: aim for one document and one PR. Proposal items need to be up about this time. Don't aim for code, folks get invested. False requirement.

Connor: aim for getting the second proposal merged into the first one?

Derek: makes sense. We have plenty of time to get code checked in. Get the proposals out.

Talk on slack afterwards and pick two to talk about. Probably this and the CPU manager static proposal.

Renaud: anyone taken a look at the PoC PR?

Derek: about 8000 lines at this point, don't expect too many reviews BTW, can we discuss the versioning issue? How do we upgrade a node?

Draining a node first is OK. Just the management of the plugin daemon sets needs to be answered.

Derek: What happens if 1.9 has new features? What's the impact of node upgrades.

Jiaying: Renaud, we should add that section in your PR.

Renaud: I'll add it.

Christopher: There was some mention of GPU metrics for 1.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment