Skip to content

Instantly share code, notes, and snippets.

@Carreau
Last active March 14, 2024 08:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Carreau/0a2bd4070675131d5aba1abb7c8b21f5 to your computer and use it in GitHub Desktop.
Save Carreau/0a2bd4070675131d5aba1abb7c8b21f5 to your computer and use it in GitHub Desktop.
NumFOCUS Summit 2023

NumFOCUS summit 2023

Two weeks ago from September 11th to 13th was NumFOCUS summit 2023. NumFOCUS is the US Based nonprofit that act as a financial entity for many Project of the Python ecosystem and more.

Every year, many project send a few representative to the NumFOCUS summit for a couple of days, and this year the summit was held near Amsterdam in the Netherland. As this was in Europe I was able to attend as one of representative of the Jupyter (and IPython).

The NumFOCUS summit was roughly structured as an unconference, with pre-scheduled sessions in the morning, and allocated time in the afternoon for concurrent working groups. Those groups would be proposed in the morning and voted upon during lunch time.

My Experience

Here some of my experience attending the summit.

  • It was great having the summit in Europe, not only because it's less of a travel from some of the participant, but because getting visas for countries in North America can be difficult. Not having any time-zone change (or at least only an hour or two) is way less tiring as least for European attendees.

  • The Summit being in-person is IMHO important to meet not only with people you might already know from online, but also to have a chance to lean meet with contributors and maintainer from other project. A large amount of discussions happen outside of scheduled conference time and in smaller groups.

  • The UN conference format is really useful to bring out topics that would likely not happen otherwise. I would have love to have better structured notes of some of the meetings – blame is in part on me –, but it is quite hard to take notes and participate in session at the same time.

What topics were brought up

This will of course be a biased view as I attended only some of sessions and have my own interests.

Learning about NumFOCUS processes and difficulties.

A few of the morning sessions were presented by NumFOCUS staff. We were presented a few numbers and the main points I got from those was that NumFOCUS has grown a lot since it's inception and the processes that worked when it was smaller don't scale with the current size. With now more than 60 projects under it's umbrella like many organisations there are growing pains.

Unlike companies that have a certain uniformity, NumFOCUS work with a number of projects that each have their idiosyncrasies, and making finding tools sometime difficult. For example Jupyter is one of the last project not using Open Collective (yet) to manage some of the finances.

I'm also happy to learn the NumFOCUS is working on changing it's ticketing system, for a more streamline interaction workflow between projects members and NumFOCUS staff. This comes with it's own challenges are ticketing systems are often thought with a hard Employee/Customers relationships, but projects members and NumFOCUS don't have such a hard divide making things challenging.

Some unconference session

Security

One of the unconference session was about Security, where we talks about many things related to security processes. One of the long topic was about scoring of CVE assessment of impacts and review of the fixes. This is something that can be fairly difficult for maintainer not trained in security, and one of the questing was Wether the PSF should offer such service for the Python ecosystem.

There were some mention of the new NumFOCUS Security committee, and what they role would be. The committee currently has 7 members and its formation is still in process, but from the draft charter the high level role would be:

The Security Committee will improve the safety of research, data, and scientific computing communities by assessing and making recommendations to address cybersecurity-related risk in NumFOCUS sponsored projects.

From my experience as part of the Jupyter Security committee I think that having a role at NumFOCUS level to centralise Security report reception, high level triage, and forwarding to the relevant Project Maintainers and monitoring responses would be helpful to avoid duplicating this process at each project level.

Some discussion happen around the conflicting uprating modes of Security vs Open-Source and the difficulty it entails.

I mentioned that Jupyter recently participated in a bug bounty program financed by the European Commission. This was a successful program with I believe more the 10k€ distributed as reward – but the problem being as a Project we did not get funds to pay developers to fix the issues. We are thus ambivalent about it.

We wondered whether a security bug bounty program should be ran at NumFOCUS level and how to fun it.

Documentation

Another unconference session I participated to was documentation one. This was less structured than the previous ones but there was some good iterations.

One of my main feeling is that was people are happy with sphinx which is still widely used in the ecosystem, many project are felling the limitation and have the impression that we are trying to make sphinx do more and more things it was not meant to achieve in the first place. Should for example API and Narrative docs use the same tools ?

MyST is growing in popularity but often as a sphinx extension. The way each project build documentation also each have their idiosyncrasies, and there is some lack of cohesiveness in the ecosystem.

Other languages (Rust, Haskell), do have central documentation hubs, and unified/automatic way of building docs.

Sphinx has a builder that can output unstyled json, which can then be used to generate html. New builders could also be added. Sphinx/docutils has historically been difficult to contribute to. (I'm personally working on https://github.com/jupyter/papyri to solve some of these issues, so I'm biased).

More questions were raised around the process of documentations and tooling to "verify" docs. How can we ensure docs are written? Rust project has tools for “docs coverage” could we have this for Python ? There was some mention of PEP 782, but many attendees are not familiar with it yet, as it is quite recent and still being written.

The session was finished with a wishlist of things that Sphinx or current tooling can not do (easily): - in IDEs, have documentation in viewer that is as good as in the rendered docs - have multiple input types as first-class citizens (Sphinx is still very rst-centric). For example, translation of notebooks breaks down easily - have documentation for all ecosystem under the same UI - break 1-1 connection between rst <-> html - not be limited by static html as an output, i.e. can I take all the existing docs and render it in for example a Single Page Application with dynamic renderigg options. - reducing friction to write documentation. Docusaurus mentioned as an example of another tool with a much faster rendering cycle, but doesn't have "intersphinx" - numpydoc could be stricter to enforce more consistency across projects

We ended up creating a discord channel for documentation: https://discord.com/invite/vur45CbwMz

Academic recognition and Licenses

And discussion thread thought the summit was as usual contribution recognition in particular in the academic world. Academic careers rarely reward software contribution, and even when this is the case it prefers new software than contribution to existing ones.

Publication also often forgo software citation, and as usual there were many discussion that floated around this topic.

One things that caught my attention was discussion about Software Licensing, and question of the effects of having a licence that including a clause that a software must be cited in academic paper if used for a given research. Funding sources often have such clauses that the institution giving money for a research must be disclosed. Is the legal avenue one of the way to incentivise institution in making sure

--

This is licensed under CC0 – AKA you can do what you like with it, though link to the original are appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment