Skip to content

Instantly share code, notes, and snippets.

@tony-o
Last active August 3, 2020 13:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tony-o/07fdf8b3a0f364b182e6034131ac224b to your computer and use it in GitHub Desktop.
Save tony-o/07fdf8b3a0f364b182e6034131ac224b to your computer and use it in GitHub Desktop.

Raku Ecosystem

  • Name: tony o'dell
  • Amount Requested: $12,000

Synopsis

Redesign the raku/zef ecosystem to be robust and to make easier the distribution submission for the raku ecosystem.

Benefits to the Raku Community

Currently the process for maintaining the ecosystem in raku is either uploading to cpan, which comes with its own set of limitations as cpan was not designed to handle the way raku uses distributions (the same distribution name can be used multiple times and pared down by the consumer by using :auth and/or :ver). The other way this is handled right now is through a github repo containing a file that the distribution authors must update in order keep the ecosystem fresh, which comes with its own set of challenges and barrier to entry for the user.

This project would create an ecosystem that is both friendly to the end user, provides secure access and storage to distribution consumers, and promote the development of distribution from raku distribution authors.

Deliverables

  • Fault tolerant ecosystem
  • Provide hooks for author tooling

Project Details

The primary deliverable of this project would be a fault tolerant ecosystem that is both consumable via zef (so, a zef plugin) and a website similar to metacpan for browsing and finding packages. Guidelines for distribution authors and tooling for distribution authors to test the quality of their upload.

The secondary deliverable would be to create an expandable API for further development (testing, quality checks, health checks, etc). It is limited in scope to the design accomodation and not the implementation of what the hooks will or could be used for.

The operating costs of this project will be paid for by donations from the community or out of my pocket. Donations exceeding operating costs will be used to further develop tooling and expand on the secondary deliverable.

The domain for this project has been suggested to use zef.pm but this is not set in stone and this proposal is open for using a different domain.

Project Schedule

The primary deliverable of this project is reckoned to take about two months complete. This includes the administration and automation of server maintenance, storage space, and writing the necessary code to store/save/retrieve these distributions both from a zef plugin (or another CLI) and from the WWW.

The secondary deliverable is likely to take closer to a month to design and build.

Bio

I am Tony O'Dell. I have written a good number of raku modules and have been writing in perl for about 15 years. I have sys admin experience and am initial commiter and co-author of zef. Prior to writing software as a full time job my primary area of expertise was in data warehouse management and design, and statistics.

@patrickbkr
Copy link

@tony-o Can you elaborate a bit on how the proposed ecosystem works from an architectural point of view and a user point of view?

I think it's good to know a bit about your approach when deciding on this grant.

@tony-o
Copy link
Author

tony-o commented Jul 30, 2020

Addendum

The ecosystem in the context of this grant is meant to mean a repository where packages can be downloaded/consumed by raku package managers and processes. The structure of the repository will be similar to CPAN, maintaining the ability to be mirrored, with modifications to handle the added complexity of :ver, :auth, :api in raku.

The API in context of this grant is meant to enable some process that allows authors to write to the repository and upload their dists (more analogous to PAUSE). The functions of the API will be around user creation and uploading distributions to the ecosystem.

This project is in no way meant to replace modules.rakudo.org or metacpan. It will make modules.rakudo.org's job much easier in displaying/searching for modules.

Milestones

  1. Design and architecture of dist storage on servers, permissions, indexing, and mirroring
  2. Build out for distribution with a sandbox environment and tests
  3. Writing tooling around the API to manage users and allowing users to manage their dists available in the repository. This milestone includes the zef plugin mentioned below
    1. Registering a user
    2. Manage user information including user deletion (GDPR)
    3. Managing packages
    4. Uploading via zef plugin

Reasons for not mending what's there

  • CPAN

    1. .. isn't designed to handle multiple modules of the same name. Something that very easily happens in Raku
    2. .. indexing names doesn't allow you to find the ACME-Test-0.0.0.tar module that is uploaded by 35 different authors tried to upload ACME
    3. The amount of code surrounding mending is the same regardless of "rewrite" or "mend," divorcing the two because the specs and needs are different makes sense.
  • Github

    1. .. the repo with the index in it is a list of github repos that may or may not exist
    2. .. this is clunky and is a poor experience for both the consumers and authors
    3. .. in order for this to be fully indexed you need something running to maintain a translation from repo name to dist name
    4. The level of effort to make this a pleasant experience for both consumers and authors is large. It involves a somewhat fragile architecture of indexing services, retrieving repositories for display on things like modules.raku.org has a lot of overhead, searching for modules in a package manager is tedious and it's not for a fault in the package manager or the modules.raku.org design.

Out of Scope

This is a list of things that are out of scope for this grant.

Search Engine

This grant is not meant to be a search engine. It is not meant to replace metacpan or modules.rakudo.com. The ecosystem file structure will be indexed and searchable by any front end or package manager.

Package Management

Zef will persist and tooling will be built around enabling this gateway through the zef cli as part of milestone 3.

Expected Costs

None of what is below is designed or set in stone but is merely meant to roughly show that the cost of the redesign can and will be made to be minimal both for the managing body and the operating costs moving forward.

Ecosystem

The expected costs for hosting the ecosystem itself are minimal. If S3 were to be used and we calculate costs from that:

A mirror of the current ecosystem in github is roughly 65MB without supporting prior versions. The cost for S3 per GB is is $0.023. So, the cost of hosting on S3 is less than $1/mo. Using the S3 calculator the ecosystem could be downloaded 157 times per month and still have operating costs of less than $1/mo.

Another option is to use one user on CPAN and creating a separate index file that can index the SHA1 hashed modules.tar.gz. This is hacky, less secure, and difficult to consume for things like modules.rakudo.org.

API

AWS Lambda is an explorable option. With very conservative numbers and guesses: The compute price is roughly $0.0000002083 / GB-s with 128MB of memory allocated. If it takes 20s to upload and index a package then the operating costs for running this function 100k/mo can be kept below $1/mo. (the calculation from https://aws.amazon.com/lambda/pricing/ - .0000002083 * 20 * 100000 * (128/1024) = $0.05)

@patrickbkr
Copy link

@tony-o: Is it possible to get the above on the grant proposal page? (https://news.perlfoundation.org/post/grant_proprosal_raku_ecosystem)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment