To-Do List
Major Problems:
- PROB-1: Performance of main generate task
- The main generate task is barely keeping up with the incoming reports. A cursory examination reveals nothing that can be easily improved without parallelization.
- Solution: Refactor into a Minion task that can be parallelized using Beam::Minion
- Critical chain: BACKEND-1 API-3 BACKEND-2
- PROB-2: AmazonDB SimpleDB Metabase is expensive (money wise)
- Solution: Make new API to write to local Metabase cache
- Make backend generate task read from local Metabase cache instead of remote Amazon SimpleDB
- Critical chain: SCHEMA-1 API-1 API-2
- PROB-3: view-report.cgi has very poor performance
- Solution: Make new app using Mojolicious for this one page
- Critical chain: WEB-1 SCHEMA-3 WEB-2
- In progress by @glasswalk3r
- PROB-4: Existing website is difficult to navigate and maintain
- Solution: A new website using Mojolicious
- New website must maintain all existing APIs
- But will not maintain any existing HTML, so if anyone's scraping they've had years and years to stop scraping
- PROB-5: Backend is disorganized and shares very little code and configuration
- Solution: Build small, reusable ETL components using Beam::Runner
- Copy the code from the existing backend into new modules
- Beam::Wire allows for sharing of configuration files
- Clean up and share code by using CPAN::Testers::Schema
To-Do List:
- SCHEMA-1: Add read/write metabase cache methods to schema
- This should abstract the current metabase cache to allow us to do things like API-1 and WEB-1 easier
- Refs PROB-1 PROB-2
- cpan-testers/cpantesters-schema#4
- SCHEMA-2: Allow metabase table to have no report id
- This allows us to store data in this table before it's processed by the backend
- This requires us to start storing schema upgrades/changes using DBIC::DeploymentHandler
- API-1: Make new Metabase API
- This API will replace the existing Metabase API but write to the local mysql metabase cache instead
- This is almost completed and needs robust testing with existing reporters
- Requires SCHEMA-2
- cpan-testers/cpantesters-api#3
- BACKEND-1: Copy "generate" task into new module that uses Beam::Runner
- This is the start of PROB-1 solution
- Only the initial "generate" task needs to be moved
- cpan-testers/cpantesters-backend#1
- API-2: Design a new incoming test report document schema
- Since we're not using the Metabase anymore, we can make a JSON document that is simpler and more flexible
- This will need coordination with the Reporters. Talk to Garu during the toolchain summit
- This needs to be expressed as an OpenAPI schema
- cpan-testers/cpantesters-api#5
- API-3: Make new Metabase API use Schema metabase APIs
- This will allow us to make a better raw report format
- Requires SCHEMA-1
- This can be done as part of SCHEMA-1, since the code for SCHEMA-1 will be copied from there
- cpan-testers/cpantesters-api#4
- API-4: Add incoming Metabase reports websocket feed
- This allows us to trigger generate jobs
- This should use the new report format that we develop so we don't have to make a second version when that report format happens
- Refs PROB-1
- Requires API-2
- cpan-testers/cpantesters-api#6
- BACKEND-2: Queue generate jobs for each incoming report using Beam::Minion
- This will allow us to use more hardware for processing jobs and improve performance
- Requires API-2
- cpan-testers/cpantesters-backend#5
- BACKEND-3: Make "generate" task use new metabase schema API
- This should simplify the generate task a little and allow us to change the metabase report storage
- Requires SCHEMA-1
- cpan-testers/cpantesters-backend#3
- SCHEMA-3: Make easier-to-use stored raw report
- By storing the plain text report in a more useful way, we can improve the performance of viewing the plain text report
- Refs PROB-3
- Requires API-3 BACKEND-3
- cpan-testers/cpantesters-schema#5
- WEB-1: Make view-report.cgi replacement
- By replacing the existing CGI script we can improve performance
- Refs PROB-3
- cpan-testers/cpantesters-web#1
- WEB-2: Make view-report.cgi replacement pre-store the almost-ready-to-serve form of the report
- By storing a better form of the report, we can cut down on processing time and improve performance
- Requires SCHEMA-3
- cpan-testers/cpantesters-web#3
- BACKEND-4: Make CPAN/BackPAN mirrors deployed as part of the backend
- The CPAN/BackPAN mirrors are required data sources for the backend, but no other part of the site.
- cpan-testers/cpantesters-backend#4