Skip to content

Instantly share code, notes, and snippets.

@DrYak
Last active May 28, 2024 16:48
Show Gist options
  • Save DrYak/ed38215519683ccd5c4657760050add7 to your computer and use it in GitHub Desktop.
Save DrYak/ed38215519683ccd5c4657760050add7 to your computer and use it in GitHub Desktop.
Google Season of Docs - Better V-pipe tutorial for virus variant surveillance in wastewater

Better V-pipe tutorial for virus variant surveillance in wastewater

Technical Writer Hiring

The V-pipe dev team is proud to be attributed a Google Season of Docs grant, and is happy to announce the partnership with "van Geest Consultancy" for our tutorials writing project.

About V-pipe

The Computational Biology Group of ETH Zürich (CBG-ETHZ), member of the Swiss Institute of Bioinformatics (SIB), has developed the SIB Software Resource V-pipe: an Apache-licensed computational pipeline for the analysis of virus next-generation sequencing (NGS) data, specializing in samples of mixed viral populations.

This bioinformatics workflow has important applications in several different settings:

  • In clinical virology, information about the detection and quantification of viral quasispecies in a single patient sample can assist clinicians in the optimization of the treatment for this patient.
  • In clinical epidemiology, sequencing test samples during a viral outbreak provides information about the viral variants found in the population, which helps better understanding of the epidemic dynamics and guiding the decision-making process of public health authorities in a timely fashion.
  • In environmental epidemiology, the sequencing of environmental samples enables the assessment of the circulation of viral variants independently of any large-scale public testing effort, but it requires specialized methods for deconvolution as a single sample provides information covering a large population of hosts living in that environment.

Role during the pandemic

In addition to its previous uses in clinical research settings such as HIV, our pipeline has thus seen increased use during the recent COVID-19 pandemic that started spreading into Europe in 2020, and V-pipe's automatic workflow played an important role in the genomic surveillance of SARS-CoV-2 in Switzerland (e.g.: the 75’000 sequences on ENA and GISAID published by the Swiss SARS-CoV-2 sequencing consortium (S3C) have been analyzed using our workflow).

Application to wastewater

In particular, the analysis of environmental wastewater samples is an application for which the pipeline has been successfully adapted, based on its focus on analyzing mixtures of viral variants in a single sample: Such environmental surveillance of the SARS-CoV-2 pandemic has become an increasingly important source of information on the spread of variants since clinical tests declined and are currently close to disappearing. Our bioinformatics workflow V-pipe, including its components specialized in the analysis of wastewater, are at the core of the ongoing wastewater-based SARS-CoV-2 variant monitoring commissioned by the Swiss Federal Office of Public Health, a cornerstone of the current pandemic surveillance in Switzerland. These surveillance efforts enable early warning of the introduction of new variants, provides estimates of their spread, and evaluates epidemiological characteristics, earlier than traditional clinical surveillance and at a fraction of the cost (10.1038/s41564-022-01185-x, 10.1101/2022.11.02.22281825, 10.4414/SMW.2022.w30202).

V-pipe uses outside of Switzerland

Abroad, in the late 2021, a software component developed as part of the V-pipe bioinformatic workflow - the GPL-licensed COJAC - has been used by the UK Health Security Agency to monitor the spread of Omicron variant across 450 wastewater sampling sites in the UK (Omicron, VOC- 21NOV-01 (B.1.1.529) Technical briefing 30), a critical step in understanding of the dynamics of the SARS-CoV-2 pandemic. More recently, in autumn 2023, V-pipe is being applied in a surveillance program in Northern Italy at Arpa Piemonte to help reduce the tedious work in searching for the variant BA.2.86 "Pirola" of SARS-CoV-2 in wastewater.

The Problem

Given how Wastewater-based epidemiology (WBE) is becoming critical in the face of declining clinical testing and sequencing, we would like to facilitate for other groups to replicate our viral variant surveillance bioinformatics data analysis. We are considering several strategies to enhance discoverability and ease of onboarding, including improved documentation. We are currently modernizing the website (old version, draft of new design) and would like to take the opportunity to upgrade the documentations through the Google Season of Docs project.

The current state of the documentation:

The SARS-CoV-2 tutorial need to be rewritten, incorporating the draft HOWTO on wastewater analysis. Also, the spread of information on multiple places causes potential users' confusion and hampers discoverability.

As a consequence of the current state of documentation, on-boarding new users has in the past required booking a video-conferencing call to brief interested new users, and walking them through the steps, which is not a long-term scalable solution.

The Scope

Main scope

The main scope of this project will consist of the following deliverable:

  • Updated tutorial for SARS-CoV-2 covering the analysis of wastewater samples sequencing data. This is the minimum viable deliverable for the project.
  • Expanded tutorial about the installation of V-pipe, with an added a "Reusing an existing conda installation" section.
  • Organise these tutorial in the docs/ folder, and write additional introduction, to make them available on the Read The Docs platform
  • Review the main website and the README files to insure that information is properly linked, easily discoverable and reachable
    • Reduce duplication, while still fulfilling the mandatory requirements (e.g.:presence of file config/README.md is required by the Snakemake Workflow Catalog).

Additional goals

Additional goals fitting within the scope if time permits:

  • content of config/config.html into the Read-The-Docs
  • passing end-to-end CI test using the updated SARS-CoV-2 tutorial

Out-of-scope for this project

  • Recording an updated video presentation introducing this application of V-pipe will be dealt at a later point in time, outside this Google Season of Docs project.

Measuring the results

Internal measures

  • Test the tutorial by sending to users with no prior experience analyze this type of data:
    • ask members of the SIB to follow the tutorial and answer a short survey
    • new users at external partner centers replicating our methods (the current person working in Davos is defending his PhD soon and will need then to train a replacement)
  • Test discoverability of information by giving exercise (again asking members of the SIB) in the form of: "Use V-pipe to analyse this dataset..."

External measures

  • the next 5 new centers interested in replicating our method should be able to do so without booking a video-conference call with the developer of V-pipe
  • students at the next course using V-pipe should be able to complete tasks without opening issues on V-pipe

Timeline

The project itself will take approximately four months to complete. The period after that will be used to track the performance of the updated documentation (see External measures above). Once the tech writer is hired, we'll spend two week on tech writer orientation (and familiarization with the technologies specific to our needs and pipeline: conda, jupytext, etc.), then move onto the inventory of existing documentation and tutorials, and definition of the content required to set up a proper Read The Docs documentation (two more weeks), and finally work the last three months on creating the defined content and populating it to the documentation (through an iterative process in collaboration with the pipeline developers).

Note: Timeline updated to match availability of technical writer.

Dates Action Items
July -- Orientation and familiarization
-- Inventory of existing tutorials and documentation
-- Review proposed content to produce
-- Inventory of existing training material and proposed content to produce
August -- Create all identified content
-- Setup a Read the Doc
-- Populate branch of repository
Semptemer -- Review of contributions made to the documentation
-- Final changes requested by project members
-- Merge the updated docs/ and README into master
-- Linking documentation into website
-- Start to advertise the updated documentation
November -- Review the Measures
-- Project report

Project Budget

Budget item Amount Running Total Notes/justifications
Technical writer $ 5,000 $ 5,000
Project swag $ 200 $ 5,200
TOTAL $ 5,200

Changelog

2024-05-28 - Announcement of Technical Writer, Adjusting the Timeline based on availability of van Geest Consultancy.

@Dhriti03
Copy link

Hello @DrYak !!!

As a B'Tech student from VJTI, Mumbai, set to complete my degree in May, I will be fully dedicated to this project. With a robust background in both Software and Hardware domains, I have authored comprehensive documentation for numerous projects, accumulating over 4 years of experience. I am eager to lend my expertise to the V-Pipe line initiative, pivotal in the early detection of new variants, estimating their transmission dynamics, and assessing their characteristics prior to conventional clinical evaluations. Recognizing the significance and urgency of this endeavor, I am enthusiastic about contributing to its success. You can find the link to my documentation folder here
Documentations link

@nitishmalang
Copy link

@DrYak Is there any link to apply I am highly interested in working in this project

@bilal-aamer
Copy link

Greetings @DrYak,

First off, congratulations on being selected for GSoD 2024!

I am a Dual Degree (BTech + MTech) final-year student from JNTUH. I have extensive experience in technical writing. My latest project was the documentation of a Python package, for which I received a contract to perform technical writing. I was actively involved in the early development of the project, and the founder, an ETH Zurich PhD candidate, shared similar concerns about setting up tutorials, similar to this project statement.

I am particularly motivated to contribute to V-pipe due to my previous experiences. As a 16-year-old, I developed a full-stack application (while learning full stack + applied ML) to support my community in screening themselves as either positive or negative for alpha-COVID-19. You can find the project here.

In addition to developing the software, I maintained close contact with medical institutions for their data sources and collaborated with medical professionals such as pathologists and virologists to build responsibly.

I am excited about the opportunity to further contribute to this field by leveraging my skillset to "Enhance the V-pipe tutorial for virus variant surveillance in wastewater".

Here are some of my other closed PRs

Here are my links for your reference: LinkedIn | Twitter | bilalahmedaamer@gmail.com

Best Regards,
Bilal Aamer

@dubewarsagar
Copy link

Hi @DrYak,
I'm really interested in contributing to V-Pipe this summer as a technical documentation writer i am open source contributor to various projects.

Kindly inform about the further details.

@Shubhra-Narang
Copy link

Dear @DrYak ,

I am really interested in working as a technical writer for the mentioned project. Although I have not held a position with the title of Technical Writer, my interest in computer science and technologies has equipped me with a unique set of skills that are directly applicable to technical writing. My enthusiasm to learn and evolve with new technologies day by day has honed my ability to distill complex information into understandable content, a key competency for any technical writer. I offer a unique blend of technical expertise and communication skills.

In my previous roles, I have contributed to open-source projects demonstrating my ability to translate complex concepts into clear and concise content. As a part of Google Developers Student Club, I have also written tech blogs, showcasing my commitment to delivering high-quality technical content.

I am excited about the opportunity to leverage my skills to streamline technical documentation.
For reference : Linkedin | Github | 41shubhranarang@gmail.com

Thank you for considering my application. I look forward to the possibility of contributing to your team.

Best regards,
Shubhra Narang

@mygitl2022
Copy link

Hello @DrYak 😃

Any update on the application process?

@Richiio
Copy link

Richiio commented Apr 28, 2024

Hello @DrYak

This is my proposal for GSOD:
https://docs.google.com/document/d/1XHFD7OAWImVbip6L4olTrDOnlWJFRN38xS7XwHqzDLw/edit?usp=sharing

I am waiting for your feedback on the proposal, @DrYak

Kind Regards,
Sarima Chiorlu

@rpsmaini
Copy link

rpsmaini commented May 2, 2024

Hi there my name is Ravpreet Singh Maini. I am eager to work with such an opportunity. By contributing to this project I will expand my expertise and will acquire a foundational skill in the industry.
Moreover, I have the advantage of contributing to this project full-time.
For reference here is my LinkedIn profile:
https://www.linkedin.com/in/ravpreet-singh-maini-0346b21b6
For further communication here is the email and contact number:
8871712525
rpsmaini@yahoo.com
@DrYak

@g-anush
Copy link

g-anush commented May 4, 2024

Hi @DrYak,

I am very interested in Project - "Better V-pipe tutorial for virus variant surveillance in wastewaters" and would like to take the next steps. I have thoroughly reviewed the project documents and am confident that I have the relevant skills and experience to accomplish the project. I would love to discuss the possibilities further.

For further communication here is the email and contact number:
+91-8319566004
anushgupta2001@gmail.com

Thanks

@Rafiea-Ashraf
Copy link

Dear HR,

I'm excited about the opportunity to contribute to your project. With two years of experience in writing reports for the IEEE Society and a year working on the IEEE student branch newsletter, I believe I could make a valuable addition to your team. Here is my LinkedIn to get to know me more: https://www.linkedin.com/in/rafiea-ashraf-16445b221/
I'm also attaching an email and am looking forward to the possibility of working together: rafiea.ashraf@gmail.com
Hope to hear from you soon.
Best regards,
Rafiea Ashraf

@kapelnick
Copy link

Hello,
I am interested in volunteering a few hours to this project, to gain experience with documenting open-source projects and GSoD in general!

Many thanks,
Nikos | kapelnick.mud@gmail.com

@TheRaj71
Copy link

Hello, DrYak at GSOD

My name is Raj, and I would want to contribute to the V-pipe bioinformatics workflow documentation enhancement project. My email address is theraj714@gmail.com. I think that as a seasoned technical writer, I can improve this crucial tool's discoverability and onboarding process.

For SARS-CoV-2 genomic surveillance, the V-pipe pipeline has been essential, particularly for wastewater analysis. The V-pipe pipeline has been critical for SARS-CoV-2 genomic surveillance, especially in wastewater analysis. Improving the documentation is crucial as wastewater-based epidemiology becomes more important. I'm excited to update the SARS-CoV-2 tutorial, expand the installation guide, and reorganize the docs for better discoverability. I'm eager to make the SARS-CoV-2 instructional more up to date, the installation guide longer, and the documents more logically arranged for easier finding.

I'm sure I can complete the project's success metrics, which include having new users test the tutorials and making sure the documentation makes it possible for others to duplicate your techniques without further assistance. I'm excited to help with this important project and enhance viral variant surveillance through wastewater analysis.

Please consider my application. I look forward to discussing my qualifications and approach in more detail.

Best regards,
Raj

@elabongaatuo
Copy link

Hello @DrYak,

Congratulations on being selected as one of the GSOD '24 participants.

As an electrical engineer with a strong interest in technical writing, I'm excited to apply for the V-pipe documentation project in Google Summer of Docs. V-pipe's use of genomics, indexing, and mapping for wastewater monitoring aligns perfectly with my skill set.

I'm passionate about V-pipe's potential to revolutionize pandemic preparedness through wastewater analysis. My technical background and dedication to clear communication make me a valuable asset to your team.

Please find my Statement of Interest and a snippet of code from when I happened to tinker with V-pipe.

Thank you for your time and consideration. Do look forward to working with you.

Regards,

Yvonne Elabonga.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment