George Carvalho geocarvalho

## CHANGELOG.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                geocarvalho
                / CHANGELOG.md
            
            
              Created
              June 9, 2023 22:23
                — forked from juampynr/CHANGELOG.md
            
              
                Sample CHANGELOG
              
          
    Change Log

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog
and this project adheres to Semantic Versioning.
[Unreleased] - yyyy-mm-dd

Here we write upgrading notes for brands. It's a team effort to make them as

  
## awesome-long-read.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                geocarvalho
                / awesome-long-read.md
            
            
              Last active
              June 14, 2022 11:30
            
          
    Long read references


2022 Accelerating minimap2 for long-read sequencing applications on modern CPUs
2021 Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits
2021 Towards population-scale long-read sequencing
2020 Opportunities and challenges in long-read sequencing data analysis
2020 Long-read human genome sequencing and its applications
2019 Long-Read Sequencing Emerging in Medical Genetics
2019 A new era of long-read sequencing for cancer genomics
[2019 Using long-read sequencing to detect imprinted DNA


## awesome-pharmacogenomics.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                geocarvalho
                / awesome-pharmacogenomics.md
            
            
              Last active
              June 15, 2022 19:58
            
          
    Awesome pharmacogenomics papers


2022 Automated Pharmacogenomic Reports for Clinical Genome Sequencing | github
2022 Pharmacogenomics decision support in the U-PGx project: Results and advice from clinical implementation across seven European countries
2021 Applying Next-Generation Sequencing Platforms for Pharmacogenomic Testing in Clinical Practice
2021 StellarPGx: A Nextflow Pipeline for Calling Star Alleles in Cytochrome P450 Genes
2021 Toward predicting CYP2D6-mediated variable drug response from CYP2D6 gene sequencing data ✅ 🔐
[2021 Cyrius: accurate CYP2D6 genotyping using whole-genome sequen


## awesome_bioinformatics.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                geocarvalho
                / awesome_bioinformatics.md
            
            
              Last active
              March 24, 2023 23:27
            
              
                Learn clinical bioinformatics: List of my recommendations to study genomics bioinformatics and clinical bioinformatics
              
          
    Learn Clinical Bioinformatics 📚


My personal list of recommendations of resources to study genomics bioinformatics and clinical bioinformatics


Courses

1. Applied Computational Genomics - Aaron Quinlan


This is just the best open source course I could find until now.


## gatk_variantcalling_tutorials.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                geocarvalho
                / gatk_variantcalling_tutorials.md
            
            
              Created
              May 14, 2021 19:18
            
          
    Tutorials


Variant Calling Pipeline using GATK4
GATK4 Best Practice Nextflow Pipeline
Variant calling in human whole genome/exome sequencing data
Germline short variant discovery (SNPs + Indels)
W8: Variant Calling with GATK - Day 1
W8: Variant Calling with GATK - Day 2

Datasets


## bed_to_list.py
import pandas as pd

file = "input.bed"
df = pd.read_csv(file, sep="\t", names=["chr", "start", "end", "interval", "score", "strand"])
df[["gene", "extra"]] = df["interval"].str.split("_", 1, expand=True)
df.drop(["interval", "score", "strand", "extra"], axis=1, inplace=True)
new_df = df.groupby("gene").agg({"chr":"unique", "start":min, "end":max})
new_df.reset_index(inplace=True)
new_df["chr"] = new_df["chr"].apply(lambda chr: chr[0])
new_df["start"] = new_df["start"].astype("str")

## WDL_tutorials.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              1 star
            
          
                geocarvalho
                / WDL_tutorials.md
            
            
              Last active
              October 13, 2022 13:31
            
          
    WDL tutorials


learn-wdl


Introducing the Learn WDL Course


Step-by-step tutorials for writing pipelines in WDL.
WDL - style_guide
(How to) Execute Workflows from the gatk-workflows Git Organization
DAVE TANG - Learning WDL
Running a Bioinformatics Software Pipeline with Wdl/Cromwell
17. Hello World WDL Tutorial - Geraldine Van der Auwera


## liftover_bed.py

#!/usr/bin/python3
from pyliftover import LiftOver
import pandas as pd
import argparse
import mapply
import sys
import os

mapply.init(

## code_cnv_bed.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                geocarvalho
                / code_cnv_bed.md
            
            
              Created
              April 15, 2019 16:01
            
              
                Commands to create bed inputs for exomedepth and devicnv using normal bed
              
          
    Exomedepth:
$ awk -F"__" '$1=$1' OFS="\t" PAHC44_1_CDHS-17427Z-2274_sorted.bed | cut -f1,2,3,4 > qiaseq_PAHC44_1_CDHS-17427Z-2274_no_header.bed
DeviCNV:
$ echo -e "Amplicon_ID\tChr\tAmplicon_Start\tAmplicon_End\tInsert_Start\tInsert_End\tGene\tTranscript\tExon\tPool" > qiaseq_PAHC44_1_CDHS-17427Z-2274_no_header.devicnv.bed
$ awk -F"__" '$1=$1' OFS="\t" PAHC44_1_CDHS-17427Z-2274_sorted.bed | awk '{ print $4"."$7"\t"$1"\t"$2"\t"$3"\t"$2"\t"$3"\t"$4"\t"$4"\t"$7"\t"Pool1}' >> qiaseq_PAHC44_1_CDHS-17427Z-2274_no_header.devicnv.bed

  
## learn_git_branching.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                geocarvalho
                / learn_git_branching.md
            
            
              Created
              July 1, 2018 22:03
            
              
                Learn_git_branching
              
          
    Learn Git Branching

Introdução aos commits no Git


Um commit em um repositório git registra uma fotografia (snapshot) de todos os arquivos no seu diretório. É como um grande copy&paste, mas ainda melhor!
O Git tem por objetivo manter os commits tão leves quanto possível, de forma que ele não copia cegamente o diretório completo toda vez que você commita. Ele pode (quando possível) comprimir um commit como um conjunto de mudanças (ou um "delta") entre uma versão do seu repositório e a seguinte.
O Git também mantém um histórico de quando ocorreu cada commit. É por isso que a maioria dos commits tem ancestrais acima de si -- que indicamos usando setas na nossa visualização. Manter a história é ótimo para todos que trabalham no projeto!
Há muito para aprender, mas por enquanto pense nos commits como snapshots do seu projeto. Os commits são muito leves, e mudar de um para outro é extremamente rápido!
Vejamos o que isso significa na prática. Abaixo, temos uma vis
	import pandas as pd

	file = "input.bed"
	df = pd.read_csv(file, sep="\t", names=["chr", "start", "end", "interval", "score", "strand"])
	df[["gene", "extra"]] = df["interval"].str.split("_", 1, expand=True)
	df.drop(["interval", "score", "strand", "extra"], axis=1, inplace=True)
	new_df = df.groupby("gene").agg({"chr":"unique", "start":min, "end":max})
	new_df.reset_index(inplace=True)
	new_df["chr"] = new_df["chr"].apply(lambda chr: chr[0])
	new_df["start"] = new_df["start"].astype("str")

	#!/usr/bin/python3
	from pyliftover import LiftOver
	import pandas as pd
	import argparse
	import mapply
	import sys
	import os

	mapply.init(