Skip to content

Instantly share code, notes, and snippets.

View RandyHarr's full-sized avatar

Randy H RandyHarr

View GitHub Profile
@RandyHarr
RandyHarr / fixFTDNAvcf.sh
Last active February 26, 2022 20:43
Shell script around fixFTDNAvcf.py script to fix FTDNA BigY VCF and then annotate with yBrowse SNP names and haplogroups
#!/bin/bash
#
# Fixes FTDNA VCF file so can be processed by standard tools following the VCF standard
# Annotates the FTDNA BigY VCF file with the latest yBrowse DB entries for SNP names, yFull and ISOGG HG
#
# This is all handled behind the scenes (automagically) by WGS Extract (in the next release)
# Simply a stand-alone. simple scenario script installation for demonstration purposes here
#
# Relies on htslib bgzip and bcftools; along with wget, python rm, zip and unzip.
# Relies on access to yBrowse DB file and WGS Extract python utility fixFTDNAvcf.py
@RandyHarr
RandyHarr / fixFTDNAbam.sh
Last active February 26, 2022 21:10
For fixing FTDNA version 1 BAM files that incorrectly include a space in the QNAME field
#!/bin/bash
#
# Fixes FTDNA BAM version 1 files so can be processed by standard bioinformatic tools.
# Applies only to Bigy files (not needed for Bigy2 or Bigy3)
#
# This is handled behind the scenes (automagically) by WGS Extract (in the next release)
# Simply a stand-alone. simple scenario script installation for demonstration purposes here
#
# Relies on htslib bgzip and samtools; along with wget, python rm, zip and unzip.
@RandyHarr
RandyHarr / countingNs.py
Last active March 22, 2024 00:25
Python stand-alone program to analyze a FASTA Human Reference Model for runs of N (masked out) entries. See WGSE.bio for latest version.
#!/usr/bin/env python3
# coding: utf8
#
# Counting Reference Model Final Assembly N's (BED, region, etc output files)
#
# Part of the WGS Extract (https://wgse.bio/) system (standalone)
#
# Copyright (C) 2022-2024 Randy Harr
#
# License: GNU General Public License v3 or later