Skip to content

Instantly share code, notes, and snippets.

View cfarkas's full-sized avatar

cfarkas cfarkas

  • Concepción, Chile
View GitHub Profile
@cfarkas
cfarkas / unique_gene_id.py
Created December 27, 2022 14:47 — forked from moble/unique_gene_id.py
Ensure GTF-format files have `gene_id` fields unique to each `gene_name` or `transcript_id`
#!/usr/bin/env python
# Copyright (c) 2020, Michael Boyle
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
@cfarkas
cfarkas / mstrg_prep.pl
Created December 27, 2022 12:46 — forked from gpertea/mstrg_prep.pl
post-processing of StringTie merge output to append ref_gene_id info to the MSTRG gene_id
#!/bin/env perl
#Usage: mstrg_prep.pl merged.gtf > merged_prep.gtf
use strict;
my %g; # gene_id => \%ref_gene_ids (or gene_names)
my @prep; # array of [line, original_id]
while (<>) {
s/ +$//;
my @t=split(/\t/);
unless (@t>8) { print $_; next }
my ($gid)=($t[8]=~m/gene_id "(MSTRG\.\d+)"/);