Skip to content

Instantly share code, notes, and snippets.

View fjossinet's full-sized avatar
✏️
Drawing an RNA

Fabrice Jossinet fjossinet

✏️
Drawing an RNA
View GitHub Profile
@fjossinet
fjossinet / gist:9909294
Last active August 29, 2015 13:57
PyRNA Cookbook
{
"metadata": {
"name": "PyRNA Cookbook"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
@fjossinet
fjossinet / gist:9909238
Last active August 29, 2015 13:57
Create and manipulate tertiary structures with PyRNA
This file has been truncated, but you can view the full file.
{
"metadata": {
"name": "Create and manipulate tertiary structures"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
@fjossinet
fjossinet / gist:9035788
Last active August 29, 2015 13:56
Create and manipulate secondary structures with PyRNA
{
"metadata": {
"name": "Create and manipulate secondary structures."
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
@fjossinet
fjossinet / gist:9033572
Last active August 29, 2015 13:56
Create and manipulate molecules with PyRNA
{
"metadata": {
"name": "Create and manipulate molecules."
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
@fjossinet
fjossinet / taxid_2_gbids.py
Last active December 17, 2015 21:19
This python script recovers the genbank ids for all the nucleotide entries linked to a taxon id. The number of requests is minimized using the retmax and retstart parameters provided by the Entrez Utilities.
#!/usr/bin/env python
import xml.etree.ElementTree as ET
import sys, urllib, urllib2
eutils_base_url = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
def get_ids(taxid):
accession_numbers =[]
retstart = 0
@fjossinet
fjossinet / gist:2942223
Created June 16, 2012 18:42
Select CDS by keyword in all E. coli genomes
#!/bin/bash
query=$1
genome_ids=$(wget -qO - "http://www.ncbi.nlm.nih.gov/genome/genomes/167?&subset=complete&limit=refseq" | grep 'title="chromosome">Chr' | sed -E 's/.+(NC_.+|NZ_.+)/\1/' | cut -d \< -f 1)
for genome_id in $genome_ids
do
wget -qO - "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=$genome_id&rettype=gb&retmode=xml" > genome.xml
gene_ids=$(xmllint --xpath "//GBFeature[GBFeature_key[.='CDS'] and GBFeature_quals/GBQualifier[GBQualifier_name[.='product'] and GBQualifier_value[contains(.,\"$query\")]]]" genome.xml | grep "GI:" | sed -E 's/.+GI:(.+)<.+/\1/')
@fjossinet
fjossinet / gist:2941281
Created June 16, 2012 13:00
Donwload all data from chromosome I of Arabidopsis thaliana through NCBI FTP
wget -r ftp://anonymous:anonymous@ftp.ncbi.nih.gov/genomes/Arabidopsis_thaliana/CHR_I/
@fjossinet
fjossinet / gist:2941262
Created June 16, 2012 12:45
Extract all accession ids from the RFAM webpage
wget -qO - "http://rfam.sanger.ac.uk/family/browse" | grep ">RF" | tr -d ' ' | cut -d \> -f 2 | cut -d \< -f 1
@fjossinet
fjossinet / gist:2941217
Created June 16, 2012 12:28
Extract all RefSeq ids from NCBI genomes list for E coli
wget -qO - "http://www.ncbi.nlm.nih.gov/genome/genomes/167?&subset=complete&limit=refseq" | grep 'title="chromosome">Chr' | sed -E 's/.+(NC_.+|NZ_.+)/\1/' | cut -d \< -f 1
@fjossinet
fjossinet / ids.txt
Created June 15, 2012 14:37
How to download protein or nucleotide sequences from a list of gene ids?
cat gene_ids.txt | xargs -I % wget -qO %.fasta "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=%&rettype=fasta"