Skip to content

Instantly share code, notes, and snippets.

View nakao's full-sized avatar

Mitsuteru Nakao nakao

View GitHub Profile
# Split a multi fasta file into single fasta files (1.fasta, 2.fasta, ...)
# > ruby a.rb multi.fasta
require 'rubygems'
gem 'bio'
Bio::FlatFile.auto($<).each_with_index do |entry, i|
File.open("#{i+1}.fasta", "w") do |f|
f.puts entry.to_s
end
@nakao
nakao / gist:5966847
Last active December 19, 2015 14:09
Fix invalid ChEMBL-RDF files. Put the script in the ChEMBL-RDF directory, then exec. You get fixed files as "*.ttl.new"
files = ["chembl_16_biocmpt.ttl", "chembl_16_target.ttl", "chembl_16_targetcmpt.ttl"]
def iotax(arg)
"<http://identifiers.org/taxonomy/#{arg}>"
end
def ncbitax(arg)
"<http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=#{arg}>"
end
@nakao
nakao / gist:5882712
Last active December 19, 2015 02:28
List human proteins (protein id, gene label, disease mnemonic, disease id and omim id) which have disease annotation. UniProt SPARQL endpoint http://beta.sparql.uniprot.org/
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos:<http://www.w3.org/2004/02/skos/core#>
PREFIX owl:<http://www.w3.org/2002/07/owl#>
PREFIX bibo:<http://purl.org/ontology/bibo/>
PREFIX dc:<http://purl.org/dc/terms/>
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>
PREFIX up:<http://purl.uniprot.org/core/>
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
Select ?protein ?geneLabel ?dMnemonic ?disease ?omim
@nakao
nakao / gist:5881623
Last active December 19, 2015 02:18
List human proteins which have natural variants described also in OMIM. The query returns protein id, gene label, mutation(cf. A301R), omim id and citation id at UniProt SPARQL endpoint http://beta.sparql.uniprot.org/sparql .
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos:<http://www.w3.org/2004/02/skos/core#>
PREFIX owl:<http://www.w3.org/2002/07/owl#>
PREFIX bibo:<http://purl.org/ontology/bibo/>
PREFIX dc:<http://purl.org/dc/terms/>
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#>
PREFIX up:<http://purl.uniprot.org/core/>
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
SELECT ?protein ?geneLabel ?hgsv ?omim ?citation
@nakao
nakao / gist:1172813
Created August 26, 2011 06:13
probeset.js generating code
require 'rubygems'
require 'rdf'
require 'rest_client'
require 'json'
task "probeset.js" do
endpoint = "http://open-biomed.org:8890/sparql"
query = "SELECT * WHERE { GRAPH <http://open-biomed.org:8890/DAV/BH11Ujicha/HG-U133A.na31.annot> { ?subject ?predicate ?object } }"
@nakao
nakao / sample-sparql.rb
Created August 25, 2011 08:01
Sample Ruby code for the BH11Ujicha SPARQL endpoint
require 'rubygems'
require 'rdf' # gem install rdf
require 'rest_client' # gem install rest-client
require 'json' # gem install json
endpoint = "http://open-biomed.org:8890/sparql"
query = "SELECT * WHERE { <http://bio2rdf.org/affymetrix:1007_s_at> ?predicate ?object }"
# XML, "application/sparql-results+xml" ; CSV, "text/csv"
response = RestClient.post endpoint, :query => query, :format => "application/sparql-results+json"
@nakao
nakao / usegalaxy-tweets.rb
Created May 27, 2011 11:08
Extracting tweets from 2011 Galaxy Community Conference #usegalaxy
# https://twitter.com/#!/pjacock/status/74029124242505728
# https://twitter.com/#!/32nm/status/74070121219497984
require 'rubygems'
require 'open-uri'
require 'nokogiri'
url = "http://togetter.com/li/140056"
doc = Nokogiri::HTML(open(url).read)
puts doc.xpath("//div[@class='status']/a").map {|x| x.attribute("href").value }
require 'bio'
class Bio::Sequence::NA
def self.compute_melting_temperature(nastr)
10 # code here ...
end
def melting_temperature
if @melting_temperature
@melting_temperature
# ruby tkana.rb http://twitter.com/synobu/status/20976018979
require 'rubygems'
require 'nokogiri'
require 'open-uri'
require 'htmlentities'
url = ARGV[0]
doc = Nokogiri::XML.parse(open(url).read)
t = (doc/"span.entry-content").to_s.gsub(/<.+?>/,'')