Skip to content

Instantly share code, notes, and snippets.

View wwood's full-sized avatar

Ben J Woodcroft wwood

View GitHub Profile
@wwood
wwood / public_metagenome_size.sql
Created February 20, 2023 01:04
Estimate the total amount of public metagenome sequence using Google BigQuery
-- Loop counter
DECLARE counter int64 default 1;
-- Create intermediate table with organisms that are metagenomes
CREATE OR REPLACE TABLE test.emp
AS
WITH cte AS
(
SELECT 1 AS xlevel, tax_id, parent_id, sci_name
samples = [1,2,3,4,5]
rule all:
input:
"test.out"
rule a:
output:
options(repr.plot.width=15)
analyse <- function(current_samples, p0) {
# p0 = 0.5
print("Current samples:")
print(current_samples)
proportion = sum(current_samples<(p0*n))
print(paste("Proportion less than",p0*n,'is',proportion,"out of",length(current_samples)))
binom_test_result = binom.test(proportion, length(current_samples), p=p0)
@wwood
wwood / sequence1.fna
Last active April 30, 2020 09:09
fastani issue seqs
>sequence1
CATCCTTTAACAATAGCCGTAGCCTTCACTTTTTTCTGGCTGCTTGGCCTGTTGTCGGTATTTGGTTTGCTGCTCTGGGTGTGTCTACAATGGCTTTTAATCTTAACGGTTTCAATTTTAACCAGTCCCTACTGGATAGTCAGGGGCGTGTTGTGCGTACTTGGGCTGACATCCTTAACCAAGCTAACCTTGGATTTGAAGTCATGCACGAACGTAATGCGCACAACTTTCCGCTAGACCTCGCTGCAGCAGACGTAACTCCTGTTGCTCTTACTGCACCTGCTGTAGGCTAATCTCCGTCCGTTCATCCCTCACAAGGGACGCATGAAGTTTGATCATGGAACGGGGGTCAAACACTTGGAGATTATCATGGCTTATCAAGTCACCTACAAGTATCGCGGCGTTTCTTACACTAAAACGGTAGTCCGTTAAAGCGGCATTGGGAGGTGCAAACCCTCCCTTACCTATTGGCGTTGGCCCTTACGAGGACACCCTTCGCCGTCTAGACGGTGGGATAGACCACAATAAAAACCCGAAAAAAATTTTCCAAAGCTTTGGAGAGAACCTATTTGTAATTTCTCTCTTTTTTAAAAATGGCACAACAAGCTGGCTCTGGTCTGCTTCAGTCTGCACTGACGCGTCCAGGCTCTCTTAATGGTGCGGCAGATGCCCGCGCCCTGTATCTCAAGCTTTTTAGCGGTGAGATGTTCAAAGGCTTCCAGCACAATGCGATCGCTCGTGACCTTGTGATGAAGCGCACCCTCAAGGGTGGCAAGAGTCTCCAGTTCATCTACACTGGACACACTCAAGCTGAGTTCCATACCCCCGGTCGGGCTATTCTTGGTAATGACCAGGGTGCTCCCCCGGTGGCTGAGAAGACCATCACCTGTGATGACCTTCTGATCAGCTCGGCTTTCCTGTACGAACTCGATGAAGTTCTGTCTCATTACGATCTGCGTAGTGAGATCTCCCGTAAGATCGGCTATGCTCTGGCTCAGAAGTATGACCGTCTG
@wwood
wwood / plasmid_and_random.fna
Created April 27, 2020 05:28
plasmid_and_random.fna for plasmidVerify
>random_sequence_length_50000_1
CCGTCACGATCATAAGCCCAGGTATTTTCTCATAGGATTAAGCCTTAGCTCCGTACAAGCCAAGCGCTAACGAACGTAGCTCGGGAATTAGCCGTACATAATGACTGCCGGCATGACATAAAGTGGCAGATTTGCAAAATTTGCTCGTACCTCCGGATTTGCTCCTGATCTCCGTTACAATGTGGTCTCTATAGACTGGTAAATCCCATTTAGGCCGAATTGTAGCGGACCAATGCCCCTAGATGAACTCAATGACCACCCACGTTGCCGACATAAATAGGCATCCTGGGTCTCTGGAGGAGCATAGCCGGGATTAGCATGTATTGGAACAGATACGACGATTGGAATAAGTCGGCAGGACGTCACGATGCCCACCTTTCGTGGTATTCTTCGATATTAGCTGACTCCTTGGGTTCTGAGGACATCGAATTTTTATGTGTTTAGACTAACCCACTAATTATTGACGAAGGACTCGATTAACTGGCATTTTGCGCACCGTCTCTTATTTGACGGCAAGTAGCAGTCTAGCACACGACCTAGTGAGCAAAAATACTTCCTTGTGAGACGGGCGGTCTAGCGGCGTGGCTAAGGTAAAAGAAACGGCGCAAGCGCTACCCTAGCGCGGCGTAGCCGATCACACACCGGTCCCGCAGGAGCCAGCGCATTGCGGGTTGGTACAACCAAAGAGGAATCGGTGTCGCGGAACGATAACTTTGAATTAGGTCAGGTTCTTTAAAGACTGTTCTCACCTGAAATTCATTTAGACTGTTCCCGGGCGCCATTGAGATCGTAATCCGGCCCACGGTGATACAAGTCCTTTACGATACATATGCGGGTGCAAGCCACCCAACTGCCCATCGCCCTTTCATCTGCCCCAGAAGCGGGGACCCGAAATTGAAGGACACTAATAAATTGACGTAATGTACGGTAATAAAGAATCTGGTAGGGTGCCCCACTCCATGTTGAGAGGATTGCCTCATCGCCCTAAACGT
@wwood
wwood / ecoli100.1.head1000.fq
Last active October 21, 2019 22:42
ecoli100.1.head1000.fq.gz
@NS500333:300:HNGK5AFXY:1:11101:14311:1054 1:N:0:GGAGCTAC+NTTAGACG
CGGGTNCACTGCATCTTCACAGCGAGTTCAATTTCACTGAGTCTCGGGTGGAGACAGCCTGGCCATCATTACGCCA
+
AAAAA#EEEEEEEEEEEEA6EEEEEEEE<EEEEEE/EEEEEEEEEEEEEEEEEEE/EEEE/EEEEEEEAEEE<AE<
@NS500333:300:HNGK5AFXY:1:11101:22036:1054 1:N:0:GGAGCTAC+NTTAGACG
GTTCCNGGCCGGACCGCTGGCAACAAAGGATAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATTTCACAACA
+
AAAAA#EEEEEEEEEAEEEEEEEEEEEEEEEAEEEEEEEE/EEEEAEEEEAEEEEE/AEEEAEEEEEEEEEE<E/
@NS500333:300:HNGK5AFXY:1:11101:13098:1055 1:N:0:GGAGCTAC+NTTAGACG
TTAGCNAGTAGGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGG
@wwood
wwood / _dbg.txt
Created January 14, 2018 05:26
_dbg.txt
~/git/mrustc/output$ tail *_dbg.txt
==> liballoc.hir_dbg.txt <==
(0.00 s) MIR Validate PO: DONE
MIR Validate Full: V V V
(0.00 s) MIR Validate Full: DONE
Trans Enumerate: V V V
(0.00 s) Trans Enumerate: DONE
Trans Codegen: V V V
Running comamnd - "gcc" "-ffunction-sections" "-pthread" "-O2" "-g" "-o" "output/liballoc.hir.o" "output/liballoc.hir.o.c" "-c"
(0.01 s) Trans Codegen: DONE
@wwood
wwood / ngs-bits build
Last active September 13, 2017 07:05
phase `unpack' succeeded after 0.5 seconds
starting phase `patch-usr-bin-file'
phase `patch-usr-bin-file' succeeded after 0.0 seconds
starting phase `patch-source-shebangs'
phase `patch-source-shebangs' succeeded after 0.2 seconds
starting phase `patch-generated-file-shebangs'
phase `patch-generated-file-shebangs' succeeded after 0.0 seconds
starting phase `build'
cd bamtools && rm -rf bin build lib include src/toolkit/bamtools_version.h
mkdir bamtools/build
starting phase `check'
t/00_requires_external.t
1..8
ok 1 - blastp in PATH
ok 2 - makeblastdb in PATH
ok 3 - mcl in PATH
ok 4 - mcxdeblast in PATH
ok 5 - bedtools in PATH
ok 6 - prank in PATH
ok 7 - parallel in PATH
$ ./pre-inst-env guix build newick-utils
;;; note: source file /home/ben/git/guix/gnu/packages/bioinformatics.scm
;;; newer than compiled /home/ben/git/guix/gnu/packages/bioinformatics.go
substitute: /gnu/store/dlr4klmngff66l845n8kwj61f11a6mh2-bash-4.3.33/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
substitute: warning: failed to install locale: Invalid argument
substitute: updating list of substitutes from 'http://hydra.gnu.org'... 100.0%
The following derivation will be built:
/gnu/store/nyp6g625brz4mkl249s4f1c8v4baq1g1-newick-utils-1.6.1.acb33ebdf.drv
The following file will be downloaded:
/gnu/store/fl27mjm8kxp0rj989cd8mj67qjvl0jr3-lua-5.1.5