Created
January 24, 2023 12:37
-
-
Save janxkoci/f9d95a9263a36542cd695d0574cec349 to your computer and use it in GitHub Desktop.
my first ever Julia script - reads fasta from stdin and prints mean length of sequences
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# julia shebang here | |
# mean fasta length | |
# non-fasta input may lead to division by zero !! | |
# use at your own risk | |
# add global constants for vars in loop below | |
# actually - requires julia v1.8+ | |
nseq::Int = 0 | |
len::Int = 0 | |
# read STDIN | |
for line in eachline(stdin) | |
#print("Found $line") | |
if(occursin(">", line) == true) | |
global nseq += 1 | |
else | |
global len += length(line) | |
end | |
end | |
println(len/nseq) # float, possibly | |
## apply function to a file | |
# function read_and_capitalize(f::IOStream) | |
# return uppercase(read(f, String)) | |
# end | |
## call | |
# open(read_and_capitalize, "hello.txt") | |
## OPEN FILE | |
# open("lom300tf.fasta", "r") do fasta | |
# for line in eachline(fasta) | |
# #println(line) | |
# # the following block can be in a function | |
# if(occursin(">", line) == true) | |
# nseq += 1 | |
# else | |
# len += length(line) | |
# end | |
# end | |
# println(len/nseq) # float, possibly | |
# end | |
## DOCS | |
## https://docs.julialang.org/en/v1/manual/networking-and-streams/ | |
## https://docs.julialang.org/en/v1/manual/strings/#Common-Operations | |
## https://syl1.gitbook.io/julia-language-a-concise-tutorial/language-core/input-output |
Try parallelism
With the @threads
macro and atomic assignments!
https://docs.julialang.org/en/v1/manual/multi-threading/#Atomic-Operations
julia> using Base.Threads
julia> nthreads()
4
julia> acc = Ref(0)
Base.RefValue{Int64}(0)
julia> @threads for i in 1:1000
acc[] += 1
end
julia> acc[]
926
julia> acc = Atomic{Int64}(0)
Atomic{Int64}(0)
julia> @threads for i in 1:1000
atomic_add!(acc, 1)
end
julia> acc[]
1000
Update
Doesn't work - stdin
doesn't support the method. I would probably need to create an iterator that collects from stdin
, but that sounds advanced for now.
CSV.Rows
https://csv.juliadata.org/stable/reading.html#CSV.Rows
Code example:
for row in CSV.Rows(file)
println("a=$(row.a), b=$(row.b), c=$(row.c)")
end
Options:
• reusebuffer=true
• header=1
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
meanfaslen.jl
Use like this:
julia meanfaslen.jl < genes.fa