@ https://gist.github.com/o314/214e26c6fb70512b56597d633dd87e6f
see JuliaLang/julia#13942
OK unzip(a) = zip(a...)
fails in Julia
But Zen of python is great - ok, it's somewhat a lie, it's also great for julia !
So let's try to bring correct simple things first, and (maybe) complicated ones later or away.
FWIW, my good-enough-poor-man, but usable in prod, isn't it, unzip considering than
- writing 5 lines > waiting 5 years
- not optimized for nonexistant use case of unzip (for streaming or tensorflow or whatever)
- third party pkg for 5 lines is a joke or an industrial hazard (software bom somebody?)
- con : 5 sloc of src; 50 lines of test. so what?
using Test
import Base.Iterators as _I
unzip(s...) = unzip(collect(s))
unzip(vs::Vector{<:Vector}) =
let M=length(vs), N=mapfoldl(length, min, vs); # todo remove me when SVector is in Base
([vs[i][j] for i in 1:M] for j in 1:N)
end
unzip(a::Vector{<:Pair}) = [k for (k,_) in a], [v for (_,v) in a]
import Base.Iterators as _I
using Test
# zipdata(M,N) = let v=collect(1:M), vt=ntuple(N) do _; copy(v) end; vt end
data(M,N) = ntuple(M) do i; fill(i,N) end
data(N) = let ks=_I.take(_I.cycle('a':'z'), N), vs=(1:N...,); (k=>v for (k,v) in zip(ks,vs)) end
# unzip of vector
@test data(5,3) == ([1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5])
@test unzip(data(5,3)...) |> collect == ([1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5]) |> collect
# unzip of pair vector
@test data(5) |> collect == ('a'=>1, 'b'=>2, 'c'=>3, 'd'=>4, 'e'=>5) |> collect
@test unzip(data(5) |> collect) |> collect == (['a','b','c','d','e'], [1,2,3,4,5]) |> collect
# unzip of vector
julia> @time unzip(data(1000,3)...);
0.029086 seconds (42.07 k allocations: 2.766 MiB, 99.28% compilation time)
julia> @time unzip(data(1_000_000,3)...);
1.507531 seconds (4.04 M allocations: 223.797 MiB, 18.21% gc time, 72.32% compilation time)
julia> @time unzip(data(1_000_000,3)...);
0.294386 seconds (1.00 M allocations: 152.588 MiB)
julia> @time unzip(data(1000,50)...);
0.000727 seconds (1.01 k allocations: 531.922 KiB)
julia> @time unzip(data(1_000_000,50)...);
1.082615 seconds (1.00 M allocations: 518.799 MiB, 48.09% gc time)
julia> @time unzip(data(1_000_000,50)...);
0.527460 seconds (1.00 M allocations: 518.799 MiB)
# unzip of pair vector
julia> @time unzip(data(1000));
2.728774 seconds (166.12 k allocations: 10.524 MiB, 99.98% compilation time)
julia> @time unzip(data(1000));
0.000334 seconds (2.00 k allocations: 116.375 KiB)
julia> @time unzip(data(1_000_000) |> collect); # BUG wo collect
0.634841 seconds (3.50 M allocations: 178.888 MiB, 18.39% gc time, 57.41% compilation time)
Mandatory continuation https://medium.com/@mbostock/what-makes-software-good-943557f8a488