Skip to content

Instantly share code, notes, and snippets.

@shashi
Created October 12, 2017 05:12
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shashi/683d7234668cad67513d98251d6dcce1 to your computer and use it in GitHub Desktop.
Save shashi/683d7234668cad67513d98251d6dcce1 to your computer and use it in GitHub Desktop.
using Distributions
using PooledArrays
N=Int64(2e8); K=100;
pool = [@sprintf "id%03d" k for k in 1:K]
function randstrarray(pool, N)
PooledArray(PooledArrays.RefArray(rand(UInt8(1):UInt8(100), N)), pool)
end
using JuliaDB
@time T = IndexedTable(Columns([1:2*10^8;]), Columns(
id1 = randstrarray(pool, N),
id2 = randstrarray(pool, N),
id3 = randstrarray(pool, N),
id4 = rand(1:K, N), # large groups (int)
id5 = rand(1:K, N), # large groups (int)
id6 = rand(1:(N/K), N), # small groups (int)
v1 = rand(1:5, N), # int in range [1,5]
v2 = rand(1:5, N), # int in range [1,5]
v3 = rand(round.(rand(Uniform(0,100),100),4), N) # numeric e.g. 23.5749
))
using BenchmarkTools
@benchmark aggregate(+, T, by=(:id1,), with=:v1)
@xiaodaigh
Copy link

Got this error function aggregate does no accept keyword argument

@andreasnoack
Copy link

@xiaodaigh Make sure that your packages are up to date. I just tried successfully with the latest released versions.

@xiaodaigh
Copy link

@andreasnoack. Ok it's working now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment