Skip to content

Instantly share code, notes, and snippets.

@hpoit

hpoit/pv2 Secret

Created May 4, 2017 11:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hpoit/6b26f41a12fc544fa97bb6d8c19f8c5a to your computer and use it in GitHub Desktop.
Save hpoit/6b26f41a12fc544fa97bb6d8c19f8c5a to your computer and use it in GitHub Desktop.
pv2
Last login: Thu May 4 07:11:19 on ttys001
_
_ _ _(_)_ | A fresh approach to technical computing
(_) | (_) (_) | Documentation: http://docs.julialang.org
_ _ _| |_ __ _ | Type "?help" for help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 0.5.0 (2016-09-19 18:14 UTC)
_/ |\__'_|_|_|\__'_| | Official http://julialang.org/ release
|__/ | x86_64-apple-darwin13.4.0
julia> using DataFrames, Query, CSV
INFO: Recompiling stale cache file /Users/Corvus/.julia/lib/v0.5/DataFrames.ji for module DataFrames.
WARNING: Method definition describe(AbstractArray) in module StatsBase at /Users/Corvus/.julia/v0.5/StatsBase/src/scalarstats.jl:573 overwritten in module DataFrames at /Users/Corvus/.julia/v0.5/DataFrames/src/abstractdataframe/abstractdataframe.jl:407.
WARNING: Method definition describe(AbstractArray) in module StatsBase at /Users/Corvus/.julia/v0.5/StatsBase/src/scalarstats.jl:573 overwritten in module DataFrames at /Users/Corvus/.julia/v0.5/DataFrames/src/abstractdataframe/abstractdataframe.jl:407.
WARNING: Method definition require(Symbol) in module Base at loading.jl:345 overwritten in module Query at /Users/Corvus/.julia/v0.5/Requires/src/require.jl:12.
INFO: Recompiling stale cache file /Users/Corvus/.julia/lib/v0.5/NamedTuples.ji for module NamedTuples.
INFO: Recompiling stale cache file /Users/Corvus/.julia/lib/v0.5/CSV.ji for module CSV.
WARNING: Method definition describe(AbstractArray) in module StatsBase at /Users/Corvus/.julia/v0.5/StatsBase/src/scalarstats.jl:573 overwritten in module DataFrames at /Users/Corvus/.julia/v0.5/DataFrames/src/abstractdataframe/abstractdataframe.jl:407.
WARNING: Method definition describe(AbstractArray) in module StatsBase at /Users/Corvus/.julia/v0.5/StatsBase/src/scalarstats.jl:573 overwritten in module DataFrames at /Users/Corvus/.julia/v0.5/DataFrames/src/abstractdataframe/abstractdataframe.jl:407.
julia> q = @from i in CSV.Source("/file.csv") begin
@where i.Type == "Trade"
@select {i.Price, i.Volume}
end
Query.EnumerableConvert2Nullable{NamedTuples._NT_Price_Volume{Nullable{Float64},Nullable{Int64}},Query.EnumerableSelect{NamedTuples._NT_Price_Volume{Query.DataValue{Float64},Query.DataValue{Int64}},Query.EnumerableWhere{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Query.DataValue{String},Query.DataValue{String},Query.DataValue{String},Query.DataValue{Int64},Query.DataValue{String},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{String}},Query.EnumerableConvert2DataValue{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Query.DataValue{String},Query.DataValue{String},Query.DataValue{String},Query.DataValue{Int64},Query.DataValue{String},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{String}},Query.EnumerableIterable{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},IterableTables.DataStreamIterator{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},CSV.Source,Tuple{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},Tuple{Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{Int64},Nullable{WeakRefString{UInt8}},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{WeakRefString{UInt8}}}}}},##1#3},##2#4}}(Query.EnumerableSelect{NamedTuples._NT_Price_Volume{Query.DataValue{Float64},Query.DataValue{Int64}},Query.EnumerableWhere{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Query.DataValue{String},Query.DataValue{String},Query.DataValue{String},Query.DataValue{Int64},Query.DataValue{String},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{String}},Query.EnumerableConvert2DataValue{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Query.DataValue{String},Query.DataValue{String},Query.DataValue{String},Query.DataValue{Int64},Query.DataValue{String},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{String}},Query.EnumerableIterable{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},IterableTables.DataStreamIterator{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},CSV.Source,Tuple{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},Tuple{Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{Int64},Nullable{WeakRefString{UInt8}},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{WeakRefString{UInt8}}}}}},##1#3},##2#4}(Query.EnumerableWhere{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Query.DataValue{String},Query.DataValue{String},Query.DataValue{String},Query.DataValue{Int64},Query.DataValue{String},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{String}},Query.EnumerableConvert2DataValue{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Query.DataValue{String},Query.DataValue{String},Query.DataValue{String},Query.DataValue{Int64},Query.DataValue{String},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{String}},Query.EnumerableIterable{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},IterableTables.DataStreamIterator{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},CSV.Source,Tuple{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},Tuple{Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{Int64},Nullable{WeakRefString{UInt8}},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{WeakRefString{UInt8}}}}}},##1#3}(Query.EnumerableConvert2DataValue{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Query.DataValue{String},Query.DataValue{String},Query.DataValue{String},Query.DataValue{Int64},Query.DataValue{String},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{Float64},Query.DataValue{Int64},Query.DataValue{String}},Query.EnumerableIterable{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},IterableTables.DataStreamIterator{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},CSV.Source,Tuple{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},Tuple{Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{Int64},Nullable{WeakRefString{UInt8}},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{WeakRefString{UInt8}}}}}}(Query.EnumerableIterable{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},IterableTables.DataStreamIterator{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},CSV.Source,Tuple{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},Tuple{Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{Int64},Nullable{WeakRefString{UInt8}},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{WeakRefString{UInt8}}}}}(IterableTables.DataStreamIterator{NamedTuples._NT_#RIC_Date[G]_Time[G]_GMT Offset_Type_Price_Volume_Bid Price_Bid Size_Ask Price_Ask Size_Qualifiers{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},CSV.Source,Tuple{Nullable{String},Nullable{String},Nullable{String},Nullable{Int64},Nullable{String},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{String}},Tuple{Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{WeakRefString{UInt8}},Nullable{Int64},Nullable{WeakRefString{UInt8}},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{Float64},Nullable{Int64},Nullable{WeakRefString{UInt8}}}}(CSV.Source: /Users/Corvus/Desktop/reuters1A.csv
CSV.Options:
delim: ','
quotechar: '"'
escapechar: '\\'
null: ""
dateformat: Base.Dates.DateFormat(Base.Dates.Slot[Base.Dates.DelimitedSlot{Base.Dates.Year}(Base.Dates.Year,'y',4,"-"),Base.Dates.DelimitedSlot{Base.Dates.Month}(Base.Dates.Month,'m',2,"-"),Base.Dates.DelimitedSlot{Base.Dates.Day}(Base.Dates.Day,'d',2,r"(?=\s|$)")],"","english")
Data.Schema{true}:
rows: 80000534 cols: 12
Columns:
"#RIC" Nullable{WeakRefString{UInt8}}
"Date[G]" Nullable{WeakRefString{UInt8}}
"Time[G]" Nullable{WeakRefString{UInt8}}
"GMT Offset" Nullable{Int64}
"Type" Nullable{WeakRefString{UInt8}}
"Price" Nullable{Float64}
"Volume" Nullable{Int64}
"Bid Price" Nullable{Float64}
"Bid Size" Nullable{Int64}
"Ask Price" Nullable{Float64}
"Ask Size" Nullable{Int64}
"Qualifiers" Nullable{WeakRefString{UInt8}},Data.Schema{true}:
rows: 80000534 cols: 12
Columns:
"#RIC" Nullable{WeakRefString{UInt8}}
"Date[G]" Nullable{WeakRefString{UInt8}}
"Time[G]" Nullable{WeakRefString{UInt8}}
"GMT Offset" Nullable{Int64}
"Type" Nullable{WeakRefString{UInt8}}
"Price" Nullable{Float64}
"Volume" Nullable{Int64}
"Bid Price" Nullable{Float64}
"Bid Size" Nullable{Int64}
"Ask Price" Nullable{Float64}
"Ask Size" Nullable{Int64}
"Qualifiers" Nullable{WeakRefString{UInt8}}))),#1),#2))
julia> df = DataFrame(take(q, 3_000_000))
3000000×2 DataFrames.DataFrame
│ Row │ Price │ Volume │
├─────────┼───────┼────────┤
│ 1 │ 0.0 │ 0 │
│ 2 │ 19.05 │ 100 │
│ 3 │ 19.05 │ 600 │
│ 4 │ 19.05 │ 200 │
│ 5 │ 19.05 │ 200 │
│ 6 │ 19.05 │ 100 │
│ 7 │ 19.05 │ 400 │
│ 8 │ 19.05 │ 100 │
│ 2999992 │ 30.45 │ 100 │
│ 2999993 │ 30.44 │ 100 │
│ 2999994 │ 30.44 │ 500 │
│ 2999995 │ 30.44 │ 100 │
│ 2999996 │ 30.41 │ 100 │
│ 2999997 │ 30.41 │ 100 │
│ 2999998 │ 30.42 │ 400 │
│ 2999999 │ 30.42 │ 300 │
│ 3000000 │ 30.43 │ 100 │
julia>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment