Skip to content

Instantly share code, notes, and snippets.

@jskDr
Created October 27, 2015 03:28
Show Gist options
  • Save jskDr/4f1158389dcfd1417c2a to your computer and use it in GitHub Desktop.
Save jskDr/4f1158389dcfd1417c2a to your computer and use it in GitHub Desktop.
Extension of Julia's DataFrames by James
# This code is for extension of DataFrames.
# By the extension, fingerprint can be loaded via string.
# Each fingerprint is a set of the binary integers in python.
# However, it can be not be loaded directly in Julia, which directly
# trainslate a binary integer string to big interger. It is automatic even if
# quoto mark is aded. Hence, "Julia" is added on a binary string as a prefix.
# By doing that, Julia can read them as a character string rather than a big integer value.
# bs2ba() transform binary string to binary array.
function bs2ba( fp)
ln = sizeof( fp)
x = zeros( Float64, ln, 1)
for ii = 1:ln
x[ii, 1] = parse(Int, fp[ii])
end
return x
end
# tofp transform binary string with a prefix of "Julia" to binary array after removal of the prefix
function tofp( fp_julia)
# The prefix substring of Julia is removed so as to binary string can be used directly.
# Julia is appeded only for dataframe to be operated propertly in Julia
lc = sizeof("Julia")
fp = fp_julia[lc+1:end]
return bs2ba( fp)
end
# to_xM translate an array of binary strings with a special prefix of "Julia" to
# a matrix of binary integer elements where a binary integer can be one of 0, 1.
# The output array is floatpoint since floatpoint is used in the machine learning library
# such as in sklearn.
function to_xM( fp_l)
M = sizeof( fp_l[1]) - sizeof("Julia")
N = size( fp_l)[1]
xM = zeros( Float64, M, N)
for n=1:N
xM[:, n] = tofp( fp_l[n])
end
return xM
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment