Skip to content

Instantly share code, notes, and snippets.

View lokeshh's full-sized avatar
🌴
On vacation

Lokesh Sharma lokeshh

🌴
On vacation
View GitHub Profile

(taken from GSOC proposal)

...

contrast_interact: This is there to code interaction terms. In a dataframe with columns ‘a’ and ‘b’, ‘a:b’ is an interaction term. Again we need to code this term to produce some number of variables. But in this case the coding is somewhat different. I’ll explain with an example how to code ‘a:b’ and one can generalize the behavior. Let’s say column ‘a’ has m categories and ‘b’ has n categories. Now if ‘a’ has been mentioned in our regression expression, then we will code the column ‘b’ with n-1 variables and similarly if ‘b’ has been mentioned in the regression expression, then we will code column ‘a’ with m-1 variables. And if ‘a’ hasn’t been mentioned in our regression expression then ‘b’ will be coded with n variables and similarly if ‘a’ hasn’t been mentioned in our regression expression then ‘b’ will be coded with m variables.

Here’s a general rule to follow when we have more than two way interaction. Say we have ‘a: b:c’ and we need to decide whether to code ‘a’ w

# Code for benchmark
require 'benchmark'
vector = Daru::Vector.new(
10000.times.map.to_a.shuffle,
missing_values: 100.times.map.to_a.shuffle
)
Benchmark.bm do |x|
x.report("Sum of a vector using compact") do
lokeshh:~/workspace/daru (master) $ bundle install
Fetching gem metadata from https://rubygems.org/
Fetching version metadata from https://rubygems.org/
Resolving dependencies..............
Installing rake 10.5.0
Installing i18n 0.7.0
Installing json 1.8.3
Installing minitest 5.8.4
Installing thread_safe 0.3.5
Installing builder 3.2.2
require 'rspec/core/rake_task'
require 'bundler/gem_tasks'
lib_folder = File.expand_path("../lib", __FILE__)
RUBIES = ['ruby-2.0.0', 'ruby-2.1.1', 'ruby-2.2.1', 'jruby']
task :spec do |task|
RUBIES.each do |ruby_v|
`rvm use #{ruby_v}; rspec spec`
..................F...........******************************...........F...***********************....*.....*....................................../home/ubuntu/workspace/daru/lib/daru/index.rb:102: [BUG] Segmentation fault at 0x007fe1e31ce030
ruby 2.2.1p85 (2015-02-26 revision 49769) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0048 p:0021 s:0197 e:000196 METHOD /home/ubuntu/workspace/daru/lib/daru/index.rb:102
c:0047 p:0015 s:0193 e:000192 METHOD /home/ubuntu/workspace/daru/lib/daru/vector.rb:207
c:0046 p:0015 s:0188 e:000187 BLOCK /home/ubuntu/workspace/daru/lib/daru/maths/statistics/vector.rb:373 [FINISH]
c:0045 p:---- s:0186 e:000185 IFUNC
c:0044 p:---- s:0184 e:000183 CFUNC :each
c:0043 p:0019 s:0181 e:000180 METHOD /home/ubuntu/.rvm/gems/ruby-2.2.1/gems/activesupport-4.2.6/lib/active_support/core_ext/range/each.rb:7 [FINISH]
source 'https://rubygems.org'
gemspec
pry -r '/home/ubuntu/workspace/daru/lib/daru.rb'
[1] pry(main)> i = Daru::Vector.new [1, 2, 3, nil], dtype: :nmatrix, nm_dtype: :object
=> #<Daru::Vector(4)>
0 1
1 2
2 3
3 nil
[2] pry(main)> i = Daru::Vector.new [1, 2, 3, nil], dtype: :nmatrix
TypeError: no implicit conversion from nil to integer
from /home/ubuntu/workspace/daru/lib/daru/accessors/nmatrix_wrapper.rb:28:in `initialize'
lokeshh:~/workspace/daru (rake_task) $ rspec spec/ -r ./formatter.rb -f SimpleFormatter
Rserve: no process found
/usr/lib/R/bin/Rcmd: 62: exec: Rserve: not found
Rserve: no process found
/usr/lib/R/bin/Rcmd: 62: exec: Rserve: not found
Failures:
1) Daru rserve extension Daru::Vector#to_REXP converts to and from R data
Got 0 failures and 2 other errors:
require 'memory_profiler'
require 'daru'
array1 = (300.times.map { ('a'..'z').to_a.shuffle }).to_a
array2 = ['a'*26]*100 + ['b'*26]*100 + ['c'*26]*100
report = MemoryProfiler.report do
a = Daru::Index.new array1
[1] pry(main)> dv = Daru::Vector.new ['third', 'third', 'first', 'second'], type: :category
=> #<Daru::Vector(4)>
0 third
1 third
2 first
3 second
[2] pry(main)> dv.order
=> ["third", "first", "second"]
[3] pry(main)> dv.order = ['first', 'second', 'thrd']
ArgumentError: The contents of new and old order must be the same.