Kenta Murata mrkn

## iruby_folium_sample_ja.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mrkn
                / iruby_folium_sample_ja.ipynb
            
            
              Created
              March 6, 2019 03:06
            
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## gist:8319b4c4b6b8477f519fc9504e7ab90d
===== LIMIT=1000 =====
Calculating -------------------------------------
Mysql2Test.test_pluck_by_arrow(n)    64.717M bytes -     100.000 times
         Mysql2Test.test_pluck(n)   154.227M bytes -     100.000 times

Comparison:
Mysql2Test.test_pluck_by_arrow(n):  64716800.0 bytes
         Mysql2Test.test_pluck(n): 154226688.0 bytes - 2.38x  larger

===== LIMIT=2000 =====

## julia-dot-optimize
julia> function dot(a, b)
         s = zero(eltype(a))
         for i in 1:endof(a)
           s += a[i] * b[i]
         end
         return s
       end
dot (generic function with 1 method)

julia> a = ones(100000); b = ones(100000);

## gist:98e4bd96068a02b173cdd5e9e4c0df2a
```
$ python
Python 3.6.4 (default, Apr  3 2018, 09:35:44)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from mxnet.gluon.parameter import ParameterDict
>>> pd1 = ParameterDict()
>>> pd2 = ParameterDict()
>>> pd1.get('a')
Parameter a (shape=None, dtype=<class 'numpy.float32'>)

## gist:549e82c47f5ffe407dc1fce29b657be2
compiling arrow-nmatrix.c
arrow-nmatrix.c: In function ‘garrow_type_to_nmatrix_dtype’:
arrow-nmatrix.c:57:8: error: ‘GARROW_TYPE_BOOL’ undeclared (first use in this function)
   case GARROW_TYPE_BOOL:
        ^
arrow-nmatrix.c:57:8: note: each undeclared identifier is reported only once for each function it appears in
arrow-nmatrix.c:34:3: warning: enumeration value ‘GARROW_TYPE_BOOLEAN’ not handled in switch [-Wswitch-enum]
   switch (arrow_type) {
   ^
arrow-nmatrix.c: In function ‘nmatrix_dtype_to_garrow_data_type’:

## rgb2lab.py
from PIL import Image, ImageCms

im = Image.open(image_path)
if im.mode != "RGB":
  im = im.convert("RGB")

srgb_profile = ImageCms.createProfile("sRGB")
lab_profile  = ImageCms.createProfile("LAB")

rgb2lab_transform = ImageCms.buildTransformFromOpenProfiles(srgb_profile, lab_profile, "RGB", "LAB")

## flu-roxonin.csv

          
            time
            temp
            headache

            
              0
              38.5
              1

            
              20
              38.4
              1

            
              25
              38.3
              1

            
              30
              38.1
              1

            
              35
              37.9
              1

            
              40
              37.6
              1

            
              45
              37.6
              1

            
              50
              37.3
              1

            
              55
              37.1
              0

## same_all_bench.rb
require 'benchmark'

LEN = 10000
TRY = 1000

s0 = 'x' * 100
s1 = 'x' * 99 + 'y'

cases = {
  shuffle: Array.new(LEN) {|i| i }.shuffle,

## rubydatatokyoworkshop20171026.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                mrkn
                / rubydatatokyoworkshop20171026.md
            
            
              Last active
              October 11, 2017 02:58
            
          
    RubyData Tokyo Workshop 2017.10.26

RubyKaigi 2017 で実施した RubyData Workshop を東京で再演します。
Ruby でのデータサイエンスを実体験できるチャンスです。
みなさんのご参加をお待ちしております。
場所と日時

日時: 10月26日 (木) 14:00〜16:30
会場: 株式会社 Speee (東京都港区六本木4-1-4黒崎ビル5Fセミナールーム)

  
## gist:8cc05a2602caed0dce7617dbdebdd394
# Ruby とデータサイエンスの関係のこれまで

- その昔 (Ruby 1.6 くらいの頃)、Ruby には NArray という numpy 的な数値配列ライブラリがあって、線形代数演算をするときはこれを使っていた。
- NArray の開発が inactive になってしばらくして、NArray に影響されて NMatrix というライブラリを John Woods さんが作った。
- John は SciRuby を立ち上げて、Ruby の科学技術計算ライブラリ群を増やそうと地道な活動をしはじめた
- SciRuby は当初は勢いがあった (?) が次第に静かになっていった。GSoC では毎年プロジェクトを実施しているが、毎年出るアイデアが長期的視野を持っておらず、継続性もないため、ライブラリの出来は悪く、お世辞にも実用的とは言えないものだらけになっていました。
- そうこうしているうちに、Ruby はデータサイエンスの盛り上がりから除け者状態になっていった
- 2015 年頃、NArray を作っていた田中さんが復活し、新しく Ruby Numo というプロジェクトを立ち上げ新しい NArray を出した
- 2016 年、私は「このままではいつまでたっても Ruby をデータサイエンスで実用的に使えない」と危機感を抱き、PyCall の開発を開始した
- 2017 年、私は PyCall の最初の安定版をリリースし、Python を下働きさせることで Ruby をデータサイエンスで使える最低限の状況を作った
	===== LIMIT=1000 =====
	Calculating -------------------------------------
	Mysql2Test.test_pluck_by_arrow(n) 64.717M bytes - 100.000 times
	Mysql2Test.test_pluck(n) 154.227M bytes - 100.000 times

	Comparison:
	Mysql2Test.test_pluck_by_arrow(n): 64716800.0 bytes
	Mysql2Test.test_pluck(n): 154226688.0 bytes - 2.38x larger

	===== LIMIT=2000 =====
	julia> function dot(a, b)
	s = zero(eltype(a))
	for i in 1:endof(a)
	s += a[i] * b[i]
	end
	return s
	end
	dot (generic function with 1 method)

	julia> a = ones(100000); b = ones(100000);
	```
	$ python
	Python 3.6.4 (default, Apr 3 2018, 09:35:44)
	[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
	Type "help", "copyright", "credits" or "license" for more information.
	>>> from mxnet.gluon.parameter import ParameterDict
	>>> pd1 = ParameterDict()
	>>> pd2 = ParameterDict()
	>>> pd1.get('a')
	Parameter a (shape=None, dtype=<class 'numpy.float32'>)
	compiling arrow-nmatrix.c
	arrow-nmatrix.c: In function ‘garrow_type_to_nmatrix_dtype’:
	arrow-nmatrix.c:57:8: error: ‘GARROW_TYPE_BOOL’ undeclared (first use in this function)
	case GARROW_TYPE_BOOL:
	^
	arrow-nmatrix.c:57:8: note: each undeclared identifier is reported only once for each function it appears in
	arrow-nmatrix.c:34:3: warning: enumeration value ‘GARROW_TYPE_BOOLEAN’ not handled in switch [-Wswitch-enum]
	switch (arrow_type) {
	^
	arrow-nmatrix.c: In function ‘nmatrix_dtype_to_garrow_data_type’:
	from PIL import Image, ImageCms

	im = Image.open(image_path)
	if im.mode != "RGB":
	im = im.convert("RGB")

	srgb_profile = ImageCms.createProfile("sRGB")
	lab_profile = ImageCms.createProfile("LAB")

	rgb2lab_transform = ImageCms.buildTransformFromOpenProfiles(srgb_profile, lab_profile, "RGB", "LAB")
time	temp	headache
0	38.5	1
20	38.4	1
25	38.3	1
30	38.1	1
35	37.9	1
40	37.6	1
45	37.6	1
50	37.3	1
55	37.1	0
	require 'benchmark'

	LEN = 10000
	TRY = 1000

	s0 = 'x' * 100
	s1 = 'x' * 99 + 'y'

	cases = {
	shuffle: Array.new(LEN) {\|i\| i }.shuffle,
	# Ruby とデータサイエンスの関係のこれまで

	- その昔 (Ruby 1.6 くらいの頃)、Ruby には NArray という numpy 的な数値配列ライブラリがあって、線形代数演算をするときはこれを使っていた。
	- NArray の開発が inactive になってしばらくして、NArray に影響されて NMatrix というライブラリを John Woods さんが作った。
	- John は SciRuby を立ち上げて、Ruby の科学技術計算ライブラリ群を増やそうと地道な活動をしはじめた
	- SciRuby は当初は勢いがあった (?) が次第に静かになっていった。GSoC では毎年プロジェクトを実施しているが、毎年出るアイデアが長期的視野を持っておらず、継続性もないため、ライブラリの出来は悪く、お世辞にも実用的とは言えないものだらけになっていました。
	- そうこうしているうちに、Ruby はデータサイエンスの盛り上がりから除け者状態になっていった
	- 2015 年頃、NArray を作っていた田中さんが復活し、新しく Ruby Numo というプロジェクトを立ち上げ新しい NArray を出した
	- 2016 年、私は「このままではいつまでたっても Ruby をデータサイエンスで実用的に使えない」と危機感を抱き、PyCall の開発を開始した
	- 2017 年、私は PyCall の最初の安定版をリリースし、Python を下働きさせることで Ruby をデータサイエンスで使える最低限の状況を作った