Skip to content

Instantly share code, notes, and snippets.

@nouiz
Created August 17, 2012 14:02
Show Gist options
  • Save nouiz/3378936 to your computer and use it in GitHub Desktop.
Save nouiz/3378936 to your computer and use it in GitHub Desktop.
Theano vs minivect
$\time make
gfortran -O3 -fno-underscoring -fno-second-underscore -g -Wall -march=native -fPIC -c fbench.f90
python2.7 ~/repos/cython/bin/cython bench.pyx
CC=gcc LD="ld" LDFLAGS="" CFLAGS="-O3 -lgfortran -g -Wall -march=native -fPIC" python2.7 setup.py build_ext --inplace
running build_ext
skipping 'bench.c' Cython extension (up-to-date)
building 'bench' extension
gcc -DNDEBUG -O2 -O3 -lgfortran -g -Wall -march=native -fPIC -fPIC -I/opt/lisa/os/epd-7.1.2/lib/python2.7/site-packages/numpy/core/include -I/opt/lisa/os/epd-7.1.2/include/python2.7 -c bench.c -o build/temp.linux-x86_64-2.7/bench.o
bench.c: In function ‘get_memview_MemoryView_5array_7memview___get__’:
bench.c:5381:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:5381:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘__pyx_array_new’:
bench.c:5689:5: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:5689:5: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘__pyx_memoryview_is_slice’:
bench.c:7047:9: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:7047:9: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘__pyx_memoryview_MemoryView_10memoryview_16is_c_contig’:
bench.c:9193:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:9193:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘__pyx_memoryview_MemoryView_10memoryview_18is_f_contig’:
bench.c:9258:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:9258:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘__pyx_memoryview_new’:
bench.c:9476:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:9476:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘_unellipsify’:
bench.c:9664:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:9664:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:9819:7: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:9819:7: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:9885:9: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:9885:9: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘__pyx_memoryview_fromslice’:
bench.c:11902:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:11902:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘__pyx_pf_5bench_2theano_compile’:
bench.c:14840:7: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:14840:7: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:14908:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c:14908:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
bench.c: In function ‘__pyx_pf_5bench_12MixedContig2_fortran’:
bench.c:20760:7: warning: implicit declaration of function ‘aplusb_fcf’
bench.c: In function ‘__mini_mangle___pyx_array_expression_1tiled_c’:
bench.c:32877:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_2tiled_fortran’:
bench.c:32952:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_3inner_contig_c’:
bench.c:33025:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_4inner_contig_fortran’:
bench.c:33141:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_5strength_reduced_strided’:
bench.c:33259:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_6strength_reduced_strided_fortran’:
bench.c:33309:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle_final_assignment_8tiled_c’:
bench.c:33408:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle_final_assignment_9tiled_fortran’:
bench.c:33467:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle_final_assignment_10inner_contig_c’:
bench.c:33525:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle_final_assignment_11inner_contig_fortran’:
bench.c:33625:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle_final_assignment_12strength_reduced_strided’:
bench.c:33726:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle_final_assignment_13strength_reduced_strided_fortran’:
bench.c:33766:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_15tiled_c’:
bench.c:33886:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_16tiled_fortran’:
bench.c:34009:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_17inner_contig_c’:
bench.c:34127:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_18inner_contig_fortran’:
bench.c:34291:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_19strength_reduced_strided’:
bench.c:34460:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_20strength_reduced_strided_fortran’:
bench.c:34540:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_22tiled_c’:
bench.c:34681:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_23tiled_fortran’:
bench.c:34788:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_24inner_contig_c’:
bench.c:34891:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_25inner_contig_fortran’:
bench.c:35039:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_26strength_reduced_strided’:
bench.c:35191:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: In function ‘__mini_mangle___pyx_array_expression_27strength_reduced_strided_fortran’:
bench.c:35261:22: warning: unused variable ‘__mini_mangle_temp0’
bench.c: At top level:
bench.c:5747:32: warning: ‘__pyx_array_new_simple’ defined but not used
gcc -pthread -shared -g -O3 -lgfortran -g -Wall -march=native -fPIC build/temp.linux-x86_64-2.7/bench.o fbench.o -L/opt/lisa/os/epd-7.1.2/lib -lpython2.7 -o /u/bastienf/repos/minivect/bench/bench.so
python2.7 -c 'import bench; bench.run()'
2D Double Precision, Fortan Contig
400 Cython 948.148148148 MFlops 0.000160932540894 seconds
400 NumPy 935.67251462 MFlops 0.000163078308105 seconds
400 Theano 96.081669419 MFlops 0.0015881061554 seconds
400 Fortran 1113.04347826 MFlops 0.000137090682983 seconds
800 Cython 258.716523497 MFlops 0.00235915184021 seconds
800 NumPy 244.251502719 MFlops 0.00249886512756 seconds
800 Theano 96.3746564771 MFlops 0.00633311271667 seconds
800 Fortran 259.056871079 MFlops 0.00235605239868 seconds
1200 Cython 287.841687072 MFlops 0.0047709941864 seconds
1200 NumPy 280.715434475 MFlops 0.00489211082458 seconds
1200 Theano 86.7796610169 MFlops 0.0158250331879 seconds
1200 Fortran 284.979220265 MFlops 0.0048189163208 seconds
1600 Cython 277.777777778 MFlops 0.0087890625 seconds
1600 NumPy 257.804632427 MFlops 0.00946998596191 seconds
1600 Theano 83.1060901182 MFlops 0.0293769836426 seconds
1600 Fortran 254.10059803 MFlops 0.00960803031921 seconds
2000 Cython 281.195079086 MFlops 0.0135660171509 seconds
2000 NumPy 281.358256986 MFlops 0.0135581493378 seconds
2000 Theano 77.9875317434 MFlops 0.0489141941071 seconds
2000 Fortran 278.789357216 MFlops 0.0136830806732 seconds
2400 Cython 280.534281435 MFlops 0.019581079483 seconds
2400 NumPy 279.965004374 MFlops 0.0196208953857 seconds
2400 Theano 72.7996587516 MFlops 0.075455904007 seconds
2400 Fortran 280.691495194 MFlops 0.0195701122284 seconds
2D Double Precision, Strided, C Order
2 Cython 397.268777157 MFlops 0.000384092330933 seconds
2 NumPy 288.418206399 MFlops 0.000529050827026 seconds
2 Theano 196.620583717 MFlops 0.000776052474976 seconds
2 Fortran 432.432432432 MFlops 0.00035285949707 seconds
4 Cython 77.2200772201 MFlops 0.00197601318359 seconds
4 NumPy 72.3490843319 MFlops 0.00210905075073 seconds
4 Theano 86.4981754291 MFlops 0.00176405906677 seconds
4 Fortran 72.4555643609 MFlops 0.0021059513092 seconds
8 Cython 35.5674113593 MFlops 0.00429010391235 seconds
8 NumPy 35.543707653 MFlops 0.0042929649353 seconds
8 Theano 46.8041538687 MFlops 0.00326013565063 seconds
8 Fortran 35.6347438753 MFlops 0.00428199768066 seconds
16 Cython 28.4951024043 MFlops 0.00535488128662 seconds
16 NumPy 28.0751008949 MFlops 0.0054349899292 seconds
16 Theano 38.7761284459 MFlops 0.00393509864807 seconds
16 Fortran 28.5637775596 MFlops 0.00534200668335 seconds
32 Cython 27.8854951854 MFlops 0.00547194480896 seconds
32 NumPy 28.0136566576 MFlops 0.00544691085815 seconds
32 Theano 38.3210586192 MFlops 0.00398182868958 seconds
32 Fortran 28.2935455349 MFlops 0.00539302825928 seconds
64 Cython 25.700746928 MFlops 0.00593709945679 seconds
64 NumPy 25.6832136121 MFlops 0.00594115257263 seconds
64 Theano 35.8443013162 MFlops 0.00425696372986 seconds
64 Fortran 25.801249748 MFlops 0.00591397285461 seconds
128 Cython 26.2402624026 MFlops 0.00581502914429 seconds
128 NumPy 26.1769397521 MFlops 0.00582909584045 seconds
128 Theano 37.2786579683 MFlops 0.00409317016602 seconds
128 Fortran 25.9856267002 MFlops 0.00587201118469 seconds
2D Double Precision, Strided Inner Contig\n4 operands
400 Cython 1658.03108808 MFlops 0.0001380443573 seconds
400 NumPy 197.652872143 MFlops 0.0011579990387 seconds
400 Theano 177.154456542 MFlops 0.00129199028015 seconds
400 Fortran 1787.70949721 MFlops 0.000128030776978 seconds
800 Cython 495.164410058 MFlops 0.00184893608093 seconds
800 NumPy 143.949617634 MFlops 0.00636005401611 seconds
800 Theano 124.087119498 MFlops 0.00737810134888 seconds
800 Fortran 497.538222337 MFlops 0.00184011459351 seconds
1200 Cython 522.433184182 MFlops 0.00394296646118 seconds
1200 NumPy 143.52874728 MFlops 0.0143520832062 seconds
1200 Theano 112.651081529 MFlops 0.0182859897614 seconds
1200 Fortran 523.097414785 MFlops 0.00393795967102 seconds
1600 Cython 469.437652812 MFlops 0.0078010559082 seconds
1600 NumPy 142.03547188 MFlops 0.0257830619812 seconds
1600 Theano 101.57856798 MFlops 0.0360519886017 seconds
1600 Fortran 481.293476217 MFlops 0.00760889053345 seconds
2000 Cython 525.198590717 MFlops 0.0108950138092 seconds
2000 NumPy 143.69020392 MFlops 0.039822101593 seconds
2000 Theano 92.3730640145 MFlops 0.0619449615479 seconds
2000 Fortran 531.585009303 MFlops 0.0107641220093 seconds
2400 Cython 525.331752474 MFlops 0.0156848430634 seconds
2400 NumPy 147.275028445 MFlops 0.0559480190277 seconds
2400 Theano 88.1313401778 MFlops 0.093493938446 seconds
2400 Fortran 535.257949107 MFlops 0.0153939723969 seconds
2D Double Precision, Strided, Mixed Order
2 Cython 412.371134021 MFlops 0.000370025634766 seconds
2 NumPy 81.7786864298 MFlops 0.00186586380005 seconds
2 Theano 157.596651071 MFlops 0.000968217849731 seconds
2 Fortran 112.182296231 MFlops 0.00136017799377 seconds
4 Cython 68.0633840264 MFlops 0.00224184989929 seconds
4 NumPy 42.4938583095 MFlops 0.00359082221985 seconds
4 Theano 70.4457897633 MFlops 0.00216603279114 seconds
4 Fortran 58.1712415924 MFlops 0.00262308120728 seconds
8 Cython 33.0715171558 MFlops 0.00461387634277 seconds
8 NumPy 25.8889203511 MFlops 0.00589394569397 seconds
8 Theano 36.45269693 MFlops 0.00418591499329 seconds
8 Fortran 29.0697674419 MFlops 0.0052490234375 seconds
16 Cython 25.7794247966 MFlops 0.00591897964478 seconds
16 NumPy 21.5640688702 MFlops 0.00707602500916 seconds
16 Theano 31.9952007199 MFlops 0.00476908683777 seconds
16 Fortran 23.088023088 MFlops 0.0066089630127 seconds
32 Cython 25.8397932817 MFlops 0.00590515136719 seconds
32 NumPy 18.9787082617 MFlops 0.00803995132446 seconds
32 Theano 31.8091451292 MFlops 0.00479698181152 seconds
32 Fortran 21.9931271478 MFlops 0.00693798065186 seconds
64 Cython 24.912417283 MFlops 0.00612497329712 seconds
64 NumPy 18.1246637025 MFlops 0.00841879844666 seconds
64 Theano 32.2792152116 MFlops 0.00472712516785 seconds
64 Fortran 20.4760685948 MFlops 0.0074520111084 seconds
128 Cython 21.7731509832 MFlops 0.00700807571411 seconds
128 NumPy 16.0650635072 MFlops 0.00949811935425 seconds
128 Theano 31.5457413249 MFlops 0.00483703613281 seconds
128 Fortran 20.8768267223 MFlops 0.00730895996094 seconds
2D Double Precision, Mixed Contig Order
400 Cython 1320.9494324 MFlops 0.000231027603149 seconds
400 NumPy 184.944372201 MFlops 0.00165009498596 seconds
400 Theano 566.121185316 MFlops 0.000539064407349 seconds
400 Fortran 280.026252461 MFlops 0.00108981132507 seconds
800 Cython 392.397302269 MFlops 0.00311088562012 seconds
800 NumPy 134.719115906 MFlops 0.00906109809875 seconds
800 Theano 326.926760743 MFlops 0.00373387336731 seconds
800 Fortran 154.575370588 MFlops 0.00789713859558 seconds
1200 Cython 390.919271098 MFlops 0.00702595710754 seconds
1200 NumPy 125.009495079 MFlops 0.0219709873199 seconds
1200 Theano 293.652816722 MFlops 0.00935316085815 seconds
1200 Fortran 139.88901167 MFlops 0.0196340084076 seconds
1600 Cython 383.836869331 MFlops 0.0127210617065 seconds
1600 NumPy 115.332199534 MFlops 0.0423369407654 seconds
1600 Theano 280.459581228 MFlops 0.0174100399017 seconds
1600 Fortran 130.825838103 MFlops 0.0373229980469 seconds
2000 Cython 381.0158836 MFlops 0.0200238227844 seconds
2000 NumPy 109.57065424 MFlops 0.069629907608 seconds
2000 Theano 264.357940652 MFlops 0.0288600921631 seconds
2000 Fortran 128.742587243 MFlops 0.0592608451843 seconds
2400 Cython 376.101860921 MFlops 0.0292110443115 seconds
2400 NumPy 104.49429794 MFlops 0.105138063431 seconds
2400 Theano 253.854925876 MFlops 0.0432779788971 seconds
2400 Fortran 100.765141559 MFlops 0.109029054642 seconds
2D Double Precision, Mixed Strided Order\n6 operands
400 Cython 759.734093067 MFlops 0.000251054763794 seconds
400 NumPy 155.18913676 MFlops 0.00122904777527 seconds
400 Theano 424.853956452 MFlops 0.000448942184448 seconds
400 Fortran 363.967242948 MFlops 0.000524044036865 seconds
800 Cython 236.354235911 MFlops 0.00322794914246 seconds
800 NumPy 115.336096594 MFlops 0.00661492347717 seconds
800 Theano 217.125797259 MFlops 0.0035138130188 seconds
800 Fortran 139.118337536 MFlops 0.00548410415649 seconds
1200 Cython 243.317224832 MFlops 0.00705504417419 seconds
1200 NumPy 94.024237359 MFlops 0.0182571411133 seconds
1200 Theano 213.219616205 MFlops 0.0080509185791 seconds
1200 Fortran 113.940276305 MFlops 0.015065908432 seconds
1600 Cython 235.510579577 MFlops 0.0129580497742 seconds
1600 NumPy 97.1618123714 MFlops 0.0314090251923 seconds
1600 Theano 207.431895895 MFlops 0.0147120952606 seconds
1600 Fortran 119.111872104 MFlops 0.0256209373474 seconds
2000 Cython 230.011960622 MFlops 0.02073097229 seconds
2000 NumPy 95.9720529382 MFlops 0.0496850013733 seconds
2000 Theano 196.303603153 MFlops 0.0242908000946 seconds
2000 Fortran 112.682404643 MFlops 0.0423169136047 seconds
2400 Cython 230.602930579 MFlops 0.029776096344 seconds
2400 NumPy 93.4439952499 MFlops 0.0734820365906 seconds
2400 Theano 200.709452161 MFlops 0.0342109203339 seconds
2400 Fortran 108.112978062 MFlops 0.0635118484497 seconds
51.54user 7.38system 1:00.41elapsed 97%CPU (0text+0data 2076016max)k 11800inputs+47088outputs (9major+3319093minor)pagefaults 0nbswaps 7499nbcontext-switch 3270nbwait
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment