Skip to content

Instantly share code, notes, and snippets.

@marty1885
Created July 27, 2016 03:46
Show Gist options
  • Save marty1885/b9c3f96033aafcd77c30602e3c71128b to your computer and use it in GitHub Desktop.
Save marty1885/b9c3f96033aafcd77c30602e3c71128b to your computer and use it in GitHub Desktop.
args: bin/deepcl_unittests --gtest_filter=-DATA*:SLOW*
Note: Google Test filter = -DATA*:SLOW*
[==========] Running 158 tests from 29 test cases.
[----------] Global test environment set-up.
[----------] 7 tests from testClBlas
[ RUN ] testClBlas.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
clblas teardown
[ OK ] testClBlas.basic (82 ms)
[ RUN ] testClBlas.transA
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
1 2 9
3 7 5
initializing clblas
clblas teardown
[ OK ] testClBlas.transA (36 ms)
[ RUN ] testClBlas.transB
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
3
-1
initializing clblas
clblas teardown
[ OK ] testClBlas.transB (37 ms)
[ RUN ] testClBlas.colMajor
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
clblas teardown
[ OK ] testClBlas.colMajor (34 ms)
[ RUN ] testClBlas.colMajor2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
clblas teardown
[ OK ] testClBlas.colMajor2 (34 ms)
[ RUN ] testClBlas.colMajorTransA
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
clblas teardown
[ OK ] testClBlas.colMajorTransA (37 ms)
[ RUN ] testClBlas.colMajorTransB
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
clblas teardown
[ OK ] testClBlas.colMajorTransB (35 ms)
[----------] 7 tests from testClBlas (295 ms total)
[----------] 1 test from testDeepCL
[ RUN ] testDeepCL.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
expected number of output: 4
clblas teardown
[ OK ] testDeepCL.basic (176 ms)
[----------] 1 test from testDeepCL (176 ms total)
[----------] 23 tests from testupdateweights
[ RUN ] testupdateweights.conv1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=2 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=5 numFilters=2 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:SquareLossLayer{}
layer 0:InputLayer{ outputPlanes=2 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=5 numFilters=2 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:SquareLossLayer{}
batchSize: 4
inputtotalsize=200 outputTotalSize=72
layer ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=5 numFilters=2 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
weightsize=36 biassize=0
statefultimer v0.7
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=2 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=5 numFilters=2 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:SquareLossLayer{}
Parameters overview: (skipping 2 layers with 0 params)
layer 1: params=36 100.0%
TOTAL : params=36
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
idx=8 predicted losschange=0.000111445 actual=0.000112534
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
idx=13 predicted losschange=-0.000886715 actual=-0.000884056
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
idx=0 predicted losschange=0.000210491 actual=0.000212669
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
idx=22 predicted losschange=-0.000164224 actual=-0.000163078
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 152ms
idx=22 predicted losschange=-0.000164224 actual=-0.000163078
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 152ms
forward layer selected kernel 1
idx=35 predicted losschange=-0.000391028 actual=-0.000391006
idx=26 predicted losschange=2.23142e-05 actual=2.57492e-05
idx=27 predicted losschange=9.38328e-05 actual=9.44138e-05
idx=27 predicted losschange=9.38328e-05 actual=9.44138e-05
idx=10 predicted losschange=0.00186697 actual=0.00187111
clblas teardown
[ OK ] testupdateweights.conv1 (566 ms)
[ RUN ] testupdateweights.conv1z
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=2 outputSize=3 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=3 numFilters=2 filterSize=3 outputSize=3 padZeros=1 biased=0 skip=0} }
layer 2:SquareLossLayer{}
layer 0:InputLayer{ outputPlanes=2 outputSize=3 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=3 numFilters=2 filterSize=3 outputSize=3 padZeros=1 biased=0 skip=0} }
layer 2:SquareLossLayer{}
batchSize: 4
inputtotalsize=72 outputTotalSize=72
layer ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=3 numFilters=2 filterSize=3 outputSize=3 padZeros=1 biased=0 skip=0} }
weightsize=36 biassize=0
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=2 outputSize=3 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=3 numFilters=2 filterSize=3 outputSize=3 padZeros=1 biased=0 skip=0} }
layer 2:SquareLossLayer{}
Parameters overview: (skipping 2 layers with 0 params)
layer 1: params=36 100.0%
TOTAL : params=36
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
idx=8 predicted losschange=0.00039831 actual=0.000397682
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
idx=13 predicted losschange=-0.000426502 actual=-0.000426292
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
idx=0 predicted losschange=0.000143287 actual=0.000144005
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, padzeros must be disabled
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
idx=22 predicted losschange=-1.7916e-06 actual=0
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 148ms
idx=22 predicted losschange=-1.7916e-06 actual=0
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 148ms
forward layer selected kernel 1
idx=35 predicted losschange=-2.82565e-05 actual=-2.76566e-05
idx=26 predicted losschange=3.62191e-05 actual=3.71933e-05
idx=27 predicted losschange=-0.000319862 actual=-0.000317574
idx=27 predicted losschange=-0.000319862 actual=-0.000317574
idx=10 predicted losschange=-0.000883857 actual=-0.000883102
clblas teardown
[ OK ] testupdateweights.conv1z (554 ms)
[ RUN ] testupdateweights.numericallytest
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=1 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=1 numFilters=1 filterSize=1 outputSize=1 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=1 100.0%
TOTAL : params=1
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=1 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=1 numFilters=1 filterSize=1 outputSize=1 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=1 100.0%
TOTAL : params=1
loss 0.0367983 loss2 0.0367913 change: 7.01472e-06
sumweightsdiff -0.000264842
loss change 7.01472e-06
estimatedLossChangeFromW 7.01413e-06
[ OK ] testupdateweights.numericallytest (388 ms)
[ RUN ] testupdateweights.numericallytest_imagesize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=3 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=3 numFilters=1 filterSize=1 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=1 100.0%
TOTAL : params=1
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=3 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=3 numFilters=1 filterSize=1 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=1 100.0%
TOTAL : params=1
loss 1.23358 loss2 1.21612 change: 0.0174605
sumweightsdiff -0.0132709
loss change 0.0174605
estimatedLossChangeFromW 0.0176118
[ OK ] testupdateweights.numericallytest_imagesize3 (394 ms)
[ RUN ] testupdateweights.numericallytest_imagesize5
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=1 outputSize=5 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=1 100.0%
TOTAL : params=1
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=1 outputSize=5 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=1 100.0%
TOTAL : params=1
loss 4.12958 loss2 4.11952 change: 0.0100665
sumweightsdiff -0.0101708
loss change 0.0100665
estimatedLossChangeFromW 0.0103444
[ OK ] testupdateweights.numericallytest_imagesize5 (398 ms)
[ RUN ] testupdateweights.numericallytest_imagesize9
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=9 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=1 outputSize=9 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=1 100.0%
TOTAL : params=1
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=9 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=1 outputSize=9 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=1 100.0%
TOTAL : params=1
loss 13.4341 loss2 13.4339 change: 0.000207901
sumweightsdiff 0.00153953
loss change 0.000207901
estimatedLossChangeFromW 0.000237015
[ OK ] testupdateweights.numericallytest_imagesize9 (392 ms)
[ RUN ] testupdateweights.numericallytest_imagesize9_filtersize9
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=9 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=9 outputSize=1 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=81 100.0%
TOTAL : params=81
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=9 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=9 outputSize=1 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=81 100.0%
TOTAL : params=81
loss 0.135896 loss2 0.0848782 change: 0.0510182
sumweightsdiff -0.0322406
loss change 0.0510182
estimatedLossChangeFromW 0.0555841
[ OK ] testupdateweights.numericallytest_imagesize9_filtersize9 (456 ms)
[ RUN ] testupdateweights.numericallytest_imagesize9_filtersize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=9 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=3 outputSize=7 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=9 100.0%
TOTAL : params=9
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=9 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=1 filterSize=3 outputSize=7 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=9 100.0%
TOTAL : params=9
loss 7.70633 loss2 7.41581 change: 0.290529
sumweightsdiff -0.0898813
loss change 0.290529
estimatedLossChangeFromW 0.316231
[ OK ] testupdateweights.numericallytest_imagesize9_filtersize3 (444 ms)
[ RUN ] testupdateweights.numericallytest_imagesize3_filtersize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=3 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=3 numFilters=1 filterSize=3 outputSize=1 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=9 100.0%
TOTAL : params=9
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=3 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=3 numFilters=1 filterSize=3 outputSize=1 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=9 100.0%
TOTAL : params=9
loss 0.0719101 loss2 0.0694461 change: 0.00246406
sumweightsdiff -0.0110647
loss change 0.00246406
estimatedLossChangeFromW 0.00248372
[ OK ] testupdateweights.numericallytest_imagesize3_filtersize3 (401 ms)
[ RUN ] testupdateweights.numericallytest_imagesize5_filtersize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=9 100.0%
TOTAL : params=9
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=9 100.0%
TOTAL : params=9
loss 1.20022 loss2 1.17241 change: 0.0278131
sumweightsdiff -0.0203888
loss change 0.0278131
estimatedLossChangeFromW 0.0280929
[ OK ] testupdateweights.numericallytest_imagesize5_filtersize3 (411 ms)
[ RUN ] testupdateweights.numericallytest_imagesize5_filtersize3_batchsize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=9 100.0%
TOTAL : params=9
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=9 100.0%
TOTAL : params=9
loss 4.97142 loss2 4.78768 change: 0.183744
sumweightsdiff -0.056004
loss change 0.183744
estimatedLossChangeFromW 0.193264
[ OK ] testupdateweights.numericallytest_imagesize5_filtersize3_batchsize3 (409 ms)
[ RUN ] testupdateweights.numericallytest_imagesize5_filtersize3_planes3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=3 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=27 100.0%
TOTAL : params=27
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=3 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=27 100.0%
TOTAL : params=27
loss 1.08887 loss2 0.9575 change: 0.13137
sumweightsdiff -0.00764531
loss change 0.13137
estimatedLossChangeFromW 0.134379
[ OK ] testupdateweights.numericallytest_imagesize5_filtersize3_planes3 (440 ms)
[ RUN ] testupdateweights.numericallytest_imagesize5_filtersize3_planes3_batchsize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=3 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=27 100.0%
TOTAL : params=27
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 0:InputLayer{ outputPlanes=3 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:ActivationLayer{ TANH }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=27 100.0%
TOTAL : params=27
loss 4.76631 loss2 4.18154 change: 0.584769
sumweightsdiff 0.029606
loss change 0.584769
estimatedLossChangeFromW 0.620442
[ OK ] testupdateweights.numericallytest_imagesize5_filtersize3_planes3_batchsize3 (424 ms)
[ RUN ] testupdateweights.backprop_weights_2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
mismatch for i 0
[ OK ] testupdateweights.backprop_weights_2 (38 ms)
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=2 -D gInputSizeSquared=4 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=2 -D gOutputSizeSquared=4 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=2 -DgInputStripeOuterNumRows=2 -DgInputStripeInnerSize=4 -DgInputStripeOuterSize=4 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=2 -DgOutputStripeSize=4
mismatch for i 0
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize2 (40 ms)
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize3_filtersize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
mismatch for i 0
mismatch for i 1
mismatch for i 2
mismatch for i 3
mismatch for i 4
mismatch for i 5
mismatch for i 6
mismatch for i 7
mismatch for i 8
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize3_filtersize3 (41 ms)
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize4_filtersize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=4 -D gInputSizeSquared=16 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=2 -D gOutputSizeSquared=4 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=4 -DgInputStripeOuterNumRows=8 -DgInputStripeInnerSize=16 -DgInputStripeOuterSize=32 -DgInputStripeMarginSize=8 -DgOutputStripeNumRows=2 -DgOutputStripeSize=4
mismatch for i 0
mismatch for i 8
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize4_filtersize3 (44 ms)
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize5_filtersize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9
mismatch for i 0
mismatch for i 4
mismatch for i 8
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize5_filtersize3 (48 ms)
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize3_filtersize1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=3 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=9 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9
mismatch for i 0
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize3_filtersize1 (46 ms)
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize16_filtersize1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=16 -D gInputSizeSquared=256 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=16 -D gOutputSizeSquared=256 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=16 -DgInputStripeOuterNumRows=16 -DgInputStripeInnerSize=256 -DgInputStripeOuterSize=256 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=16 -DgOutputStripeSize=256
mismatch for i 0
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize16_filtersize1 (46 ms)
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize17_filtersize1
LayerDimensions{ inputPlanes=1 inputSize=17 numFilters=1 filterSize=1 outputSize=17 padZeros=0 biased=0 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=17 -D gInputSizeSquared=289 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=17 -D gOutputSizeSquared=289 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=17 -DgInputStripeOuterNumRows=17 -DgInputStripeInnerSize=289 -DgInputStripeOuterSize=289 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=17 -DgOutputStripeSize=289
mismatch for i 0
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize17_filtersize1 (46 ms)
[ RUN ] testupdateweights.backprop_weights_2_upstreamimagesize17_filtersize1_moredata
expectedresult: -958.715
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=17 -D gInputSizeSquared=289 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=17 -D gOutputSizeSquared=289 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=17 -DgInputStripeOuterNumRows=17 -DgInputStripeInnerSize=289 -DgInputStripeOuterSize=289 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=17 -DgOutputStripeSize=289
mismatch for i 0
[ OK ] testupdateweights.backprop_weights_2_upstreamimagesize17_filtersize1_moredata (44 ms)
[ RUN ] testupdateweights.backprop_instance3_smaller2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
numweights: 36
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=96 -D gInputSizeSquared=9216 -D gNumFilters=1 -D gFilterSize=6 -D gHalfFilterSize=3 -D gFilterSizeSquared=36 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=91 -D gOutputSizeSquared=8281 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=8 -DgInputStripeMarginRows=5 -DgInputStripeInnerNumRows=12 -DgInputStripeOuterNumRows=22 -DgInputStripeInnerSize=1152 -DgInputStripeOuterSize=2112 -DgInputStripeMarginSize=480 -DgOutputStripeNumRows=12 -DgOutputStripeSize=1092
138 0 0 0 0 0
132 0 0 0 0 0
138 0 0 0 0 0
138 0 0 0 0 0
138 0 0 0 0 0
132 0 0 0 0 0
138 0 0 0 0 0
132 0 0 0 0 0
138 0 0 0 0 0
138 0 0 0 0 0
138 0 0 0 0 0
132 0 0 0 0 0
......
......
......
......
......
......
0=0 0 0 0 0 0 0 0
1=0 0 0 0 0 0 0 0
2=0 0 0 0 0 0 0 0
3=0 0 0 0 0 0 0 0
4=0 0 0 0 0 0 0 0
5=0 0 0 0 0 0 0 0
6=0 0 0 0 0 0 0 0
7=0 0 0 0 0 0 0 0
8=0 0 0 0 0 0 0 0
9=0 0 0 0 0 0 0 0
10=0 0 0 0 0 0 0 0
11=0 0 0 0 0 0 0 0
0=0 0 0 0 0 0 0 0
1=0 0 0 0 0 0 0 0
2=0 0 0 0 0 0 0 0
3=0 0 0 0 0 0 0 0
4=0 0 0 0 0 0 0 0
5=0 0 0 0 0 0 0 0
6=0 0 0 0 0 0 0 0
7=0 0 0 0 0 0 0 0
8=0 0 0 0 0 0 0 0
9=0 0 0 0 0 0 0 0
10=0 0 0 0 0 0 0 0
11=0 0 0 0 0 0 0 0
12=0 0 0 0 0 0 0 0
13=0 0 0 0 0 0 0 0
14=0 0 0 0 0 0 0 0
15=0 0 0 0 0 0 0 0
16=0 0 0 0 0 0 0 0
17=0 0 0 0 0 0 0 0
18=0 0 0 0 0 0 0 0
19=0 0 0 0 0 0 0 0
[ OK ] testupdateweights.backprop_instance3_smaller2 (72 ms)
[----------] 23 tests from testupdateweights (6142 ms total)
[----------] 17 tests from testforward
[ RUN ] testforward.imagesize2_nopadzeros
expected number of output: 4
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testforward.imagesize2_nopadzeros (177 ms)
[ RUN ] testforward.imagesize2_padzeros
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
checking result[0]=0 expecting: 0
checking result[1]=0 expecting: 0
checking result[2]=0 expecting: 0
checking result[3]=0.2 expecting: 0.2
checking result[4]=-0.13 expecting: -0.13
checking result[5]=-0.15 expecting: -0.15
checking result[6]=0 expecting: 0
checking result[7]=0 expecting: 0
checking result[8]=0 expecting: 0
checking result[9]=0 expecting: 0
checking result[10]=0 expecting: 0
checking result[11]=0 expecting: 0
checking result[12]=-0.55 expecting: -0.55
checking result[13]=0.02 expecting: 0.02
checking result[14]=0.21 expecting: 0.21
checking result[27]=-14.3 expecting: -14.3
checking result[28]=-9.6 expecting: -9.6
checking result[29]=11.9 expecting: 11.9
checking result[35]=0.46 expecting: 0.46
[ OK ] testforward.imagesize2_padzeros (70 ms)
[ RUN ] testforward.imagesize3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
test1 ok
[ OK ] testforward.imagesize3 (67 ms)
[ RUN ] testforward.test2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testforward.test2 (64 ms)
[ RUN ] testforward.test3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testforward.test3 (67 ms)
[ RUN ] testforward.compare_0_1_biased_nopad
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
clblas teardown
[ OK ] testforward.compare_0_1_biased_nopad (89 ms)
[ RUN ] testforward.compare_0_1_biased_pad
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
clblas teardown
[ OK ] testforward.compare_0_1_biased_pad (91 ms)
[ RUN ] testforward.compare_1_n_biased_nopad
instance: 2
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
clblas teardown
instance: 3
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
clblas teardown
instance: 4
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
clblas teardown
instance: 6
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
clblas teardown
instance: 7
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=15 padZeros=0 biased=1 skip=0}
clblas teardown
[ OK ] testforward.compare_1_n_biased_nopad (913 ms)
[ RUN ] testforward.compare_1_n_biased_pad
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
instance: 2
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
clblas teardown
instance: 3
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
clblas teardown
instance: 4
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
clblas teardown
instance: 6
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
clblas teardown
instance: 7
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
clblas teardown
[ OK ] testforward.compare_1_n_biased_pad (876 ms)
[ RUN ] testforward.compare_1_5_biased_nopad
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=19 outputSize=1 padZeros=0 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=8 inputSize=19 numFilters=8 filterSize=19 outputSize=1 padZeros=0 biased=1 skip=0}
clblas teardown
[ OK ] testforward.compare_1_5_biased_nopad (141 ms)
[ RUN ] testforward.compare_1_4_fcscenario
LayerDimensions{ inputPlanes=10 inputSize=24 numFilters=10 filterSize=24 outputSize=1 padZeros=0 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=10 inputSize=24 numFilters=10 filterSize=24 outputSize=1 padZeros=0 biased=1 skip=0}
clblas teardown
[ OK ] testforward.compare_1_4_fcscenario (121 ms)
[ RUN ] testforward.compare_break1_0_1
LayerDimensions{ inputPlanes=1 inputSize=33 numFilters=1 filterSize=1 outputSize=33 padZeros=0 biased=0 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 1
dump enabled=0
batch 0 batchsize 1
dump enabled=0
LayerDimensions{ inputPlanes=1 inputSize=33 numFilters=1 filterSize=1 outputSize=33 padZeros=0 biased=0 skip=0}
clblas teardown
[ OK ] testforward.compare_break1_0_1 (66 ms)
[ RUN ] testforward.compare_break1_0_4
LayerDimensions{ inputPlanes=1 inputSize=33 numFilters=1 filterSize=1 outputSize=33 padZeros=0 biased=0 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 1
dump enabled=0
batch 0 batchsize 1
dump enabled=0
LayerDimensions{ inputPlanes=1 inputSize=33 numFilters=1 filterSize=1 outputSize=33 padZeros=0 biased=0 skip=0}
clblas teardown
[ OK ] testforward.compare_break1_0_4 (70 ms)
[ RUN ] testforward.comparespecific_break2
LayerDimensions{ inputPlanes=64 inputSize=19 numFilters=64 filterSize=19 outputSize=1 padZeros=0 biased=0 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
batch 0 batchsize 4
dump enabled=0
batch 0 batchsize 4
dump enabled=0
LayerDimensions{ inputPlanes=64 inputSize=19 numFilters=64 filterSize=19 outputSize=1 padZeros=0 biased=0 skip=0}
clblas teardown
[ OK ] testforward.comparespecific_break2 (175 ms)
[ RUN ] testforward.softmax
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
output[0]=0.0320586
output[1]=0.0871443
output[2]=0.643914
output[3]=0.236883
loss 0.44019
loss 3.44019
loss 2.44019
loss 1.44019
[ OK ] testforward.softmax (2 ms)
[ RUN ] testforward.softmax_byplane
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
output[0]=0.0320586
output[1]=0.0871443
output[2]=0.643914
output[3]=0.236883
loss 0.44019
loss 3.44019
loss 2.44019
loss 1.44019
[ OK ] testforward.softmax_byplane (1 ms)
[ RUN ] testforward.crash_from_jm
-D gNumInputPlanes=32 -D gInputPlanes=32 -D gInputSize=28 -D gInputSizeSquared=784 -D gNumFilters=20 -D gFilterSize=28 -D gHalfFilterSize=14 -D gFilterSizeSquared=784 -D gNumOutputPlanes=20 -D gOutputPlanes=20 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
dump enabled=0
[ OK ] testforward.crash_from_jm (160 ms)
[----------] 17 tests from testforward (3151 ms total)
[----------] 2 tests from testfilehelper
[ RUN ] testfilehelper.testfilehelper
[ OK ] testfilehelper.testfilehelper (4 ms)
[ RUN ] testfilehelper.testreadchunk
[ OK ] testfilehelper.testreadchunk (2 ms)
[----------] 2 tests from testfilehelper (6 ms total)
[----------] 12 tests from testsimpleconvolvenet
[ RUN ] testsimpleconvolvenet.imagesize1_planes2_filters2_unbiased_tanh
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 0.141046
accuracy: 2/2 100%
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 144ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 144ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 94ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 94ms
forward layer selected kernel 1
loss, E, 0.0733091
accuracy: 2/2 100%
loss, E, 0.0426809
accuracy: 2/2 100%
loss, E, 0.0262453
accuracy: 2/2 100%
loss, E, 0.0164245
accuracy: 2/2 100%
loss, E, 0.0107573
accuracy: 2/2 100%
accuracy: 2/2
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize1_planes2_filters2_unbiased_tanh (927 ms)
[ RUN ] testsimpleconvolvenet.imagesize1_planes2_filters2_tanh
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 0.964924
accuracy: 1/2 50%
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 206ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 206ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 97ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 97ms
forward layer selected kernel 1
loss, E, 0.00570459
accuracy: 2/2 100%
loss, E, 1.34828e-05
accuracy: 2/2 100%
loss, E, 3.61852e-08
accuracy: 2/2 100%
accuracy: 2/2
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize1_planes2_filters2_tanh (1033 ms)
[ RUN ] testsimpleconvolvenet.imagesize3_n4_filtersize3_tanh
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 1.13283
accuracy: 3/4 75%
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=2 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 229ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 229ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
loss, E, 0.00996342
accuracy: 4/4 100%
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 114ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 114ms
forward layer selected kernel 1
loss, E, 4.70668e-05
accuracy: 4/4 100%
loss, E, 4.09802e-07
accuracy: 4/4 100%
accuracy: 4/4
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize3_n4_filtersize3_tanh (1064 ms)
[ RUN ] testsimpleconvolvenet.imagesize1_2planes_filtersize1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 0.751601
accuracy: 2/2 100%
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 227ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 227ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
loss, E, 0.195916
accuracy: 2/2 100%
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 101ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 101ms
forward layer selected kernel 1
loss, E, 0.0679117
accuracy: 2/2 100%
loss, E, 0.023677
accuracy: 2/2 100%
loss, E, 0.00825563
accuracy: 2/2 100%
loss, E, 0.00287856
accuracy: 2/2 100%
loss, E, 0.00100369
accuracy: 2/2 100%
loss, E, 0.000349964
accuracy: 2/2 100%
accuracy: 2/2 100%
accuracy: 2/2
loss, E, 0.000150648
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize1_2planes_filtersize1 (1025 ms)
[ RUN ] testsimpleconvolvenet.imagesize3_n4_filtersize3_relu
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 1.48951
accuracy: 2/4 50%
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=2 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 213ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 213ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
loss, E, 1.12957
accuracy: 2/4 50%
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 112ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 112ms
forward layer selected kernel 1
loss, E, 0.070782
accuracy: 4/4 100%
loss, E, 0.003026
accuracy: 4/4 100%
loss, E, 0.00021158
accuracy: 4/4 100%
loss, E, 1.96858e-05
accuracy: 4/4 100%
loss, E, 2.03002e-06
accuracy: 4/4 100%
loss, E, 2.15572e-07
accuracy: 4/4 100%
loss, E, 2.3083e-08
accuracy: 4/4 100%
loss, E, 2.48239e-09
accuracy: 4/4 100%
loss, E, 4.14442e-10
accuracy: 4/4 100%
accuracy: 4/4
loss, E, 4.14442e-10
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize3_n4_filtersize3_relu (1085 ms)
[ RUN ] testsimpleconvolvenet.imagesize3_n4_filtersize3_linear
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 0.50604
accuracy: 4/4 100%
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=2 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 229ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 229ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
loss, E, 0.0565529
accuracy: 4/4 100%
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 115ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 115ms
forward layer selected kernel 1
loss, E, 0.00777245
accuracy: 4/4 100%
loss, E, 0.00106831
accuracy: 4/4 100%
loss, E, 0.000218376
accuracy: 4/4 100%
accuracy: 4/4
loss, E, 0.000218376
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize3_n4_filtersize3_linear (1025 ms)
[ RUN ] testsimpleconvolvenet.imagesize1_n2_2layers_unbiased
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 1ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
epoch 0 loss, E, 0.0559531
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
epoch 1 loss, E, 0.0254554
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 97ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
epoch 2 loss, E, 0.0172943
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 97ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 201ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 34ms
epoch 3 loss, E, 0.0138013
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 201ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 34ms
calcGradWeights layer selected kernel 1
epoch 4 loss, E, 0.0115848
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 2ms
epoch 5 loss, E, 0.00987036
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 93ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 33ms
epoch 6 loss, E, 0.00844797
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 93ms
forward layer selected kernel 1
forward kernel 0: cannot be used
forward kernel 1 time: 1ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 2ms
forward kernel 7 time: 33ms
forward layer selected kernel 2
epoch 7 loss, E, 0.00724182
epoch 8 loss, E, 0.00621212
epoch 9 loss, E, 0.00533106
epoch 10 loss, E, 0.00457645
epoch 11 loss, E, 0.00392979
epoch 12 loss, E, 0.00337539
epoch 13 loss, E, 0.00289992
epoch 14 loss, E, 0.002492
epoch 15 loss, E, 0.00214191
epoch 16 loss, E, 0.00184138
epoch 17 loss, E, 0.00158331
epoch 18 loss, E, 0.00136164
epoch 19 loss, E, 0.0011712
epoch 20 loss, E, 0.00100754
epoch 21 loss, E, 0.000866877
epoch 22 loss, E, 0.000745946
epoch 23 loss, E, 0.000641966
epoch 24 loss, E, 0.000552543
epoch 25 loss, E, 0.000475625
epoch 26 loss, E, 0.000409454
epoch 27 loss, E, 0.000352522
epoch 28 loss, E, 0.000303531
epoch 29 loss, E, 0.00026137
epoch 30 loss, E, 0.000225082
epoch 31 loss, E, 0.000193845
epoch 32 loss, E, 0.000166954
epoch 33 loss, E, 0.000143801
epoch 34 loss, E, 0.000123866
epoch 35 loss, E, 0.000106699
epoch 36 loss, E, 9.19176e-05
epoch 37 loss, E, 7.91864e-05
epoch 38 loss, E, 6.82211e-05
epoch 39 loss, E, 5.87767e-05
layer 0:InputLayer{ outputPlanes=1 outputSize=1 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} }
layer 4:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=4 40.0%
layer 3: params=6 60.0%
TOTAL : params=10
loss, E, 5.87767e-05
accuracy: 2/2 100%
accuracy: 2/2
loss, E, 5.87767e-05
loss, E, 5.87767e-05
layer 0:InputLayer{ outputPlanes=1 outputSize=1 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} }
layer 4:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=4 40.0%
layer 3: params=6 60.0%
TOTAL : params=10
float weights1[] = {-0.303866f, -1.59823f};
float weights3[] = {0.426358f, -0.719592f, -0.420361f, 0.719566f};
float bias1[] = {-0.324465f, 0.60279f};
float bias3[] = {0.506862f, -0.506837f};
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize1_n2_2layers_unbiased (1727 ms)
[ RUN ] testsimpleconvolvenet.imagesize1_n2_2layers_biased
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 1ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 1.19067
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 102ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 102ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 195ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 36ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 1ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 195ms
calcGradWeights layer selected kernel 2
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 36ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
loss, E, 0.0667568
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 93ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 33ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 93ms
forward layer selected kernel 1
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 33ms
forward layer selected kernel 1
loss, E, 0.00923595
loss, E, 0.00112611
loss, E, 0.0001174
loss, E, 1.15642e-05
dump enabled=0
loss, E, 1.78564e-06
accuracy: 2/2 100%
accuracy: 2/2
loss, E, 1.78564e-06
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize1_n2_2layers_biased (1593 ms)
[ RUN ] testsimpleconvolvenet.imagesize_5_4_2layers_filtersize_2_4_biased_n3
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 2.45455
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 110ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=3 -D gInputPlanes=3 -D gInputSize=4 -D gInputSizeSquared=16 -D gNumFilters=3 -D gFilterSize=4 -D gHalfFilterSize=2 -D gFilterSizeSquared=16 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=3 -DgInputStripeInnerNumRows=4 -DgInputStripeOuterNumRows=10 -DgInputStripeInnerSize=16 -DgInputStripeOuterSize=40 -DgInputStripeMarginSize=12 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=3 -D gFilterSize=2 -D gHalfFilterSize=1 -D gFilterSizeSquared=4 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=4 -D gOutputSizeSquared=16 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=1 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=35 -DgInputStripeMarginSize=5 -DgOutputStripeNumRows=4 -DgOutputStripeSize=16
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 110ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 205ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 193ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 205ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 193ms
calcGradWeights layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 110ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 110ms
forward layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 103ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 103ms
forward layer selected kernel 1
loss, E, 0.000668798
loss, E, 8.79736e-08
loss, E, 4.64206e-11
loss, E, 3.85469e-13
loss, E, 1.32339e-13
loss, E, 1.14131e-13
loss, E, 8.9706e-14
loss, E, 6.83897e-14
loss, E, 6.83897e-14
loss, E, 6.83897e-14
accuracy: 3/3 100%
accuracy: 3/3
loss, E, 6.83897e-14
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize_5_4_2layers_filtersize_2_4_biased_n3 (3364 ms)
[ RUN ] testsimpleconvolvenet.imagesize_5_4_2layers_filtersize_2_4_biased_n6
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 3.64011
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 116ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=3 -D gInputPlanes=3 -D gInputSize=4 -D gInputSizeSquared=16 -D gNumFilters=3 -D gFilterSize=4 -D gHalfFilterSize=2 -D gFilterSizeSquared=16 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=3 -DgInputStripeInnerNumRows=4 -DgInputStripeOuterNumRows=10 -DgInputStripeInnerSize=16 -DgInputStripeOuterSize=40 -DgInputStripeMarginSize=12 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=3 -D gFilterSize=2 -D gHalfFilterSize=1 -D gFilterSizeSquared=4 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=4 -D gOutputSizeSquared=16 -D gPadZeros=0 -D gMargin=0 -D gEven=1 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=1 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=35 -DgInputStripeMarginSize=5 -DgOutputStripeNumRows=4 -DgOutputStripeSize=16
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 5ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 116ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 238ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 213ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 238ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 213ms
calcGradWeights layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 115ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 5ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 115ms
forward layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 103ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 103ms
forward layer selected kernel 1
loss, E, 4.13952e-10
loss, E, 2.13163e-14
loss, E, 1.77636e-14
loss, E, 1.68754e-14
loss, E, 8.88178e-15
accuracy: 6/6 100%
accuracy: 6/6
loss, E, 8.88178e-15
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize_5_4_2layers_filtersize_2_4_biased_n6 (2886 ms)
[ RUN ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n6
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 4.00796
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 122ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=3 -D gInputPlanes=3 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=3 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=3 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 122ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 227ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 253ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 227ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 253ms
calcGradWeights layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 158ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 158ms
forward layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 109ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 109ms
forward layer selected kernel 1
loss, E, 1.87774e-08
loss, E, 5.06262e-14
loss, E, 6.21725e-15
accuracy: 6/6 100%
accuracy: 6/6
loss, E, 6.21725e-15
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n6 (2494 ms)
[ RUN ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n18
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
loss, E, 7.78557
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 129ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=3 -D gInputPlanes=3 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=3 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=3 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=3 -D gOutputPlanes=3 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 129ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 219ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 247ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 3ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 219ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 247ms
calcGradWeights layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 155ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 3ms
forward kernel 7 time: 155ms
forward layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 111ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 111ms
forward layer selected kernel 1
loss, E, 0.0959845
loss, E, 0.0247032
loss, E, 0.0102922
loss, E, 0.00504672
loss, E, 0.00281197
loss, E, 0.00167871
loss, E, 0.00107965
loss, E, 0.000748767
loss, E, 0.00055091
loss, E, 0.000425275
loss, E, 0.000341032
loss, E, 0.000281538
loss, E, 0.000237609
loss, E, 0.000203943
loss, E, 0.000177351
loss, E, 0.000155821
loss, E, 0.000138037
loss, E, 0.000123113
loss, E, 0.000110426
loss, E, 9.95232e-05
loss, E, 9.00714e-05
loss, E, 8.18146e-05
loss, E, 7.45556e-05
loss, E, 6.81377e-05
loss, E, 6.24353e-05
loss, E, 5.73468e-05
loss, E, 5.27886e-05
loss, E, 4.86925e-05
loss, E, 4.49983e-05
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=3 filterSize=3 outputSize=3 padZeros=0 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=3 inputSize=3 numFilters=3 filterSize=3 outputSize=1 padZeros=0 biased=1 skip=0} }
layer 4:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 1: params=30 26.3%
layer 3: params=84 73.7%
TOTAL : params=114
loss, E, 4.16905e-05
accuracy: 18/18 100%
accuracy: 18/18
loss, E, 4.16905e-05
clblas teardown
[ OK ] testsimpleconvolvenet.imagesize_5_3_2layers_filtersize_3_3_biased_n18 (6496 ms)
[----------] 12 tests from testsimpleconvolvenet (24719 ms total)
[----------] 3 tests from testlogicaloperators
[ RUN ] testlogicaloperators.Convolve_1layer_biased_And
And
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
Loss L 3.55932
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 210ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 210ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
Loss L 0.914111
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 96ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 96ms
forward layer selected kernel 1
Loss L 0.4786
Loss L 0.32969
accuracy: 4/4
loss, E, 0.284304
clblas teardown
[ OK ] testlogicaloperators.Convolve_1layer_biased_And (925 ms)
[ RUN ] testlogicaloperators.Convolve_1layerbiased_Or
Or, convolve
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
Loss L 4.72064
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 203ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 203ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
Loss L 0.631151
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 96ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 96ms
forward layer selected kernel 1
Loss L 0.375778
Loss L 0.293813
accuracy: 4/4 100%
loss, E, 0.26886
clblas teardown
[ OK ] testlogicaloperators.Convolve_1layerbiased_Or (927 ms)
[ RUN ] testlogicaloperators.Convolve_2layers_relu_Xor
Xor, convolve
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
hand-setting weights...
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
Loss L 0.152638
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 101ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=2 -D gInputPlanes=2 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=2 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=2 -D gOutputPlanes=2 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 101ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 202ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 43ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 202ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 43ms
calcGradWeights layer selected kernel 1
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
Loss L 0.00640068
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 95ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 34ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 95ms
forward layer selected kernel 1
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 34ms
forward layer selected kernel 1
Loss L 0.00139435
Loss L 0.000383307
Loss L 0.000117079
Loss L 4.63626e-05
Loss L 1.8873e-05
Loss L 7.15534e-06
Loss L 2.83958e-06
Loss L 1.12727e-06
Loss L 4.44109e-07
Loss L 1.72233e-07
Loss L 6.82345e-08
Loss L 2.76343e-08
Loss L 1.04286e-08
Loss L 4.13357e-09
Loss L 1.67201e-09
Loss L 6.29148e-10
Loss L 2.4837e-10
Loss L 1.00833e-10
Loss L 3.80673e-11
Loss L 1.5131e-11
Loss L 5.84421e-12
Loss L 2.16893e-12
Loss L 9.52127e-13
Loss L 3.58824e-13
Loss L 1.56319e-13
Loss L 9.9476e-14
Loss L 9.9476e-14
Loss L 9.9476e-14
Loss L 9.9476e-14
Loss L 9.23706e-14
Loss L 9.23706e-14
Loss L 9.41469e-14
Loss L 8.70415e-14
Loss L 9.41469e-14
Loss L 8.52651e-14
Loss L 8.52651e-14
Loss L 8.52651e-14
Loss L 8.52651e-14
layer 0:InputLayer{ outputPlanes=2 outputSize=1 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=1 numFilters=2 filterSize=1 outputSize=1 padZeros=0 biased=1 skip=0} }
layer 4:ActivationLayer{ RELU }
layer 5:SquareLossLayer{}
Parameters overview: (skipping 4 layers with 0 params)
layer 1: params=6 50.0%
layer 3: params=6 50.0%
TOTAL : params=12
accuracy: 4/4 100%
loss, E, 8.52651e-14
clblas teardown
[ OK ] testlogicaloperators.Convolve_2layers_relu_Xor (1969 ms)
[----------] 3 tests from testlogicaloperators (3821 ms total)
[----------] 12 tests from testbackward
[ RUN ] testbackward.squareloss
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
layer 0:InputLayer{ outputPlanes=3 outputSize=5 }
layer 1:ForceBackpropLayer{ outputPlanes=3 outputSize=5 }
layer 2:SquareLossLayer{}
inputtotalsize=2400 outputTotalSize=2400
layer 0:InputLayer{ outputPlanes=3 outputSize=5 }
layer 1:ForceBackpropLayer{ outputPlanes=3 outputSize=5 }
layer 2:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
TOTAL : params=0
idx=44 predicted losschange=-0.000912508 actual=-0.000976562
idx=2245 predicted losschange=0.00785823 actual=0.00805664
idx=648 predicted losschange=0.00965759 actual=0.00976562
idx=586 predicted losschange=0.0136895 actual=0.0136719
idx=730 predicted losschange=0.00117897 actual=0.00146484
idx=611 predicted losschange=0.00152302 actual=0.00195312
idx=1130 predicted losschange=0.0159167 actual=0.0161133
idx=15 predicted losschange=0.0434798 actual=0.0439453
idx=1923 predicted losschange=-0.00790002 actual=-0.0078125
idx=670 predicted losschange=0.0335141 actual=0.0336914
[ OK ] testbackward.squareloss (2 ms)
[ RUN ] testbackward.crossentropyloss
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
layer 0:InputLayer{ outputPlanes=3 outputSize=5 }
layer 1:ForceBackpropLayer{ outputPlanes=3 outputSize=5 }
layer 2:Layer{}
inputtotalsize=300 outputTotalSize=300
layer 0:InputLayer{ outputPlanes=3 outputSize=5 }
layer 1:ForceBackpropLayer{ outputPlanes=3 outputSize=5 }
layer 2:Layer{}
Parameters overview: (skipping 3 layers with 0 params)
TOTAL : params=0
idx=44 predicted losschange=0.000274935 actual=0.000274658
idx=145 predicted losschange=-0.000885784 actual=-0.00088501
idx=48 predicted losschange=-0.000859834 actual=-0.000854492
idx=286 predicted losschange=0.00713042 actual=0.00717163
idx=130 predicted losschange=-0.000264829 actual=-0.000244141
idx=11 predicted losschange=-1.98163e-05 actual=0
idx=230 predicted losschange=-0.000594819 actual=-0.000610352
idx=15 predicted losschange=-0.0006499 actual=-0.000640869
idx=123 predicted losschange=-0.000846121 actual=-0.000823975
idx=70 predicted losschange=0.000790196 actual=0.000793457
[ OK ] testbackward.crossentropyloss (1 ms)
[ RUN ] testbackward.softmaxloss
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 }
inputtotalsize=10 outputTotalSize=10
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 }
Parameters overview: (skipping 3 layers with 0 params)
TOTAL : params=0
idx=4 predicted losschange=0.000113075 actual=0.00011301
idx=5 predicted losschange=0.000145627 actual=0.000145674
idx=8 predicted losschange=3.16699e-05 actual=3.19481e-05
idx=6 predicted losschange=4.89271e-06 actual=5.24521e-06
idx=0 predicted losschange=2.29469e-05 actual=2.28882e-05
idx=1 predicted losschange=-8.26119e-05 actual=-8.27312e-05
idx=0 predicted losschange=2.29469e-05 actual=2.28882e-05
idx=5 predicted losschange=0.000145627 actual=0.000145674
idx=3 predicted losschange=-5.50179e-05 actual=-5.50747e-05
idx=0 predicted losschange=2.29469e-05 actual=2.28882e-05
[ OK ] testbackward.softmaxloss (2 ms)
[ RUN ] testbackward.squareloss2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:SquareLossLayer{}
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:SquareLossLayer{}
batchSize: 32
inputtotalsize=160 outputTotalSize=160
layer SquareLossLayer{}
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
TOTAL : params=0
idx=44 predicted losschange=0.000126406 actual=0.000125885
idx=5 predicted losschange=0.00461891 actual=0.00464439
idx=8 predicted losschange=0.000356787 actual=0.000356674
idx=106 predicted losschange=0.00716324 actual=0.00719643
idx=90 predicted losschange=0.000474759 actual=0.000480652
idx=131 predicted losschange=0.000979017 actual=0.000984192
idx=10 predicted losschange=0.000660134 actual=0.000663757
idx=15 predicted losschange=0.00961313 actual=0.00965118
idx=3 predicted losschange=0.00264732 actual=0.00267029
idx=30 predicted losschange=0.00865312 actual=0.00868607
[ OK ] testbackward.squareloss2 (1 ms)
[ RUN ] testbackward.crossentropy2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:Layer{}
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:Layer{}
batchSize: 2
inputtotalsize=10 outputTotalSize=10
layer Layer{}
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:Layer{}
Parameters overview: (skipping 3 layers with 0 params)
TOTAL : params=0
idx=4 predicted losschange=0.00258649 actual=nan
idx=5 predicted losschange=0.0227095 actual=nan
idx=8 predicted losschange=-0.00202714 actual=nan
idx=6 predicted losschange=-0.000846508 actual=nan
idx=0 predicted losschange=-0.000424821 actual=nan
idx=1 predicted losschange=-0.00171216 actual=nan
idx=0 predicted losschange=-0.000424821 actual=nan
idx=5 predicted losschange=0.0227095 actual=nan
idx=3 predicted losschange=0.0123444 actual=nan
idx=0 predicted losschange=-0.000424821 actual=nan
[ OK ] testbackward.crossentropy2 (2 ms)
[ RUN ] testbackward.softmax2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 }
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 }
batchSize: 2
inputtotalsize=10 outputTotalSize=10
layer SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 }
layer 0:InputLayer{ outputPlanes=5 outputSize=1 }
layer 1:ForceBackpropLayer{ outputPlanes=5 outputSize=1 }
layer 2:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 }
Parameters overview: (skipping 3 layers with 0 params)
TOTAL : params=0
idx=4 predicted losschange=0.00035729 actual=0.000357628
idx=5 predicted losschange=0.0015055 actual=0.00151086
idx=8 predicted losschange=-5.63632e-05 actual=-5.65052e-05
idx=6 predicted losschange=-1.48864e-05 actual=-1.4782e-05
idx=0 predicted losschange=1.96542e-05 actual=1.95503e-05
idx=1 predicted losschange=-0.000287167 actual=-0.000287056
idx=0 predicted losschange=1.96542e-05 actual=1.95503e-05
idx=5 predicted losschange=0.0015055 actual=0.00151086
idx=3 predicted losschange=-0.000152824 actual=-0.00014782
idx=0 predicted losschange=1.96542e-05 actual=1.95503e-05
[ OK ] testbackward.softmax2 (1 ms)
[ RUN ] testbackward.conv1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=2 outputSize=4 }
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 }
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=4 numFilters=2 filterSize=3 outputSize=2 padZeros=0 biased=0 skip=0} }
layer 3:SquareLossLayer{}
layer 0:InputLayer{ outputPlanes=2 outputSize=4 }
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 }
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=4 numFilters=2 filterSize=3 outputSize=2 padZeros=0 biased=0 skip=0} }
layer 3:SquareLossLayer{}
batchSize: 4
inputtotalsize=128 outputTotalSize=32
layer ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=4 numFilters=2 filterSize=3 outputSize=2 padZeros=0 biased=0 skip=0} }
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=2 outputSize=4 }
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 }
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=2 inputSize=4 numFilters=2 filterSize=3 outputSize=2 padZeros=0 biased=0 skip=0} }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 2: params=36 100.0%
TOTAL : params=36
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
idx=44 predicted losschange=-0.000314065 actual=-0.000314236
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
idx=37 predicted losschange=0.00253314 actual=0.00254202
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
idx=40 predicted losschange=0.00496457 actual=0.00497627
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
idx=106 predicted losschange=-0.000453683 actual=-0.000446796
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 117ms
idx=122 predicted losschange=0.000748635 actual=-0.000446796
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 117ms
forward layer selected kernel 1
idx=99 predicted losschange=5.24616e-05 actual=5.38826e-05
idx=10 predicted losschange=0.000438654 actual=0.000439644
idx=47 predicted losschange=-0.0013164 actual=-0.00131559
idx=67 predicted losschange=0.00172771 actual=0.0017333
idx=126 predicted losschange=0.00328649 actual=0.00329351
clblas teardown
[ OK ] testbackward.conv1 (608 ms)
[ RUN ] testbackward.fc1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=2 outputSize=4 }
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 }
layer 2:FullyConnectedLayer{ numPlanes=4 imageSize=1 }
layer 3:SquareLossLayer{}
layer 0:InputLayer{ outputPlanes=2 outputSize=4 }
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 }
layer 2:FullyConnectedLayer{ numPlanes=4 imageSize=1 }
layer 3:SquareLossLayer{}
batchSize: 4
inputtotalsize=128 outputTotalSize=16
layer FullyConnectedLayer{ numPlanes=4 imageSize=1 }
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
layer 0:InputLayer{ outputPlanes=2 outputSize=4 }
layer 1:ForceBackpropLayer{ outputPlanes=2 outputSize=4 }
layer 2:FullyConnectedLayer{ numPlanes=4 imageSize=1 }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 3 layers with 0 params)
layer 2: params=128 100.0%
TOTAL : params=128
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
idx=44 predicted losschange=0.000349482 actual=0.000349522
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
idx=37 predicted losschange=0.00073425 actual=0.000735283
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
idx=40 predicted losschange=0.000336202 actual=0.0003438
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
idx=106 predicted losschange=-0.00125048 actual=-0.00124693
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
idx=122 predicted losschange=-0.000898851 actual=-0.000895023
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 110ms
idx=99 predicted losschange=0.000183326 actual=-0.000895023
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 110ms
forward layer selected kernel 1
idx=10 predicted losschange=0.000889723 actual=0.000889778
idx=47 predicted losschange=-0.000766629 actual=-0.0007658
idx=67 predicted losschange=0.00080667 actual=0.000810146
idx=126 predicted losschange=-0.00017344 actual=-0.000169754
clblas teardown
[ OK ] testbackward.fc1 (620 ms)
[ RUN ] testbackward.act1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
layer 0:InputLayer{ outputPlanes=1 outputSize=2 }
layer 1:ForceBackpropLayer{ outputPlanes=1 outputSize=2 }
layer 2:ActivationLayer{ RELU }
layer 3:SquareLossLayer{}
layer 0:InputLayer{ outputPlanes=1 outputSize=2 }
layer 1:ForceBackpropLayer{ outputPlanes=1 outputSize=2 }
layer 2:ActivationLayer{ RELU }
layer 3:SquareLossLayer{}
batchSize: 1
inputtotalsize=4 outputTotalSize=4
layer ActivationLayer{ RELU }
layer 0:InputLayer{ outputPlanes=1 outputSize=2 }
layer 1:ForceBackpropLayer{ outputPlanes=1 outputSize=2 }
layer 2:ActivationLayer{ RELU }
layer 3:SquareLossLayer{}
Parameters overview: (skipping 4 layers with 0 params)
TOTAL : params=0
idx=0 predicted losschange=-0.000880961 actual=-0.00088048
idx=1 predicted losschange=-0.00151209 actual=-0.00151044
idx=0 predicted losschange=-0.000880961 actual=-0.00088048
idx=2 predicted losschange=-0.00245153 actual=-0.0024423
idx=2 predicted losschange=-0.00245153 actual=-0.0024423
idx=3 predicted losschange=-0.00214455 actual=-0.00212085
idx=2 predicted losschange=-0.00245153 actual=-0.0024423
idx=3 predicted losschange=-0.00214455 actual=-0.00212085
idx=3 predicted losschange=-0.00214455 actual=-0.00212085
idx=2 predicted losschange=-0.00245153 actual=-0.0024423
[ OK ] testbackward.act1 (65 ms)
[ RUN ] testbackward.checknumerically
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
loss 0.0986296 loss2 0.0984814 change: 0.000148199
sumweightsdiff 0.0038507
loss change 0.000148199
estimatedLossChangeFromW 0.000148279
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
loss 0.0984814 loss2 0.0983336 change: 0.000147872
sumweightsdiff 0.00384641
loss change 0.000147872
estimatedLossChangeFromW 0.000147948
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 65ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 39ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 65ms
forward layer selected kernel 1
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 39ms
forward layer selected kernel 1
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 67ms
calcGradWeights try kernel 3
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=1 -D gInputSizeSquared=1 -D gNumFilters=1 -D gFilterSize=1 -D gHalfFilterSize=0 -D gFilterSizeSquared=1 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=0 -DgInputStripeInnerNumRows=1 -DgInputStripeOuterNumRows=1 -DgInputStripeInnerSize=1 -DgInputStripeOuterSize=1 -DgInputStripeMarginSize=0 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
loss 0.0983336 loss2 0.098186 change: 0.000147544
sumweightsdiff 0.00384223
loss change 0.000147544
estimatedLossChangeFromW 0.000147628
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 67ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 101ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 34ms
loss 0.098186 loss2 0.0980388 change: 0.000147216
sumweightsdiff 0.00383794
loss change 0.000147216
estimatedLossChangeFromW 0.000147298
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 101ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 34ms
calcGradWeights layer selected kernel 1
loss 0.0980388 loss2 0.0978919 change: 0.000146888
sumweightsdiff 0.00383377
loss change 0.000146888
estimatedLossChangeFromW 0.000146978
clblas teardown
[ OK ] testbackward.checknumerically (1510 ms)
[ RUN ] testbackward.checknumerically_imagesize5_filter3_relu
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
loss 630.466 loss2 608.021 change: 22.4443
sumweightsdiff -0.035685
loss change 22.4443
estimatedLossChangeFromW 22.6629
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 99ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 44ms
loss 608.021 loss2 586.349 change: 21.672
sumweightsdiff -0.0350289
loss change 21.672
estimatedLossChangeFromW 21.7974
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 99ms
forward layer selected kernel 1
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 44ms
forward layer selected kernel 1
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 175ms
calcGradWeights try kernel 3
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=5 -D gOutputSizeSquared=25 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=5 -DgOutputStripeSize=25
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=1 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=1 -D gOutputPlanes=1 -D gOutputSize=5 -D gOutputSizeSquared=25 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=5 -DgOutputStripeSize=25
... seems valid
BackpropWeightsAuto: kernel 3 0ms
loss 586.349 loss2 565.324 change: 21.025
sumweightsdiff -0.0345262
loss change 21.025
estimatedLossChangeFromW 21.2378
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 175ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 135ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 40ms
loss 565.324 loss2 545.133 change: 20.1916
sumweightsdiff -0.0338754
loss change 20.1916
estimatedLossChangeFromW 20.3956
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 135ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 40ms
calcGradWeights layer selected kernel 1
loss 545.133 loss2 525.742 change: 19.3912
sumweightsdiff -0.0332378
loss change 19.3912
estimatedLossChangeFromW 19.5872
loss 525.742 loss2 507.119 change: 18.6229
sumweightsdiff -0.0326132
loss change 18.6229
estimatedLossChangeFromW 18.8111
loss 507.119 loss2 489.233 change: 17.8853
sumweightsdiff -0.032001
loss change 17.8853
estimatedLossChangeFromW 18.066
loss 489.233 loss2 472.056 change: 17.1772
sumweightsdiff -0.0314012
loss change 17.1772
estimatedLossChangeFromW 17.3506
loss 472.056 loss2 455.559 change: 16.4975
sumweightsdiff -0.0308135
loss change 16.4975
estimatedLossChangeFromW 16.6639
loss 455.559 loss2 439.714 change: 15.8447
sumweightsdiff -0.0302379
loss change 15.8447
estimatedLossChangeFromW 16.0046
loss 439.714 loss2 424.416 change: 15.2976
sumweightsdiff -0.0296733
loss change 15.2976
estimatedLossChangeFromW 15.3717
loss 424.416 loss2 409.545 change: 14.871
sumweightsdiff -0.0299227
loss change 14.871
estimatedLossChangeFromW 15.0234
loss 409.545 loss2 395.271 change: 14.274
sumweightsdiff -0.0293575
loss change 14.274
estimatedLossChangeFromW 14.4202
loss 395.271 loss2 381.57 change: 13.7013
sumweightsdiff -0.0288033
loss change 13.7013
estimatedLossChangeFromW 13.8415
loss 381.57 loss2 368.418 change: 13.1519
sumweightsdiff -0.0282608
loss change 13.1519
estimatedLossChangeFromW 13.2864
loss 368.418 loss2 355.794 change: 12.6248
sumweightsdiff -0.0277294
loss change 12.6248
estimatedLossChangeFromW 12.7538
loss 355.794 loss2 343.675 change: 12.119
sumweightsdiff -0.027209
loss change 12.119
estimatedLossChangeFromW 12.2429
loss 343.675 loss2 332.041 change: 11.634
sumweightsdiff -0.0266991
loss change 11.634
estimatedLossChangeFromW 11.7526
loss 332.041 loss2 320.872 change: 11.1684
sumweightsdiff -0.0261997
loss change 11.1684
estimatedLossChangeFromW 11.2823
loss 320.872 loss2 310.15 change: 10.7218
sumweightsdiff -0.0257105
loss change 10.7218
estimatedLossChangeFromW 10.8312
clblas teardown
[ OK ] testbackward.checknumerically_imagesize5_filter3_relu (1877 ms)
[ RUN ] testbackward.compare_1_n_kgsgo_32c5
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
-D BIASED -D gNumInputPlanes=32 -D gInputPlanes=32 -D gInputSize=19 -D gInputSizeSquared=361 -D gNumFilters=32 -D gFilterSize=5 -D gHalfFilterSize=2 -D gFilterSizeSquared=25 -D gNumOutputPlanes=32 -D gOutputPlanes=32 -D gOutputSize=19 -D gOutputSizeSquared=361 -D gPadZeros=1 -D gMargin=2 -D gEven=0 -D gSkip=0
batchsize=8 LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
output[0]=-0.0308112 -0.0308112 SAME || -0.129603 || -0.048413 || 0.07916 || -0.118675 || 0.0416933 || 0.100887 || -0.106013
output[1]=-0.0574008 -0.0574008 SAME || 0.099984 || 0.0155394 || 0.00411644 || 0.131031 || -0.0107744 || 0.121347 || 0.0437087
output[2]=-0.0227139 -0.0227139 SAME || -0.0115189 || -0.190989 || -0.0445787 || -0.013341 || -0.04953 || -0.109186 || 0.104814
output[3]=-0.0805896 -0.0805896 SAME || 0.0216207 || -0.128649 || -0.0159031 || 0.0534839 || 0.0301581 || 0.104269 || -0.0841106
output[4]=-0.0723994 -0.0723994 SAME || -0.0164838 || -0.00649171 || -0.042007 || 0.147102 || -0.0702085 || -0.0120931 || 0.0597854
output[5]=0.130336 0.130336 SAME || -0.0816751 || -0.272227 || 0.0707071 || 0.133967 || 0.0323092 || 0.124248 || -0.0138626
output[6]=-0.00415662 -0.00415662 SAME || -0.0920411 || 0.0352436 || 0.0541946 || 0.00491123 || -0.0805987 || 0.0834764 || 0.0631893
output[7]=-0.0915931 -0.0915931 SAME || -0.0358497 || 0.0445722 || -0.0472172 || 0.0778742 || -0.0550363 || -0.179262 || -0.0812755
output[8]=0.0556533 0.0556533 SAME || -0.0684331 || -0.0243033 || -0.0822076 || -0.0104788 || -0.043145 || -0.0481164 || 0.0538944
output[9]=-0.0725742 -0.0725742 SAME || 0.0486592 || -0.0286811 || -0.0249626 || 0.0394469 || -0.144496 || 0.0909432 || -0.0152857
output[10]=-0.0153476 -0.0153476 SAME || -0.0677297 || -0.140709 || -0.0161164 || 0.131645 || 0.0545684 || -0.0210541 || 0.0611338
output[11]=-0.0212713 -0.0212713 SAME || 0.100494 || 0.2122 || -0.0812487 || 0.0532493 || -0.0183774 || -0.0937923 || -0.069912
output[12]=0.0389741 0.0389741 SAME || 0.0809882 || 0.0370538 || 0.0241565 || -0.0582968 || 0.0437625 || 0.139931 || -0.065007
output[13]=0.0349705 0.0349705 SAME || -0.0251775 || -0.0759114 || 0.0945214 || 0.00389841 || -0.0377205 || 0.17624 || -0.114476
output[14]=0.0366689 0.0366689 SAME || -0.0348694 || -0.0581568 || 0.0376178 || -0.0298947 || -0.0299259 || -0.0913825 || -0.0745193
output[15]=0.0186965 0.0186965 SAME || 0.0281147 || 0.00937999 || 0.108983 || -0.0505074 || -0.0573388 || 0.067382 || 0.0387854
output[16]=0.0658136 0.0658136 SAME || -0.0412163 || -0.128719 || 0.150029 || 0.0555238 || -0.0203267 || -0.0795422 || -0.123847
output[17]=0.0705919 0.0705919 SAME || 0.147334 || 0.151016 || -0.0122364 || 0.0360484 || -0.0609187 || 0.0166715 || -0.141399
output[18]=-0.0508929 -0.0508929 SAME || 0.0131358 || -0.0101773 || -0.120741 || -0.00821514 || 0.00894922 || -0.117651 || 0.0631629
output[19]=-0.0110406 -0.0110406 SAME || 0.189081 || 0.0665268 || 0.0622702 || 0.151629 || -0.0172241 || -0.0215623 || 0.0457666
clblas teardown
instance 2
batchsize=8 LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
output[0]=-0.0308112 -0.0308112 SAME || -0.129603 || -0.048413 || 0.07916 || -0.118675 || 0.0416933 || 0.100887 || -0.106013
output[1]=-0.0574008 -0.0574008 SAME || 0.099984 || 0.0155394 || 0.00411644 || 0.131031 || -0.0107744 || 0.121347 || 0.0437087
output[2]=-0.0227139 -0.0227139 SAME || -0.0115189 || -0.190989 || -0.0445787 || -0.013341 || -0.04953 || -0.109186 || 0.104814
output[3]=-0.0805896 -0.0805896 SAME || 0.0216207 || -0.128649 || -0.0159031 || 0.0534839 || 0.0301581 || 0.104269 || -0.0841106
output[4]=-0.0723994 -0.0723994 SAME || -0.0164838 || -0.00649171 || -0.042007 || 0.147102 || -0.0702085 || -0.0120931 || 0.0597854
output[5]=0.130336 0.130336 SAME || -0.0816751 || -0.272227 || 0.0707071 || 0.133967 || 0.0323092 || 0.124248 || -0.0138626
output[6]=-0.00415662 -0.00415662 SAME || -0.0920411 || 0.0352436 || 0.0541946 || 0.00491123 || -0.0805987 || 0.0834764 || 0.0631893
output[7]=-0.0915931 -0.0915931 SAME || -0.0358497 || 0.0445722 || -0.0472172 || 0.0778742 || -0.0550363 || -0.179262 || -0.0812755
output[8]=0.0556533 0.0556533 SAME || -0.0684331 || -0.0243033 || -0.0822076 || -0.0104788 || -0.043145 || -0.0481164 || 0.0538944
output[9]=-0.0725742 -0.0725742 SAME || 0.0486592 || -0.0286811 || -0.0249626 || 0.0394469 || -0.144496 || 0.0909432 || -0.0152857
output[10]=-0.0153476 -0.0153476 SAME || -0.0677297 || -0.140709 || -0.0161164 || 0.131645 || 0.0545684 || -0.0210541 || 0.0611338
output[11]=-0.0212713 -0.0212713 SAME || 0.100494 || 0.2122 || -0.0812487 || 0.0532493 || -0.0183774 || -0.0937923 || -0.069912
output[12]=0.0389741 0.0389741 SAME || 0.0809882 || 0.0370538 || 0.0241565 || -0.0582968 || 0.0437625 || 0.139931 || -0.065007
output[13]=0.0349705 0.0349705 SAME || -0.0251775 || -0.0759114 || 0.0945214 || 0.00389841 || -0.0377205 || 0.17624 || -0.114476
output[14]=0.0366689 0.0366689 SAME || -0.0348694 || -0.0581568 || 0.0376178 || -0.0298947 || -0.0299259 || -0.0913825 || -0.0745193
output[15]=0.0186965 0.0186965 SAME || 0.0281147 || 0.00937999 || 0.108983 || -0.0505074 || -0.0573388 || 0.067382 || 0.0387854
output[16]=0.0658136 0.0658136 SAME || -0.0412163 || -0.128719 || 0.150029 || 0.0555238 || -0.0203267 || -0.0795422 || -0.123847
output[17]=0.0705919 0.0705919 SAME || 0.147334 || 0.151016 || -0.0122364 || 0.0360484 || -0.0609187 || 0.0166715 || -0.141399
output[18]=-0.0508929 -0.0508929 SAME || 0.0131358 || -0.0101773 || -0.120741 || -0.00821514 || 0.00894922 || -0.117651 || 0.0631629
output[19]=-0.0110406 -0.0110406 SAME || 0.189081 || 0.0665268 || 0.0622702 || 0.151629 || -0.0172241 || -0.0215623 || 0.0457666
clblas teardown
instance 3
batchsize=8 LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0}
output[0]=-0.0308112 -0.0308112 SAME || -0.129603 || -0.048413 || 0.0791599 || -0.118675 || 0.0416933 || 0.100887 || -0.106013
output[1]=-0.0574008 -0.0574008 SAME || 0.0999841 || 0.0155394 || 0.00411648 || 0.131031 || -0.0107744 || 0.121347 || 0.0437087
output[2]=-0.0227139 -0.0227139 SAME || -0.0115189 || -0.190989 || -0.0445787 || -0.013341 || -0.0495299 || -0.109186 || 0.104814
output[3]=-0.0805896 -0.0805895 SAME || 0.0216206 || -0.128649 || -0.0159031 || 0.053484 || 0.0301581 || 0.104269 || -0.0841106
output[4]=-0.0723994 -0.0723994 SAME || -0.0164838 || -0.0064917 || -0.042007 || 0.147102 || -0.0702085 || -0.0120931 || 0.0597853
output[5]=0.130336 0.130336 SAME || -0.0816751 || -0.272227 || 0.0707071 || 0.133967 || 0.0323092 || 0.124248 || -0.0138626
output[6]=-0.00415662 -0.00415662 SAME || -0.0920411 || 0.0352436 || 0.0541946 || 0.00491123 || -0.0805988 || 0.0834764 || 0.0631893
output[7]=-0.0915931 -0.0915931 SAME || -0.0358497 || 0.0445723 || -0.0472172 || 0.0778742 || -0.0550363 || -0.179262 || -0.0812755
output[8]=0.0556533 0.0556533 SAME || -0.0684331 || -0.0243033 || -0.0822077 || -0.0104788 || -0.043145 || -0.0481164 || 0.0538944
output[9]=-0.0725742 -0.0725742 SAME || 0.0486591 || -0.0286811 || -0.0249626 || 0.0394469 || -0.144496 || 0.0909431 || -0.0152858
output[10]=-0.0153476 -0.0153476 SAME || -0.0677297 || -0.140709 || -0.0161163 || 0.131645 || 0.0545684 || -0.0210541 || 0.0611338
output[11]=-0.0212713 -0.0212713 SAME || 0.100494 || 0.2122 || -0.0812488 || 0.0532493 || -0.0183774 || -0.0937924 || -0.069912
output[12]=0.0389741 0.0389741 SAME || 0.0809881 || 0.0370537 || 0.0241565 || -0.0582967 || 0.0437625 || 0.139931 || -0.0650069
output[13]=0.0349705 0.0349705 SAME || -0.0251774 || -0.0759114 || 0.0945214 || 0.00389844 || -0.0377205 || 0.17624 || -0.114476
output[14]=0.0366689 0.0366688 SAME || -0.0348695 || -0.0581568 || 0.0376178 || -0.0298947 || -0.0299259 || -0.0913827 || -0.0745193
output[15]=0.0186965 0.0186966 SAME || 0.0281147 || 0.00938 || 0.108983 || -0.0505074 || -0.0573388 || 0.067382 || 0.0387854
output[16]=0.0658136 0.0658136 SAME || -0.0412163 || -0.128719 || 0.150029 || 0.0555237 || -0.0203267 || -0.0795422 || -0.123847
output[17]=0.0705919 0.0705919 SAME || 0.147334 || 0.151016 || -0.0122364 || 0.0360484 || -0.0609187 || 0.0166715 || -0.141399
output[18]=-0.0508929 -0.0508929 SAME || 0.0131358 || -0.0101772 || -0.120741 || -0.00821514 || 0.00894924 || -0.117651 || 0.0631629
output[19]=-0.0110406 -0.0110407 SAME || 0.189081 || 0.0665268 || 0.0622703 || 0.151629 || -0.0172242 || -0.0215623 || 0.0457666
clblas teardown
[ OK ] testbackward.compare_1_n_kgsgo_32c5 (964 ms)
[----------] 12 tests from testbackward (5653 ms total)
[----------] 5 tests from testsinglebatch
[ RUN ] testsinglebatch.imagesize5_filtersize3_batchsize2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=5 filterSize=3 outputSize=3 padZeros=0 biased=1 skip=0} }
layer 2:ActivationLayer{ LINEAR }
layer 3:FullyConnectedLayer{ numPlanes=5 imageSize=1 }
layer 4:ActivationLayer{ TANH }
layer 5:SquareLossLayer{}
Parameters overview: (skipping 4 layers with 0 params)
layer 1: params=50 17.9%
layer 3: params=230 82.1%
TOTAL : params=280
weightsTotalSize=280
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 160ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 239ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 239ms
forward layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 136ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 160ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 316ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 402ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 136ms
forward layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 316ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 402ms
calcGradWeights layer selected kernel 1
batch time 2352 ms
dump enabled=0
clblas teardown
[ OK ] testsinglebatch.imagesize5_filtersize3_batchsize2 (2617 ms)
[ RUN ] testsinglebatch.imagesize28
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=1 outputSize=28 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=28 numFilters=10 filterSize=3 outputSize=26 padZeros=0 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:FullyConnectedLayer{ numPlanes=10 imageSize=1 }
layer 4:ActivationLayer{ TANH }
layer 5:SquareLossLayer{}
Parameters overview: (skipping 4 layers with 0 params)
layer 1: params=100 0.1%
layer 3: params=67610 99.9%
TOTAL : params=67710
weightsTotalSize=67710
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 1ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 2ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 1ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 1ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 1ms
forward try kernel 2
ForwardAuto: kernel 2: this instance cant be used: cannot use forward2, since outputimagesize * outputimagesize > maxworkgroupsize
... not valid
forward try kernel 3
ForwardAuto: kernel 3: this instance cant be used: cannot use forward3, since outputimagesize * outputimagesize > maxworkgroupsize
... not valid
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 2ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 3ms
backward try kernel 2
BackwardAuto: kernel 2: this instance cant be used: cannot use BackwardGpuCached, since inputSize * inputSize > maxworkgroupsize
... not valid
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 137ms
calcGradWeights try kernel 2
BackpropWeightsAuto: kernel 2: this instance cant be used: cannot use BackpropWeightsScratch, since filterSize * filterSize > maxworkgroupsize
... not valid
calcGradWeights try kernel 3
BackpropWeightsAuto: kernel 3: this instance cant be used: cannot use BackpropWeightsScratchLarge, since filterSize * filterSize > maxworkgroupsize
... not valid
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 316ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 231ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 2ms
forward kernel 0: cannot be used
forward kernel 1 time: 1ms
forward kernel 2: cannot be used
forward kernel 3: cannot be used
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 231ms
forward layer selected kernel 4
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
backward kernel 0: cannot be used
backward kernel 1 time: 1ms
backward kernel 2: cannot be used
backward kernel 3 time: 137ms
backward layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 1ms
calcGradWeights kernel 2: cannot be used
calcGradWeights kernel 3: cannot be used
calcGradWeights kernel 4 time: 316ms
calcGradWeights layer selected kernel 1
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=28 -D gInputSizeSquared=784 -D gNumFilters=10 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=10 -D gOutputPlanes=10 -D gOutputSize=26 -D gOutputSizeSquared=676 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=28 -DgInputStripeOuterNumRows=32 -DgInputStripeInnerSize=784 -DgInputStripeOuterSize=896 -DgInputStripeMarginSize=56 -DgOutputStripeNumRows=26 -DgOutputStripeSize=676
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 1ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 122ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 327ms
forward kernel 0: cannot be used
forward kernel 1 time: 2ms
forward kernel 2 time: 2ms
forward kernel 3 time: 3ms
forward kernel 4 time: 2ms
forward kernel 5 time: 0ms
forward kernel 6 time: 1ms
forward kernel 7 time: 122ms
forward layer selected kernel 5
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 1ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 327ms
calcGradWeights layer selected kernel 2
batch time 3039 ms
dump enabled=0
clblas teardown
[ OK ] testsinglebatch.imagesize28 (3289 ms)
[ RUN ] testsinglebatch.imagesize28_filtersize5
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=1 outputSize=28 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=28 numFilters=10 filterSize=5 outputSize=24 padZeros=0 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:FullyConnectedLayer{ numPlanes=10 imageSize=1 }
layer 4:ActivationLayer{ TANH }
layer 5:SquareLossLayer{}
Parameters overview: (skipping 4 layers with 0 params)
layer 1: params=260 0.4%
layer 3: params=57610 99.6%
TOTAL : params=57870
weightsTotalSize=57870
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 1ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 2ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 1ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 1ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 2ms
forward try kernel 2
ForwardAuto: kernel 2: this instance cant be used: cannot use forward2, since outputimagesize * outputimagesize > maxworkgroupsize
... not valid
forward try kernel 3
ForwardAuto: kernel 3: this instance cant be used: cannot use forward3, since outputimagesize * outputimagesize > maxworkgroupsize
... not valid
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 2ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 4ms
backward try kernel 2
BackwardAuto: kernel 2: this instance cant be used: cannot use BackwardGpuCached, since inputSize * inputSize > maxworkgroupsize
... not valid
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 140ms
calcGradWeights try kernel 2
BackpropWeightsAuto: kernel 2: this instance cant be used: cannot use BackpropWeightsScratch, since filterSize * filterSize > maxworkgroupsize
... not valid
calcGradWeights try kernel 3
BackpropWeightsAuto: kernel 3: this instance cant be used: cannot use BackpropWeightsScratchLarge, since filterSize * filterSize > maxworkgroupsize
... not valid
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 308ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 1ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 242ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 2ms
forward kernel 0: cannot be used
forward kernel 1 time: 1ms
forward kernel 2: cannot be used
forward kernel 3: cannot be used
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 242ms
forward layer selected kernel 4
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 1ms
backward kernel 0: cannot be used
backward kernel 1 time: 1ms
backward kernel 2: cannot be used
backward kernel 3 time: 140ms
backward layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 1ms
calcGradWeights kernel 2: cannot be used
calcGradWeights kernel 3: cannot be used
calcGradWeights kernel 4 time: 308ms
calcGradWeights layer selected kernel 1
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=28 -D gInputSizeSquared=784 -D gNumFilters=10 -D gFilterSize=5 -D gHalfFilterSize=2 -D gFilterSizeSquared=25 -D gNumOutputPlanes=10 -D gOutputPlanes=10 -D gOutputSize=24 -D gOutputSizeSquared=576 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=4 -DgInputStripeInnerNumRows=28 -DgInputStripeOuterNumRows=36 -DgInputStripeInnerSize=784 -DgInputStripeOuterSize=1008 -DgInputStripeMarginSize=112 -DgOutputStripeNumRows=24 -DgOutputStripeSize=576
... seems valid
BackpropWeightsAuto: kernel 3 1ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 4ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 127ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 300ms
forward kernel 0: cannot be used
forward kernel 1 time: 2ms
forward kernel 2 time: 2ms
forward kernel 3 time: 4ms
forward kernel 4 time: 2ms
forward kernel 5 time: 1ms
forward kernel 6 time: 4ms
forward kernel 7 time: 127ms
forward layer selected kernel 5
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 2ms
calcGradWeights kernel 2 time: 1ms
calcGradWeights kernel 3 time: 1ms
calcGradWeights kernel 4 time: 300ms
calcGradWeights layer selected kernel 2
batch time 2752 ms
dump enabled=0
clblas teardown
[ OK ] testsinglebatch.imagesize28_filtersize5 (3025 ms)
[ RUN ] testsinglebatch.imagesize5_filtersize3_batchsize2_softmax
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=5 filterSize=3 outputSize=5 padZeros=1 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=5 inputSize=5 numFilters=5 filterSize=3 outputSize=5 padZeros=1 biased=1 skip=0} }
layer 4:ActivationLayer{ RELU }
layer 5:FullyConnectedLayer{ numPlanes=5 imageSize=1 }
layer 6:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 }
Parameters overview: (skipping 4 layers with 0 params)
layer 1: params=50 5.5%
layer 3: params=230 25.3%
layer 5: params=630 69.2%
TOTAL : params=910
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
layer 1 offset: 0
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 1
from w: 0
actual: -3.14851
layer 2 offset: 50
layer 3 offset: 50
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
layer 3
from w: 0
actual: -3.14851
layer 4 offset: 280
layer 5 offset: 280
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
layer 5
from w: 0
actual: -3.14851
layer 6 offset: 910
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 245ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 42ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
full thisloss: 3.14851
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 245ms
forward layer selected kernel 1
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 42ms
forward layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 139ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
layer 1 offset: 0
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 139ms
forward layer selected kernel 1
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 162ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=5 -D gFilterSize=5 -D gHalfFilterSize=2 -D gFilterSizeSquared=25 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=4 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=13 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=65 -DgInputStripeMarginSize=20 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 397ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=5 -D gOutputSizeSquared=25 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=5 -DgOutputStripeSize=25
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=5 -D gOutputSizeSquared=25 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=5 -DgOutputStripeSize=25
... seems valid
BackpropWeightsAuto: kernel 3 0ms
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 162ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 315ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 397ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 408ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 43ms
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 315ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 408ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 43ms
calcGradWeights layer selected kernel 1
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 3
from w: 0
actual: 0
layer 4 offset: 280
layer 5 offset: 280
layer 5
from w: 0
actual: 0
layer 6 offset: 910
full thisloss: 3.14851
batch time 3481 ms
dump enabled=0
clblas teardown
[ OK ] testsinglebatch.imagesize5_filtersize3_batchsize2_softmax (3705 ms)
[ RUN ] testsinglebatch.imagesize4_filtersize3_batchsize2_pooling
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=1 outputSize=12 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=12 numFilters=5 filterSize=3 outputSize=12 padZeros=1 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:PoolingLayer{ inputPlanes=5 inputSize=12 poolingSize=2 }
layer 4:ConvolutionalLayer{ LayerDimensions{ inputPlanes=5 inputSize=6 numFilters=5 filterSize=3 outputSize=6 padZeros=1 biased=1 skip=0} }
layer 5:ActivationLayer{ RELU }
layer 6:PoolingLayer{ inputPlanes=5 inputSize=6 poolingSize=2 }
layer 7:FullyConnectedLayer{ numPlanes=5 imageSize=1 }
layer 8:SoftMaxLayer{ perPlane=0 numPlanes=5 imageSize=1 }
Parameters overview: (skipping 6 layers with 0 params)
layer 1: params=50 9.8%
layer 4: params=230 45.1%
layer 7: params=230 45.1%
TOTAL : params=510
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
layer 1 offset: 0
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
layer 1
from w: 0
actual: -3.55299
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
layer 4
from w: 0
actual: -3.55299
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
layer 7
from w: 0
actual: -3.55299
layer 8 offset: 510
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 210ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 224ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
full thisloss: 3.55299
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 210ms
forward layer selected kernel 1
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 224ms
forward layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 137ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
layer 1 offset: 0
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 137ms
forward layer selected kernel 1
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 158ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 362ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=5 -D gInputPlanes=5 -D gInputSize=6 -D gInputSizeSquared=36 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=6 -D gOutputSizeSquared=36 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=6 -DgInputStripeOuterNumRows=10 -DgInputStripeInnerSize=36 -DgInputStripeOuterSize=60 -DgInputStripeMarginSize=12 -DgOutputStripeNumRows=6 -DgOutputStripeSize=36
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=12 -D gInputSizeSquared=144 -D gNumFilters=5 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=5 -D gOutputPlanes=5 -D gOutputSize=12 -D gOutputSizeSquared=144 -D gPadZeros=1 -D gMargin=1 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=12 -DgInputStripeOuterNumRows=16 -DgInputStripeInnerSize=144 -DgInputStripeOuterSize=192 -DgInputStripeMarginSize=24 -DgOutputStripeNumRows=12 -DgOutputStripeSize=144
... seems valid
BackpropWeightsAuto: kernel 3 0ms
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 158ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 315ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 362ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 321ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 212ms
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 315ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 321ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 212ms
calcGradWeights layer selected kernel 1
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
layer 1 offset: 0
layer 1
from w: 0
actual: 0
layer 2 offset: 50
layer 3 offset: 50
layer 4 offset: 50
layer 4
from w: 0
actual: 0
layer 5 offset: 280
layer 6 offset: 280
layer 7 offset: 280
layer 7
from w: 0
actual: 0
layer 8 offset: 510
full thisloss: 3.55299
batch time 3659 ms
dump enabled=0
clblas teardown
[ OK ] testsinglebatch.imagesize4_filtersize3_batchsize2_pooling (4086 ms)
[----------] 5 tests from testsinglebatch (16722 ms total)
[----------] 1 test from EXCLUDED_testsinglebatch
[ RUN ] EXCLUDED_testsinglebatch.imagesize5_filtersize3_batchsize2_10filters
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
initializing clblas
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=10 filterSize=3 outputSize=3 padZeros=0 biased=1 skip=0} }
layer 2:ActivationLayer{ RELU }
layer 3:FullyConnectedLayer{ numPlanes=10 imageSize=1 }
layer 4:ActivationLayer{ TANH }
layer 5:SquareLossLayer{}
Parameters overview: (skipping 4 layers with 0 params)
layer 1: params=100 9.9%
layer 3: params=910 90.1%
TOTAL : params=1010
weightsTotalSize=1010
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
backward try kernel 0
... not plausibly optimal, skipping
backward try kernel 1
... seems valid
BackwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 2
... seems valid
ForwardAuto: kernel 2 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
forward try kernel 3
... seems valid
ForwardAuto: kernel 3 0ms
backward try kernel 2
... seems valid
BackwardAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
calcGradWeights try kernel 2
... seems valid
BackpropWeightsAuto: kernel 2 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 4
... seems valid
ForwardAuto: kernel 4 0ms
forward try kernel 5
ForwardAuto: kernel 5: this instance cant be used: For ForwardFc, filtersize and inputimagesize must be identical
... not valid
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward try kernel 5
... seems valid
ForwardAuto: kernel 5 0ms
backward try kernel 3
... seems valid
BackwardAuto: kernel 3 152ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=10 -D gInputPlanes=10 -D gInputSize=3 -D gInputSizeSquared=9 -D gNumFilters=10 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=10 -D gOutputPlanes=10 -D gOutputSize=1 -D gOutputSizeSquared=1 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=3 -DgInputStripeOuterNumRows=7 -DgInputStripeInnerSize=9 -DgInputStripeOuterSize=21 -DgInputStripeMarginSize=6 -DgOutputStripeNumRows=1 -DgOutputStripeSize=1
... seems valid
BackpropWeightsAuto: kernel 3 0ms
calcGradWeights try kernel 3
options: -D BIASED -D gNumInputPlanes=1 -D gInputPlanes=1 -D gInputSize=5 -D gInputSizeSquared=25 -D gNumFilters=10 -D gFilterSize=3 -D gHalfFilterSize=1 -D gFilterSizeSquared=9 -D gNumOutputPlanes=10 -D gOutputPlanes=10 -D gOutputSize=3 -D gOutputSizeSquared=9 -D gPadZeros=0 -D gMargin=0 -D gEven=0 -D gSkip=0 -DgNumStripes=1 -DgInputStripeMarginRows=2 -DgInputStripeInnerNumRows=5 -DgInputStripeOuterNumRows=9 -DgInputStripeInnerSize=25 -DgInputStripeOuterSize=45 -DgInputStripeMarginSize=10 -DgOutputStripeNumRows=3 -DgOutputStripeSize=9
... seems valid
BackpropWeightsAuto: kernel 3 0ms
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 242ms
forward try kernel 6
... seems valid
ForwardAuto: kernel 6 0ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5: cannot be used
forward kernel 6 time: 0ms
forward kernel 7 time: 242ms
forward layer selected kernel 1
forward try kernel 7
... seems valid
ForwardAuto: kernel 7 135ms
backward kernel 0: cannot be used
backward kernel 1 time: 0ms
backward kernel 2 time: 0ms
backward kernel 3 time: 152ms
backward layer selected kernel 1
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 326ms
calcGradWeights try kernel 4
... seems valid
BackpropWeightsAuto: kernel 4 447ms
forward kernel 0: cannot be used
forward kernel 1 time: 0ms
forward kernel 2 time: 0ms
forward kernel 3 time: 0ms
forward kernel 4 time: 0ms
forward kernel 5 time: 0ms
forward kernel 6 time: 0ms
forward kernel 7 time: 135ms
forward layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 326ms
calcGradWeights layer selected kernel 1
calcGradWeights kernel 0: cannot be used
calcGradWeights kernel 1 time: 0ms
calcGradWeights kernel 2 time: 0ms
calcGradWeights kernel 3 time: 0ms
calcGradWeights kernel 4 time: 447ms
calcGradWeights layer selected kernel 1
batch time 2668 ms
dump enabled=0
clblas teardown
[ OK ] EXCLUDED_testsinglebatch.imagesize5_filtersize3_batchsize2_10filters (2916 ms)
[----------] 1 test from EXCLUDED_testsinglebatch (2917 ms total)
[----------] 9 tests from testpoolingforward
[ RUN ] testpoolingforward.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.basic (51 ms)
[ RUN ] testpoolingforward.basic_2plane_batchsize2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.basic_2plane_batchsize2 (44 ms)
[ RUN ] testpoolingforward.fromwrappers
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.fromwrappers (42 ms)
[ RUN ] testpoolingforward.comparespecific_0_1_pooling2
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.comparespecific_0_1_pooling2 (43 ms)
[ RUN ] testpoolingforward.comparespecific_0_1_pooling3
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.comparespecific_0_1_pooling3 (48 ms)
[ RUN ] testpoolingforward.comparespecific_0_1_pooling2_pz
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.comparespecific_0_1_pooling2_pz (42 ms)
[ RUN ] testpoolingforward.comparespecific_0_1_pooling3_pz
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.comparespecific_0_1_pooling3_pz (49 ms)
[ RUN ] testpoolingforward.comparespecific_0_1_pooling3_small
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.comparespecific_0_1_pooling3_small (40 ms)
[ RUN ] testpoolingforward.comparespecific_0_1_pooling3_small2
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingforward.comparespecific_0_1_pooling3_small2 (39 ms)
[----------] 9 tests from testpoolingforward (398 ms total)
[----------] 2 tests from testpoolingbackward
[ RUN ] testpoolingbackward.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingbackward.basic (4 ms)
[ RUN ] testpoolingbackward.basic_2plane_batchsize2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testpoolingbackward.basic_2plane_batchsize2 (1 ms)
[----------] 2 tests from testpoolingbackward (5 ms total)
[----------] 7 tests from teststringhelper
[ RUN ] teststringhelper.split
[ OK ] teststringhelper.split (0 ms)
[ RUN ] teststringhelper.split2
[ OK ] teststringhelper.split2 (0 ms)
[ RUN ] teststringhelper.split3
[ OK ] teststringhelper.split3 (0 ms)
[ RUN ] teststringhelper.tolower
[ OK ] teststringhelper.tolower (0 ms)
[ RUN ] teststringhelper.replace
[ OK ] teststringhelper.replace (0 ms)
[ RUN ] teststringhelper.replaceglobal
[ OK ] teststringhelper.replaceglobal (0 ms)
[ RUN ] teststringhelper.strcpy_safe
[ OK ] teststringhelper.strcpy_safe (0 ms)
[----------] 7 tests from teststringhelper (0 ms total)
[----------] 1 test from testGtestGlobals
[ RUN ] testGtestGlobals.basic
There are 1 parameters:
argv[0]=bin/deepcl_unittests
[ OK ] testGtestGlobals.basic (0 ms)
[----------] 1 test from testGtestGlobals (0 ms total)
[----------] 1 test from testMemset
[ RUN ] testMemset.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testMemset.basic (38 ms)
[----------] 1 test from testMemset (38 ms total)
[----------] 2 tests from testCopyBuffer
[ RUN ] testCopyBuffer.floats
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
3
4
5
6
7
8
9
10
11
12
[ OK ] testCopyBuffer.floats (110 ms)
[ RUN ] testCopyBuffer.ints
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
3
4
5
6
7
8
9
10
11
12
[ OK ] testCopyBuffer.ints (125 ms)
[----------] 2 tests from testCopyBuffer (235 ms total)
[----------] 2 tests from testCopyBlock
[ RUN ] testCopyBlock.testPos
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
in[0]=3076
in[1]=8
in[2]=14
res[0]=3
res[1]=4
res[2]=8206
res[3]=8
res[4]=14
[ OK ] testCopyBlock.testPos (48 ms)
[ RUN ] testCopyBlock.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
2 3 4
6 7 8
0 0 0 0
5 6 7
9 10 11
0 0 0 0
[ OK ] testCopyBlock.basic (49 ms)
[----------] 2 tests from testCopyBlock (97 ms total)
[----------] 1 test from testCopyLocal
[ RUN ] testCopyLocal.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
0 0 0 0
1 2 3 4
5 6 7 8
9 10 11 12
0 0 0 0
[ OK ] testCopyLocal.basic (40 ms)
[----------] 1 test from testCopyLocal (40 ms total)
[----------] 8 tests from testNetdefToNet
[ RUN ] testNetdefToNet.empty
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testNetdefToNet.empty (2 ms)
[ RUN ] testNetdefToNet.onefc
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testNetdefToNet.onefc (76 ms)
[ RUN ] testNetdefToNet.onefclinear
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testNetdefToNet.onefclinear (73 ms)
[ RUN ] testNetdefToNet.150n_10n
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testNetdefToNet.150n_10n (75 ms)
[ RUN ] testNetdefToNet.3xfclinear
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
nnString: [3]
repeatNum 3
remainderString [150n]
inner [150n]
multiplied string: 150n-150n-150n
layer 0:InputLayer{ outputPlanes=1 outputSize=19 }
layer 1:FullyConnectedLayer{ numPlanes=150 imageSize=1 }
layer 2:FullyConnectedLayer{ numPlanes=150 imageSize=1 }
layer 3:FullyConnectedLayer{ numPlanes=150 imageSize=1 }
layer 4:SoftMaxLayer{ perPlane=0 numPlanes=150 imageSize=1 }
Parameters overview: (skipping 2 layers with 0 params)
layer 1: params=54300 54.5%
layer 2: params=22650 22.7%
layer 3: params=22650 22.7%
TOTAL : params=99600
[ OK ] testNetdefToNet.3xfclinear (74 ms)
[ RUN ] testNetdefToNet.mp2_3x32c5z_10n
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
prefix: [mp2]
nnString: [3]
repeatNum 3
remainderString [32c5z-10n ]
postfix [10n ]
inner [32c5z]
multiplied string: mp2-32c5z-32c5z-32c5z-10n
layer 0:InputLayer{ outputPlanes=1 outputSize=19 }
layer 1:PoolingLayer{ inputPlanes=1 inputSize=19 poolingSize=2 }
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=9 numFilters=32 filterSize=5 outputSize=9 padZeros=1 biased=1 skip=0} }
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=9 numFilters=32 filterSize=5 outputSize=9 padZeros=1 biased=1 skip=0} }
layer 4:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=9 numFilters=32 filterSize=5 outputSize=9 padZeros=1 biased=1 skip=0} }
layer 5:FullyConnectedLayer{ numPlanes=10 imageSize=1 }
layer 6:SoftMaxLayer{ perPlane=0 numPlanes=10 imageSize=1 }
Parameters overview: (skipping 3 layers with 0 params)
layer 2: params=832 1.1%
layer 3: params=25632 32.9%
layer 4: params=25632 32.9%
layer 5: params=25930 33.2%
TOTAL : params=78026
[ OK ] testNetdefToNet.mp2_3x32c5z_10n (182 ms)
[ RUN ] testNetdefToNet.3x32c5zmp2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
nnString: [3]
repeatNum 3
remainderString [(32c5z-mp2)-10n]
inner [32c5z-mp2]
newRemainder [-10n]
postfix [10n]
multiplied string: 32c5z-mp2-32c5z-mp2-32c5z-mp2-10n
layer 0:InputLayer{ outputPlanes=1 outputSize=128 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=128 numFilters=32 filterSize=5 outputSize=128 padZeros=1 biased=1 skip=0} }
layer 2:PoolingLayer{ inputPlanes=32 inputSize=128 poolingSize=2 }
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=64 numFilters=32 filterSize=5 outputSize=64 padZeros=1 biased=1 skip=0} }
layer 4:PoolingLayer{ inputPlanes=32 inputSize=64 poolingSize=2 }
layer 5:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=32 numFilters=32 filterSize=5 outputSize=32 padZeros=1 biased=1 skip=0} }
layer 6:PoolingLayer{ inputPlanes=32 inputSize=32 poolingSize=2 }
layer 7:FullyConnectedLayer{ numPlanes=10 imageSize=1 }
layer 8:SoftMaxLayer{ perPlane=0 numPlanes=10 imageSize=1 }
Parameters overview: (skipping 5 layers with 0 params)
layer 1: params=832 0.6%
layer 3: params=25632 19.1%
layer 5: params=25632 19.1%
layer 7: params=81930 61.1%
TOTAL : params=134026
[ OK ] testNetdefToNet.3x32c5zmp2 (385 ms)
[ RUN ] testNetdefToNet.2x32c7_3x32c5z
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
nnString: [2]
repeatNum 2
remainderString [32c7z-3*32c5z-10n]
postfix [3*32c5z-10n]
inner [32c7z]
nnString: [3]
repeatNum 3
remainderString [32c5z-10n]
postfix [10n]
inner [32c5z]
multiplied string: 32c5z-32c5z-32c5z-10n
multiplied string: 32c7z-32c7z-32c5z-32c5z-32c5z-10n
layer 0:InputLayer{ outputPlanes=1 outputSize=19 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=19 numFilters=32 filterSize=7 outputSize=19 padZeros=1 biased=1 skip=0} }
layer 2:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=7 outputSize=19 padZeros=1 biased=1 skip=0} }
layer 3:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} }
layer 4:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} }
layer 5:ConvolutionalLayer{ LayerDimensions{ inputPlanes=32 inputSize=19 numFilters=32 filterSize=5 outputSize=19 padZeros=1 biased=1 skip=0} }
layer 6:FullyConnectedLayer{ numPlanes=10 imageSize=1 }
layer 7:SoftMaxLayer{ perPlane=0 numPlanes=10 imageSize=1 }
Parameters overview: (skipping 2 layers with 0 params)
layer 1: params=1600 0.7%
layer 2: params=50208 20.6%
layer 3: params=25632 10.5%
layer 4: params=25632 10.5%
layer 5: params=25632 10.5%
layer 6: params=115530 47.3%
TOTAL : params=244234
[ OK ] testNetdefToNet.2x32c7_3x32c5z (69 ms)
[----------] 8 tests from testNetdefToNet (936 ms total)
[----------] 10 tests from testactivationforward
[ RUN ] testactivationforward.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.basic (1 ms)
[ RUN ] testactivationforward.basic_2plane_batchsize2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.basic_2plane_batchsize2 (2 ms)
[ RUN ] testactivationforward.fromwrappers
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.fromwrappers (34 ms)
[ RUN ] testactivationforward.comparespecific_0_1_activation2
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.comparespecific_0_1_activation2 (34 ms)
[ RUN ] testactivationforward.comparespecific_0_1_activation3
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.comparespecific_0_1_activation3 (34 ms)
[ RUN ] testactivationforward.comparespecific_0_1_activation2_pz
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.comparespecific_0_1_activation2_pz (35 ms)
[ RUN ] testactivationforward.comparespecific_0_1_activation3_pz
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.comparespecific_0_1_activation3_pz (34 ms)
[ RUN ] testactivationforward.comparespecific_0_1_activation3_small
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.comparespecific_0_1_activation3_small (34 ms)
[ RUN ] testactivationforward.comparespecific_0_1_activation3_small2
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.comparespecific_0_1_activation3_small2 (35 ms)
[ RUN ] testactivationforward.comparespecific_0_1_activation3_small2_tanh
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testactivationforward.comparespecific_0_1_activation3_small2_tanh (70 ms)
[----------] 10 tests from testactivationforward (313 ms total)
[----------] 2 tests from testactivationbackward
[ RUN ] testactivationbackward.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
gradInput=3
gradInput=0
gradInput=-2.7
gradInput=2
gradInput=-0
gradInput=2.1
gradInput=0
gradInput=-1.1
gradInput=0
[ OK ] testactivationbackward.basic (2 ms)
[ RUN ] testactivationbackward.basic_2plane_batchsize2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
gradInput=3
gradInput=0
gradInput=0
gradInput=9
[ OK ] testactivationbackward.basic_2plane_batchsize2 (1 ms)
[----------] 2 tests from testactivationbackward (3 ms total)
[----------] 1 test from testRandomSingleton
[ RUN ] testRandomSingleton.testMockRandom
0.569795
0.59168
0.620742
0.807657
0.0113285
0.359743
0.556429
0.334354
0.476656
0.0844408
[ OK ] testRandomSingleton.testMockRandom (0 ms)
[----------] 1 test from testRandomSingleton (0 ms total)
[----------] 10 tests from testdropoutforward
[ RUN ] testdropoutforward.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.basic (1 ms)
[ RUN ] testdropoutforward.basic_2plane_batchsize2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.basic_2plane_batchsize2 (1 ms)
[ RUN ] testdropoutforward.fromwrappers
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.fromwrappers (1 ms)
[ RUN ] testdropoutforward.comparespecific_0_1_dropout2
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.comparespecific_0_1_dropout2 (35 ms)
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.comparespecific_0_1_dropout3 (34 ms)
[ RUN ] testdropoutforward.comparespecific_0_1_dropout2_pz
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.comparespecific_0_1_dropout2_pz (47 ms)
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3_pz
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.comparespecific_0_1_dropout3_pz (36 ms)
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3_small
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.comparespecific_0_1_dropout3_small (37 ms)
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3_small2
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.comparespecific_0_1_dropout3_small2 (36 ms)
[ RUN ] testdropoutforward.comparespecific_0_1_dropout3_small2_tanh
instance0: 0
instance1: 1
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutforward.comparespecific_0_1_dropout3_small2_tanh (34 ms)
[----------] 10 tests from testdropoutforward (262 ms total)
[----------] 3 tests from testdropoutbackward
[ RUN ] testdropoutbackward.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutbackward.basic (34 ms)
[ RUN ] testdropoutbackward.basic_2plane_batchsize2
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutbackward.basic_2plane_batchsize2 (36 ms)
[ RUN ] testdropoutbackward.compare_args
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testdropoutbackward.compare_args (35 ms)
[----------] 3 tests from testdropoutbackward (105 ms total)
[----------] 1 test from testsgd
[ RUN ] testsgd.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
layer 0:InputLayer{ outputPlanes=1 outputSize=5 }
layer 1:ConvolutionalLayer{ LayerDimensions{ inputPlanes=1 inputSize=5 numFilters=1 filterSize=3 outputSize=3 padZeros=0 biased=0 skip=0} }
layer 2:SquareLossLayer{}
inputtotalsize=50 outputTotalSize=18
forward try kernel 0
... not plausibly optimal, skipping
forward try kernel 1
... seems valid
ForwardAuto: kernel 1 0ms
calcGradWeights try kernel 0
... not plausibly optimal, skipping
calcGradWeights try kernel 1
... seems valid
BackpropWeightsAuto: kernel 1 0ms
[ OK ] testsgd.basic (281 ms)
[----------] 1 test from testsgd (282 ms total)
[----------] 9 tests from testCLMathWrapper
[ RUN ] testCLMathWrapper.assign
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=4
a[1]=2.1
a[2]=5
a[3]=3
a[4]=9.2
[ OK ] testCLMathWrapper.assign (35 ms)
[ RUN ] testCLMathWrapper.assignScalar
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=3.4
a[1]=3.4
a[2]=3.4
a[3]=3.4
a[4]=3.4
[ OK ] testCLMathWrapper.assignScalar (36 ms)
[ RUN ] testCLMathWrapper.addinplace
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=5
a[1]=5.1
a[2]=14
a[3]=15.5
a[4]=11.7
[ OK ] testCLMathWrapper.addinplace (35 ms)
[ RUN ] testCLMathWrapper.multiplyinplace
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=1.5
a[1]=4.5
a[2]=13.5
a[3]=18.75
a[4]=3.75
[ OK ] testCLMathWrapper.multiplyinplace (35 ms)
[ RUN ] testCLMathWrapper.addscalar
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=2.5
a[1]=4.5
a[2]=10.5
a[3]=14
a[4]=4
[ OK ] testCLMathWrapper.addscalar (34 ms)
[ RUN ] testCLMathWrapper.sqrt
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=1
a[1]=1.73205
a[2]=3
a[3]=3.53553
a[4]=1.58114
[ OK ] testCLMathWrapper.sqrt (35 ms)
[ RUN ] testCLMathWrapper.squared
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=1
a[1]=9
a[2]=81
a[3]=156.25
a[4]=6.25
[ OK ] testCLMathWrapper.squared (34 ms)
[ RUN ] testCLMathWrapper.inverse
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=1
a[1]=0.333333
a[2]=0.111111
a[3]=0.08
a[4]=0.4
[ OK ] testCLMathWrapper.inverse (34 ms)
[ RUN ] testCLMathWrapper.perelementmult
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=4
a[1]=6.3
a[2]=45
a[3]=37.5
a[4]=23
[ OK ] testCLMathWrapper.perelementmult (34 ms)
[----------] 9 tests from testCLMathWrapper (313 ms total)
[----------] 1 test from testreducesegments
[ RUN ] testreducesegments.basic
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
[ OK ] testreducesegments.basic (34 ms)
[----------] 1 test from testreducesegments (34 ms total)
[----------] 4 tests from testGpuOp
[ RUN ] testGpuOp.addinplace
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=5
a[1]=5.1
a[2]=14
a[3]=15.5
a[4]=11.7
[ OK ] testGpuOp.addinplace (34 ms)
[ RUN ] testGpuOp.addoutofplace
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=1
a[1]=3
a[2]=9
a[3]=12.5
a[4]=2.5
c[0]=5
c[1]=5.1
c[2]=14
c[3]=15.5
c[4]=11.7
[ OK ] testGpuOp.addoutofplace (35 ms)
[ RUN ] testGpuOp.inverse
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=1
a[1]=0.333333
a[2]=0.111111
a[3]=0.08
a[4]=0.4
[ OK ] testGpuOp.inverse (34 ms)
[ RUN ] testGpuOp.addscalarinplace
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics Skylake Desktop GT2
a[0]=5.2
a[1]=7.2
a[2]=13.2
a[3]=16.7
a[4]=6.7
[ OK ] testGpuOp.addscalarinplace (34 ms)
[----------] 4 tests from testGpuOp (137 ms total)
[----------] 1 test from testjpeghelper
[ RUN ] testjpeghelper.writeread
[ OK ] testjpeghelper.writeread (0 ms)
[----------] 1 test from testjpeghelper (0 ms total)
[----------] Global test environment tear-down
[==========] 158 tests from 29 test cases ran. (66801 ms total)
[ PASSED ] 158 tests.
YOU HAVE 2 DISABLED TESTS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment