Created
January 6, 2016 22:24
-
-
Save gnperdue/fde79523a018a78a8b32 to your computer and use it in GitHub Desktop.
MNIST tutorial error with more detail
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Interactive GPU job: | |
qsub -q gpu -l nodes=1:gpu -A minervaG -I | |
. caffe_gpu_setup.sh | |
caffe.bin device_query -gpu 0 | |
I0106 16:20:46.109344 29397 caffe.cpp:111] Querying GPUs 0 | |
I0106 16:20:48.749228 29397 common.cpp:168] Device id: 0 | |
I0106 16:20:48.749281 29397 common.cpp:169] Major revision number: 3 | |
I0106 16:20:48.749287 29397 common.cpp:170] Minor revision number: 5 | |
I0106 16:20:48.749294 29397 common.cpp:171] Name: Tesla K20m | |
I0106 16:20:48.749299 29397 common.cpp:172] Total global memory: 5032706048 | |
I0106 16:20:48.749307 29397 common.cpp:173] Total shared memory per block: 49152 | |
I0106 16:20:48.749312 29397 common.cpp:174] Total registers per block: 65536 | |
I0106 16:20:48.749317 29397 common.cpp:175] Warp size: 32 | |
I0106 16:20:48.749323 29397 common.cpp:176] Maximum memory pitch: 2147483647 | |
I0106 16:20:48.749328 29397 common.cpp:177] Maximum threads per block: 1024 | |
I0106 16:20:48.749332 29397 common.cpp:178] Maximum dimension of block: 1024, 1024, 64 | |
I0106 16:20:48.749341 29397 common.cpp:181] Maximum dimension of grid: 2147483647, 6553\ | |
5, 65535 | |
I0106 16:20:48.749346 29397 common.cpp:184] Clock rate: 705500 | |
I0106 16:20:48.749351 29397 common.cpp:185] Total constant memory: 65536 | |
I0106 16:20:48.749356 29397 common.cpp:186] Texture alignment: 512 | |
I0106 16:20:48.749361 29397 common.cpp:187] Concurrent copy and execution: Yes | |
I0106 16:20:48.749372 29397 common.cpp:189] Number of multiprocessors: 13 | |
I0106 16:20:48.749377 29397 common.cpp:190] Kernel execution timeout: No | |
First, get the data... | |
perdue@gpu1> pwd | |
/home/perdue/caffe/data/mnist | |
perdue@gpu1> ls -l | |
total 53744 | |
-rwxr-xr-x 1 perdue e-938 788 Dec 20 17:02 get_mnist.sh | |
-rw-r--r-- 1 perdue e-938 7840016 Jul 21 2000 t10k-images-idx3-ubyte | |
-rw-r--r-- 1 perdue e-938 10008 Jul 21 2000 t10k-labels-idx1-ubyte | |
-rw-r--r-- 1 perdue e-938 47040016 Jul 21 2000 train-images-idx3-ubyte | |
-rw-r--r-- 1 perdue e-938 60008 Jul 21 2000 train-labels-idx1-ubyte | |
Next, prepare the data. Need to edit `get_mnist.sh` a bit to make it function. | |
perdue@gpu1> more examples/mnist/create_mnist.sh | |
#!/usr/bin/env sh | |
# This script converts the mnist data into lmdb/leveldb format, | |
# depending on the value assigned to $BACKEND. | |
EXAMPLE=examples/mnist | |
DATA=data/mnist | |
BUILD=build/examples/mnist | |
BACKEND="lmdb" | |
echo "Creating ${BACKEND}..." | |
rm -rf $EXAMPLE/mnist_train_${BACKEND} | |
rm -rf $EXAMPLE/mnist_test_${BACKEND} | |
convert_mnist_data.bin $DATA/train-images-idx3-ubyte \ | |
$DATA/train-labels-idx1-ubyte $EXAMPLE/mnist_train_${BACKEND} --backend=${BACKEND} | |
convert_mnist_data.bin $DATA/t10k-images-idx3-ubyte \ | |
$DATA/t10k-labels-idx1-ubyte $EXAMPLE/mnist_test_${BACKEND} --backend=${BACKEND} | |
echo "Done." | |
Note comment in `get_mnist.sh`: | |
# Creation is split out because leveldb sometimes causes segfault | |
# and needs to be re-created. | |
Go to prep area... | |
perdue@gpu1> pwd | |
/home/perdue/caffe | |
perdue@gpu1> ls examples/mnist/create_mnist.sh | |
examples/mnist/create_mnist.sh | |
perdue@gpu1> ./examples/mnist/create_mnist.sh | |
Creating lmdb... | |
F0106 14:51:25.869540 29018 convert_mnist_data.cpp:91] Check failed: mdb_env_open(mdb_env, \ | |
db_path, 0, 0664) == 0 (5 vs. 0) mdb_env_open failed | |
*** Check failure stack trace: *** | |
@ 0x2b5816751b4d google::LogMessage::Fail() | |
@ 0x2b5816755b67 google::LogMessage::SendToLog() | |
@ 0x2b58167539e9 google::LogMessage::Flush() | |
@ 0x2b5816753ced google::LogMessageFatal::~LogMessageFatal() | |
@ 0x403d29 convert_dataset() | |
@ 0x40462c main | |
@ 0x2b581ed19d5d __libc_start_main | |
@ 0x4025a9 (unknown) | |
./examples/mnist/create_mnist.sh: line 17: 29018 Aborted convert_mnist_data\ | |
.bin $DATA/train-images-idx3-ubyte $DATA/train-labels-idx1-ubyte $EXAMPLE/mnist_train_${BACKEND\ | |
} --backend=${BACKEND} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment