Create a gist now

Instantly share code, notes, and snippets.

@shelhamer /readme.md Secret
Last active Jan 26, 2018

Embed
What would you like to do?
FCN-32s Fully Convolutional Semantic Segmentation on PASCAL-Context
@weiliu89

This comment has been minimized.

Show comment
Hide comment
@weiliu89

weiliu89 Apr 7, 2015

I can only achieve 33.40 mean I/U after 80000 iterations. I used the 4998 training images to train and 5015 to test. I can get to 35.0 after 160000 iterations though. I am not sure if there is other details I need to notice to get same 35.1 for only 80000 iterations?

weiliu89 commented Apr 7, 2015

I can only achieve 33.40 mean I/U after 80000 iterations. I used the 4998 training images to train and 5015 to test. I can get to 35.0 after 160000 iterations though. I am not sure if there is other details I need to notice to get same 35.1 for only 80000 iterations?

@whjxnyzh

This comment has been minimized.

Show comment
Hide comment
@whjxnyzh

whjxnyzh Apr 8, 2015

@weiliu89
I use this code to train my own data. my image is 300*300. batch size 1. my gpu is k20. but i get memory error. why?

I0408 09:45:56.056066  5286 solver.cpp:214] Iteration 0, loss = 62383.2
I0408 09:45:56.056125  5286 solver.cpp:229]     Train net output #0: loss = 62383.2 (* 1 = 62383.2 loss)
I0408 09:45:56.056143  5286 solver.cpp:489] Iteration 0, lr = 1e-10
I0408 09:46:06.473623  5286 solver.cpp:214] Iteration 20, loss = 40320.5
I0408 09:46:06.473659  5286 solver.cpp:229]     Train net output #0: loss = 17528.8 (* 1 = 17528.8 loss)
I0408 09:46:06.473671  5286 solver.cpp:489] Iteration 20, lr = 1e-10
I0408 09:46:16.893838  5286 solver.cpp:214] Iteration 40, loss = 4002.79
I0408 09:46:16.893874  5286 solver.cpp:229]     Train net output #0: loss = 798.329 (* 1 = 798.329 loss)
I0408 09:46:16.893885  5286 solver.cpp:489] Iteration 40, lr = 1e-10
I0408 09:46:27.312593  5286 solver.cpp:214] Iteration 60, loss = 224.324
I0408 09:46:27.312628  5286 solver.cpp:229]     Train net output #0: loss = 57.2293 (* 1 = 57.2293 loss)
I0408 09:46:27.312639  5286 solver.cpp:489] Iteration 60, lr = 1e-10
I0408 09:46:37.731933  5286 solver.cpp:214] Iteration 80, loss = 7.85354
I0408 09:46:37.731976  5286 solver.cpp:229]     Train net output #0: loss = 0.486825 (* 1 = 0.486825 loss)
I0408 09:46:37.731987  5286 solver.cpp:489] Iteration 80, lr = 1e-10
I0408 09:46:47.627682  5286 solver.cpp:291] Iteration 100, Testing net (#0)
F0408 09:46:48.053794  5286 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
Aborted (core dumped)

whjxnyzh commented Apr 8, 2015

@weiliu89
I use this code to train my own data. my image is 300*300. batch size 1. my gpu is k20. but i get memory error. why?

I0408 09:45:56.056066  5286 solver.cpp:214] Iteration 0, loss = 62383.2
I0408 09:45:56.056125  5286 solver.cpp:229]     Train net output #0: loss = 62383.2 (* 1 = 62383.2 loss)
I0408 09:45:56.056143  5286 solver.cpp:489] Iteration 0, lr = 1e-10
I0408 09:46:06.473623  5286 solver.cpp:214] Iteration 20, loss = 40320.5
I0408 09:46:06.473659  5286 solver.cpp:229]     Train net output #0: loss = 17528.8 (* 1 = 17528.8 loss)
I0408 09:46:06.473671  5286 solver.cpp:489] Iteration 20, lr = 1e-10
I0408 09:46:16.893838  5286 solver.cpp:214] Iteration 40, loss = 4002.79
I0408 09:46:16.893874  5286 solver.cpp:229]     Train net output #0: loss = 798.329 (* 1 = 798.329 loss)
I0408 09:46:16.893885  5286 solver.cpp:489] Iteration 40, lr = 1e-10
I0408 09:46:27.312593  5286 solver.cpp:214] Iteration 60, loss = 224.324
I0408 09:46:27.312628  5286 solver.cpp:229]     Train net output #0: loss = 57.2293 (* 1 = 57.2293 loss)
I0408 09:46:27.312639  5286 solver.cpp:489] Iteration 60, lr = 1e-10
I0408 09:46:37.731933  5286 solver.cpp:214] Iteration 80, loss = 7.85354
I0408 09:46:37.731976  5286 solver.cpp:229]     Train net output #0: loss = 0.486825 (* 1 = 0.486825 loss)
I0408 09:46:37.731987  5286 solver.cpp:489] Iteration 80, lr = 1e-10
I0408 09:46:47.627682  5286 solver.cpp:291] Iteration 100, Testing net (#0)
F0408 09:46:48.053794  5286 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
Aborted (core dumped)
@shelhamer

This comment has been minimized.

Show comment
Hide comment
@shelhamer

shelhamer Apr 9, 2015

@weiliu89 according to my notes this should be sufficient to reach 35.1 mean I/U in 80,000 iterations when fine-tuning from the VGG16 classifier weights.

Owner

shelhamer commented Apr 9, 2015

@weiliu89 according to my notes this should be sufficient to reach 35.1 mean I/U in 80,000 iterations when fine-tuning from the VGG16 classifier weights.

@weiliu89

This comment has been minimized.

Show comment
Hide comment
@weiliu89

weiliu89 Apr 9, 2015

@whjxnyzh I merged @shelhamer's pull BVLC/caffe#2016, and it can save some memory. I was using k40 and it has enough memory. But it helped me when I was dealing with 1500x1500 images.

@shelhamer I used the same protocol as described by your notes, but cannot get the same number when I finetune from VGG16. 1.7 (35.1 - 33.4) mean IoU difference is huge in my opinion. Not sure if I miss some details or not. However, I did use the model provided by you guys and verify that it indeed achieves 35.1 mean IoU on the validation set of the pascal-context dataset. I also tried to finetune upto 160000 iterations and it catches up to 35.1 mean IoU..

weiliu89 commented Apr 9, 2015

@whjxnyzh I merged @shelhamer's pull BVLC/caffe#2016, and it can save some memory. I was using k40 and it has enough memory. But it helped me when I was dealing with 1500x1500 images.

@shelhamer I used the same protocol as described by your notes, but cannot get the same number when I finetune from VGG16. 1.7 (35.1 - 33.4) mean IoU difference is huge in my opinion. Not sure if I miss some details or not. However, I did use the model provided by you guys and verify that it indeed achieves 35.1 mean IoU on the validation set of the pascal-context dataset. I also tried to finetune upto 160000 iterations and it catches up to 35.1 mean IoU..

@likaidlut

This comment has been minimized.

Show comment
Hide comment
@likaidlut

likaidlut Apr 13, 2015

@whjxnyzh i have the same problem..

@whjxnyzh i have the same problem..

@weiliu89

This comment has been minimized.

Show comment
Hide comment
@weiliu89

weiliu89 Apr 15, 2015

@whjxnyzh @likaidlut My Loss doesn't decrease that quick and can end up with more or less similar results as reported. Did you merge all the necessary pulls (as shown at https://github.com/longjon/caffe/blob/future/future.sh) to do the training? Especially the accum_gradient pull #1977

@whjxnyzh @likaidlut My Loss doesn't decrease that quick and can end up with more or less similar results as reported. Did you merge all the necessary pulls (as shown at https://github.com/longjon/caffe/blob/future/future.sh) to do the training? Especially the accum_gradient pull #1977

@KimHoon

This comment has been minimized.

Show comment
Hide comment
@KimHoon

KimHoon Apr 16, 2015

How to make 'pascal-context-val-gt59-lmdb'??
I know to make 'pascal-context-val-lmdb' using convert_imageset.cpp.
But I don't know to make lmdb of ground truth version.

KimHoon commented Apr 16, 2015

How to make 'pascal-context-val-gt59-lmdb'??
I know to make 'pascal-context-val-lmdb' using convert_imageset.cpp.
But I don't know to make lmdb of ground truth version.

@whjxnyzh

This comment has been minimized.

Show comment
Hide comment
@whjxnyzh

whjxnyzh Apr 17, 2015

@shelhamer @weiliu89 what's the loss look like.... my loss is always about 10000....is this too big?

@shelhamer @weiliu89 what's the loss look like.... my loss is always about 10000....is this too big?

@weiliu89

This comment has been minimized.

Show comment
Hide comment
@weiliu89

weiliu89 Apr 17, 2015

@whjxnyzh My loss look like the following using the exact solver provided by the author. 10000 is not too big, because it doesn't normalize the softmax loss by default

I0406 11:30:43.457891 23294 solver.cpp:248] Learning Rate Policy: fixed
I0406 11:30:44.058550 23294 solver.cpp:214] Iteration 0, loss = 904850
I0406 11:30:44.058609 23294 solver.cpp:229] Train net output #0: loss = 904850 (* 1 = 904850 loss)
I0406 11:30:44.058651 23294 solver.cpp:587] Iteration 0, lr = 1e-10
I0406 11:30:58.970970 23294 solver.cpp:214] Iteration 20, loss = 742877
I0406 11:30:58.971009 23294 solver.cpp:229] Train net output #0: loss = 674992 (* 1 = 674992 loss)
I0406 11:30:58.971016 23294 solver.cpp:587] Iteration 20, lr = 1e-10
I0406 11:31:13.578385 23294 solver.cpp:214] Iteration 40, loss = 668181
I0406 11:31:13.578558 23294 solver.cpp:229] Train net output #0: loss = 617561 (* 1 = 617561 loss)
I0406 11:31:13.578567 23294 solver.cpp:587] Iteration 40, lr = 1e-10
I0406 11:31:28.126888 23294 solver.cpp:214] Iteration 60, loss = 610147
I0406 11:31:28.126927 23294 solver.cpp:229] Train net output #0: loss = 620603 (* 1 = 620603 loss)
I0406 11:31:28.126935 23294 solver.cpp:587] Iteration 60, lr = 1e-10
I0406 11:31:42.699487 23294 solver.cpp:214] Iteration 80, loss = 580835
I0406 11:31:42.699523 23294 solver.cpp:229] Train net output #0: loss = 654719 (* 1 = 654719 loss)
I0406 11:31:42.699532 23294 solver.cpp:587] Iteration 80, lr = 1e-10
I0406 11:31:56.649629 23294 solver.cpp:214] Iteration 100, loss = 500280
I0406 11:31:56.649773 23294 solver.cpp:229] Train net output #0: loss = 582982 (* 1 = 582982 loss)
I0406 11:31:56.649782 23294 solver.cpp:587] Iteration 100, lr = 1e-10




I0407 03:56:52.927431 23294 solver.cpp:214] Iteration 79900, loss = 59123.9
I0407 03:56:52.927711 23294 solver.cpp:229] Train net output #0: loss = 57815.9 (* 1 = 57815.9 loss)
I0407 03:56:52.927719 23294 solver.cpp:587] Iteration 79900, lr = 1e-10
I0407 03:57:07.777854 23294 solver.cpp:214] Iteration 79920, loss = 73536.9
I0407 03:57:07.777889 23294 solver.cpp:229] Train net output #0: loss = 96303.8 (* 1 = 96303.8 loss)
I0407 03:57:07.777897 23294 solver.cpp:587] Iteration 79920, lr = 1e-10
I0407 03:57:22.179699 23294 solver.cpp:214] Iteration 79940, loss = 61383.8
I0407 03:57:22.179733 23294 solver.cpp:229] Train net output #0: loss = 76730.4 (* 1 = 76730.4 loss)
I0407 03:57:22.179741 23294 solver.cpp:587] Iteration 79940, lr = 1e-10
I0407 03:57:36.917495 23294 solver.cpp:214] Iteration 79960, loss = 57981
I0407 03:57:36.917778 23294 solver.cpp:229] Train net output #0: loss = 32195.5 (* 1 = 32195.5 loss)
I0407 03:57:36.917786 23294 solver.cpp:587] Iteration 79960, lr = 1e-10
I0407 03:57:51.527885 23294 solver.cpp:214] Iteration 79980, loss = 70032.2
I0407 03:57:51.527925 23294 solver.cpp:229] Train net output #0: loss = 48702 (* 1 = 48702 loss)
I0407 03:57:51.527932 23294 solver.cpp:587] Iteration 79980, lr = 1e-10
I0407 03:58:24.327311 23294 solver.cpp:273] Iteration 80000, loss = 64353.8

@whjxnyzh My loss look like the following using the exact solver provided by the author. 10000 is not too big, because it doesn't normalize the softmax loss by default

I0406 11:30:43.457891 23294 solver.cpp:248] Learning Rate Policy: fixed
I0406 11:30:44.058550 23294 solver.cpp:214] Iteration 0, loss = 904850
I0406 11:30:44.058609 23294 solver.cpp:229] Train net output #0: loss = 904850 (* 1 = 904850 loss)
I0406 11:30:44.058651 23294 solver.cpp:587] Iteration 0, lr = 1e-10
I0406 11:30:58.970970 23294 solver.cpp:214] Iteration 20, loss = 742877
I0406 11:30:58.971009 23294 solver.cpp:229] Train net output #0: loss = 674992 (* 1 = 674992 loss)
I0406 11:30:58.971016 23294 solver.cpp:587] Iteration 20, lr = 1e-10
I0406 11:31:13.578385 23294 solver.cpp:214] Iteration 40, loss = 668181
I0406 11:31:13.578558 23294 solver.cpp:229] Train net output #0: loss = 617561 (* 1 = 617561 loss)
I0406 11:31:13.578567 23294 solver.cpp:587] Iteration 40, lr = 1e-10
I0406 11:31:28.126888 23294 solver.cpp:214] Iteration 60, loss = 610147
I0406 11:31:28.126927 23294 solver.cpp:229] Train net output #0: loss = 620603 (* 1 = 620603 loss)
I0406 11:31:28.126935 23294 solver.cpp:587] Iteration 60, lr = 1e-10
I0406 11:31:42.699487 23294 solver.cpp:214] Iteration 80, loss = 580835
I0406 11:31:42.699523 23294 solver.cpp:229] Train net output #0: loss = 654719 (* 1 = 654719 loss)
I0406 11:31:42.699532 23294 solver.cpp:587] Iteration 80, lr = 1e-10
I0406 11:31:56.649629 23294 solver.cpp:214] Iteration 100, loss = 500280
I0406 11:31:56.649773 23294 solver.cpp:229] Train net output #0: loss = 582982 (* 1 = 582982 loss)
I0406 11:31:56.649782 23294 solver.cpp:587] Iteration 100, lr = 1e-10




I0407 03:56:52.927431 23294 solver.cpp:214] Iteration 79900, loss = 59123.9
I0407 03:56:52.927711 23294 solver.cpp:229] Train net output #0: loss = 57815.9 (* 1 = 57815.9 loss)
I0407 03:56:52.927719 23294 solver.cpp:587] Iteration 79900, lr = 1e-10
I0407 03:57:07.777854 23294 solver.cpp:214] Iteration 79920, loss = 73536.9
I0407 03:57:07.777889 23294 solver.cpp:229] Train net output #0: loss = 96303.8 (* 1 = 96303.8 loss)
I0407 03:57:07.777897 23294 solver.cpp:587] Iteration 79920, lr = 1e-10
I0407 03:57:22.179699 23294 solver.cpp:214] Iteration 79940, loss = 61383.8
I0407 03:57:22.179733 23294 solver.cpp:229] Train net output #0: loss = 76730.4 (* 1 = 76730.4 loss)
I0407 03:57:22.179741 23294 solver.cpp:587] Iteration 79940, lr = 1e-10
I0407 03:57:36.917495 23294 solver.cpp:214] Iteration 79960, loss = 57981
I0407 03:57:36.917778 23294 solver.cpp:229] Train net output #0: loss = 32195.5 (* 1 = 32195.5 loss)
I0407 03:57:36.917786 23294 solver.cpp:587] Iteration 79960, lr = 1e-10
I0407 03:57:51.527885 23294 solver.cpp:214] Iteration 79980, loss = 70032.2
I0407 03:57:51.527925 23294 solver.cpp:229] Train net output #0: loss = 48702 (* 1 = 48702 loss)
I0407 03:57:51.527932 23294 solver.cpp:587] Iteration 79980, lr = 1e-10
I0407 03:58:24.327311 23294 solver.cpp:273] Iteration 80000, loss = 64353.8

@debadeepta

This comment has been minimized.

Show comment
Hide comment
@debadeepta

debadeepta May 4, 2015

@weiliu My loss after 80000 iterations remains at around 600000 and is not around the 60000 that you are reporting. How did you make the lmdb for the input? I converted the pascal-context dataset to have 0 as background class and 1-59 as the numbers for the other classes.

@weiliu My loss after 80000 iterations remains at around 600000 and is not around the 60000 that you are reporting. How did you make the lmdb for the input? I converted the pascal-context dataset to have 0 as background class and 1-59 as the numbers for the other classes.

@Fchaubard

This comment has been minimized.

Show comment
Hide comment
@Fchaubard

Fchaubard Jun 5, 2015

@KimHoon Same question.. did you figure it out?

@KimHoon Same question.. did you figure it out?

@hkcqr

This comment has been minimized.

Show comment
Hide comment
@hkcqr

hkcqr Jul 23, 2015

@debadeepta Have you figure out why ? I got similar convergence value, around 650000.

hkcqr commented Jul 23, 2015

@debadeepta Have you figure out why ? I got similar convergence value, around 650000.

@kashefy

This comment has been minimized.

Show comment
Hide comment
@kashefy

kashefy Aug 4, 2015

Same here. Also made sure to cluster all the classes outside of the 59 set to zero (background). Note that classes 1,2,3,... in the 59 set are not necessarily 1,2,3,... in the full set. You need to map them so you end up with indices in the [0, 59] range. Still getting > 666790 loss with fluctuations that never seem to go below 440K.

I tried describing my procedure in this post.

kashefy commented Aug 4, 2015

Same here. Also made sure to cluster all the classes outside of the 59 set to zero (background). Note that classes 1,2,3,... in the 59 set are not necessarily 1,2,3,... in the full set. You need to map them so you end up with indices in the [0, 59] range. Still getting > 666790 loss with fluctuations that never seem to go below 440K.

I tried describing my procedure in this post.

@irri

This comment has been minimized.

Show comment
Hide comment
@irri

irri Sep 8, 2015

I've set up Caffe (the future-branch) and successfully run the FCN-32s Fully Convolutional Semantic Segmentation on PASCAL-Context model. However, I'm unable to produce clear labeled images with it. Here are my results:

Input image:
Input image

Ground truth (example from the PASCAL webpage):
good output result, but other model

My result (not very good):
My result output.

Any idea of where I'm going wrong?

irri commented Sep 8, 2015

I've set up Caffe (the future-branch) and successfully run the FCN-32s Fully Convolutional Semantic Segmentation on PASCAL-Context model. However, I'm unable to produce clear labeled images with it. Here are my results:

Input image:
Input image

Ground truth (example from the PASCAL webpage):
good output result, but other model

My result (not very good):
My result output.

Any idea of where I'm going wrong?

@shaibagon

This comment has been minimized.

Show comment
Hide comment
@shaibagon

shaibagon Sep 9, 2015

@irri please see my answer at stackoverflow

@irri please see my answer at stackoverflow

@Eniac-Xie

This comment has been minimized.

Show comment
Hide comment
@Eniac-Xie

Eniac-Xie Sep 15, 2015

how can I convert JPEG images to data which caffe can process?

I use convert_imageset in caffe/tools, but it need a a list of images as well as their labels(an integer). But in FCN-32s the labels are images.

how can I convert JPEG images to data which caffe can process?

I use convert_imageset in caffe/tools, but it need a a list of images as well as their labels(an integer). But in FCN-32s the labels are images.

@arunmallya

This comment has been minimized.

Show comment
Hide comment
@arunmallya

arunmallya Oct 1, 2015

@Eniac-Xie: you can use a function like this - https://gist.github.com/arunmallya/9b67faf63405389afb83, to load CSV data (segmentation labels) into a datum and then store it in an LMDB database.

@Eniac-Xie: you can use a function like this - https://gist.github.com/arunmallya/9b67faf63405389afb83, to load CSV data (segmentation labels) into a datum and then store it in an LMDB database.

@mtrth

This comment has been minimized.

Show comment
Hide comment
@mtrth

mtrth Oct 28, 2015

My dataset has 2 classes; with 1000 training images of (5,256,256) also corresponding ground truth data (1,256,256) which is a binary image either 0 or 1 to represent the 2 classes.

When training in solve.py you use the existing caffemodel which I assume is 3-channel ; but as I want to implement in on my 5 channel dataset can I use the same model provided ?

mtrth commented Oct 28, 2015

My dataset has 2 classes; with 1000 training images of (5,256,256) also corresponding ground truth data (1,256,256) which is a binary image either 0 or 1 to represent the 2 classes.

When training in solve.py you use the existing caffemodel which I assume is 3-channel ; but as I want to implement in on my 5 channel dataset can I use the same model provided ?

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Nov 5, 2015

I have just tried to run the eval.py using FCN-32s model. But I get the following error

F1105 19:29:04.896656 25800 layer_factory.hpp:77] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: Crop (known types: AbsVal, Accuracy, ArgMax, BNLL, Concat, ContrastiveLoss, Convolution, Data, Deconvolution, Dropout, DummyData, Eltwise, EuclideanLoss, Exp, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, LRN, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Pooling, Power, Python, ReLU, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, TanH, Threshold, WindowData)
*** Check failure stack trace: ***
Aborted (core dumped)

which says that the Crop is an unknown layer type. I have no idea of what is going wrong. Could anyone give me some suggestions?

BTW, I use the required caffe-future brach: I download and unzip caffe-future.zip. Then I copy the Makefile.config from caffe-master to it and successfully run make all and make pycaffe.

ghost commented Nov 5, 2015

I have just tried to run the eval.py using FCN-32s model. But I get the following error

F1105 19:29:04.896656 25800 layer_factory.hpp:77] Check failed: registry.count(type) == 1 (0 vs. 1) Unknown layer type: Crop (known types: AbsVal, Accuracy, ArgMax, BNLL, Concat, ContrastiveLoss, Convolution, Data, Deconvolution, Dropout, DummyData, Eltwise, EuclideanLoss, Exp, Flatten, HDF5Data, HDF5Output, HingeLoss, Im2col, ImageData, InfogainLoss, InnerProduct, LRN, MVN, MemoryData, MultinomialLogisticLoss, PReLU, Pooling, Power, Python, ReLU, Sigmoid, SigmoidCrossEntropyLoss, Silence, Slice, Softmax, SoftmaxWithLoss, Split, TanH, Threshold, WindowData)
*** Check failure stack trace: ***
Aborted (core dumped)

which says that the Crop is an unknown layer type. I have no idea of what is going wrong. Could anyone give me some suggestions?

BTW, I use the required caffe-future brach: I download and unzip caffe-future.zip. Then I copy the Makefile.config from caffe-master to it and successfully run make all and make pycaffe.

@mtrth

This comment has been minimized.

Show comment
Hide comment
@mtrth

mtrth Nov 10, 2015

@Jianchao-ICT update your ld_library_path and python_path to point to new caffe-future branch which has the crop layer

mtrth commented Nov 10, 2015

@Jianchao-ICT update your ld_library_path and python_path to point to new caffe-future branch which has the crop layer

@acgtyrant

This comment has been minimized.

Show comment
Hide comment
@acgtyrant

acgtyrant Nov 19, 2015

Hello anyone, I execute eval.py, and it returns at the end:

...
I1119 19:59:50.529485  6746 net.cpp:241] conv1_1 does not need backward computation.
I1119 19:59:50.529494  6746 net.cpp:241] data_input_0_split does not need backward computation.
I1119 19:59:50.529501  6746 net.cpp:284] This network produces output score
I1119 19:59:50.529527  6746 net.cpp:298] Network initialization done.
I1119 19:59:50.529536  6746 net.cpp:299] Memory required for data: 1277452160
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 597011289

So I have to increase the limit and recompile the protobuf now? Is there any better solution? Thank you!

Hello anyone, I execute eval.py, and it returns at the end:

...
I1119 19:59:50.529485  6746 net.cpp:241] conv1_1 does not need backward computation.
I1119 19:59:50.529494  6746 net.cpp:241] data_input_0_split does not need backward computation.
I1119 19:59:50.529501  6746 net.cpp:284] This network produces output score
I1119 19:59:50.529527  6746 net.cpp:298] Network initialization done.
I1119 19:59:50.529536  6746 net.cpp:299] Memory required for data: 1277452160
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 597011289

So I have to increase the limit and recompile the protobuf now? Is there any better solution? Thank you!

@acgtyrant

This comment has been minimized.

Show comment
Hide comment
@acgtyrant

acgtyrant Nov 23, 2015

I have to use a quick and dirty approach to hack the kDefaultTotalBytesLimit and kDefaultTotalBytesWarningThreshold in the coded_stream.h, recompile, reinstall. So eval.py can execute successfully finally.

I will try to limit the sizes of the big messages now.

I have to use a quick and dirty approach to hack the kDefaultTotalBytesLimit and kDefaultTotalBytesWarningThreshold in the coded_stream.h, recompile, reinstall. So eval.py can execute successfully finally.

I will try to limit the sizes of the big messages now.

@xuzhenqi

This comment has been minimized.

Show comment
Hide comment
@xuzhenqi

xuzhenqi Nov 29, 2015

Could anybody explain what the 'Crop' layer is for?

Could anybody explain what the 'Crop' layer is for?

@eswears

This comment has been minimized.

Show comment
Hide comment
@eswears

eswears Dec 17, 2015

What is the train/val split on the PASCAL Context data that is needed to get the 35.1 mean I/U?

There is a train.txt and val.txt in the VOC2010/ImageSets/Segmentation folder of the PASCAL VOC 2010 download, but these only have 964 images in each. The Long et al. paper seems to use half of the 10,103 images available for training and the other half for testing. So, I don't think these are the correct files. The PASCAL Context Dataset download also doesn't have any files for the train/val split.

eswears commented Dec 17, 2015

What is the train/val split on the PASCAL Context data that is needed to get the 35.1 mean I/U?

There is a train.txt and val.txt in the VOC2010/ImageSets/Segmentation folder of the PASCAL VOC 2010 download, but these only have 964 images in each. The Long et al. paper seems to use half of the 10,103 images available for training and the other half for testing. So, I don't think these are the correct files. The PASCAL Context Dataset download also doesn't have any files for the train/val split.

@eswears

This comment has been minimized.

Show comment
Hide comment
@eswears

eswears Jan 6, 2016

I was able to get a mean 53.5% accuracy after 10150 iteration with a loss of 67968. This was with a 50/50 random split for train vs test. The accuracy was determined by adding an 'accuracy' layer to train_val.prototxt. How is the mean I/U determined? Is there a similar layer that can be added?

eswears commented Jan 6, 2016

I was able to get a mean 53.5% accuracy after 10150 iteration with a loss of 67968. This was with a 50/50 random split for train vs test. The accuracy was determined by adding an 'accuracy' layer to train_val.prototxt. How is the mean I/U determined? Is there a similar layer that can be added?

@ch977

This comment has been minimized.

Show comment
Hide comment
@ch977

ch977 Jan 15, 2016

Why the learning rate is so small, can anyone explain it,plz?

ch977 commented Jan 15, 2016

Why the learning rate is so small, can anyone explain it,plz?

@JoestarK

This comment has been minimized.

Show comment
Hide comment
@JoestarK

JoestarK Jan 16, 2016

Which should I use to train my own data,the solver.py or the solver.prototxt? I used the solver.prototxt to train my data,but the loss didn't decrease. Dose anyone have a solution? Thanks

Which should I use to train my own data,the solver.py or the solver.prototxt? I used the solver.prototxt to train my data,but the loss didn't decrease. Dose anyone have a solution? Thanks

@zmonoid

This comment has been minimized.

Show comment
Hide comment
@zmonoid

zmonoid Jan 16, 2016

Hi, anyone know where to download "vgg16fc.caffemodel" in the solver.py?

zmonoid commented Jan 16, 2016

Hi, anyone know where to download "vgg16fc.caffemodel" in the solver.py?

@mhsung

This comment has been minimized.

Show comment
Hide comment
@mhsung

mhsung Jan 18, 2016

I would like to share my script for creating a custom data set in lmdb format. It would be very appreciated if you could let me know when you find any bug on this code.

#!/usr/bin/python

import caffe
import glob
import lmdb
import numpy as np
from PIL import Image
import os
import sys

# Variables
img_width = 500
img_height = 500


# Paths
# PNG images
color_dir = './input/color_image_dir'
# PNG images
# Per-pixel labels are stored in a gray image
label_dir = './input/label_image_dir'
output_dir = './lmdb/'


inputs = glob.glob(color_dir + '/*.png')

color_lmdb_name = output_dir + '/color-lmdb'
if not os.path.isdir(color_lmdb_name):
    os.makedirs(color_lmdb_name)
color_in_db = lmdb.open(color_lmdb_name, map_size=int(1e12))

label_lmdb_name = output_dir + '/label-lmdb'
if not os.path.isdir(label_lmdb_name):
    os.makedirs(label_lmdb_name)
label_in_db = lmdb.open(label_lmdb_name, map_size=int(1e12))

num_images = 0;
color_mean_color = np.zeros((3))


with color_in_db.begin(write=True) as color_in_txn:
    with label_in_db.begin(write=True) as label_in_txn:

        for in_idx, in_ in enumerate(inputs):
            img_name = os.path.splitext( os.path.basename(in_))[0]
            color_filename = color_dir + img_name + '.png'
            label_filename = label_dir + img_name + '.png'
            print(str(in_idx + 1) + ' / ' + str(len(inputs)))

            # load image
            im = np.array(Image.open(color_filename)) # or load whatever ndarray you need
            assert im.dtype == np.uint8            
            # RGB to BGR
            im = im[:,:,::-1]
            # in Channel x Height x Width order (switch from H x W x C)
            im = im.transpose((2,0,1))

            # compute mean color image
            for i in range(3):
                color_mean_color[i] += im[i,:,:].mean()
            num_images += 1

            #color_im_dat = caffe.io.array_to_datum(im)
            color_im_dat = caffe.proto.caffe_pb2.Datum()
            color_im_dat.channels, color_im_dat.height, color_im_dat.width = im.shape
            assert color_im_dat.height == img_height
            assert color_im_dat.width == img_width
            color_im_dat.data = im.tostring()
            color_in_txn.put('{:0>12d}'.format(in_idx), color_im_dat.SerializeToString())

            im = np.array(Image.open(label_filename)) # or load whatever ndarray you need
            assert im.dtype == np.uint8
            label_im_dat = caffe.proto.caffe_pb2.Datum()
            label_im_dat.channels = 1
            label_im_dat.height, label_im_dat.width = im.shape
            assert label_im_dat.height == img_height
            assert label_im_dat.width == img_width
            label_im_dat.data = im.tostring()
            label_in_txn.put('{:0>12d}'.format(in_idx), label_im_dat.SerializeToString())

    label_in_db.close()
color_in_db.close()

color_mean_color /= num_images
np.savetxt(output_dir + '/{}.csv'.format('color-mean'), color_mean_color, delimiter=",", fmt='%.4f')

mhsung commented Jan 18, 2016

I would like to share my script for creating a custom data set in lmdb format. It would be very appreciated if you could let me know when you find any bug on this code.

#!/usr/bin/python

import caffe
import glob
import lmdb
import numpy as np
from PIL import Image
import os
import sys

# Variables
img_width = 500
img_height = 500


# Paths
# PNG images
color_dir = './input/color_image_dir'
# PNG images
# Per-pixel labels are stored in a gray image
label_dir = './input/label_image_dir'
output_dir = './lmdb/'


inputs = glob.glob(color_dir + '/*.png')

color_lmdb_name = output_dir + '/color-lmdb'
if not os.path.isdir(color_lmdb_name):
    os.makedirs(color_lmdb_name)
color_in_db = lmdb.open(color_lmdb_name, map_size=int(1e12))

label_lmdb_name = output_dir + '/label-lmdb'
if not os.path.isdir(label_lmdb_name):
    os.makedirs(label_lmdb_name)
label_in_db = lmdb.open(label_lmdb_name, map_size=int(1e12))

num_images = 0;
color_mean_color = np.zeros((3))


with color_in_db.begin(write=True) as color_in_txn:
    with label_in_db.begin(write=True) as label_in_txn:

        for in_idx, in_ in enumerate(inputs):
            img_name = os.path.splitext( os.path.basename(in_))[0]
            color_filename = color_dir + img_name + '.png'
            label_filename = label_dir + img_name + '.png'
            print(str(in_idx + 1) + ' / ' + str(len(inputs)))

            # load image
            im = np.array(Image.open(color_filename)) # or load whatever ndarray you need
            assert im.dtype == np.uint8            
            # RGB to BGR
            im = im[:,:,::-1]
            # in Channel x Height x Width order (switch from H x W x C)
            im = im.transpose((2,0,1))

            # compute mean color image
            for i in range(3):
                color_mean_color[i] += im[i,:,:].mean()
            num_images += 1

            #color_im_dat = caffe.io.array_to_datum(im)
            color_im_dat = caffe.proto.caffe_pb2.Datum()
            color_im_dat.channels, color_im_dat.height, color_im_dat.width = im.shape
            assert color_im_dat.height == img_height
            assert color_im_dat.width == img_width
            color_im_dat.data = im.tostring()
            color_in_txn.put('{:0>12d}'.format(in_idx), color_im_dat.SerializeToString())

            im = np.array(Image.open(label_filename)) # or load whatever ndarray you need
            assert im.dtype == np.uint8
            label_im_dat = caffe.proto.caffe_pb2.Datum()
            label_im_dat.channels = 1
            label_im_dat.height, label_im_dat.width = im.shape
            assert label_im_dat.height == img_height
            assert label_im_dat.width == img_width
            label_im_dat.data = im.tostring()
            label_in_txn.put('{:0>12d}'.format(in_idx), label_im_dat.SerializeToString())

    label_in_db.close()
color_in_db.close()

color_mean_color /= num_images
np.savetxt(output_dir + '/{}.csv'.format('color-mean'), color_mean_color, delimiter=",", fmt='%.4f')
@fqnchina

This comment has been minimized.

Show comment
Hide comment
@fqnchina

fqnchina Feb 12, 2016

pad: 100 in conv1_1 layer is wrong..

pad: 100 in conv1_1 layer is wrong..

@masakinakada

This comment has been minimized.

Show comment
Hide comment
@masakinakada

masakinakada Mar 3, 2016

When I tried to run deploy.prototxt, it takes very large memory, and the process crash on CPU.
Can anyone help me to reduce the use of memory or solve this problem?

Thanks in advance.

When I tried to run deploy.prototxt, it takes very large memory, and the process crash on CPU.
Can anyone help me to reduce the use of memory or solve this problem?

Thanks in advance.

@smajida

This comment has been minimized.

Show comment
Hide comment
@smajida

smajida Mar 10, 2016

Hi guys,
how to finetune on different number of classes?
thanks

smajida commented Mar 10, 2016

Hi guys,
how to finetune on different number of classes?
thanks

@laotao

This comment has been minimized.

Show comment
Hide comment
@laotao

laotao Mar 25, 2016

Can I fine-tune this model for high-resolution image classification? The input sizes of Alexnet/Caffenet/GoogleLenet are too small for my application.

laotao commented Mar 25, 2016

Can I fine-tune this model for high-resolution image classification? The input sizes of Alexnet/Caffenet/GoogleLenet are too small for my application.

@CarrieHui

This comment has been minimized.

Show comment
Hide comment
@CarrieHui

CarrieHui Mar 31, 2016

@weiliu89 Hi, I encountered conflicts when I merged PR #2016 , it says "Automatic merge failed". What should I do next? Thank you in advance.

@weiliu89 Hi, I encountered conflicts when I merged PR #2016 , it says "Automatic merge failed". What should I do next? Thank you in advance.

@twtygqyy

This comment has been minimized.

Show comment
Hide comment
@twtygqyy

twtygqyy Apr 17, 2016

Hi, I tried to use deconv layer with group and bilinear as upsampling instead of using the solver script, but could hardly reproduce the result. Anybody knows the reason?

Hi, I tried to use deconv layer with group and bilinear as upsampling instead of using the solver script, but could hardly reproduce the result. Anybody knows the reason?

@yjc04

This comment has been minimized.

Show comment
Hide comment
@yjc04

yjc04 Apr 21, 2016

@acgtyrant
Hi I am getting this error :
I0421 02:38:28.223543 24891 net.cpp:299] Memory required for data: 1277452160
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 597011289

I installed the protobuf using the command given in google tensorflow website
$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/protobuf-3.0.0b2.post2-cp27-none-linux_x86_64.whl

However, once I uninstalled that and tried to compile using the github source code from google protobuf, python can't find the google protobuf at all. Can you help me finding a way here?

Thanks in advance

yjc04 commented Apr 21, 2016

@acgtyrant
Hi I am getting this error :
I0421 02:38:28.223543 24891 net.cpp:299] Memory required for data: 1277452160
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 597011289

I installed the protobuf using the command given in google tensorflow website
$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/protobuf-3.0.0b2.post2-cp27-none-linux_x86_64.whl

However, once I uninstalled that and tried to compile using the github source code from google protobuf, python can't find the google protobuf at all. Can you help me finding a way here?

Thanks in advance

@tianzq

This comment has been minimized.

Show comment
Hide comment
@tianzq

tianzq Apr 22, 2016

@yjc04
I had same error. I just modified "kDefaultTotalBytesLimit" and "kDefaultTotalBytesWarningThreshold" in the "/usr/include/google/protobuf/io/coded_stream.h". I didn't recompile and reinstall, it works well.

tianzq commented Apr 22, 2016

@yjc04
I had same error. I just modified "kDefaultTotalBytesLimit" and "kDefaultTotalBytesWarningThreshold" in the "/usr/include/google/protobuf/io/coded_stream.h". I didn't recompile and reinstall, it works well.

@mjohn123

This comment has been minimized.

Show comment
Hide comment
@mjohn123

mjohn123 Feb 6, 2017

Hello all, Have anyone try to export prediction image using C++, instead of python? I am not the family of python code. Thank all

mjohn123 commented Feb 6, 2017

Hello all, Have anyone try to export prediction image using C++, instead of python? I am not the family of python code. Thank all

@wgeppert

This comment has been minimized.

Show comment
Hide comment
@wgeppert

wgeppert Feb 12, 2017

any c++ for classification or Segmentation would be great.

wgeppert commented Feb 12, 2017

any c++ for classification or Segmentation would be great.

@sara-eb

This comment has been minimized.

Show comment
Hide comment
@sara-eb

sara-eb Mar 6, 2017

Hi,
I trained the FCN32 from the scratch. But the output is zero values (a black image). Could someone help what is the reason? Thanks

sara-eb commented Mar 6, 2017

Hi,
I trained the FCN32 from the scratch. But the output is zero values (a black image). Could someone help what is the reason? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment