Last active
September 7, 2023 15:01
-
-
Save rdednl/64e8fb4b7d4a0e4d047f91188cbfaaed to your computer and use it in GitHub Desktop.
batch norm is bad (td3/sac)
Check out properties whose names start with "running_" (either in your batch norm layer or state_dict). They are "learnable", meaning they change under training but not by gradients. They are not present in parameters()
.
All learnable parameters are in state_dict()
. parameters()
are only those that are updated by gradients.
My code is not from stable baselines.
Ahh.... So this misunderstanding spread wider than I thought... Maybe there is a chain of misuse and people never bother checking.
When stable-baseline came out there was no such thing as batch norm by the way. The code is great and should indeed be our implement baseline. But we, "the later generations", really have more responsibilities when working on earlier codes.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@honglu2875 Hi. My code is not from stable baselines. Also, batch norm learnable parameters that have to be updated on the target are present in the
parameters()
method:what are the variables that are missing?
Also, what do you mean that I abused
.eval()
and.train()
?