Last active
September 7, 2023 15:01
-
-
Save rdednl/64e8fb4b7d4a0e4d047f91188cbfaaed to your computer and use it in GitHub Desktop.
batch norm is bad (td3/sac)
My code is not from stable baselines.
Ahh.... So this misunderstanding spread wider than I thought... Maybe there is a chain of misuse and people never bother checking.
When stable-baseline came out there was no such thing as batch norm by the way. The code is great and should indeed be our implement baseline. But we, "the later generations", really have more responsibilities when working on earlier codes.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Check out properties whose names start with "running_" (either in your batch norm layer or state_dict). They are "learnable", meaning they change under training but not by gradients. They are not present in
parameters()
.All learnable parameters are in
state_dict()
.parameters()
are only those that are updated by gradients.