You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
And I found this sentence:
"Otherwise, to have actor and critic that share the same network architecture, you only need to specify net_arch=[128, 128] (here, two hidden layers of 128 units each, this is equivalent tonet_arch=dict(pi=[128, 128], vf=[128, 128]))."
I think it might induce confusion because (unless I miss something) weights are shared by the actor and critic in the 1st case (with net_arch=[128, 128]) but not in the 2nd one (with net_arch=dict(pi=[128, 128], vf=[128, 128]))). Thus, these syntaxes are not really equivalent.
The example just below the sentence gives the impression that weights are not shared when using net_arch=[128, 128]. "Same architecture for actor and critic with two layers of size 128: net_arch=[128, 128]"
The example just below the sentence gives the impression that weights are not shared when using
It's not an impression, it is the case. We changed the behavior in SB3 v1.8.0+ to match the offpolicy algorithms and simplify the code: #1292 and #1252
📚 Documentation
Hello,
I was reading the documentation about how to customize the networks' architecture here:
https://stable-baselines3.readthedocs.io/en/master/guide/custom_policy.html#on-policy-algorithms
And I found this sentence:
"Otherwise, to have actor and critic that share the same network architecture, you only need to specify
net_arch=[128, 128]
(here, two hidden layers of 128 units each, this is equivalent tonet_arch=dict(pi=[128, 128], vf=[128, 128]))
."I think it might induce confusion because (unless I miss something) weights are shared by the actor and critic in the 1st case (with
net_arch=[128, 128]
) but not in the 2nd one (withnet_arch=dict(pi=[128, 128], vf=[128, 128]))
). Thus, these syntaxes are not really equivalent.The example just below the sentence gives the impression that weights are not shared when using
net_arch=[128, 128]
."Same architecture for actor and critic with two layers of size 128:
net_arch=[128, 128]
"Maybe you could use:
Hoping I did not misunderstand something.
Eva
Checklist
The text was updated successfully, but these errors were encountered: