Unclear sentence about the net_arch argument for on policy algorithms #1408

EBoguslawski · 2023-03-24T15:45:06Z

📚 Documentation

Hello,
I was reading the documentation about how to customize the networks' architecture here:
https://stable-baselines3.readthedocs.io/en/master/guide/custom_policy.html#on-policy-algorithms

And I found this sentence:
"Otherwise, to have actor and critic that share the same network architecture, you only need to specify net_arch=[128, 128] (here, two hidden layers of 128 units each, this is equivalent to net_arch=dict(pi=[128, 128], vf=[128, 128]))."
I think it might induce confusion because (unless I miss something) weights are shared by the actor and critic in the 1st case (with net_arch=[128, 128]) but not in the 2nd one (with net_arch=dict(pi=[128, 128], vf=[128, 128]))). Thus, these syntaxes are not really equivalent.
The example just below the sentence gives the impression that weights are not shared when using net_arch=[128, 128].
"Same architecture for actor and critic with two layers of size 128: net_arch=[128, 128]"

        obs
   /            \
 <128>          <128>
  |              |
 <128>          <128>
  |              |
action         value

Maybe you could use:

        obs
         |
        <128>
         |
        <128>
   /            \
action         value

Hoping I did not misunderstand something.
Eva

Checklist

I have checked that there is no similar issue in the repo
I have read the documentation

The text was updated successfully, but these errors were encountered:

araffin · 2023-03-24T16:10:47Z

The example just below the sentence gives the impression that weights are not shared when using

It's not an impression, it is the case. We changed the behavior in SB3 v1.8.0+ to match the offpolicy algorithms and simplify the code: #1292 and #1252

EBoguslawski · 2023-03-29T12:58:43Z

Indeed I use SB3 v1.7.0. Thank you for your answer !

EBoguslawski added the documentation Improvements or additions to documentation label Mar 24, 2023

EBoguslawski closed this as completed Mar 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unclear sentence about the net_arch argument for on policy algorithms #1408

Unclear sentence about the net_arch argument for on policy algorithms #1408

EBoguslawski commented Mar 24, 2023

araffin commented Mar 24, 2023

EBoguslawski commented Mar 29, 2023

Unclear sentence about the net_arch argument for on policy algorithms #1408

Unclear sentence about the net_arch argument for on policy algorithms #1408

Comments

EBoguslawski commented Mar 24, 2023

📚 Documentation

Checklist

araffin commented Mar 24, 2023

EBoguslawski commented Mar 29, 2023