-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New info
API for vectorized environments #2657
#2773
Conversation
info
API for vectorized environments #2657info
API for vectorized environments #2657
I think all changes are in place now; the new |
Thanks for the PR, it looks good to me. I will get someone else to review as well. |
Add some fixes and solved a merge conflict
Yup, will update the docs 👍 |
@gianlucadecola Sorry, I did a bit more complete review and found a couple more things that could be updated. |
No problem 👍 I'll take a look |
So I'm converting this as a draft since it will require a different approach based on: google/brax#194 |
Did the renaming and incorporate the info_processor into VecEnv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a bunch of comments, it's on the right track - most comments are stylistic and nitpicks, with I think one or two about functionality.
@@ -211,6 +217,42 @@ def seed(self, seed=None): | |||
"Please use `env.reset(seed=seed) instead in VectorEnvs." | |||
) | |||
|
|||
def _add_info(self, infos: dict, info: dict, env_num: int) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that this and the next method are basically static, correct? The only thing is the number of envs which can be passed as an argument. I think it's better to extract them to separate functions.
for _ in range(ENV_STEPS): | ||
action = env.action_space.sample() | ||
_, _, dones, infos = env.step(action) | ||
if any(dones): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is my understanding correct that the way this test is done, there's a reasonable chance that there will never be two environments terminating at the same time? This leaves us with a pretty important untested functionality. Please design a test for various cases, i.e. 1, 2, 3 (all) envs terminating at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Will add other test cases 👍
Hey, thanks a lot for the review, added a bunch of comments, will work on that soon 👍 |
@RedTachyon I've done some changes, when you have time could you please take a look? Particularly at: |
This is a draft referring to issue #2657 in order to gain some feedback before moving on.
For the new
info
API I'm working to make the code more modular and backward-compatible with the oldinfo
in order to make switching to a differentinfo
format in the future eventually easier.Default behaviour is still the same so
envs = gym.vector.make("CartPole-v1", asynchronous=False, num_envs=3)
will return
info
in the format:If we want to switch to another format
envs = gym.vector.make("CartPole-v1", asynchronous=False, num_envs=3, info_format="brax")
we get
If we then want to format info for example like proposed by @pseudo-rnd-thoughts
All we need to do is registering a new
VecEnvInfoStrategy
The same approach applies to wrappers like
RecordEpisodeStatistics
which get theinfo_strategy
by the wrapped environments and process the info accordingly.Current draft works on SyncVectorEnv only. If we agree on this approach I will continue implementing it with Async and other wrappers