New `info` API for vectorized environments #2657 #2773

gianlucadecola · 2022-04-23T09:56:33Z

This is a draft referring to issue #2657 in order to gain some feedback before moving on.
For the new info API I'm working to make the code more modular and backward-compatible with the old info in order to make switching to a different info format in the future eventually easier.

Default behaviour is still the same so
envs = gym.vector.make("CartPole-v1", asynchronous=False, num_envs=3)
will return info in the format:

[{'terminal_observation': array([ 0.1307018,  1.7346063, -0.2267962, -2.7928443], dtype=float32)},
 {},
 {}]

If we want to switch to another format
envs = gym.vector.make("CartPole-v1", asynchronous=False, num_envs=3, info_format="brax")
we get

{'terminal_observation': [None,
                          array([-0.09317806, -1.5342456 ,  0.22191852,  2.548524  ], dtype=float32),
                          None]}

If we then want to format info for example like proposed by @pseudo-rnd-thoughts

{
   "key": [
         (data, environment number) 
         ...
   ], 
   ...
}

All we need to do is registering a new VecEnvInfoStrategy

The same approach applies to wrappers like RecordEpisodeStatistics which get the info_strategy by the wrapped environments and process the info accordingly.

Current draft works on SyncVectorEnv only. If we agree on this approach I will continue implementing it with Async and other wrappers

gianlucadecola · 2022-04-25T22:58:34Z

I think all changes are in place now; the new info API works for both async and sync vector envs; the changes are backward compatible with the previous gym version and, in the future, if we want to add different info formatting all we need to do is adding a new ProcessingStrategy

gym/vector/async_vector_env.py

gym/vector/sync_vector_env.py

gym/vector/utils/info_strategies.py

gym/wrappers/record_episode_statistics.py

pseudo-rnd-thoughts · 2022-04-26T16:32:00Z

Thanks for the PR, it looks good to me. I will get someone else to review as well.
Would you be able to add documentation for this PR please, https://github.com/Farama-Foundation/gym-docs

gianlucadecola · 2022-04-26T21:57:48Z

Add some fixes and solved a merge conflict

Thanks for the PR, it looks good to me. I will get someone else to review as well. Would you be able to add documentation for this PR please, https://github.com/Farama-Foundation/gym-docs

Yup, will update the docs 👍

gym/vector/__init__.py

gym/vector/utils/info_strategies.py

gym/wrappers/record_episode_statistics.py

tests/vector/test_info_format_strategies.py

tests/wrappers/test_record_video.py

pseudo-rnd-thoughts · 2022-04-27T10:27:49Z

@gianlucadecola Sorry, I did a bit more complete review and found a couple more things that could be updated.

gianlucadecola · 2022-04-27T10:53:35Z

@gianlucadecola Sorry, I did a bit more complete review and found a couple more things that could be updated.

No problem 👍 I'll take a look

gym/vector/utils/info_strategies.py

gianlucadecola · 2022-05-07T18:11:30Z

So I'm converting this as a draft since it will require a different approach based on: google/brax#194
The new default behavior will be the Brax format so this PR will introduce a breaking change. Instead of having different strategies a wrapper will be added to convert back the info to the "classic" format.

gianlucadecola · 2022-05-14T15:35:59Z

Did the renaming and incorporate the info_processor into VecEnv

gym/wrappers/record_episode_statistics.py

gym/vector/sync_vector_env.py

gym/vector/vector_env.py

gym/wrappers/record_episode_statistics.py

gym/wrappers/vec_info_to_classic.py

gym/vector/async_vector_env.py

gym/wrappers/record_episode_statistics.py

gym/wrappers/vec_info_to_classic.py

tests/wrappers/test_record_episode_statistics.py

…ining. naming.

RedTachyon

I left a bunch of comments, it's on the right track - most comments are stylistic and nitpicks, with I think one or two about functionality.

RedTachyon · 2022-05-19T13:50:23Z

gym/vector/vector_env.py

@@ -211,6 +217,42 @@ def seed(self, seed=None):
            "Please use `env.reset(seed=seed) instead in VectorEnvs."
        )

+    def _add_info(self, infos: dict, info: dict, env_num: int) -> dict:


It seems that this and the next method are basically static, correct? The only thing is the number of envs which can be passed as an argument. I think it's better to extract them to separate functions.

gym/vector/vector_env.py

gym/wrappers/record_episode_statistics.py

gym/wrappers/vector_list_info.py

tests/vector/test_vector_env_info.py

RedTachyon · 2022-05-19T14:28:48Z

tests/vector/test_vector_env_info.py

+    for _ in range(ENV_STEPS):
+        action = env.action_space.sample()
+        _, _, dones, infos = env.step(action)
+        if any(dones):


Is my understanding correct that the way this test is done, there's a reasonable chance that there will never be two environments terminating at the same time? This leaves us with a pretty important untested functionality. Please design a test for various cases, i.e. 1, 2, 3 (all) envs terminating at once.

Correct. Will add other test cases 👍

tests/wrappers/test_vector_list_info.py

gianlucadecola · 2022-05-19T16:42:46Z

Hey, thanks a lot for the review, added a bunch of comments, will work on that soon 👍

gianlucadecola · 2022-05-23T10:40:10Z

@RedTachyon I've done some changes, when you have time could you please take a look? Particularly at:
#2773 (comment)
#2773 (comment)
#2773 (comment)

gianlucadecola marked this pull request as ready for review April 25, 2022 22:51

gianlucadecola changed the title ~~[Draft] New info API for vectorized environments #2657~~ New info API for vectorized environments #2657 Apr 25, 2022

pseudo-rnd-thoughts reviewed Apr 26, 2022

View reviewed changes

gym/vector/async_vector_env.py Outdated Show resolved Hide resolved

pseudo-rnd-thoughts reviewed Apr 26, 2022

View reviewed changes

gym/vector/sync_vector_env.py Outdated Show resolved Hide resolved

pseudo-rnd-thoughts reviewed Apr 26, 2022

View reviewed changes

gym/vector/utils/info_strategies.py Outdated Show resolved Hide resolved

pseudo-rnd-thoughts reviewed Apr 26, 2022

View reviewed changes

gym/vector/utils/info_strategies.py Outdated Show resolved Hide resolved

pseudo-rnd-thoughts reviewed Apr 26, 2022

View reviewed changes

gym/wrappers/record_episode_statistics.py Outdated Show resolved Hide resolved