Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing Table 3 Diffuser numbers #27

Open
joeybose opened this issue Jan 9, 2023 · 10 comments
Open

Reproducing Table 3 Diffuser numbers #27

joeybose opened this issue Jan 9, 2023 · 10 comments

Comments

@joeybose
Copy link

joeybose commented Jan 9, 2023

Hi,

It's unclear how to reproduce the Diffuser column in Table 3 of the paper. For example, for the unconditional stacking experiment the report table number is 58.7 +/- 2.5. When I try the provided script with the pretrained weights I get a reward mean of 1.6 +/- 0.067. How can I get roughly the same numbers as reported in the table? I'm sorry if I'm missing something obvious.

@yilundu
Copy link
Collaborator

yilundu commented Jan 9, 2023

Hi,

Sorry about the confusion, we normalized values to be between 0 and 100. The max reward in the environment is 3.0 so 1.6 corresponds to 1.6 / 3 * 100 = 53.3 reward.

@joeybose
Copy link
Author

joeybose commented Jan 9, 2023

Ah that makes a lot of sense. It seems that the number is still a bit off the ones reported in the paper but closer now. Do the released pre-trained weights correspond to the final model that was used in constructing the table? Thanks in advance again for your help!

@yilundu
Copy link
Collaborator

yilundu commented Jan 9, 2023

I think the pre-trained weights should be very close to the values in the table, but I restructured the underlying code quite a bit and retrained a model based off the new codebase. As a result, they won't be exactly the same (but should be very close) -- let me know if there are any drastic differences.

@joeybose
Copy link
Author

joeybose commented Jan 9, 2023

The score of 53.3 reward was with the pre-trained weights, which is definitely out of the std. Perhaps, this is due to the restructuring changes? If its not too much trouble would you be able to also independently verify what reward you get from your own pretrained weights?

@yilundu
Copy link
Collaborator

yilundu commented Jan 15, 2023

Hi, sure I went and reran the exact code that was released with the pretrained model and obtained performance 55.67 with a standard deviation of 2.4. I'm running the code with 1000 trials and will report the number after I get that also.

@joeybose
Copy link
Author

Excellent! Thank you for re-running the models. I will close this issue after you report the number after 1000 trials.

@Looomo
Copy link

Looomo commented Jan 21, 2023

@joeybose Hi, I was trying to reproduce the results of walker2d-medium-replay-v2 in table 2 by training a new weight, but I didn't make it. The results are very different from the pretrained weights. Just curious, have you tried to train a new weight to reproduce the results? If so, does it work? Thanks!

@joeybose
Copy link
Author

@Looomo I've only tried the Kuka experiments as that was what I was most interested in. Unfortunately, I didn't try the Mujoco experiments.

@Looomo
Copy link

Looomo commented Jan 21, 2023

@joeybose Thanks, I'll try other datasets and other seeds.

@jannerm
Copy link
Owner

jannerm commented Jan 21, 2023

@Looomo, sorry to hear about the locomotion troubles. I can help you over in #29.

@yilundu, how'd the 1000-trial kuka evaluation go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants