Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dominant model data loading and training problems #1

Closed
YingtongDou opened this issue Feb 26, 2022 · 1 comment
Closed

Dominant model data loading and training problems #1

YingtongDou opened this issue Feb 26, 2022 · 1 comment
Assignees

Comments

@YingtongDou
Copy link
Member

@kayzliu
When I write the Dominant example, I find the following issues. Please fix/answer them accordingly.

  1. The current process_graph function is dedicated to the BlogCatalog dataset, we need to write a general dataloader that could handle any PyG data object. The preprocessing code for BlogCatalog can be put into the dominant.py under /example.
  2. When I run model.fit(), train_loss became NaN after 5-6 epochs.
  3. How is the outlier label of BlogCatalog generated?
  4. Should we train the model on clean data and evaluate it on data with outliers?
@kayzliu kayzliu self-assigned this Feb 26, 2022
@kayzliu
Copy link
Member

kayzliu commented Feb 27, 2022

The BlogCatalog dataset is for code validation only. The labels make no sense. I have updated the code in the latest commit. It should work well with correct outlier labels. Shall we override the original labels in the dataset with outlier labels (in
pygod/utils/outlier_generator.py)?

kayzliu added a commit that referenced this issue May 10, 2023
* fix requirement

* fix requirement

* fix requirement

* fix requirement

* simplify requirements
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants