Dominant model data loading and training problems #1

YingtongDou · 2022-02-26T21:38:10Z

@kayzliu
When I write the Dominant example, I find the following issues. Please fix/answer them accordingly.

The current process_graph function is dedicated to the BlogCatalog dataset, we need to write a general dataloader that could handle any PyG data object. The preprocessing code for BlogCatalog can be put into the dominant.py under /example.
When I run model.fit(), train_loss became NaN after 5-6 epochs.
How is the outlier label of BlogCatalog generated?
Should we train the model on clean data and evaluate it on data with outliers?

The text was updated successfully, but these errors were encountered:

kayzliu · 2022-02-27T22:32:13Z

The BlogCatalog dataset is for code validation only. The labels make no sense. I have updated the code in the latest commit. It should work well with correct outlier labels. Shall we override the original labels in the dataset with outlier labels (in
pygod/utils/outlier_generator.py)?

* fix requirement * fix requirement * fix requirement * fix requirement * simplify requirements

kayzliu self-assigned this Feb 26, 2022

kayzliu closed this as completed in e392427 Feb 27, 2022

kayzliu added a commit that referenced this issue May 10, 2023

Simplify Requirements (#1)

7e5ffb3

* fix requirement * fix requirement * fix requirement * fix requirement * simplify requirements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dominant model data loading and training problems #1

Dominant model data loading and training problems #1

YingtongDou commented Feb 26, 2022

kayzliu commented Feb 27, 2022

Dominant model data loading and training problems #1

Dominant model data loading and training problems #1

Comments

YingtongDou commented Feb 26, 2022

kayzliu commented Feb 27, 2022