You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I am very impressed by your work, and am trying to start my anomaly detection research based off of your work.
The first thing I am trying to do is to reproduce the results for the SWaT dataset given in Table 2.
I followed the exact step that you provided in scripts/readme.md for SWaT preprocessing.
After running process_swat.py, I got these statistics for the final data.
train.csv : (47520, 52)
test.csv : (44991, 52)
I noticed that it is slightly different from the data statistics given in Table 1. (5 extra data points exist in my processed data)
Finally, I tried to run your code with the same seed and data multiple times to see if the performance varies between each run. Unfortunately, fixing the seed didn't really work because the performance varied so much between each run. (For your understanding, I used the hyperparameter settings from #4) +) I also tried to run the code under cpu environment, but the results are still non-reproducible.
(1)
F1 score: 0.8163308589607635
precision: 0.9778963414634146
recall: 0.7007099945385036
(2)
F1 score: 0.7394631639063391
precision: 0.9926402943882244
recall: 0.5892954669579464
(3)
F1 score: 0.8220572640509013
precision: 0.9845020325203252
recall: 0.7054432914618606
(4)
F1 score: 0.8120639690887624
precision: 0.9895370128171593
recall: 0.6886947023484434
How did you evaluate your model when reporting to the paper? Have you come across this problem before?
My question can be arranged as follows.
Why does the difference in data statistics occur?
Why following the exact preprocessing step results in different data from the given demo data?
Why does fixing the seed not work in GDN? Is it something related to the atomic operations(non-deterministic operations) included in torch_scatter and torch_sparse?
The same thing happened for WADI as well.
The data statistics are different.
train.csv : (102697,128)
test.csv : (17280, 128)
The processed data and the demo data do not match.
The code is not reproducible with a fixed seed for WADI dataset as well.
The results are nowhere near the reported results in the paper.
Has anyone been succesful at reproducing the results for SWaT and WADI?
The text was updated successfully, but these errors were encountered:
I tried to reproduce the results of the WADI and SWAT datasets on my computer, but the results are much worse than the original and the results you got. If it is convenient for you, could you please send me a copy of the code according to. /scripts/readme.md file to me? Thank you very much. My email address is [email protected].
Hello,
I am very impressed by your work, and am trying to start my anomaly detection research based off of your work.
The first thing I am trying to do is to reproduce the results for the SWaT dataset given in Table 2.
I followed the exact step that you provided in scripts/readme.md for SWaT preprocessing.
After running process_swat.py, I got these statistics for the final data.
I noticed that it is slightly different from the data statistics given in Table 1. (5 extra data points exist in my processed data)
After creating train.csv, test.csv, and list.txt, I compared the created files with demo data (swat_train_demo.csv, swat_test_demo.csv) given in https://drive.google.com/drive/folders/1_4TlatKh-f7QhstaaY7YTSCs8D4ywbWc?usp=sharing.
However, the first 999 rows of the data didn't match.
Finally, I tried to run your code with the same seed and data multiple times to see if the performance varies between each run. Unfortunately, fixing the seed didn't really work because the performance varied so much between each run. (For your understanding, I used the hyperparameter settings from #4) +) I also tried to run the code under cpu environment, but the results are still non-reproducible.
(1)
F1 score: 0.8163308589607635
precision: 0.9778963414634146
recall: 0.7007099945385036
(2)
F1 score: 0.7394631639063391
precision: 0.9926402943882244
recall: 0.5892954669579464
(3)
F1 score: 0.8220572640509013
precision: 0.9845020325203252
recall: 0.7054432914618606
(4)
F1 score: 0.8120639690887624
precision: 0.9895370128171593
recall: 0.6886947023484434
How did you evaluate your model when reporting to the paper? Have you come across this problem before?
My question can be arranged as follows.
The same thing happened for WADI as well.
The data statistics are different.
The processed data and the demo data do not match.
The code is not reproducible with a fixed seed for WADI dataset as well.
The results are nowhere near the reported results in the paper.
Has anyone been succesful at reproducing the results for SWaT and WADI?
The text was updated successfully, but these errors were encountered: