Time-Series Representation Learning via Temporal and Contextual Contrasting (TS-TCC) [Paper]
by: Emadeldeen Eldele, Mohamed Ragab, Zhenghua Chen, Min Wu, Chee Keong Kwoh, Xiaoli Li and Cuntai Guan
This work is accepted for publication in the International Joint Conferences on Artificial Intelligence (IJCAI-21) (Acceptance Rate: 13.9%).
- Python3.x
- Pytorch==1.7
- Numpy
- Sklearn
- Pandas
- openpyxl (for classification reports)
- mne=='0.20.7' (For Sleep-EDF preprocessing)
- mat4py (for Fault diagnosis preprocessing)
We used four public datasets in this study:
- Sleep-EDF
- HAR
- Epilepsy (this dataset is recently removed for some reason, so I uploaded the data file to the repo)
- Fault Diagnosis
The data should be in a separate folder called "data" inside the project folder.
Inside that folder, you should have a separate folders; one for each dataset. Each subfolder should have "train.pt", "val.pt" and "test.pt" files.
The structure of data files should in dictionary form as follows:
train.pt = {"samples": data, "labels: labels}
, and similarly val.pt
, and test.pt
The details of preprocessing is as follows:
Create a folder named data_files
in the path data_preprocessing/sleep-edf/
.
Download the dataset files and place them in this folder.
Run the script preprocess_sleep_edf.py
to generate the numpy files ... you will find the numpy files of
each PSG file in another folder named sleepEDF20_fpzcz
(you can change these names from args).
You will also find the data of each subject in the folder sleepEDF20_fpzcz_subjects
(since each subject has two-night data)
Finally run the file generate_train_val_test.py
to generate the files and it will automatically place
them in the data/sleepEDF
folder.
When you dowload the dataset and extract the zip file, you will find the data in a folder named
UCI HAR Dataset
... place it in data_preprocessing/uci_har/
folder and run preprocess_har.py
file.
download the data file in data_files
folder and run the preprocessing scripts.
The configuration files in the config_files
folder should have the same name as the dataset folder name.
For example, for HAR dataset, the data folder name is HAR
and the configuration file is HAR_Configs.py
.
From these files, you can update the training parameters.
You can select one of several training modes:
- Random Initialization (random_init)
- Supervised training (supervised)
- Self-supervised training (self_supervised)
- Fine-tuning the self-supervised model (fine_tune)
- Training a linear classifier (train_linear)
The code allows also setting a name for the experiment, and a name of separate runs in each experiment. It also allows the choice of a random seed value.
To use these options:
python main.py --experiment_description exp1 --run_description run_1 --seed 123 --training_mode random_init --selected_dataset HAR
Note that the name of the dataset should be the same name as inside the "data" folder, and the training modes should be the same as the ones above.
To train the model for the fine_tune
and train_linear
modes, you have to run self_supervised
first.
- The experiments are saved in "experiments_logs" directory by default (you can change that from args too).
- Each experiment will have a log file and a final classification report in case of modes other that "self-supervised".
If you found this work useful for you, please consider citing it.
@inproceedings{ijcai2021-324,
title = {Time-Series Representation Learning via Temporal and Contextual Contrasting},
author = {Eldele, Emadeldeen and Ragab, Mohamed and Chen, Zhenghua and Wu, Min and Kwoh, Chee Keong and Li, Xiaoli and Guan, Cuntai},
booktitle = {Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, {IJCAI-21}},
pages = {2352--2359},
year = {2021},
}
For any issues/questions regarding the paper or reproducing the results, please contact me.
Emadeldeen Eldele
School of Computer Science and Engineering (SCSE),
Nanyang Technological University (NTU), Singapore.
Email: emad0002{at}e.ntu.edu.sg