Skip to content

Commit

Permalink
Final update for this cleanup phase
Browse files Browse the repository at this point in the history
  • Loading branch information
ceshine committed Oct 31, 2019
1 parent 99b6ddc commit 2827ec2
Show file tree
Hide file tree
Showing 10 changed files with 48 additions and 28 deletions.
Binary file added Lee2019.pdf
Binary file not shown.
26 changes: 20 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,26 @@
# 7th place solution to The 3rd YouTube-8M Video Understanding Challenge

(WIP) This is the final states of the codebase at the end of the competition. Code cleanup and documentation are under way.
A brief model summary can be found [here](https://www.kaggle.com/c/youtube8m-2019/discussion/112349). Please refer to the [workshop paper](Lee2019.pdf) for more details.

A brief model summary can be found [here](https://www.kaggle.com/c/youtube8m-2019/discussion/112349). A more detailed summary will be added later as a paper.
## 20191031 Update

- Redundant functions and classes have been removed.
- Some minor refactor.
- **Manage the models using YAML config files**: a YAML config file is used to specify the model architecture and training parameters. An exported model now consists of a YAML file and a pickled state dictionary.

**Correction to the paper** (and potential bugs): During the code cleanup, I found out that at near the end of competition, I set `num_workers=1` for train data loader when training segment classifiers. In the paper I wrote that I used `num_workers>1` to add more randomness. That was a mistake. In fact, using `num_workers>1` caused some convergence issue when I tried to reproduce the result. There might be some undiscovered bugs in the data loader. Using only one worker, although slower, should reproduce the results correctly.

### Model Reproduction

I've manage to reproduce the results with the cleaned codebase and Docker image (using some of the the remaining GCP credit). Two base models and seven segment classifiers are enough to obtain the 7th place:

![images](reproduction_results.png)

Notes:

1. Because the data loader is reshuffled after each resumption from instance preemption, the base model cannot be exactly reproduced. The base model performs slightly worse this time (in terms of local CV results), and it affected the downstream models.
2. The Dockerized version seem to be slower. But your mileage may vary.
3. The training scripts under `/scripts` folder has been updated.

## System Environment

Expand Down Expand Up @@ -99,7 +117,3 @@ The submission file will be create as `sub.csv` at the project root folder.
## Troubleshooting

- **RuntimeError: received 0 items of ancdata**: [Increasing ulimit and file descriptors limit on Linux](https://glassonionblog.wordpress.com/2013/01/27/increase-ulimit-and-file-descriptors-limit/).

## Potential Improvements

1. **Config-file-based model creation**: currently the entire PyTorch model object is pickled into a file on disk. This is to avoid remembering the hyper-parameters when restoring models, and thus acclerate model iteration. However, it is not considered the best practice. Storing the hyper-parameters in a config file is a better solution. I'll have to do some research to find out how to implement this properly.
Binary file added images/reproduction_results.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 7 additions & 6 deletions scripts/context-agnostic.bash
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
SEED=31537 python -m yt8m.train_pure_segment data/cache/video/ dbof-3.pth --name dbof-3 --steps 8000 --ckpt-interval 4000 --offset 3 --lr 2e-4 --fold 2 --batch-size 128
SEED=31537 python -m yt8m.train_pure_segment data/cache/video/ dbof-3.pth --name dbof-3 --steps 8000 --ckpt-interval 4000 --offset 3 --lr 2e-4 --fold 1 --batch-size 128
SEED=1822 python -m yt8m.train_pure_segment data/cache/video/ dbof-3.pth --name dbof-3 --steps 9000 --ckpt-interval 3000 --offset 3 --lr 2e-4 --fold 0 --batch-size 128
SEED=1423 python -m yt8m.train_pure_segment data/cache/video/ nxvlad-2.pth --name nxvlad-2 --steps 12000 --ckpt-interval 4000 --offset 3 --lr 2e-4 --fold 0 --batch-size 128
SEED=423 python -m yt8m.train_pure_segment data/cache/video/ nxvlad-2.pth --name nxvlad-2 --steps 8000 --ckpt-interval 4000 --offset 3 --lr 2e-4 --fold 1 --batch-size 128
SEED=33537 python -m yt8m.train_pure_segment data/cache/video/ nxvlad-2.pth --name nxvlad-2 --steps 12000 --ckpt-interval 4000 --offset 3 --lr 2e-4 --fold 2 --batch-size 128
SEED=1213 python -m yt8m.train_pure_segment scripts/pure_segment_dbof.yaml data/cache/video/dbof-3/ --fold 0 --name dbof-3
SEED=1216 python -m yt8m.train_pure_segment scripts/pure_segment_dbof.yaml data/cache/video/dbof-3/ --fold 1 --name dbof-3
SEED=1351 python -m yt8m.train_pure_segment scripts/pure_segment_dbof.yaml data/cache/video/dbof-3/ --fold 2 --name dbof-3

SEED=5696 python -m yt8m.train_pure_segment scripts/pure_segment_nextvlad.yaml data/cache/video/nextvlad-2/ --fold 0 --name nextvlad-2
SEED=1696 python -m yt8m.train_pure_segment scripts/pure_segment_nextvlad.yaml data/cache/video/nextvlad-2/ --fold 1 --name nextvlad-2 --steps 12000
SEED=2396 python -m yt8m.train_pure_segment scripts/pure_segment_nextvlad.yaml data/cache/video/nextvlad-2/ --fold 2 --name nextvlad-2
11 changes: 5 additions & 6 deletions scripts/context-aware.bash
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
SEED=98998 python -m yt8m.train_segment_w_context data/cache/video/ dbof-1.pth nxvlad-2.pth --name dbof-1_nxvlad-2 --steps 12000 --ckpt-interval 4000 --offset 3 --lr 2e-4 --fcn-dim 2048 --drop 0.5 --fold 6 --se-reduction 4 --max-len 150 --batch-size 128
SEED=93498 python -m yt8m.train_segment_w_context data/cache/video/ dbof-1.pth nxvlad-2.pth --name dbof-1_nxvlad-2 --steps 12000 --ckpt-interval 4000 --offset 3 --lr 2e-4 --fcn-dim 2048 --drop 0.5 --fold 7 --se-reduction 4 --max-len 150 --batch-size 128
SEED=23498 python -m yt8m.train_segment_w_context data/cache/video/ dbof-2.pth nxvlad-2.pth --name dbof-2_nxvlad-2 --steps 12000 --ckpt-interval 3000 --offset 3 --lr 2e-4 --fcn-dim 2048 --drop 0.5 --fold 5 --se-reduction 4 --max-len 150 --batch-size 128
SEED=18448 python -m yt8m.train_segment_w_context data/cache/video/ dbof-3.pth nxvlad-2.pth --name dbof-3_nxvlad-2 --steps 9000 --ckpt-interval 3000 --offset 3 --lr 2e-4 --fcn-dim 2048 --drop 0.5 --fold 3 --se-reduction 4 --max-len 150 --batch-size 128
SEED=28448 python -m yt8m.train_segment_w_context data/cache/video/ dbof-3.pth nxvlad-2.pth --name dbof-3_nxvlad-2 --steps 9000 --ckpt-interval 3000 --offset 3 --lr 2e-4 --fcn-dim 2048 --drop 0.5 --fold 4 --se-reduction 4 --max-len 150 --batch-size 128
SEED=7498 python -m yt8m.train_segment_w_context data/cache/video/ nxvlad-2.pth nxvlad-2.pth --name nxvlad-2_nxvlad-2 --steps 15000 --ckpt-interval 5000 --offset 3 --lr 2e-4 --fcn-dim 2048 --drop 0.5 --fold 4 --se-reduction 4 --max-len 150 --batch-size 128
SEED=4055 python -m yt8m.train_segment_w_context scripts/segment_with_context.yaml data/cache/video/dbof-3 data/cache/video/nextvlad-2 --fold 3 --name dbof-3_nextvlad-2 --steps 12000
SEED=5055 python -m yt8m.train_segment_w_context scripts/segment_with_context.yaml data/cache/video/dbof-3 data/cache/video/nextvlad-2 --fold 4 --name dbof-3_nextvlad-2

SEED=5455 python -m yt8m.train_segment_w_context scripts/segment_with_context.yaml data/cache/video/nextvlad-2 data/cache/video/nextvlad-2 --fold 5 --name nextvlad-2_x2
SEED=3055 python -m yt8m.train_segment_w_context scripts/segment_with_context.yaml data/cache/video/dbof-3 data/cache/video/dbof-3 --fold 4 --name dbof-3_x2
12 changes: 4 additions & 8 deletions scripts/pretraining.bash
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
SEED=27805 python -m yt8m.train_video nextvlad --steps 200000 --ckpt-interval 10000 --lr 3e-4 --groups 16 --batch-size 48 --n-clusters 64 --max-len 150
mv data/cache/video/baseline_model.pth data/cache/video/nxvlad-2.pth
SEED=17805 python -m yt8m.train_video dbof --steps 100000 --ckpt-interval 10000 --lr 3e-4 --batch-size 128 --max-len 150
mv data/cache/video/baseline_model.pth data/cache/video/dbof-3.pth
SEED=4827 python -m yt8m.train_video dbof --steps 120000 --ckpt-interval 10000 --lr 3e-4 --batch-size 32 --n-mixtures 5
mv data/cache/video/baseline_model.pth data/cache/video/dbof-1.pth
SEED=1635 python -m yt8m.train_video dbof --steps 100000 --ckpt-interval 10000 --lr 4e-4 --batch-size 32 --max-len 200
mv data/cache/video/baseline_model.pth data/cache/video/dbof-2.pth
SEED=4827 python -m yt8m.train_video scripts/video_gated_dbof.yaml
mv $(find data/cache/video/ -name "20*" | head -1) data/cache/video/dbof-3
SEED=1635 python -m yt8m.train_video scripts/video_nextvlad.yaml
mv $(find data/cache/video/ -name "20*" | head -1) data/cache/video/nextvlad-2
9 changes: 9 additions & 0 deletions scripts/pure_segment_nextvlad.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
pure_segment:
training:
lr: 2e-4
batch_size: 128
steps: 8000
ckpt_interval: 4000
offset: 3
weight_decay: 0.02
eps: 1e-7
2 changes: 1 addition & 1 deletion scripts/segment_with_context.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ segment_w_context:
n_mixture: 4
training:
lr: 2e-4
batch_size: 64
batch_size: 128
steps: 9000
ckpt_interval: 3000
offset: 3
Expand Down
1 change: 1 addition & 0 deletions yt8m/dataloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ def __init__(self, file_paths, seed=939, debug=False,
vocab_path="./data/segment_vocabulary.csv",
epochs=1, max_examples=None, offset=0):
super(YoutubeSegmentDataset).__init__()
print("Offset:", offset)
self.file_paths = file_paths
self.seed = seed
self.debug = debug
Expand Down
2 changes: 1 addition & 1 deletion yt8m/train_pure_segment.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ def main():
torch.optim.Adam(
optimizer_grouped_parameters,
lr=lr, eps=float(training_config["eps"])),
[training_config["weight_decay"], 0]
[float(training_config["weight_decay"]), 0]
)
# optimizer = torch.optim.Adam(
# optimizer_grouped_parameters, lr=lr, eps=1e-7)
Expand Down

0 comments on commit 2827ec2

Please sign in to comment.