Update README.md

BOTAK0803 · Dec 16, 2020 · 6a65271 · 6a65271
1 parent 1c82123
commit 6a65271
Showing 1 changed file with 17 additions and 12 deletions.
diff --git a/README.md b/README.md
@@ -27,39 +27,44 @@ Precomputed features for [TCGA Lung Cancer dataset](https://portal.gdc.cancer.go
 ```
   $ python download.py --dataset=tcga
 ```
-This dataset requires 20GB of free disk space. 
+This dataset requires 20GB of free disk space.
 
+## Process WSI data
 If you are processing WSI from raw images, you will need to download the WSIs first.  
-1. Navigate to './tcga-download/'
+1. **Download WSIs.**  
+Navigate to './tcga-download/' and download WSIs from [TCGA data portal](https://docs.gdc.cancer.gov/Data_Transfer_Tool/Users_Guide/Getting_Started/) using the manifest file and configuration file.  
+The example shows the case of Windows operating system. The WSIs will be saved in './WSI/TCGA-lung/LUAD' and './WSI/TCGA-lung/LUSC'.  
+The raw WSIs take about 1TB disc space and may take several days to download. Open command line tool (*Command Prompt* for the case of Windows), navigate to './tcga-download', and use commands:
 ```
   $ cd tcga-download
-```
-2. Download WSIs from [TCGA data portal](https://docs.gdc.cancer.gov/Data_Transfer_Tool/Users_Guide/Getting_Started/) using the manifest file and configuration file. The example shows the case of Windows operating system. The WSIs will be saved in './WSI/TCGA-lung/LUAD' and './WSI/TCGA-lung/LUSC'. The raw WSIs take about 1TB disc space and may take several days to download. Open command line tool (*Command Prompt* for the case of Windows), navigate to './tcga-download', and use commands:
-```
   $ gdc-client -m gdc_manifest.2020-09-06-TCGA-LUAD.txt --config config-LUAD.dtt
   $ gdc-client -m gdc_manifest.2020-09-06-TCGA-LUSC.txt --config config-LUSC.dtt
 ```    
-3. Prepare the patches. We will be using [OpenSlide](https://openslide.org/), a C library with a [Python API](https://pypi.org/project/openslide-python/) that provides a simple interface to read WSI data. We refer the users to [OpenSlide Python API document](https://openslide.org/api/python/) for the details of using this tool. The patches will be saved in './WSI/TCGA-lung/pyramid' in a pyramidal structure for the magnifications of 20x and 5x. Navigate to './tcga-download/OpenSlide/bin' and run the script 'TCGA-pre-crop.py'  
+2. **Prepare the patches.**  
+We will be using [OpenSlide](https://openslide.org/), a C library with a [Python API](https://pypi.org/project/openslide-python/) that provides a simple interface to read WSI data. We refer the users to [OpenSlide Python API document](https://openslide.org/api/python/) for the details of using this tool.  
+The patches will be saved in './WSI/TCGA-lung/pyramid' in a pyramidal structure for the magnifications of 20x and 5x. Navigate to './tcga-download/OpenSlide/bin' and run the script 'TCGA-pre-crop.py'  
 ```
   $ python TCGA-pre-crop.py
 ```
-* For training your embedder, we refer the users to [Pytorch implementation of SimCLR](https://github.com/sthalles/SimCLR) for details. We provided a modified script from this repository. Navigate to './simclr' and edit the attributes in the configuration file 'config.yaml'. You will need to determine a batch size that fits your gpu. We recommand to use a batch size of at least 512 to get good simclr features. The trained model weights and loss log are saved in folder './simclr/runs'.
+3. **Train the embedder.**  
+We provided a modified script from this repository [Pytorch implementation of SimCLR](https://github.com/sthalles/SimCLR) For training the embedder.  
+Navigate to './simclr' and edit the attributes in the configuration file 'config.yaml'. You will need to determine a batch size that fits your gpu(s). We recommand to use a batch size of at least 512 to get good simclr features. The trained model weights and loss log are saved in folder './simclr/runs'.
 ```
   $ python run.py
 ```
 
-## Training on default datasets
-To train DSMIL on standard MIL benchmark dataset:
+## **Training on default datasets.**  
+Train DSMIL on standard MIL benchmark dataset:
 ```
   $ python train_mil.py
 ```
-To switch between MIL benchmark dataset, use option:
+Switch between MIL benchmark dataset, use option:
  ```
  [--datasets]      # musk1, musk2, elephant, fox, tiger
  ```
- Other options are available for learning rate (0.0002), cross validation fold (5), weight-decay (5e-3), and number of epochs (40).  
+Other options are available for learning rate (0.0002), cross validation fold (5), weight-decay (5e-3), and number of epochs (40).  
 
- To train DSMIL on TCGA Lung Cancer dataset:
+Train DSMIL on TCGA Lung Cancer dataset (precomputed features):
  ```
   $ python train_tcga.py
 ```