Quick Run
Train A Segmentor
CSSegmentation only supports distributed training which uses DistributedDataParallel.
All outputs (log files and checkpoints) will be saved to the working directory, which is specified by “work_dir” in the config file.
Train on a single machine
You can train the segmentors in a single machine as follows,
bash scripts/distrain.sh ${NGPUS} ${CFGFILEPATH} [optional arguments]
where “${NGPUS}” means the number of GPUS you want to use and “${CFGFILEPATH}” denotes for the config file path. For example, you can train a segmentor on a single machine with the following commands,
bash scripts/distrain.sh 4 csseg/configs/annnet/annnet_resnet50os16_ade20k.py
If you want to resume from the checkpoints, you can run as follows,
bash scripts/distrain.sh 4 csseg/configs/annnet/annnet_resnet50os16_ade20k.py --ckptspath annnet_resnet50os16_ade20k/epoch_44.pth
Train with multiple machines
Now, we only support training with multiple machines with Slurm. Slurm is a good job scheduling system for computing clusters. On a cluster managed by Slurm, you can use “slurmtrain.sh” to spawn training jobs. It supports both single-node and multi-node training.
Specifically, you can train the segmentors with multiple machines as follows,
bash scripts/slurmtrain.sh ${PARTITION} ${JOBNAME} ${NGPUS} ${CFGFILEPATH} [optional arguments]
Here is an example of using 16 GPUs to train PSPNet on the dev partition,
bash scripts/slurmtrain.sh dev pspnet 16 csseg/configs/pspnet/pspnet_resnet101os8_ade20k.py
Test A Segmentor
We provide testing scripts to evaluate a whole dataset (Cityscapes, PASCAL VOC, ADE20k, etc.), and also some high-level apis for easier integration to other projects.
Test on a single machine
You can test the segmentors in a single machine as follows,
bash scripts/distest.sh ${NGPUS} ${CFGFILEPATH} ${ckptspath} [optional arguments]
For example, you can test a segmentor on a single machine with the following commands,
bash scripts/distest.sh 4 csseg/configs/annnet/annnet_resnet50os16_ade20k.py annnet_resnet50os16_ade20k/epoch_130.pth
Test with multiple machines
Now, we only support testing with multiple machines with Slurm. Slurm is a good job scheduling system for computing clusters. On a cluster managed by Slurm, you can use “slurmtest.sh” to spawn testing jobs. It supports both single-node and multi-node testing.
Specifically, you can test the segmentors with multiple machines as follows,
bash scripts/slurmtest.sh ${PARTITION} ${JOBNAME} ${NGPUS} ${CFGFILEPATH} ${ckptspath} [optional arguments]
Here is an example of using 16 GPUs to test PSPNet on the dev partition,
bash scripts/slurmtest.sh dev pspnet 16 csseg/configs/pspnet/pspnet_resnet101os8_ade20k.py pspnet_resnet101os8_ade20k/epoch_130.pth
Inference A Segmentor
You can apply the segmentor as follows:
bash scripts/inference.sh ${CFGFILEPATH} ${ckptspath} [optional arguments]
For example, if you want to inference one image, the command can be,
bash scripts/inference.sh csseg/configs/pspnet/pspnet_resnet101os8_ade20k.py pspnet_resnet101os8_ade20k/epoch_130.pth --imagepath dog.jpg
If you want to inference the images in one directory, the command can be,
bash scripts/inference.sh csseg/configs/pspnet/pspnet_resnet101os8_ade20k.py pspnet_resnet101os8_ade20k/epoch_130.pth --imagedir dogs