本文主要是介绍Detectorn2预训练模型复现:数据准备、训练命令、日志分析与输出目录,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
Detectorn2预训练模型复现:数据准备、训练命令、日志分析与输出目录
在深度学习项目中,目标检测是一项重要的任务。本文将详细介绍如何使用Detectron2进行目标检测模型的复现训练,涵盖训练数据准备、训练命令、训练日志分析、训练指标以及训练输出目录的各个文件及其作用。特别地,我们将演示在训练过程中出现中断后,如何使用 resume
功能继续训练,并将我们复现的模型与Model Zoo中的模型进行比较。
一、训练数据准备
COCO(Common Objects in Context)数据集是一个广泛使用的图像识别、目标检测和分割数据集。我们将使用COCO数据集进行模型训练和评估。以下是COCO数据集的目录结构:
/mnt/coco
├── annotations
├── annotations_trainval2014.zip
├── annotations_trainval2017.zip
├── test2014
├── test2014.zip
├── test2017
├── test2017.zip
├── train2014
├── train2014.zip
├── train2017
├── train2017.zip
├── val2014
├── val2014.zip
└── val2017└── val2017.zip
目录和文件解释
- annotations/:存放COCO数据集的注释文件,这些文件通常是JSON格式,包含了图像的标签、边界框、分割掩码等信息。
- annotations_trainval2014.zip 和 annotations_trainval2017.zip:COCO 2014和2017训练和验证集的注释文件压缩包。
- test2014/ 和 test2017/:存放COCO 2014和2017测试集的图像文件,用于模型测试。
- test2014.zip 和 test2017.zip:COCO 2014和2017测试集的图像文件压缩包。
- train2014/ 和 train2017/:存放COCO 2014和2017训练集的图像文件,用于模型训练。
- train2014.zip 和 train2017.zip:COCO 2014和2017训练集的图像文件压缩包。
- val2014/ 和 val2017/:存放COCO 2014和2017验证集的图像文件,用于模型验证。
- val2014.zip 和 val2017.zip:COCO 2014和2017验证集的图像文件压缩包。
二、训练命令
在开始训练之前,需要设置环境变量来指定数据集的路径:
export DETECTRON2_DATASETS=/mnt/
第一个训练命令
nohup ./train_net.py --config-file ../configs/COCO-Detection/faster_rcnn_R_50_FPN_1x.yaml --num-gpus 8 OUTPUT_DIR /mnt/output/ > train.log 2>&1 &
- nohup:使命令在后台运行,即使关闭终端也不会中断。
- ./train_net.py:训练脚本,负责启动训练过程。
- –config-file:指定配置文件路径。
- –num-gpus 8:使用8个GPU进行训练。
- OUTPUT_DIR /mnt/output/:指定输出目录。
- > train.log 2>&1:将标准输出和错误输出重定向到
train.log
。 - &:将命令放到后台运行。
第二个训练命令(使用resume功能)
在训练过程中出现中断后,我们可以使用 resume
功能继续训练:
nohup ./train_net.py --config-file ../configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml --num-gpus 8 --resume OUTPUT_DIR /mnt/output/ MODEL.WEIGHTS /mnt/output/model_0029999.pth > train.log 2>&1 &
- –config-file:指定配置文件路径。
- –resume:从上一次中断的地方继续训练。
- MODEL.WEIGHTS:指定预训练模型的权重文件路径。
三、训练日志分析
nohup: ignoring input
Command Line Args: Namespace(config_file='../configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml', dist_url='tcp://127.0.0.1:49152', eval_only=False, machine_rank=0, num_gpus=8, num_machines=1, opts=['OUTPUT_DIR', '/mnt/output/', 'MODEL.WEIGHTS', '/mnt/output/model_0029999.pth'], resume=True)
[09/06 02:16:26 detectron2]: Rank of current process: 0. World size: 8
[09/06 02:16:30 detectron2]: Environment info:
------------------------------- --------------------------------------------------------------
sys.platform linux
Python 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0]
numpy 1.22.2
detectron2 0.6 @/root/detectron2/detectron2
Compiler GCC 9.4
CUDA compiler CUDA 12.0
detectron2 arch flags 5.2, 6.0, 6.1, 7.0, 7.5, 8.0, 8.6, 9.0
DETECTRON2_ENV_MODULE <not set>
PyTorch 1.14.0a0+44dac51 @/usr/local/lib/python3.8/dist-packages/torch
PyTorch debug build False
torch._C._GLIBCXX_USE_CXX11_ABI True
GPU available Yes
GPU 0,1,2,3,4,5,6,7 Tesla V100-SXM2-16GB (arch=7.0)
Driver version 535.161.08
CUDA_HOME /usr/local/cuda
Pillow 9.2.0
torchvision 0.15.0a0 @/usr/local/lib/python3.8/dist-packages/torchvision
torchvision arch flags 5.2, 6.0, 6.1, 7.0, 7.5, 8.0, 8.6, 9.0
fvcore 0.1.5.post20221221
iopath 0.1.9
cv2 4.6.0
------------------------------- --------------------------------------------------------------
PyTorch built with:- GCC 9.4- C++ Version: 201402- Intel(R) oneAPI Math Kernel Library Version 2021.1-Product Build 20201104 for Intel(R) 64 architecture applications- Intel(R) MKL-DNN v2.7.0 (Git Hash N/A)- OpenMP 201511 (a.k.a. OpenMP 4.5)- LAPACK is enabled (usually provided by MKL)- NNPACK is enabled- CPU capability usage: NO AVX- CUDA Runtime 12.0- NVCC architecture flags: -gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_90,code=compute_90- CuDNN 8.7 (built against CUDA 11.8)- Magma 2.6.2- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.0, CUDNN_VERSION=8.7.0, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS=-fno-gnu-unique -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=1.14.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, [09/06 02:16:30 detectron2]: Command line arguments: Namespace(config_file='../configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml', dist_url='tcp://127.0.0.1:49152', eval_only=False, machine_rank=0, num_gpus=8, num_machines=1, opts=['OUTPUT_DIR', '/mnt/output/', 'MODEL.WEIGHTS', '/mnt/output/model_0029999.pth'], resume=True)
[09/06 02:16:30 detectron2]: Contents of args.config_file=../configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml:
_BASE_: "../Base-RCNN-FPN.yaml"
MODEL:WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"MASK_ON: FalseRESNETS:DEPTH: 50
SOLVER:STEPS: (210000, 250000)MAX_ITER: 270000[09/06 02:16:30 detectron2]: Running with full config:
CUDNN_BENCHMARK: false
DATALOADER:ASPECT_RATIO_GROUPING: trueFILTER_EMPTY_ANNOTATIONS: trueNUM_WORKERS: 4REPEAT_SQRT: trueREPEAT_THRESHOLD: 0.0SAMPLER_TRAIN: TrainingSampler
DATASETS:PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000PROPOSAL_FILES_TEST: []PROPOSAL_FILES_TRAIN: []TEST:- coco_2017_valTRAIN:- coco_2017_train
GLOBAL:HACK: 1.0
INPUT:CROP:ENABLED: falseSIZE:- 0.9- 0.9TYPE: relative_rangeFORMAT: BGRMASK_FORMAT: polygonMAX_SIZE_TEST: 1333MAX_SIZE_TRAIN: 1333MIN_SIZE_TEST: 800MIN_SIZE_TRAIN:- 640- 672- 704- 736- 768- 800MIN_SIZE_TRAIN_SAMPLING: choiceRANDOM_FLIP: horizontal
MODEL:ANCHOR_GENERATOR:ANGLES:- - -90- 0- 90ASPECT_RATIOS:- - 0.5- 1.0- 2.0NAME: DefaultAnchorGeneratorOFFSET: 0.0SIZES:- - 32- - 64- - 128- - 256- - 512BACKBONE:FREEZE_AT: 2NAME: build_resnet_fpn_backboneDEVICE: cudaFPN:FUSE_TYPE: sumIN_FEATURES:- res2- res3- res4- res5NORM: ''OUT_CHANNELS: 256KEYPOINT_ON: falseLOAD_PROPOSALS: falseMASK_ON: falseMETA_ARCHITECTURE: GeneralizedRCNNPANOPTIC_FPN:COMBINE:ENABLED: trueINSTANCES_CONFIDENCE_THRESH: 0.5OVERLAP_THRESH: 0.5STUFF_AREA_LIMIT: 4096INSTANCE_LOSS_WEIGHT: 1.0PIXEL_MEAN:- 103.53- 116.28- 123.675PIXEL_STD:- 1.0- 1.0- 1.0PROPOSAL_GENERATOR:MIN_SIZE: 0NAME: RPNRESNETS:DEFORM_MODULATED: falseDEFORM_NUM_GROUPS: 1DEFORM_ON_PER_STAGE:- false- false- false- falseDEPTH: 50NORM: FrozenBNNUM_GROUPS: 1OUT_FEATURES:- res2- res3- res4- res5RES2_OUT_CHANNELS: 256RES5_DILATION: 1STEM_OUT_CHANNELS: 64STRIDE_IN_1X1: trueWIDTH_PER_GROUP: 64RETINANET:BBOX_REG_LOSS_TYPE: smooth_l1BBOX_REG_WEIGHTS: &id002- 1.0- 1.0- 1.0- 1.0FOCAL_LOSS_ALPHA: 0.25FOCAL_LOSS_GAMMA: 2.0IN_FEATURES:- p3- p4- p5- p6- p7IOU_LABELS:- 0- -1- 1IOU_THRESHOLDS:- 0.4- 0.5NMS_THRESH_TEST: 0.5NORM: ''NUM_CLASSES: 80NUM_CONVS: 4PRIOR_PROB: 0.01SCORE_THRESH_TEST: 0.05SMOOTH_L1_LOSS_BETA: 0.1TOPK_CANDIDATES_TEST: 1000ROI_BOX_CASCADE_HEAD:BBOX_REG_WEIGHTS:- &id001- 10.0- 10.0- 5.0- 5.0- - 20.0- 20.0- 10.0- 10.0- - 30.0- 30.0- 15.0- 15.0IOUS:- 0.5- 0.6- 0.7ROI_BOX_HEAD:BBOX_REG_LOSS_TYPE: smooth_l1BBOX_REG_LOSS_WEIGHT: 1.0BBOX_REG_WEIGHTS: *id001CLS_AGNOSTIC_BBOX_REG: falseCONV_DIM: 256FC_DIM: 1024FED_LOSS_FREQ_WEIGHT_POWER: 0.5FED_LOSS_NUM_CLASSES: 50NAME: FastRCNNConvFCHeadNORM: ''NUM_CONV: 0NUM_FC: 2POOLER_RESOLUTION: 7POOLER_SAMPLING_RATIO: 0POOLER_TYPE: ROIAlignV2SMOOTH_L1_BETA: 0.0TRAIN_ON_PRED_BOXES: falseUSE_FED_LOSS: falseUSE_SIGMOID_CE: falseROI_HEADS:BATCH_SIZE_PER_IMAGE: 512IN_FEATURES:- p2- p3- p4- p5IOU_LABELS:- 0- 1IOU_THRESHOLDS:- 0.5NAME: StandardROIHeadsNMS_THRESH_TEST: 0.5NUM_CLASSES: 80POSITIVE_FRACTION: 0.25PROPOSAL_APPEND_GT: trueSCORE_THRESH_TEST: 0.05ROI_KEYPOINT_HEAD:CONV_DIMS:- 512- 512- 512- 512- 512- 512- 512- 512LOSS_WEIGHT: 1.0MIN_KEYPOINTS_PER_IMAGE: 1NAME: KRCNNConvDeconvUpsampleHeadNORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: trueNUM_KEYPOINTS: 17POOLER_RESOLUTION: 14POOLER_SAMPLING_RATIO: 0POOLER_TYPE: ROIAlignV2ROI_MASK_HEAD:CLS_AGNOSTIC_MASK: falseCONV_DIM: 256NAME: MaskRCNNConvUpsampleHeadNORM: ''NUM_CONV: 4POOLER_RESOLUTION: 14POOLER_SAMPLING_RATIO: 0POOLER_TYPE: ROIAlignV2RPN:BATCH_SIZE_PER_IMAGE: 256BBOX_REG_LOSS_TYPE: smooth_l1BBOX_REG_LOSS_WEIGHT: 1.0BBOX_REG_WEIGHTS: *id002BOUNDARY_THRESH: -1CONV_DIMS:- -1HEAD_NAME: StandardRPNHeadIN_FEATURES:- p2- p3- p4- p5- p6IOU_LABELS:- 0- -1- 1IOU_THRESHOLDS:- 0.3- 0.7LOSS_WEIGHT: 1.0NMS_THRESH: 0.7POSITIVE_FRACTION: 0.5POST_NMS_TOPK_TEST: 1000POST_NMS_TOPK_TRAIN: 1000PRE_NMS_TOPK_TEST: 1000PRE_NMS_TOPK_TRAIN: 2000SMOOTH_L1_BETA: 0.0SEM_SEG_HEAD:COMMON_STRIDE: 4CONVS_DIM: 128IGNORE_VALUE: 255IN_FEATURES:- p2- p3- p4- p5LOSS_WEIGHT: 1.0NAME: SemSegFPNHeadNORM: GNNUM_CLASSES: 54WEIGHTS: /mnt/output/model_0029999.pth
OUTPUT_DIR: /mnt/output/
SEED: -1
SOLVER:AMP:ENABLED: falseBASE_LR: 0.02BASE_LR_END: 0.0BIAS_LR_FACTOR: 1.0CHECKPOINT_PERIOD: 5000CLIP_GRADIENTS:CLIP_TYPE: valueCLIP_VALUE: 1.0ENABLED: falseNORM_TYPE: 2.0GAMMA: 0.1IMS_PER_BATCH: 16LR_SCHEDULER_NAME: WarmupMultiStepLRMAX_ITER: 270000MOMENTUM: 0.9NESTEROV: falseNUM_DECAYS: 3REFERENCE_WORLD_SIZE: 0RESCALE_INTERVAL: falseSTEPS:- 210000- 250000WARMUP_FACTOR: 0.001WARMUP_ITERS: 1000WARMUP_METHOD: linearWEIGHT_DECAY: 0.0001WEIGHT_DECAY_BIAS: nullWEIGHT_DECAY_NORM: 0.0
TEST:AUG:ENABLED: falseFLIP: trueMAX_SIZE: 4000MIN_SIZES:- 400- 500- 600- 700- 800- 900- 1000- 1100- 1200DETECTIONS_PER_IMAGE: 100EVAL_PERIOD: 0EXPECTED_RESULTS: []KEYPOINT_OKS_SIGMAS: []PRECISE_BN:ENABLED: falseNUM_ITER: 200
VERSION: 2
VIS_PERIOD: 0[09/06 02:16:30 detectron2]: Full config saved to /mnt/output/config.yaml
[09/06 02:16:30 d2.utils.env]: Using a generated random seed 30790687
[09/06 02:16:32 d2.engine.defaults]: Model:
GeneralizedRCNN((backbone): FPN((fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))(fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))(fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))(fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))(fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(top_block): LastLevelMaxPool()(bottom_up): ResNet((stem): BasicStem((conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)))(res2): Sequential((0): BottleneckBlock((shortcut): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05))(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05))(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)))(1): BottleneckBlock((conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05))(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05))(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)))(2): BottleneckBlock((conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05))(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05))(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))))(res3): Sequential((0): BottleneckBlock((shortcut): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05))(conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(2, 2), bias=False(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05))(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05))(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)))(1): BottleneckBlock((conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05))(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05))(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)))(2): BottleneckBlock((conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05))(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05))(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)))(3): BottleneckBlock((conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05))(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05))(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05))))(res4): Sequential((0): BottleneckBlock((shortcut): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05))(conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(2, 2), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)))(1): BottleneckBlock((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)))(2): BottleneckBlock((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)))(3): BottleneckBlock((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)))(4): BottleneckBlock((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)))(5): BottleneckBlock((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05))(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05))))(res5): Sequential((0): BottleneckBlock((shortcut): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05))(conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(2, 2), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05))(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05))(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)))(1): BottleneckBlock((conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05))(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05))(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)))(2): BottleneckBlock((conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05))(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05))(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05))))))(proposal_generator): RPN((rpn_head): StandardRPNHead((conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)(activation): ReLU())(objectness_logits): Conv2d(256, 3, kernel_size=(1, 1), stride=(1, 1))(anchor_deltas): Conv2d(256, 12, kernel_size=(1, 1), stride=(1, 1)))(anchor_generator): DefaultAnchorGenerator((cell_anchors): BufferList()))(roi_heads): StandardROIHeads((box_pooler): ROIPooler((level_poolers): ModuleList((0): ROIAlign(output_size=(7, 7), spatial_scale=0.25, sampling_ratio=0, aligned=True)(1): ROIAlign(output_size=(7, 7), spatial_scale=0.125, sampling_ratio=0, aligned=True)(2): ROIAlign(output_size=(7, 7), spatial_scale=0.0625, sampling_ratio=0, aligned=True)(3): ROIAlign(output_size=(7, 7), spatial_scale=0.03125, sampling_ratio=0, aligned=True)))(box_head): FastRCNNConvFCHead((flatten): Flatten(start_dim=1, end_dim=-1)(fc1): Linear(in_features=12544, out_features=1024, bias=True)(fc_relu1): ReLU()(fc2): Linear(in_features=1024, out_features=1024, bias=True)(fc_relu2): ReLU())(box_predictor): FastRCNNOutputLayers((cls_score): Linear(in_features=1024, out_features=81, bias=True)(bbox_pred): Linear(in_features=1024, out_features=320, bias=True)))
)
[09/06 02:16:53 d2.data.datasets.coco]: Loading /mnt/coco/annotations/instances_train2017.json takes 21.55 seconds.
[09/06 02:16:55 d2.data.datasets.coco]: Loaded 118287 images in COCO format from /mnt/coco/annotations/instances_train2017.json
[09/06 02:17:05 d2.data.build]: Removed 1021 images with no usable annotations. 117266 images left.
[09/06 02:17:10 d2.data.build]: Distribution of instances among all 80 categories:
| category | #instances | category | #instances | category | #instances |
|:-------------:|:-------------|:------------:|:-------------|:-------------:|:-------------|
| person | 257253 | bicycle | 7056 | car | 43533 |
| motorcycle | 8654 | airplane | 5129 | bus | 6061 |
| train | 4570 | truck | 9970 | boat | 10576 |
| traffic light | 12842 | fire hydrant | 1865 | stop sign | 1983 |
| parking meter | 1283 | bench | 9820 | bird | 10542 |
| cat | 4766 | dog | 5500 | horse | 6567 |
| sheep | 9223 | cow | 8014 | elephant | 5484 |
| bear | 1294 | zebra | 5269 | giraffe | 5128 |
| backpack | 8714 | umbrella | 11265 | handbag | 12342 |
| tie | 6448 | suitcase | 6112 | frisbee | 2681 |
| skis | 6623 | snowboard | 2681 | sports ball | 6299 |
| kite | 8802 | baseball bat | 3273 | baseball gl.. | 3747 |
| skateboard | 5536 | surfboard | 6095 | tennis racket | 4807 |
| bottle | 24070 | wine glass | 7839 | cup | 20574 |
| fork | 5474 | knife | 7760 | spoon | 6159 |
| bowl | 14323 | banana | 9195 | apple | 5776 |
| sandwich | 4356 | orange | 6302 | broccoli | 7261 |
| carrot | 7758 | hot dog | 2884 | pizza | 5807 |
| donut | 7005 | cake | 6296 | chair | 38073 |
| couch | 5779 | potted plant | 8631 | bed | 4192 |
| dining table | 15695 | toilet | 4149 | tv | 5803 |
| laptop | 4960 | mouse | 2261 | remote | 5700 |
| keyboard | 2854 | cell phone | 6422 | microwave | 1672 |
| oven | 3334 | toaster | 225 | sink | 5609 |
| refrigerator | 2634 | book | 24077 | clock | 6320 |
| vase | 6577 | scissors | 1464 | teddy bear | 4729 |
| hair drier | 198 | toothbrush | 1945 | | |
| total | 849949 | | | | |
[09/06 02:17:10 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[09/06 02:17:10 d2.data.build]: Using training sampler TrainingSampler
[09/06 02:17:11 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[09/06 02:17:11 d2.data.common]: Serializing 117266 elements to byte tensors and concatenating them all ...
[09/06 02:17:16 d2.data.common]: Serialized dataset takes 450.77 MiB
[09/06 02:17:16 d2.data.build]: Making batched data loader with batch_size=2
[09/06 02:17:19 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /mnt/output/model_0029999.pth ...
[09/06 02:17:19 fvcore.common.checkpoint]: [Checkpointer] Loading from /mnt/output/model_0029999.pth ...
[09/06 02:17:19 fvcore.common.checkpoint]: Loading trainer from /mnt/output/model_0029999.pth ...
[09/06 02:17:19 d2.engine.hooks]: Loading scheduler from state_dict ...
[09/06 02:17:20 d2.engine.train_loop]: Starting training from iteration 30000
[09/06 02:17:35 d2.utils.events]: eta: 13:29:58 iter: 30019 total_loss: 0.6783 loss_cls: 0.2512 loss_box_reg: 0.2507 loss_rpn_cls: 0.05119 loss_rpn_loc: 0.08998 time: 0.2031 last_time: 0.2046 data_time: 0.4823 last_data_time: 0.0075 lr: 0.02 max_mem: 2903M
[09/06 02:17:39 d2.utils.events]: eta: 13:29:54 iter: 30039 total_loss: 0.7224 loss_cls: 0.2514 loss_box_reg: 0.2828 loss_rpn_cls: 0.05798 loss_rpn_loc: 0.09668 time: 0.2028 last_time: 0.2075 data_time: 0.0088 last_data_time: 0.0105 lr: 0.02 max_mem: 2903M
[09/06 02:17:43 d2.utils.events]: eta: 13:22:12 iter: 30059 total_loss: 0.6778 loss_cls: 0.2609 loss_box_reg: 0.28 loss_rpn_cls: 0.0531 loss_rpn_loc: 0.08019 time: 0.2015 last_time: 0.1991 data_time: 0.0087 last_data_time: 0.0075 lr: 0.02 max_mem: 2903M
[09/06 02:17:47 d2.utils.events]: eta: 13:20:22 iter: 30079 total_loss: 0.6472 loss_cls: 0.2415 loss_box_reg: 0.2495 loss_rpn_cls: 0.04993 loss_rpn_loc: 0.09604 time: 0.2003 last_time: 0.1824 data_time: 0.0092 last_data_time: 0.0114 lr: 0.02 max_mem: 2903M
[09/06 02:17:51 d2.utils.events]: eta: 13:20:18 iter: 30099 total_loss: 0.6032 loss_cls: 0.2444 loss_box_reg: 0.2445 loss_rpn_cls: 0.05288 loss_rpn_loc: 0.0732 time: 0.2008 last_time: 0.2081 data_time: 0.0090 last_data_time: 0.0170 lr: 0.02 max_mem: 2904M
[09/06 02:17:55 d2.utils.events]: eta: 13:20:14 iter: 30119 total_loss: 0.5806 loss_cls: 0.2233 loss_box_reg: 0.2357 loss_rpn_cls: 0.04176 loss_rpn_loc: 0.07658 time: 0.2006 last_time: 0.2017 data_time: 0.0080 last_data_time: 0.0070 lr: 0.02 max_mem: 2904M
[09/06 15:45:25 d2.utils.events]: eta: 0:01:19 iter: 269599 total_loss: 0.4819 loss_cls: 0.1778 loss_box_reg: 0.221 loss_rpn_cls: 0.02964 loss_rpn_loc: 0.05748 time: 0.1988 last_time: 0.2063 data_time: 0.0083 last_data_time: 0.0076 lr: 0.0002 max_mem: 2904M
[09/06 15:45:29 d2.utils.events]: eta: 0:01:15 iter: 269619 total_loss: 0.4636 loss_cls: 0.1539 loss_box_reg: 0.2098 loss_rpn_cls: 0.02256 loss_rpn_loc: 0.06406 time: 0.1988 last_time: 0.2030 data_time: 0.0086 last_data_time: 0.0077 lr: 0.0002 max_mem: 2904M
[09/06 15:45:33 d2.utils.events]: eta: 0:01:11 iter: 269639 total_loss: 0.5086 loss_cls: 0.1783 loss_box_reg: 0.2321 loss_rpn_cls: 0.02421 loss_rpn_loc: 0.06799 time: 0.1988 last_time: 0.1933 data_time: 0.0089 last_data_time: 0.0124 lr: 0.0002 max_mem: 2904M
[09/06 15:45:37 d2.utils.events]: eta: 0:01:07 iter: 269659 total_loss: 0.4706 loss_cls: 0.1592 loss_box_reg: 0.2124 loss_rpn_cls: 0.02371 loss_rpn_loc: 0.06897 time: 0.1988 last_time: 0.2003 data_time: 0.0083 last_data_time: 0.0089 lr: 0.0002 max_mem: 2904M
[09/06 15:45:41 d2.utils.events]: eta: 0:01:03 iter: 269679 total_loss: 0.4713 loss_cls: 0.1709 loss_box_reg: 0.2129 loss_rpn_cls: 0.02803 loss_rpn_loc: 0.06166 time: 0.1988 last_time: 0.1977 data_time: 0.0081 last_data_time: 0.0068 lr: 0.0002 max_mem: 2904M
[09/06 15:45:45 d2.utils.events]: eta: 0:00:59 iter: 269699 total_loss: 0.4516 loss_cls: 0.1572 loss_box_reg: 0.2083 loss_rpn_cls: 0.02273 loss_rpn_loc: 0.06108 time: 0.1988 last_time: 0.1891 data_time: 0.0086 last_data_time: 0.0120 lr: 0.0002 max_mem: 2904M
[09/06 15:45:49 d2.utils.events]: eta: 0:00:55 iter: 269719 total_loss: 0.4771 loss_cls: 0.1766 loss_box_reg: 0.2144 loss_rpn_cls: 0.02578 loss_rpn_loc: 0.06036 time: 0.1988 last_time: 0.1882 data_time: 0.0081 last_data_time: 0.0060 lr: 0.0002 max_mem: 2904M
[09/06 15:45:53 d2.utils.events]: eta: 0:00:51 iter: 269739 total_loss: 0.4586 loss_cls: 0.168 loss_box_reg: 0.2119 loss_rpn_cls: 0.02173 loss_rpn_loc: 0.05767 time: 0.1988 last_time: 0.1921 data_time: 0.0091 last_data_time: 0.0063 lr: 0.0002 max_mem: 2904M
[09/06 15:45:57 d2.utils.events]: eta: 0:00:47 iter: 269759 total_loss: 0.442 loss_cls: 0.1605 loss_box_reg: 0.2144 loss_rpn_cls: 0.02168 loss_rpn_loc: 0.05123 time: 0.1988 last_time: 0.1978 data_time: 0.0084 last_data_time: 0.0089 lr: 0.0002 max_mem: 2904M
[09/06 15:46:01 d2.utils.events]: eta: 0:00:43 iter: 269779 total_loss: 0.4803 loss_cls: 0.1671 loss_box_reg: 0.2056 loss_rpn_cls: 0.02325 loss_rpn_loc: 0.06247 time: 0.1988 last_time: 0.1842 data_time: 0.0091 last_data_time: 0.0063 lr: 0.0002 max_mem: 2904M
[09/06 15:46:05 d2.utils.events]: eta: 0:00:39 iter: 269799 total_loss: 0.4994 loss_cls: 0.181 loss_box_reg: 0.2173 loss_rpn_cls: 0.02877 loss_rpn_loc: 0.06976 time: 0.1988 last_time: 0.2037 data_time: 0.0082 last_data_time: 0.0078 lr: 0.0002 max_mem: 2904M
[09/06 15:46:09 d2.utils.events]: eta: 0:00:35 iter: 269819 total_loss: 0.4605 loss_cls: 0.162 loss_box_reg: 0.2145 loss_rpn_cls: 0.02834 loss_rpn_loc: 0.06403 time: 0.1988 last_time: 0.2047 data_time: 0.0078 last_data_time: 0.0067 lr: 0.0002 max_mem: 2904M
[09/06 15:46:13 d2.utils.events]: eta: 0:00:31 iter: 269839 total_loss: 0.5042 loss_cls: 0.1746 loss_box_reg: 0.2268 loss_rpn_cls: 0.02664 loss_rpn_loc: 0.05538 time: 0.1988 last_time: 0.2013 data_time: 0.0087 last_data_time: 0.0077 lr: 0.0002 max_mem: 2904M
[09/06 15:46:17 d2.utils.events]: eta: 0:00:27 iter: 269859 total_loss: 0.4772 loss_cls: 0.1592 loss_box_reg: 0.2132 loss_rpn_cls: 0.02413 loss_rpn_loc: 0.05851 time: 0.1988 last_time: 0.1826 data_time: 0.0074 last_data_time: 0.0107 lr: 0.0002 max_mem: 2904M
[09/06 15:46:21 d2.utils.events]: eta: 0:00:23 iter: 269879 total_loss: 0.4978 loss_cls: 0.1759 loss_box_reg: 0.2295 loss_rpn_cls: 0.02774 loss_rpn_loc: 0.07485 time: 0.1988 last_time: 0.2152 data_time: 0.0080 last_data_time: 0.0072 lr: 0.0002 max_mem: 2904M
[09/06 15:46:26 d2.utils.events]: eta: 0:00:19 iter: 269899 total_loss: 0.4582 loss_cls: 0.157 loss_box_reg: 0.2078 loss_rpn_cls: 0.02078 loss_rpn_loc: 0.05431 time: 0.1988 last_time: 0.2094 data_time: 0.0076 last_data_time: 0.0074 lr: 0.0002 max_mem: 2904M
[09/06 15:46:30 d2.utils.events]: eta: 0:00:15 iter: 269919 total_loss: 0.477 loss_cls: 0.1648 loss_box_reg: 0.2149 loss_rpn_cls: 0.02556 loss_rpn_loc: 0.06299 time: 0.1988 last_time: 0.1939 data_time: 0.0075 last_data_time: 0.0061 lr: 0.0002 max_mem: 2904M
[09/06 15:46:34 d2.utils.events]: eta: 0:00:11 iter: 269939 total_loss: 0.4678 loss_cls: 0.1682 loss_box_reg: 0.2207 loss_rpn_cls: 0.02335 loss_rpn_loc: 0.06278 time: 0.1988 last_time: 0.1984 data_time: 0.0086 last_data_time: 0.0074 lr: 0.0002 max_mem: 2904M
[09/06 15:46:38 d2.utils.events]: eta: 0:00:07 iter: 269959 total_loss: 0.4705 loss_cls: 0.1607 loss_box_reg: 0.2123 loss_rpn_cls: 0.02339 loss_rpn_loc: 0.06207 time: 0.1988 last_time: 0.1914 data_time: 0.0090 last_data_time: 0.0083 lr: 0.0002 max_mem: 2904M
[09/06 15:46:42 d2.utils.events]: eta: 0:00:03 iter: 269979 total_loss: 0.4843 loss_cls: 0.168 loss_box_reg: 0.2255 loss_rpn_cls: 0.0248 loss_rpn_loc: 0.07147 time: 0.1988 last_time: 0.2150 data_time: 0.0081 last_data_time: 0.0128 lr: 0.0002 max_mem: 2904M
[09/06 15:46:46 fvcore.common.checkpoint]: Saving checkpoint to /mnt/output/model_0269999.pth
[09/06 15:46:47 fvcore.common.checkpoint]: Saving checkpoint to /mnt/output/model_final.pth
[09/06 15:46:48 d2.utils.events]: eta: 0:00:00 iter: 269999 total_loss: 0.4217 loss_cls: 0.1577 loss_box_reg: 0.191 loss_rpn_cls: 0.02127 loss_rpn_loc: 0.0584 time: 0.1988 last_time: 0.2100 data_time: 0.0084 last_data_time: 0.0062 lr: 0.0002 max_mem: 2904M
[09/06 15:46:48 d2.engine.hooks]: Overall training speed: 239998 iterations in 13:15:06 (0.1988 s / it)
[09/06 15:46:48 d2.engine.hooks]: Total training time: 13:29:17 (0:14:10 on hooks)
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3435.)return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3435.)return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3435.)return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3435.)return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3435.)return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3435.)return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3435.)return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
[09/06 15:46:49 d2.data.datasets.coco]: Loaded 5000 images in COCO format from /mnt/coco/annotations/instances_val2017.json
[09/06 15:46:49 d2.data.build]: Distribution of instances among all 80 categories:
| category | #instances | category | #instances | category | #instances |
|:-------------:|:-------------|:------------:|:-------------|:-------------:|:-------------|
| person | 10777 | bicycle | 314 | car | 1918 |
| motorcycle | 367 | airplane | 143 | bus | 283 |
| train | 190 | truck | 414 | boat | 424 |
| traffic light | 634 | fire hydrant | 101 | stop sign | 75 |
| parking meter | 60 | bench | 411 | bird | 427 |
| cat | 202 | dog | 218 | horse | 272 |
| sheep | 354 | cow | 372 | elephant | 252 |
| bear | 71 | zebra | 266 | giraffe | 232 |
| backpack | 371 | umbrella | 407 | handbag | 540 |
| tie | 252 | suitcase | 299 | frisbee | 115 |
| skis | 241 | snowboard | 69 | sports ball | 260 |
| kite | 327 | baseball bat | 145 | baseball gl.. | 148 |
| skateboard | 179 | surfboard | 267 | tennis racket | 225 |
| bottle | 1013 | wine glass | 341 | cup | 895 |
| fork | 215 | knife | 325 | spoon | 253 |
| bowl | 623 | banana | 370 | apple | 236 |
| sandwich | 177 | orange | 285 | broccoli | 312 |
| carrot | 365 | hot dog | 125 | pizza | 284 |
| donut | 328 | cake | 310 | chair | 1771 |
| couch | 261 | potted plant | 342 | bed | 163 |
| dining table | 695 | toilet | 179 | tv | 288 |
| laptop | 231 | mouse | 106 | remote | 283 |
| keyboard | 153 | cell phone | 262 | microwave | 55 |
| oven | 143 | toaster | 9 | sink | 225 |
| refrigerator | 126 | book | 1129 | clock | 267 |
| vase | 274 | scissors | 36 | teddy bear | 190 |
| hair drier | 11 | toothbrush | 57 | | |
| total | 36335 | | | | |
[09/06 15:46:49 d2.data.dataset_mapper]: [DatasetMapper] Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
[09/06 15:46:49 d2.data.common]: Serializing the dataset using: <class 'detectron2.data.common._TorchSerializedList'>
[09/06 15:46:49 d2.data.common]: Serializing 5000 elements to byte tensors and concatenating them all ...
[09/06 15:46:50 d2.data.common]: Serialized dataset takes 19.08 MiB
[09/06 15:46:50 d2.evaluation.evaluator]: Start inference on 625 batches
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:3435.)return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
[09/06 15:46:54 d2.evaluation.evaluator]: Inference done 11/625. Dataloading: 0.0013 s/iter. Inference: 0.0479 s/iter. Eval: 0.0003 s/iter. Total: 0.0495 s/iter. ETA=0:00:30
[09/06 15:46:59 d2.evaluation.evaluator]: Inference done 121/625. Dataloading: 0.0018 s/iter. Inference: 0.0437 s/iter. Eval: 0.0003 s/iter. Total: 0.0459 s/iter. ETA=0:00:23
[09/06 15:47:04 d2.evaluation.evaluator]: Inference done 218/625. Dataloading: 0.0018 s/iter. Inference: 0.0463 s/iter. Eval: 0.0004 s/iter. Total: 0.0486 s/iter. ETA=0:00:19
[09/06 15:47:09 d2.evaluation.evaluator]: Inference done 327/625. Dataloading: 0.0019 s/iter. Inference: 0.0454 s/iter. Eval: 0.0004 s/iter. Total: 0.0477 s/iter. ETA=0:00:14
[09/06 15:47:14 d2.evaluation.evaluator]: Inference done 439/625. Dataloading: 0.0019 s/iter. Inference: 0.0447 s/iter. Eval: 0.0004 s/iter. Total: 0.0470 s/iter. ETA=0:00:08
[09/06 15:47:19 d2.evaluation.evaluator]: Inference done 548/625. Dataloading: 0.0018 s/iter. Inference: 0.0446 s/iter. Eval: 0.0004 s/iter. Total: 0.0468 s/iter. ETA=0:00:03
[09/06 15:47:23 d2.evaluation.evaluator]: Total inference time: 0:00:29.375996 (0.047381 s / iter per device, on 8 devices)
[09/06 15:47:23 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:27 (0.044494 s / iter per device, on 8 devices)
[09/06 15:47:25 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[09/06 15:47:25 d2.evaluation.coco_evaluation]: Saving results to /mnt/output/inference/coco_instances_results.json
[09/06 15:47:26 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.87s)
creating index...
index created!
[09/06 15:47:27 d2.evaluation.fast_eval_api]: Evaluate annotation type *bbox*
[09/06 15:47:38 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 11.66 seconds.
[09/06 15:47:39 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[09/06 15:47:40 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 1.11 seconds.Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.401Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.608Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.435Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.238Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.434Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.521Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.326Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.512Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.537Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.350Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.573Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.675
[09/06 15:47:40 d2.evaluation.coco_evaluation]: Evaluation results for bbox:
| AP | AP50 | AP75 | APs | APm | APl |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 40.064 | 60.844 | 43.457 | 23.807 | 43.418 | 52.071 |
[09/06 15:47:40 d2.evaluation.coco_evaluation]: Per-category bbox AP:
| category | AP | category | AP | category | AP |
|:--------------|:-------|:-------------|:-------|:---------------|:-------|
| person | 54.495 | bicycle | 30.683 | car | 44.135 |
| motorcycle | 42.270 | airplane | 63.287 | bus | 63.137 |
| train | 60.485 | truck | 33.669 | boat | 26.893 |
| traffic light | 27.209 | fire hydrant | 66.158 | stop sign | 65.836 |
| parking meter | 43.886 | bench | 24.038 | bird | 36.551 |
| cat | 62.168 | dog | 58.581 | horse | 56.822 |
| sheep | 49.903 | cow | 53.541 | elephant | 59.534 |
| bear | 68.135 | zebra | 65.108 | giraffe | 64.143 |
| backpack | 15.814 | umbrella | 37.993 | handbag | 14.656 |
| tie | 32.060 | suitcase | 37.098 | frisbee | 63.240 |
| skis | 22.691 | snowboard | 32.603 | sports ball | 46.379 |
| kite | 41.527 | baseball bat | 27.225 | baseball glove | 34.695 |
| skateboard | 49.108 | surfboard | 35.116 | tennis racket | 47.308 |
| bottle | 38.765 | wine glass | 35.143 | cup | 40.831 |
| fork | 34.533 | knife | 17.379 | spoon | 15.792 |
| bowl | 40.787 | banana | 23.224 | apple | 19.382 |
| sandwich | 31.478 | orange | 29.419 | broccoli | 21.541 |
| carrot | 21.991 | hot dog | 30.782 | pizza | 50.570 |
| donut | 42.976 | cake | 34.088 | chair | 26.252 |
| couch | 39.110 | potted plant | 26.181 | bed | 37.305 |
| dining table | 26.871 | toilet | 58.641 | tv | 54.275 |
| laptop | 57.916 | mouse | 62.390 | remote | 30.654 |
| keyboard | 52.291 | cell phone | 33.529 | microwave | 52.223 |
| oven | 31.390 | toaster | 44.195 | sink | 37.394 |
| refrigerator | 52.529 | book | 15.838 | clock | 47.557 |
| vase | 37.813 | scissors | 23.726 | teddy bear | 43.291 |
| hair drier | 4.950 | toothbrush | 21.977 | | |
[09/06 15:47:41 d2.engine.defaults]: Evaluation results for coco_2017_val in csv format:
[09/06 15:47:41 d2.evaluation.testing]: copypaste: Task: bbox
[09/06 15:47:41 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[09/06 15:47:41 d2.evaluation.testing]: copypaste: 40.0645,60.8442,43.4570,23.8066,43.4178,52.0706
训练日志文件记录了训练过程中的详细信息,包括环境信息、配置文件内容、模型定义、数据集加载、训练过程等。以下是日志文件的关键部分和解释:
环境信息
[09/06 02:16:30 detectron2]: Environment info:
------------------------------- --------------------------------------------------------------
sys.platform linux
Python 3.8.10 (default, Nov 14 2022, 12:59:47) [GCC 9.4.0]
numpy 1.22.2
detectron2 0.6 @/root/detectron2/detectron2
Compiler GCC 9.4
CUDA compiler CUDA 12.0
...
------------------------------- --------------------------------------------------------------
配置文件内容
[09/06 02:16:30 detectron2]: Contents of args.config_file=../configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml:
_BASE_: "../Base-RCNN-FPN.yaml"
MODEL:WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"MASK_ON: FalseRESNETS:DEPTH: 50
SOLVER:STEPS: (210000, 250000)MAX_ITER: 270000
模型定义
[09/06 02:16:32 d2.engine.defaults]: Model: GeneralizedRCNN(...)
数据集加载
[09/06 02:16:53 d2.data.datasets.coco]: Loading /mnt/coco/annotations/instances_train2017.json takes 21.55 seconds.
[09/06 02:16:55 d2.data.datasets.coco]: Loaded 118287 images in COCO format from /mnt/coco/annotations/instances_train2017.json
训练过程
[09/06 02:17:20 d2.engine.train_loop]: Starting training from iteration 30000
...
[09/06 15:46:48 d2.engine.hooks]: Overall training speed: 239998 iterations in 13:15:06 (0.1988 s / it)
推理时间
[09/06 15:47:23 d2.evaluation.evaluator]: Total inference time: 0:00:29.375996 (0.047381 s / iter per device, on 8 devices)
[09/06 15:47:23 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:27 (0.044494 s / iter per device, on 8 devices)
平均精度(AP)
[09/06 15:47:40 d2.evaluation.coco_evaluation]: Evaluation results for bbox:
| AP | AP50 | AP75 | APs | APm | APl |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 40.064 | 60.844 | 43.457 | 23.807 | 43.418 | 52.071 |
每类别的AP
[09/06 15:47:40 d2.evaluation.coco_evaluation]: Per-category bbox AP:
| category | AP | category | AP | category | AP |
|:--------------|:-------|:-------------|:-------|:---------------|:-------|
| person | 54.495 | bicycle | 30.683 | car | 44.135 |
| motorcycle | 42.270 | airplane | 63.287 | bus | 63.137 |
| train | 60.485 | truck | 33.669 | boat | 26.893 |
...
| hair drier | 4.950 | toothbrush | 21.977 | | |
四、训练指标比较
Model Zoo中的指标
Name | lr sched | train time (s/iter) | inference time (s/im) | train mem (GB) | box AP | model id | download |
---|---|---|---|---|---|---|---|
R50-C4 | 1x | 0.551 | 0.102 | 4.8 | 35.7 | 137257644 | model | metrics |
R50-DC5 | 1x | 0.380 | 0.068 | 5.0 | 37.3 | 137847829 | model | metrics |
R50-FPN | 1x | 0.210 | 0.038 | 3.0 | 37.9 | 137257794 | model | metrics |
R50-C4 | 3x | 0.543 | 0.104 | 4.8 | 38.4 | 137849393 | model | metrics |
R50-DC5 | 3x | 0.378 | 0.070 | 5.0 | 39.0 | 137849425 | model | metrics |
R50-FPN | 3x | 0.209 | 0.038 | 3.0 | 40.2 | 137849458 | model | metrics |
R101-C4 | 3x | 0.619 | 0.139 | 5.9 | 41.1 | 138204752 | model | metrics |
R101-DC5 | 3x | 0.452 | 0.086 | 6.1 | 40.6 | 138204841 | model | metrics |
R101-FPN | 3x | 0.286 | 0.051 | 4.1 | 42.0 | 137851257 | model | metrics |
X101-FPN | 3x | 0.638 | 0.098 | 6.7 | 43.0 | 139173657 | model | metrics |
我们复现的模型指标
- 训练时间 (s/iter):0.1988
- 推理时间 (s/im):0.047381
- 训练内存 (GB):最大约 2.9GB
- 检测精度 (box AP):40.0645
比较分析
-
训练时间 (s/iter):
- Model Zoo 中 R50-FPN 3x 模型的训练时间为 0.209s/iter,而我们训练的模型为 0.1988s/iter。我们的训练时间略短,可能是由于硬件配置或优化的差异。
-
推理时间 (s/im):
- Model Zoo 中 R50-FPN 3x 模型的推理时间为 0.038s/im,而我们复现的模型推理时间为 0.047381s/im,稍长一些。
-
训练内存 (GB):
- Model Zoo 中 R50-FPN 3x 模型的训练内存为 3.0GB,而我们的训练内存为 2.9GB,基本相当。
-
检测精度 (box AP):
- Model Zoo 中 R50-FPN 3x 模型的检测精度为 40.2,而我们训练的模型检测精度为 40.0645,基本相当,略低于 Model Zoo 中的结果。
结果总结
通过比较可以看出,我们训练的模型在各项指标上与 Model Zoo 中的 R50-FPN 3x 模型非常接近,训练时间略短,内存使用相当,检测精度稍低。总体来说,复现的结果是成功的,证明了训练过程的可靠性和模型的有效性。
五、训练输出目录
训练输出目录包含了训练过程中生成的所有重要文件,包括配置文件、事件日志、训练日志、评估指标和多个模型检查点文件。
目录结构
/mnt/output
├── config.yaml
├── events.out.tfevents.*
├── inference/
├── last_checkpoint
├── log.txt
├── log.txt.rank*
├── metrics.json
├── model_*.pth
└── model_final.pth
文件解释
- config.yaml:记录训练配置,便于复现实验。
- events.out.tfevents.*:用于TensorBoard可视化,帮助监控训练过程。
- inference/:存储推理结果。
- last_checkpoint:记录最新的检查点文件名,便于恢复训练。
- log.txt 和 log.txt.rank*:记录训练过程中的详细日志信息,便于调试和分析。
- metrics.json:记录训练和验证过程中计算的各种指标。
- model_*.pth 和 model_final.pth:保存模型权重,用于恢复训练、评估或部署。
文件作用总结
- 配置文件(config.yaml):记录训练配置,便于复现实验。
- 事件文件(events.out.tfevents.*):用于TensorBoard可视化,帮助监控训练过程。
- 推理目录(inference/):存储推理结果。
- 检查点记录(last_checkpoint):记录最新的检查点文件名,便于恢复训练。
- 日志文件(log.txt、log.txt.rank)*:记录训练过程中的详细日志信息,便于调试和分析。
- 指标文件(metrics.json):记录训练和验证过程中计算的各种指标,便于分析和比较。
- 模型检查点文件(model_*.pth、model_final.pth):保存模型权重,用于恢复训练、评估或部署。
这些文件和目录共同构成了一个完整的模型训练输出,便于后续的分析、调试和部署。
复现过程的价值
复现过程不仅验证了原始研究结果的可靠性,还帮助我们深入理解了模型的训练和评估过程。通过比较不同模型的指标,我们可以选择最合适的模型架构和训练策略,提高模型的性能和效率。
结论
本文详细介绍了如何使用Detectron2进行目标检测模型的训练,涵盖数据准备、训练命令、训练日志分析、训练指标以及训练输出目录的各个文件及其作用。通过复现Model Zoo中的模型训练过程,并在训练过程中出现中断后使用 resume
功能继续训练,我们验证了训练结果的可靠性,并深入理解了模型的性能指标。希望这篇文章能为读者提供有价值的参考,帮助大家更好地进行模型训练和评估。
这篇关于Detectorn2预训练模型复现:数据准备、训练命令、日志分析与输出目录的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!