Repository: Kedreamix/YoloGesture Branch: main Commit: f4e9ddb54510 Files: 77 Total size: 475.9 KB Directory structure: gitextract_n_8zvszz/ ├── .devcontainer/ │ └── devcontainer.json ├── .gitignore ├── 2007_train.txt ├── 2007_val.txt ├── Pipfile ├── README.md ├── VOCdevkit/ │ └── VOC2007/ │ ├── Annotations/ │ │ ├── 1.xml │ │ ├── 2.xml │ │ ├── 3.xml │ │ ├── 4.xml │ │ ├── 5.xml │ │ └── README.md │ └── ImageSets/ │ └── Main/ │ ├── README.md │ ├── test.txt │ ├── train.txt │ ├── trainval.txt │ └── val.txt ├── YOLOv4-study学习资料md ├── detect.py ├── gen_annotation.py ├── gesture.streamlit.py ├── get_map.py ├── get_yaml.py ├── instructions.md ├── kmeans_for_anchors.py ├── logs/ │ ├── README.md │ ├── gesture_loss_2021_11_14_22_04_00/ │ │ ├── epoch_loss_2021_11_14_22_04_00.txt │ │ └── epoch_val_loss_2021_11_14_22_04_00.txt │ ├── loss_2022_04_27_08_48_16/ │ │ ├── epoch_loss.txt │ │ ├── epoch_val_loss.txt │ │ └── events.out.tfevents.1651049298.fef10e9dbba1.425.0 │ ├── loss_2022_04_27_10_38_48/ │ │ ├── epoch_loss.txt │ │ ├── epoch_val_loss.txt │ │ └── events.out.tfevents.1651055931.9b45dd4991ae.367.0 │ ├── loss_2022_04_27_12_50_47/ │ │ ├── epoch_loss.txt │ │ ├── epoch_val_loss.txt │ │ └── events.out.tfevents.1651063849.274e119c63fb.1015.0 │ ├── loss_2022_04_28_00_40_54/ │ │ ├── epoch_loss.txt │ │ ├── epoch_val_loss.txt │ │ └── events.out.tfevents.1651106457.117e69507361.564.0 │ ├── loss_2022_04_28_14_54_17/ │ │ ├── epoch_loss.txt │ │ ├── epoch_val_loss.txt │ │ └── events.out.tfevents.1651128857.LAPTOP-IE5MVR15.24536.0 │ └── loss_2022_05_02_14_57_57/ │ ├── epoch_loss.txt │ ├── epoch_val_loss.txt │ └── events.out.tfevents.1651503480.437fb01f4bb0.370.0 ├── model_data/ │ ├── .gitattributes │ ├── gesture.yaml │ ├── gesture_classes.txt │ ├── yolo_anchors.txt │ └── yolotiny_anchors.txt ├── nets/ │ ├── CSPdarknet.py │ ├── CSPdarknet53_tiny.py │ ├── __init__.py │ ├── attention.py │ ├── yolo.py │ ├── yolo_tiny.py │ ├── yolo_training.py │ └── yolotiny_training.py ├── packages.txt ├── predict.py ├── requirements.txt ├── summary.py ├── train.py ├── utils/ │ ├── __init__.py │ ├── callbacks.py │ ├── dataloader.py │ ├── utils.py │ ├── utils_bbox.py │ ├── utils_fit.py │ └── utils_map.py ├── utils_coco/ │ ├── coco_annotation.py │ └── get_map_coco.py ├── voc_annotation.py ├── yolo.py ├── yolo_anchors.txt └── yolov4-gesture-tutorial.ipynb ================================================ FILE CONTENTS ================================================ ================================================ FILE: .devcontainer/devcontainer.json ================================================ { "name": "Python 3", // Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile "image": "mcr.microsoft.com/devcontainers/python:1-3.11-bullseye", "customizations": { "codespaces": { "openFiles": [ "README.md", "gesture_streamlit.py" ] }, "vscode": { "settings": {}, "extensions": [ "ms-python.python", "ms-python.vscode-pylance" ] } }, "updateContentCommand": "[ -f packages.txt ] && sudo apt update && sudo apt upgrade -y && sudo xargs apt install -y 1600 - [x] 利用Mosaic数据增强,但是结果不好,之后训练不会采用,除非数据足够多 - [x] 增加yaml文件,利用yaml配置所有参数 - [x] 提高图片的输入shape,从256x256 -> 416x416 - [x] 由于结果不理想,使用部分自制数据集替换,数据集总数不变 - [x] 添加yolov4 tiny 轻量化模型 - [x] 增加注意力机制,可以比轻量化模型得到更不错的结果 ## 2. 部分尝试结果 - [x] 使用Mosaic 结果较差 - [x] 在运行过程中结果十分差,原因是数据集标注出现错误,会重新修改数据集 - [x] 用SGD的结果没有Adam好 - [x] front的数据集需要重新修改才能得到更好的结果 - [x] 使用tiny模型速度更快,结果虽然差一点,但是只是一个速度与精度的trade off ## 3. 项目整体框架 1、 了解项目研究的背景以及其意义,学习其中的创新点和科研价值。 2、 使用python语言对项目中的代码进行编写。研究项目源代码,理解项目工程的代码结构、原理及其功能。 3、 学习深度学习算法。理解卷积神经网络的相关概念,包括神经元系统、局部感受野、权值共享和卷积神经网络总体结构;了解目前常见的目标检测方法和YOLOv4算法框架,以及基于YOLOv4的手势识别算法。 4、 设计并制作针对本项目手势控制数据集,并使用数据增广的方式对数据集进行扩充,同时使用图像处理的方法包括中值滤波、阈值分割等对数据进行预处理。 5、 训练模型,对目标检测性能进行测试。了解实验环境以及评价标准,测试本项目研究的手势识别算法的实验结果,然后通过采用控制变量方法对手势识别算法进行多组实验,以评估其在不同环境下的识别效果,使用验证集对手势识别算法的精度和速度进行性能测试。 6、 总结本项目的研究工作,对基于无人机的手势识别演剧提供创新点与发展建议。 ### 3.1. 数据集构建 1. 设计并制作针对本项目手势控制数据集,对数据集进行分类。 ![在这里插入图片描述](https://img-blog.csdnimg.cn/5b3c7cc2c58c404987d54d9a2f5bb68d.png) 2. 使用Labelimg标注工具设计针对本项目的手势数据集,对数据集进行标注。 ![在这里插入图片描述](https://img-blog.csdnimg.cn/3312206223674362b64026836184e3e4.png) ### 3.2. 模型选择 在前期的模型选择中,简单的选择了YOLOv4的模型进行训练和测试 **YOLOv4 = CSPDarknet53(主干) + SPP** **附加模块(颈** **) +** **PANet** **路径聚合(颈** **) + YOLOv3(头部)** ![img](https://pdf.cdn.readpaper.com/parsed/fetch_target/699143cdb334ecfc63caf8192472490c_0_Figure_1.png) ### 3.3. 代码实现 - [x] 主干特征提取网络:DarkNet53 => CSPDarkNet53 - [x] 特征金字塔:SPP,PAN - [x] 训练用到的小技巧:Mosaic数据增强、Label Smoothing平滑、CIOU、学习率余弦退火衰减 - [x] 激活函数:使用Mish激活函数 - [x] 增加yaml配置文件,只需要修改配置文件即可 - [x] 添加detect.py,利用此进行半自动标注,可以方便标注其他类似于对应👋的数据集 - [x] 修改成命令行运行的快速模式,很方便,快速运行和理解 - [x] 利用streamlit部署到服务器上,可以随时使用,在线demo [https://kedreamix-yologesture.streamlit.app/](https://kedreamix-yologesture.streamlit.app/) - [ ] ...... ## 4. 实验结果详情 | 训练数据集 | 权值文件名称 | 迭代次数 | Batch-size | 图片shape | 平均准确率 | mAP 0.5 | fps | | :--------: | :----------------------------------------------------------: | :------: | :--------: | :-------: | :--------: | :-----: | ----- | | Gesture v1 | yolo4_gesture_weights.pth | 150 | 4->8 | 256x256 | 61.65 | 51.66 | | | Gesture v2 | yolo4tiny_gesture_SE.pth | 100 | 64->32 | 416x416 | 83.6 | 95.18 | 76.08 | | Gesture v2 | yolo4tiny_gesture_CBAM.pth | 100 | 64->32 | 416x416 | 89.35 | 98.85 | 70.01 | | Gesture v2 | yolo4tiny_gesture_ECA.pth | 100 | 64->32 | 416x416 | 88.37 | 96.26 | 77.19 | | Gesture v2 | yolo4tiny_gesture.pth | 100 | 64->32 | 416x416 | 87.01 | 95.86 | 81.81 | | Gesture v2 | yolo4_gesture_weightsv2.pth | 100 | 4->8 | 256x256 | 84.51 | 90.77 | 24.21 | | Gesture v3 | [yolov4_tiny.pth](https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_tiny.pth) | 150 | 64->32 | 416x416 | 75.05 | 91.30 | | | Gesture v3 | [yolov4_SE.pth](https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_SE.pth) | 150 | 64->32 | 416x416 | 78.06 | 90.13 | | | Gesture v3 | [yolov4_CBAM.pth](https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_CBAM.pth) | 150 | 64->32 | 416x416 | 91.09 | 94.97 | | | Gesture v3 | [yolov4_ECA.pth](https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_ECA.pth) | 150 | 64->32 | 416x416 | 94.58 | 83.24 | | | Gesture v3 | [yolov4_weights_ep150_416.pth](https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_weights_ep150_416.pth) | 150 | 64->32 | 416x416 | 95.145 | 98.35 | | | Gesture v3 | [yolov4_weights_ep150_608.pth](https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_weights_ep150_608.pth) | 150 | 64->32 | 608x608 | 93.64 | 97.23 | | > Gesture v1中存在数据集问题,所以模型结构不好 > > Gesture v2中重新修改数据集 > > Gesture v3中修改front数据集 Batch-Size 64->32是指在进行训练的时候,前半段冻结的时候使用的bs为64,在后续不冻结训练使用bs=32 ### 4.1. 训练权重文件下载 训练所需的yolo4_weights.pth有两种方式下载。(release包含所有过程的权重,百度网盘和奶牛只记录最新的权重) - 可以从我的release下载权重 [https://github.com/Kedreamix/YoloGesture/releases/tag/v1.0](https://github.com/Kedreamix/YoloGesture/releases/tag/v1.0) - 可以从我的huggingface的model下载权重 [https://huggingface.co/Kedreamix/YoloGestureWeights](https://huggingface.co/Kedreamix/YoloGestureWeights) - 也可以百度网盘下载 链接:https://pan.baidu.com/s/1Pt11VHMaHqSsPjb50W5IeQ 提取码:6666 - 由于百度网盘下载速度较慢,这里也给一个不限速的链接 (永久有效) 传输链接:https://cowtransfer.com/s/dc5e0f7f43a940 或 打开【奶牛快传】cowtransfer.com 使用传输口令:ftyvu0 提取; ### 4.2. 数据集概况 ![在这里插入图片描述](https://img-blog.csdnimg.cn/5b3c7cc2c58c404987d54d9a2f5bb68d.png) - **Gesture v1** 只有800张图片,数量较少 - **Gesture** **v2** 增加了800张图片,数量增多,一共1600张图片 在运行过程中结果十分差,原因是数据集标注出现错误,会重新修改数据集 - **Gesture v3** 中修改了front的手势,使得front结果大大提升,平均准确率增大 > 上述展示图是关于Gesture v1的手势,后续数据进行了修改 整体数据集一共含有1600张,8个类别的手势,我的Gesture v3最后就是8个类别,大概1600张的数据集,类别分别是 - up - down - left - right - front - back - clockwise - anticlockwise 数据已经放在release中了,可以下载自用 > 之后我也做了类似的手势识别的任务,里面的数据集有18个类别 HaGRID手势识别数据集,里面的手势结果更多,并且也更大,总共有716G,建议可以缩小以后进行训练增强,如果有机会,我可以拿一个多类别的我也来训练一下 > > 以下是HaGRID的手势识别的类别,支持更多的手势识别的结果,这是官方下载地址:[https://github.com/hukenovs/hagrid](https://github.com/hukenovs/hagrid) > > [![gestures](https://github.com/hukenovs/hagrid/raw/master/images/gestures.jpg)](https://github.com/hukenovs/hagrid/blob/master/images/gestures.jpg) ## 5. 环境配置 我用的是torch==1.8.1 torchvision==0.9.1 > 代码在更高的版本也是适配的,我觉得可能去>=1.7的应该都是可以的 相对应的库可以直接利用以下代码在当前路径进行运行,利用清华源进行换源 ```bash pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/ ``` ## 6. 快速运行代码 以下可以在命令行中运行,在命令行运行可能会更好一点
Install ```bash git clone https://github.com/Kedreamix/YoloGesture.git cd YoloGesture pip install -r requirements.txt # -i https://pypi.tuna.tsinghua.edu.cn/simple/ # install 可以加上清华源 ```
Data 这一部分会生成两个文件,分别是2007_train.txt和2007_val.txt,在每一行包括了图片路径和对应的标签,之后代码会读取文件夹VOCdevkit/VOC2007下的图片和标签 ```python python voc_annotation.py ```
Optional 可选,yolov4有对anchors进行Kmeans计算,但是用yolov4自带的也可以,这一部分是可选择的,做完有一个可视化的结果 ```python python kmeans_for_anchors.py ``` ![kmeans_for_anchors.jpg](https://github.com/Kedreamix/YoloGesture/blob/main/kmeans_for_anchors.jpg?raw=true)
Training 我们可以在里面设置所需要的参数,phi代表着不同的注意力机制,weights代表着权重,其他的是我们的一些参数的设置,都是可调的,参数的部分解释都可以从python train.py --help看到 ```bash usage: train.py [-h] [--init INIT] [--epochs EPOCHS] [--weights WEIGHTS] [--freeze] [--freeze-epochs FREEZE_EPOCHS] [--freeze-size FREEZE_SIZE] [--batch-size BATCH_SIZE] [--optimizer {sgd,adam,adamw}] [--num_workers NUM_WORKERS] [--lr LR] [--tiny] [--phi PHI] [--weight-decay WEIGHT_DECAY] [--momentum MOMENTUM] [--save-period SAVE_PERIOD] [--cuda] [--shape SHAPE] [--fp16] [--mosaic] [--lr_decay_type {cos,step}] [--distributed] optional arguments: -h, --help show this help message and exit --init INIT 从init epoch开始训练 --epochs EPOCHS epochs for training --weights WEIGHTS initial weights path 初始权重的路径 --freeze 表示是否冻结训练 --freeze-epochs FREEZE_EPOCHS epochs for feeze 冻结训练的迭代次数 --freeze-size FREEZE_SIZE total batch size for Freezeing --batch-size BATCH_SIZE total batch size for all GPUs --optimizer {sgd,adam,adamw} 训练使用的optimizer --num_workers NUM_WORKERS 用于设置是否使用多线程读取数据 --lr LR Learning Rate 学习率的初始值 --tiny 使用yolov4-tiny模型 --phi PHI yolov4-tiny所使用的注意力机制的类型 --weight-decay WEIGHT_DECAY 权值衰减,可防止过拟合 --momentum MOMENTUM 优化器中的参数 --save-period SAVE_PERIOD 多少个epochs保存一次权重 --cuda 表示是否使用GPU --shape SHAPE 输入图像的shape,一定要是32的倍数 --fp16 是否使用混合精度训练 --mosaic Yolov4的tricks应用 马赛克数据增强 --lr_decay_type {cos,step} cos --distributed 是否使用多卡运行 ``` 这里对一些常用参数进行解释 - fp16 由于训练神经网络,有时候得到的权重的精度都是64位或者32位的,保存和训练的时候都占了很多显存,但是有时候这些是不必要的,所以可以利用fp16将精度设为16位,这样大概可以减少一半的显存 - phi 这里解释一下,phi = 0代表的是yolov4_tiny,也就是改进的轻量化yolov4,而phi = 1,2,3分别是加了SE,CBAM,ECA三种注意力机制得到的结果。具体对SE,CBAM,ECA注意力机制不懂的,可以看看这篇博文,我觉得写的蛮好的:[https://blog.csdn.net/weixin_44791964/article/details/121371986](https://blog.csdn.net/weixin_44791964/article/details/121371986),这里不过多介绍。 - freeze 除此之外,可以从下面的代码看出,我们可以冻结进行迁移学习,也可以选择不冻结,通过参数freeze来控制,还可以控制冻结次数的冻结时的batch-size,冻结的时候,可以把batch-size调高一点,并且还可以调一下freeze-epochs参数和freeze-size参数 如果对于不同模型训练的不动的,可以看看下面的训练预测细节解释 ```python # 冻结进行迁移学习,利用已有的yolov4_SE.pth的权重进行 python train.py --tiny --phi 1 --epochs 100 \ --weights model_data/yolov4_SE.pth \ --freeze --freeze-epochs 50 --freeze-size 8 \ --batch-size 4 --shape 416 \ --fp16 --cuda # 快速运行尝试,重新学习 python train.py --tiny --phi 1 --epochs 10 \ --batch-size 4 --shape 416 \ --fp16 --cuda ``` 在后续为了简化操作,不用打那么多的字母,还进行了缩写的修改,把--freeze简化成-f,--weights 简化成 -w, --freeze-epochsj简化成-fe,--freeze-size 简化成fb, --batch-size简化成-bs,这是为了方便运行的时候设置参数 这段代码和上面是等价的 ```python # 冻结进行迁移学习 python train.py --tiny --phi 1 --epochs 100 \ -w model_data/yolov4_SE.pth \ -f -fe 50 -fs 8 \ --bs 4 --shape 416 \ --fp16 --cuda # 快速运行尝试,重新学习 python train.py --tiny --phi 1 --epochs 10 \ --batch-size 4 --shape 416 \ --fp16 --cuda ```
Predict predict也有一些参数,比如以什么模式运行,分别有['dir_predict', 'video', 'fps','predict','heatmap'],默认是用predict来推理img文件夹下的所有图片 ```python # python predict.py --mode dir_predict \ # --tiny --phi 1 \ # --weights model_data/yolov4_SE.pth \ # --cuda --shape 416 python predict.py --tiny --cuda ```
Get Map 这一部分可以得到召回率和精确率等可视化的图片,可以清晰的看到结果 ```python # 对验证集进行计算 # python get_map.py --mode 0 \ # --tiny --phi 1 \ # --weights model_data/yolov4_SE.pth \ # --cuda --shape 416 # python .\get_map.py --cuda --mode 0 --tiny --phi 3 --weights model_data/yolotv4_ECA.pth python get_map.py --tiny --cuda ```
所有的参数都可以通过,通过help看到解释 ```python python train.py -h python get_map.py -h python predict.py -h ``` 除此之外,如果有多个GPU,需要设定指定的GPU,在python前加上配置CUDA_VISIBLE_DEVICES=3,表示使用第四块GPU ```python # 比如使用第4块GPU CUDA_VISIBLE_DEVICES=3 python train.py ... ``` 或者是多块GPU,比如有0,1两块GPU ```python # 比如使用第0,1块GPU CUDA_VISIBLE_DEVICES=0,1 python train.py ... ``` ## 7. 训练预测细节解释 ### 7.1. 训练配置文件(重点) 这个重中之重,在model_data文件夹下,有一个yaml文件,里面包括部分需要运行的参数 只需要调节里面的参数,然后运行就可以得到我们的结果,完全是ok的,只需要改配置文件,其他参数可以在命令行修改,直接运行也是可以使用的,下面会详细介绍,主要是修改train.py的部分,因为这样可以方便我们训练 ```yaml #------------------------------detect.py--------------------------------# # 这一部分是为了半自动标注数据,可以减轻负担,需要提前训练一个权重,以Labelme格式保存 # dir_origin_path 图片存放位置 # dir_save_path Annotation保存位置 # ----------------------------------------------------------------------# dir_detect_path: ./JPEGImages detect_save_path: ./Annotation # ----------------------------- train.py -------------------------------# nc: 8 # 类别的数量 classes: ["up","down","left","right","front","back","clockwise","anticlockwise"] # 类别 confidence: 0.5 # 置信度 nms_iou: 0.3 letterbox_image: False lr_decay_type: cos # 使用到的学习率下降方式,可选的有step、cos # 用于设置是否使用多线程读取数据 # 开启后会加快数据读取速度,但是会占用更多内存 # 内存较小的电脑可以设置为2或者0,win建议设为0 num_workers: 4 ``` ### 7.2. 训练自己数据集 1. 数据集的准备 训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的Annotation中。 训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。 2. 数据集的处理 在完成数据集的摆放之后,我们需要利用voc_annotation.py获得训练用的2007_train.txt和2007_val.txt。 修改voc_annotation.py里面的参数。第一次训练可以仅修改classes_path,classes_path用于指向检测类别所对应的txt。 然后再前面所说的data.yaml中写清楚自己的类别以及类别的数量 ```bash nc: 8 # 类别的数量 classes: ["up","down","left","right","front","back","clockwise","anticlockwise"] # 类别 ``` 3. 开始网络训练 之后根据快速运行train.py,运行train.py开始训练了,在训练多个epoch后,权值会生成在logs文件夹中,可以自己设迭代次数保存权重,如上述快速运行即可。 这里面我内置了5个模型,分别是最原始的yolov4模型,以及yolov4-tiny,yolov4-SE,yolov4-ECA,yolov4-CBAM四种模型,这四种模型都可以进行训练,其中yolov4-tiny,yolov4-SE,yolov4-ECA,yolov4-CBAM都属于小模型,所以认为是tiny模型,得到的权重也比较小速度也会比较快,这几种方式有不同的参数,我现在简单的介绍,我用tiny和phi的参数对他们进行分开 - phi = 0 yolov4-tiny - phi = 1 yolov4-SE - phi = 2 yolov4-CBAM - phi = 3 yolov4-ECA ```python # yolov4 模型 python train.py --epochs 10 --shape 416 --cuda --batch-size 4 # yolov4-tiny python train.py --epochs 10 --shape 416 --cuda --batch-size 8 --tiny --phi 0 # yolov4-SE python train.py --epochs 10 --shape 416 --cuda --batch-size 8 --tiny --phi 1 # yolov4-CBAM python train.py --epochs 10 --shape 416 --cuda --batch-size 8 --tiny --phi 2 # yolov4-ECA python train.py --epochs 10 --shape 416 --cuda --batch-size 8 --tiny --phi 3 ``` 如果还要做其他参数对,也可以看到快速运行代码的训练部分,进行增加一些参数 4. 训练结果预测 训练结果预测需要用到两个文件,分别是yolo.py和predict.py。 完成修改后就可以运行predict.py进行检测了。运行后输入图片路径即可检测。 (可以自己设置模式得到结果) ### 7.3. 使用Tensorboard可视化结果 在我们训练的过程中,我们可以用TensorBoard实时查看训练情况,也可以看到训练的网络模型结构,非常方便 只需要在我们的文件夹的命令行下,运行 ```bash tensorboard --logdir='logs/' ``` 之后大概我们的6006端口就可以实时看到我们的结果,即是https://localhost:6006 > 如果是使用Ubuntu,有可能会出现一些bug,所以需要进行一些操作,因为会显示无法找到命令 > > 这时候首先需要找到TensorBoard在库的哪里 > > ```bash > pip show tensorboard > ``` > > 这样子就能看到自己tensorboard下载的路径 > > 然后找到TensorBoard的文件夹下,找到main.py文件,就可以进行了,利用绝对路径就可以了 > > ``` > python .../python3.8/site-packages/tensorboard/main.py --logdir='logs/' > ``` ### 7.4. 预测步骤 1. 下载完库后解压,在百度网盘后者其他地方下载yolo_gesture_weights.pth,放入model_data,运行predict.py,调整权重路径 在predict.py中事先设置了`dir_predict`表示遍历文件夹进行检测并保存。默认遍历img文件夹,保存img_out文件夹,这样就可以在img_out中得到文件 有很多种模式,可以通过mode来调节,这一部分还可以设置参数,我们都可以从help里看到 ```bash predict.py -h usage: predict.py [-h] [--weights WEIGHTS] [--tiny] [--phi PHI] [--mode {dir_predict,video,fps,predict,heatmap,export_onnx}] [--cuda] [--shape SHAPE] [--video VIDEO] [--save-video SAVE_VIDEO] [--confidence CONFIDENCE] [--nms_iou NMS_IOU] optional arguments: -h, --help show this help message and exit --weights WEIGHTS initial weights path --tiny 使用yolotiny模型 --phi PHI yolov4tiny注意力机制类型 --mode {dir_predict,video,fps,predict,heatmap,export_onnx} 预测的模式 --cuda 表示是否使用GPU --shape SHAPE 输入图像的shape --video VIDEO 需要检测的视频文件 --save-video SAVE_VIDEO 保存视频的位置 --confidence CONFIDENCE 只有得分大于置信度的预测框会被保留下来 --nms_iou NMS_IOU 非极大抑制所用到的nms_iou大小 ``` 如果下载了权重于路径model_data/yolov4_tiny.pth,默认是文件夹中的图片模式运行,我们就可以直接运行得到结果 ```python python predict.py --tiny --phi 0 --weights model_data/yolov4_tiny.pth ``` 2. 在predict.py里面进行设置可以进行fps测试和video视频检测。 (这一部分可以自己尝试) 这一部分只要设置一下路径和视频即可,分别有多种模式 ### 7.5. 评估步骤 1. 本文使用VOC格式进行评估。 2. 如果在训练前已经运行过voc_annotation.py文件,代码会自动将数据集划分成训练集、验证集和测试集。如果想要修改测试集的比例,可以修改voc_annotation.py文件下的trainval_percent。trainval_percent用于指定(训练集+验证集)与测试集的比例,默认情况下 (训练集+验证集):测试集 = 9:1。train_percent用于指定(训练集+验证集)中训练集与验证集的比例,默认情况下 训练集:验证集 = 9:1。 3. 利用voc_annotation.py划分测试集 4. 运行get_map.py即可获得评估结果,评估结果会保存在map_out文件夹中。 ## 8. Streamlit 项目部署 经过上述学习过程,最后我利用streamlit进行了项目部署,可以在本地部署,也可以在云端部署,代码已经上传的了,并且我最后部署到了streamlit的服务器中,大家都可以在线体验 [https://kedreamix-yologesture.streamlit.app/](https://kedreamix-yologesture.streamlit.app/),然后选择“Run the app”即可,不需要过多的操作,云端服务器会会自动从我的release中下载模型,所以这个不用担心。 > 关于streamlit的一些方法,可以看一下我另一篇博客,也有对应的github链接,那个简单一点[https://redamancy.blog.csdn.net/article/details/121788919](https://redamancy.blog.csdn.net/article/details/121788919) ![在这里插入图片描述](https://img-blog.csdnimg.cn/c62927213dc04faf855ed667d65602bd.png) ### 8.1. 本地运行 打开命令行运行以下代码,记住,首先进行pip install streamlit ```python streamlit run gesture.streamlit.py ``` 运行之后,打开的 https://localhost:8501 就可以看到自己的streamlit的界面了 ![](https://img-blog.csdnimg.cn/45d6624063f649ec8083e75e94afaf28.png) ### 8.2. 检测方法 在这个部署界面中,我一共设了几种方式,分别是 | 测试模型方式 | 测试方式描述 | | ------------ | ------------------------------------------------------------ | | Example | 已有一部分数据在服务器的文件夹里,可以读取进行检测 | | Image | 可以自主上传对应的图片进行检测![在这里插入图片描述](https://img-blog.csdnimg.cn/7db5de05f119415cae6a9cb0ef39d092.png) | | Camera | 利用电脑摄像头进行检测,对摄像头进行拍照,然后可以检测fps,heatmap![在这里插入图片描述](https://img-blog.csdnimg.cn/6c1b8f721ebe4e0f86d68bf321e210b2.png) | | FPS | 上传一张图片进行FPS | | Heatmap | 进行一个热力图的检测,可以看到模型关注哪一部分![在这里插入图片描述](https://img-blog.csdnimg.cn/e2c53a474f764a0f99adb05b39c3bee9.png) | | Real Time | 实时检测,可能这一部分在云服务器有点bug,可能要在自己电脑下才能正常运行,云端不可用 | | Video | 传入视频进行检测![在这里插入图片描述](https://img-blog.csdnimg.cn/ccde1a9f657d4e7587012bf753f36095.png) | ### 8.3. 选择模型以及参数 并且在下面的部分,也设置了几个参数,首先是使用的模型,根据前面所说的,一共有五种模型,并且可以调节传入的shape,这里注意一下,如果选择tiny模型,要勾选☑️使用tiny模型,要不默认全是yolov4模型,然后tiny模型的shape统一只有416,只有yolov4模型有一个608和416,可以根据自己的情况选择。 除此之外,还可以调节一下confidence和nms的参数,默认分别是0.5和0.3,这个是可以通过滑动杆来修改的 ![在这里插入图片描述](https://img-blog.csdnimg.cn/88e387f27b3747c58a6c910281837c03.png) ## 9. 参考Reference - [https://github.com/bubbliiiing/yolov4-pytorch](https://github.com/bubbliiiing/yolov4-pytorch) - [https://github.com/qqwweee/keras-yolo3/](https://github.com/qqwweee/keras-yolo3/) - [https://github.com/Ma-Dan/keras-yolo4](https://github.com/Ma-Dan/keras-yolo4) ## 10. 代码权重可复现,已开源(求🌟🌟🌟) 所有的上述代码权重全部可复现,已经全部开源,有需要可以自取https://github.com/Kedreamix/YoloGesture 有问题欢迎在issue中讨论,最后创作不易,给我个星星吧哈哈哈star一下,🌟🌟🌟 ================================================ FILE: VOCdevkit/VOC2007/Annotations/1.xml ================================================ JPEGImages 1.jpg E:\handpose_x_gesture_v2\JPEGImages\1.jpg Unknown 175 223 3 0 down Unspecified 0 0 21 7 174 210 ================================================ FILE: VOCdevkit/VOC2007/Annotations/2.xml ================================================ JPEGImages 2.jpg E:\handpose_x_gesture_v2\JPEGImages\2.jpg Unknown 274 295 3 0 down Unspecified 0 0 44 20 259 264 ================================================ FILE: VOCdevkit/VOC2007/Annotations/3.xml ================================================ JPEGImages 3.jpg E:\handpose_x_gesture_v2\JPEGImages\3.jpg Unknown 325 363 3 0 down Unspecified 0 0 30 59 261 297 ================================================ FILE: VOCdevkit/VOC2007/Annotations/4.xml ================================================ JPEGImages 4.jpg E:\handpose_x_gesture_v2\JPEGImages\4.jpg Unknown 306 299 3 0 down Unspecified 0 0 44 45 264 256 ================================================ FILE: VOCdevkit/VOC2007/Annotations/5.xml ================================================ JPEGImages 5.jpg E:\handpose_x_gesture_v2\JPEGImages\5.jpg Unknown 191 211 3 0 down Unspecified 0 0 31 19 152 167 ================================================ FILE: VOCdevkit/VOC2007/Annotations/README.md ================================================ 存放标签文件 ================================================ FILE: VOCdevkit/VOC2007/ImageSets/Main/README.md ================================================ 存放训练索引文件 ================================================ FILE: VOCdevkit/VOC2007/ImageSets/Main/test.txt ================================================ ================================================ FILE: VOCdevkit/VOC2007/ImageSets/Main/train.txt ================================================ 10 100 1000 1001 1002 1003 1005 1006 1007 1008 1009 101 1010 1011 1012 1013 1014 1015 1016 1017 1019 102 1020 1021 1022 1023 1024 1025 1026 1028 1029 103 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 104 1040 1041 1042 1043 1044 1045 1046 1047 1048 105 1050 1051 1052 1053 1054 1056 1057 1058 1059 106 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 107 1070 1071 1072 1074 1075 1076 1077 1078 1079 108 1080 1081 1083 1084 1085 1086 1088 1089 109 1090 1091 1094 1096 1098 11 110 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 111 1111 1112 1113 1114 1115 1116 1117 1118 1119 112 1120 1121 1122 1123 1124 1125 1126 1127 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 114 1140 1141 1142 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1155 1156 1157 1158 116 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 117 1170 1171 1172 1173 1174 1175 1176 1178 1179 118 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 119 1190 1191 1192 1193 1195 1196 1197 1198 1199 12 120 1202 1203 1204 1205 1206 1207 1208 1209 121 1210 1211 1213 1214 1216 1217 1218 1219 122 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 123 1230 1231 1232 1234 1235 1236 1237 1238 1239 124 1240 1241 1242 1243 1244 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1259 126 1260 1261 1263 1264 1265 1266 1267 1268 1269 127 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 128 1280 1281 1282 1284 1285 1286 1287 1288 1289 129 1290 1291 1292 1293 1295 1296 1297 1298 1299 13 130 1300 1301 1302 1303 1304 1305 1306 1307 1308 131 1310 1311 1312 1313 1314 1316 1317 1318 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 133 1330 1331 1332 1333 1334 1335 1336 1338 134 1340 1342 1343 1344 1345 1346 1347 1348 1349 135 1350 1352 1353 1354 1355 1356 1357 1358 1359 136 1360 1361 1362 1363 1364 1365 1366 1367 1369 137 1370 1372 1373 1374 1375 1376 1377 1378 1379 138 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 139 1390 1391 1392 1394 1395 1396 1397 1398 1399 14 1400 1401 1402 1403 1404 1405 1406 1407 1409 141 1410 1411 1412 1413 1414 1416 1417 1418 142 1420 1421 1422 1424 1425 1426 1427 1428 1429 143 1430 1431 1432 1433 1434 1435 1436 1437 1439 1440 1441 1442 1443 1444 1445 1446 1448 1449 145 1450 1452 1454 1456 1458 1459 146 1460 1461 1462 1463 1464 1465 1466 1467 1468 147 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 148 1480 1482 1483 1484 1485 1487 1488 1489 149 1490 1491 1492 1494 1495 1496 1497 1498 1499 15 150 1500 1501 1502 1503 1504 1506 1507 1508 1509 151 1510 1511 1512 1513 1515 1516 1517 1518 1519 152 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1532 1533 1534 1535 1536 1537 1539 154 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1554 1555 1556 1557 1558 1559 156 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 157 1571 1572 1573 1574 1575 1576 1577 1578 1579 158 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 159 1591 1592 1593 1594 1595 1596 1597 1598 1599 16 160 1600 161 162 163 164 165 166 168 169 17 170 171 172 173 174 175 176 177 178 18 182 183 184 185 186 187 189 19 190 193 194 195 196 197 198 199 2 200 201 202 203 204 205 206 207 208 209 21 210 212 213 214 215 216 217 218 219 22 220 221 222 223 224 225 226 227 228 230 231 232 233 234 235 236 237 238 239 24 240 241 242 243 244 245 246 247 25 250 251 252 253 254 255 256 257 258 259 26 260 261 262 263 265 266 267 268 269 27 270 271 272 273 274 275 276 277 278 279 28 280 281 284 285 286 287 288 291 294 295 296 297 298 299 3 30 300 301 302 303 304 305 306 307 308 309 31 310 311 312 313 314 315 316 317 318 319 32 320 321 322 323 324 325 326 327 328 329 33 330 331 332 334 335 337 338 339 34 340 341 342 343 344 345 346 347 348 349 35 350 351 353 354 355 356 357 358 36 360 361 362 363 364 365 366 367 368 369 37 370 371 372 373 374 375 376 377 378 379 38 380 382 383 384 385 386 387 388 389 39 390 391 392 393 394 395 396 397 398 399 4 40 400 401 402 404 405 406 407 408 409 41 410 411 412 413 414 415 416 417 418 419 42 420 421 422 423 425 426 427 428 429 43 430 431 432 433 434 436 437 438 439 44 440 441 442 443 444 445 446 447 448 449 45 450 451 452 453 454 455 457 458 459 46 461 462 464 465 466 467 468 47 470 471 472 473 474 475 476 477 478 479 48 480 481 482 483 484 485 486 487 488 489 49 490 491 492 493 494 495 496 497 498 499 5 50 500 501 502 503 504 505 506 507 508 509 51 512 513 514 515 516 517 518 519 520 521 522 523 524 526 527 528 529 53 530 532 533 534 536 537 539 54 540 541 542 543 544 545 546 547 549 55 551 552 553 554 556 557 559 56 561 562 563 564 565 566 567 568 569 57 570 571 572 573 575 577 578 579 58 580 581 582 583 585 586 587 588 589 59 590 591 592 593 594 596 597 598 599 6 600 601 602 603 604 605 606 608 609 61 610 611 612 613 614 615 616 617 618 619 62 620 621 622 623 625 626 627 628 629 63 630 631 632 633 634 635 636 637 638 639 64 640 641 642 643 644 645 646 647 648 649 65 650 651 652 653 654 655 656 657 658 66 660 662 663 664 665 666 667 668 669 67 670 671 672 673 675 676 677 678 679 68 680 681 682 683 684 685 686 687 688 689 69 690 691 692 693 694 695 696 698 699 70 700 701 702 703 704 705 706 708 709 71 710 711 713 714 715 716 717 718 719 72 720 721 722 723 724 725 726 727 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 75 750 751 752 753 755 756 757 759 76 760 761 762 763 764 766 767 769 77 770 771 772 773 774 775 776 777 778 779 78 780 781 783 784 785 786 787 788 789 79 790 791 792 793 794 795 796 797 798 799 8 80 800 801 802 803 804 805 806 807 808 809 81 810 812 813 814 815 816 819 82 820 821 822 823 824 825 826 827 828 829 83 830 831 832 833 834 836 837 838 839 84 840 841 842 843 844 845 846 847 848 849 85 850 851 852 853 854 855 856 857 858 859 86 860 861 862 863 864 866 867 868 869 87 870 872 873 874 876 877 878 879 88 880 882 883 884 885 886 887 888 889 89 890 891 892 893 895 896 897 899 9 900 901 902 903 904 905 906 908 909 91 910 911 912 913 914 915 916 917 918 919 92 920 921 923 924 925 926 927 928 929 93 930 931 932 933 934 935 936 938 94 940 941 943 944 945 946 947 948 949 95 950 951 952 953 954 955 956 957 958 959 96 960 961 963 965 966 967 968 969 97 970 971 972 973 974 975 976 977 978 979 98 981 982 983 984 985 986 987 988 989 99 990 991 992 994 995 996 997 998 ================================================ FILE: VOCdevkit/VOC2007/ImageSets/Main/trainval.txt ================================================ 1 10 100 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 101 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 102 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 103 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 104 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 105 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 106 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 107 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 108 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 109 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 11 110 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 111 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 112 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 113 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 114 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 115 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 116 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 117 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 118 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 119 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 12 120 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 121 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 122 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 123 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 124 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 125 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 126 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 127 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 128 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 129 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 13 130 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 131 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 132 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 133 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 134 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 135 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 136 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 137 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 138 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 139 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 14 140 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 141 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 142 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 143 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 144 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 145 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 146 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 147 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 148 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 149 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 15 150 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 151 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 152 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 153 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 154 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 155 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 156 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 157 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 158 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 159 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 16 160 1600 161 162 163 164 165 166 167 168 169 17 170 171 172 173 174 175 176 177 178 179 18 180 181 182 183 184 185 186 187 188 189 19 190 191 192 193 194 195 196 197 198 199 2 20 200 201 202 203 204 205 206 207 208 209 21 210 211 212 213 214 215 216 217 218 219 22 220 221 222 223 224 225 226 227 228 229 23 230 231 232 233 234 235 236 237 238 239 24 240 241 242 243 244 245 246 247 248 249 25 250 251 252 253 254 255 256 257 258 259 26 260 261 262 263 264 265 266 267 268 269 27 270 271 272 273 274 275 276 277 278 279 28 280 281 282 283 284 285 286 287 288 289 29 290 291 292 293 294 295 296 297 298 299 3 30 300 301 302 303 304 305 306 307 308 309 31 310 311 312 313 314 315 316 317 318 319 32 320 321 322 323 324 325 326 327 328 329 33 330 331 332 333 333 334 335 336 337 338 339 34 340 341 342 343 344 345 346 347 348 349 35 350 351 352 353 354 355 356 357 358 359 36 360 361 362 363 364 365 366 367 368 369 37 370 371 372 373 374 375 376 377 378 379 38 380 381 382 383 384 385 386 387 388 389 39 390 391 392 393 394 395 396 397 398 399 4 40 400 401 402 403 404 405 406 407 408 409 41 410 411 412 413 414 415 416 417 418 419 42 420 421 422 423 424 425 426 427 428 429 43 430 431 432 433 434 435 436 437 438 439 44 440 441 442 443 444 445 446 447 448 449 45 450 451 452 453 454 455 456 457 458 459 46 460 461 462 463 464 465 466 467 468 469 47 470 471 472 473 474 475 476 477 478 479 48 480 481 482 483 484 485 486 487 488 489 49 490 491 492 493 494 495 496 497 498 499 5 50 500 501 502 503 504 505 506 507 508 509 51 510 511 512 513 514 515 516 517 518 519 52 520 521 522 523 524 525 526 527 528 529 53 530 531 532 533 534 535 536 537 538 539 54 540 541 542 543 544 545 546 547 548 549 55 550 551 552 553 554 555 556 557 558 559 56 560 561 562 563 564 565 566 567 568 569 57 570 571 572 573 574 575 576 577 578 579 58 580 581 582 583 584 585 586 587 588 589 59 590 591 592 593 594 595 596 597 598 599 6 60 600 601 602 603 604 605 606 607 608 609 61 610 611 612 613 614 615 616 617 618 619 62 620 621 622 623 624 625 626 627 628 629 63 630 631 632 633 634 635 636 637 638 639 64 640 641 642 643 644 645 646 647 648 649 65 650 651 652 653 654 655 656 657 658 659 66 660 661 662 663 664 665 666 667 668 669 67 670 671 672 673 674 675 676 677 678 679 68 680 681 682 683 684 685 686 687 688 689 69 690 691 692 693 694 695 696 697 698 699 7 70 700 701 702 703 704 705 706 707 708 709 71 710 711 712 713 714 715 716 717 718 719 72 720 721 722 723 724 725 726 727 728 729 73 730 731 732 733 734 735 736 737 738 739 74 740 741 742 743 744 745 746 747 748 749 75 750 751 752 753 754 755 756 757 758 759 76 760 761 762 763 764 765 766 767 768 769 77 770 771 772 773 774 775 776 777 778 779 78 780 781 782 783 784 785 786 787 788 789 79 790 791 792 793 794 795 796 797 798 799 8 80 800 801 802 803 804 805 806 807 808 809 81 810 811 812 813 814 815 816 817 818 819 82 820 821 822 823 824 825 826 827 828 829 83 830 831 832 833 834 835 836 837 838 839 84 840 841 842 843 844 845 846 847 848 849 85 850 851 852 853 854 855 856 857 858 859 86 860 861 862 863 864 865 866 867 868 869 87 870 871 872 873 874 875 876 877 878 879 88 880 881 882 883 884 885 886 887 888 889 89 890 891 892 893 894 895 896 897 898 899 9 90 900 901 902 903 904 905 906 907 908 909 91 910 911 912 913 914 915 916 917 918 919 92 920 921 922 923 924 925 926 927 928 929 93 930 931 932 933 934 935 936 937 938 939 94 940 941 942 943 944 945 946 947 948 949 95 950 951 952 953 954 955 956 957 958 959 96 960 961 962 963 964 965 966 967 968 969 97 970 971 972 973 974 975 976 977 978 979 98 980 981 982 983 984 985 986 987 988 989 99 990 991 992 993 994 995 996 997 998 999 ================================================ FILE: VOCdevkit/VOC2007/ImageSets/Main/val.txt ================================================ 1 1004 1018 1027 1049 1055 1073 1082 1087 1092 1093 1095 1097 1099 1110 1128 1129 113 1143 115 1154 1159 1177 1194 1200 1201 1212 1215 1233 1245 1246 125 1258 1262 1283 1294 1309 1315 1319 132 1337 1339 1341 1351 1368 1371 1393 140 1408 1415 1419 1423 1438 144 1447 1451 1453 1455 1457 1469 1481 1486 1493 1505 1514 153 1531 1538 155 1553 1570 1590 167 179 180 181 188 191 192 20 211 229 23 248 249 264 282 283 289 29 290 292 293 333 333 336 352 359 381 403 424 435 456 460 463 469 510 511 52 525 531 535 538 548 550 555 558 560 574 576 584 595 60 607 624 659 661 674 697 7 707 712 728 73 74 754 758 765 768 782 811 817 818 835 865 871 875 881 894 898 90 907 922 937 939 942 962 964 980 993 999 ================================================ FILE: YOLOv4-study学习资料md ================================================ # YOLOv4 学习资料 ![在这里插入图片描述](https://img-blog.csdnimg.cn/e3551e344873465d8ad884f856d652ed.png) [Tianxiaomo](https://github.com/Tianxiaomo)/**[pytorch-YOLOv4](https://github.com/Tianxiaomo/pytorch-YOLOv4)** star 3.5k PyTorch ,ONNX and TensorRT implementation of *YOLOv4* [WongKinYiu](https://github.com/WongKinYiu)/**[PyTorch_YOLOv4](https://github.com/WongKinYiu/PyTorch_YOLOv4)** star 1.5k PyTorch implementation of *YOLOv4* [argusswift](https://github.com/argusswift)/**[YOLOv4-pytorch ](https://github.com/argusswift/YOLOv4-pytorch)** star 1.4k This is a pytorch repository of *YOLOv4*, attentive *YOLOv4* and mobilenet *YOLOv4* with PASCAL VOC and COCO [bubbliiiing/*yolov4*-pytorch ](https://github.com/bubbliiiing/yolov4-pytorch) star 1.2k 这是一个*YoloV4*-pytorch的源码,可以用于训练自己的模型。 ## 扩展 [Bil369](https://github.com/Bil369)/**[MaskDetect-YOLOv4-PyTorch](https://github.com/Bil369/MaskDetect-YOLOv4-PyTorch)** 基于*PyTorch*&*YOLOv4*实现的口罩佩戴检测 ⭐ 自建口罩数据集分享 [bobo0810](https://github.com/bobo0810)/**[PytorchNetHub](https://github.com/bobo0810/PytorchNetHub)** 项目注释+论文复现+算法竞赛+Pytorch指北 [Bil369](https://github.com/Bil369)/**[YOLOv4-PyTorch-Simple-Implementation](https://github.com/Bil369/YOLOv4-PyTorch-Simple-Implementation)** *YOLOv4* *PyTorch* Simple Implementation ================================================ FILE: detect.py ================================================ #-----------------------------------------------------------------------# # detect.py 是用来尝试利用小模型半自动化进行标注数据 #-----------------------------------------------------------------------# import numpy as np from PIL import Image from get_yaml import get_config from yolo import YOLO from gen_annotation import GEN_Annotations if __name__ == "__main__":# 配置文件 # 配置文件 config = get_config() yolo = YOLO() dir_detect_path = config['dir_detect_path'] detect_save_path = config['detect_save_path'] import os from tqdm import tqdm img_names = os.listdir(dir_detect_path) for img_name in tqdm(img_names): if img_name.lower().endswith(('.bmp', '.dib', '.png', '.jpg', '.jpeg', '.pbm', '.pgm', '.ppm', '.tif', '.tiff')): # if int(img_name.split('.')[0][-4:]) < 355: # continue image_path = os.path.join(dir_detect_path, img_name) image = Image.open(image_path) boxes = yolo.get_box(image) if not os.path.exists(detect_save_path): os.makedirs(detect_save_path) annotation = GEN_Annotations(img_name) w,h = np.array(np.shape(image)[0:2]) annotation.set_size(w,h,3) if boxes: for box in boxes: label,ymin,xmin,ymax,xmax = box annotation.add_pic_attr(label,xmin,ymin,xmax,ymax) annotation_path = os.path.join(detect_save_path, img_name.split('.')[0]) annotation.savefile("{}.xml".format(annotation_path )) # print(img_name,'已经半自动标注') ================================================ FILE: gen_annotation.py ================================================ from lxml import etree class GEN_Annotations: def __init__(self, filename): self.root = etree.Element("annotation") child1 = etree.SubElement(self.root, "folder") child1.text = "VOC2007" child2 = etree.SubElement(self.root, "filename") child2.text = filename child3 = etree.SubElement(self.root, "source") child4 = etree.SubElement(child3, "annotation") child4.text = "PASCAL VOC2007" child5 = etree.SubElement(child3, "database") child5.text = "Unknown" ## child6 = etree.SubElement(child3, "image") ## child6.text = "flickr" ## child7 = etree.SubElement(child3, "flickrid") ## child7.text = "35435" def set_size(self,witdh,height,channel): size = etree.SubElement(self.root, "size") widthn = etree.SubElement(size, "width") widthn.text = str(witdh) heightn = etree.SubElement(size, "height") heightn.text = str(height) channeln = etree.SubElement(size, "depth") channeln.text = str(channel) def savefile(self,filename): tree = etree.ElementTree(self.root) tree.write(filename, pretty_print=True, xml_declaration=False, encoding='utf-8') def add_pic_attr(self,label,xmin,ymin,xmax,ymax): object = etree.SubElement(self.root, "object") namen = etree.SubElement(object, "name") namen.text = label bndbox = etree.SubElement(object, "bndbox") xminn = etree.SubElement(bndbox, "xmin") xminn.text = str(xmin) yminn = etree.SubElement(bndbox, "ymin") yminn.text = str(ymin) xmaxn = etree.SubElement(bndbox, "xmax") xmaxn.text = str(xmax) ymaxn = etree.SubElement(bndbox, "ymax") ymaxn.text = str(ymax) if __name__ == '__main__': filename="000001.jpg" anno= GEN_Annotations(filename) anno.set_size(1280,720,3) for i in range(3): xmin=i+1 ymin=i+10 xmax=i+100 ymax=i+100 anno.add_pic_attr("pikachu",xmin,ymin,xmax,ymax) anno.savefile("00001.xml") ================================================ FILE: gesture.streamlit.py ================================================ """Create an Object Detection Web App using PyTorch and Streamlit.""" # import libraries from PIL import Image from torchvision import models, transforms import torch import streamlit as st from yolo import YOLO import os import urllib import numpy as np from streamlit_webrtc import webrtc_streamer, WebRtcMode, RTCConfiguration import av # 设置网页的icon st.set_page_config(page_title='Gesture Detector', page_icon='✌', layout='centered', initial_sidebar_state='expanded') RTC_CONFIGURATION = RTCConfiguration( { "RTCIceServer": [{ "urls": ["stun:stun.l.google.com:19302"], "username": "pikachu", "credential": "1234", }] } ) def main(): # Render the readme as markdown using st.markdown. readme_text = st.markdown(open("instructions.md",encoding='utf-8').read()) # Once we have the dependencies, add a selector for the app mode on the sidebar. st.sidebar.title("What to do") app_mode = st.sidebar.selectbox("Choose the app mode", ["Show instructions", "Run the app", "Show the source code"]) if app_mode == "Show instructions": st.sidebar.success('To continue select "Run the app".') elif app_mode == "Show the source code": readme_text.empty() st.code(open("gesture.streamlit.py",encoding='utf-8').read()) elif app_mode == "Run the app": # Download external dependencies. for filename in EXTERNAL_DEPENDENCIES.keys(): download_file(filename) readme_text.empty() run_the_app() # External files to download. EXTERNAL_DEPENDENCIES = { "yolov4_tiny.pth": { "url": "https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_tiny.pth", "size": 23631189 }, "yolov4_SE.pth": { "url": "https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_SE.pth", "size": 23806027 }, "yolov4_CBAM.pth":{ "url": "https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_CBAM.pth", "size": 23981478 }, "yolov4_ECA.pth":{ "url": "https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_ECA.pth", "size": 23632688 }, "yolov4_weights_ep150_608.pth":{ "url": "https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_weights_ep150_608.pth", "size": 256423031 }, "yolov4_weights_ep150_416.pth":{ "url": "https://github.com/Kedreamix/YoloGesture/releases/download/v1.0/yolov4_weights_ep150_416.pth", "size": 256423031 }, } # This file downloader demonstrates Streamlit animation. def download_file(file_path): # Don't download the file twice. (If possible, verify the download using the file length.) if os.path.exists(file_path): if "size" not in EXTERNAL_DEPENDENCIES[file_path]: return elif os.path.getsize(file_path) == EXTERNAL_DEPENDENCIES[file_path]["size"]: return # print(os.path.getsize(file_path)) # These are handles to two visual elements to animate. weights_warning, progress_bar = None, None try: weights_warning = st.warning("Downloading %s..." % file_path) progress_bar = st.progress(0) with open(file_path, "wb") as output_file: with urllib.request.urlopen(EXTERNAL_DEPENDENCIES[file_path]["url"]) as response: length = int(response.info()["Content-Length"]) counter = 0.0 MEGABYTES = 2.0 ** 20.0 while True: data = response.read(8192) if not data: break counter += len(data) output_file.write(data) # We perform animation by overwriting the elements. weights_warning.warning("Downloading %s... (%6.2f/%6.2f MB)" % (file_path, counter / MEGABYTES, length / MEGABYTES)) progress_bar.progress(min(counter / length, 1.0)) except Exception as e: print(e) # Finally, we remove these visual elements by calling .empty(). finally: if weights_warning is not None: weights_warning.empty() if progress_bar is not None: progress_bar.empty() # This is the main app app itself, which appears when the user selects "Run the app". def run_the_app(): class Config(): def __init__(self, weights = 'yolov4_tiny.pth', tiny = True, phi = 0, shape = 416,nms_iou = 0.3, confidence = 0.5): self.weights = weights self.tiny = tiny self.phi = phi self.cuda = False self.shape = shape self.confidence = confidence self.nms_iou = nms_iou # set title of app st.markdown('

✌ Gesture Detection

', unsafe_allow_html=True) st.sidebar.markdown("# Gesture Detection on?") activities = ["Example","Image", "Camera", "FPS", "Heatmap","Real Time", "Video"] choice = st.sidebar.selectbox("Choose among the given options:", activities) phi = st.sidebar.selectbox("yolov4-tiny 使用的自注意力模式:",('0tiny','1SE','2CABM','3ECA')) print("") tiny = st.sidebar.checkbox('是否使用 yolov4 tiny 模型') if not tiny: shape = st.sidebar.selectbox("Choose shape to Input:", [416,608]) conf,nms = object_detector_ui() @st.cache def get_yolo(tiny,phi,conf,nms,shape=416): weights = 'yolov4_tiny.pth' if tiny: if phi == '0tiny': weights = 'yolov4_tiny.pth' elif phi == '1SE': weights = 'yolov4_SE.pth' elif phi == '2CABM': weights = 'yolov4_CBAM.pth' elif phi == '3ECA': weights = 'yolov4_ECA.pth' else: if shape == 608: weights = 'yolov4_weights_ep150_608.pth' elif shape == 416: weights = 'yolov4_weights_ep150_416.pth' opt = Config(weights = weights, tiny = tiny , phi = int(phi[0]), shape = shape,nms_iou = nms, confidence = conf) yolo = YOLO(opt) return yolo if tiny: yolo = get_yolo(tiny, phi, conf, nms) st.write("YOLOV4 tiny 模型加载完毕") else: yolo = get_yolo(tiny, phi, conf, nms, shape) st.write("YOLOV4 模型加载完毕") if choice == 'Image': detect_image(yolo) elif choice =='Camera': detect_camera(yolo) elif choice == 'FPS': detect_fps(yolo) elif choice == "Heatmap": detect_heatmap(yolo) elif choice == "Example": detect_example(yolo) elif choice == "Real Time": detect_realtime(yolo) elif choice == "Video": detect_video(yolo) # This sidebar UI lets the user select parameters for the YOLO object detector. def object_detector_ui(): st.sidebar.markdown("# Model") confidence_threshold = st.sidebar.slider("Confidence threshold", 0.0, 1.0, 0.5, 0.01) overlap_threshold = st.sidebar.slider("Overlap threshold", 0.0, 1.0, 0.3, 0.01) return confidence_threshold, overlap_threshold def predict(image,yolo): """Return predictions. Parameters ---------- :param image: uploaded image :type image: jpg :rtype: list :return: none """ crop = False count = False try: # image = Image.open(image) r_image = yolo.detect_image(image, crop = crop, count=count) transform = transforms.Compose([transforms.ToTensor()]) result = transform(r_image) st.image(result.permute(1,2,0).numpy(), caption = 'Processed Image.', use_column_width = True) except Exception as e: print(e) def fps(image,yolo): test_interval = 50 tact_time = yolo.get_FPS(image, test_interval) st.write(str(tact_time) + ' seconds, ', str(1/tact_time),'FPS, @batch_size 1') return tact_time # print(str(tact_time) + ' seconds, ' + str(1/tact_time) + 'FPS, @batch_size 1') def detect_image(yolo): # enable users to upload images for the model to make predictions file_up = st.file_uploader("Upload an image", type = ["jpg","png","jpeg"]) classes = ["up","down","left","right","front","back","clockwise","anticlockwise"] class_to_idx = {cls: idx for (idx, cls) in enumerate(classes)} st.sidebar.markdown("See the model preformance and play with it") if file_up is not None: with st.spinner(text='Preparing Image'): # display image that user uploaded image = Image.open(file_up) st.image(image, caption = 'Uploaded Image.', use_column_width = True) st.balloons() detect = st.button("开始检测Image") if detect: st.write("") st.write("Just a second ...") predict(image,yolo) st.balloons() def detect_camera(yolo): picture = st.camera_input("Take a picture") if picture: filters_to_funcs = { "No filter": predict, "Heatmap": heatmap, "FPS": fps, } filters = st.selectbox("...and now, apply a filter!", filters_to_funcs.keys()) image = Image.open(picture) with st.spinner(text='Preparing Image'): filters_to_funcs[filters](image,yolo) st.balloons() def detect_fps(yolo): file_up = st.file_uploader("Upload an image", type = ["jpg","png","jpeg"]) classes = ["up","down","left","right","front","back","clockwise","anticlockwise"] class_to_idx = {cls: idx for (idx, cls) in enumerate(classes)} st.sidebar.markdown("See the model preformance and play with it") if file_up is not None: # display image that user uploaded image = Image.open(file_up) st.image(image, caption = 'Uploaded Image.', use_column_width = True) st.balloons() detect = st.button("开始检测 FPS") if detect: with st.spinner(text='Preparing Image'): st.write("") st.write("Just a second ...") tact_time = fps(image,yolo) # st.write(str(tact_time) + ' seconds, ', str(1/tact_time),'FPS, @batch_size 1') st.balloons() def heatmap(image,yolo): heatmap_save_path = "heatmap_vision.png" yolo.detect_heatmap(image, heatmap_save_path) img = Image.open(heatmap_save_path) transform = transforms.Compose([transforms.ToTensor()]) result = transform(img) st.image(result.permute(1,2,0).numpy(), caption = 'Processed Image.', use_column_width = True) def detect_heatmap(yolo): file_up = st.file_uploader("Upload an image", type = ["jpg","png","jpeg"]) classes = ["up","down","left","right","front","back","clockwise","anticlockwise"] class_to_idx = {cls: idx for (idx, cls) in enumerate(classes)} st.sidebar.markdown("See the model preformance and play with it") if file_up is not None: # display image that user uploaded image = Image.open(file_up) st.image(image, caption = 'Uploaded Image.', use_column_width = True) st.balloons() detect = st.button("开始检测 heatmap") if detect: with st.spinner(text='Preparing Heatmap'): st.write("") st.write("Just a second ...") heatmap(image,yolo) st.balloons() def detect_example(yolo): st.sidebar.title("Choose an Image as a example") images = os.listdir('./img') images.sort() image = st.sidebar.selectbox("Image Name", images) st.sidebar.markdown("See the model preformance and play with it") image = Image.open(os.path.join('img',image)) st.image(image, caption = 'Choose Image.', use_column_width = True) st.balloons() detect = st.button("开始检测Image") if detect: st.write("") st.write("Just a second ...") predict(image,yolo) st.balloons() def detect_realtime(yolo): class VideoProcessor: def recv(self, frame): img = frame.to_ndarray(format="bgr24") img = Image.fromarray(img) crop = False count = False r_image = yolo.detect_image(img, crop = crop, count=count) transform = transforms.Compose([transforms.ToTensor()]) result = transform(r_image) result = result.permute(1,2,0).numpy() result = (result * 255).astype(np.uint8) return av.VideoFrame.from_ndarray(result, format="bgr24") webrtc_ctx = webrtc_streamer( key="example", mode=WebRtcMode.SENDRECV, rtc_configuration=RTC_CONFIGURATION, media_stream_constraints={"video": True, "audio": False}, async_processing=False, video_processor_factory=VideoProcessor ) import cv2 import time def detect_video(yolo): file_up = st.file_uploader("Upload a video", type = ["mp4"]) print(file_up) classes = ["up","down","left","right","front","back","clockwise","anticlockwise"] if file_up is not None: video_path = 'video.mp4' st.video(file_up) with open(video_path, 'wb') as f: f.write(file_up.read()) detect = st.button("开始检测 Video") if detect: video_save_path = 'video2.mp4' # display image that user uploaded capture = cv2.VideoCapture(video_path) video_fps = st.slider("Video FPS", 5, 30, int(capture.get(cv2.CAP_PROP_FPS)), 1) fourcc = cv2.VideoWriter_fourcc(*'XVID') size = (int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))) out = cv2.VideoWriter(video_save_path, fourcc, video_fps, size) while(True): # 读取某一帧 ref, frame = capture.read() if not ref: break # 转变成Image # frame = Image.fromarray(np.uint8(frame)) # 格式转变,BGRtoRGB frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB) # 转变成Image frame = Image.fromarray(np.uint8(frame)) # 进行检测 frame = np.array(yolo.detect_image(frame)) # RGBtoBGR满足opencv显示格式 frame = cv2.cvtColor(frame,cv2.COLOR_RGB2BGR) # print("fps= %.2f"%(fps)) # frame = cv2.putText(frame, "fps= %.2f"%(fps), (0, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2) out.write(frame) out.release() capture.release() print("Save processed video to the path :" + video_save_path) with open(video_save_path, "rb") as file: btn = st.download_button( label="Download Video", data=file, file_name="video.mp4", ) st.balloons() if __name__ == "__main__": main() ================================================ FILE: get_map.py ================================================ import os import xml.etree.ElementTree as ET from PIL import Image from tqdm import tqdm import yaml from utils.utils import get_classes from utils.utils_map import get_coco_map, get_map from yolo import YOLO from get_yaml import get_config import argparse if __name__ == "__main__": ''' Recall和Precision不像AP是一个面积的概念,在门限值不同时,网络的Recall和Precision值是不同的。 map计算结果中的Recall和Precision代表的是当预测时,门限置信度为0.5时,所对应的Recall和Precision值。 此处获得的./map_out/detection-results/里面的txt的框的数量会比直接predict多一些,这是因为这里的门限低, 目的是为了计算不同门限条件下的Recall和Precision值,从而实现map的计算。 ''' parser = argparse.ArgumentParser() parser.add_argument('--weights',type=str,default='model_data/yolotiny_SE_ep100.pth',help='initial weights path') parser.add_argument('--tiny',action='store_true',help='使用yolotiny模型') parser.add_argument('--phi',type=int,default=1,help='yolov4tiny注意力机制类型') parser.add_argument('--mode',type=int,default=0,help='get map的模式') parser.add_argument('--cuda',action='store_true',help='表示是否使用GPU') parser.add_argument('--shape',type=int,default=416,help='输入图像的shape') parser.add_argument('--confidence',type=float,default=0.5,help='只有得分大于置信度的预测框会被保留下来') parser.add_argument('--nms_iou',type=float,default=0.3,help='非极大抑制所用到的nms_iou大小') opt = parser.parse_args() print(opt) # 配置文件 config = get_config() #------------------------------------------------------------------------------------------------------------------# # map_mode用于指定该文件运行时计算的内容 # map_mode为0代表整个map计算流程,包括获得预测结果、获得真实框、计算VOC_map。 # map_mode为1代表仅仅获得预测结果。 # map_mode为2代表仅仅获得真实框。 # map_mode为3代表仅仅计算VOC_map。 # map_mode为4代表利用COCO工具箱计算当前数据集的0.50:0.95map。需要获得预测结果、获得真实框后并安装pycocotools才行 #-------------------------------------------------------------------------------------------------------------------# map_mode = opt.mode #-------------------------------------------------------# # MINOVERLAP用于指定想要获得的mAP0.x # 比如计算mAP0.75,可以设定MINOVERLAP = 0.75。 #-------------------------------------------------------# MINOVERLAP = 0.5 #-------------------------------------------------------# # map_vis用于指定是否开启VOC_map计算的可视化 #-------------------------------------------------------# map_vis = False #-------------------------------------------------------# # 指向VOC数据集所在的文件夹 # 默认指向根目录下的VOC数据集 #-------------------------------------------------------# VOCdevkit_path = 'VOCdevkit' #-------------------------------------------------------# # 结果输出的文件夹,默认为map_out #-------------------------------------------------------# map_out_path = 'map_out' image_ids = open(os.path.join(VOCdevkit_path, "VOC2007/ImageSets/Main/val.txt")).read().strip().split() if not os.path.exists(map_out_path): os.makedirs(map_out_path) if not os.path.exists(os.path.join(map_out_path, 'ground-truth')): os.makedirs(os.path.join(map_out_path, 'ground-truth')) if not os.path.exists(os.path.join(map_out_path, 'detection-results')): os.makedirs(os.path.join(map_out_path, 'detection-results')) if not os.path.exists(os.path.join(map_out_path, 'images-optional')): os.makedirs(os.path.join(map_out_path, 'images-optional')) class_names = config['classes'] if map_mode == 0 or map_mode == 1: print("Load model.") yolo = YOLO(opt, confidence = 0.001, nms_iou = 0.5) print("Load model done.") print("Get predict result.") for image_id in tqdm(image_ids): image_path = os.path.join(VOCdevkit_path, "VOC2007/JPEGImages/"+image_id+".jpg") image = Image.open(image_path) if map_vis: image.save(os.path.join(map_out_path, "images-optional/" + image_id + ".jpg")) yolo.get_map_txt(image_id, image, class_names, map_out_path) print("Get predict result done.") if map_mode == 0 or map_mode == 2: print("Get ground truth result.") for image_id in tqdm(image_ids): with open(os.path.join(map_out_path, "ground-truth/"+image_id+".txt"), "w") as new_f: root = ET.parse(os.path.join(VOCdevkit_path, "VOC2007/Annotations/"+image_id+".xml")).getroot() for obj in root.findall('object'): difficult_flag = False if obj.find('difficult')!=None: difficult = obj.find('difficult').text if int(difficult)==1: difficult_flag = True obj_name = obj.find('name').text if obj_name not in class_names: continue bndbox = obj.find('bndbox') left = bndbox.find('xmin').text top = bndbox.find('ymin').text right = bndbox.find('xmax').text bottom = bndbox.find('ymax').text if difficult_flag: new_f.write("%s %s %s %s %s difficult\n" % (obj_name, left, top, right, bottom)) else: new_f.write("%s %s %s %s %s\n" % (obj_name, left, top, right, bottom)) print("Get ground truth result done.") if map_mode == 0 or map_mode == 3: print("Get map.") get_map(MINOVERLAP, True, path = map_out_path) print("Get map done.") if map_mode == 4: print("Get map.") get_coco_map(class_names = class_names, path = map_out_path) print("Get map done.") ================================================ FILE: get_yaml.py ================================================ import os import sys import yaml def get_config(): yaml_path = 'model_data/gesture.yaml' f = open(yaml_path,'r',encoding='utf-8') config = yaml.load(f,Loader =yaml.FullLoader) f.close() return config if __name__ == "__main__": config = get_config() print(config) ================================================ FILE: instructions.md ================================================ # ✌ Gesture Detection 这是一个基于无人机视觉图像手势识别控制系统,选择了YOLOv4模型进行训练 **YOLOv4 = CSPDarknet53(主干) + SPP** **附加模块(颈** **) +** **PANet** **路径聚合(颈** **) + YOLOv3(头部)** ![img](https://pdf.cdn.readpaper.com/parsed/fetch_target/699143cdb334ecfc63caf8192472490c_0_Figure_1.png) ================================================ FILE: kmeans_for_anchors.py ================================================ #-------------------------------------------------------------------------------------------------------# # kmeans虽然会对数据集中的框进行聚类,但是很多数据集由于框的大小相近,聚类出来的9个框相差不大, # 这样的框反而不利于模型的训练。因为不同的特征层适合不同大小的先验框,shape越小的特征层适合越大的先验框 # 原始网络的先验框已经按大中小比例分配好了,不进行聚类也会有非常好的效果。 #-------------------------------------------------------------------------------------------------------# import glob import xml.etree.ElementTree as ET import matplotlib.pyplot as plt import numpy as np from tqdm import tqdm def cas_iou(box, cluster): x = np.minimum(cluster[:, 0], box[0]) y = np.minimum(cluster[:, 1], box[1]) intersection = x * y area1 = box[0] * box[1] area2 = cluster[:,0] * cluster[:,1] iou = intersection / (area1 + area2 - intersection) return iou def avg_iou(box, cluster): return np.mean([np.max(cas_iou(box[i], cluster)) for i in range(box.shape[0])]) def kmeans(box, k): #-------------------------------------------------------------# # 取出一共有多少框 #-------------------------------------------------------------# row = box.shape[0] #-------------------------------------------------------------# # 每个框各个点的位置 #-------------------------------------------------------------# distance = np.empty((row, k)) #-------------------------------------------------------------# # 最后的聚类位置 #-------------------------------------------------------------# last_clu = np.zeros((row, )) np.random.seed() #-------------------------------------------------------------# # 随机选5个当聚类中心 #-------------------------------------------------------------# cluster = box[np.random.choice(row, k, replace = False)] iter = 0 while True: #-------------------------------------------------------------# # 计算当前框和先验框的宽高比例 #-------------------------------------------------------------# for i in range(row): distance[i] = 1 - cas_iou(box[i], cluster) #-------------------------------------------------------------# # 取出最小点 #-------------------------------------------------------------# near = np.argmin(distance, axis=1) if (last_clu == near).all(): break #-------------------------------------------------------------# # 求每一个类的中位点 #-------------------------------------------------------------# for j in range(k): cluster[j] = np.median( box[near == j],axis=0) last_clu = near if iter % 5 == 0: print('iter: {:d}. avg_iou:{:.2f}'.format(iter, avg_iou(box, cluster))) iter += 1 return cluster, near def load_data(path): data = [] #-------------------------------------------------------------# # 对于每一个xml都寻找box #-------------------------------------------------------------# for xml_file in tqdm(glob.glob('{}/*xml'.format(path))): tree = ET.parse(xml_file) height = int(tree.findtext('./size/height')) width = int(tree.findtext('./size/width')) if height<=0 or width<=0: continue #-------------------------------------------------------------# # 对于每一个目标都获得它的宽高 #-------------------------------------------------------------# for obj in tree.iter('object'): xmin = int(float(obj.findtext('bndbox/xmin'))) / width ymin = int(float(obj.findtext('bndbox/ymin'))) / height xmax = int(float(obj.findtext('bndbox/xmax'))) / width ymax = int(float(obj.findtext('bndbox/ymax'))) / height xmin = np.float64(xmin) ymin = np.float64(ymin) xmax = np.float64(xmax) ymax = np.float64(ymax) # 得到宽高 data.append([xmax - xmin, ymax - ymin]) return np.array(data) if __name__ == '__main__': np.random.seed(0) #-------------------------------------------------------------# # 运行该程序会计算'./VOCdevkit/VOC2007/Annotations'的xml # 会生成yolo_anchors.txt #-------------------------------------------------------------# input_shape = [224, 224] anchors_num = 9 #-------------------------------------------------------------# # 载入数据集,可以使用VOC的xml #-------------------------------------------------------------# path = 'VOCdevkit/VOC2007/Annotations' #-------------------------------------------------------------# # 载入所有的xml # 存储格式为转化为比例后的width,height #-------------------------------------------------------------# print('Load xmls.') data = load_data(path) print('Load xmls done.') #-------------------------------------------------------------# # 使用k聚类算法 #-------------------------------------------------------------# print('K-means boxes.') cluster, near = kmeans(data, anchors_num) print('K-means boxes done.') data = data * np.array([input_shape[1], input_shape[0]]) cluster = cluster * np.array([input_shape[1], input_shape[0]]) #-------------------------------------------------------------# # 绘图 #-------------------------------------------------------------# for j in range(anchors_num): plt.scatter(data[near == j][:,0], data[near == j][:,1]) plt.scatter(cluster[j][0], cluster[j][1], marker='x', c='black') plt.savefig("kmeans_for_anchors.jpg") plt.show() print('Save kmeans_for_anchors.jpg in root dir.') cluster = cluster[np.argsort(cluster[:, 0] * cluster[:, 1])] print('avg_ratio:{:.2f}'.format(avg_iou(data, cluster))) print(cluster) f = open("yolo_anchors.txt", 'w') row = np.shape(cluster)[0] for i in range(row): if i == 0: x_y = "%d,%d" % (cluster[i][0], cluster[i][1]) else: x_y = ", %d,%d" % (cluster[i][0], cluster[i][1]) f.write(x_y) f.close() ================================================ FILE: logs/README.md ================================================ 用于存放训练好的文件 ================================================ FILE: logs/gesture_loss_2021_11_14_22_04_00/epoch_loss_2021_11_14_22_04_00.txt ================================================ 390.34399642473386 21.87092101721116 14.030741856421953 11.276778338867942 9.814540598127577 8.89100271978496 8.609104168267898 7.924442773983802 7.723959027984996 7.2670195367601185 7.255199196897907 6.893556188654016 6.661026071619104 6.5294443972316785 6.535371827490536 6.529178083678822 6.403998654565694 6.439444012112087 6.092924733220795 5.926193254965323 5.9576384785734575 5.8119951972255 5.6878520206168846 5.819804650765878 5.707105348139633 5.458082881974585 5.665041320117903 5.317585485952872 5.349038653903538 5.283199619363855 5.064980445084749 5.070186079284291 4.9681971073150635 4.793072164794545 4.973145805759194 4.918124354915855 4.663256362632469 4.837633197690233 4.743683688434554 4.616998254516979 4.524823586146037 4.4209345593864535 4.558955289699413 4.333138801433422 4.426347941528132 4.412103137852233 4.295655697952082 4.364107617625484 4.211893027211413 4.084111590444306 4.018801480163763 3.8647101366961443 3.734398615581018 3.6937082122873375 3.793811333032302 3.429193678093545 3.6194330038111886 3.4087822738988898 3.331124112193967 3.3782305434162234 3.3561593158009613 3.25705443598606 3.2106575075490973 3.0107549484129303 3.0536143231539077 2.9674469438599953 3.1300665189822516 2.909559675204901 2.9446099194479576 2.8209132660686236 2.8917798992292383 2.815192371238897 2.861111617566627 2.9016677490722986 2.8193857658792427 2.8216423440126723 2.777715330874478 2.725730179820532 2.589312877184079 2.670389473438263 2.626439411331106 2.57100960759469 2.6649326178026786 2.449705180930503 2.6089335954115715 2.666015229291386 2.5139025822281837 2.4510488511971484 2.60918134248551 2.615589211384455 2.4221341083815067 2.5034887735490448 2.3411180855315408 2.3742799654970934 2.4252039420383946 2.5134657593788923 2.5887757239886273 2.5031773506859203 2.3927585335425388 2.4924555529414874 2.3816184005987497 2.3525361067351 2.35756847280779 2.4606370890030154 2.262793848084079 2.283497501026701 2.2522216586419095 2.3806339068177307 2.345363718767961 2.305632569723659 2.1932848855669116 2.332635486199532 2.2705356725204138 2.233249652992796 2.4728508678115446 2.3142452859952125 2.3585592800820314 2.335805359078042 2.337391757118849 2.391327069129473 2.3404054016242792 2.3145943543425314 2.196398460570677 2.2358641638248056 2.3038836038774915 2.2790947368851415 2.2812541202630525 2.2533860233278924 2.3108025224488458 2.2092323683110284 2.308551702050515 2.2422945557369127 2.1741022714126257 2.44105933992951 2.3797168718811905 2.231722431326354 2.3973163276174922 2.1568032256615015 2.239097781571341 2.2258979082107544 2.1682290563612807 2.2031694714117935 2.2706658139272973 2.329095835854978 2.255610410262037 2.2977319957665454 2.3046101513836117 2.249893919369321 2.2964354607242123 2.315463280696192 ================================================ FILE: logs/gesture_loss_2021_11_14_22_04_00/epoch_val_loss_2021_11_14_22_04_00.txt ================================================ 28.558996200561523 15.032766554090712 11.545120133293999 9.72215329276191 8.58862935172187 8.486469162835014 7.804132832421197 7.238262918260363 6.890773402320014 6.530833350287543 6.475247330135769 6.4751937124464245 6.239521026611328 6.0489738782246905 6.12673372692532 5.641317420535618 6.040707217322455 5.724527147081163 5.265863656997681 5.316834555731879 5.4665877024332685 5.622564209832086 5.04600026872423 5.060362259546916 5.527375910017225 5.435662375556098 5.021538707945082 5.028834872775608 4.896508720186022 4.989696582158406 5.161070320341322 5.098267449273004 4.707995070351495 4.600137048297459 4.426739745669895 4.481476042005751 4.555791060129802 4.693203316794501 4.515556865268284 4.371145274904039 4.138098372353448 4.548380348417494 4.3106510109371605 4.320602138837178 4.131023804346721 4.0555612511105 4.217087030410767 4.128190358479817 4.032541698879665 3.99964001443651 3.741890834437476 3.749820719162623 3.6366468982564077 3.5657983157369824 3.9311270780033536 3.6530382368299694 4.012030104796092 3.8975768751568265 3.764561494191488 3.4476174149248333 3.535598119099935 3.998010264502631 3.88807831870185 3.810675323009491 3.8832875225279064 3.532531124022272 3.9232571257485285 3.58525949716568 3.7238865759637623 3.7168162133958607 3.503431843386756 3.5310314959949918 3.7993387116326227 3.5516341394848294 3.6795931648876934 3.564246873060862 3.484692699379391 3.7236365245448217 3.7466657956441245 3.66163033246994 3.751209259033203 3.6696145402060614 3.5883768465783863 3.853155712286631 3.4928252498308816 3.602889382176929 3.7287648055288525 3.6207654832137957 3.610999337500996 3.8127831634547977 3.6820534533924527 3.716387847231494 3.6561857561270394 3.703249845239851 3.686804783013132 3.687538597318861 3.8072550859716205 3.6593143989642463 3.707283900843726 3.7246316257450314 3.8617856800556183 3.573318580786387 3.531035871969329 3.6177483134799533 3.6122085054715476 3.5437003208531275 3.5555910716454187 3.6909723381201425 3.5987775524457297 3.646808198756642 3.6476809779802957 3.615621048543188 3.8375576469633312 3.7161912678016558 3.694040416015519 3.677286409669452 3.6777902278635235 3.7830483151806726 3.707444575097826 3.7904206779268055 3.5872142712275186 3.6864367392328052 3.7757607218292026 3.835707320107354 3.6799587168627315 3.8233094347847834 3.6921923756599426 3.7244893974728055 3.6797771288288965 3.711515542533663 3.8481360466943846 3.8577410876750946 3.710074722766876 3.8249045742882624 3.7864705423514047 3.6575047771135965 3.8352384832170276 3.7801570263173847 3.7013448344336615 3.6655967930952706 3.657223959763845 3.722360614273283 3.772919843594233 3.7007322708765664 3.7042017413510218 3.8934470083978443 3.8964318566852145 3.6877589921156564 3.713595751259062 3.597744878795412 ================================================ FILE: logs/loss_2022_04_27_08_48_16/epoch_loss.txt ================================================ 4.311199968511408 2.641528855670582 1.0470811074430293 0.3173784383318641 0.1660231321372769 0.12659757448868317 0.11646865105087106 0.1186594499105757 0.11129742149602283 0.09524408660151741 0.09781679036942395 0.09211275726556778 0.08542741784317927 0.08707698925652287 0.08003932000561194 0.09124952453103932 0.07743281058289787 0.07542280463332479 0.062316759235479614 0.07161653380502354 0.06821535866368901 0.07083209519359199 0.07460641437633471 0.07450477220118046 0.06487809985198757 0.050884095443920654 0.07091375355693427 0.06433163752610033 0.0656029749661684 0.05935167453505776 0.06459851512177424 0.06376675008372827 0.05718133259903301 0.05716039274226536 0.05911739483814348 0.05761603875593706 0.051265862939709965 0.047803171148354355 0.0480937244031917 0.05439905263483524 0.058482232080264526 0.05515999550169164 0.049258994361893696 0.050817748277702114 0.05204320927573876 0.04787483066320419 0.050909879194064575 0.04848571375689723 0.050943345593457874 0.04928677469830622 0.05230807525416215 0.054047910206847724 0.04724785503413942 0.04339685816731718 0.04393725813262993 0.04542147194345792 0.046219487115740775 0.04159199959701962 0.0356766721026765 0.0347878428383006 0.0335447210404608 0.03512532735864322 0.032664823532104495 0.035281008275018795 0.027731664727131525 0.03222298233045472 0.03146794889536169 0.02836602210170693 0.028307923198574118 0.027572717414134078 0.026898101448184913 0.029324432876374987 0.02880634083929989 0.024556251760158274 0.027897736864785354 0.024288477210534943 0.022848848750193915 0.023355903372996385 0.02707639779481623 0.022250585506359735 0.025191593791047732 0.022139282586673897 0.02378465121404992 0.02341305265824 0.02176100810368856 0.025529090170231132 0.023221762292087077 0.02107305938584937 0.019723483237127463 0.027768902087377176 0.023790666233334277 0.02183559000906017 0.019348353561427858 0.021541342077155908 0.020851219362682766 0.01955224501176013 0.02228688634932041 0.018856989074912338 0.01816959279692835 0.024754421909650166 ================================================ FILE: logs/loss_2022_04_27_08_48_16/epoch_val_loss.txt ================================================ 3.5736865997314453 1.7812694907188416 0.5147329270839691 0.15201690793037415 0.10024188458919525 0.08380990475416183 0.07576803863048553 0.06853799521923065 0.06467496231198311 0.060902709141373634 0.05481202341616154 0.05164487101137638 0.046625690534710884 0.046081338077783585 0.04508414678275585 0.046726442873477936 0.041066285222768784 0.039722129702568054 0.0392248947173357 0.04033488966524601 0.03738676756620407 0.0356711745262146 0.03774934820830822 0.035463595762848854 0.03278419189155102 0.03250573016703129 0.03182028792798519 0.031694755889475346 0.03182463627308607 0.028715165331959724 0.03064714837819338 0.028574727475643158 0.031066023744642735 0.028762156143784523 0.027465523220598698 0.02787941414862871 0.02755015157163143 0.02802269347012043 0.028581750579178333 0.026334763504564762 0.026825452223420143 0.02670316770672798 0.02603335492312908 0.025488858111202717 0.027477828785777092 0.02550355065613985 0.026508965529501438 0.02424653246998787 0.02420251350849867 0.024741491302847862 0.03815543949604035 0.024845311418175697 0.024306144565343857 0.02493119016289711 0.024438758194446564 0.021836227178573607 0.022118838876485823 0.02276018038392067 0.019801595807075502 0.018804560229182244 0.01913141254335642 0.018066196143627165 0.018252668902277946 0.017480477318167688 0.016695075295865537 0.018235534615814685 0.016669700480997564 0.01745656579732895 0.01661595106124878 0.014982381090521812 0.014259136654436589 0.01617119237780571 0.01583776492625475 0.015838896110653877 0.015466723032295704 0.014705226197838784 0.014486565068364144 0.0142423365265131 0.013639062829315662 0.013229098543524742 0.013664134219288826 0.014067459665238858 0.014119291864335536 0.014162952080368996 0.014096969552338124 0.014010479114949704 0.013855390436947345 0.01369147039949894 0.013611100800335407 0.013387569226324558 0.013233654387295245 0.013060701824724675 0.01311743687838316 0.013459368608891964 0.013417618162930012 0.013188641518354416 0.013131854124367237 0.013138605654239655 0.013040048442780972 0.013191545940935611 ================================================ FILE: logs/loss_2022_04_27_10_38_48/epoch_loss.txt ================================================ 4.417048931121826 2.7174118811433967 1.0889532132582231 0.3425311154939912 0.17422378638928587 0.13641497018662366 0.11632075736468489 0.11424875665794719 0.10951222343878313 0.11042191968722777 0.0965666960586201 0.09156128205358982 0.09250037236647173 0.09282402846623551 0.08625757846642625 0.07673129354688255 0.07389622215520252 0.07624811069531874 0.08134209279986945 0.08268712799657475 0.06569299051030116 0.06593379310586235 0.07313475605439056 0.06932794980027458 0.07105197571218014 0.05761696923185478 0.05699523843147538 0.05502087775279175 0.056425975635647774 0.060862130570140754 0.05275308594784953 0.05468131161548875 0.06639060936868191 0.0586402067406611 0.05531726946884936 0.05826686415821314 0.05614634239199487 0.060194396329197014 0.056169633330269295 0.05521787144243717 0.05759791826659983 0.06400778830390084 0.048669698648154736 0.05138815820894458 0.05391152406280691 0.048903680660507896 0.05098097136413509 0.046242827380245384 0.05179907051338391 0.0525860372422771 0.05424936364094416 0.049993348659740554 0.04597619854741626 0.04917745155592759 0.05255601741373539 0.04698830768465996 0.041387100517749784 0.04129959721532133 0.04556649559073978 0.036499715513653226 0.03981801929573218 0.04143420826229784 0.03435336612164974 0.03496221779949135 0.03109016865491867 0.03035914318429099 0.029583082410196464 0.03257722655932108 0.030363482443822754 0.027382713970210817 0.03354052487346861 0.02999182954016659 0.027540474219454658 0.03399232141673565 0.027007617097761897 0.025914737520118556 0.0295799125606815 0.02715012611200412 0.025495433765980933 0.0296443536463711 0.023164296481344434 0.025637096497747633 0.024675296164221233 0.02778547273741828 0.021970178662902778 0.023107113461527558 0.024780070698923535 0.022441018600430754 0.023930547055270937 0.0282184108470877 0.023034340888261794 0.024948879559006956 0.021047428602145778 0.019247366736332577 0.019984866658018696 0.02513700392511156 0.02460642974409792 0.0241888129669759 0.024461371141175428 0.023433364638023906 ================================================ FILE: logs/loss_2022_04_27_10_38_48/epoch_val_loss.txt ================================================ 3.682404637336731 1.8932517766952515 0.5478550791740417 0.1596439927816391 0.1100359559059143 0.0877840518951416 0.07812783867120743 0.07114855200052261 0.06861080229282379 0.059281766414642334 0.057694293558597565 0.051728978753089905 0.052549805492162704 0.04606110043823719 0.04738330654799938 0.04431380145251751 0.04233948327600956 0.04040302708745003 0.038821205496788025 0.0383895430713892 0.03584542125463486 0.03636615164577961 0.03440128639340401 0.031500913202762604 0.03160226531326771 0.03259335644543171 0.03182834479957819 0.03255347441881895 0.03205320052802563 0.03115831222385168 0.030962957069277763 0.03099967911839485 0.028362704440951347 0.029792566783726215 0.029385950416326523 0.028081808239221573 0.02900168113410473 0.028213596902787685 0.026003092527389526 0.029015707783401012 0.027079648338258266 0.02746042888611555 0.026224803179502487 0.02623423095792532 0.026428623124957085 0.025775899179279804 0.025982394814491272 0.02434847690165043 0.027825096622109413 0.026163294911384583 0.029283170774579047 0.025315795838832856 0.027043038606643678 0.028298694640398026 0.024901207908987998 0.021958087757229804 0.02251458093523979 0.022333519905805586 0.021478286758065224 0.021176514402031898 0.018941503018140793 0.019572099670767784 0.018108497187495232 0.018086655251681804 0.017889507673680784 0.01727491766214371 0.01810304317623377 0.020134907588362692 0.018655003793537617 0.018117578141391276 0.017840097844600677 0.01779591590166092 0.016621771082282067 0.017149972915649413 0.016952383518218993 0.015586855821311474 0.01567951999604702 0.0161365307867527 0.01567267570644617 0.01678410042077303 0.015898118540644646 0.01655469797551632 0.015443072095513344 0.015269587188959122 0.015318373404443263 0.015480193309485912 0.015252745896577834 0.015485197678208351 0.01524040475487709 0.015235877968370915 0.015190575830638408 0.01506870575249195 0.015268886275589467 0.015318392775952816 0.015248116478323937 0.01509730275720358 0.015357919968664646 0.015471475012600423 0.015338210947811603 0.015286244638264179 ================================================ FILE: logs/loss_2022_04_27_12_50_47/epoch_loss.txt ================================================ 4.458093025467613 2.7262558070096103 1.0888537033037706 0.3306311368942261 0.1712129498747262 0.12332972951910713 0.1077601161192764 0.10889687660065564 0.10751076347448608 0.09971555254676125 0.09748913144523447 0.09051749330352653 0.08674890751188452 0.09196238592267036 0.0813636336136948 0.08286366950381886 0.07791051878170534 0.0753517130559141 0.07469043592837724 0.07069844498553059 0.06863954527811571 0.05802192301912741 0.07001199353147637 0.0646351370960474 0.0635682385076176 0.06396392174065113 0.062142887406728485 0.0702532638203014 0.056375787339427254 0.06388939967886968 0.05778990279544483 0.06408696647056124 0.06048921140080148 0.046278277158059856 0.05944571127607064 0.05725045552985235 0.05380251800472086 0.053617957894775 0.053481346842917526 0.05578712136908011 0.05615681384436109 0.0525641811334274 0.04595534486526793 0.04221054826947776 0.0491331076588143 0.04645225058563731 0.047417608005079354 0.045993872325528755 0.04980102206834338 0.05388529971241951 0.04780796766281128 0.051682502610815896 0.05296175873114003 0.04763079182141357 0.03715274184942245 0.038538362830877304 0.03803896543880304 0.04017537732919057 0.03992160202728377 0.03339115016990238 0.03391021318319771 0.03317808165318436 0.033503353450861244 0.034213335605131255 0.037453227241834 0.033429956477549344 0.032547304261889724 0.03456145400802294 0.026851379209094577 0.029029812270568476 0.02536299385958248 0.02381322646720542 0.02601998903685146 0.020065840913189782 0.02312256395816803 0.028637176213992966 0.023025286176966295 0.023644178753925695 0.024718130793836383 0.02247788065837489 0.023494062303668923 0.025069689253966014 0.02251974062787162 0.024839345862468085 0.021578845319648585 0.022635220984617867 0.022249876335263253 0.01972206729567713 0.018786311563518312 0.02083740762124459 0.02136736027896404 0.019557259066237342 0.018951669645806152 0.020326226308114 0.021592341653174824 0.019481366727915075 0.018176950762669244 0.02213383706079589 0.019981356461842854 0.020978835970163347 ================================================ FILE: logs/loss_2022_04_27_12_50_47/epoch_val_loss.txt ================================================ 3.7051011323928833 1.8262890577316284 0.5144035518169403 0.16302762925624847 0.10760901868343353 0.09057768434286118 0.07540924847126007 0.07146378979086876 0.06520375981926918 0.05898746848106384 0.054325105622410774 0.05058479495346546 0.0504811592400074 0.046029604971408844 0.04258855804800987 0.042371716350317 0.040247365832328796 0.04038912057876587 0.03568720445036888 0.038001520559191704 0.03973718546330929 0.035464052110910416 0.03202499449253082 0.02998754195868969 0.032502518966794014 0.03302299045026302 0.03285937011241913 0.029083450324833393 0.029631994664669037 0.03396240994334221 0.029673300683498383 0.028280221857130527 0.027639511972665787 0.028393579646945 0.027291471138596535 0.026989608071744442 0.02653918694704771 0.027808908373117447 0.027841621078550816 0.02570505067706108 0.025745649822056293 0.026372630149126053 0.024600804783403873 0.026447951793670654 0.02569119818508625 0.026840184815227985 0.024051610380411148 0.02362955827265978 0.024365886114537716 0.024577765725553036 0.031041909381747244 0.02641780823469162 0.02472583018243313 0.02326701581478119 0.019615407288074493 0.021174174174666403 0.019675580970942973 0.01869105324149132 0.018909885734319686 0.019662134535610675 0.01899590715765953 0.016179793514311314 0.01545619908720255 0.015423668920993805 0.018800214119255542 0.0158102760091424 0.0158376544713974 0.01783675402402878 0.015972125343978405 0.01454415861517191 0.014743064902722836 0.013825051300227643 0.01407058835029602 0.013598379865288734 0.013919505663216114 0.013623752258718013 0.014403878897428512 0.014411385357379913 0.01337964329868555 0.013076365552842617 0.013368507660925389 0.013667609356343747 0.013365321420133114 0.013264597952365875 0.013465055078268052 0.01281917616724968 0.01263135802000761 0.012750985845923424 0.01290153805166483 0.01281326413154602 0.012850469164550304 0.012885735556483268 0.013168741390109063 0.013198709674179554 0.0126633545383811 0.012886124104261399 0.012797533720731735 0.012569484673440457 0.012130422703921794 0.012647346407175065 ================================================ FILE: logs/loss_2022_04_28_00_40_54/epoch_loss.txt ================================================ 4.65520715713501 3.142860672690652 1.5020794109864668 0.5057930661873384 0.231415910476988 0.1739024357362227 0.1501499665054408 0.13435510004108603 0.12552000412886793 0.1170116358182647 0.1097346202216365 0.10218094119971449 0.09653170305219563 0.09267877211624925 0.08959556709636342 0.08778026801618663 0.0813840397379615 0.08208547498692166 0.07795694809068333 0.0774568762968887 0.07742892002517526 0.07316952097144994 0.0717044398188591 0.07023497687822039 0.07019331865012646 0.06709351390600204 0.06731417910619215 0.06743009134449741 0.06635952317579226 0.06368578191507947 0.06163112514398315 0.06230247410183603 0.0609466726468368 0.059141877869313415 0.059421493925831535 0.05991599742661823 0.05664417435499755 0.05543165823275393 0.055084149945865975 0.05501931634816257 0.05503683621910485 0.05480257303199985 0.05537006275897676 0.05448474125428633 0.05232419649308378 0.05311859653077342 0.05284474231302738 0.051879515532742844 0.052160846746780655 0.048417276787486946 0.07137971396247546 0.06579171708888477 0.06337685022089216 0.058213022185696496 0.06011202625102467 0.05577432778146532 0.05307989873819881 0.05232232163349788 0.047045067904724014 0.045659234002232554 0.046541030332446096 0.041184055474069385 0.04066362182299296 0.041569982427689764 0.03817177605297831 0.0390163982907931 0.041840214654803275 0.038884344117509 0.03724856765733825 0.03528667270309395 0.03439781483676699 0.03381528837813271 0.03448933532668485 0.03202489465475082 0.03492107921176486 0.029904662817716598 0.03170571397576067 0.03179397972093688 0.0303279221471813 0.029197406230701342 0.02931012755466832 0.029168612303005326 0.027595289217101204 0.02744665356973807 0.026995969439546266 0.027659311725033654 0.02661879969139894 0.027540806722309855 0.025905532100134427 0.0255900744555725 0.026152818650007247 0.025521984696388243 0.025769058614969254 0.02644038177612755 0.02754443759719531 0.024427745077345107 0.025285613785187403 0.026757355800105465 0.02632749622894658 0.026431108307507303 ================================================ FILE: logs/loss_2022_04_28_00_40_54/epoch_val_loss.txt ================================================ 3.979103207588196 2.2379150390625 0.7213477790355682 0.20374882966279984 0.13149111717939377 0.10669583082199097 0.08946957811713219 0.07844944670796394 0.07209542766213417 0.06465885788202286 0.060964012518525124 0.05698745884001255 0.053726550191640854 0.053231727331876755 0.05091492086648941 0.04869535565376282 0.045929690822958946 0.043502215296030045 0.04109686613082886 0.042073581367731094 0.03760443814098835 0.036989014595746994 0.0369559321552515 0.03501574695110321 0.03553796745836735 0.03463827446103096 0.03613190911710262 0.03488997742533684 0.03165611159056425 0.03400527499616146 0.03399870544672012 0.03354485519230366 0.030975072644650936 0.0297493115067482 0.029600737616419792 0.02729297336190939 0.027453931979835033 0.028598678298294544 0.027731974609196186 0.030310326255857944 0.026450641453266144 0.027599090710282326 0.027010041289031506 0.026624951511621475 0.027538660913705826 0.026772234588861465 0.026853609830141068 0.027332110330462456 0.026638195849955082 0.026076992973685265 0.029674236476421357 0.03184238411486149 0.02579696960747242 0.026541008800268173 0.028798045963048934 0.02365291155874729 0.024432314187288286 0.024038903787732123 0.022221024334430694 0.022891897335648538 0.01906990371644497 0.021012770757079125 0.020605479553341865 0.020398029685020448 0.019171418249607088 0.01934974603354931 0.020316287130117416 0.019410957768559455 0.018952558375895025 0.017280998453497887 0.0177790354937315 0.018064785189926623 0.01828454677015543 0.01720294840633869 0.01639395747333765 0.016722467541694642 0.016642549820244313 0.01656894329935312 0.015701821073889732 0.015975065901875495 0.016035530529916287 0.015547602623701095 0.01571439057588577 0.01621132455766201 0.015737788379192354 0.01545789260417223 0.015475354716181755 0.015286277420818806 0.015320570766925811 0.015739747881889345 0.015467294491827488 0.015462711267173291 0.015299991890788078 0.014891423098742963 0.014959413185715675 0.015149685740470886 0.015103902481496335 0.014999320358037948 0.015079839341342448 0.0150094548240304 ================================================ FILE: logs/loss_2022_04_28_14_54_17/epoch_loss.txt ================================================ 3.3427013629012636 0.590641807185279 0.20623346173928844 0.13935681179993684 0.11779505432479911 0.10669546342558331 0.0995730339239041 0.09289641034685903 0.08960233483877447 0.08865145291719172 0.08199652650703987 0.08332964736554357 0.08082385785463783 0.07951261059691508 0.07187494143015809 0.07693152552884486 0.07002928235257665 0.06805908863122265 0.06391975372615788 0.06560571603477001 0.06688064705166552 0.062423851289269 0.06189305805083778 0.06095021272905999 0.05913820943484704 0.05766822151425812 0.05601171863575776 0.050846687311099634 0.0500038359210723 0.05070198744845887 0.04995435054620935 0.04775367355905473 0.04747431728368004 0.05075365285803046 0.049145943324805964 0.04660840377522012 0.04236363642849028 0.04308449916231136 0.04134128590942257 0.04134896128024492 0.040451034003247816 0.040809157501078316 0.04189636699027485 0.03930564734877812 0.04004836426013046 0.03825837828529378 0.03547370240299238 0.03609677294476165 0.035196643017439376 0.03430712522628407 0.04613391875237641 0.05915206435084757 0.045893035898916426 0.04116026466298434 0.0429476417420018 0.03999344222247601 0.034763063090698175 0.03578517514720766 0.03375119598996308 0.03283411696967151 0.03579546554893669 0.03182236654813298 0.03289871994768166 0.03093694845964718 0.028104687105709066 0.0279214970392382 0.02814181201522135 0.026209147684534806 0.024499411086758807 0.02420818345660033 0.02401729004470528 0.02229024926847261 0.021894857381832684 0.021454263018677013 0.020758730825683518 0.02169692176976241 0.019593946940343207 0.019191343562367062 0.0194984604876178 0.02022809916266447 0.017767922341590747 0.01808840037944416 0.018055611812613077 0.017147960676164885 0.015863009145121194 0.015711418758534514 0.016356725540633003 0.016216116898512052 0.015499612758867442 0.015379458964647104 0.016735805649982973 0.014799573211025239 0.015743958410651734 0.014708074144113601 0.014328512709148021 0.015710317682371373 0.01542505334622951 0.014101080921439765 0.014700241691510503 0.014981216627832812 ================================================ FILE: logs/loss_2022_04_28_14_54_17/epoch_val_loss.txt ================================================ 1.1948505997657777 0.2769960485398769 0.1309874437749386 0.10720247365534305 0.0823921812698245 0.06992402952164412 0.0779087346047163 0.06684023551642895 0.06127838855609298 0.06253754440695047 0.06560290511697531 0.05028826054185629 0.05307867294177413 0.046788199059665206 0.05016098273918033 0.041087670251727104 0.049103803001344204 0.04360529286786914 0.04554138630628586 0.03290841649286449 0.04053358295932412 0.038861811719834806 0.040706123877316716 0.03609397481195629 0.03557254578918219 0.03464236315339804 0.03329266821965575 0.03151600556448102 0.030487440805882216 0.03179679936729372 0.030378894181922078 0.03546885224059224 0.028008161624893547 0.030146837001666427 0.028426590701565148 0.030748564330860973 0.028618200030177832 0.03007163112051785 0.02537959101609886 0.028373095905408263 0.025091598788276315 0.027431158255785702 0.0274854336399585 0.0238998107612133 0.024188394332304596 0.025603410461917518 0.022463220916688443 0.021122918161563576 0.023449525656178593 0.02241856213659048 0.030004368303343652 0.03465683250688016 0.025661695492453875 0.025751420808956028 0.0250759432092309 0.024298161384649575 0.023818821809254587 0.02544179279357195 0.02248522681184113 0.02272053265478462 0.021450468467082828 0.022059163730591535 0.01965688676573336 0.019216149824205785 0.020135902601759882 0.02419198288116604 0.017368705407716335 0.01844585470389575 0.015960348234511913 0.017440078582149 0.015858469036174938 0.01589310457929969 0.01708033775212243 0.030576034029945732 0.014990652166306972 0.020580469502601773 0.01814356680260971 0.016363495017867536 0.016028978914255275 0.015470803889911622 0.017227034358074888 0.016705141763668507 0.01754759649047628 0.02099468276137486 0.02627454571193084 0.016601535107474773 0.019520913722226398 0.016074266715440898 0.015431905922014266 0.015508590545505286 0.013960553548531606 0.015237966080894694 0.015095379657577724 0.01584624971728772 0.015998882468556984 0.01559915920952335 0.01576072332682088 0.016472871112637223 0.014691755402600393 0.014136423316085712 ================================================ FILE: logs/loss_2022_05_02_14_57_57/epoch_loss.txt ================================================ 17.101406224568684 10.8318008740743 4.240671507517496 1.0019958794116974 0.37954812149206796 0.2687491794427236 0.22754189471403757 0.19753684798876445 0.1771739900112152 0.16613257378339769 0.14869885842005412 0.13755213419596354 0.13448657716313997 0.12195368086298307 0.1128251701593399 0.10961388771732648 0.10665635019540787 0.10061115821202596 0.0969288428624471 0.09855932394663493 0.0889915977915128 0.08737521395087242 0.08142138893405597 0.081571697195371 0.08513322671254477 0.0799174178391695 0.07576848641037941 0.07407469501097998 0.07028314856191477 0.07057048715651035 0.0709464654326439 0.07267625791331132 0.06727536929150423 0.0662232073644797 0.06310114165147146 0.06374188972016176 0.06626531345148881 0.05850081816315651 0.056352414563298224 0.05607227062185605 0.057017019018530846 0.05952403930326303 0.057178026810288426 0.051601182545224826 0.051208433136343955 0.051774655406673746 0.050313881536324816 0.04995381236076355 0.048258970181147255 0.04914092607796192 0.06768884502040844 0.06370118060149252 0.05913636611464123 0.05405666360942026 0.052676150932287176 0.04658079737176498 0.0453374430614834 0.04464669832183669 0.04386587947762261 0.038802354324919484 0.038647202278176945 0.03676449179959794 0.03481319181931516 0.0347878224371622 0.03463629183825105 0.03564592384112378 0.03169099524772415 0.03046195216011256 0.029932656922998527 0.02693921811878681 0.02624520653237899 0.02643638541145871 0.024267646336617568 0.02276813123996059 0.022201836206174146 0.025956252019386738 0.022044219623785465 0.01913531731891756 0.018665816611610354 0.020095466733134042 0.019377306945777186 0.019703271872519204 0.017145425283039608 0.017283631632259735 0.015655260040269545 0.017102580536932994 0.01568767197119693 0.015433585511830945 0.01649760961299762 0.01480112192220986 0.01458095806495597 0.01634620662080124 0.014586444144758086 0.01412225275610884 0.014443966598870853 0.014422304722635696 0.014611958689056338 0.01421121487316365 0.014518235716968775 0.01446291058867549 ================================================ FILE: logs/loss_2022_05_02_14_57_57/epoch_val_loss.txt ================================================ 14.182828585306803 6.964454015096028 1.7364161411921184 0.4160226086775462 0.23061403135458627 0.18009933829307556 0.15316933890183768 0.12558546662330627 0.11013514300187428 0.10292657961448033 0.09011622269948323 0.0910362775127093 0.07362671693166097 0.06496318926413854 0.06620646268129349 0.05724670241276423 0.05412605529030164 0.05476600428422292 0.04998553295930227 0.04453219473361969 0.046111090729633965 0.03964699556430181 0.04128604009747505 0.0385576585928599 0.040300281097491585 0.036520869781573616 0.03233897313475609 0.03402836248278618 0.029543195540706318 0.03613479311267535 0.030847225338220596 0.03196833903590838 0.030614140133063 0.027615018809835117 0.029661099116007488 0.028920121490955353 0.031096385171016056 0.026975831637779873 0.02437760556737582 0.024089227120081585 0.024140140662590664 0.02602989909549554 0.023526831219593685 0.023234928647677105 0.02490025262037913 0.024476055055856705 0.02195119174818198 0.02400912468632062 0.021773086860775948 0.021737251430749893 0.03704084885808138 0.027747553415023364 0.02609148549918945 0.027060106253394715 0.02310138403509672 0.02209098207262846 0.019444907299027994 0.01728303673175665 0.022116302154385127 0.017028711091440458 0.018385969388943452 0.020397630233604174 0.017034396529197693 0.0161269146662492 0.014033435915525142 0.015593188958099255 0.015342251899150701 0.015232413147504512 0.01195777920432962 0.013383755532021705 0.01376453500527602 0.012433087345785819 0.010423123764877137 0.011021508405414911 0.010145062186683599 0.011127662809135823 0.009687475251177182 0.010067089210049463 0.008900713497916093 0.009318945392106589 0.008838421199470758 0.008917749107170563 0.008874757430301262 0.00834214468844808 0.009231974191677112 0.00839424731496435 0.00878818673439897 0.008268425169472512 0.008394974642075025 0.008387481507200461 0.008073390604784856 0.008447423434028259 0.007967768595195733 0.008031251589552714 0.007093459976693759 0.0077013208960684445 0.008188612150171628 0.008229664276139094 0.008362234892466893 0.0081037561624096 ================================================ FILE: model_data/.gitattributes ================================================ *.pth filter=lfs diff=lfs merge=lfs -text ================================================ FILE: model_data/gesture.yaml ================================================ #------------------------------detect.py--------------------------------# # 这一部分是为了半自动标注数据,可以减轻负担,需要提前训练一个权重,以Labelme格式保存 # dir_origin_path 图片存放位置 # dir_save_path Annotation保存位置 # ----------------------------------------------------------------------# dir_detect_path: ./JPEGImages detect_save_path: ./Annotation # ----------------------------- train.py -------------------------------# nc: 8 # 类别的数量 classes: ["up","down","left","right","front","back","clockwise","anticlockwise"] # 类别 confidence: 0.5 # 置信度 nms_iou: 0.3 letterbox_image: False lr_decay_type: cos # 使用到的学习率下降方式,可选的有step、cos # 用于设置是否使用多线程读取数据 # 开启后会加快数据读取速度,但是会占用更多内存 # 内存较小的电脑可以设置为2或者0,win建议设为0 num_workers: 4 ================================================ FILE: model_data/gesture_classes.txt ================================================ up down left right front back clockwise anticlockwise ================================================ FILE: model_data/yolo_anchors.txt ================================================ 12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401 ================================================ FILE: model_data/yolotiny_anchors.txt ================================================ 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 ================================================ FILE: nets/CSPdarknet.py ================================================ import math from collections import OrderedDict import torch import torch.nn as nn import torch.nn.functional as F #-------------------------------------------------# # MISH激活函数 #-------------------------------------------------# class Mish(nn.Module): def __init__(self): super(Mish, self).__init__() def forward(self, x): return x * torch.tanh(F.softplus(x)) #---------------------------------------------------# # 卷积块 -> 卷积 + 标准化 + 激活函数 # Conv2d + BatchNormalization + Mish #---------------------------------------------------# class BasicConv(nn.Module): def __init__(self, in_channels, out_channels, kernel_size, stride=1): super(BasicConv, self).__init__() self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, kernel_size//2, bias=False) self.bn = nn.BatchNorm2d(out_channels) self.activation = Mish() def forward(self, x): x = self.conv(x) x = self.bn(x) x = self.activation(x) return x #---------------------------------------------------# # CSPdarknet的结构块的组成部分 # 内部堆叠的残差块 #---------------------------------------------------# class Resblock(nn.Module): def __init__(self, channels, hidden_channels=None): super(Resblock, self).__init__() if hidden_channels is None: hidden_channels = channels self.block = nn.Sequential( BasicConv(channels, hidden_channels, 1), BasicConv(hidden_channels, channels, 3) ) def forward(self, x): return x + self.block(x) #--------------------------------------------------------------------# # CSPdarknet的结构块 # 首先利用ZeroPadding2D和一个步长为2x2的卷积块进行高和宽的压缩 # 然后建立一个大的残差边shortconv、这个大残差边绕过了很多的残差结构 # 主干部分会对num_blocks进行循环,循环内部是残差结构。 # 对于整个CSPdarknet的结构块,就是一个大残差块+内部多个小残差块 #--------------------------------------------------------------------# class Resblock_body(nn.Module): def __init__(self, in_channels, out_channels, num_blocks, first): super(Resblock_body, self).__init__() #----------------------------------------------------------------# # 利用一个步长为2x2的卷积块进行高和宽的压缩 #----------------------------------------------------------------# self.downsample_conv = BasicConv(in_channels, out_channels, 3, stride=2) if first: #--------------------------------------------------------------------------# # 然后建立一个大的残差边self.split_conv0、这个大残差边绕过了很多的残差结构 #--------------------------------------------------------------------------# self.split_conv0 = BasicConv(out_channels, out_channels, 1) #----------------------------------------------------------------# # 主干部分会对num_blocks进行循环,循环内部是残差结构。 #----------------------------------------------------------------# self.split_conv1 = BasicConv(out_channels, out_channels, 1) self.blocks_conv = nn.Sequential( Resblock(channels=out_channels, hidden_channels=out_channels//2), BasicConv(out_channels, out_channels, 1) ) self.concat_conv = BasicConv(out_channels*2, out_channels, 1) else: #--------------------------------------------------------------------------# # 然后建立一个大的残差边self.split_conv0、这个大残差边绕过了很多的残差结构 #--------------------------------------------------------------------------# self.split_conv0 = BasicConv(out_channels, out_channels//2, 1) #----------------------------------------------------------------# # 主干部分会对num_blocks进行循环,循环内部是残差结构。 #----------------------------------------------------------------# self.split_conv1 = BasicConv(out_channels, out_channels//2, 1) self.blocks_conv = nn.Sequential( *[Resblock(out_channels//2) for _ in range(num_blocks)], BasicConv(out_channels//2, out_channels//2, 1) ) self.concat_conv = BasicConv(out_channels, out_channels, 1) def forward(self, x): x = self.downsample_conv(x) x0 = self.split_conv0(x) x1 = self.split_conv1(x) x1 = self.blocks_conv(x1) #------------------------------------# # 将大残差边再堆叠回来 #------------------------------------# x = torch.cat([x1, x0], dim=1) #------------------------------------# # 最后对通道数进行整合 #------------------------------------# x = self.concat_conv(x) return x #---------------------------------------------------# # CSPdarknet53 的主体部分 # 输入为一张416x416x3的图片 # 输出为三个有效特征层 #---------------------------------------------------# class CSPDarkNet(nn.Module): def __init__(self, layers): super(CSPDarkNet, self).__init__() self.inplanes = 32 # 416,416,3 -> 416,416,32 self.conv1 = BasicConv(3, self.inplanes, kernel_size=3, stride=1) self.feature_channels = [64, 128, 256, 512, 1024] self.stages = nn.ModuleList([ # 416,416,32 -> 208,208,64 Resblock_body(self.inplanes, self.feature_channels[0], layers[0], first=True), # 208,208,64 -> 104,104,128 Resblock_body(self.feature_channels[0], self.feature_channels[1], layers[1], first=False), # 104,104,128 -> 52,52,256 Resblock_body(self.feature_channels[1], self.feature_channels[2], layers[2], first=False), # 52,52,256 -> 26,26,512 Resblock_body(self.feature_channels[2], self.feature_channels[3], layers[3], first=False), # 26,26,512 -> 13,13,1024 Resblock_body(self.feature_channels[3], self.feature_channels[4], layers[4], first=False) ]) self.num_features = 1 for m in self.modules(): if isinstance(m, nn.Conv2d): n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels m.weight.data.normal_(0, math.sqrt(2. / n)) elif isinstance(m, nn.BatchNorm2d): m.weight.data.fill_(1) m.bias.data.zero_() def forward(self, x): x = self.conv1(x) x = self.stages[0](x) x = self.stages[1](x) out3 = self.stages[2](x) out4 = self.stages[3](out3) out5 = self.stages[4](out4) return out3, out4, out5 def darknet53(pretrained): model = CSPDarkNet([1, 2, 8, 8, 4]) if pretrained: model.load_state_dict(torch.load("model_data/CSPdarknet53_backbone_weights.pth")) return model ================================================ FILE: nets/CSPdarknet53_tiny.py ================================================ import math import torch import torch.nn as nn #-------------------------------------------------# # 卷积块 # Conv2d + BatchNorm2d + LeakyReLU #-------------------------------------------------# class BasicConv(nn.Module): def __init__(self, in_channels, out_channels, kernel_size, stride=1): super(BasicConv, self).__init__() self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, kernel_size//2, bias=False) self.bn = nn.BatchNorm2d(out_channels) self.activation = nn.LeakyReLU(0.1) def forward(self, x): x = self.conv(x) x = self.bn(x) x = self.activation(x) return x ''' input | BasicConv ----------------------- | | route_group route | | BasicConv | | | ------------------- | | | | route_1 BasicConv | | | | -----------------cat | | | ---- BasicConv | | | | feat cat--------------------- | MaxPooling2D ''' #---------------------------------------------------# # CSPdarknet53-tiny的结构块 # 存在一个大残差边 # 这个大残差边绕过了很多的残差结构 #---------------------------------------------------# class Resblock_body(nn.Module): def __init__(self, in_channels, out_channels): super(Resblock_body, self).__init__() self.out_channels = out_channels self.conv1 = BasicConv(in_channels, out_channels, 3) self.conv2 = BasicConv(out_channels//2, out_channels//2, 3) self.conv3 = BasicConv(out_channels//2, out_channels//2, 3) self.conv4 = BasicConv(out_channels, out_channels, 1) self.maxpool = nn.MaxPool2d([2,2],[2,2]) def forward(self, x): # 利用一个3x3卷积进行特征整合 x = self.conv1(x) # 引出一个大的残差边route route = x c = self.out_channels # 对特征层的通道进行分割,取第二部分作为主干部分。 x = torch.split(x, c//2, dim = 1)[1] # 对主干部分进行3x3卷积 x = self.conv2(x) # 引出一个小的残差边route_1 route1 = x # 对第主干部分进行3x3卷积 x = self.conv3(x) # 主干部分与残差部分进行相接 x = torch.cat([x,route1], dim = 1) # 对相接后的结果进行1x1卷积 x = self.conv4(x) feat = x x = torch.cat([route, x], dim = 1) # 利用最大池化进行高和宽的压缩 x = self.maxpool(x) return x,feat class CSPDarkNet(nn.Module): def __init__(self): super(CSPDarkNet, self).__init__() # 首先利用两次步长为2x2的3x3卷积进行高和宽的压缩 # 416,416,3 -> 208,208,32 -> 104,104,64 self.conv1 = BasicConv(3, 32, kernel_size=3, stride=2) self.conv2 = BasicConv(32, 64, kernel_size=3, stride=2) # 104,104,64 -> 52,52,128 self.resblock_body1 = Resblock_body(64, 64) # 52,52,128 -> 26,26,256 self.resblock_body2 = Resblock_body(128, 128) # 26,26,256 -> 13,13,512 self.resblock_body3 = Resblock_body(256, 256) # 13,13,512 -> 13,13,512 self.conv3 = BasicConv(512, 512, kernel_size=3) self.num_features = 1 # 进行权值初始化 for m in self.modules(): if isinstance(m, nn.Conv2d): n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels m.weight.data.normal_(0, math.sqrt(2. / n)) elif isinstance(m, nn.BatchNorm2d): m.weight.data.fill_(1) m.bias.data.zero_() def forward(self, x): # 416,416,3 -> 208,208,32 -> 104,104,64 x = self.conv1(x) x = self.conv2(x) # 104,104,64 -> 52,52,128 x, _ = self.resblock_body1(x) # 52,52,128 -> 26,26,256 x, _ = self.resblock_body2(x) # 26,26,256 -> x为13,13,512 # -> feat1为26,26,256 x, feat1 = self.resblock_body3(x) # 13,13,512 -> 13,13,512 x = self.conv3(x) feat2 = x return feat1,feat2 def darknet53_tiny(pretrained, **kwargs): model = CSPDarkNet() if pretrained: model.load_state_dict(torch.load("model_data/CSPdarknet53_tiny_backbone_weights.pth")) return model ================================================ FILE: nets/__init__.py ================================================ # ================================================ FILE: nets/attention.py ================================================ import torch import torch.nn as nn import math class se_block(nn.Module): def __init__(self, channel, ratio=16): super(se_block, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.fc = nn.Sequential( nn.Linear(channel, channel // ratio, bias=False), nn.ReLU(inplace=True), nn.Linear(channel // ratio, channel, bias=False), nn.Sigmoid() ) def forward(self, x): b, c, _, _ = x.size() y = self.avg_pool(x).view(b, c) y = self.fc(y).view(b, c, 1, 1) return x * y class ChannelAttention(nn.Module): def __init__(self, in_planes, ratio=8): super(ChannelAttention, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.max_pool = nn.AdaptiveMaxPool2d(1) # 利用1x1卷积代替全连接 self.fc1 = nn.Conv2d(in_planes, in_planes // ratio, 1, bias=False) self.relu1 = nn.ReLU() self.fc2 = nn.Conv2d(in_planes // ratio, in_planes, 1, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x)))) max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x)))) out = avg_out + max_out return self.sigmoid(out) class SpatialAttention(nn.Module): def __init__(self, kernel_size=7): super(SpatialAttention, self).__init__() assert kernel_size in (3, 7), 'kernel size must be 3 or 7' padding = 3 if kernel_size == 7 else 1 self.conv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): avg_out = torch.mean(x, dim=1, keepdim=True) max_out, _ = torch.max(x, dim=1, keepdim=True) x = torch.cat([avg_out, max_out], dim=1) x = self.conv1(x) return self.sigmoid(x) class cbam_block(nn.Module): def __init__(self, channel, ratio=8, kernel_size=7): super(cbam_block, self).__init__() self.channelattention = ChannelAttention(channel, ratio=ratio) self.spatialattention = SpatialAttention(kernel_size=kernel_size) def forward(self, x): x = x*self.channelattention(x) x = x*self.spatialattention(x) return x class eca_block(nn.Module): def __init__(self, channel, b=1, gamma=2): super(eca_block, self).__init__() kernel_size = int(abs((math.log(channel, 2) + b) / gamma)) kernel_size = kernel_size if kernel_size % 2 else kernel_size + 1 self.avg_pool = nn.AdaptiveAvgPool2d(1) self.conv = nn.Conv1d(1, 1, kernel_size=kernel_size, padding=(kernel_size - 1) // 2, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): y = self.avg_pool(x) y = self.conv(y.squeeze(-1).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-1) y = self.sigmoid(y) return x * y.expand_as(x) class CA_Block(nn.Module): def __init__(self, channel, reduction=16): super(CA_Block, self).__init__() self.conv_1x1 = nn.Conv2d(in_channels=channel, out_channels=channel//reduction, kernel_size=1, stride=1, bias=False) self.relu = nn.ReLU() self.bn = nn.BatchNorm2d(channel//reduction) self.F_h = nn.Conv2d(in_channels=channel//reduction, out_channels=channel, kernel_size=1, stride=1, bias=False) self.F_w = nn.Conv2d(in_channels=channel//reduction, out_channels=channel, kernel_size=1, stride=1, bias=False) self.sigmoid_h = nn.Sigmoid() self.sigmoid_w = nn.Sigmoid() def forward(self, x): _, _, h, w = x.size() x_h = torch.mean(x, dim = 3, keepdim = True).permute(0, 1, 3, 2) x_w = torch.mean(x, dim = 2, keepdim = True) x_cat_conv_relu = self.relu(self.bn(self.conv_1x1(torch.cat((x_h, x_w), 3)))) x_cat_conv_split_h, x_cat_conv_split_w = x_cat_conv_relu.split([h, w], 3) s_h = self.sigmoid_h(self.F_h(x_cat_conv_split_h.permute(0, 1, 3, 2))) s_w = self.sigmoid_w(self.F_w(x_cat_conv_split_w)) out = x * s_h.expand_as(x) * s_w.expand_as(x) return out ================================================ FILE: nets/yolo.py ================================================ from collections import OrderedDict import torch import torch.nn as nn from nets.CSPdarknet import darknet53 def conv2d(filter_in, filter_out, kernel_size, stride=1): pad = (kernel_size - 1) // 2 if kernel_size else 0 return nn.Sequential(OrderedDict([ ("conv", nn.Conv2d(filter_in, filter_out, kernel_size=kernel_size, stride=stride, padding=pad, bias=False)), ("bn", nn.BatchNorm2d(filter_out)), ("relu", nn.LeakyReLU(0.1)), ])) #---------------------------------------------------# # SPP结构,利用不同大小的池化核进行池化 # 池化后堆叠 #---------------------------------------------------# class SpatialPyramidPooling(nn.Module): def __init__(self, pool_sizes=[5, 9, 13]): super(SpatialPyramidPooling, self).__init__() self.maxpools = nn.ModuleList([nn.MaxPool2d(pool_size, 1, pool_size//2) for pool_size in pool_sizes]) def forward(self, x): features = [maxpool(x) for maxpool in self.maxpools[::-1]] features = torch.cat(features + [x], dim=1) return features #---------------------------------------------------# # 卷积 + 上采样 #---------------------------------------------------# class Upsample(nn.Module): def __init__(self, in_channels, out_channels): super(Upsample, self).__init__() self.upsample = nn.Sequential( conv2d(in_channels, out_channels, 1), nn.Upsample(scale_factor=2, mode='nearest') ) def forward(self, x,): x = self.upsample(x) return x #---------------------------------------------------# # 三次卷积块 #---------------------------------------------------# def make_three_conv(filters_list, in_filters): m = nn.Sequential( conv2d(in_filters, filters_list[0], 1), conv2d(filters_list[0], filters_list[1], 3), conv2d(filters_list[1], filters_list[0], 1), ) return m #---------------------------------------------------# # 五次卷积块 #---------------------------------------------------# def make_five_conv(filters_list, in_filters): m = nn.Sequential( conv2d(in_filters, filters_list[0], 1), conv2d(filters_list[0], filters_list[1], 3), conv2d(filters_list[1], filters_list[0], 1), conv2d(filters_list[0], filters_list[1], 3), conv2d(filters_list[1], filters_list[0], 1), ) return m #---------------------------------------------------# # 最后获得yolov4的输出 #---------------------------------------------------# def yolo_head(filters_list, in_filters): m = nn.Sequential( conv2d(in_filters, filters_list[0], 3), nn.Conv2d(filters_list[0], filters_list[1], 1), ) return m #---------------------------------------------------# # yolo_body #---------------------------------------------------# class YoloBody(nn.Module): def __init__(self, anchors_mask, num_classes, pretrained = False): super(YoloBody, self).__init__() #---------------------------------------------------# # 生成CSPdarknet53的主干模型 # 获得三个有效特征层,他们的shape分别是: # 52,52,256 # 26,26,512 # 13,13,1024 #---------------------------------------------------# self.backbone = darknet53(pretrained) self.conv1 = make_three_conv([512,1024],1024) self.SPP = SpatialPyramidPooling() self.conv2 = make_three_conv([512,1024],2048) self.upsample1 = Upsample(512,256) self.conv_for_P4 = conv2d(512,256,1) self.make_five_conv1 = make_five_conv([256, 512],512) self.upsample2 = Upsample(256,128) self.conv_for_P3 = conv2d(256,128,1) self.make_five_conv2 = make_five_conv([128, 256],256) # 3*(5+num_classes) = 3*(5+20) = 3*(4+1+20)=75 self.yolo_head3 = yolo_head([256, len(anchors_mask[0]) * (5 + num_classes)],128) self.down_sample1 = conv2d(128,256,3,stride=2) self.make_five_conv3 = make_five_conv([256, 512],512) # 3*(5+num_classes) = 3*(5+20) = 3*(4+1+20)=75 self.yolo_head2 = yolo_head([512, len(anchors_mask[1]) * (5 + num_classes)],256) self.down_sample2 = conv2d(256,512,3,stride=2) self.make_five_conv4 = make_five_conv([512, 1024],1024) # 3*(5+num_classes)=3*(5+20)=3*(4+1+20)=75 self.yolo_head1 = yolo_head([1024, len(anchors_mask[2]) * (5 + num_classes)],512) def forward(self, x): # backbone x2, x1, x0 = self.backbone(x) # 13,13,1024 -> 13,13,512 -> 13,13,1024 -> 13,13,512 -> 13,13,2048 P5 = self.conv1(x0) P5 = self.SPP(P5) # 13,13,2048 -> 13,13,512 -> 13,13,1024 -> 13,13,512 P5 = self.conv2(P5) # 13,13,512 -> 13,13,256 -> 26,26,256 P5_upsample = self.upsample1(P5) # 26,26,512 -> 26,26,256 P4 = self.conv_for_P4(x1) # 26,26,256 + 26,26,256 -> 26,26,512 P4 = torch.cat([P4,P5_upsample],axis=1) # 26,26,512 -> 26,26,256 -> 26,26,512 -> 26,26,256 -> 26,26,512 -> 26,26,256 P4 = self.make_five_conv1(P4) # 26,26,256 -> 26,26,128 -> 52,52,128 P4_upsample = self.upsample2(P4) # 52,52,256 -> 52,52,128 P3 = self.conv_for_P3(x2) # 52,52,128 + 52,52,128 -> 52,52,256 P3 = torch.cat([P3,P4_upsample],axis=1) # 52,52,256 -> 52,52,128 -> 52,52,256 -> 52,52,128 -> 52,52,256 -> 52,52,128 P3 = self.make_five_conv2(P3) # 52,52,128 -> 26,26,256 P3_downsample = self.down_sample1(P3) # 26,26,256 + 26,26,256 -> 26,26,512 P4 = torch.cat([P3_downsample,P4],axis=1) # 26,26,512 -> 26,26,256 -> 26,26,512 -> 26,26,256 -> 26,26,512 -> 26,26,256 P4 = self.make_five_conv3(P4) # 26,26,256 -> 13,13,512 P4_downsample = self.down_sample2(P4) # 13,13,512 + 13,13,512 -> 13,13,1024 P5 = torch.cat([P4_downsample,P5],axis=1) # 13,13,1024 -> 13,13,512 -> 13,13,1024 -> 13,13,512 -> 13,13,1024 -> 13,13,512 P5 = self.make_five_conv4(P5) #---------------------------------------------------# # 第三个特征层 # y3=(batch_size,75,52,52) #---------------------------------------------------# out2 = self.yolo_head3(P3) #---------------------------------------------------# # 第二个特征层 # y2=(batch_size,75,26,26) #---------------------------------------------------# out1 = self.yolo_head2(P4) #---------------------------------------------------# # 第一个特征层 # y1=(batch_size,75,13,13) #---------------------------------------------------# out0 = self.yolo_head1(P5) return out0, out1, out2 ================================================ FILE: nets/yolo_tiny.py ================================================ import torch import torch.nn as nn from nets.CSPdarknet53_tiny import darknet53_tiny from nets.attention import cbam_block, eca_block, se_block, CA_Block attention_block = [se_block, cbam_block, eca_block, CA_Block] #-------------------------------------------------# # 卷积块 -> 卷积 + 标准化 + 激活函数 # Conv2d + BatchNormalization + LeakyReLU #-------------------------------------------------# class BasicConv(nn.Module): def __init__(self, in_channels, out_channels, kernel_size, stride=1): super(BasicConv, self).__init__() self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, kernel_size//2, bias=False) self.bn = nn.BatchNorm2d(out_channels) self.activation = nn.LeakyReLU(0.1) def forward(self, x): x = self.conv(x) x = self.bn(x) x = self.activation(x) return x #---------------------------------------------------# # 卷积 + 上采样 #---------------------------------------------------# class Upsample(nn.Module): def __init__(self, in_channels, out_channels): super(Upsample, self).__init__() self.upsample = nn.Sequential( BasicConv(in_channels, out_channels, 1), nn.Upsample(scale_factor=2, mode='nearest') ) def forward(self, x,): x = self.upsample(x) return x #---------------------------------------------------# # 最后获得yolov4的输出 #---------------------------------------------------# def yolo_head(filters_list, in_filters): m = nn.Sequential( BasicConv(in_filters, filters_list[0], 3), nn.Conv2d(filters_list[0], filters_list[1], 1), ) return m #---------------------------------------------------# # yolo_body #---------------------------------------------------# class YoloBodytiny(nn.Module): def __init__(self, anchors_mask, num_classes, phi=0, pretrained=False): super(YoloBodytiny, self).__init__() self.phi = phi self.backbone = darknet53_tiny(pretrained) self.conv_for_P5 = BasicConv(512,256,1) self.yolo_headP5 = yolo_head([512, len(anchors_mask[0]) * (5 + num_classes)],256) self.upsample = Upsample(256,128) self.yolo_headP4 = yolo_head([256, len(anchors_mask[1]) * (5 + num_classes)],384) if 1 <= self.phi and self.phi <= 4: self.feat1_att = attention_block[self.phi - 1](256) self.feat2_att = attention_block[self.phi - 1](512) self.upsample_att = attention_block[self.phi - 1](128) def forward(self, x): #---------------------------------------------------# # 生成CSPdarknet53_tiny的主干模型 # feat1的shape为26,26,256 # feat2的shape为13,13,512 #---------------------------------------------------# feat1, feat2 = self.backbone(x) if 1 <= self.phi and self.phi <= 4: feat1 = self.feat1_att(feat1) feat2 = self.feat2_att(feat2) # 13,13,512 -> 13,13,256 P5 = self.conv_for_P5(feat2) # 13,13,256 -> 13,13,512 -> 13,13,255 out0 = self.yolo_headP5(P5) # 13,13,256 -> 13,13,128 -> 26,26,128 P5_Upsample = self.upsample(P5) # 26,26,256 + 26,26,128 -> 26,26,384 if 1 <= self.phi and self.phi <= 4: P5_Upsample = self.upsample_att(P5_Upsample) P4 = torch.cat([P5_Upsample,feat1],axis=1) # 26,26,384 -> 26,26,256 -> 26,26,255 out1 = self.yolo_headP4(P4) return out0, out1 ================================================ FILE: nets/yolo_training.py ================================================ import math from functools import partial import numpy as np import torch import torch.nn as nn class YOLOLoss(nn.Module): def __init__(self, anchors, num_classes, input_shape, cuda, anchors_mask = [[6,7,8], [3,4,5], [0,1,2]], label_smoothing = 0, focal_loss = False, alpha = 0.25, gamma = 2): super(YOLOLoss, self).__init__() #-----------------------------------------------------------# # 13x13的特征层对应的anchor是[142, 110],[192, 243],[459, 401] # 26x26的特征层对应的anchor是[36, 75],[76, 55],[72, 146] # 52x52的特征层对应的anchor是[12, 16],[19, 36],[40, 28] #-----------------------------------------------------------# self.anchors = anchors self.num_classes = num_classes self.bbox_attrs = 5 + num_classes self.input_shape = input_shape self.anchors_mask = anchors_mask self.label_smoothing = label_smoothing self.balance = [0.4, 1.0, 4] self.box_ratio = 0.05 self.obj_ratio = 5 * (input_shape[0] * input_shape[1]) / (416 ** 2) self.cls_ratio = 1 * (num_classes / 80) self.focal_loss = focal_loss self.focal_loss_ratio = 10 self.alpha = alpha self.gamma = gamma self.ignore_threshold = 0.5 self.cuda = cuda def clip_by_tensor(self, t, t_min, t_max): t = t.float() result = (t >= t_min).float() * t + (t < t_min).float() * t_min result = (result <= t_max).float() * result + (result > t_max).float() * t_max return result def MSELoss(self, pred, target): return torch.pow(pred - target, 2) def BCELoss(self, pred, target): epsilon = 1e-7 pred = self.clip_by_tensor(pred, epsilon, 1.0 - epsilon) output = - target * torch.log(pred) - (1.0 - target) * torch.log(1.0 - pred) return output def box_ciou(self, b1, b2): """ 输入为: ---------- b1: tensor, shape=(batch, feat_w, feat_h, anchor_num, 4), xywh b2: tensor, shape=(batch, feat_w, feat_h, anchor_num, 4), xywh 返回为: ------- ciou: tensor, shape=(batch, feat_w, feat_h, anchor_num, 1) """ #----------------------------------------------------# # 求出预测框左上角右下角 #----------------------------------------------------# b1_xy = b1[..., :2] b1_wh = b1[..., 2:4] b1_wh_half = b1_wh/2. b1_mins = b1_xy - b1_wh_half b1_maxes = b1_xy + b1_wh_half #----------------------------------------------------# # 求出真实框左上角右下角 #----------------------------------------------------# b2_xy = b2[..., :2] b2_wh = b2[..., 2:4] b2_wh_half = b2_wh/2. b2_mins = b2_xy - b2_wh_half b2_maxes = b2_xy + b2_wh_half #----------------------------------------------------# # 求真实框和预测框所有的iou #----------------------------------------------------# intersect_mins = torch.max(b1_mins, b2_mins) intersect_maxes = torch.min(b1_maxes, b2_maxes) intersect_wh = torch.max(intersect_maxes - intersect_mins, torch.zeros_like(intersect_maxes)) intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1] b1_area = b1_wh[..., 0] * b1_wh[..., 1] b2_area = b2_wh[..., 0] * b2_wh[..., 1] union_area = b1_area + b2_area - intersect_area iou = intersect_area / torch.clamp(union_area,min = 1e-6) #----------------------------------------------------# # 计算中心的差距 #----------------------------------------------------# center_distance = torch.sum(torch.pow((b1_xy - b2_xy), 2), axis=-1) #----------------------------------------------------# # 找到包裹两个框的最小框的左上角和右下角 #----------------------------------------------------# enclose_mins = torch.min(b1_mins, b2_mins) enclose_maxes = torch.max(b1_maxes, b2_maxes) enclose_wh = torch.max(enclose_maxes - enclose_mins, torch.zeros_like(intersect_maxes)) #----------------------------------------------------# # 计算对角线距离 #----------------------------------------------------# enclose_diagonal = torch.sum(torch.pow(enclose_wh,2), axis=-1) ciou = iou - 1.0 * (center_distance) / torch.clamp(enclose_diagonal,min = 1e-6) v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(b1_wh[..., 0] / torch.clamp(b1_wh[..., 1],min = 1e-6)) - torch.atan(b2_wh[..., 0] / torch.clamp(b2_wh[..., 1], min = 1e-6))), 2) alpha = v / torch.clamp((1.0 - iou + v), min=1e-6) ciou = ciou - alpha * v return ciou #---------------------------------------------------# # 平滑标签 #---------------------------------------------------# def smooth_labels(self, y_true, label_smoothing, num_classes): return y_true * (1.0 - label_smoothing) + label_smoothing / num_classes def forward(self, l, input, targets=None): #----------------------------------------------------# # l 代表使用的是第几个有效特征层 # input的shape为 bs, 3*(5+num_classes), 13, 13 # bs, 3*(5+num_classes), 26, 26 # bs, 3*(5+num_classes), 52, 52 # targets 真实框的标签情况 [batch_size, num_gt, 5] #----------------------------------------------------# #--------------------------------# # 获得图片数量,特征层的高和宽 #--------------------------------# bs = input.size(0) in_h = input.size(2) in_w = input.size(3) #-----------------------------------------------------------------------# # 计算步长 # 每一个特征点对应原来的图片上多少个像素点 # # 如果特征层为13x13的话,一个特征点就对应原来的图片上的32个像素点 # 如果特征层为26x26的话,一个特征点就对应原来的图片上的16个像素点 # 如果特征层为52x52的话,一个特征点就对应原来的图片上的8个像素点 # stride_h = stride_w = 32、16、8 #-----------------------------------------------------------------------# stride_h = self.input_shape[0] / in_h stride_w = self.input_shape[1] / in_w #-------------------------------------------------# # 此时获得的scaled_anchors大小是相对于特征层的 #-------------------------------------------------# scaled_anchors = [(a_w / stride_w, a_h / stride_h) for a_w, a_h in self.anchors] #-----------------------------------------------# # 输入的input一共有三个,他们的shape分别是 # bs, 3 * (5+num_classes), 13, 13 => bs, 3, 5 + num_classes, 13, 13 => batch_size, 3, 13, 13, 5 + num_classes # batch_size, 3, 13, 13, 5 + num_classes # batch_size, 3, 26, 26, 5 + num_classes # batch_size, 3, 52, 52, 5 + num_classes #-----------------------------------------------# prediction = input.view(bs, len(self.anchors_mask[l]), self.bbox_attrs, in_h, in_w).permute(0, 1, 3, 4, 2).contiguous() #-----------------------------------------------# # 先验框的中心位置的调整参数 #-----------------------------------------------# x = torch.sigmoid(prediction[..., 0]) y = torch.sigmoid(prediction[..., 1]) #-----------------------------------------------# # 先验框的宽高调整参数 #-----------------------------------------------# w = prediction[..., 2] h = prediction[..., 3] #-----------------------------------------------# # 获得置信度,是否有物体 #-----------------------------------------------# conf = torch.sigmoid(prediction[..., 4]) #-----------------------------------------------# # 种类置信度 #-----------------------------------------------# pred_cls = torch.sigmoid(prediction[..., 5:]) #-----------------------------------------------# # 获得网络应该有的预测结果 #-----------------------------------------------# y_true, noobj_mask, box_loss_scale = self.get_target(l, targets, scaled_anchors, in_h, in_w) #---------------------------------------------------------------# # 将预测结果进行解码,判断预测结果和真实值的重合程度 # 如果重合程度过大则忽略,因为这些特征点属于预测比较准确的特征点 # 作为负样本不合适 #----------------------------------------------------------------# noobj_mask, pred_boxes = self.get_ignore(l, x, y, h, w, targets, scaled_anchors, in_h, in_w, noobj_mask) if self.cuda: y_true = y_true.type_as(x) noobj_mask = noobj_mask.type_as(x) box_loss_scale = box_loss_scale.type_as(x) #--------------------------------------------------------------------------# # box_loss_scale是真实框宽高的乘积,宽高均在0-1之间,因此乘积也在0-1之间。 # 2-宽高的乘积代表真实框越大,比重越小,小框的比重更大。 # 使用iou损失时,大中小目标的回归损失不存在比例失衡问题,故弃用 #--------------------------------------------------------------------------# box_loss_scale = 2 - box_loss_scale loss = 0 obj_mask = y_true[..., 4] == 1 n = torch.sum(obj_mask) if n != 0: #---------------------------------------------------------------# # 计算预测结果和真实结果的差距 # loss_loc ciou回归损失 # loss_cls 分类损失 #---------------------------------------------------------------# ciou = self.box_ciou(pred_boxes, y_true[..., :4]).type_as(x) # loss_loc = torch.mean((1 - ciou)[obj_mask] * box_loss_scale[obj_mask]) loss_loc = torch.mean((1 - ciou)[obj_mask]) loss_cls = torch.mean(self.BCELoss(pred_cls[obj_mask], y_true[..., 5:][obj_mask])) loss += loss_loc * self.box_ratio + loss_cls * self.cls_ratio #---------------------------------------------------------------# # 计算是否包含物体的置信度损失 #---------------------------------------------------------------# if self.focal_loss: pos_neg_ratio = torch.where(obj_mask, torch.ones_like(conf) * self.alpha, torch.ones_like(conf) * (1 - self.alpha)) hard_easy_ratio = torch.where(obj_mask, torch.ones_like(conf) - conf, conf) ** self.gamma loss_conf = torch.mean((self.BCELoss(conf, obj_mask.type_as(conf)) * pos_neg_ratio * hard_easy_ratio)[noobj_mask.bool() | obj_mask]) * self.focal_loss_ratio else: loss_conf = torch.mean(self.BCELoss(conf, obj_mask.type_as(conf))[noobj_mask.bool() | obj_mask]) loss += loss_conf * self.balance[l] * self.obj_ratio # if n != 0: # print(loss_loc * self.box_ratio, loss_cls * self.cls_ratio, loss_conf * self.balance[l] * self.obj_ratio) return loss def calculate_iou(self, _box_a, _box_b): #-----------------------------------------------------------# # 计算真实框的左上角和右下角 #-----------------------------------------------------------# b1_x1, b1_x2 = _box_a[:, 0] - _box_a[:, 2] / 2, _box_a[:, 0] + _box_a[:, 2] / 2 b1_y1, b1_y2 = _box_a[:, 1] - _box_a[:, 3] / 2, _box_a[:, 1] + _box_a[:, 3] / 2 #-----------------------------------------------------------# # 计算先验框获得的预测框的左上角和右下角 #-----------------------------------------------------------# b2_x1, b2_x2 = _box_b[:, 0] - _box_b[:, 2] / 2, _box_b[:, 0] + _box_b[:, 2] / 2 b2_y1, b2_y2 = _box_b[:, 1] - _box_b[:, 3] / 2, _box_b[:, 1] + _box_b[:, 3] / 2 #-----------------------------------------------------------# # 将真实框和预测框都转化成左上角右下角的形式 #-----------------------------------------------------------# box_a = torch.zeros_like(_box_a) box_b = torch.zeros_like(_box_b) box_a[:, 0], box_a[:, 1], box_a[:, 2], box_a[:, 3] = b1_x1, b1_y1, b1_x2, b1_y2 box_b[:, 0], box_b[:, 1], box_b[:, 2], box_b[:, 3] = b2_x1, b2_y1, b2_x2, b2_y2 #-----------------------------------------------------------# # A为真实框的数量,B为先验框的数量 #-----------------------------------------------------------# A = box_a.size(0) B = box_b.size(0) #-----------------------------------------------------------# # 计算交的面积 #-----------------------------------------------------------# max_xy = torch.min(box_a[:, 2:].unsqueeze(1).expand(A, B, 2), box_b[:, 2:].unsqueeze(0).expand(A, B, 2)) min_xy = torch.max(box_a[:, :2].unsqueeze(1).expand(A, B, 2), box_b[:, :2].unsqueeze(0).expand(A, B, 2)) inter = torch.clamp((max_xy - min_xy), min=0) inter = inter[:, :, 0] * inter[:, :, 1] #-----------------------------------------------------------# # 计算预测框和真实框各自的面积 #-----------------------------------------------------------# area_a = ((box_a[:, 2]-box_a[:, 0]) * (box_a[:, 3]-box_a[:, 1])).unsqueeze(1).expand_as(inter) # [A,B] area_b = ((box_b[:, 2]-box_b[:, 0]) * (box_b[:, 3]-box_b[:, 1])).unsqueeze(0).expand_as(inter) # [A,B] #-----------------------------------------------------------# # 求IOU #-----------------------------------------------------------# union = area_a + area_b - inter return inter / union # [A,B] def get_target(self, l, targets, anchors, in_h, in_w): #-----------------------------------------------------# # 计算一共有多少张图片 #-----------------------------------------------------# bs = len(targets) #-----------------------------------------------------# # 用于选取哪些先验框不包含物体 #-----------------------------------------------------# noobj_mask = torch.ones(bs, len(self.anchors_mask[l]), in_h, in_w, requires_grad = False) #-----------------------------------------------------# # 让网络更加去关注小目标 #-----------------------------------------------------# box_loss_scale = torch.zeros(bs, len(self.anchors_mask[l]), in_h, in_w, requires_grad = False) #-----------------------------------------------------# # batch_size, 3, 13, 13, 5 + num_classes #-----------------------------------------------------# y_true = torch.zeros(bs, len(self.anchors_mask[l]), in_h, in_w, self.bbox_attrs, requires_grad = False) for b in range(bs): if len(targets[b])==0: continue batch_target = torch.zeros_like(targets[b]) #-------------------------------------------------------# # 计算出正样本在特征层上的中心点 #-------------------------------------------------------# batch_target[:, [0,2]] = targets[b][:, [0,2]] * in_w batch_target[:, [1,3]] = targets[b][:, [1,3]] * in_h batch_target[:, 4] = targets[b][:, 4] batch_target = batch_target.cpu() #-------------------------------------------------------# # 将真实框转换一个形式 # num_true_box, 4 #-------------------------------------------------------# gt_box = torch.FloatTensor(torch.cat((torch.zeros((batch_target.size(0), 2)), batch_target[:, 2:4]), 1)) #-------------------------------------------------------# # 将先验框转换一个形式 # 9, 4 #-------------------------------------------------------# anchor_shapes = torch.FloatTensor(torch.cat((torch.zeros((len(anchors), 2)), torch.FloatTensor(anchors)), 1)) #-------------------------------------------------------# # 计算交并比 # self.calculate_iou(gt_box, anchor_shapes) = [num_true_box, 9]每一个真实框和9个先验框的重合情况 # best_ns: # [每个真实框最大的重合度max_iou, 每一个真实框最重合的先验框的序号] #-------------------------------------------------------# best_ns = torch.argmax(self.calculate_iou(gt_box, anchor_shapes), dim=-1) for t, best_n in enumerate(best_ns): if best_n not in self.anchors_mask[l]: continue #----------------------------------------# # 判断这个先验框是当前特征点的哪一个先验框 #----------------------------------------# k = self.anchors_mask[l].index(best_n) #----------------------------------------# # 获得真实框属于哪个网格点 #----------------------------------------# i = torch.floor(batch_target[t, 0]).long() j = torch.floor(batch_target[t, 1]).long() #----------------------------------------# # 取出真实框的种类 #----------------------------------------# c = batch_target[t, 4].long() #----------------------------------------# # noobj_mask代表无目标的特征点 #----------------------------------------# noobj_mask[b, k, j, i] = 0 #----------------------------------------# # tx、ty代表中心调整参数的真实值 #----------------------------------------# y_true[b, k, j, i, 0] = batch_target[t, 0] y_true[b, k, j, i, 1] = batch_target[t, 1] y_true[b, k, j, i, 2] = batch_target[t, 2] y_true[b, k, j, i, 3] = batch_target[t, 3] y_true[b, k, j, i, 4] = 1 y_true[b, k, j, i, c + 5] = 1 #----------------------------------------# # 用于获得xywh的比例 # 大目标loss权重小,小目标loss权重大 #----------------------------------------# box_loss_scale[b, k, j, i] = batch_target[t, 2] * batch_target[t, 3] / in_w / in_h return y_true, noobj_mask, box_loss_scale def get_ignore(self, l, x, y, h, w, targets, scaled_anchors, in_h, in_w, noobj_mask): #-----------------------------------------------------# # 计算一共有多少张图片 #-----------------------------------------------------# bs = len(targets) #-----------------------------------------------------# # 生成网格,先验框中心,网格左上角 #-----------------------------------------------------# grid_x = torch.linspace(0, in_w - 1, in_w).repeat(in_h, 1).repeat( int(bs * len(self.anchors_mask[l])), 1, 1).view(x.shape).type_as(x) grid_y = torch.linspace(0, in_h - 1, in_h).repeat(in_w, 1).t().repeat( int(bs * len(self.anchors_mask[l])), 1, 1).view(y.shape).type_as(x) # 生成先验框的宽高 scaled_anchors_l = np.array(scaled_anchors)[self.anchors_mask[l]] anchor_w = torch.Tensor(scaled_anchors_l).index_select(1, torch.LongTensor([0])).type_as(x) anchor_h = torch.Tensor(scaled_anchors_l).index_select(1, torch.LongTensor([1])).type_as(x) anchor_w = anchor_w.repeat(bs, 1).repeat(1, 1, in_h * in_w).view(w.shape) anchor_h = anchor_h.repeat(bs, 1).repeat(1, 1, in_h * in_w).view(h.shape) #-------------------------------------------------------# # 计算调整后的先验框中心与宽高 #-------------------------------------------------------# pred_boxes_x = torch.unsqueeze(x + grid_x, -1) pred_boxes_y = torch.unsqueeze(y + grid_y, -1) pred_boxes_w = torch.unsqueeze(torch.exp(w) * anchor_w, -1) pred_boxes_h = torch.unsqueeze(torch.exp(h) * anchor_h, -1) pred_boxes = torch.cat([pred_boxes_x, pred_boxes_y, pred_boxes_w, pred_boxes_h], dim = -1) for b in range(bs): #-------------------------------------------------------# # 将预测结果转换一个形式 # pred_boxes_for_ignore num_anchors, 4 #-------------------------------------------------------# pred_boxes_for_ignore = pred_boxes[b].view(-1, 4) #-------------------------------------------------------# # 计算真实框,并把真实框转换成相对于特征层的大小 # gt_box num_true_box, 4 #-------------------------------------------------------# if len(targets[b]) > 0: batch_target = torch.zeros_like(targets[b]) #-------------------------------------------------------# # 计算出正样本在特征层上的中心点 #-------------------------------------------------------# batch_target[:, [0,2]] = targets[b][:, [0,2]] * in_w batch_target[:, [1,3]] = targets[b][:, [1,3]] * in_h batch_target = batch_target[:, :4].type_as(x) #-------------------------------------------------------# # 计算交并比 # anch_ious num_true_box, num_anchors #-------------------------------------------------------# anch_ious = self.calculate_iou(batch_target, pred_boxes_for_ignore) #-------------------------------------------------------# # 每个先验框对应真实框的最大重合度 # anch_ious_max num_anchors #-------------------------------------------------------# anch_ious_max, _ = torch.max(anch_ious, dim = 0) anch_ious_max = anch_ious_max.view(pred_boxes[b].size()[:3]) noobj_mask[b][anch_ious_max > self.ignore_threshold] = 0 return noobj_mask, pred_boxes def weights_init(net, init_type='normal', init_gain = 0.02): def init_func(m): classname = m.__class__.__name__ if hasattr(m, 'weight') and classname.find('Conv') != -1: if init_type == 'normal': torch.nn.init.normal_(m.weight.data, 0.0, init_gain) elif init_type == 'xavier': torch.nn.init.xavier_normal_(m.weight.data, gain=init_gain) elif init_type == 'kaiming': torch.nn.init.kaiming_normal_(m.weight.data, a=0, mode='fan_in') elif init_type == 'orthogonal': torch.nn.init.orthogonal_(m.weight.data, gain=init_gain) else: raise NotImplementedError('initialization method [%s] is not implemented' % init_type) elif classname.find('BatchNorm2d') != -1: torch.nn.init.normal_(m.weight.data, 1.0, 0.02) torch.nn.init.constant_(m.bias.data, 0.0) print('initialize network with %s type' % init_type) net.apply(init_func) def get_lr_scheduler(lr_decay_type, lr, min_lr, total_iters, warmup_iters_ratio = 0.05, warmup_lr_ratio = 0.1, no_aug_iter_ratio = 0.05, step_num = 10): def yolox_warm_cos_lr(lr, min_lr, total_iters, warmup_total_iters, warmup_lr_start, no_aug_iter, iters): if iters <= warmup_total_iters: # lr = (lr - warmup_lr_start) * iters / float(warmup_total_iters) + warmup_lr_start lr = (lr - warmup_lr_start) * pow(iters / float(warmup_total_iters), 2) + warmup_lr_start elif iters >= total_iters - no_aug_iter: lr = min_lr else: lr = min_lr + 0.5 * (lr - min_lr) * ( 1.0 + math.cos(math.pi* (iters - warmup_total_iters) / (total_iters - warmup_total_iters - no_aug_iter)) ) return lr def step_lr(lr, decay_rate, step_size, iters): if step_size < 1: raise ValueError("step_size must above 1.") n = iters // step_size out_lr = lr * decay_rate ** n return out_lr if lr_decay_type == "cos": warmup_total_iters = min(max(warmup_iters_ratio * total_iters, 1), 3) warmup_lr_start = max(warmup_lr_ratio * lr, 1e-6) no_aug_iter = min(max(no_aug_iter_ratio * total_iters, 1), 15) func = partial(yolox_warm_cos_lr ,lr, min_lr, total_iters, warmup_total_iters, warmup_lr_start, no_aug_iter) else: decay_rate = (min_lr / lr) ** (1 / (step_num - 1)) step_size = total_iters / step_num func = partial(step_lr, lr, decay_rate, step_size) return func def set_optimizer_lr(optimizer, lr_scheduler_func, epoch): lr = lr_scheduler_func(epoch) for param_group in optimizer.param_groups: param_group['lr'] = lr ================================================ FILE: nets/yolotiny_training.py ================================================ import math from functools import partial import numpy as np import torch import torch.nn as nn class YOLOLosstiny(nn.Module): def __init__(self, anchors, num_classes, input_shape, cuda, anchors_mask = [[6,7,8], [3,4,5], [0,1,2]], label_smoothing = 0): super(YOLOLosstiny, self).__init__() #-----------------------------------------------------------# # 13x13的特征层对应的anchor是[81,82],[135,169],[344,319] # 26x26的特征层对应的anchor是[10,14],[23,27],[37,58] #-----------------------------------------------------------# self.anchors = anchors self.num_classes = num_classes self.bbox_attrs = 5 + num_classes self.input_shape = input_shape self.anchors_mask = anchors_mask self.label_smoothing = label_smoothing self.balance = [0.4, 1.0, 4] self.box_ratio = 0.05 self.obj_ratio = 5 * (input_shape[0] * input_shape[1]) / (416 ** 2) self.cls_ratio = 1 * (num_classes / 80) self.ignore_threshold = 0.5 self.cuda = cuda def clip_by_tensor(self, t, t_min, t_max): t = t.float() result = (t >= t_min).float() * t + (t < t_min).float() * t_min result = (result <= t_max).float() * result + (result > t_max).float() * t_max return result def MSELoss(self, pred, target): return torch.pow(pred - target, 2) def BCELoss(self, pred, target): epsilon = 1e-7 pred = self.clip_by_tensor(pred, epsilon, 1.0 - epsilon) output = - target * torch.log(pred) - (1.0 - target) * torch.log(1.0 - pred) return output def box_ciou(self, b1, b2): """ 输入为: ---------- b1: tensor, shape=(batch, feat_w, feat_h, anchor_num, 4), xywh b2: tensor, shape=(batch, feat_w, feat_h, anchor_num, 4), xywh 返回为: ------- ciou: tensor, shape=(batch, feat_w, feat_h, anchor_num, 1) """ #----------------------------------------------------# # 求出预测框左上角右下角 #----------------------------------------------------# b1_xy = b1[..., :2] b1_wh = b1[..., 2:4] b1_wh_half = b1_wh/2. b1_mins = b1_xy - b1_wh_half b1_maxes = b1_xy + b1_wh_half #----------------------------------------------------# # 求出真实框左上角右下角 #----------------------------------------------------# b2_xy = b2[..., :2] b2_wh = b2[..., 2:4] b2_wh_half = b2_wh/2. b2_mins = b2_xy - b2_wh_half b2_maxes = b2_xy + b2_wh_half #----------------------------------------------------# # 求真实框和预测框所有的iou #----------------------------------------------------# intersect_mins = torch.max(b1_mins, b2_mins) intersect_maxes = torch.min(b1_maxes, b2_maxes) intersect_wh = torch.max(intersect_maxes - intersect_mins, torch.zeros_like(intersect_maxes)) intersect_area = intersect_wh[..., 0] * intersect_wh[..., 1] b1_area = b1_wh[..., 0] * b1_wh[..., 1] b2_area = b2_wh[..., 0] * b2_wh[..., 1] union_area = b1_area + b2_area - intersect_area iou = intersect_area / torch.clamp(union_area,min = 1e-6) #----------------------------------------------------# # 计算中心的差距 #----------------------------------------------------# center_distance = torch.sum(torch.pow((b1_xy - b2_xy), 2), axis=-1) #----------------------------------------------------# # 找到包裹两个框的最小框的左上角和右下角 #----------------------------------------------------# enclose_mins = torch.min(b1_mins, b2_mins) enclose_maxes = torch.max(b1_maxes, b2_maxes) enclose_wh = torch.max(enclose_maxes - enclose_mins, torch.zeros_like(intersect_maxes)) #----------------------------------------------------# # 计算对角线距离 #----------------------------------------------------# enclose_diagonal = torch.sum(torch.pow(enclose_wh,2), axis=-1) ciou = iou - 1.0 * (center_distance) / torch.clamp(enclose_diagonal,min = 1e-6) v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(b1_wh[..., 0] / torch.clamp(b1_wh[..., 1],min = 1e-6)) - torch.atan(b2_wh[..., 0] / torch.clamp(b2_wh[..., 1], min = 1e-6))), 2) alpha = v / torch.clamp((1.0 - iou + v), min=1e-6) ciou = ciou - alpha * v return ciou #---------------------------------------------------# # 平滑标签 #---------------------------------------------------# def smooth_labels(self, y_true, label_smoothing, num_classes): return y_true * (1.0 - label_smoothing) + label_smoothing / num_classes def forward(self, l, input, targets=None): #----------------------------------------------------# # l 代表使用的是第几个有效特征层 # input的shape为 bs, 3*(5+num_classes), 13, 13 # bs, 3*(5+num_classes), 26, 26 # targets 真实框的标签情况 [batch_size, num_gt, 5] #----------------------------------------------------# #--------------------------------# # 获得图片数量,特征层的高和宽 #--------------------------------# bs = input.size(0) in_h = input.size(2) in_w = input.size(3) #-----------------------------------------------------------------------# # 计算步长 # 每一个特征点对应原来的图片上多少个像素点 # # 如果特征层为13x13的话,一个特征点就对应原来的图片上的32个像素点 # 如果特征层为26x26的话,一个特征点就对应原来的图片上的16个像素点 # stride_h = stride_w = 32、16 #-----------------------------------------------------------------------# stride_h = self.input_shape[0] / in_h stride_w = self.input_shape[1] / in_w #-------------------------------------------------# # 此时获得的scaled_anchors大小是相对于特征层的 #-------------------------------------------------# scaled_anchors = [(a_w / stride_w, a_h / stride_h) for a_w, a_h in self.anchors] #-----------------------------------------------# # 输入的input一共有三个,他们的shape分别是 # bs, 3 * (5+num_classes), 13, 13 => bs, 3, 5 + num_classes, 13, 13 => batch_size, 3, 13, 13, 5 + num_classes # batch_size, 3, 13, 13, 5 + num_classes # batch_size, 3, 26, 26, 5 + num_classes #-----------------------------------------------# prediction = input.view(bs, len(self.anchors_mask[l]), self.bbox_attrs, in_h, in_w).permute(0, 1, 3, 4, 2).contiguous() #-----------------------------------------------# # 先验框的中心位置的调整参数 #-----------------------------------------------# x = torch.sigmoid(prediction[..., 0]) y = torch.sigmoid(prediction[..., 1]) #-----------------------------------------------# # 先验框的宽高调整参数 #-----------------------------------------------# w = prediction[..., 2] h = prediction[..., 3] #-----------------------------------------------# # 获得置信度,是否有物体 #-----------------------------------------------# conf = torch.sigmoid(prediction[..., 4]) #-----------------------------------------------# # 种类置信度 #-----------------------------------------------# pred_cls = torch.sigmoid(prediction[..., 5:]) #-----------------------------------------------# # 获得网络应该有的预测结果 #-----------------------------------------------# y_true, noobj_mask, box_loss_scale = self.get_target(l, targets, scaled_anchors, in_h, in_w) #---------------------------------------------------------------# # 将预测结果进行解码,判断预测结果和真实值的重合程度 # 如果重合程度过大则忽略,因为这些特征点属于预测比较准确的特征点 # 作为负样本不合适 #----------------------------------------------------------------# noobj_mask, pred_boxes = self.get_ignore(l, x, y, h, w, targets, scaled_anchors, in_h, in_w, noobj_mask) if self.cuda: y_true = y_true.type_as(x) noobj_mask = noobj_mask.type_as(x) box_loss_scale = box_loss_scale.type_as(x) #--------------------------------------------------------------------------# # box_loss_scale是真实框宽高的乘积,宽高均在0-1之间,因此乘积也在0-1之间。 # 2-宽高的乘积代表真实框越大,比重越小,小框的比重更大。 # 使用iou损失时,大中小目标的回归损失不存在比例失衡问题,故弃用 #--------------------------------------------------------------------------# box_loss_scale = 2 - box_loss_scale loss = 0 obj_mask = y_true[..., 4] == 1 n = torch.sum(obj_mask) if n != 0: #---------------------------------------------------------------# # 计算预测结果和真实结果的差距 # loss_loc ciou回归损失 # loss_cls 分类损失 #---------------------------------------------------------------# ciou = self.box_ciou(pred_boxes, y_true[..., :4]).type_as(x) # loss_loc = torch.mean((1 - ciou)[obj_mask] * box_loss_scale[obj_mask]) loss_loc = torch.mean((1 - ciou)[obj_mask]) loss_cls = torch.mean(self.BCELoss(pred_cls[obj_mask], y_true[..., 5:][obj_mask])) loss += loss_loc * self.box_ratio + loss_cls * self.cls_ratio loss_conf = torch.mean(self.BCELoss(conf, obj_mask.type_as(conf))[noobj_mask.bool() | obj_mask]) loss += loss_conf * self.balance[l] * self.obj_ratio # if n != 0: # print(loss_loc * self.box_ratio, loss_cls * self.cls_ratio, loss_conf * self.balance[l] * self.obj_ratio) return loss def calculate_iou(self, _box_a, _box_b): #-----------------------------------------------------------# # 计算真实框的左上角和右下角 #-----------------------------------------------------------# b1_x1, b1_x2 = _box_a[:, 0] - _box_a[:, 2] / 2, _box_a[:, 0] + _box_a[:, 2] / 2 b1_y1, b1_y2 = _box_a[:, 1] - _box_a[:, 3] / 2, _box_a[:, 1] + _box_a[:, 3] / 2 #-----------------------------------------------------------# # 计算先验框获得的预测框的左上角和右下角 #-----------------------------------------------------------# b2_x1, b2_x2 = _box_b[:, 0] - _box_b[:, 2] / 2, _box_b[:, 0] + _box_b[:, 2] / 2 b2_y1, b2_y2 = _box_b[:, 1] - _box_b[:, 3] / 2, _box_b[:, 1] + _box_b[:, 3] / 2 #-----------------------------------------------------------# # 将真实框和预测框都转化成左上角右下角的形式 #-----------------------------------------------------------# box_a = torch.zeros_like(_box_a) box_b = torch.zeros_like(_box_b) box_a[:, 0], box_a[:, 1], box_a[:, 2], box_a[:, 3] = b1_x1, b1_y1, b1_x2, b1_y2 box_b[:, 0], box_b[:, 1], box_b[:, 2], box_b[:, 3] = b2_x1, b2_y1, b2_x2, b2_y2 #-----------------------------------------------------------# # A为真实框的数量,B为先验框的数量 #-----------------------------------------------------------# A = box_a.size(0) B = box_b.size(0) #-----------------------------------------------------------# # 计算交的面积 #-----------------------------------------------------------# max_xy = torch.min(box_a[:, 2:].unsqueeze(1).expand(A, B, 2), box_b[:, 2:].unsqueeze(0).expand(A, B, 2)) min_xy = torch.max(box_a[:, :2].unsqueeze(1).expand(A, B, 2), box_b[:, :2].unsqueeze(0).expand(A, B, 2)) inter = torch.clamp((max_xy - min_xy), min=0) inter = inter[:, :, 0] * inter[:, :, 1] #-----------------------------------------------------------# # 计算预测框和真实框各自的面积 #-----------------------------------------------------------# area_a = ((box_a[:, 2]-box_a[:, 0]) * (box_a[:, 3]-box_a[:, 1])).unsqueeze(1).expand_as(inter) # [A,B] area_b = ((box_b[:, 2]-box_b[:, 0]) * (box_b[:, 3]-box_b[:, 1])).unsqueeze(0).expand_as(inter) # [A,B] #-----------------------------------------------------------# # 求IOU #-----------------------------------------------------------# union = area_a + area_b - inter return inter / union # [A,B] def get_target(self, l, targets, anchors, in_h, in_w): #-----------------------------------------------------# # 计算一共有多少张图片 #-----------------------------------------------------# bs = len(targets) #-----------------------------------------------------# # 用于选取哪些先验框不包含物体 #-----------------------------------------------------# noobj_mask = torch.ones(bs, len(self.anchors_mask[l]), in_h, in_w, requires_grad = False) #-----------------------------------------------------# # 让网络更加去关注小目标 #-----------------------------------------------------# box_loss_scale = torch.zeros(bs, len(self.anchors_mask[l]), in_h, in_w, requires_grad = False) #-----------------------------------------------------# # batch_size, 3, 13, 13, 5 + num_classes #-----------------------------------------------------# y_true = torch.zeros(bs, len(self.anchors_mask[l]), in_h, in_w, self.bbox_attrs, requires_grad = False) for b in range(bs): if len(targets[b])==0: continue batch_target = torch.zeros_like(targets[b]) #-------------------------------------------------------# # 计算出正样本在特征层上的中心点 #-------------------------------------------------------# batch_target[:, [0,2]] = targets[b][:, [0,2]] * in_w batch_target[:, [1,3]] = targets[b][:, [1,3]] * in_h batch_target[:, 4] = targets[b][:, 4] batch_target = batch_target.cpu() #-------------------------------------------------------# # 将真实框转换一个形式 # num_true_box, 4 #-------------------------------------------------------# gt_box = torch.FloatTensor(torch.cat((torch.zeros((batch_target.size(0), 2)), batch_target[:, 2:4]), 1)) #-------------------------------------------------------# # 将先验框转换一个形式 # 9, 4 #-------------------------------------------------------# anchor_shapes = torch.FloatTensor(torch.cat((torch.zeros((len(anchors), 2)), torch.FloatTensor(anchors)), 1)) #-------------------------------------------------------# # 计算交并比 # self.calculate_iou(gt_box, anchor_shapes) = [num_true_box, 9]每一个真实框和9个先验框的重合情况 # best_ns: # [每个真实框最大的重合度max_iou, 每一个真实框最重合的先验框的序号] #-------------------------------------------------------# iou = self.calculate_iou(gt_box, anchor_shapes) best_ns = torch.argmax(iou, dim=-1) sort_ns = torch.argsort(iou, dim=-1, descending=True) def check_in_anchors_mask(index, anchors_mask): for sub_anchors_mask in anchors_mask: if index in sub_anchors_mask: return True return False for t, best_n in enumerate(best_ns): #----------------------------------------# # 防止匹配到的先验框不在anchors_mask中 #----------------------------------------# if not check_in_anchors_mask(best_n, self.anchors_mask): for index in sort_ns[t]: if check_in_anchors_mask(index, self.anchors_mask): best_n = index break if best_n not in self.anchors_mask[l]: continue #----------------------------------------# # 判断这个先验框是当前特征点的哪一个先验框 #----------------------------------------# k = self.anchors_mask[l].index(best_n) #----------------------------------------# # 获得真实框属于哪个网格点 #----------------------------------------# i = torch.floor(batch_target[t, 0]).long() j = torch.floor(batch_target[t, 1]).long() #----------------------------------------# # 取出真实框的种类 #----------------------------------------# c = batch_target[t, 4].long() #----------------------------------------# # noobj_mask代表无目标的特征点 #----------------------------------------# noobj_mask[b, k, j, i] = 0 #----------------------------------------# # tx、ty代表中心调整参数的真实值 #----------------------------------------# y_true[b, k, j, i, 0] = batch_target[t, 0] y_true[b, k, j, i, 1] = batch_target[t, 1] y_true[b, k, j, i, 2] = batch_target[t, 2] y_true[b, k, j, i, 3] = batch_target[t, 3] y_true[b, k, j, i, 4] = 1 y_true[b, k, j, i, c + 5] = 1 #----------------------------------------# # 用于获得xywh的比例 # 大目标loss权重小,小目标loss权重大 #----------------------------------------# box_loss_scale[b, k, j, i] = batch_target[t, 2] * batch_target[t, 3] / in_w / in_h return y_true, noobj_mask, box_loss_scale def get_ignore(self, l, x, y, h, w, targets, scaled_anchors, in_h, in_w, noobj_mask): #-----------------------------------------------------# # 计算一共有多少张图片 #-----------------------------------------------------# bs = len(targets) #-----------------------------------------------------# # 生成网格,先验框中心,网格左上角 #-----------------------------------------------------# grid_x = torch.linspace(0, in_w - 1, in_w).repeat(in_h, 1).repeat( int(bs * len(self.anchors_mask[l])), 1, 1).view(x.shape).type_as(x) grid_y = torch.linspace(0, in_h - 1, in_h).repeat(in_w, 1).t().repeat( int(bs * len(self.anchors_mask[l])), 1, 1).view(y.shape).type_as(x) # 生成先验框的宽高 scaled_anchors_l = np.array(scaled_anchors)[self.anchors_mask[l]] anchor_w = torch.Tensor(scaled_anchors_l).index_select(1, torch.LongTensor([0])).type_as(x) anchor_h = torch.Tensor(scaled_anchors_l).index_select(1, torch.LongTensor([1])).type_as(x) anchor_w = anchor_w.repeat(bs, 1).repeat(1, 1, in_h * in_w).view(w.shape) anchor_h = anchor_h.repeat(bs, 1).repeat(1, 1, in_h * in_w).view(h.shape) #-------------------------------------------------------# # 计算调整后的先验框中心与宽高 #-------------------------------------------------------# pred_boxes_x = torch.unsqueeze(x + grid_x, -1) pred_boxes_y = torch.unsqueeze(y + grid_y, -1) pred_boxes_w = torch.unsqueeze(torch.exp(w) * anchor_w, -1) pred_boxes_h = torch.unsqueeze(torch.exp(h) * anchor_h, -1) pred_boxes = torch.cat([pred_boxes_x, pred_boxes_y, pred_boxes_w, pred_boxes_h], dim = -1) for b in range(bs): #-------------------------------------------------------# # 将预测结果转换一个形式 # pred_boxes_for_ignore num_anchors, 4 #-------------------------------------------------------# pred_boxes_for_ignore = pred_boxes[b].view(-1, 4) #-------------------------------------------------------# # 计算真实框,并把真实框转换成相对于特征层的大小 # gt_box num_true_box, 4 #-------------------------------------------------------# if len(targets[b]) > 0: batch_target = torch.zeros_like(targets[b]) #-------------------------------------------------------# # 计算出正样本在特征层上的中心点 #-------------------------------------------------------# batch_target[:, [0,2]] = targets[b][:, [0,2]] * in_w batch_target[:, [1,3]] = targets[b][:, [1,3]] * in_h batch_target = batch_target[:, :4].type_as(x) #-------------------------------------------------------# # 计算交并比 # anch_ious num_true_box, num_anchors #-------------------------------------------------------# anch_ious = self.calculate_iou(batch_target, pred_boxes_for_ignore) #-------------------------------------------------------# # 每个先验框对应真实框的最大重合度 # anch_ious_max num_anchors #-------------------------------------------------------# anch_ious_max, _ = torch.max(anch_ious, dim = 0) anch_ious_max = anch_ious_max.view(pred_boxes[b].size()[:3]) noobj_mask[b][anch_ious_max > self.ignore_threshold] = 0 return noobj_mask, pred_boxes def weights_init(net, init_type='normal', init_gain = 0.02): def init_func(m): classname = m.__class__.__name__ if hasattr(m, 'weight') and classname.find('Conv') != -1: if init_type == 'normal': torch.nn.init.normal_(m.weight.data, 0.0, init_gain) elif init_type == 'xavier': torch.nn.init.xavier_normal_(m.weight.data, gain=init_gain) elif init_type == 'kaiming': torch.nn.init.kaiming_normal_(m.weight.data, a=0, mode='fan_in') elif init_type == 'orthogonal': torch.nn.init.orthogonal_(m.weight.data, gain=init_gain) else: raise NotImplementedError('initialization method [%s] is not implemented' % init_type) elif classname.find('BatchNorm2d') != -1: torch.nn.init.normal_(m.weight.data, 1.0, 0.02) torch.nn.init.constant_(m.bias.data, 0.0) print('initialize network with %s type' % init_type) net.apply(init_func) def get_lr_scheduler(lr_decay_type, lr, min_lr, total_iters, warmup_iters_ratio = 0.05, warmup_lr_ratio = 0.1, no_aug_iter_ratio = 0.05, step_num = 10): def yolox_warm_cos_lr(lr, min_lr, total_iters, warmup_total_iters, warmup_lr_start, no_aug_iter, iters): if iters <= warmup_total_iters: # lr = (lr - warmup_lr_start) * iters / float(warmup_total_iters) + warmup_lr_start lr = (lr - warmup_lr_start) * pow(iters / float(warmup_total_iters), 2) + warmup_lr_start elif iters >= total_iters - no_aug_iter: lr = min_lr else: lr = min_lr + 0.5 * (lr - min_lr) * ( 1.0 + math.cos(math.pi* (iters - warmup_total_iters) / (total_iters - warmup_total_iters - no_aug_iter)) ) return lr def step_lr(lr, decay_rate, step_size, iters): if step_size < 1: raise ValueError("step_size must above 1.") n = iters // step_size out_lr = lr * decay_rate ** n return out_lr if lr_decay_type == "cos": warmup_total_iters = min(max(warmup_iters_ratio * total_iters, 1), 3) warmup_lr_start = max(warmup_lr_ratio * lr, 1e-6) no_aug_iter = min(max(no_aug_iter_ratio * total_iters, 1), 15) func = partial(yolox_warm_cos_lr ,lr, min_lr, total_iters, warmup_total_iters, warmup_lr_start, no_aug_iter) else: decay_rate = (min_lr / lr) ** (1 / (step_num - 1)) step_size = total_iters / step_num func = partial(step_lr, lr, decay_rate, step_size) return func def set_optimizer_lr(optimizer, lr_scheduler_func, epoch): lr = lr_scheduler_func(epoch) for param_group in optimizer.param_groups: param_group['lr'] = lr ================================================ FILE: packages.txt ================================================ freeglut3-dev libgtk2.0-dev ================================================ FILE: predict.py ================================================ #-----------------------------------------------------------------------# # predict.py将单张图片预测、摄像头检测、FPS测试和目录遍历检测等功能 # 整合到了一个py文件中,通过指定mode进行模式的修改。 #-----------------------------------------------------------------------# import time import yaml import cv2 import numpy as np from PIL import Image from get_yaml import get_config from yolo import YOLO import argparse if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument('--weights',type=str,default='model_data/yolotiny_SE_ep100.pth',help='initial weights path') parser.add_argument('--tiny',action='store_true',help='使用yolotiny模型') parser.add_argument('--phi',type=int,default=1,help='yolov4tiny注意力机制类型') parser.add_argument('--mode',type=str,choices=['dir_predict', 'video', 'fps','predict','heatmap','export_onnx'],default="dir_predict",help='预测的模式') parser.add_argument('--cuda',action='store_true',help='表示是否使用GPU') parser.add_argument('--shape',type=int,default=416,help='输入图像的shape') parser.add_argument('--video',type=str,default='',help='需要检测的视频文件') parser.add_argument('--save-video',type=str,default='',help='保存视频的位置') parser.add_argument('--confidence',type=float,default=0.5,help='只有得分大于置信度的预测框会被保留下来') parser.add_argument('--nms_iou',type=float,default=0.3,help='非极大抑制所用到的nms_iou大小') opt = parser.parse_args() print(opt) # 配置文件 config = get_config() yolo = YOLO(opt) #----------------------------------------------------------------------------------------------------------# # mode用于指定测试的模式: # 'predict' 表示单张图片预测,如果想对预测过程进行修改,如保存图片,截取对象等,可以先看下方详细的注释 # 'video' 表示视频检测,可调用摄像头或者视频进行检测,详情查看下方注释。 # 'fps' 表示测试fps,使用的图片是img里面的street.jpg,详情查看下方注释。 # 'dir_predict' 表示遍历文件夹进行检测并保存。默认遍历img文件夹,保存img_out文件夹,详情查看下方注释。 # 'heatmap' 表示进行预测结果的热力图可视化,详情查看下方注释。 # 'export_onnx' 表示将模型导出为onnx,需要pytorch1.7.1以上。 #----------------------------------------------------------------------------------------------------------# mode = opt.mode #-------------------------------------------------------------------------# # crop 指定了是否在单张图片预测后对目标进行截取 # count 指定了是否进行目标的计数 # crop、count仅在mode='predict'时有效 #-------------------------------------------------------------------------# crop = False count = False #----------------------------------------------------------------------------------------------------------# # video_path 用于指定视频的路径,当video_path=0时表示检测摄像头 # 想要检测视频,则设置如video_path = "xxx.mp4"即可,代表读取出根目录下的xxx.mp4文件。 # video_save_path 表示视频保存的路径,当video_save_path=""时表示不保存 # 想要保存视频,则设置如video_save_path = "yyy.mp4"即可,代表保存为根目录下的yyy.mp4文件。 # video_fps 用于保存的视频的fps # # video_path、video_save_path和video_fps仅在mode='video'时有效 # 保存视频时需要ctrl+c退出或者运行到最后一帧才会完成完整的保存步骤。 #----------------------------------------------------------------------------------------------------------# video_path = 0 if opt.video == '' else opt.video video_save_path = opt.save_video video_fps = 25.0 #----------------------------------------------------------------------------------------------------------# # test_interval 用于指定测量fps的时候,图片检测的次数。理论上test_interval越大,fps越准确。 # fps_image_path 用于指定测试的fps图片 # # test_interval和fps_image_path仅在mode='fps'有效 #----------------------------------------------------------------------------------------------------------# test_interval = 100 fps_image_path = "img/up.jpg" #-------------------------------------------------------------------------# # dir_origin_path 指定了用于检测的图片的文件夹路径 # dir_save_path 指定了检测完图片的保存路径 # # dir_origin_path和dir_save_path仅在mode='dir_predict'时有效 #-------------------------------------------------------------------------# dir_origin_path = "img/" dir_save_path = "img_out/" #-------------------------------------------------------------------------# # heatmap_save_path 热力图的保存路径,默认保存在model_data下 # # heatmap_save_path仅在mode='heatmap'有效 #-------------------------------------------------------------------------# heatmap_save_path = "model_data/heatmap_vision.png" #-------------------------------------------------------------------------# # simplify 使用Simplify onnx # onnx_save_path 指定了onnx的保存路径 #-------------------------------------------------------------------------# simplify = True onnx_save_path = "model_data/models.onnx" if mode == "predict": ''' 1、如果想要进行检测完的图片的保存,利用r_image.save("img.jpg")即可保存,直接在predict.py里进行修改即可。 2、如果想要获得预测框的坐标,可以进入yolo.detect_image函数,在绘图部分读取top,left,bottom,right这四个值。 3、如果想要利用预测框截取下目标,可以进入yolo.detect_image函数,在绘图部分利用获取到的top,left,bottom,right这四个值 在原图上利用矩阵的方式进行截取。 4、如果想要在预测图上写额外的字,比如检测到的特定目标的数量,可以进入yolo.detect_image函数,在绘图部分对predicted_class进行判断, 比如判断if predicted_class == 'car': 即可判断当前目标是否为车,然后记录数量即可。利用draw.text即可写字。 ''' while True: img = input('Input image filename:') try: image = Image.open(img) except: print('Open Error! Try again!') continue else: r_image = yolo.detect_image(image, crop = crop, count=count) r_image.show() r_image.save(dir_save_path + 'img_result.jpg') elif mode == "video": capture = cv2.VideoCapture(video_path) if video_save_path != '': fourcc = cv2.VideoWriter_fourcc(*'XVID') size = (int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))) out = cv2.VideoWriter(video_save_path, fourcc, video_fps, size) ref, frame = capture.read() if not ref: raise ValueError("未能正确读取摄像头(视频),请注意是否正确安装摄像头(是否正确填写视频路径)。") fps = 0.0 while(True): t1 = time.time() # 读取某一帧 ref, frame = capture.read() if not ref: break # 格式转变,BGRtoRGB frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB) # 转变成Image frame = Image.fromarray(np.uint8(frame)) # 进行检测 frame = np.array(yolo.detect_image(frame)) # RGBtoBGR满足opencv显示格式 frame = cv2.cvtColor(frame,cv2.COLOR_RGB2BGR) fps = ( fps + (1./(time.time()-t1)) ) / 2 print("fps= %.2f"%(fps)) frame = cv2.putText(frame, "fps= %.2f"%(fps), (0, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2) cv2.imshow("video",frame) c= cv2.waitKey(1) & 0xff if video_save_path != '': out.write(frame) if c==27: capture.release() break print("Video Detection Done!") capture.release() if video_save_path != '': print("Save processed video to the path :" + video_save_path) out.release() cv2.destroyAllWindows() elif mode == "fps": img = Image.open(fps_image_path) tact_time = yolo.get_FPS(img, test_interval) print(str(tact_time) + ' seconds, ' + str(1/tact_time) + 'FPS, @batch_size 1') elif mode == "dir_predict": import os from tqdm import tqdm img_names = os.listdir(dir_origin_path) for img_name in tqdm(img_names): if img_name.lower().endswith(('.bmp', '.dib', '.png', '.jpg', '.jpeg', '.pbm', '.pgm', '.ppm', '.tif', '.tiff')): image_path = os.path.join(dir_origin_path, img_name) image = Image.open(image_path) r_image = yolo.detect_image(image) if not os.path.exists(dir_save_path): os.makedirs(dir_save_path) r_image.save(os.path.join(dir_save_path, img_name.replace(".jpg", ".png")), quality=95, subsampling=0) elif mode == "heatmap": while True: img = input('Input image filename:') try: image = Image.open(img) except: print('Open Error! Try again!') continue else: yolo.detect_heatmap(image, heatmap_save_path) elif mode == "export_onnx": yolo.convert_to_onnx(simplify, onnx_save_path) else: raise AssertionError("Please specify the correct mode: 'predict', 'video', 'fps', 'heatmap', 'export_onnx', 'dir_predict'.") ================================================ FILE: requirements.txt ================================================ scipy numpy matplotlib==3.7.0 opencv_python torch==1.8.1 torchvision==0.9.1 tqdm==4.60.0 Pillow==8.2.0 h5py==2.10.0 tensorboard pyyaml==6.0 torchinfo labelimg==1.8.6 streamlit==1.8.1 opencv-python-headless==4.5.2.52 streamlit<=1.11.* ================================================ FILE: summary.py ================================================ #--------------------------------------------# # 该部分代码用于看网络结构 #--------------------------------------------# import torch from torchinfo import summary from nets.yolo import YoloBody from nets.yolo_tiny import YoloBodytiny if __name__ == "__main__": # 需要使用device来指定网络在GPU还是CPU运行 device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') m = YoloBody([[6, 7, 8], [3, 4, 5], [0, 1, 2]], 80).to(device) summary(m, input_size=(1,3, 416, 416)) m = YoloBodytiny([[3, 4, 5], [1, 2, 3]], 80, phi = 1).to(device) summary(m, input_size=(1,3, 416, 416)) ================================================ FILE: train.py ================================================ #-------------------------------------# # 对数据集进行训练 #-------------------------------------# import os import numpy as np import torch import torch.backends.cudnn as cudnn import torch.distributed as dist import torch.nn as nn import torch.optim as optim from torch.utils.data import DataLoader from get_yaml import get_config from nets.yolo import YoloBody from nets.yolo_tiny import YoloBodytiny from nets.yolotiny_training import YOLOLosstiny from nets.yolo_training import (YOLOLoss, get_lr_scheduler, set_optimizer_lr, weights_init) from utils.callbacks import LossHistory from utils.dataloader import YoloDataset, yolo_dataset_collate from utils.utils import get_anchors, get_classes from utils.utils_fit import fit_one_epoch from utils.utils import get_lr import warnings import yaml warnings.filterwarnings('ignore') import argparse ''' 训练自己的目标检测模型一定需要注意以下几点: 1、训练前仔细检查自己的格式是否满足要求,该库要求数据集格式为VOC格式,需要准备好的内容有输入图片和标签 输入图片为.jpg图片,无需固定大小,传入训练前会自动进行resize。 灰度图会自动转成RGB图片进行训练,无需自己修改。 输入图片如果后缀非jpg,需要自己批量转成jpg后再开始训练。 标签为.xml格式,文件中会有需要检测的目标信息,标签文件和输入图片文件相对应。 2、训练好的权值文件保存在logs文件夹中,每个epoch都会保存一次,如果只是训练了几个step是不会保存的,epoch和step的概念要捋清楚一下。 在训练过程中,该代码并没有设定只保存最低损失的,因此按默认参数训练完会有100个权值,如果空间不够可以自行删除。 这个并不是保存越少越好也不是保存越多越好,有人想要都保存、有人想只保存一点,为了满足大多数的需求,还是都保存可选择性高。 3、损失值的大小用于判断是否收敛,比较重要的是有收敛的趋势,即验证集损失不断下降,如果验证集损失基本上不改变的话,模型基本上就收敛了。 损失值的具体大小并没有什么意义,大和小只在于损失的计算方式,并不是接近于0才好。如果想要让损失好看点,可以直接到对应的损失函数里面除上10000。 训练过程中的损失值会保存在logs文件夹下的loss_%Y_%m_%d_%H_%M_%S文件夹中 4、调参是一门蛮重要的学问,没有什么参数是一定好的,现有的参数是我测试过可以正常训练的参数,因此我会建议用现有的参数。 但是参数本身并不是绝对的,比如随着batch的增大学习率也可以增大,效果也会好一些;过深的网络不要用太大的学习率等等。 这些都是经验上,只能靠各位同学多查询资料和自己试试了。 ''' if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument('--init',type=int,default=0,help='从init epoch开始训练') parser.add_argument('--epochs',type=int, default=100,help='epochs for training') parser.add_argument('--weights','-w',type=str,default='',help='initial weights path 初始权重的路径') parser.add_argument('--freeze','-f',action='store_true',help='表示是否冻结训练') parser.add_argument('--freeze-epochs','-fe',type=int,default=50,help='epochs for feeze 冻结训练的迭代次数') parser.add_argument('--freeze-size', '-fb',type=int, default=32, help='total batch size for Freezeing') parser.add_argument('--batch-size','-bs',type=int, default=10, help='total batch size for all GPUs') parser.add_argument('--optimizer',type=str, choices=['sgd', 'adam', 'adamw'], default='adam', help='训练使用的optimizer') parser.add_argument('--workers',type=int, default=4, help='用于设置是否使用多线程读取数据') parser.add_argument('--lr',type=float,default=0.02,help='Learning Rate 学习率的初始值') parser.add_argument('--tiny',action='store_true',help='使用yolov4-tiny模型') parser.add_argument('--phi',type=int,default=0,help='yolov4-tiny所使用的注意力机制的类型') parser.add_argument('--weight-decay',type=float,default=0,help='权值衰减,可防止过拟合') parser.add_argument('--momentum',type=float,default=0.937,help='优化器中的参数') parser.add_argument('--save-period','-save',type=int,default=4,help='多少个epochs保存一次权重') parser.add_argument('--cuda',action='store_true',help='表示是否使用GPU') parser.add_argument('--shape',type=int,default=416,help='输入图像的shape,一定要是32的倍数') parser.add_argument('--fp16',action='store_true',help='是否使用混合精度训练') parser.add_argument('--mosaic',action='store_true',help='Yolov4的tricks应用 马赛克数据增强') parser.add_argument('--lr_decay_type',type=str,default='cos',choices=['cos','step'],help='cos') parser.add_argument('--distributed',action='store_true',help='是否使用多卡运行') parser.add_argument('--local_rank',type=int) parser.add_argument('--resume','-r',action='store_true',help='进行断点续传') opt = parser.parse_args() print(opt) # 配置文件 config = get_config() # 断点续传功能 resume = opt.resume if resume: try: if os.path.exists('logs/opt.yaml'): with open('logs/opt.yaml') as f: opt = yaml.load(f,Loader =yaml.FullLoader) opt = argparse.Namespace(**opt) opt.weights = 'logs/last.pth' print("mode resume :",opt) except Exception as e: print(e) print("断点续传失败,重新训练") #---------------------------------# # Cuda 是否使用Cuda # 没有GPU可以设置成False #---------------------------------# Cuda = opt.cuda #---------------------------------------------------------------------# # distributed 用于指定是否使用单机多卡分布式运行 # 终端指令仅支持Ubuntu。CUDA_VISIBLE_DEVICES用于在Ubuntu下指定显卡。 # Windows系统下默认使用DP模式调用所有显卡,不支持DDP。 # DP模式: # 设置 distributed = False # 在终端中输入 CUDA_VISIBLE_DEVICES=0,1 python train.py # DDP模式: # 设置 distributed = True # 在终端中输入 CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py #---------------------------------------------------------------------# distributed = opt.distributed #---------------------------------------------------------------------# # sync_bn 是否使用sync_bn,DDP模式多卡可用 #---------------------------------------------------------------------# sync_bn = False #---------------------------------------------------------------------# # fp16 是否使用混合精度训练 # 可减少约一半的显存、需要pytorch1.7.1以上 #---------------------------------------------------------------------# fp16 = opt.fp16 #---------------------------------------------------------------------# # anchors_path 代表先验框对应的txt文件,一般不修改。 # anchors_mask 用于帮助代码找到对应的先验框,一般不修改。 #---------------------------------------------------------------------# anchors_path = 'model_data/yolo_anchors.txt' if not opt.tiny: anchors_mask = [[6, 7, 8], [3, 4, 5], [0, 1, 2]] elif opt.tiny: anchors_mask = [[3,4,5], [1,2,3]] anchors_path = 'model_data/yolotiny_anchors.txt' #----------------------------------------------------------------------------------------------------------------------------# # 权值文件的下载请看README,可以通过网盘下载。模型的 预训练权重 对不同数据集是通用的,因为特征是通用的。 # 模型的 预训练权重 比较重要的部分是 主干特征提取网络的权值部分,用于进行特征提取。 # 预训练权重对于99%的情况都必须要用,不用的话主干部分的权值太过随机,特征提取效果不明显,网络训练的结果也不会好 # # 如果训练过程中存在中断训练的操作,可以将model_path设置成logs文件夹下的权值文件,将已经训练了一部分的权值再次载入。 # 同时修改下方的 冻结阶段 或者 解冻阶段 的参数,来保证模型epoch的连续性。 # # 当model_path = ''的时候不加载整个模型的权值。 # # 此处使用的是整个模型的权重,因此是在train.py进行加载的,下面的pretrain不影响此处的权值加载。 # 如果想要让模型从主干的预训练权值开始训练,则设置model_path = '',下面的pretrain = True,此时仅加载主干。 # 如果想要让模型从0开始训练,则设置model_path = '',下面的pretrain = Fasle,Freeze_Train = Fasle,此时从0开始训练,且没有冻结主干的过程。 # # 一般来讲,网络从0开始的训练效果会很差,因为权值太过随机,特征提取效果不明显,因此非常、非常、非常不建议大家从0开始训练! # 从0开始训练有两个方案: # 1、得益于Mosaic数据增强方法强大的数据增强能力,将UnFreeze_Epoch设置的较大(300及以上)、batch较大(16及以上)、数据较多(万以上)的情况下, # 可以设置mosaic=True,直接随机初始化参数开始训练,但得到的效果仍然不如有预训练的情况。(像COCO这样的大数据集可以这样做) # 2、了解imagenet数据集,首先训练分类模型,获得网络的主干部分权值,分类模型的 主干部分 和该模型通用,基于此进行训练。 #----------------------------------------------------------------------------------------------------------------------------# model_path = opt.weights #------------------------------------------------------# # input_shape 输入的shape大小,一定要是32的倍数 #------------------------------------------------------# # input_shape = [416, 416] input_shape = [opt.shape,opt.shape] #----------------------------------------------------------------------------------------------------------------------------# # pretrained 是否使用主干网络的预训练权重,此处使用的是主干的权重,因此是在模型构建的时候进行加载的。 # 如果设置了model_path,则主干的权值无需加载,pretrained的值无意义。 # 如果不设置model_path,pretrained = True,此时仅加载主干开始训练。 # 如果不设置model_path,pretrained = False,Freeze_Train = Fasle,此时从0开始训练,且没有冻结主干的过程。 #----------------------------------------------------------------------------------------------------------------------------# pretrained = False #------------------------------------------------------# # Yolov4的tricks应用 # mosaic 马赛克数据增强 # 参考YoloX,由于Mosaic生成的训练图片, # 远远脱离自然图片的真实分布。 # 本代码会在训练结束前的N个epoch自动关掉Mosaic # 100个世代会关闭30个世代(比例可在dataloader.py调整) # label_smoothing 标签平滑。一般0.01以下。如0.01、0.005 # # 余弦退火算法的参数放到下面的lr_decay_type中设置 #------------------------------------------------------# mosaic = opt.mosaic label_smoothing = 0 #----------------------------------------------------------------------------------------------------------------------------# # 训练分为两个阶段,分别是冻结阶段和解冻阶段。设置冻结阶段是为了满足机器性能不足的同学的训练需求。 # 冻结训练需要的显存较小,显卡非常差的情况下,可设置Freeze_Epoch等于UnFreeze_Epoch,此时仅仅进行冻结训练。 # # 在此提供若干参数设置建议,各位训练者根据自己的需求进行灵活调整: # (一)从整个模型的预训练权重开始训练: # Adam: # Init_Epoch = 0,Freeze_Epoch = 50,UnFreeze_Epoch = 100,Freeze_Train = True,optimizer_type = 'adam',Init_lr = 1e-3,weight_decay = 0。(冻结) # Init_Epoch = 0,UnFreeze_Epoch = 100,Freeze_Train = False,optimizer_type = 'adam',Init_lr = 1e-3,weight_decay = 0。(不冻结) # SGD: # Init_Epoch = 0,Freeze_Epoch = 50,UnFreeze_Epoch = 100,Freeze_Train = True,optimizer_type = 'sgd',Init_lr = 1e-2,weight_decay = 5e-4。(冻结) # Init_Epoch = 0,UnFreeze_Epoch = 100,Freeze_Train = False,optimizer_type = 'sgd',Init_lr = 1e-2,weight_decay = 5e-4。(不冻结) # 其中:UnFreeze_Epoch可以在100-300之间调整。 # (二)从主干网络的预训练权重开始训练: # Adam: # Init_Epoch = 0,Freeze_Epoch = 50,UnFreeze_Epoch = 100,Freeze_Train = True,optimizer_type = 'adam',Init_lr = 1e-3,weight_decay = 0。(冻结) # Init_Epoch = 0,UnFreeze_Epoch = 100,Freeze_Train = False,optimizer_type = 'adam',Init_lr = 1e-3,weight_decay = 0。(不冻结) # SGD: # Init_Epoch = 0,Freeze_Epoch = 50,UnFreeze_Epoch = 300,Freeze_Train = True,optimizer_type = 'sgd',Init_lr = 1e-2,weight_decay = 5e-4。(冻结) # Init_Epoch = 0,UnFreeze_Epoch = 300,Freeze_Train = False,optimizer_type = 'sgd',Init_lr = 1e-2,weight_decay = 5e-4。(不冻结) # 其中:由于从主干网络的预训练权重开始训练,主干的权值不一定适合目标检测,需要更多的训练跳出局部最优解。 # UnFreeze_Epoch可以在150-300之间调整,YOLOV5和YOLOX均推荐使用300。 # Adam相较于SGD收敛的快一些。因此UnFreeze_Epoch理论上可以小一点,但依然推荐更多的Epoch。 # (三)从0开始训练: # Init_Epoch = 0,UnFreeze_Epoch >= 300,Unfreeze_batch_size >= 16,Freeze_Train = False(不冻结训练) # 其中:UnFreeze_Epoch尽量不小于300。optimizer_type = 'sgd',Init_lr = 1e-2,mosaic = True。 # (四)batch_size的设置: # 在显卡能够接受的范围内,以大为好。显存不足与数据集大小无关,提示显存不足(OOM或者CUDA out of memory)请调小batch_size。 # 受到BatchNorm层影响,batch_size最小为2,不能为1。 # 正常情况下Freeze_batch_size建议为Unfreeze_batch_size的1-2倍。不建议设置的差距过大,因为关系到学习率的自动调整。 #----------------------------------------------------------------------------------------------------------------------------# #------------------------------------------------------------------# # 冻结阶段训练参数 # 此时模型的主干被冻结了,特征提取网络不发生改变 # 占用的显存较小,仅对网络进行微调 # Init_Epoch 模型当前开始的训练世代,其值可以大于Freeze_Epoch,如设置: # Init_Epoch = 60、Freeze_Epoch = 50、UnFreeze_Epoch = 100 # 会跳过冻结阶段,直接从60代开始,并调整对应的学习率。 # (断点续练时使用) # Freeze_Epoch 模型冻结训练的Freeze_Epoch # (当Freeze_Train=False时失效) # Freeze_batch_size 模型冻结训练的batch_size # (当Freeze_Train=False时失效) #------------------------------------------------------------------# Init_Epoch = opt.init Freeze_Epoch = opt.freeze_epochs Freeze_batch_size = opt.freeze_size #------------------------------------------------------------------# # 解冻阶段训练参数 # 此时模型的主干不被冻结了,特征提取网络会发生改变 # 占用的显存较大,网络所有的参数都会发生改变 # UnFreeze_Epoch 模型总共训练的epoch # Unfreeze_batch_size 模型在解冻后的batch_size #------------------------------------------------------------------# UnFreeze_Epoch = opt.epochs Unfreeze_batch_size = opt.batch_size #------------------------------------------------------------------# # Freeze_Train 是否进行冻结训练 # 默认先冻结主干训练后解冻训练。 #------------------------------------------------------------------# Freeze_Train = opt.freeze #------------------------------------------------------------------# # 其它训练参数:学习率、优化器、学习率下降有关 #------------------------------------------------------------------# #------------------------------------------------------------------# # Init_lr 模型的最大学习率 # Min_lr 模型的最小学习率,默认为最大学习率的0.01 #------------------------------------------------------------------# Init_lr = opt.lr Min_lr = Init_lr * 0.01 #------------------------------------------------------------------# # optimizer_type 使用到的优化器种类,可选的有adam、sgd # 当使用Adam优化器时建议设置 Init_lr=1e-3 # 当使用SGD优化器时建议设置 Init_lr=1e-2 # momentum 优化器内部使用到的momentum参数 # weight_decay 权值衰减,可防止过拟合 # adam会导致weight_decay错误,使用adam时建议设置为0。 #------------------------------------------------------------------# optimizer_type = opt.optimizer momentum = opt.momentum weight_decay = opt.weight_decay #------------------------------------------------------------------# # lr_decay_type 使用到的学习率下降方式,可选的有step、cos #------------------------------------------------------------------# lr_decay_type = opt.lr_decay_type #------------------------------------------------------------------# # focal_loss 是否使用Focal Loss平衡正负样本 # focal_alpha Focal Loss的正负样本平衡参数 # focal_gamma Focal Loss的难易分类样本平衡参数 #------------------------------------------------------------------# focal_loss = False focal_alpha = 0.25 focal_gamma = 2 #------------------------------------------------------------------# # save_period 多少个epoch保存一次权值,默认每个世代都保存 #------------------------------------------------------------------# save_period = opt.save_period #------------------------------------------------------------------# # save_dir 权值与日志文件保存的文件夹 #------------------------------------------------------------------# save_dir = 'logs' #------------------------------------------------------------------# # num_workers 用于设置是否使用多线程读取数据 # 开启后会加快数据读取速度,但是会占用更多内存 # 内存较小的电脑可以设置为2或者0 #------------------------------------------------------------------# num_workers = opt.workers #------------------------------------------------------# # train_annotation_path 训练图片路径和标签 # val_annotation_path 验证图片路径和标签 #------------------------------------------------------# train_annotation_path = '2007_train.txt' val_annotation_path = '2007_val.txt' #------------------------------------------------------# # 设置用到的显卡 #------------------------------------------------------# ngpus_per_node = torch.cuda.device_count() if distributed: dist.init_process_group(backend="nccl") local_rank = int(os.environ["LOCAL_RANK"]) rank = int(os.environ["RANK"]) device = torch.device("cuda", local_rank) if local_rank == 0: print(f"[{os.getpid()}] (rank = {rank}, local_rank = {local_rank}) training...") print("Gpu Device Count : ", ngpus_per_node) else: device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') local_rank = 0 #------------------------------------------------------# # 获取classes和anchor #------------------------------------------------------# class_names, num_classes = config['classes'], config['nc'] anchors, num_anchors = get_anchors(anchors_path) #------------------------------------------------------# # 创建yolo模型 #------------------------------------------------------# if not opt.tiny: model = YoloBody(anchors_mask, num_classes, pretrained = pretrained) elif opt.tiny: model = YoloBodytiny(anchors_mask, num_classes, pretrained = pretrained, phi = opt.phi) if not pretrained: weights_init(model) if model_path != "": if local_rank == 0: #------------------------------------------------------# # 权值文件请看README,百度网盘下载 #------------------------------------------------------# print('Load weights {}.'.format(model_path)) model_dict = model.state_dict() pretrained_dict = torch.load(model_path, map_location = device) pretrained_dict = {k: v for k, v in pretrained_dict.items() if np.shape(model_dict[k]) == np.shape(v)} model_dict.update(pretrained_dict) model.load_state_dict(model_dict) if not opt.tiny: yolo_loss = YOLOLoss(anchors, num_classes, input_shape, Cuda, anchors_mask, label_smoothing, focal_loss, focal_alpha, focal_gamma) elif opt.tiny: yolo_loss = YOLOLosstiny(anchors, num_classes, input_shape, Cuda, anchors_mask, label_smoothing) if local_rank == 0: loss_history = LossHistory(save_dir, model, input_shape=input_shape) else: loss_history = None if fp16: #------------------------------------------------------------------# # torch 1.2不支持amp,建议使用torch 1.7.1及以上正确使用fp16 # 因此torch1.2这里显示"could not be resolve" #------------------------------------------------------------------# from torch.cuda.amp import GradScaler as GradScaler scaler = GradScaler() else: scaler = None model_train = model.train() #----------------------------# # 多卡同步Bn #----------------------------# if sync_bn and ngpus_per_node > 1 and distributed: model_train = torch.nn.SyncBatchNorm.convert_sync_batchnorm(model_train) elif sync_bn: print("Sync_bn is not support in one gpu or not distributed.") if Cuda: if distributed: #----------------------------# # 多卡平行运行 #----------------------------# model_train = model_train.cuda(local_rank) model_train = torch.nn.parallel.DistributedDataParallel(model_train, device_ids=[local_rank], find_unused_parameters=True) else: print("Gpu Device Count : ", ngpus_per_node) devices = [i for i in range(ngpus_per_node)] model_train = torch.nn.DataParallel(model, device_ids=devices) cudnn.benchmark = True model_train = model_train.cuda() #---------------------------# # 读取数据集对应的txt #---------------------------# with open(train_annotation_path, encoding='utf-8') as f: train_lines = f.readlines() with open(val_annotation_path, encoding='utf-8') as f: val_lines = f.readlines() num_train = len(train_lines) num_val = len(val_lines) #------------------------------------------------------# # 主干特征提取网络特征通用,冻结训练可以加快训练速度 # 也可以在训练初期防止权值被破坏。 # Init_Epoch为起始世代 # Freeze_Epoch为冻结训练的世代 # UnFreeze_Epoch总训练世代 # 提示OOM或者显存不足请调小Batch_size #------------------------------------------------------# if True: UnFreeze_flag = False #------------------------------------# # 冻结一定部分训练 #------------------------------------# if Freeze_Train: for param in model.backbone.parameters(): param.requires_grad = False #-------------------------------------------------------------------# # 如果不冻结训练的话,直接设置batch_size为Unfreeze_batch_size #-------------------------------------------------------------------# batch_size = Freeze_batch_size if Freeze_Train else Unfreeze_batch_size #-------------------------------------------------------------------# # 判断当前batch_size,自适应调整学习率 #-------------------------------------------------------------------# nbs = 64 lr_limit_max = 1e-3 if optimizer_type in ['adam', 'adamw'] else 5e-2 lr_limit_min = 3e-4 if optimizer_type in ['adam', 'adamw'] else 5e-4 Init_lr_fit = min(max(batch_size / nbs * Init_lr, lr_limit_min), lr_limit_max) Min_lr_fit = min(max(batch_size / nbs * Min_lr, lr_limit_min * 1e-2), lr_limit_max * 1e-2) #---------------------------------------# # 根据optimizer_type选择优化器 #---------------------------------------# pg0, pg1, pg2 = [], [], [] for k, v in model.named_modules(): if hasattr(v, "bias") and isinstance(v.bias, nn.Parameter): pg2.append(v.bias) if isinstance(v, nn.BatchNorm2d) or "bn" in k: pg0.append(v.weight) elif hasattr(v, "weight") and isinstance(v.weight, nn.Parameter): pg1.append(v.weight) optimizer = { 'adam' : optim.Adam(pg0, Init_lr_fit, betas = (momentum, 0.999)), 'adamw' : optim.AdamW(pg0, Init_lr_fit, betas = (momentum, 0.999)), 'sgd' : optim.SGD(pg0, Init_lr_fit, momentum = momentum, nesterov=True) }[optimizer_type] optimizer.add_param_group({"params": pg1, "weight_decay": weight_decay}) optimizer.add_param_group({"params": pg2}) #---------------------------------------# # 获得学习率下降的公式 #---------------------------------------# lr_scheduler_func = get_lr_scheduler(lr_decay_type, Init_lr_fit, Min_lr_fit, UnFreeze_Epoch) #---------------------------------------# # 判断每一个世代的长度 #---------------------------------------# epoch_step = num_train // batch_size epoch_step_val = num_val // batch_size if epoch_step == 0 or epoch_step_val == 0: raise ValueError("数据集过小,无法继续进行训练,请扩充数据集。") #---------------------------------------# # 构建数据集加载器。 #---------------------------------------# train_dataset = YoloDataset(train_lines, input_shape, num_classes, epoch_length = UnFreeze_Epoch, mosaic=mosaic, train = True) val_dataset = YoloDataset(val_lines, input_shape, num_classes, epoch_length = UnFreeze_Epoch, mosaic=False, train = False) if distributed: train_sampler = torch.utils.data.distributed.DistributedSampler(train_dataset, shuffle=True,) val_sampler = torch.utils.data.distributed.DistributedSampler(val_dataset, shuffle=False,) batch_size = batch_size // ngpus_per_node shuffle = False else: train_sampler = None val_sampler = None shuffle = True gen = DataLoader(train_dataset, shuffle = shuffle, batch_size = batch_size, num_workers = num_workers, pin_memory=True, drop_last=True, collate_fn=yolo_dataset_collate, sampler=train_sampler) gen_val = DataLoader(val_dataset , shuffle = shuffle, batch_size = batch_size, num_workers = num_workers, pin_memory=True, drop_last=True, collate_fn=yolo_dataset_collate, sampler=val_sampler) #---------------------------------------# # 开始模型训练 #---------------------------------------# for epoch in range(Init_Epoch, UnFreeze_Epoch): #---------------------------------------# # 如果模型有冻结学习部分 # 则解冻,并设置参数 #---------------------------------------# if epoch >= Freeze_Epoch and not UnFreeze_flag and Freeze_Train: batch_size = Unfreeze_batch_size #-------------------------------------------------------------------# # 判断当前batch_size,自适应调整学习率 #-------------------------------------------------------------------# nbs = 64 lr_limit_max = 1e-3 if optimizer_type in ['adam', 'adamw'] else 5e-2 lr_limit_min = 3e-4 if optimizer_type in ['adam', 'adamw'] else 5e-4 Init_lr_fit = min(max(batch_size / nbs * Init_lr, lr_limit_min), lr_limit_max) Min_lr_fit = min(max(batch_size / nbs * Min_lr, lr_limit_min * 1e-2), lr_limit_max * 1e-2) #---------------------------------------# # 获得学习率下降的公式 #---------------------------------------# lr_scheduler_func = get_lr_scheduler(lr_decay_type, Init_lr_fit, Min_lr_fit, UnFreeze_Epoch) for param in model.backbone.parameters(): param.requires_grad = True epoch_step = num_train // batch_size epoch_step_val = num_val // batch_size if epoch_step == 0 or epoch_step_val == 0: raise ValueError("数据集过小,无法继续进行训练,请扩充数据集。") if distributed: batch_size = batch_size // ngpus_per_node gen = DataLoader(train_dataset, shuffle = shuffle, batch_size = batch_size, num_workers = num_workers, pin_memory=True, drop_last=True, collate_fn=yolo_dataset_collate, sampler=train_sampler) gen_val = DataLoader(val_dataset , shuffle = shuffle, batch_size = batch_size, num_workers = num_workers, pin_memory=True, drop_last=True, collate_fn=yolo_dataset_collate, sampler=val_sampler) UnFreeze_flag = True gen.dataset.epoch_now = epoch gen_val.dataset.epoch_now = epoch if distributed: train_sampler.set_epoch(epoch) set_optimizer_lr(optimizer, lr_scheduler_func, epoch) fit_one_epoch(model_train, model, yolo_loss, loss_history, optimizer, epoch, epoch_step, epoch_step_val, gen, gen_val, UnFreeze_Epoch, Cuda, fp16, scaler, save_period, save_dir, local_rank) # 记录 opt.lr = get_lr(optimizer) opt.init = epoch + 1 with open('logs/opt.yaml','w') as f: yaml.dump(vars(opt),f,encoding='utf-8',allow_unicode=True) if local_rank == 0: loss_history.writer.close() ================================================ FILE: utils/__init__.py ================================================ # ================================================ FILE: utils/callbacks.py ================================================ import datetime import os import torch import matplotlib matplotlib.use('Agg') import scipy.signal from matplotlib import pyplot as plt from torch.utils.tensorboard import SummaryWriter class LossHistory(): def __init__(self, log_dir, model, input_shape): time_str = datetime.datetime.strftime(datetime.datetime.now(),'%Y_%m_%d_%H_%M_%S') self.log_dir = os.path.join(log_dir, "loss_" + str(time_str)) self.losses = [] self.val_loss = [] os.makedirs(self.log_dir) self.writer = SummaryWriter(self.log_dir) try: dummy_input = torch.randn(2, 3, input_shape[0], input_shape[1]) self.writer.add_graph(model, dummy_input) except: pass def append_loss(self, epoch, loss, val_loss): if not os.path.exists(self.log_dir): os.makedirs(self.log_dir) self.losses.append(loss) self.val_loss.append(val_loss) with open(os.path.join(self.log_dir, "epoch_loss.txt"), 'a') as f: f.write(str(loss)) f.write("\n") with open(os.path.join(self.log_dir, "epoch_val_loss.txt"), 'a') as f: f.write(str(val_loss)) f.write("\n") self.writer.add_scalar('loss', loss, epoch) self.writer.add_scalar('val_loss', val_loss, epoch) self.loss_plot() def loss_plot(self): iters = range(len(self.losses)) plt.figure() plt.plot(iters, self.losses, 'red', linewidth = 2, label='train loss') plt.plot(iters, self.val_loss, 'coral', linewidth = 2, label='val loss') try: if len(self.losses) < 25: num = 5 else: num = 15 plt.plot(iters, scipy.signal.savgol_filter(self.losses, num, 3), 'green', linestyle = '--', linewidth = 2, label='smooth train loss') plt.plot(iters, scipy.signal.savgol_filter(self.val_loss, num, 3), '#8B4513', linestyle = '--', linewidth = 2, label='smooth val loss') except: pass plt.grid(True) plt.xlabel('Epoch') plt.ylabel('Loss') plt.legend(loc="upper right") plt.savefig(os.path.join(self.log_dir, "epoch_loss.png")) plt.cla() plt.close("all") ================================================ FILE: utils/dataloader.py ================================================ from random import sample, shuffle import cv2 import numpy as np import torch from PIL import Image from torch.utils.data.dataset import Dataset from utils.utils import cvtColor, preprocess_input class YoloDataset(Dataset): def __init__(self, annotation_lines, input_shape, num_classes, epoch_length, mosaic, train, mosaic_ratio = 0.7): super(YoloDataset, self).__init__() self.annotation_lines = annotation_lines self.input_shape = input_shape self.num_classes = num_classes self.epoch_length = epoch_length self.mosaic = mosaic self.train = train self.mosaic_ratio = mosaic_ratio self.epoch_now = -1 self.length = len(self.annotation_lines) def __len__(self): return self.length def __getitem__(self, index): index = index % self.length #---------------------------------------------------# # 训练时进行数据的随机增强 # 验证时不进行数据的随机增强 #---------------------------------------------------# if self.mosaic: if self.rand() < 0.5 and self.epoch_now < self.epoch_length * self.mosaic_ratio: lines = sample(self.annotation_lines, 3) lines.append(self.annotation_lines[index]) shuffle(lines) image, box = self.get_random_data_with_Mosaic(lines, self.input_shape) else: image, box = self.get_random_data(self.annotation_lines[index], self.input_shape, random = self.train) else: image, box = self.get_random_data(self.annotation_lines[index], self.input_shape, random = self.train) image = np.transpose(preprocess_input(np.array(image, dtype=np.float32)), (2, 0, 1)) box = np.array(box, dtype=np.float32) if len(box) != 0: box[:, [0, 2]] = box[:, [0, 2]] / self.input_shape[1] box[:, [1, 3]] = box[:, [1, 3]] / self.input_shape[0] box[:, 2:4] = box[:, 2:4] - box[:, 0:2] box[:, 0:2] = box[:, 0:2] + box[:, 2:4] / 2 return image, box def rand(self, a=0, b=1): return np.random.rand()*(b-a) + a def get_random_data(self, annotation_line, input_shape, jitter=.3, hue=.1, sat=0.7, val=0.4, random=True): line = annotation_line.split() #------------------------------# # 读取图像并转换成RGB图像 #------------------------------# image = Image.open(line[0]) image = cvtColor(image) #------------------------------# # 获得图像的高宽与目标高宽 #------------------------------# iw, ih = image.size h, w = input_shape #------------------------------# # 获得预测框 #------------------------------# box = np.array([np.array(list(map(int,box.split(',')))) for box in line[1:]]) if not random: scale = min(w/iw, h/ih) nw = int(iw*scale) nh = int(ih*scale) dx = (w-nw)//2 dy = (h-nh)//2 #---------------------------------# # 将图像多余的部分加上灰条 #---------------------------------# image = image.resize((nw,nh), Image.BICUBIC) new_image = Image.new('RGB', (w,h), (128,128,128)) new_image.paste(image, (dx, dy)) image_data = np.array(new_image, np.float32) #---------------------------------# # 对真实框进行调整 #---------------------------------# if len(box)>0: np.random.shuffle(box) box[:, [0,2]] = box[:, [0,2]]*nw/iw + dx box[:, [1,3]] = box[:, [1,3]]*nh/ih + dy box[:, 0:2][box[:, 0:2]<0] = 0 box[:, 2][box[:, 2]>w] = w box[:, 3][box[:, 3]>h] = h box_w = box[:, 2] - box[:, 0] box_h = box[:, 3] - box[:, 1] box = box[np.logical_and(box_w>1, box_h>1)] # discard invalid box return image_data, box #------------------------------------------# # 对图像进行缩放并且进行长和宽的扭曲 #------------------------------------------# new_ar = iw/ih * self.rand(1-jitter,1+jitter) / self.rand(1-jitter,1+jitter) scale = self.rand(.25, 2) if new_ar < 1: nh = int(scale*h) nw = int(nh*new_ar) else: nw = int(scale*w) nh = int(nw/new_ar) image = image.resize((nw,nh), Image.BICUBIC) #------------------------------------------# # 将图像多余的部分加上灰条 #------------------------------------------# dx = int(self.rand(0, w-nw)) dy = int(self.rand(0, h-nh)) new_image = Image.new('RGB', (w,h), (128,128,128)) new_image.paste(image, (dx, dy)) image = new_image #------------------------------------------# # 翻转图像 #------------------------------------------# flip = self.rand()<.5 if flip: image = image.transpose(Image.FLIP_LEFT_RIGHT) image_data = np.array(image, np.uint8) #---------------------------------# # 对图像进行色域变换 # 计算色域变换的参数 #---------------------------------# r = np.random.uniform(-1, 1, 3) * [hue, sat, val] + 1 #---------------------------------# # 将图像转到HSV上 #---------------------------------# hue, sat, val = cv2.split(cv2.cvtColor(image_data, cv2.COLOR_RGB2HSV)) dtype = image_data.dtype #---------------------------------# # 应用变换 #---------------------------------# x = np.arange(0, 256, dtype=r.dtype) lut_hue = ((x * r[0]) % 180).astype(dtype) lut_sat = np.clip(x * r[1], 0, 255).astype(dtype) lut_val = np.clip(x * r[2], 0, 255).astype(dtype) image_data = cv2.merge((cv2.LUT(hue, lut_hue), cv2.LUT(sat, lut_sat), cv2.LUT(val, lut_val))) image_data = cv2.cvtColor(image_data, cv2.COLOR_HSV2RGB) #---------------------------------# # 对真实框进行调整 #---------------------------------# if len(box)>0: np.random.shuffle(box) box[:, [0,2]] = box[:, [0,2]]*nw/iw + dx box[:, [1,3]] = box[:, [1,3]]*nh/ih + dy if flip: box[:, [0,2]] = w - box[:, [2,0]] box[:, 0:2][box[:, 0:2]<0] = 0 box[:, 2][box[:, 2]>w] = w box[:, 3][box[:, 3]>h] = h box_w = box[:, 2] - box[:, 0] box_h = box[:, 3] - box[:, 1] box = box[np.logical_and(box_w>1, box_h>1)] return image_data, box def merge_bboxes(self, bboxes, cutx, cuty): merge_bbox = [] for i in range(len(bboxes)): for box in bboxes[i]: tmp_box = [] x1, y1, x2, y2 = box[0], box[1], box[2], box[3] if i == 0: if y1 > cuty or x1 > cutx: continue if y2 >= cuty and y1 <= cuty: y2 = cuty if x2 >= cutx and x1 <= cutx: x2 = cutx if i == 1: if y2 < cuty or x1 > cutx: continue if y2 >= cuty and y1 <= cuty: y1 = cuty if x2 >= cutx and x1 <= cutx: x2 = cutx if i == 2: if y2 < cuty or x2 < cutx: continue if y2 >= cuty and y1 <= cuty: y1 = cuty if x2 >= cutx and x1 <= cutx: x1 = cutx if i == 3: if y1 > cuty or x2 < cutx: continue if y2 >= cuty and y1 <= cuty: y2 = cuty if x2 >= cutx and x1 <= cutx: x1 = cutx tmp_box.append(x1) tmp_box.append(y1) tmp_box.append(x2) tmp_box.append(y2) tmp_box.append(box[-1]) merge_bbox.append(tmp_box) return merge_bbox def get_random_data_with_Mosaic(self, annotation_line, input_shape, jitter=0.3, hue=.1, sat=0.7, val=0.4): h, w = input_shape min_offset_x = self.rand(0.3, 0.7) min_offset_y = self.rand(0.3, 0.7) image_datas = [] box_datas = [] index = 0 for line in annotation_line: #---------------------------------# # 每一行进行分割 #---------------------------------# line_content = line.split() #---------------------------------# # 打开图片 #---------------------------------# image = Image.open(line_content[0]) image = cvtColor(image) #---------------------------------# # 图片的大小 #---------------------------------# iw, ih = image.size #---------------------------------# # 保存框的位置 #---------------------------------# box = np.array([np.array(list(map(int,box.split(',')))) for box in line_content[1:]]) #---------------------------------# # 是否翻转图片 #---------------------------------# flip = self.rand()<.5 if flip and len(box)>0: image = image.transpose(Image.FLIP_LEFT_RIGHT) box[:, [0,2]] = iw - box[:, [2,0]] #------------------------------------------# # 对图像进行缩放并且进行长和宽的扭曲 #------------------------------------------# new_ar = iw/ih * self.rand(1-jitter,1+jitter) / self.rand(1-jitter,1+jitter) scale = self.rand(.4, 1) if new_ar < 1: nh = int(scale*h) nw = int(nh*new_ar) else: nw = int(scale*w) nh = int(nw/new_ar) image = image.resize((nw, nh), Image.BICUBIC) #-----------------------------------------------# # 将图片进行放置,分别对应四张分割图片的位置 #-----------------------------------------------# if index == 0: dx = int(w*min_offset_x) - nw dy = int(h*min_offset_y) - nh elif index == 1: dx = int(w*min_offset_x) - nw dy = int(h*min_offset_y) elif index == 2: dx = int(w*min_offset_x) dy = int(h*min_offset_y) elif index == 3: dx = int(w*min_offset_x) dy = int(h*min_offset_y) - nh new_image = Image.new('RGB', (w,h), (128,128,128)) new_image.paste(image, (dx, dy)) image_data = np.array(new_image) index = index + 1 box_data = [] #---------------------------------# # 对box进行重新处理 #---------------------------------# if len(box)>0: np.random.shuffle(box) box[:, [0,2]] = box[:, [0,2]]*nw/iw + dx box[:, [1,3]] = box[:, [1,3]]*nh/ih + dy box[:, 0:2][box[:, 0:2]<0] = 0 box[:, 2][box[:, 2]>w] = w box[:, 3][box[:, 3]>h] = h box_w = box[:, 2] - box[:, 0] box_h = box[:, 3] - box[:, 1] box = box[np.logical_and(box_w>1, box_h>1)] box_data = np.zeros((len(box),5)) box_data[:len(box)] = box image_datas.append(image_data) box_datas.append(box_data) #---------------------------------# # 将图片分割,放在一起 #---------------------------------# cutx = int(w * min_offset_x) cuty = int(h * min_offset_y) new_image = np.zeros([h, w, 3]) new_image[:cuty, :cutx, :] = image_datas[0][:cuty, :cutx, :] new_image[cuty:, :cutx, :] = image_datas[1][cuty:, :cutx, :] new_image[cuty:, cutx:, :] = image_datas[2][cuty:, cutx:, :] new_image[:cuty, cutx:, :] = image_datas[3][:cuty, cutx:, :] new_image = np.array(new_image, np.uint8) #---------------------------------# # 对图像进行色域变换 # 计算色域变换的参数 #---------------------------------# r = np.random.uniform(-1, 1, 3) * [hue, sat, val] + 1 #---------------------------------# # 将图像转到HSV上 #---------------------------------# hue, sat, val = cv2.split(cv2.cvtColor(new_image, cv2.COLOR_RGB2HSV)) dtype = new_image.dtype #---------------------------------# # 应用变换 #---------------------------------# x = np.arange(0, 256, dtype=r.dtype) lut_hue = ((x * r[0]) % 180).astype(dtype) lut_sat = np.clip(x * r[1], 0, 255).astype(dtype) lut_val = np.clip(x * r[2], 0, 255).astype(dtype) new_image = cv2.merge((cv2.LUT(hue, lut_hue), cv2.LUT(sat, lut_sat), cv2.LUT(val, lut_val))) new_image = cv2.cvtColor(new_image, cv2.COLOR_HSV2RGB) #---------------------------------# # 对框进行进一步的处理 #---------------------------------# new_boxes = self.merge_bboxes(box_datas, cutx, cuty) return new_image, new_boxes # DataLoader中collate_fn使用 def yolo_dataset_collate(batch): images = [] bboxes = [] for img, box in batch: images.append(img) bboxes.append(box) images = torch.from_numpy(np.array(images)).type(torch.FloatTensor) bboxes = [torch.from_numpy(ann).type(torch.FloatTensor) for ann in bboxes] return images, bboxes ================================================ FILE: utils/utils.py ================================================ import numpy as np from PIL import Image #---------------------------------------------------------# # 将图像转换成RGB图像,防止灰度图在预测时报错。 # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB #---------------------------------------------------------# def cvtColor(image): if len(np.shape(image)) == 3 and np.shape(image)[2] == 3: return image else: image = image.convert('RGB') return image #---------------------------------------------------# # 对输入图像进行resize #---------------------------------------------------# def resize_image(image, size, letterbox_image): iw, ih = image.size w, h = size if letterbox_image: scale = min(w/iw, h/ih) nw = int(iw*scale) nh = int(ih*scale) image = image.resize((nw,nh), Image.BICUBIC) new_image = Image.new('RGB', size, (128,128,128)) new_image.paste(image, ((w-nw)//2, (h-nh)//2)) else: new_image = image.resize((w, h), Image.BICUBIC) return new_image #---------------------------------------------------# # 获得类 #---------------------------------------------------# def get_classes(classes_path): with open(classes_path, encoding='utf-8') as f: class_names = f.readlines() class_names = [c.strip() for c in class_names] return class_names, len(class_names) #---------------------------------------------------# # 获得先验框 #---------------------------------------------------# def get_anchors(anchors_path): '''loads the anchors from a file''' with open(anchors_path, encoding='utf-8') as f: anchors = f.readline() anchors = [float(x) for x in anchors.split(',')] anchors = np.array(anchors).reshape(-1, 2) return anchors, len(anchors) #---------------------------------------------------# # 获得学习率 #---------------------------------------------------# def get_lr(optimizer): for param_group in optimizer.param_groups: return param_group['lr'] def preprocess_input(image): image /= 255.0 return image ================================================ FILE: utils/utils_bbox.py ================================================ import torch import torch.nn as nn from torchvision.ops import nms import numpy as np class DecodeBox(): def __init__(self, anchors, num_classes, input_shape, anchors_mask = [[6,7,8], [3,4,5], [0,1,2]]): super(DecodeBox, self).__init__() self.anchors = anchors self.num_classes = num_classes self.bbox_attrs = 5 + num_classes self.input_shape = input_shape #-----------------------------------------------------------# # 13x13的特征层对应的anchor是[142, 110],[192, 243],[459, 401] # 26x26的特征层对应的anchor是[36, 75],[76, 55],[72, 146] # 52x52的特征层对应的anchor是[12, 16],[19, 36],[40, 28] #-----------------------------------------------------------# self.anchors_mask = anchors_mask def decode_box(self, inputs): outputs = [] for i, input in enumerate(inputs): #-----------------------------------------------# # 输入的input一共有三个,他们的shape分别是 # batch_size, 255, 13, 13 # batch_size, 255, 26, 26 # batch_size, 255, 52, 52 #-----------------------------------------------# batch_size = input.size(0) input_height = input.size(2) input_width = input.size(3) #-----------------------------------------------# # 输入为416x416时 # stride_h = stride_w = 32、16、8 #-----------------------------------------------# stride_h = self.input_shape[0] / input_height stride_w = self.input_shape[1] / input_width #-------------------------------------------------# # 此时获得的scaled_anchors大小是相对于特征层的 #-------------------------------------------------# scaled_anchors = [(anchor_width / stride_w, anchor_height / stride_h) for anchor_width, anchor_height in self.anchors[self.anchors_mask[i]]] #-----------------------------------------------# # 输入的input一共有三个,他们的shape分别是 # batch_size, 3, 13, 13, 85 # batch_size, 3, 26, 26, 85 # batch_size, 3, 52, 52, 85 #-----------------------------------------------# prediction = input.view(batch_size, len(self.anchors_mask[i]), self.bbox_attrs, input_height, input_width).permute(0, 1, 3, 4, 2).contiguous() #-----------------------------------------------# # 先验框的中心位置的调整参数 #-----------------------------------------------# x = torch.sigmoid(prediction[..., 0]) y = torch.sigmoid(prediction[..., 1]) #-----------------------------------------------# # 先验框的宽高调整参数 #-----------------------------------------------# w = prediction[..., 2] h = prediction[..., 3] #-----------------------------------------------# # 获得置信度,是否有物体 #-----------------------------------------------# conf = torch.sigmoid(prediction[..., 4]) #-----------------------------------------------# # 种类置信度 #-----------------------------------------------# pred_cls = torch.sigmoid(prediction[..., 5:]) FloatTensor = torch.cuda.FloatTensor if x.is_cuda else torch.FloatTensor LongTensor = torch.cuda.LongTensor if x.is_cuda else torch.LongTensor #----------------------------------------------------------# # 生成网格,先验框中心,网格左上角 # batch_size,3,13,13 #----------------------------------------------------------# grid_x = torch.linspace(0, input_width - 1, input_width).repeat(input_height, 1).repeat( batch_size * len(self.anchors_mask[i]), 1, 1).view(x.shape).type(FloatTensor) grid_y = torch.linspace(0, input_height - 1, input_height).repeat(input_width, 1).t().repeat( batch_size * len(self.anchors_mask[i]), 1, 1).view(y.shape).type(FloatTensor) #----------------------------------------------------------# # 按照网格格式生成先验框的宽高 # batch_size,3,13,13 #----------------------------------------------------------# anchor_w = FloatTensor(scaled_anchors).index_select(1, LongTensor([0])) anchor_h = FloatTensor(scaled_anchors).index_select(1, LongTensor([1])) anchor_w = anchor_w.repeat(batch_size, 1).repeat(1, 1, input_height * input_width).view(w.shape) anchor_h = anchor_h.repeat(batch_size, 1).repeat(1, 1, input_height * input_width).view(h.shape) #----------------------------------------------------------# # 利用预测结果对先验框进行调整 # 首先调整先验框的中心,从先验框中心向右下角偏移 # 再调整先验框的宽高。 #----------------------------------------------------------# pred_boxes = FloatTensor(prediction[..., :4].shape) pred_boxes[..., 0] = x.data + grid_x pred_boxes[..., 1] = y.data + grid_y pred_boxes[..., 2] = torch.exp(w.data) * anchor_w pred_boxes[..., 3] = torch.exp(h.data) * anchor_h #----------------------------------------------------------# # 将输出结果归一化成小数的形式 #----------------------------------------------------------# _scale = torch.Tensor([input_width, input_height, input_width, input_height]).type(FloatTensor) output = torch.cat((pred_boxes.view(batch_size, -1, 4) / _scale, conf.view(batch_size, -1, 1), pred_cls.view(batch_size, -1, self.num_classes)), -1) outputs.append(output.data) return outputs def yolo_correct_boxes(self, box_xy, box_wh, input_shape, image_shape, letterbox_image): #-----------------------------------------------------------------# # 把y轴放前面是因为方便预测框和图像的宽高进行相乘 #-----------------------------------------------------------------# box_yx = box_xy[..., ::-1] box_hw = box_wh[..., ::-1] input_shape = np.array(input_shape) image_shape = np.array(image_shape) if letterbox_image: #-----------------------------------------------------------------# # 这里求出来的offset是图像有效区域相对于图像左上角的偏移情况 # new_shape指的是宽高缩放情况 #-----------------------------------------------------------------# new_shape = np.round(image_shape * np.min(input_shape/image_shape)) offset = (input_shape - new_shape)/2./input_shape scale = input_shape/new_shape box_yx = (box_yx - offset) * scale box_hw *= scale box_mins = box_yx - (box_hw / 2.) box_maxes = box_yx + (box_hw / 2.) boxes = np.concatenate([box_mins[..., 0:1], box_mins[..., 1:2], box_maxes[..., 0:1], box_maxes[..., 1:2]], axis=-1) boxes *= np.concatenate([image_shape, image_shape], axis=-1) return boxes def non_max_suppression(self, prediction, num_classes, input_shape, image_shape, letterbox_image, conf_thres=0.5, nms_thres=0.4): #----------------------------------------------------------# # 将预测结果的格式转换成左上角右下角的格式。 # prediction [batch_size, num_anchors, 85] #----------------------------------------------------------# box_corner = prediction.new(prediction.shape) box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2 box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2 box_corner[:, :, 2] = prediction[:, :, 0] + prediction[:, :, 2] / 2 box_corner[:, :, 3] = prediction[:, :, 1] + prediction[:, :, 3] / 2 prediction[:, :, :4] = box_corner[:, :, :4] output = [None for _ in range(len(prediction))] for i, image_pred in enumerate(prediction): #----------------------------------------------------------# # 对种类预测部分取max。 # class_conf [num_anchors, 1] 种类置信度 # class_pred [num_anchors, 1] 种类 #----------------------------------------------------------# class_conf, class_pred = torch.max(image_pred[:, 5:5 + num_classes], 1, keepdim=True) #----------------------------------------------------------# # 利用置信度进行第一轮筛选 #----------------------------------------------------------# conf_mask = (image_pred[:, 4] * class_conf[:, 0] >= conf_thres).squeeze() #----------------------------------------------------------# # 根据置信度进行预测结果的筛选 #----------------------------------------------------------# image_pred = image_pred[conf_mask] class_conf = class_conf[conf_mask] class_pred = class_pred[conf_mask] if not image_pred.size(0): continue #-------------------------------------------------------------------------# # detections [num_anchors, 7] # 7的内容为:x1, y1, x2, y2, obj_conf, class_conf, class_pred #-------------------------------------------------------------------------# detections = torch.cat((image_pred[:, :5], class_conf.float(), class_pred.float()), 1) #------------------------------------------# # 获得预测结果中包含的所有种类 #------------------------------------------# unique_labels = detections[:, -1].cpu().unique() if prediction.is_cuda: unique_labels = unique_labels.cuda() detections = detections.cuda() for c in unique_labels: #------------------------------------------# # 获得某一类得分筛选后全部的预测结果 #------------------------------------------# detections_class = detections[detections[:, -1] == c] #------------------------------------------# # 使用官方自带的非极大抑制会速度更快一些! #------------------------------------------# keep = nms( detections_class[:, :4], detections_class[:, 4] * detections_class[:, 5], nms_thres ) max_detections = detections_class[keep] # # 按照存在物体的置信度排序 # _, conf_sort_index = torch.sort(detections_class[:, 4]*detections_class[:, 5], descending=True) # detections_class = detections_class[conf_sort_index] # # 进行非极大抑制 # max_detections = [] # while detections_class.size(0): # # 取出这一类置信度最高的,一步一步往下判断,判断重合程度是否大于nms_thres,如果是则去除掉 # max_detections.append(detections_class[0].unsqueeze(0)) # if len(detections_class) == 1: # break # ious = bbox_iou(max_detections[-1], detections_class[1:]) # detections_class = detections_class[1:][ious < nms_thres] # # 堆叠 # max_detections = torch.cat(max_detections).data # Add max detections to outputs output[i] = max_detections if output[i] is None else torch.cat((output[i], max_detections)) if output[i] is not None: output[i] = output[i].cpu().numpy() box_xy, box_wh = (output[i][:, 0:2] + output[i][:, 2:4])/2, output[i][:, 2:4] - output[i][:, 0:2] output[i][:, :4] = self.yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape, letterbox_image) return output ================================================ FILE: utils/utils_fit.py ================================================ import os import torch from tqdm import tqdm from utils.utils import get_lr def fit_one_epoch(model_train, model, yolo_loss, loss_history, optimizer, epoch, epoch_step, epoch_step_val, gen, gen_val, Epoch, cuda, fp16, scaler, save_period, save_dir, local_rank=0): loss = 0 val_loss = 0 if local_rank == 0: print('Start Train') pbar = tqdm(total=epoch_step,desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) model_train.train() for iteration, batch in enumerate(gen): if iteration >= epoch_step: break images, targets = batch[0], batch[1] with torch.no_grad(): if cuda: images = images.cuda() targets = [ann.cuda() for ann in targets] #----------------------# # 清零梯度 #----------------------# optimizer.zero_grad() if not fp16: #----------------------# # 前向传播 #----------------------# outputs = model_train(images) loss_value_all = 0 #----------------------# # 计算损失 #----------------------# for l in range(len(outputs)): loss_item = yolo_loss(l, outputs[l], targets) loss_value_all += loss_item loss_value = loss_value_all #----------------------# # 反向传播 #----------------------# loss_value.backward() optimizer.step() else: from torch.cuda.amp import autocast with autocast(): #----------------------# # 前向传播 #----------------------# outputs = model_train(images) loss_value_all = 0 #----------------------# # 计算损失 #----------------------# for l in range(len(outputs)): loss_item = yolo_loss(l, outputs[l], targets) loss_value_all += loss_item loss_value = loss_value_all #----------------------# # 反向传播 #----------------------# scaler.scale(loss_value).backward() scaler.step(optimizer) scaler.update() loss += loss_value.item() if local_rank == 0: pbar.set_postfix(**{'loss' : loss / (iteration + 1), 'lr' : get_lr(optimizer)}) pbar.update(1) if local_rank == 0: pbar.close() print('Finish Train') print('Start Validation') pbar = tqdm(total=epoch_step_val, desc=f'Epoch {epoch + 1}/{Epoch}',postfix=dict,mininterval=0.3) model_train.eval() for iteration, batch in enumerate(gen_val): if iteration >= epoch_step_val: break images, targets = batch[0], batch[1] with torch.no_grad(): if cuda: images = images.cuda() targets = [ann.cuda() for ann in targets] #----------------------# # 清零梯度 #----------------------# optimizer.zero_grad() #----------------------# # 前向传播 #----------------------# outputs = model_train(images) loss_value_all = 0 #----------------------# # 计算损失 #----------------------# for l in range(len(outputs)): loss_item = yolo_loss(l, outputs[l], targets) loss_value_all += loss_item loss_value = loss_value_all val_loss += loss_value.item() if local_rank == 0: pbar.set_postfix(**{'val_loss': val_loss / (iteration + 1)}) pbar.update(1) if local_rank == 0: pbar.close() print('Finish Validation') loss_history.append_loss(epoch + 1, loss / epoch_step, val_loss / epoch_step_val) print('Epoch:'+ str(epoch + 1) + '/' + str(Epoch)) print('Total Loss: %.3f || Val Loss: %.3f ' % (loss / epoch_step, val_loss / epoch_step_val)) if (epoch + 1) % save_period == 0 or epoch + 1 == Epoch: torch.save(model.state_dict(), os.path.join(save_dir, "ep%03d-loss%.3f-val_loss%.3f.pth" % (epoch + 1, loss / epoch_step, val_loss / epoch_step_val))) # 每次保存最后一个权重 torch.save(model.state_dict(), os.path.join(save_dir, "last.pth" )) ================================================ FILE: utils/utils_map.py ================================================ import glob import json import math import operator import os import shutil import sys import cv2 import matplotlib.pyplot as plt import numpy as np ''' 0,0 ------> x (width) | | (Left,Top) | *_________ | | | | | y |_________| (height) * (Right,Bottom) ''' def log_average_miss_rate(precision, fp_cumsum, num_images): """ log-average miss rate: Calculated by averaging miss rates at 9 evenly spaced FPPI points between 10e-2 and 10e0, in log-space. output: lamr | log-average miss rate mr | miss rate fppi | false positives per image references: [1] Dollar, Piotr, et al. "Pedestrian Detection: An Evaluation of the State of the Art." Pattern Analysis and Machine Intelligence, IEEE Transactions on 34.4 (2012): 743 - 761. """ if precision.size == 0: lamr = 0 mr = 1 fppi = 0 return lamr, mr, fppi fppi = fp_cumsum / float(num_images) mr = (1 - precision) fppi_tmp = np.insert(fppi, 0, -1.0) mr_tmp = np.insert(mr, 0, 1.0) ref = np.logspace(-2.0, 0.0, num = 9) for i, ref_i in enumerate(ref): j = np.where(fppi_tmp <= ref_i)[-1][-1] ref[i] = mr_tmp[j] lamr = math.exp(np.mean(np.log(np.maximum(1e-10, ref)))) return lamr, mr, fppi """ throw error and exit """ def error(msg): print(msg) sys.exit(0) """ check if the number is a float between 0.0 and 1.0 """ def is_float_between_0_and_1(value): try: val = float(value) if val > 0.0 and val < 1.0: return True else: return False except ValueError: return False """ Calculate the AP given the recall and precision array 1st) We compute a version of the measured precision/recall curve with precision monotonically decreasing 2nd) We compute the AP as the area under this curve by numerical integration. """ def voc_ap(rec, prec): """ --- Official matlab code VOC2012--- mrec=[0 ; rec ; 1]; mpre=[0 ; prec ; 0]; for i=numel(mpre)-1:-1:1 mpre(i)=max(mpre(i),mpre(i+1)); end i=find(mrec(2:end)~=mrec(1:end-1))+1; ap=sum((mrec(i)-mrec(i-1)).*mpre(i)); """ rec.insert(0, 0.0) # insert 0.0 at begining of list rec.append(1.0) # insert 1.0 at end of list mrec = rec[:] prec.insert(0, 0.0) # insert 0.0 at begining of list prec.append(0.0) # insert 0.0 at end of list mpre = prec[:] """ This part makes the precision monotonically decreasing (goes from the end to the beginning) matlab: for i=numel(mpre)-1:-1:1 mpre(i)=max(mpre(i),mpre(i+1)); """ for i in range(len(mpre)-2, -1, -1): mpre[i] = max(mpre[i], mpre[i+1]) """ This part creates a list of indexes where the recall changes matlab: i=find(mrec(2:end)~=mrec(1:end-1))+1; """ i_list = [] for i in range(1, len(mrec)): if mrec[i] != mrec[i-1]: i_list.append(i) # if it was matlab would be i + 1 """ The Average Precision (AP) is the area under the curve (numerical integration) matlab: ap=sum((mrec(i)-mrec(i-1)).*mpre(i)); """ ap = 0.0 for i in i_list: ap += ((mrec[i]-mrec[i-1])*mpre[i]) return ap, mrec, mpre """ Convert the lines of a file to a list """ def file_lines_to_list(path): # open txt file lines to a list with open(path) as f: content = f.readlines() # remove whitespace characters like `\n` at the end of each line content = [x.strip() for x in content] return content """ Draws text in image """ def draw_text_in_image(img, text, pos, color, line_width): font = cv2.FONT_HERSHEY_PLAIN fontScale = 1 lineType = 1 bottomLeftCornerOfText = pos cv2.putText(img, text, bottomLeftCornerOfText, font, fontScale, color, lineType) text_width, _ = cv2.getTextSize(text, font, fontScale, lineType)[0] return img, (line_width + text_width) """ Plot - adjust axes """ def adjust_axes(r, t, fig, axes): # get text width for re-scaling bb = t.get_window_extent(renderer=r) text_width_inches = bb.width / fig.dpi # get axis width in inches current_fig_width = fig.get_figwidth() new_fig_width = current_fig_width + text_width_inches propotion = new_fig_width / current_fig_width # get axis limit x_lim = axes.get_xlim() axes.set_xlim([x_lim[0], x_lim[1]*propotion]) """ Draw plot using Matplotlib """ def draw_plot_func(dictionary, n_classes, window_title, plot_title, x_label, output_path, to_show, plot_color, true_p_bar): # sort the dictionary by decreasing value, into a list of tuples sorted_dic_by_value = sorted(dictionary.items(), key=operator.itemgetter(1)) # unpacking the list of tuples into two lists sorted_keys, sorted_values = zip(*sorted_dic_by_value) # if true_p_bar != "": """ Special case to draw in: - green -> TP: True Positives (object detected and matches ground-truth) - red -> FP: False Positives (object detected but does not match ground-truth) - orange -> FN: False Negatives (object not detected but present in the ground-truth) """ fp_sorted = [] tp_sorted = [] for key in sorted_keys: fp_sorted.append(dictionary[key] - true_p_bar[key]) tp_sorted.append(true_p_bar[key]) plt.barh(range(n_classes), fp_sorted, align='center', color='crimson', label='False Positive') plt.barh(range(n_classes), tp_sorted, align='center', color='forestgreen', label='True Positive', left=fp_sorted) # add legend plt.legend(loc='lower right') """ Write number on side of bar """ fig = plt.gcf() # gcf - get current figure axes = plt.gca() r = fig.canvas.get_renderer() for i, val in enumerate(sorted_values): fp_val = fp_sorted[i] tp_val = tp_sorted[i] fp_str_val = " " + str(fp_val) tp_str_val = fp_str_val + " " + str(tp_val) # trick to paint multicolor with offset: # first paint everything and then repaint the first number t = plt.text(val, i, tp_str_val, color='forestgreen', va='center', fontweight='bold') plt.text(val, i, fp_str_val, color='crimson', va='center', fontweight='bold') if i == (len(sorted_values)-1): # largest bar adjust_axes(r, t, fig, axes) else: plt.barh(range(n_classes), sorted_values, color=plot_color) """ Write number on side of bar """ fig = plt.gcf() # gcf - get current figure axes = plt.gca() r = fig.canvas.get_renderer() for i, val in enumerate(sorted_values): str_val = " " + str(val) # add a space before if val < 1.0: str_val = " {0:.2f}".format(val) t = plt.text(val, i, str_val, color=plot_color, va='center', fontweight='bold') # re-set axes to show number inside the figure if i == (len(sorted_values)-1): # largest bar adjust_axes(r, t, fig, axes) # set window title fig.canvas.manager.set_window_title(window_title) # write classes in y axis tick_font_size = 12 plt.yticks(range(n_classes), sorted_keys, fontsize=tick_font_size) """ Re-scale height accordingly """ init_height = fig.get_figheight() # comput the matrix height in points and inches dpi = fig.dpi height_pt = n_classes * (tick_font_size * 1.4) # 1.4 (some spacing) height_in = height_pt / dpi # compute the required figure height top_margin = 0.15 # in percentage of the figure height bottom_margin = 0.05 # in percentage of the figure height figure_height = height_in / (1 - top_margin - bottom_margin) # set new height if figure_height > init_height: fig.set_figheight(figure_height) # set plot title plt.title(plot_title, fontsize=14) # set axis titles # plt.xlabel('classes') plt.xlabel(x_label, fontsize='large') # adjust size of window fig.tight_layout() # save the plot fig.savefig(output_path) # show image if to_show: plt.show() # close the plot plt.close() def get_map(MINOVERLAP, draw_plot, path = './map_out'): GT_PATH = os.path.join(path, 'ground-truth') DR_PATH = os.path.join(path, 'detection-results') IMG_PATH = os.path.join(path, 'images-optional') TEMP_FILES_PATH = os.path.join(path, '.temp_files') RESULTS_FILES_PATH = os.path.join(path, 'results') show_animation = True if os.path.exists(IMG_PATH): for dirpath, dirnames, files in os.walk(IMG_PATH): if not files: show_animation = False else: show_animation = False if not os.path.exists(TEMP_FILES_PATH): os.makedirs(TEMP_FILES_PATH) if os.path.exists(RESULTS_FILES_PATH): shutil.rmtree(RESULTS_FILES_PATH) if draw_plot: os.makedirs(os.path.join(RESULTS_FILES_PATH, "AP")) os.makedirs(os.path.join(RESULTS_FILES_PATH, "F1")) os.makedirs(os.path.join(RESULTS_FILES_PATH, "Recall")) os.makedirs(os.path.join(RESULTS_FILES_PATH, "Precision")) if show_animation: os.makedirs(os.path.join(RESULTS_FILES_PATH, "images", "detections_one_by_one")) ground_truth_files_list = glob.glob(GT_PATH + '/*.txt') if len(ground_truth_files_list) == 0: error("Error: No ground-truth files found!") ground_truth_files_list.sort() gt_counter_per_class = {} counter_images_per_class = {} for txt_file in ground_truth_files_list: file_id = txt_file.split(".txt", 1)[0] file_id = os.path.basename(os.path.normpath(file_id)) temp_path = os.path.join(DR_PATH, (file_id + ".txt")) if not os.path.exists(temp_path): error_msg = "Error. File not found: {}\n".format(temp_path) error(error_msg) lines_list = file_lines_to_list(txt_file) bounding_boxes = [] is_difficult = False already_seen_classes = [] for line in lines_list: try: if "difficult" in line: class_name, left, top, right, bottom, _difficult = line.split() is_difficult = True else: class_name, left, top, right, bottom = line.split() except: if "difficult" in line: line_split = line.split() _difficult = line_split[-1] bottom = line_split[-2] right = line_split[-3] top = line_split[-4] left = line_split[-5] class_name = "" for name in line_split[:-5]: class_name += name + " " class_name = class_name[:-1] is_difficult = True else: line_split = line.split() bottom = line_split[-1] right = line_split[-2] top = line_split[-3] left = line_split[-4] class_name = "" for name in line_split[:-4]: class_name += name + " " class_name = class_name[:-1] bbox = left + " " + top + " " + right + " " + bottom if is_difficult: bounding_boxes.append({"class_name":class_name, "bbox":bbox, "used":False, "difficult":True}) is_difficult = False else: bounding_boxes.append({"class_name":class_name, "bbox":bbox, "used":False}) if class_name in gt_counter_per_class: gt_counter_per_class[class_name] += 1 else: gt_counter_per_class[class_name] = 1 if class_name not in already_seen_classes: if class_name in counter_images_per_class: counter_images_per_class[class_name] += 1 else: counter_images_per_class[class_name] = 1 already_seen_classes.append(class_name) with open(TEMP_FILES_PATH + "/" + file_id + "_ground_truth.json", 'w') as outfile: json.dump(bounding_boxes, outfile) gt_classes = list(gt_counter_per_class.keys()) gt_classes = sorted(gt_classes) n_classes = len(gt_classes) dr_files_list = glob.glob(DR_PATH + '/*.txt') dr_files_list.sort() for class_index, class_name in enumerate(gt_classes): bounding_boxes = [] for txt_file in dr_files_list: file_id = txt_file.split(".txt",1)[0] file_id = os.path.basename(os.path.normpath(file_id)) temp_path = os.path.join(GT_PATH, (file_id + ".txt")) if class_index == 0: if not os.path.exists(temp_path): error_msg = "Error. File not found: {}\n".format(temp_path) error(error_msg) lines = file_lines_to_list(txt_file) for line in lines: try: tmp_class_name, confidence, left, top, right, bottom = line.split() except: line_split = line.split() bottom = line_split[-1] right = line_split[-2] top = line_split[-3] left = line_split[-4] confidence = line_split[-5] tmp_class_name = "" for name in line_split[:-5]: tmp_class_name += name + " " tmp_class_name = tmp_class_name[:-1] if tmp_class_name == class_name: bbox = left + " " + top + " " + right + " " +bottom bounding_boxes.append({"confidence":confidence, "file_id":file_id, "bbox":bbox}) bounding_boxes.sort(key=lambda x:float(x['confidence']), reverse=True) with open(TEMP_FILES_PATH + "/" + class_name + "_dr.json", 'w') as outfile: json.dump(bounding_boxes, outfile) sum_AP = 0.0 ap_dictionary = {} lamr_dictionary = {} with open(RESULTS_FILES_PATH + "/results.txt", 'w') as results_file: results_file.write("# AP and precision/recall per class\n") count_true_positives = {} for class_index, class_name in enumerate(gt_classes): count_true_positives[class_name] = 0 dr_file = TEMP_FILES_PATH + "/" + class_name + "_dr.json" dr_data = json.load(open(dr_file)) nd = len(dr_data) tp = [0] * nd fp = [0] * nd score = [0] * nd score05_idx = 0 for idx, detection in enumerate(dr_data): file_id = detection["file_id"] score[idx] = float(detection["confidence"]) if score[idx] > 0.5: score05_idx = idx if show_animation: ground_truth_img = glob.glob1(IMG_PATH, file_id + ".*") if len(ground_truth_img) == 0: error("Error. Image not found with id: " + file_id) elif len(ground_truth_img) > 1: error("Error. Multiple image with id: " + file_id) else: img = cv2.imread(IMG_PATH + "/" + ground_truth_img[0]) img_cumulative_path = RESULTS_FILES_PATH + "/images/" + ground_truth_img[0] if os.path.isfile(img_cumulative_path): img_cumulative = cv2.imread(img_cumulative_path) else: img_cumulative = img.copy() bottom_border = 60 BLACK = [0, 0, 0] img = cv2.copyMakeBorder(img, 0, bottom_border, 0, 0, cv2.BORDER_CONSTANT, value=BLACK) gt_file = TEMP_FILES_PATH + "/" + file_id + "_ground_truth.json" ground_truth_data = json.load(open(gt_file)) ovmax = -1 gt_match = -1 bb = [float(x) for x in detection["bbox"].split()] for obj in ground_truth_data: if obj["class_name"] == class_name: bbgt = [ float(x) for x in obj["bbox"].split() ] bi = [max(bb[0],bbgt[0]), max(bb[1],bbgt[1]), min(bb[2],bbgt[2]), min(bb[3],bbgt[3])] iw = bi[2] - bi[0] + 1 ih = bi[3] - bi[1] + 1 if iw > 0 and ih > 0: ua = (bb[2] - bb[0] + 1) * (bb[3] - bb[1] + 1) + (bbgt[2] - bbgt[0] + 1) * (bbgt[3] - bbgt[1] + 1) - iw * ih ov = iw * ih / ua if ov > ovmax: ovmax = ov gt_match = obj if show_animation: status = "NO MATCH FOUND!" min_overlap = MINOVERLAP if ovmax >= min_overlap: if "difficult" not in gt_match: if not bool(gt_match["used"]): tp[idx] = 1 gt_match["used"] = True count_true_positives[class_name] += 1 with open(gt_file, 'w') as f: f.write(json.dumps(ground_truth_data)) if show_animation: status = "MATCH!" else: fp[idx] = 1 if show_animation: status = "REPEATED MATCH!" else: fp[idx] = 1 if ovmax > 0: status = "INSUFFICIENT OVERLAP" """ Draw image to show animation """ if show_animation: height, widht = img.shape[:2] white = (255,255,255) light_blue = (255,200,100) green = (0,255,0) light_red = (30,30,255) margin = 10 # 1nd line v_pos = int(height - margin - (bottom_border / 2.0)) text = "Image: " + ground_truth_img[0] + " " img, line_width = draw_text_in_image(img, text, (margin, v_pos), white, 0) text = "Class [" + str(class_index) + "/" + str(n_classes) + "]: " + class_name + " " img, line_width = draw_text_in_image(img, text, (margin + line_width, v_pos), light_blue, line_width) if ovmax != -1: color = light_red if status == "INSUFFICIENT OVERLAP": text = "IoU: {0:.2f}% ".format(ovmax*100) + "< {0:.2f}% ".format(min_overlap*100) else: text = "IoU: {0:.2f}% ".format(ovmax*100) + ">= {0:.2f}% ".format(min_overlap*100) color = green img, _ = draw_text_in_image(img, text, (margin + line_width, v_pos), color, line_width) # 2nd line v_pos += int(bottom_border / 2.0) rank_pos = str(idx+1) text = "Detection #rank: " + rank_pos + " confidence: {0:.2f}% ".format(float(detection["confidence"])*100) img, line_width = draw_text_in_image(img, text, (margin, v_pos), white, 0) color = light_red if status == "MATCH!": color = green text = "Result: " + status + " " img, line_width = draw_text_in_image(img, text, (margin + line_width, v_pos), color, line_width) font = cv2.FONT_HERSHEY_SIMPLEX if ovmax > 0: bbgt = [ int(round(float(x))) for x in gt_match["bbox"].split() ] cv2.rectangle(img,(bbgt[0],bbgt[1]),(bbgt[2],bbgt[3]),light_blue,2) cv2.rectangle(img_cumulative,(bbgt[0],bbgt[1]),(bbgt[2],bbgt[3]),light_blue,2) cv2.putText(img_cumulative, class_name, (bbgt[0],bbgt[1] - 5), font, 0.6, light_blue, 1, cv2.LINE_AA) bb = [int(i) for i in bb] cv2.rectangle(img,(bb[0],bb[1]),(bb[2],bb[3]),color,2) cv2.rectangle(img_cumulative,(bb[0],bb[1]),(bb[2],bb[3]),color,2) cv2.putText(img_cumulative, class_name, (bb[0],bb[1] - 5), font, 0.6, color, 1, cv2.LINE_AA) cv2.imshow("Animation", img) cv2.waitKey(20) output_img_path = RESULTS_FILES_PATH + "/images/detections_one_by_one/" + class_name + "_detection" + str(idx) + ".jpg" cv2.imwrite(output_img_path, img) cv2.imwrite(img_cumulative_path, img_cumulative) cumsum = 0 for idx, val in enumerate(fp): fp[idx] += cumsum cumsum += val cumsum = 0 for idx, val in enumerate(tp): tp[idx] += cumsum cumsum += val rec = tp[:] for idx, val in enumerate(tp): rec[idx] = float(tp[idx]) / np.maximum(gt_counter_per_class[class_name], 1) prec = tp[:] for idx, val in enumerate(tp): prec[idx] = float(tp[idx]) / np.maximum((fp[idx] + tp[idx]), 1) ap, mrec, mprec = voc_ap(rec[:], prec[:]) F1 = np.array(rec)*np.array(prec)*2 / np.where((np.array(prec)+np.array(rec))==0, 1, (np.array(prec)+np.array(rec))) sum_AP += ap text = "{0:.2f}%".format(ap*100) + " = " + class_name + " AP " #class_name + " AP = {0:.2f}%".format(ap*100) if len(prec)>0: F1_text = "{0:.2f}".format(F1[score05_idx]) + " = " + class_name + " F1 " Recall_text = "{0:.2f}%".format(rec[score05_idx]*100) + " = " + class_name + " Recall " Precision_text = "{0:.2f}%".format(prec[score05_idx]*100) + " = " + class_name + " Precision " else: F1_text = "0.00" + " = " + class_name + " F1 " Recall_text = "0.00%" + " = " + class_name + " Recall " Precision_text = "0.00%" + " = " + class_name + " Precision " rounded_prec = [ '%.2f' % elem for elem in prec ] rounded_rec = [ '%.2f' % elem for elem in rec ] results_file.write(text + "\n Precision: " + str(rounded_prec) + "\n Recall :" + str(rounded_rec) + "\n\n") if len(prec)>0: print(text + "\t||\tscore_threhold=0.5 : " + "F1=" + "{0:.2f}".format(F1[score05_idx])\ + " ; Recall=" + "{0:.2f}%".format(rec[score05_idx]*100) + " ; Precision=" + "{0:.2f}%".format(prec[score05_idx]*100)) else: print(text + "\t||\tscore_threhold=0.5 : F1=0.00% ; Recall=0.00% ; Precision=0.00%") ap_dictionary[class_name] = ap n_images = counter_images_per_class[class_name] lamr, mr, fppi = log_average_miss_rate(np.array(rec), np.array(fp), n_images) lamr_dictionary[class_name] = lamr if draw_plot: plt.plot(rec, prec, '-o') area_under_curve_x = mrec[:-1] + [mrec[-2]] + [mrec[-1]] area_under_curve_y = mprec[:-1] + [0.0] + [mprec[-1]] plt.fill_between(area_under_curve_x, 0, area_under_curve_y, alpha=0.2, edgecolor='r') fig = plt.gcf() fig.canvas.manager.set_window_title('AP ' + class_name) plt.title('class: ' + text) plt.xlabel('Recall') plt.ylabel('Precision') axes = plt.gca() axes.set_xlim([0.0,1.0]) axes.set_ylim([0.0,1.05]) fig.savefig(RESULTS_FILES_PATH + "/AP/" + class_name + ".png") plt.cla() plt.plot(score, F1, "-", color='orangered') plt.title('class: ' + F1_text + "\nscore_threhold=0.5") plt.xlabel('Score_Threhold') plt.ylabel('F1') axes = plt.gca() axes.set_xlim([0.0,1.0]) axes.set_ylim([0.0,1.05]) fig.savefig(RESULTS_FILES_PATH + "/F1/" + class_name + ".png") plt.cla() plt.plot(score, rec, "-H", color='gold') plt.title('class: ' + Recall_text + "\nscore_threhold=0.5") plt.xlabel('Score_Threhold') plt.ylabel('Recall') axes = plt.gca() axes.set_xlim([0.0,1.0]) axes.set_ylim([0.0,1.05]) fig.savefig(RESULTS_FILES_PATH + "/Recall/" + class_name + ".png") plt.cla() plt.plot(score, prec, "-s", color='palevioletred') plt.title('class: ' + Precision_text + "\nscore_threhold=0.5") plt.xlabel('Score_Threhold') plt.ylabel('Precision') axes = plt.gca() axes.set_xlim([0.0,1.0]) axes.set_ylim([0.0,1.05]) fig.savefig(RESULTS_FILES_PATH + "/Precision/" + class_name + ".png") plt.cla() if show_animation: cv2.destroyAllWindows() results_file.write("\n# mAP of all classes\n") mAP = sum_AP / n_classes text = "mAP = {0:.2f}%".format(mAP*100) results_file.write(text + "\n") print(text) shutil.rmtree(TEMP_FILES_PATH) """ Count total of detection-results """ det_counter_per_class = {} for txt_file in dr_files_list: lines_list = file_lines_to_list(txt_file) for line in lines_list: class_name = line.split()[0] if class_name in det_counter_per_class: det_counter_per_class[class_name] += 1 else: det_counter_per_class[class_name] = 1 dr_classes = list(det_counter_per_class.keys()) """ Write number of ground-truth objects per class to results.txt """ with open(RESULTS_FILES_PATH + "/results.txt", 'a') as results_file: results_file.write("\n# Number of ground-truth objects per class\n") for class_name in sorted(gt_counter_per_class): results_file.write(class_name + ": " + str(gt_counter_per_class[class_name]) + "\n") """ Finish counting true positives """ for class_name in dr_classes: if class_name not in gt_classes: count_true_positives[class_name] = 0 """ Write number of detected objects per class to results.txt """ with open(RESULTS_FILES_PATH + "/results.txt", 'a') as results_file: results_file.write("\n# Number of detected objects per class\n") for class_name in sorted(dr_classes): n_det = det_counter_per_class[class_name] text = class_name + ": " + str(n_det) text += " (tp:" + str(count_true_positives[class_name]) + "" text += ", fp:" + str(n_det - count_true_positives[class_name]) + ")\n" results_file.write(text) """ Plot the total number of occurences of each class in the ground-truth """ if draw_plot: window_title = "ground-truth-info" plot_title = "ground-truth\n" plot_title += "(" + str(len(ground_truth_files_list)) + " files and " + str(n_classes) + " classes)" x_label = "Number of objects per class" output_path = RESULTS_FILES_PATH + "/ground-truth-info.png" to_show = False plot_color = 'forestgreen' draw_plot_func( gt_counter_per_class, n_classes, window_title, plot_title, x_label, output_path, to_show, plot_color, '', ) # """ # Plot the total number of occurences of each class in the "detection-results" folder # """ # if draw_plot: # window_title = "detection-results-info" # # Plot title # plot_title = "detection-results\n" # plot_title += "(" + str(len(dr_files_list)) + " files and " # count_non_zero_values_in_dictionary = sum(int(x) > 0 for x in list(det_counter_per_class.values())) # plot_title += str(count_non_zero_values_in_dictionary) + " detected classes)" # # end Plot title # x_label = "Number of objects per class" # output_path = RESULTS_FILES_PATH + "/detection-results-info.png" # to_show = False # plot_color = 'forestgreen' # true_p_bar = count_true_positives # draw_plot_func( # det_counter_per_class, # len(det_counter_per_class), # window_title, # plot_title, # x_label, # output_path, # to_show, # plot_color, # true_p_bar # ) """ Draw log-average miss rate plot (Show lamr of all classes in decreasing order) """ if draw_plot: window_title = "lamr" plot_title = "log-average miss rate" x_label = "log-average miss rate" output_path = RESULTS_FILES_PATH + "/lamr.png" to_show = False plot_color = 'royalblue' draw_plot_func( lamr_dictionary, n_classes, window_title, plot_title, x_label, output_path, to_show, plot_color, "" ) """ Draw mAP plot (Show AP's of all classes in decreasing order) """ if draw_plot: window_title = "mAP" plot_title = "mAP = {0:.2f}%".format(mAP*100) x_label = "Average Precision" output_path = RESULTS_FILES_PATH + "/mAP.png" to_show = True plot_color = 'royalblue' draw_plot_func( ap_dictionary, n_classes, window_title, plot_title, x_label, output_path, to_show, plot_color, "" ) def preprocess_gt(gt_path, class_names): image_ids = os.listdir(gt_path) results = {} images = [] bboxes = [] for i, image_id in enumerate(image_ids): lines_list = file_lines_to_list(os.path.join(gt_path, image_id)) boxes_per_image = [] image = {} image_id = os.path.splitext(image_id)[0] image['file_name'] = image_id + '.jpg' image['width'] = 1 image['height'] = 1 #-----------------------------------------------------------------# # 感谢 多学学英语吧 的提醒 # 解决了'Results do not correspond to current coco set'问题 #-----------------------------------------------------------------# image['id'] = str(image_id) for line in lines_list: difficult = 0 if "difficult" in line: line_split = line.split() left, top, right, bottom, _difficult = line_split[-5:] class_name = "" for name in line_split[:-5]: class_name += name + " " class_name = class_name[:-1] difficult = 1 else: line_split = line.split() left, top, right, bottom = line_split[-4:] class_name = "" for name in line_split[:-4]: class_name += name + " " class_name = class_name[:-1] left, top, right, bottom = float(left), float(top), float(right), float(bottom) cls_id = class_names.index(class_name) + 1 bbox = [left, top, right - left, bottom - top, difficult, str(image_id), cls_id, (right - left) * (bottom - top) - 10.0] boxes_per_image.append(bbox) images.append(image) bboxes.extend(boxes_per_image) results['images'] = images categories = [] for i, cls in enumerate(class_names): category = {} category['supercategory'] = cls category['name'] = cls category['id'] = i + 1 categories.append(category) results['categories'] = categories annotations = [] for i, box in enumerate(bboxes): annotation = {} annotation['area'] = box[-1] annotation['category_id'] = box[-2] annotation['image_id'] = box[-3] annotation['iscrowd'] = box[-4] annotation['bbox'] = box[:4] annotation['id'] = i annotations.append(annotation) results['annotations'] = annotations return results def preprocess_dr(dr_path, class_names): image_ids = os.listdir(dr_path) results = [] for image_id in image_ids: lines_list = file_lines_to_list(os.path.join(dr_path, image_id)) image_id = os.path.splitext(image_id)[0] for line in lines_list: line_split = line.split() confidence, left, top, right, bottom = line_split[-5:] class_name = "" for name in line_split[:-5]: class_name += name + " " class_name = class_name[:-1] left, top, right, bottom = float(left), float(top), float(right), float(bottom) result = {} result["image_id"] = str(image_id) result["category_id"] = class_names.index(class_name) + 1 result["bbox"] = [left, top, right - left, bottom - top] result["score"] = float(confidence) results.append(result) return results def get_coco_map(class_names, path): from pycocotools.coco import COCO from pycocotools.cocoeval import COCOeval GT_PATH = os.path.join(path, 'ground-truth') DR_PATH = os.path.join(path, 'detection-results') COCO_PATH = os.path.join(path, 'coco_eval') if not os.path.exists(COCO_PATH): os.makedirs(COCO_PATH) GT_JSON_PATH = os.path.join(COCO_PATH, 'instances_gt.json') DR_JSON_PATH = os.path.join(COCO_PATH, 'instances_dr.json') with open(GT_JSON_PATH, "w") as f: results_gt = preprocess_gt(GT_PATH, class_names) json.dump(results_gt, f, indent=4) with open(DR_JSON_PATH, "w") as f: results_dr = preprocess_dr(DR_PATH, class_names) json.dump(results_dr, f, indent=4) cocoGt = COCO(GT_JSON_PATH) cocoDt = cocoGt.loadRes(DR_JSON_PATH) cocoEval = COCOeval(cocoGt, cocoDt, 'bbox') cocoEval.evaluate() cocoEval.accumulate() cocoEval.summarize() ================================================ FILE: utils_coco/coco_annotation.py ================================================ #-------------------------------------------------------# # 用于处理COCO数据集,根据json文件生成txt文件用于训练 #-------------------------------------------------------# import json import os from collections import defaultdict #-------------------------------------------------------# # 指向了COCO训练集与验证集图片的路径 #-------------------------------------------------------# train_datasets_path = "coco_dataset/train2017" val_datasets_path = "coco_dataset/val2017" #-------------------------------------------------------# # 指向了COCO训练集与验证集标签的路径 #-------------------------------------------------------# train_annotation_path = "coco_dataset/annotations/instances_train2017.json" val_annotation_path = "coco_dataset/annotations/instances_val2017.json" #-------------------------------------------------------# # 生成的txt文件路径 #-------------------------------------------------------# train_output_path = "coco_train.txt" val_output_path = "coco_val.txt" if __name__ == "__main__": name_box_id = defaultdict(list) id_name = dict() f = open(train_annotation_path, encoding='utf-8') data = json.load(f) annotations = data['annotations'] for ant in annotations: id = ant['image_id'] name = os.path.join(train_datasets_path, '%012d.jpg' % id) cat = ant['category_id'] if cat >= 1 and cat <= 11: cat = cat - 1 elif cat >= 13 and cat <= 25: cat = cat - 2 elif cat >= 27 and cat <= 28: cat = cat - 3 elif cat >= 31 and cat <= 44: cat = cat - 5 elif cat >= 46 and cat <= 65: cat = cat - 6 elif cat == 67: cat = cat - 7 elif cat == 70: cat = cat - 9 elif cat >= 72 and cat <= 82: cat = cat - 10 elif cat >= 84 and cat <= 90: cat = cat - 11 name_box_id[name].append([ant['bbox'], cat]) f = open(train_output_path, 'w') for key in name_box_id.keys(): f.write(key) box_infos = name_box_id[key] for info in box_infos: x_min = int(info[0][0]) y_min = int(info[0][1]) x_max = x_min + int(info[0][2]) y_max = y_min + int(info[0][3]) box_info = " %d,%d,%d,%d,%d" % ( x_min, y_min, x_max, y_max, int(info[1])) f.write(box_info) f.write('\n') f.close() name_box_id = defaultdict(list) id_name = dict() f = open(val_annotation_path, encoding='utf-8') data = json.load(f) annotations = data['annotations'] for ant in annotations: id = ant['image_id'] name = os.path.join(val_datasets_path, '%012d.jpg' % id) cat = ant['category_id'] if cat >= 1 and cat <= 11: cat = cat - 1 elif cat >= 13 and cat <= 25: cat = cat - 2 elif cat >= 27 and cat <= 28: cat = cat - 3 elif cat >= 31 and cat <= 44: cat = cat - 5 elif cat >= 46 and cat <= 65: cat = cat - 6 elif cat == 67: cat = cat - 7 elif cat == 70: cat = cat - 9 elif cat >= 72 and cat <= 82: cat = cat - 10 elif cat >= 84 and cat <= 90: cat = cat - 11 name_box_id[name].append([ant['bbox'], cat]) f = open(val_output_path, 'w') for key in name_box_id.keys(): f.write(key) box_infos = name_box_id[key] for info in box_infos: x_min = int(info[0][0]) y_min = int(info[0][1]) x_max = x_min + int(info[0][2]) y_max = y_min + int(info[0][3]) box_info = " %d,%d,%d,%d,%d" % ( x_min, y_min, x_max, y_max, int(info[1])) f.write(box_info) f.write('\n') f.close() ================================================ FILE: utils_coco/get_map_coco.py ================================================ import json import os import numpy as np import torch from PIL import Image from pycocotools.coco import COCO from pycocotools.cocoeval import COCOeval from tqdm import tqdm from utils.utils import cvtColor, preprocess_input, resize_image from yolo import YOLO #---------------------------------------------------------------------------# # map_mode用于指定该文件运行时计算的内容 # map_mode为0代表整个map计算流程,包括获得预测结果、计算map。 # map_mode为1代表仅仅获得预测结果。 # map_mode为2代表仅仅获得计算map。 #---------------------------------------------------------------------------# map_mode = 0 #-------------------------------------------------------# # 指向了验证集标签与图片路径 #-------------------------------------------------------# cocoGt_path = 'coco_dataset/annotations/instances_val2017.json' dataset_img_path = 'coco_dataset/val2017' #-------------------------------------------------------# # 结果输出的文件夹,默认为map_out #-------------------------------------------------------# temp_save_path = 'map_out/coco_eval' class mAP_YOLO(YOLO): #---------------------------------------------------# # 检测图片 #---------------------------------------------------# def detect_image(self, image_id, image, results): #---------------------------------------------------# # 计算输入图片的高和宽 #---------------------------------------------------# image_shape = np.array(np.shape(image)[0:2]) #---------------------------------------------------------# # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。 # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB #---------------------------------------------------------# image = cvtColor(image) #---------------------------------------------------------# # 给图像增加灰条,实现不失真的resize # 也可以直接resize进行识别 #---------------------------------------------------------# image_data = resize_image(image, (self.input_shape[1],self.input_shape[0]), self.letterbox_image) #---------------------------------------------------------# # 添加上batch_size维度 #---------------------------------------------------------# image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0) with torch.no_grad(): images = torch.from_numpy(image_data) if self.cuda: images = images.cuda() #---------------------------------------------------------# # 将图像输入网络当中进行预测! #---------------------------------------------------------# outputs = self.net(images) outputs = self.bbox_util.decode_box(outputs) #---------------------------------------------------------# # 将预测框进行堆叠,然后进行非极大抑制 #---------------------------------------------------------# outputs = self.bbox_util.non_max_suppression(torch.cat(outputs, 1), self.num_classes, self.input_shape, image_shape, self.letterbox_image, conf_thres = self.confidence, nms_thres = self.nms_iou) if outputs[0] is None: return results top_label = np.array(outputs[0][:, 6], dtype = 'int32') top_conf = outputs[0][:, 4] * outputs[0][:, 5] top_boxes = outputs[0][:, :4] for i, c in enumerate(top_label): result = {} top, left, bottom, right = top_boxes[i] result["image_id"] = int(image_id) result["category_id"] = clsid2catid[c] result["bbox"] = [float(left),float(top),float(right-left),float(bottom-top)] result["score"] = float(top_conf[i]) results.append(result) return results if __name__ == "__main__": if not os.path.exists(temp_save_path): os.makedirs(temp_save_path) cocoGt = COCO(cocoGt_path) ids = list(cocoGt.imgToAnns.keys()) clsid2catid = cocoGt.getCatIds() if map_mode == 0 or map_mode == 1: yolo = mAP_YOLO(confidence = 0.001, nms_iou = 0.65) with open(os.path.join(temp_save_path, 'eval_results.json'),"w") as f: results = [] for image_id in tqdm(ids): image_path = os.path.join(dataset_img_path, cocoGt.loadImgs(image_id)[0]['file_name']) image = Image.open(image_path) results = yolo.detect_image(image_id, image, results) json.dump(results, f) if map_mode == 0 or map_mode == 2: cocoDt = cocoGt.loadRes(os.path.join(temp_save_path, 'eval_results.json')) cocoEval = COCOeval(cocoGt, cocoDt, 'bbox') cocoEval.evaluate() cocoEval.accumulate() cocoEval.summarize() print("Get map done.") ================================================ FILE: voc_annotation.py ================================================ import os import random import xml.etree.ElementTree as ET from get_yaml import get_config from utils.utils import get_classes #--------------------------------------------------------------------------------------------------------------------------------# # annotation_mode用于指定该文件运行时计算的内容 # annotation_mode为0代表整个标签处理过程,包括获得VOCdevkit/VOC2007/ImageSets里面的txt以及训练用的2007_train.txt、2007_val.txt # annotation_mode为1代表获得VOCdevkit/VOC2007/ImageSets里面的txt # annotation_mode为2代表获得训练用的2007_train.txt、2007_val.txt #--------------------------------------------------------------------------------------------------------------------------------# annotation_mode = 0 #-------------------------------------------------------------------# # 必须要修改,用于生成2007_train.txt、2007_val.txt的目标信息 # 与训练和预测所用的classes_path一致即可 # 如果生成的2007_train.txt里面没有目标信息 # 那么就是因为classes没有设定正确 # 仅在annotation_mode为0和2的时候有效 #-------------------------------------------------------------------# # classes_path = 'model_data/gesture_classes.txt' #--------------------------------------------------------------------------------------------------------------------------------# # trainval_percent用于指定(训练集+验证集)与测试集的比例,默认情况下 (训练集+验证集):测试集 = 9:1 # train_percent用于指定(训练集+验证集)中训练集与验证集的比例,默认情况下 训练集:验证集 = 9:1 # 仅在annotation_mode为0和1的时候有效 #--------------------------------------------------------------------------------------------------------------------------------# trainval_percent = 1 train_percent = 0.9 #-------------------------------------------------------# # 指向VOC数据集所在的文件夹 # 默认指向根目录下的VOC数据集 #-------------------------------------------------------# VOCdevkit_path = 'VOCdevkit' VOCdevkit_sets = [('2007', 'train'), ('2007', 'val')] # classes, _ = get_classes(classes_path) config = get_config() classes = config['classes'] def convert_annotation(year, image_id, list_file): in_file = open(os.path.join(VOCdevkit_path, 'VOC%s/Annotations/%s.xml'%(year, image_id)), encoding='utf-8') tree=ET.parse(in_file) root = tree.getroot() for obj in root.iter('object'): difficult = 0 if obj.find('difficult')!=None: difficult = obj.find('difficult').text cls = obj.find('name').text if cls not in classes or int(difficult)==1: continue cls_id = classes.index(cls) xmlbox = obj.find('bndbox') b = (int(float(xmlbox.find('xmin').text)), int(float(xmlbox.find('ymin').text)), int(float(xmlbox.find('xmax').text)), int(float(xmlbox.find('ymax').text))) list_file.write(" " + ",".join([str(a) for a in b]) + ',' + str(cls_id)) if __name__ == "__main__": random.seed(0) if annotation_mode == 0 or annotation_mode == 1: print("Generate txt in ImageSets.") xmlfilepath = os.path.join(VOCdevkit_path, 'VOC2007/Annotations') saveBasePath = os.path.join(VOCdevkit_path, 'VOC2007/ImageSets/Main') temp_xml = os.listdir(xmlfilepath) total_xml = [] for xml in temp_xml: if xml.endswith(".xml"): total_xml.append(xml) num = len(total_xml) list = range(num) tv = int(num*trainval_percent) tr = int(tv*train_percent) trainval= random.sample(list,tv) train = random.sample(trainval,tr) print("train and val size",tv) print("train size",tr) ftrainval = open(os.path.join(saveBasePath,'trainval.txt'), 'w') ftest = open(os.path.join(saveBasePath,'test.txt'), 'w') ftrain = open(os.path.join(saveBasePath,'train.txt'), 'w') fval = open(os.path.join(saveBasePath,'val.txt'), 'w') for i in list: name=total_xml[i][:-4]+'\n' if i in trainval: ftrainval.write(name) if i in train: ftrain.write(name) else: fval.write(name) else: ftest.write(name) ftrainval.close() ftrain.close() fval.close() ftest.close() print("Generate txt in ImageSets done.") if annotation_mode == 0 or annotation_mode == 2: print("Generate gesture_train.txt and 2007_val.txt for train.") for year, image_set in VOCdevkit_sets: image_ids = open(os.path.join(VOCdevkit_path, 'VOC%s/ImageSets/Main/%s.txt'%(year, image_set)), encoding='utf-8').read().strip().split() list_file = open('%s_%s.txt'%(year, image_set), 'w', encoding='utf-8') for image_id in image_ids: list_file.write('%s/VOC%s/JPEGImages/%s.jpg'%(os.path.abspath(VOCdevkit_path), year, image_id)) convert_annotation(year, image_id, list_file) list_file.write('\n') list_file.close() print("Generate gesture_train.txt and gesture_val.txt for train done.") ================================================ FILE: yolo.py ================================================ import colorsys import os import time import numpy as np import torch import torch.nn as nn from PIL import ImageDraw, ImageFont from nets.yolo import YoloBody from nets.yolo_tiny import YoloBodytiny from utils.utils import (cvtColor, get_anchors, get_classes, preprocess_input, resize_image) from utils.utils_bbox import DecodeBox from get_yaml import get_config import argparse ''' 训练自己的数据集必看注释! ''' class YOLO(object): # 配置文件 config = get_config() _defaults = { #--------------------------------------------------------------------------# # 使用自己训练好的模型进行预测一定要修改model_path和classes_path! # model_path指向logs文件夹下的权值文件,classes_path指向model_data下的txt # # 训练好后logs文件夹下存在多个权值文件,选择验证集损失较低的即可。 # 验证集损失较低不代表mAP较高,仅代表该权值在验证集上泛化性能较好。 # 如果出现shape不匹配,同时要注意训练时的model_path和classes_path参数的修改 #--------------------------------------------------------------------------# "class_names" : config['classes'], "num_classes" : config['nc'], #---------------------------------------------------------------------# # anchors_path代表先验框对应的txt文件,一般不修改。 # anchors_mask用于帮助代码找到对应的先验框,一般不修改。 #---------------------------------------------------------------------# "anchors_path" : 'model_data/yolo_anchors.txt', "anchors_mask" : [[6, 7, 8], [3, 4, 5], [0, 1, 2]], #---------------------------------------------------------------------# # 只有得分大于置信度的预测框会被保留下来 #---------------------------------------------------------------------# "confidence" : 0.5, # 0.5, #---------------------------------------------------------------------# # 非极大抑制所用到的nms_iou大小 #---------------------------------------------------------------------# "nms_iou" : 0.3, # 0.3, #---------------------------------------------------------------------# # 该变量用于控制是否使用letterbox_image对输入图像进行不失真的resize, # 在多次测试后,发现关闭letterbox_image直接resize的效果更好 #---------------------------------------------------------------------# "letterbox_image" : config['letterbox_image'], # False, } @classmethod def get_defaults(cls, n): if n in cls._defaults: return cls._defaults[n] else: return "Unrecognized attribute name '" + n + "'" #---------------------------------------------------# # 初始化YOLO #---------------------------------------------------# def __init__(self, opt, **kwargs): self.__dict__.update(self._defaults) for name, value in kwargs.items(): setattr(self, name, value) self.phi = opt.phi self.tiny = opt.tiny self.cuda = opt.cuda self.input_shape = [opt.shape,opt.shape] self.model_path = opt.weights self.phi = opt.phi self.confidence = opt.confidence self.nms_iou = opt.nms_iou if self.tiny: self.anchors_mask = [[3,4,5], [1,2,3]] self.anchors_path = 'model_data/yolotiny_anchors.txt' #---------------------------------------------------# # 获得种类和先验框的数量 #---------------------------------------------------# # self.class_names, self.num_classes = get_classes(self.classes_path) self.anchors, self.num_anchors = get_anchors(self.anchors_path) self.bbox_util = DecodeBox(self.anchors, self.num_classes, (self.input_shape[0], self.input_shape[1]), self.anchors_mask) #---------------------------------------------------# # 画框设置不同的颜色 #---------------------------------------------------# hsv_tuples = [(x / self.num_classes, 1., 1.) for x in range(self.num_classes)] self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples)) self.colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), self.colors)) self.generate() #---------------------------------------------------# # 生成模型 #---------------------------------------------------# def generate(self, onnx=False): #---------------------------------------------------# # 建立yolo模型,载入yolo模型的权重 #---------------------------------------------------# if not self.tiny: self.net = YoloBody(self.anchors_mask, self.num_classes) elif self.tiny: self.net = YoloBodytiny(self.anchors_mask, self.num_classes, self.phi) device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') self.net.load_state_dict(torch.load(self.model_path, map_location=device)) self.net = self.net.eval() print('{} model, anchors, and classes loaded.'.format(self.model_path)) if not onnx: if self.cuda: self.net = nn.DataParallel(self.net) self.net = self.net.cuda() #---------------------------------------------------# # 检测图片 #---------------------------------------------------# def detect_image(self, image, crop = False, count = False): #---------------------------------------------------# # 计算输入图片的高和宽 #---------------------------------------------------# image_shape = np.array(np.shape(image)[0:2]) #---------------------------------------------------------# # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。 # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB #---------------------------------------------------------# image = cvtColor(image) #---------------------------------------------------------# # 给图像增加灰条,实现不失真的resize # 也可以直接resize进行识别 #---------------------------------------------------------# image_data = resize_image(image, (self.input_shape[1],self.input_shape[0]), self.letterbox_image) #---------------------------------------------------------# # 添加上batch_size维度 #---------------------------------------------------------# image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0) with torch.no_grad(): images = torch.from_numpy(image_data) if self.cuda: images = images.cuda() #---------------------------------------------------------# # 将图像输入网络当中进行预测! #---------------------------------------------------------# outputs = self.net(images) outputs = self.bbox_util.decode_box(outputs) #---------------------------------------------------------# # 将预测框进行堆叠,然后进行非极大抑制 #---------------------------------------------------------# results = self.bbox_util.non_max_suppression(torch.cat(outputs, 1), self.num_classes, self.input_shape, image_shape, self.letterbox_image, conf_thres = self.confidence, nms_thres = self.nms_iou) if results[0] is None: return image top_label = np.array(results[0][:, 6], dtype = 'int32') top_conf = results[0][:, 4] * results[0][:, 5] top_boxes = results[0][:, :4] #---------------------------------------------------------# # 设置字体与边框厚度 #---------------------------------------------------------# font = ImageFont.truetype(font='model_data/simhei.ttf', size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32')) thickness = int(max((image.size[0] + image.size[1]) // np.mean(self.input_shape), 1)) #---------------------------------------------------------# # 计数 #---------------------------------------------------------# if count: print("top_label:", top_label) classes_nums = np.zeros([self.num_classes]) for i in range(self.num_classes): num = np.sum(top_label == i) if num > 0: print(self.class_names[i], " : ", num) classes_nums[i] = num print("classes_nums:", classes_nums) #---------------------------------------------------------# # 是否进行目标的裁剪 #---------------------------------------------------------# if crop: for i, c in list(enumerate(top_label)): top, left, bottom, right = top_boxes[i] top = max(0, np.floor(top).astype('int32')) left = max(0, np.floor(left).astype('int32')) bottom = min(image.size[1], np.floor(bottom).astype('int32')) right = min(image.size[0], np.floor(right).astype('int32')) dir_save_path = "img_crop" if not os.path.exists(dir_save_path): os.makedirs(dir_save_path) crop_image = image.crop([left, top, right, bottom]) crop_image.save(os.path.join(dir_save_path, "crop_" + str(i) + ".png"), quality=95, subsampling=0) print("save crop_" + str(i) + ".png to " + dir_save_path) #---------------------------------------------------------# # 图像绘制 #---------------------------------------------------------# for i, c in list(enumerate(top_label)): predicted_class = self.class_names[int(c)] box = top_boxes[i] score = top_conf[i] top, left, bottom, right = box top = max(0, np.floor(top).astype('int32')) left = max(0, np.floor(left).astype('int32')) bottom = min(image.size[1], np.floor(bottom).astype('int32')) right = min(image.size[0], np.floor(right).astype('int32')) label = '{} {:.2f}'.format(predicted_class, score) draw = ImageDraw.Draw(image) label_size = draw.textsize(label, font) label = label.encode('utf-8') print(label, top, left, bottom, right) if top - label_size[1] >= 0: text_origin = np.array([left, top - label_size[1]]) else: text_origin = np.array([left, top + 1]) for i in range(thickness): draw.rectangle([left + i, top + i, right - i, bottom - i], outline=self.colors[c]) draw.rectangle([tuple(text_origin), tuple(text_origin + label_size)], fill=self.colors[c]) draw.text(text_origin, str(label,'UTF-8'), fill=(0, 0, 0), font=font) del draw return image def get_FPS(self, image, test_interval): image_shape = np.array(np.shape(image)[0:2]) #---------------------------------------------------------# # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。 # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB #---------------------------------------------------------# image = cvtColor(image) #---------------------------------------------------------# # 给图像增加灰条,实现不失真的resize # 也可以直接resize进行识别 #---------------------------------------------------------# image_data = resize_image(image, (self.input_shape[1],self.input_shape[0]), self.letterbox_image) #---------------------------------------------------------# # 添加上batch_size维度 #---------------------------------------------------------# image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0) with torch.no_grad(): images = torch.from_numpy(image_data) if self.cuda: images = images.cuda() #---------------------------------------------------------# # 将图像输入网络当中进行预测! #---------------------------------------------------------# outputs = self.net(images) outputs = self.bbox_util.decode_box(outputs) #---------------------------------------------------------# # 将预测框进行堆叠,然后进行非极大抑制 #---------------------------------------------------------# results = self.bbox_util.non_max_suppression(torch.cat(outputs, 1), self.num_classes, self.input_shape, image_shape, self.letterbox_image, conf_thres=self.confidence, nms_thres=self.nms_iou) t1 = time.time() for _ in range(test_interval): with torch.no_grad(): #---------------------------------------------------------# # 将图像输入网络当中进行预测! #---------------------------------------------------------# outputs = self.net(images) outputs = self.bbox_util.decode_box(outputs) #---------------------------------------------------------# # 将预测框进行堆叠,然后进行非极大抑制 #---------------------------------------------------------# results = self.bbox_util.non_max_suppression(torch.cat(outputs, 1), self.num_classes, self.input_shape, image_shape, self.letterbox_image, conf_thres=self.confidence, nms_thres=self.nms_iou) t2 = time.time() tact_time = (t2 - t1) / test_interval return tact_time def detect_heatmap(self, image, heatmap_save_path): import cv2 import matplotlib.pyplot as plt def sigmoid(x): y = 1.0 / (1.0 + np.exp(-x)) return y #---------------------------------------------------------# # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。 # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB #---------------------------------------------------------# image = cvtColor(image) #---------------------------------------------------------# # 给图像增加灰条,实现不失真的resize # 也可以直接resize进行识别 #---------------------------------------------------------# image_data = resize_image(image, (self.input_shape[1],self.input_shape[0]), self.letterbox_image) #---------------------------------------------------------# # 添加上batch_size维度 #---------------------------------------------------------# image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0) with torch.no_grad(): images = torch.from_numpy(image_data) if self.cuda: images = images.cuda() #---------------------------------------------------------# # 将图像输入网络当中进行预测! #---------------------------------------------------------# outputs = self.net(images) plt.imshow(image, alpha=1) plt.axis('off') mask = np.zeros((image.size[1], image.size[0])) for sub_output in outputs: sub_output = sub_output.cpu().numpy() b, c, h, w = np.shape(sub_output) sub_output = np.transpose(np.reshape(sub_output, [b, 3, -1, h, w]), [0, 3, 4, 1, 2])[0] score = np.max(sigmoid(sub_output[..., 4]), -1) score = cv2.resize(score, (image.size[0], image.size[1])) normed_score = (score * 255).astype('uint8') mask = np.maximum(mask, normed_score) plt.imshow(mask, alpha=0.5, interpolation='nearest', cmap="jet") plt.axis('off') plt.subplots_adjust(top=1, bottom=0, right=1, left=0, hspace=0, wspace=0) plt.margins(0, 0) plt.savefig(heatmap_save_path, dpi=200, bbox_inches='tight', pad_inches = -0.1) print("Save to the " + heatmap_save_path) plt.show() def convert_to_onnx(self, simplify, model_path): import onnx self.generate(onnx=True) im = torch.zeros(1, 3, *self.input_shape).to('cpu') # image size(1, 3, 512, 512) BCHW input_layer_names = ["images"] output_layer_names = ["output"] # Export the model print(f'Starting export with onnx {onnx.__version__}.') torch.onnx.export(self.net, im, f = model_path, verbose = False, opset_version = 12, training = torch.onnx.TrainingMode.EVAL, do_constant_folding = True, input_names = input_layer_names, output_names = output_layer_names, dynamic_axes = None) # Checks model_onnx = onnx.load(model_path) # load onnx model onnx.checker.check_model(model_onnx) # check onnx model # Simplify onnx if simplify: import onnxsim print(f'Simplifying with onnx-simplifier {onnxsim.__version__}.') model_onnx, check = onnxsim.simplify( model_onnx, dynamic_input_shape=False, input_shapes=None) assert check, 'assert check failed' onnx.save(model_onnx, model_path) print('Onnx model save as {}'.format(model_path)) def get_map_txt(self, image_id, image, class_names, map_out_path): f = open(os.path.join(map_out_path, "detection-results/"+image_id+".txt"),"w") image_shape = np.array(np.shape(image)[0:2]) #---------------------------------------------------------# # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。 # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB #---------------------------------------------------------# image = cvtColor(image) #---------------------------------------------------------# # 给图像增加灰条,实现不失真的resize # 也可以直接resize进行识别 #---------------------------------------------------------# image_data = resize_image(image, (self.input_shape[1],self.input_shape[0]), self.letterbox_image) #---------------------------------------------------------# # 添加上batch_size维度 #---------------------------------------------------------# image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0) with torch.no_grad(): images = torch.from_numpy(image_data) if self.cuda: images = images.cuda() #---------------------------------------------------------# # 将图像输入网络当中进行预测! #---------------------------------------------------------# outputs = self.net(images) outputs = self.bbox_util.decode_box(outputs) #---------------------------------------------------------# # 将预测框进行堆叠,然后进行非极大抑制 #---------------------------------------------------------# results = self.bbox_util.non_max_suppression(torch.cat(outputs, 1), self.num_classes, self.input_shape, image_shape, self.letterbox_image, conf_thres = self.confidence, nms_thres = self.nms_iou) if results[0] is None: return top_label = np.array(results[0][:, 6], dtype = 'int32') top_conf = results[0][:, 4] * results[0][:, 5] top_boxes = results[0][:, :4] for i, c in list(enumerate(top_label)): predicted_class = self.class_names[int(c)] box = top_boxes[i] score = str(top_conf[i]) top, left, bottom, right = box if predicted_class not in class_names: continue f.write("%s %s %s %s %s %s\n" % (predicted_class, score[:6], str(int(left)), str(int(top)), str(int(right)),str(int(bottom)))) f.close() return ================================================ FILE: yolo_anchors.txt ================================================ 105,107, 118,136, 152,122, 114,165, 139,151, 160,156, 152,185, 181,167, 192,197 ================================================ FILE: yolov4-gesture-tutorial.ipynb ================================================ {"cells":[{"cell_type":"markdown","metadata":{"id":"9MEPFVpX4mRS"},"source":["# 挂载Drive (使用colab才做这个操作)"]},{"cell_type":"code","execution_count":1,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":34099,"status":"ok","timestamp":1651105771374,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"},"user_tz":-480},"id":"TjAeId9H0zl7","outputId":"32dd1d61-331c-46b4-a190-cef10340e732"},"outputs":[{"output_type":"stream","name":"stdout","text":["/content\n","Thu Apr 28 00:28:57 2022 \n","+-----------------------------------------------------------------------------+\n","| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |\n","|-------------------------------+----------------------+----------------------+\n","| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n","| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n","| | | MIG M. |\n","|===============================+======================+======================|\n","| 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 |\n","| N/A 33C P8 29W / 149W | 0MiB / 11441MiB | 0% Default |\n","| | | N/A |\n","+-------------------------------+----------------------+----------------------+\n"," \n","+-----------------------------------------------------------------------------+\n","| Processes: |\n","| GPU GI CI PID Type Process name GPU Memory |\n","| ID ID Usage |\n","|=============================================================================|\n","| No running processes found |\n","+-----------------------------------------------------------------------------+\n","Mounted at /content/gdrive\n"]}],"source":["%cd /content\n","!nvidia-smi\n","from google.colab import drive\n","drive.mount('/content/gdrive')"]},{"cell_type":"code","execution_count":24,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":30262,"status":"ok","timestamp":1651106445511,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"},"user_tz":-480},"id":"1qANLkv-F8Ho","outputId":"13ee99dc-9e24-4ab3-8146-1a70322cc750"},"outputs":[{"output_type":"stream","name":"stdout","text":["/content\n","Cloning into 'College-Students-Innovative-Entrepreneurial-Training-Plan-Program'...\n","warning: redirecting to https://github.com/Dreaming-future/College-Students-Innovative-Entrepreneurial-Training-Plan-Program/\n","remote: Enumerating objects: 7027, done.\u001b[K\n","remote: Counting objects: 100% (12/12), done.\u001b[K\n","remote: Compressing objects: 100% (9/9), done.\u001b[K\n","remote: Total 7027 (delta 6), reused 9 (delta 3), pack-reused 7015\u001b[K\n","Receiving objects: 100% (7027/7027), 326.24 MiB | 12.62 MiB/s, done.\n","Resolving deltas: 100% (3882/3882), done.\n","Checking out files: 100% (4448/4448), done.\n","/content/College-Students-Innovative-Entrepreneurial-Training-Plan-Program/yolov4-gesture\n"]}],"source":["# git clone 一下代码进行colab\n","%cd /content\n","!rm -rf /content/College-Students-Innovative-Entrepreneurial-Training-Plan-Program\n","!git clone http://project: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@github.com/Dreaming-future/College-Students-Innovative-Entrepreneurial-Training-Plan-Program\n","%cd College-Students-Innovative-Entrepreneurial-Training-Plan-Program/yolov4-gesture"]},{"cell_type":"markdown","metadata":{"id":"WZwtRt1N4uEP"},"source":["# 连接Drive(使用colab才做这个操作)"]},{"cell_type":"code","execution_count":25,"metadata":{"executionInfo":{"elapsed":664,"status":"ok","timestamp":1651106446164,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"},"user_tz":-480},"id":"LErdhCFizPuU"},"outputs":[],"source":["!rm -rf logs\n","# 与drive建立软连接\n","!ln -s /content/gdrive/MyDrive/weights/logs logs\n","# # 复制权重文件 \n","# !cp /content/gdrive/MyDrive/weights/yolo4_gesture_weightsv3.pth \\\n","# /content/College-Students-Innovative-Entrepreneurial-Training-Plan-Program/yolov4-gesture/model_data/yolo4_gesture_weights.pth"]},{"cell_type":"markdown","metadata":{"id":"iLMsxSTSGUFi"},"source":["# 对数据集进行预处理"]},{"cell_type":"code","execution_count":3,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":64099,"status":"ok","timestamp":1651105859081,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"},"user_tz":-480},"id":"l0LFj3WTyaKt","outputId":"2d011aa2-668a-4e18-8cad-94db14edfc98"},"outputs":[{"output_type":"stream","name":"stdout","text":["Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 1)) (1.4.1)\n","Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 2)) (1.21.6)\n","Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 3)) (3.2.2)\n","Requirement already satisfied: opencv_python in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 4)) (4.1.2.30)\n","Collecting torch==1.8.1\n"," Downloading torch-1.8.1-cp37-cp37m-manylinux1_x86_64.whl (804.1 MB)\n","\u001b[K |████████████████████████████████| 804.1 MB 2.7 kB/s \n","\u001b[?25hCollecting torchvision==0.9.1\n"," Downloading torchvision-0.9.1-cp37-cp37m-manylinux1_x86_64.whl (17.4 MB)\n","\u001b[K |████████████████████████████████| 17.4 MB 545 kB/s \n","\u001b[?25hCollecting tqdm==4.60.0\n"," Downloading tqdm-4.60.0-py2.py3-none-any.whl (75 kB)\n","\u001b[K |████████████████████████████████| 75 kB 4.2 MB/s \n","\u001b[?25hCollecting Pillow==8.2.0\n"," Downloading Pillow-8.2.0-cp37-cp37m-manylinux1_x86_64.whl (3.0 MB)\n","\u001b[K |████████████████████████████████| 3.0 MB 36.0 MB/s \n","\u001b[?25hCollecting h5py==2.10.0\n"," Downloading h5py-2.10.0-cp37-cp37m-manylinux1_x86_64.whl (2.9 MB)\n","\u001b[K |████████████████████████████████| 2.9 MB 38.9 MB/s \n","\u001b[?25hRequirement already satisfied: tensorboard in /usr/local/lib/python3.7/dist-packages (from -r requirements.txt (line 10)) (2.8.0)\n","Collecting pyyaml==6.0\n"," Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)\n","\u001b[K |████████████████████████████████| 596 kB 55.7 MB/s \n","\u001b[31mERROR: Could not find a version that satisfies the requirement tochinfo (from versions: none)\u001b[0m\n","\u001b[31mERROR: No matching distribution found for tochinfo\u001b[0m\n","\u001b[?25hCollecting tqdm==4.60.0\n"," Using cached tqdm-4.60.0-py2.py3-none-any.whl (75 kB)\n","Collecting h5py==2.10.0\n"," Using cached h5py-2.10.0-cp37-cp37m-manylinux1_x86_64.whl (2.9 MB)\n","Collecting pyyaml==6.0\n"," Using cached PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)\n","Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from h5py==2.10.0) (1.15.0)\n","Requirement already satisfied: numpy>=1.7 in /usr/local/lib/python3.7/dist-packages (from h5py==2.10.0) (1.21.6)\n","Installing collected packages: tqdm, pyyaml, h5py\n"," Attempting uninstall: tqdm\n"," Found existing installation: tqdm 4.64.0\n"," Uninstalling tqdm-4.64.0:\n"," Successfully uninstalled tqdm-4.64.0\n"," Attempting uninstall: pyyaml\n"," Found existing installation: PyYAML 3.13\n"," Uninstalling PyYAML-3.13:\n"," Successfully uninstalled PyYAML-3.13\n"," Attempting uninstall: h5py\n"," Found existing installation: h5py 3.1.0\n"," Uninstalling h5py-3.1.0:\n"," Successfully uninstalled h5py-3.1.0\n","\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n","tensorflow 2.8.0 requires tf-estimator-nightly==2.8.0.dev2021122109, which is not installed.\u001b[0m\n","Successfully installed h5py-2.10.0 pyyaml-6.0 tqdm-4.60.0\n"]}],"source":["# pip 安装包\n","!pip install -r requirements.txt\n","!pip install tqdm==4.60.0 h5py==2.10.0 pyyaml==6.0"]},{"cell_type":"code","execution_count":26,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":4,"status":"ok","timestamp":1651106446164,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"},"user_tz":-480},"id":"e5j2Q9-zxbuN","outputId":"94f411c1-8256-4a02-93c3-9c095e941935"},"outputs":[{"output_type":"stream","name":"stdout","text":["Generate txt in ImageSets.\n","train and val size 1601\n","train size 1440\n","Generate txt in ImageSets done.\n","Generate gesture_train.txt and 2007_val.txt for train.\n","Generate gesture_train.txt and gesture_val.txt for train done.\n"]}],"source":["!python voc_annotation.py"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"vX7FkUmiOUQR"},"outputs":[],"source":["# # 复制最新的权重文件作为backbone\n","# !cp /content/gdrive/MyDrive/模型权重文件/logs/ep050-loss0.077-val_loss0.052.pth \\\n","# /content/College-Students-Innovative-Entrepreneurial-Training-Plan-Program/yolov4-gesture/model_data/yolo4_gesture_weight.pth"]},{"cell_type":"code","execution_count":27,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":2408,"status":"ok","timestamp":1651106448570,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"},"user_tz":-480},"id":"zKn6VU7gnaeU","outputId":"0d37fb42-36ac-4db4-d35a-407391460865"},"outputs":[{"output_type":"stream","name":"stdout","text":["Load xmls.\n","\r 0% 0/1601 [00:00\n","Save kmeans_for_anchors.jpg in root dir.\n","avg_ratio:0.89\n","[[105.91304348 105.77777778]\n"," [ 97.60893855 140.28282828]\n"," [128.99895507 134.10934744]\n"," [159.6 122.79069767]\n"," [121.625 161.66037736]\n"," [147.69230769 153.6 ]\n"," [155.27272727 185.21212121]\n"," [178.21422887 162.90909091]\n"," [192.22695035 196. ]]\n"]}],"source":["!python kmeans_for_anchors.py"]},{"cell_type":"markdown","metadata":{"id":"Y-SAO6KltlfF"},"source":["# 训练数据"]},{"cell_type":"markdown","source":["# yolov4 版本"],"metadata":{"id":"efuI36jfn2fL"}},{"cell_type":"code","source":["!wget -nc https://github.com/bubbliiiing/yolov4-pytorch/releases/download/v1.0/yolo4_weights.pth -O model_data/yolo4_weights.pth"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"1K3isfpZn6Kw","executionInfo":{"status":"ok","timestamp":1651112763279,"user_tz":-480,"elapsed":59782,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"}},"outputId":"23141ca8-580f-4b86-d232-495f277003c2"},"execution_count":33,"outputs":[{"output_type":"stream","name":"stdout","text":["--2022-04-28 02:25:04-- https://github.com/bubbliiiing/yolov4-pytorch/releases/download/v1.0/yolo4_weights.pth\n","Resolving github.com (github.com)... 13.114.40.48\n","Connecting to github.com (github.com)|13.114.40.48|:443... connected.\n","HTTP request sent, awaiting response... 302 Found\n","Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/266480370/07efad80-654c-11eb-9fc4-8055eb471ae0?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220428%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220428T022504Z&X-Amz-Expires=300&X-Amz-Signature=c60bad9aed15ce878abe1d659e0ff80078d0064b64d4ff1d470fac80845e84d0&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=266480370&response-content-disposition=attachment%3B%20filename%3Dyolo4_weights.pth&response-content-type=application%2Foctet-stream [following]\n","--2022-04-28 02:25:04-- https://objects.githubusercontent.com/github-production-release-asset-2e65be/266480370/07efad80-654c-11eb-9fc4-8055eb471ae0?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220428%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220428T022504Z&X-Amz-Expires=300&X-Amz-Signature=c60bad9aed15ce878abe1d659e0ff80078d0064b64d4ff1d470fac80845e84d0&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=266480370&response-content-disposition=attachment%3B%20filename%3Dyolo4_weights.pth&response-content-type=application%2Foctet-stream\n","Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n","Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.108.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 257863693 (246M) [application/octet-stream]\n","Saving to: ‘model_data/yolo4_weights.pth’\n","\n","model_data/yolo4_we 100%[===================>] 245.92M 3.96MB/s in 57s \n","\n","2022-04-28 02:26:03 (4.31 MB/s) - ‘model_data/yolo4_weights.pth’ saved [257863693/257863693]\n","\n"]}]},{"cell_type":"code","source":["# 训练数据,默认为100个epochs\n","!python train.py --epochs 100 \\\n"," --weights model_data/yolo4_weights.pth \\\n"," --freeze --freeze-epochs 50 --freeze-size 32 \\\n"," --batch-size 10 --shape 416 \\\n"," --fp16 --cuda"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"lzgTnAW7n8ig","outputId":"9f6f2374-3573-430b-a1b1-55af6ad997d8"},"execution_count":null,"outputs":[{"output_type":"stream","name":"stdout","text":["Namespace(batch_size=10, cuda=True, epochs=100, fp16=True, freeze=True, freeze_epochs=50, freeze_size=32, init=0, lr=0.02, momentum=0.937, mosaic=False, optimizer='adam', phi=0, save_period=4, shape=416, tiny=False, weight_decay=0, weights='model_data/yolo4_weights.pth')\n","initialize network with normal type\n","Load weights model_data/yolo4_weights.pth.\n","Start Train\n","Epoch 1/100: 100% 45/45 [01:55<00:00, 2.56s/it, loss=14.7, lr=0.0001]\n","Finish Train\n","Start Validation\n","Epoch 1/100: 100% 5/5 [00:17<00:00, 3.42s/it, val_loss=10.6]\n","Finish Validation\n","Epoch:1/100\n","Total Loss: 14.700 || Val Loss: 10.551 \n","Start Train\n","Epoch 2/100: 100% 45/45 [01:51<00:00, 2.47s/it, loss=7.37, lr=0.0002]\n","Finish Train\n","Start Validation\n","Epoch 2/100: 100% 5/5 [00:12<00:00, 2.54s/it, val_loss=4.12]\n","Finish Validation\n","Epoch:2/100\n","Total Loss: 7.371 || Val Loss: 4.121 \n","Start Train\n","Epoch 3/100: 100% 45/45 [01:49<00:00, 2.44s/it, loss=1.99, lr=0.0005]\n","Finish Train\n","Start Validation\n","Epoch 3/100: 100% 5/5 [00:12<00:00, 2.43s/it, val_loss=0.791]\n","Finish Validation\n","Epoch:3/100\n","Total Loss: 1.995 || Val Loss: 0.791 \n","Start Train\n","Epoch 4/100: 100% 45/45 [01:49<00:00, 2.42s/it, loss=0.523, lr=0.001]\n","Finish Train\n","Start Validation\n","Epoch 4/100: 100% 5/5 [00:11<00:00, 2.37s/it, val_loss=0.281]\n","Finish Validation\n","Epoch:4/100\n","Total Loss: 0.523 || Val Loss: 0.281 \n","Start Train\n","Epoch 5/100: 100% 45/45 [01:51<00:00, 2.48s/it, loss=0.268, lr=0.001]\n","Finish Train\n","Start Validation\n","Epoch 5/100: 100% 5/5 [00:11<00:00, 2.36s/it, val_loss=0.176]\n","Finish Validation\n","Epoch:5/100\n","Total Loss: 0.268 || Val Loss: 0.176 \n","Start Train\n","Epoch 6/100: 100% 45/45 [01:48<00:00, 2.42s/it, loss=0.209, lr=0.000999]\n","Finish Train\n","Start Validation\n","Epoch 6/100: 100% 5/5 [00:12<00:00, 2.42s/it, val_loss=0.132]\n","Finish Validation\n","Epoch:6/100\n","Total Loss: 0.209 || Val Loss: 0.132 \n","Start Train\n","Epoch 7/100: 100% 45/45 [01:50<00:00, 2.47s/it, loss=0.179, lr=0.000997]\n","Finish Train\n","Start Validation\n","Epoch 7/100: 100% 5/5 [00:12<00:00, 2.53s/it, val_loss=0.108]\n","Finish Validation\n","Epoch:7/100\n","Total Loss: 0.179 || Val Loss: 0.108 \n","Start Train\n","Epoch 8/100: 100% 45/45 [01:50<00:00, 2.45s/it, loss=0.158, lr=0.000995]\n","Finish Train\n","Start Validation\n","Epoch 8/100: 100% 5/5 [00:12<00:00, 2.52s/it, val_loss=0.0953]\n","Finish Validation\n","Epoch:8/100\n","Total Loss: 0.158 || Val Loss: 0.095 \n","Start Train\n","Epoch 9/100: 100% 45/45 [01:51<00:00, 2.47s/it, loss=0.139, lr=0.000993]\n","Finish Train\n","Start Validation\n","Epoch 9/100: 100% 5/5 [00:11<00:00, 2.37s/it, val_loss=0.0818]\n","Finish Validation\n","Epoch:9/100\n","Total Loss: 0.139 || Val Loss: 0.082 \n","Start Train\n","Epoch 10/100: 100% 45/45 [01:51<00:00, 2.47s/it, loss=0.129, lr=0.00099]\n","Finish Train\n","Start Validation\n","Epoch 10/100: 100% 5/5 [00:12<00:00, 2.57s/it, val_loss=0.0734]\n","Finish Validation\n","Epoch:10/100\n","Total Loss: 0.129 || Val Loss: 0.073 \n","Start Train\n","Epoch 11/100: 100% 45/45 [01:49<00:00, 2.43s/it, loss=0.126, lr=0.000986]\n","Finish Train\n","Start Validation\n","Epoch 11/100: 100% 5/5 [00:11<00:00, 2.35s/it, val_loss=0.0664]\n","Finish Validation\n","Epoch:11/100\n","Total Loss: 0.126 || Val Loss: 0.066 \n","Start Train\n","Epoch 12/100: 100% 45/45 [01:48<00:00, 2.41s/it, loss=0.115, lr=0.000982]\n","Finish Train\n","Start Validation\n","Epoch 12/100: 100% 5/5 [00:12<00:00, 2.43s/it, val_loss=0.063]\n","Finish Validation\n","Epoch:12/100\n","Total Loss: 0.115 || Val Loss: 0.063 \n","Start Train\n","Epoch 13/100: 100% 45/45 [01:49<00:00, 2.44s/it, loss=0.105, lr=0.000977]\n","Finish Train\n","Start Validation\n","Epoch 13/100: 100% 5/5 [00:11<00:00, 2.30s/it, val_loss=0.0522]\n","Finish Validation\n","Epoch:13/100\n","Total Loss: 0.105 || Val Loss: 0.052 \n","Start Train\n","Epoch 14/100: 100% 45/45 [01:49<00:00, 2.44s/it, loss=0.1, lr=0.000971]\n","Finish Train\n","Start Validation\n","Epoch 14/100: 100% 5/5 [00:11<00:00, 2.28s/it, val_loss=0.0489]\n","Finish Validation\n","Epoch:14/100\n","Total Loss: 0.100 || Val Loss: 0.049 \n","Start Train\n","Epoch 15/100: 100% 45/45 [01:49<00:00, 2.43s/it, loss=0.0965, lr=0.000965]\n","Finish Train\n","Start Validation\n","Epoch 15/100: 100% 5/5 [00:12<00:00, 2.53s/it, val_loss=0.0468]\n","Finish Validation\n","Epoch:15/100\n","Total Loss: 0.097 || Val Loss: 0.047 \n","Start Train\n","Epoch 16/100: 100% 45/45 [01:47<00:00, 2.40s/it, loss=0.0992, lr=0.000959]\n","Finish Train\n","Start Validation\n","Epoch 16/100: 100% 5/5 [00:12<00:00, 2.43s/it, val_loss=0.045]\n","Finish Validation\n","Epoch:16/100\n","Total Loss: 0.099 || Val Loss: 0.045 \n","Start Train\n","Epoch 17/100: 100% 45/45 [01:49<00:00, 2.44s/it, loss=0.0935, lr=0.000952]\n","Finish Train\n","Start Validation\n","Epoch 17/100: 100% 5/5 [00:12<00:00, 2.46s/it, val_loss=0.0408]\n","Finish Validation\n","Epoch:17/100\n","Total Loss: 0.094 || Val Loss: 0.041 \n","Start Train\n","Epoch 18/100: 100% 45/45 [01:49<00:00, 2.43s/it, loss=0.0881, lr=0.000945]\n","Finish Train\n","Start Validation\n","Epoch 18/100: 100% 5/5 [00:11<00:00, 2.36s/it, val_loss=0.0389]\n","Finish Validation\n","Epoch:18/100\n","Total Loss: 0.088 || Val Loss: 0.039 \n","Start Train\n","Epoch 19/100: 100% 45/45 [01:48<00:00, 2.42s/it, loss=0.0884, lr=0.000936]\n","Finish Train\n","Start Validation\n","Epoch 19/100: 100% 5/5 [00:11<00:00, 2.25s/it, val_loss=0.04]\n","Finish Validation\n","Epoch:19/100\n","Total Loss: 0.088 || Val Loss: 0.040 \n","Start Train\n","Epoch 20/100: 100% 45/45 [01:49<00:00, 2.44s/it, loss=0.077, lr=0.000928]\n","Finish Train\n","Start Validation\n","Epoch 20/100: 100% 5/5 [00:12<00:00, 2.46s/it, val_loss=0.0346]\n","Finish Validation\n","Epoch:20/100\n","Total Loss: 0.077 || Val Loss: 0.035 \n","Start Train\n","Epoch 21/100: 100% 45/45 [01:48<00:00, 2.41s/it, loss=0.0783, lr=0.000919]\n","Finish Train\n","Start Validation\n","Epoch 21/100: 100% 5/5 [00:12<00:00, 2.44s/it, val_loss=0.0323]\n","Finish Validation\n","Epoch:21/100\n","Total Loss: 0.078 || Val Loss: 0.032 \n","Start Train\n","Epoch 22/100: 100% 45/45 [01:48<00:00, 2.41s/it, loss=0.0769, lr=0.000909]\n","Finish Train\n","Start Validation\n","Epoch 22/100: 100% 5/5 [00:11<00:00, 2.37s/it, val_loss=0.0304]\n","Finish Validation\n","Epoch:22/100\n","Total Loss: 0.077 || Val Loss: 0.030 \n","Start Train\n","Epoch 23/100: 100% 45/45 [01:47<00:00, 2.40s/it, loss=0.0735, lr=0.000899]\n","Finish Train\n","Start Validation\n","Epoch 23/100: 100% 5/5 [00:12<00:00, 2.43s/it, val_loss=0.0338]\n","Finish Validation\n","Epoch:23/100\n","Total Loss: 0.073 || Val Loss: 0.034 \n","Start Train\n","Epoch 24/100: 100% 45/45 [01:48<00:00, 2.41s/it, loss=0.0718, lr=0.000889]\n","Finish Train\n","Start Validation\n","Epoch 24/100: 100% 5/5 [00:12<00:00, 2.44s/it, val_loss=0.0296]\n","Finish Validation\n","Epoch:24/100\n","Total Loss: 0.072 || Val Loss: 0.030 \n","Start Train\n","Epoch 25/100: 100% 45/45 [01:48<00:00, 2.42s/it, loss=0.0707, lr=0.000878]\n","Finish Train\n","Start Validation\n","Epoch 25/100: 100% 5/5 [00:12<00:00, 2.47s/it, val_loss=0.0255]\n","Finish Validation\n","Epoch:25/100\n","Total Loss: 0.071 || Val Loss: 0.026 \n","Start Train\n","Epoch 26/100: 100% 45/45 [01:47<00:00, 2.39s/it, loss=0.0666, lr=0.000867]\n","Finish Train\n","Start Validation\n","Epoch 26/100: 100% 5/5 [00:12<00:00, 2.45s/it, val_loss=0.0256]\n","Finish Validation\n","Epoch:26/100\n","Total Loss: 0.067 || Val Loss: 0.026 \n","Start Train\n","Epoch 27/100: 100% 45/45 [01:49<00:00, 2.43s/it, loss=0.0691, lr=0.000855]\n","Finish Train\n","Start Validation\n","Epoch 27/100: 100% 5/5 [00:12<00:00, 2.43s/it, val_loss=0.0271]\n","Finish Validation\n","Epoch:27/100\n","Total Loss: 0.069 || Val Loss: 0.027 \n","Start Train\n","Epoch 28/100: 100% 45/45 [01:48<00:00, 2.41s/it, loss=0.0636, lr=0.000843]\n","Finish Train\n","Start Validation\n","Epoch 28/100: 100% 5/5 [00:11<00:00, 2.40s/it, val_loss=0.0259]\n","Finish Validation\n","Epoch:28/100\n","Total Loss: 0.064 || Val Loss: 0.026 \n","Start Train\n","Epoch 29/100: 100% 45/45 [01:50<00:00, 2.46s/it, loss=0.066, lr=0.00083]\n","Finish Train\n","Start Validation\n","Epoch 29/100: 100% 5/5 [00:11<00:00, 2.27s/it, val_loss=0.0231]\n","Finish Validation\n","Epoch:29/100\n","Total Loss: 0.066 || Val Loss: 0.023 \n","Start Train\n","Epoch 30/100: 100% 45/45 [01:48<00:00, 2.42s/it, loss=0.0672, lr=0.000817]\n","Finish Train\n","Start Validation\n","Epoch 30/100: 100% 5/5 [00:12<00:00, 2.45s/it, val_loss=0.0327]\n","Finish Validation\n","Epoch:30/100\n","Total Loss: 0.067 || Val Loss: 0.033 \n","Start Train\n","Epoch 31/100: 100% 45/45 [01:49<00:00, 2.43s/it, loss=0.0605, lr=0.000804]\n","Finish Train\n","Start Validation\n","Epoch 31/100: 100% 5/5 [00:11<00:00, 2.26s/it, val_loss=0.0269]\n","Finish Validation\n","Epoch:31/100\n","Total Loss: 0.061 || Val Loss: 0.027 \n","Start Train\n","Epoch 32/100: 100% 45/45 [01:51<00:00, 2.47s/it, loss=0.0571, lr=0.00079]\n","Finish Train\n","Start Validation\n","Epoch 32/100: 100% 5/5 [00:11<00:00, 2.30s/it, val_loss=0.0266]\n","Finish Validation\n","Epoch:32/100\n","Total Loss: 0.057 || Val Loss: 0.027 \n","Start Train\n","Epoch 33/100: 33% 15/45 [00:40<01:09, 2.30s/it, loss=0.063, lr=0.000776]"]}]},{"cell_type":"markdown","source":["## yolov4-tiny 版本"],"metadata":{"id":"6YU6lPwJnfPT"}},{"cell_type":"markdown","source":["### CSPdarknet53-tiny无注意力机制"],"metadata":{"id":"Zwlh5OyelE4n"}},{"cell_type":"code","source":["!wget -nc https://github.91chi.fun/https://github.com//bubbliiiing/yolov4-tiny-pytorch/releases/download/v1.0/yolov4_tiny_weights_coco.pth -O model_data/yolov4_tiny_weights_coco.pth"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"tFSRGVZBlMd5","executionInfo":{"status":"ok","timestamp":1651106450640,"user_tz":-480,"elapsed":2072,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"}},"outputId":"ec00320e-6ddf-4a98-f1d6-e70449e90da1"},"execution_count":28,"outputs":[{"output_type":"stream","name":"stdout","text":["--2022-04-28 00:40:48-- https://github.91chi.fun/https://github.com//bubbliiiing/yolov4-tiny-pytorch/releases/download/v1.0/yolov4_tiny_weights_coco.pth\n","Resolving github.91chi.fun (github.91chi.fun)... 104.21.48.120, 172.67.151.57, 2606:4700:3037::6815:3078, ...\n","Connecting to github.91chi.fun (github.91chi.fun)|104.21.48.120|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 24274351 (23M) [application/octet-stream]\n","Saving to: ‘model_data/yolov4_tiny_weights_coco.pth’\n","\n","model_data/yolov4_t 100%[===================>] 23.15M 40.3MB/s in 0.6s \n","\n","2022-04-28 00:40:50 (40.3 MB/s) - ‘model_data/yolov4_tiny_weights_coco.pth’ saved [24274351/24274351]\n","\n"]}]},{"cell_type":"code","source":["# 训练数据,默认为100个epochs\n","!python train.py --tiny --phi 0 --epochs 100 \\\n"," --weights model_data/yolov4_tiny_weights_coco.pth \\\n"," --freeze --freeze-epochs 50 --freeze-size 64 \\\n"," --batch-size 32 --shape 416 \\\n"," --fp16 --cuda"],"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"id":"BCQ-ywOulHkQ","executionInfo":{"status":"ok","timestamp":1651112371419,"user_tz":-480,"elapsed":5920782,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"}},"outputId":"df858913-5b20-4a74-d7f9-d30cb4491c4e"},"execution_count":29,"outputs":[{"output_type":"stream","name":"stdout","text":["Namespace(batch_size=32, cuda=True, epochs=100, fp16=True, freeze=True, freeze_epochs=50, freeze_size=64, init=0, lr=0.02, momentum=0.937, mosaic=False, optimizer='adam', phi=0, save_period=4, shape=416, tiny=True, weight_decay=0, weights='model_data/yolov4_tiny_weights_coco.pth')\n","initialize network with normal type\n","Load weights model_data/yolov4_tiny_weights_coco.pth.\n","Start Train\n","Epoch 1/100: 100% 22/22 [00:50<00:00, 2.31s/it, loss=4.66, lr=0.0001]\n","Finish Train\n","Start Validation\n","Epoch 1/100: 100% 2/2 [00:08<00:00, 4.01s/it, val_loss=3.98]\n","Finish Validation\n","Epoch:1/100\n","Total Loss: 4.655 || Val Loss: 3.979 \n","Start Train\n","Epoch 2/100: 100% 22/22 [00:49<00:00, 2.24s/it, loss=3.14, lr=0.0002]\n","Finish Train\n","Start Validation\n","Epoch 2/100: 100% 2/2 [00:05<00:00, 2.52s/it, val_loss=2.24]\n","Finish Validation\n","Epoch:2/100\n","Total Loss: 3.143 || Val Loss: 2.238 \n","Start Train\n","Epoch 3/100: 100% 22/22 [00:49<00:00, 2.24s/it, loss=1.5, lr=0.0005]\n","Finish Train\n","Start Validation\n","Epoch 3/100: 100% 2/2 [00:05<00:00, 2.95s/it, val_loss=0.721]\n","Finish Validation\n","Epoch:3/100\n","Total Loss: 1.502 || Val Loss: 0.721 \n","Start Train\n","Epoch 4/100: 100% 22/22 [00:49<00:00, 2.26s/it, loss=0.506, lr=0.001]\n","Finish Train\n","Start Validation\n","Epoch 4/100: 100% 2/2 [00:05<00:00, 2.93s/it, val_loss=0.204]\n","Finish Validation\n","Epoch:4/100\n","Total Loss: 0.506 || Val Loss: 0.204 \n","Start Train\n","Epoch 5/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.231, lr=0.001]\n","Finish Train\n","Start Validation\n","Epoch 5/100: 100% 2/2 [00:06<00:00, 3.10s/it, val_loss=0.131]\n","Finish Validation\n","Epoch:5/100\n","Total Loss: 0.231 || Val Loss: 0.131 \n","Start Train\n","Epoch 6/100: 100% 22/22 [00:49<00:00, 2.25s/it, loss=0.174, lr=0.000999]\n","Finish Train\n","Start Validation\n","Epoch 6/100: 100% 2/2 [00:05<00:00, 2.99s/it, val_loss=0.107]\n","Finish Validation\n","Epoch:6/100\n","Total Loss: 0.174 || Val Loss: 0.107 \n","Start Train\n","Epoch 7/100: 100% 22/22 [00:50<00:00, 2.28s/it, loss=0.15, lr=0.000997]\n","Finish Train\n","Start Validation\n","Epoch 7/100: 100% 2/2 [00:05<00:00, 2.79s/it, val_loss=0.0895]\n","Finish Validation\n","Epoch:7/100\n","Total Loss: 0.150 || Val Loss: 0.089 \n","Start Train\n","Epoch 8/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.134, lr=0.000995]\n","Finish Train\n","Start Validation\n","Epoch 8/100: 100% 2/2 [00:05<00:00, 2.78s/it, val_loss=0.0784]\n","Finish Validation\n","Epoch:8/100\n","Total Loss: 0.134 || Val Loss: 0.078 \n","Start Train\n","Epoch 9/100: 100% 22/22 [00:49<00:00, 2.23s/it, loss=0.126, lr=0.000993]\n","Finish Train\n","Start Validation\n","Epoch 9/100: 100% 2/2 [00:05<00:00, 2.81s/it, val_loss=0.0721]\n","Finish Validation\n","Epoch:9/100\n","Total Loss: 0.126 || Val Loss: 0.072 \n","Start Train\n","Epoch 10/100: 100% 22/22 [00:48<00:00, 2.21s/it, loss=0.117, lr=0.00099]\n","Finish Train\n","Start Validation\n","Epoch 10/100: 100% 2/2 [00:05<00:00, 2.92s/it, val_loss=0.0647]\n","Finish Validation\n","Epoch:10/100\n","Total Loss: 0.117 || Val Loss: 0.065 \n","Start Train\n","Epoch 11/100: 100% 22/22 [00:49<00:00, 2.24s/it, loss=0.11, lr=0.000986]\n","Finish Train\n","Start Validation\n","Epoch 11/100: 100% 2/2 [00:05<00:00, 2.73s/it, val_loss=0.061]\n","Finish Validation\n","Epoch:11/100\n","Total Loss: 0.110 || Val Loss: 0.061 \n","Start Train\n","Epoch 12/100: 100% 22/22 [00:48<00:00, 2.22s/it, loss=0.102, lr=0.000982]\n","Finish Train\n","Start Validation\n","Epoch 12/100: 100% 2/2 [00:06<00:00, 3.08s/it, val_loss=0.057]\n","Finish Validation\n","Epoch:12/100\n","Total Loss: 0.102 || Val Loss: 0.057 \n","Start Train\n","Epoch 13/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0965, lr=0.000977]\n","Finish Train\n","Start Validation\n","Epoch 13/100: 100% 2/2 [00:05<00:00, 2.82s/it, val_loss=0.0537]\n","Finish Validation\n","Epoch:13/100\n","Total Loss: 0.097 || Val Loss: 0.054 \n","Start Train\n","Epoch 14/100: 100% 22/22 [00:49<00:00, 2.25s/it, loss=0.0927, lr=0.000971]\n","Finish Train\n","Start Validation\n","Epoch 14/100: 100% 2/2 [00:05<00:00, 2.64s/it, val_loss=0.0532]\n","Finish Validation\n","Epoch:14/100\n","Total Loss: 0.093 || Val Loss: 0.053 \n","Start Train\n","Epoch 15/100: 100% 22/22 [00:51<00:00, 2.33s/it, loss=0.0896, lr=0.000965]\n","Finish Train\n","Start Validation\n","Epoch 15/100: 100% 2/2 [00:05<00:00, 2.64s/it, val_loss=0.0509]\n","Finish Validation\n","Epoch:15/100\n","Total Loss: 0.090 || Val Loss: 0.051 \n","Start Train\n","Epoch 16/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0878, lr=0.000959]\n","Finish Train\n","Start Validation\n","Epoch 16/100: 100% 2/2 [00:05<00:00, 2.81s/it, val_loss=0.0487]\n","Finish Validation\n","Epoch:16/100\n","Total Loss: 0.088 || Val Loss: 0.049 \n","Start Train\n","Epoch 17/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.0814, lr=0.000952]\n","Finish Train\n","Start Validation\n","Epoch 17/100: 100% 2/2 [00:05<00:00, 2.84s/it, val_loss=0.0459]\n","Finish Validation\n","Epoch:17/100\n","Total Loss: 0.081 || Val Loss: 0.046 \n","Start Train\n","Epoch 18/100: 100% 22/22 [00:51<00:00, 2.33s/it, loss=0.0821, lr=0.000945]\n","Finish Train\n","Start Validation\n","Epoch 18/100: 100% 2/2 [00:06<00:00, 3.15s/it, val_loss=0.0435]\n","Finish Validation\n","Epoch:18/100\n","Total Loss: 0.082 || Val Loss: 0.044 \n","Start Train\n","Epoch 19/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.078, lr=0.000936]\n","Finish Train\n","Start Validation\n","Epoch 19/100: 100% 2/2 [00:05<00:00, 2.55s/it, val_loss=0.0411]\n","Finish Validation\n","Epoch:19/100\n","Total Loss: 0.078 || Val Loss: 0.041 \n","Start Train\n","Epoch 20/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0775, lr=0.000928]\n","Finish Train\n","Start Validation\n","Epoch 20/100: 100% 2/2 [00:05<00:00, 2.78s/it, val_loss=0.0421]\n","Finish Validation\n","Epoch:20/100\n","Total Loss: 0.077 || Val Loss: 0.042 \n","Start Train\n","Epoch 21/100: 100% 22/22 [00:51<00:00, 2.35s/it, loss=0.0774, lr=0.000919]\n","Finish Train\n","Start Validation\n","Epoch 21/100: 100% 2/2 [00:05<00:00, 2.95s/it, val_loss=0.0376]\n","Finish Validation\n","Epoch:21/100\n","Total Loss: 0.077 || Val Loss: 0.038 \n","Start Train\n","Epoch 22/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0732, lr=0.000909]\n","Finish Train\n","Start Validation\n","Epoch 22/100: 100% 2/2 [00:05<00:00, 2.94s/it, val_loss=0.037]\n","Finish Validation\n","Epoch:22/100\n","Total Loss: 0.073 || Val Loss: 0.037 \n","Start Train\n","Epoch 23/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.0717, lr=0.000899]\n","Finish Train\n","Start Validation\n","Epoch 23/100: 100% 2/2 [00:05<00:00, 2.87s/it, val_loss=0.037]\n","Finish Validation\n","Epoch:23/100\n","Total Loss: 0.072 || Val Loss: 0.037 \n","Start Train\n","Epoch 24/100: 100% 22/22 [00:50<00:00, 2.28s/it, loss=0.0702, lr=0.000889]\n","Finish Train\n","Start Validation\n","Epoch 24/100: 100% 2/2 [00:05<00:00, 2.64s/it, val_loss=0.035]\n","Finish Validation\n","Epoch:24/100\n","Total Loss: 0.070 || Val Loss: 0.035 \n","Start Train\n","Epoch 25/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0702, lr=0.000878]\n","Finish Train\n","Start Validation\n","Epoch 25/100: 100% 2/2 [00:05<00:00, 2.72s/it, val_loss=0.0355]\n","Finish Validation\n","Epoch:25/100\n","Total Loss: 0.070 || Val Loss: 0.036 \n","Start Train\n","Epoch 26/100: 100% 22/22 [00:49<00:00, 2.23s/it, loss=0.0671, lr=0.000867]\n","Finish Train\n","Start Validation\n","Epoch 26/100: 100% 2/2 [00:05<00:00, 2.86s/it, val_loss=0.0346]\n","Finish Validation\n","Epoch:26/100\n","Total Loss: 0.067 || Val Loss: 0.035 \n","Start Train\n","Epoch 27/100: 100% 22/22 [00:49<00:00, 2.26s/it, loss=0.0673, lr=0.000855]\n","Finish Train\n","Start Validation\n","Epoch 27/100: 100% 2/2 [00:05<00:00, 2.81s/it, val_loss=0.0361]\n","Finish Validation\n","Epoch:27/100\n","Total Loss: 0.067 || Val Loss: 0.036 \n","Start Train\n","Epoch 28/100: 100% 22/22 [00:51<00:00, 2.33s/it, loss=0.0674, lr=0.000843]\n","Finish Train\n","Start Validation\n","Epoch 28/100: 100% 2/2 [00:05<00:00, 2.71s/it, val_loss=0.0349]\n","Finish Validation\n","Epoch:28/100\n","Total Loss: 0.067 || Val Loss: 0.035 \n","Start Train\n","Epoch 29/100: 100% 22/22 [00:49<00:00, 2.25s/it, loss=0.0664, lr=0.00083]\n","Finish Train\n","Start Validation\n","Epoch 29/100: 100% 2/2 [00:05<00:00, 2.88s/it, val_loss=0.0317]\n","Finish Validation\n","Epoch:29/100\n","Total Loss: 0.066 || Val Loss: 0.032 \n","Start Train\n","Epoch 30/100: 100% 22/22 [00:49<00:00, 2.25s/it, loss=0.0637, lr=0.000817]\n","Finish Train\n","Start Validation\n","Epoch 30/100: 100% 2/2 [00:05<00:00, 2.93s/it, val_loss=0.034]\n","Finish Validation\n","Epoch:30/100\n","Total Loss: 0.064 || Val Loss: 0.034 \n","Start Train\n","Epoch 31/100: 100% 22/22 [00:49<00:00, 2.26s/it, loss=0.0616, lr=0.000804]\n","Finish Train\n","Start Validation\n","Epoch 31/100: 100% 2/2 [00:05<00:00, 2.85s/it, val_loss=0.034]\n","Finish Validation\n","Epoch:31/100\n","Total Loss: 0.062 || Val Loss: 0.034 \n","Start Train\n","Epoch 32/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0623, lr=0.00079]\n","Finish Train\n","Start Validation\n","Epoch 32/100: 100% 2/2 [00:04<00:00, 2.36s/it, val_loss=0.0335]\n","Finish Validation\n","Epoch:32/100\n","Total Loss: 0.062 || Val Loss: 0.034 \n","Start Train\n","Epoch 33/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.0609, lr=0.000776]\n","Finish Train\n","Start Validation\n","Epoch 33/100: 100% 2/2 [00:05<00:00, 2.78s/it, val_loss=0.031]\n","Finish Validation\n","Epoch:33/100\n","Total Loss: 0.061 || Val Loss: 0.031 \n","Start Train\n","Epoch 34/100: 100% 22/22 [00:49<00:00, 2.26s/it, loss=0.0591, lr=0.000762]\n","Finish Train\n","Start Validation\n","Epoch 34/100: 100% 2/2 [00:05<00:00, 2.74s/it, val_loss=0.0297]\n","Finish Validation\n","Epoch:34/100\n","Total Loss: 0.059 || Val Loss: 0.030 \n","Start Train\n","Epoch 35/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0594, lr=0.000748]\n","Finish Train\n","Start Validation\n","Epoch 35/100: 100% 2/2 [00:05<00:00, 2.97s/it, val_loss=0.0296]\n","Finish Validation\n","Epoch:35/100\n","Total Loss: 0.059 || Val Loss: 0.030 \n","Start Train\n","Epoch 36/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0599, lr=0.000733]\n","Finish Train\n","Start Validation\n","Epoch 36/100: 100% 2/2 [00:05<00:00, 2.85s/it, val_loss=0.0273]\n","Finish Validation\n","Epoch:36/100\n","Total Loss: 0.060 || Val Loss: 0.027 \n","Start Train\n","Epoch 37/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0566, lr=0.000718]\n","Finish Train\n","Start Validation\n","Epoch 37/100: 100% 2/2 [00:05<00:00, 2.96s/it, val_loss=0.0275]\n","Finish Validation\n","Epoch:37/100\n","Total Loss: 0.057 || Val Loss: 0.027 \n","Start Train\n","Epoch 38/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0554, lr=0.000702]\n","Finish Train\n","Start Validation\n","Epoch 38/100: 100% 2/2 [00:06<00:00, 3.11s/it, val_loss=0.0286]\n","Finish Validation\n","Epoch:38/100\n","Total Loss: 0.055 || Val Loss: 0.029 \n","Start Train\n","Epoch 39/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0551, lr=0.000687]\n","Finish Train\n","Start Validation\n","Epoch 39/100: 100% 2/2 [00:05<00:00, 2.88s/it, val_loss=0.0277]\n","Finish Validation\n","Epoch:39/100\n","Total Loss: 0.055 || Val Loss: 0.028 \n","Start Train\n","Epoch 40/100: 100% 22/22 [00:50<00:00, 2.28s/it, loss=0.055, lr=0.000671]\n","Finish Train\n","Start Validation\n","Epoch 40/100: 100% 2/2 [00:06<00:00, 3.16s/it, val_loss=0.0303]\n","Finish Validation\n","Epoch:40/100\n","Total Loss: 0.055 || Val Loss: 0.030 \n","Start Train\n","Epoch 41/100: 100% 22/22 [00:51<00:00, 2.34s/it, loss=0.055, lr=0.000655]\n","Finish Train\n","Start Validation\n","Epoch 41/100: 100% 2/2 [00:05<00:00, 2.89s/it, val_loss=0.0265]\n","Finish Validation\n","Epoch:41/100\n","Total Loss: 0.055 || Val Loss: 0.026 \n","Start Train\n","Epoch 42/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0548, lr=0.000639]\n","Finish Train\n","Start Validation\n","Epoch 42/100: 100% 2/2 [00:05<00:00, 2.91s/it, val_loss=0.0276]\n","Finish Validation\n","Epoch:42/100\n","Total Loss: 0.055 || Val Loss: 0.028 \n","Start Train\n","Epoch 43/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.0554, lr=0.000622]\n","Finish Train\n","Start Validation\n","Epoch 43/100: 100% 2/2 [00:05<00:00, 2.99s/it, val_loss=0.027]\n","Finish Validation\n","Epoch:43/100\n","Total Loss: 0.055 || Val Loss: 0.027 \n","Start Train\n","Epoch 44/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0545, lr=0.000606]\n","Finish Train\n","Start Validation\n","Epoch 44/100: 100% 2/2 [00:05<00:00, 2.64s/it, val_loss=0.0266]\n","Finish Validation\n","Epoch:44/100\n","Total Loss: 0.054 || Val Loss: 0.027 \n","Start Train\n","Epoch 45/100: 100% 22/22 [00:49<00:00, 2.23s/it, loss=0.0523, lr=0.000589]\n","Finish Train\n","Start Validation\n","Epoch 45/100: 100% 2/2 [00:05<00:00, 2.81s/it, val_loss=0.0275]\n","Finish Validation\n","Epoch:45/100\n","Total Loss: 0.052 || Val Loss: 0.028 \n","Start Train\n","Epoch 46/100: 100% 22/22 [00:49<00:00, 2.24s/it, loss=0.0531, lr=0.000572]\n","Finish Train\n","Start Validation\n","Epoch 46/100: 100% 2/2 [00:05<00:00, 2.84s/it, val_loss=0.0268]\n","Finish Validation\n","Epoch:46/100\n","Total Loss: 0.053 || Val Loss: 0.027 \n","Start Train\n","Epoch 47/100: 100% 22/22 [00:49<00:00, 2.25s/it, loss=0.0528, lr=0.000556]\n","Finish Train\n","Start Validation\n","Epoch 47/100: 100% 2/2 [00:05<00:00, 2.87s/it, val_loss=0.0269]\n","Finish Validation\n","Epoch:47/100\n","Total Loss: 0.053 || Val Loss: 0.027 \n","Start Train\n","Epoch 48/100: 100% 22/22 [00:48<00:00, 2.20s/it, loss=0.0519, lr=0.000539]\n","Finish Train\n","Start Validation\n","Epoch 48/100: 100% 2/2 [00:05<00:00, 2.79s/it, val_loss=0.0273]\n","Finish Validation\n","Epoch:48/100\n","Total Loss: 0.052 || Val Loss: 0.027 \n","Start Train\n","Epoch 49/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0522, lr=0.000522]\n","Finish Train\n","Start Validation\n","Epoch 49/100: 100% 2/2 [00:05<00:00, 2.98s/it, val_loss=0.0266]\n","Finish Validation\n","Epoch:49/100\n","Total Loss: 0.052 || Val Loss: 0.027 \n","Start Train\n","Epoch 50/100: 100% 22/22 [00:49<00:00, 2.24s/it, loss=0.0484, lr=0.000505]\n","Finish Train\n","Start Validation\n","Epoch 50/100: 100% 2/2 [00:05<00:00, 2.68s/it, val_loss=0.0261]\n","Finish Validation\n","Epoch:50/100\n","Total Loss: 0.048 || Val Loss: 0.026 \n","Start Train\n","Epoch 51/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0714, lr=0.000488]\n","Finish Train\n","Start Validation\n","Epoch 51/100: 100% 5/5 [00:06<00:00, 1.40s/it, val_loss=0.0297]\n","Finish Validation\n","Epoch:51/100\n","Total Loss: 0.071 || Val Loss: 0.030 \n","Start Train\n","Epoch 52/100: 100% 45/45 [00:54<00:00, 1.21s/it, loss=0.0658, lr=0.000471]\n","Finish Train\n","Start Validation\n","Epoch 52/100: 100% 5/5 [00:06<00:00, 1.34s/it, val_loss=0.0318]\n","Finish Validation\n","Epoch:52/100\n","Total Loss: 0.066 || Val Loss: 0.032 \n","Start Train\n","Epoch 53/100: 100% 45/45 [00:54<00:00, 1.20s/it, loss=0.0634, lr=0.000454]\n","Finish Train\n","Start Validation\n","Epoch 53/100: 100% 5/5 [00:06<00:00, 1.38s/it, val_loss=0.0258]\n","Finish Validation\n","Epoch:53/100\n","Total Loss: 0.063 || Val Loss: 0.026 \n","Start Train\n","Epoch 54/100: 100% 45/45 [00:54<00:00, 1.21s/it, loss=0.0582, lr=0.000438]\n","Finish Train\n","Start Validation\n","Epoch 54/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0265]\n","Finish Validation\n","Epoch:54/100\n","Total Loss: 0.058 || Val Loss: 0.027 \n","Start Train\n","Epoch 55/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0601, lr=0.000421]\n","Finish Train\n","Start Validation\n","Epoch 55/100: 100% 5/5 [00:07<00:00, 1.40s/it, val_loss=0.0288]\n","Finish Validation\n","Epoch:55/100\n","Total Loss: 0.060 || Val Loss: 0.029 \n","Start Train\n","Epoch 56/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0558, lr=0.000404]\n","Finish Train\n","Start Validation\n","Epoch 56/100: 100% 5/5 [00:06<00:00, 1.31s/it, val_loss=0.0237]\n","Finish Validation\n","Epoch:56/100\n","Total Loss: 0.056 || Val Loss: 0.024 \n","Start Train\n","Epoch 57/100: 100% 45/45 [00:54<00:00, 1.21s/it, loss=0.0531, lr=0.000388]\n","Finish Train\n","Start Validation\n","Epoch 57/100: 100% 5/5 [00:06<00:00, 1.32s/it, val_loss=0.0244]\n","Finish Validation\n","Epoch:57/100\n","Total Loss: 0.053 || Val Loss: 0.024 \n","Start Train\n","Epoch 58/100: 100% 45/45 [00:55<00:00, 1.22s/it, loss=0.0523, lr=0.000371]\n","Finish Train\n","Start Validation\n","Epoch 58/100: 100% 5/5 [00:06<00:00, 1.36s/it, val_loss=0.024]\n","Finish Validation\n","Epoch:58/100\n","Total Loss: 0.052 || Val Loss: 0.024 \n","Start Train\n","Epoch 59/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.047, lr=0.000355]\n","Finish Train\n","Start Validation\n","Epoch 59/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0222]\n","Finish Validation\n","Epoch:59/100\n","Total Loss: 0.047 || Val Loss: 0.022 \n","Start Train\n","Epoch 60/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0457, lr=0.000339]\n","Finish Train\n","Start Validation\n","Epoch 60/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0229]\n","Finish Validation\n","Epoch:60/100\n","Total Loss: 0.046 || Val Loss: 0.023 \n","Start Train\n","Epoch 61/100: 100% 45/45 [00:55<00:00, 1.22s/it, loss=0.0465, lr=0.000323]\n","Finish Train\n","Start Validation\n","Epoch 61/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0191]\n","Finish Validation\n","Epoch:61/100\n","Total Loss: 0.047 || Val Loss: 0.019 \n","Start Train\n","Epoch 62/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0412, lr=0.000308]\n","Finish Train\n","Start Validation\n","Epoch 62/100: 100% 5/5 [00:06<00:00, 1.37s/it, val_loss=0.021]\n","Finish Validation\n","Epoch:62/100\n","Total Loss: 0.041 || Val Loss: 0.021 \n","Start Train\n","Epoch 63/100: 100% 45/45 [00:54<00:00, 1.21s/it, loss=0.0407, lr=0.000292]\n","Finish Train\n","Start Validation\n","Epoch 63/100: 100% 5/5 [00:06<00:00, 1.34s/it, val_loss=0.0206]\n","Finish Validation\n","Epoch:63/100\n","Total Loss: 0.041 || Val Loss: 0.021 \n","Start Train\n","Epoch 64/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0416, lr=0.000277]\n","Finish Train\n","Start Validation\n","Epoch 64/100: 100% 5/5 [00:07<00:00, 1.41s/it, val_loss=0.0204]\n","Finish Validation\n","Epoch:64/100\n","Total Loss: 0.042 || Val Loss: 0.020 \n","Start Train\n","Epoch 65/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0382, lr=0.000262]\n","Finish Train\n","Start Validation\n","Epoch 65/100: 100% 5/5 [00:07<00:00, 1.41s/it, val_loss=0.0192]\n","Finish Validation\n","Epoch:65/100\n","Total Loss: 0.038 || Val Loss: 0.019 \n","Start Train\n","Epoch 66/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.039, lr=0.000248]\n","Finish Train\n","Start Validation\n","Epoch 66/100: 100% 5/5 [00:06<00:00, 1.31s/it, val_loss=0.0193]\n","Finish Validation\n","Epoch:66/100\n","Total Loss: 0.039 || Val Loss: 0.019 \n","Start Train\n","Epoch 67/100: 100% 45/45 [00:55<00:00, 1.22s/it, loss=0.0418, lr=0.000234]\n","Finish Train\n","Start Validation\n","Epoch 67/100: 100% 5/5 [00:07<00:00, 1.41s/it, val_loss=0.0203]\n","Finish Validation\n","Epoch:67/100\n","Total Loss: 0.042 || Val Loss: 0.020 \n","Start Train\n","Epoch 68/100: 100% 45/45 [00:55<00:00, 1.22s/it, loss=0.0389, lr=0.00022]\n","Finish Train\n","Start Validation\n","Epoch 68/100: 100% 5/5 [00:06<00:00, 1.36s/it, val_loss=0.0194]\n","Finish Validation\n","Epoch:68/100\n","Total Loss: 0.039 || Val Loss: 0.019 \n","Start Train\n","Epoch 69/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0372, lr=0.000206]\n","Finish Train\n","Start Validation\n","Epoch 69/100: 100% 5/5 [00:06<00:00, 1.32s/it, val_loss=0.019]\n","Finish Validation\n","Epoch:69/100\n","Total Loss: 0.037 || Val Loss: 0.019 \n","Start Train\n","Epoch 70/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0353, lr=0.000193]\n","Finish Train\n","Start Validation\n","Epoch 70/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0173]\n","Finish Validation\n","Epoch:70/100\n","Total Loss: 0.035 || Val Loss: 0.017 \n","Start Train\n","Epoch 71/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0344, lr=0.00018]\n","Finish Train\n","Start Validation\n","Epoch 71/100: 100% 5/5 [00:06<00:00, 1.34s/it, val_loss=0.0178]\n","Finish Validation\n","Epoch:71/100\n","Total Loss: 0.034 || Val Loss: 0.018 \n","Start Train\n","Epoch 72/100: 100% 45/45 [00:53<00:00, 1.20s/it, loss=0.0338, lr=0.000167]\n","Finish Train\n","Start Validation\n","Epoch 72/100: 100% 5/5 [00:06<00:00, 1.34s/it, val_loss=0.0181]\n","Finish Validation\n","Epoch:72/100\n","Total Loss: 0.034 || Val Loss: 0.018 \n","Start Train\n","Epoch 73/100: 100% 45/45 [00:54<00:00, 1.20s/it, loss=0.0345, lr=0.000155]\n","Finish Train\n","Start Validation\n","Epoch 73/100: 100% 5/5 [00:06<00:00, 1.36s/it, val_loss=0.0183]\n","Finish Validation\n","Epoch:73/100\n","Total Loss: 0.034 || Val Loss: 0.018 \n","Start Train\n","Epoch 74/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.032, lr=0.000143]\n","Finish Train\n","Start Validation\n","Epoch 74/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0172]\n","Finish Validation\n","Epoch:74/100\n","Total Loss: 0.032 || Val Loss: 0.017 \n","Start Train\n","Epoch 75/100: 100% 45/45 [00:53<00:00, 1.20s/it, loss=0.0349, lr=0.000132]\n","Finish Train\n","Start Validation\n","Epoch 75/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0164]\n","Finish Validation\n","Epoch:75/100\n","Total Loss: 0.035 || Val Loss: 0.016 \n","Start Train\n","Epoch 76/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0299, lr=0.000121]\n","Finish Train\n","Start Validation\n","Epoch 76/100: 100% 5/5 [00:06<00:00, 1.38s/it, val_loss=0.0167]\n","Finish Validation\n","Epoch:76/100\n","Total Loss: 0.030 || Val Loss: 0.017 \n","Start Train\n","Epoch 77/100: 100% 45/45 [00:54<00:00, 1.21s/it, loss=0.0317, lr=0.000111]\n","Finish Train\n","Start Validation\n","Epoch 77/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0166]\n","Finish Validation\n","Epoch:77/100\n","Total Loss: 0.032 || Val Loss: 0.017 \n","Start Train\n","Epoch 78/100: 100% 45/45 [00:54<00:00, 1.21s/it, loss=0.0318, lr=0.000101]\n","Finish Train\n","Start Validation\n","Epoch 78/100: 100% 5/5 [00:06<00:00, 1.38s/it, val_loss=0.0166]\n","Finish Validation\n","Epoch:78/100\n","Total Loss: 0.032 || Val Loss: 0.017 \n","Start Train\n","Epoch 79/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0303, lr=9.11e-5]\n","Finish Train\n","Start Validation\n","Epoch 79/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0157]\n","Finish Validation\n","Epoch:79/100\n","Total Loss: 0.030 || Val Loss: 0.016 \n","Start Train\n","Epoch 80/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0292, lr=8.21e-5]\n","Finish Train\n","Start Validation\n","Epoch 80/100: 100% 5/5 [00:06<00:00, 1.34s/it, val_loss=0.016]\n","Finish Validation\n","Epoch:80/100\n","Total Loss: 0.029 || Val Loss: 0.016 \n","Start Train\n","Epoch 81/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0293, lr=7.35e-5]\n","Finish Train\n","Start Validation\n","Epoch 81/100: 100% 5/5 [00:06<00:00, 1.33s/it, val_loss=0.016]\n","Finish Validation\n","Epoch:81/100\n","Total Loss: 0.029 || Val Loss: 0.016 \n","Start Train\n","Epoch 82/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0292, lr=6.55e-5]\n","Finish Train\n","Start Validation\n","Epoch 82/100: 100% 5/5 [00:06<00:00, 1.38s/it, val_loss=0.0155]\n","Finish Validation\n","Epoch:82/100\n","Total Loss: 0.029 || Val Loss: 0.016 \n","Start Train\n","Epoch 83/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0276, lr=5.8e-5]\n","Finish Train\n","Start Validation\n","Epoch 83/100: 100% 5/5 [00:06<00:00, 1.34s/it, val_loss=0.0157]\n","Finish Validation\n","Epoch:83/100\n","Total Loss: 0.028 || Val Loss: 0.016 \n","Start Train\n","Epoch 84/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0274, lr=5.1e-5]\n","Finish Train\n","Start Validation\n","Epoch 84/100: 100% 5/5 [00:07<00:00, 1.42s/it, val_loss=0.0162]\n","Finish Validation\n","Epoch:84/100\n","Total Loss: 0.027 || Val Loss: 0.016 \n","Start Train\n","Epoch 85/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.027, lr=4.45e-5]\n","Finish Train\n","Start Validation\n","Epoch 85/100: 100% 5/5 [00:06<00:00, 1.37s/it, val_loss=0.0157]\n","Finish Validation\n","Epoch:85/100\n","Total Loss: 0.027 || Val Loss: 0.016 \n","Start Train\n","Epoch 86/100: 100% 45/45 [00:53<00:00, 1.20s/it, loss=0.0277, lr=3.86e-5]\n","Finish Train\n","Start Validation\n","Epoch 86/100: 100% 5/5 [00:06<00:00, 1.37s/it, val_loss=0.0155]\n","Finish Validation\n","Epoch:86/100\n","Total Loss: 0.028 || Val Loss: 0.015 \n","Start Train\n","Epoch 87/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0266, lr=3.32e-5]\n","Finish Train\n","Start Validation\n","Epoch 87/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0155]\n","Finish Validation\n","Epoch:87/100\n","Total Loss: 0.027 || Val Loss: 0.015 \n","Start Train\n","Epoch 88/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0275, lr=2.84e-5]\n","Finish Train\n","Start Validation\n","Epoch 88/100: 100% 5/5 [00:06<00:00, 1.40s/it, val_loss=0.0153]\n","Finish Validation\n","Epoch:88/100\n","Total Loss: 0.028 || Val Loss: 0.015 \n","Start Train\n","Epoch 89/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0259, lr=2.41e-5]\n","Finish Train\n","Start Validation\n","Epoch 89/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0153]\n","Finish Validation\n","Epoch:89/100\n","Total Loss: 0.026 || Val Loss: 0.015 \n","Start Train\n","Epoch 90/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0256, lr=2.04e-5]\n","Finish Train\n","Start Validation\n","Epoch 91/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0262, lr=1.72e-5]\n","Finish Train\n","Start Validation\n","Epoch 91/100: 100% 5/5 [00:06<00:00, 1.36s/it, val_loss=0.0155]\n","Finish Validation\n","Epoch:91/100\n","Total Loss: 0.026 || Val Loss: 0.015 \n","Start Train\n","Epoch 92/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0255, lr=1.46e-5]\n","Finish Train\n","Start Validation\n","Epoch 92/100: 100% 5/5 [00:06<00:00, 1.37s/it, val_loss=0.0155]\n","Finish Validation\n","Epoch:92/100\n","Total Loss: 0.026 || Val Loss: 0.015 \n","Start Train\n","Epoch 93/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0258, lr=1.26e-5]\n","Finish Train\n","Start Validation\n","Epoch 93/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.0153]\n","Finish Validation\n","Epoch:93/100\n","Total Loss: 0.026 || Val Loss: 0.015 \n","Start Train\n","Epoch 94/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0264, lr=1.12e-5]\n","Finish Train\n","Start Validation\n","Epoch 94/100: 100% 5/5 [00:06<00:00, 1.31s/it, val_loss=0.0149]\n","Finish Validation\n","Epoch:94/100\n","Total Loss: 0.026 || Val Loss: 0.015 \n","Start Train\n","Epoch 95/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0275, lr=1.03e-5]\n","Finish Train\n","Start Validation\n","Epoch 95/100: 100% 5/5 [00:06<00:00, 1.37s/it, val_loss=0.015]\n","Finish Validation\n","Epoch:95/100\n","Total Loss: 0.028 || Val Loss: 0.015 \n","Start Train\n","Epoch 96/100: 100% 45/45 [00:55<00:00, 1.22s/it, loss=0.0244, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 96/100: 100% 5/5 [00:06<00:00, 1.37s/it, val_loss=0.0151]\n","Finish Validation\n","Epoch:96/100\n","Total Loss: 0.024 || Val Loss: 0.015 \n","Start Train\n","Epoch 97/100: 100% 45/45 [00:53<00:00, 1.20s/it, loss=0.0253, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 97/100: 100% 5/5 [00:06<00:00, 1.39s/it, val_loss=0.0151]\n","Finish Validation\n","Epoch:97/100\n","Total Loss: 0.025 || Val Loss: 0.015 \n","Start Train\n","Epoch 98/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0268, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 98/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.015]\n","Finish Validation\n","Epoch:98/100\n","Total Loss: 0.027 || Val Loss: 0.015 \n","Start Train\n","Epoch 99/100: 100% 45/45 [00:55<00:00, 1.22s/it, loss=0.0263, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 99/100: 100% 5/5 [00:06<00:00, 1.34s/it, val_loss=0.0151]\n","Finish Validation\n","Epoch:99/100\n","Total Loss: 0.026 || Val Loss: 0.015 \n","Start Train\n","Epoch 100/100: 100% 45/45 [00:54<00:00, 1.21s/it, loss=0.0264, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 100/100: 100% 5/5 [00:06<00:00, 1.37s/it, val_loss=0.015]\n","Finish Validation\n","Epoch:100/100\n","Total Loss: 0.026 || Val Loss: 0.015 \n"]}]},{"cell_type":"markdown","metadata":{"id":"Xdk8w9AgkX9x"},"source":["### SE版本"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"pT9v141CkX9x"},"outputs":[],"source":["# 训练数据,默认为100个epochs\n","!python train.py --tiny --phi 1 --epochs 100 \\\n"," --weights model_data/yolotiny_SE_ep100.pth \\\n"," --freeze --freeze-epochs 50 --freeze-size 64 \\\n"," --batch-size 32 --shape 416 \\\n"," --fp16 --cuda"]},{"cell_type":"markdown","metadata":{"id":"LvfM3YsGkX9y"},"source":["### CBAM版本"]},{"cell_type":"code","execution_count":null,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":5705245,"status":"ok","timestamp":1651069772733,"user":{"displayName":"Pikachu Pika-pika","userId":"18122152102369823184"},"user_tz":-480},"id":"0HRM9brRwa7U","outputId":"979cb976-476d-4359-ace6-aaf29a99a201"},"outputs":[{"name":"stdout","output_type":"stream","text":["Namespace(batch_size=32, cuda=True, epochs=100, fp16=True, freeze=True, freeze_epochs=50, freeze_size=64, lr=0.02, momentum=0.937, optimizer='adam', phi=2, save_period=4, shape=416, tiny=True, weight_decay=0, weights='model_data/yolov4_tiny_weights_voc_CBAM.pth')\n","initialize network with normal type\n","Load weights model_data/yolov4_tiny_weights_voc_CBAM.pth.\n","Start Train\n","Epoch 1/100: 100% 22/22 [00:52<00:00, 2.40s/it, loss=4.46, lr=0.0001]\n","Finish Train\n","Start Validation\n","Epoch 1/100: 100% 2/2 [00:07<00:00, 3.76s/it, val_loss=3.71]\n","Finish Validation\n","Epoch:1/100\n","Total Loss: 4.458 || Val Loss: 3.705 \n","Start Train\n","Epoch 2/100: 100% 22/22 [00:52<00:00, 2.38s/it, loss=2.73, lr=0.0002]\n","Finish Train\n","Start Validation\n","Epoch 2/100: 100% 2/2 [00:04<00:00, 2.45s/it, val_loss=1.83]\n","Finish Validation\n","Epoch:2/100\n","Total Loss: 2.726 || Val Loss: 1.826 \n","Start Train\n","Epoch 3/100: 100% 22/22 [00:51<00:00, 2.36s/it, loss=1.09, lr=0.0005]\n","Finish Train\n","Start Validation\n","Epoch 3/100: 100% 2/2 [00:04<00:00, 2.45s/it, val_loss=0.514]\n","Finish Validation\n","Epoch:3/100\n","Total Loss: 1.089 || Val Loss: 0.514 \n","Start Train\n","Epoch 4/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.331, lr=0.001]\n","Finish Train\n","Start Validation\n","Epoch 4/100: 100% 2/2 [00:05<00:00, 2.60s/it, val_loss=0.163]\n","Finish Validation\n","Epoch:4/100\n","Total Loss: 0.331 || Val Loss: 0.163 \n","Start Train\n","Epoch 5/100: 100% 22/22 [00:52<00:00, 2.39s/it, loss=0.171, lr=0.001]\n","Finish Train\n","Start Validation\n","Epoch 5/100: 100% 2/2 [00:05<00:00, 2.79s/it, val_loss=0.108]\n","Finish Validation\n","Epoch:5/100\n","Total Loss: 0.171 || Val Loss: 0.108 \n","Start Train\n","Epoch 6/100: 100% 22/22 [00:50<00:00, 2.31s/it, loss=0.123, lr=0.000999]\n","Finish Train\n","Start Validation\n","Epoch 6/100: 100% 2/2 [00:04<00:00, 2.18s/it, val_loss=0.0906]\n","Finish Validation\n","Epoch:6/100\n","Total Loss: 0.123 || Val Loss: 0.091 \n","Start Train\n","Epoch 7/100: 100% 22/22 [00:51<00:00, 2.35s/it, loss=0.108, lr=0.000997]\n","Finish Train\n","Start Validation\n","Epoch 7/100: 100% 2/2 [00:04<00:00, 2.42s/it, val_loss=0.0754]\n","Finish Validation\n","Epoch:7/100\n","Total Loss: 0.108 || Val Loss: 0.075 \n","Start Train\n","Epoch 8/100: 100% 22/22 [00:52<00:00, 2.38s/it, loss=0.109, lr=0.000995]\n","Finish Train\n","Start Validation\n","Epoch 8/100: 100% 2/2 [00:05<00:00, 2.59s/it, val_loss=0.0715]\n","Finish Validation\n","Epoch:8/100\n","Total Loss: 0.109 || Val Loss: 0.071 \n","Start Train\n","Epoch 9/100: 100% 22/22 [00:51<00:00, 2.34s/it, loss=0.108, lr=0.000993]\n","Finish Train\n","Start Validation\n","Epoch 9/100: 100% 2/2 [00:04<00:00, 2.30s/it, val_loss=0.0652]\n","Finish Validation\n","Epoch:9/100\n","Total Loss: 0.108 || Val Loss: 0.065 \n","Start Train\n","Epoch 10/100: 100% 22/22 [00:51<00:00, 2.33s/it, loss=0.0997, lr=0.00099]\n","Finish Train\n","Start Validation\n","Epoch 10/100: 100% 2/2 [00:05<00:00, 2.67s/it, val_loss=0.059]\n","Finish Validation\n","Epoch:10/100\n","Total Loss: 0.100 || Val Loss: 0.059 \n","Start Train\n","Epoch 11/100: 100% 22/22 [00:52<00:00, 2.38s/it, loss=0.0975, lr=0.000986]\n","Finish Train\n","Start Validation\n","Epoch 11/100: 100% 2/2 [00:05<00:00, 2.75s/it, val_loss=0.0543]\n","Finish Validation\n","Epoch:11/100\n","Total Loss: 0.097 || Val Loss: 0.054 \n","Start Train\n","Epoch 12/100: 100% 22/22 [00:51<00:00, 2.35s/it, loss=0.0905, lr=0.000982]\n","Finish Train\n","Start Validation\n","Epoch 12/100: 100% 2/2 [00:05<00:00, 2.55s/it, val_loss=0.0506]\n","Finish Validation\n","Epoch:12/100\n","Total Loss: 0.091 || Val Loss: 0.051 \n","Start Train\n","Epoch 13/100: 100% 22/22 [00:51<00:00, 2.34s/it, loss=0.0867, lr=0.000977]\n","Finish Train\n","Start Validation\n","Epoch 13/100: 100% 2/2 [00:04<00:00, 2.50s/it, val_loss=0.0505]\n","Finish Validation\n","Epoch:13/100\n","Total Loss: 0.087 || Val Loss: 0.050 \n","Start Train\n","Epoch 14/100: 100% 22/22 [00:50<00:00, 2.31s/it, loss=0.092, lr=0.000971]\n","Finish Train\n","Start Validation\n","Epoch 14/100: 100% 2/2 [00:04<00:00, 2.25s/it, val_loss=0.046]\n","Finish Validation\n","Epoch:14/100\n","Total Loss: 0.092 || Val Loss: 0.046 \n","Start Train\n","Epoch 15/100: 100% 22/22 [00:50<00:00, 2.32s/it, loss=0.0814, lr=0.000965]\n","Finish Train\n","Start Validation\n","Epoch 15/100: 100% 2/2 [00:05<00:00, 2.54s/it, val_loss=0.0426]\n","Finish Validation\n","Epoch:15/100\n","Total Loss: 0.081 || Val Loss: 0.043 \n","Start Train\n","Epoch 16/100: 100% 22/22 [00:50<00:00, 2.28s/it, loss=0.0829, lr=0.000959]\n","Finish Train\n","Start Validation\n","Epoch 16/100: 100% 2/2 [00:05<00:00, 2.50s/it, val_loss=0.0424]\n","Finish Validation\n","Epoch:16/100\n","Total Loss: 0.083 || Val Loss: 0.042 \n","Start Train\n","Epoch 17/100: 100% 22/22 [00:51<00:00, 2.33s/it, loss=0.0779, lr=0.000952]\n","Finish Train\n","Start Validation\n","Epoch 17/100: 100% 2/2 [00:05<00:00, 2.55s/it, val_loss=0.0402]\n","Finish Validation\n","Epoch:17/100\n","Total Loss: 0.078 || Val Loss: 0.040 \n","Start Train\n","Epoch 18/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0754, lr=0.000945]\n","Finish Train\n","Start Validation\n","Epoch 18/100: 100% 2/2 [00:04<00:00, 2.38s/it, val_loss=0.0404]\n","Finish Validation\n","Epoch:18/100\n","Total Loss: 0.075 || Val Loss: 0.040 \n","Start Train\n","Epoch 19/100: 100% 22/22 [00:50<00:00, 2.31s/it, loss=0.0747, lr=0.000936]\n","Finish Train\n","Start Validation\n","Epoch 19/100: 100% 2/2 [00:05<00:00, 2.61s/it, val_loss=0.0357]\n","Finish Validation\n","Epoch:19/100\n","Total Loss: 0.075 || Val Loss: 0.036 \n","Start Train\n","Epoch 20/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0707, lr=0.000928]\n","Finish Train\n","Start Validation\n","Epoch 20/100: 100% 2/2 [00:05<00:00, 2.63s/it, val_loss=0.038]\n","Finish Validation\n","Epoch:20/100\n","Total Loss: 0.071 || Val Loss: 0.038 \n","Start Train\n","Epoch 21/100: 100% 22/22 [00:50<00:00, 2.28s/it, loss=0.0686, lr=0.000919]\n","Finish Train\n","Start Validation\n","Epoch 21/100: 100% 2/2 [00:05<00:00, 2.69s/it, val_loss=0.0397]\n","Finish Validation\n","Epoch:21/100\n","Total Loss: 0.069 || Val Loss: 0.040 \n","Start Train\n","Epoch 22/100: 100% 22/22 [00:51<00:00, 2.36s/it, loss=0.058, lr=0.000909]\n","Finish Train\n","Start Validation\n","Epoch 22/100: 100% 2/2 [00:05<00:00, 2.53s/it, val_loss=0.0355]\n","Finish Validation\n","Epoch:22/100\n","Total Loss: 0.058 || Val Loss: 0.035 \n","Start Train\n","Epoch 23/100: 100% 22/22 [00:50<00:00, 2.32s/it, loss=0.07, lr=0.000899]\n","Finish Train\n","Start Validation\n","Epoch 23/100: 100% 2/2 [00:05<00:00, 2.72s/it, val_loss=0.032]\n","Finish Validation\n","Epoch:23/100\n","Total Loss: 0.070 || Val Loss: 0.032 \n","Start Train\n","Epoch 24/100: 100% 22/22 [00:50<00:00, 2.32s/it, loss=0.0646, lr=0.000889]\n","Finish Train\n","Start Validation\n","Epoch 24/100: 100% 2/2 [00:05<00:00, 2.64s/it, val_loss=0.03]\n","Finish Validation\n","Epoch:24/100\n","Total Loss: 0.065 || Val Loss: 0.030 \n","Start Train\n","Epoch 25/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0636, lr=0.000878]\n","Finish Train\n","Start Validation\n","Epoch 25/100: 100% 2/2 [00:04<00:00, 2.16s/it, val_loss=0.0325]\n","Finish Validation\n","Epoch:25/100\n","Total Loss: 0.064 || Val Loss: 0.033 \n","Start Train\n","Epoch 26/100: 100% 22/22 [00:50<00:00, 2.28s/it, loss=0.064, lr=0.000867]\n","Finish Train\n","Start Validation\n","Epoch 26/100: 100% 2/2 [00:05<00:00, 2.70s/it, val_loss=0.033]\n","Finish Validation\n","Epoch:26/100\n","Total Loss: 0.064 || Val Loss: 0.033 \n","Start Train\n","Epoch 27/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0621, lr=0.000855]\n","Finish Train\n","Start Validation\n","Epoch 27/100: 100% 2/2 [00:05<00:00, 2.54s/it, val_loss=0.0329]\n","Finish Validation\n","Epoch:27/100\n","Total Loss: 0.062 || Val Loss: 0.033 \n","Start Train\n","Epoch 28/100: 100% 22/22 [00:50<00:00, 2.31s/it, loss=0.0703, lr=0.000843]\n","Finish Train\n","Start Validation\n","Epoch 28/100: 100% 2/2 [00:05<00:00, 2.67s/it, val_loss=0.0291]\n","Finish Validation\n","Epoch:28/100\n","Total Loss: 0.070 || Val Loss: 0.029 \n","Start Train\n","Epoch 29/100: 100% 22/22 [00:51<00:00, 2.34s/it, loss=0.0564, lr=0.00083]\n","Finish Train\n","Start Validation\n","Epoch 29/100: 100% 2/2 [00:04<00:00, 2.48s/it, val_loss=0.0296]\n","Finish Validation\n","Epoch:29/100\n","Total Loss: 0.056 || Val Loss: 0.030 \n","Start Train\n","Epoch 30/100: 100% 22/22 [00:49<00:00, 2.27s/it, loss=0.0639, lr=0.000817]\n","Finish Train\n","Start Validation\n","Epoch 30/100: 100% 2/2 [00:04<00:00, 2.41s/it, val_loss=0.034]\n","Finish Validation\n","Epoch:30/100\n","Total Loss: 0.064 || Val Loss: 0.034 \n","Start Train\n","Epoch 31/100: 100% 22/22 [00:50<00:00, 2.31s/it, loss=0.0578, lr=0.000804]\n","Finish Train\n","Start Validation\n","Epoch 31/100: 100% 2/2 [00:05<00:00, 2.54s/it, val_loss=0.0297]\n","Finish Validation\n","Epoch:31/100\n","Total Loss: 0.058 || Val Loss: 0.030 \n","Start Train\n","Epoch 32/100: 100% 22/22 [00:50<00:00, 2.27s/it, loss=0.0641, lr=0.00079]\n","Finish Train\n","Start Validation\n","Epoch 32/100: 100% 2/2 [00:05<00:00, 2.65s/it, val_loss=0.0283]\n","Finish Validation\n","Epoch:32/100\n","Total Loss: 0.064 || Val Loss: 0.028 \n","Start Train\n","Epoch 33/100: 100% 22/22 [00:51<00:00, 2.32s/it, loss=0.0605, lr=0.000776]\n","Finish Train\n","Start Validation\n","Epoch 33/100: 100% 2/2 [00:04<00:00, 2.32s/it, val_loss=0.0276]\n","Finish Validation\n","Epoch:33/100\n","Total Loss: 0.060 || Val Loss: 0.028 \n","Start Train\n","Epoch 34/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.0463, lr=0.000762]\n","Finish Train\n","Start Validation\n","Epoch 34/100: 100% 2/2 [00:04<00:00, 2.43s/it, val_loss=0.0284]\n","Finish Validation\n","Epoch:34/100\n","Total Loss: 0.046 || Val Loss: 0.028 \n","Start Train\n","Epoch 35/100: 100% 22/22 [00:51<00:00, 2.34s/it, loss=0.0594, lr=0.000748]\n","Finish Train\n","Start Validation\n","Epoch 35/100: 100% 2/2 [00:05<00:00, 2.51s/it, val_loss=0.0273]\n","Finish Validation\n","Epoch:35/100\n","Total Loss: 0.059 || Val Loss: 0.027 \n","Start Train\n","Epoch 36/100: 100% 22/22 [00:51<00:00, 2.35s/it, loss=0.0573, lr=0.000733]\n","Finish Train\n","Start Validation\n","Epoch 36/100: 100% 2/2 [00:05<00:00, 2.51s/it, val_loss=0.027]\n","Finish Validation\n","Epoch:36/100\n","Total Loss: 0.057 || Val Loss: 0.027 \n","Start Train\n","Epoch 37/100: 100% 22/22 [00:50<00:00, 2.32s/it, loss=0.0538, lr=0.000718]\n","Finish Train\n","Start Validation\n","Epoch 37/100: 100% 2/2 [00:04<00:00, 2.32s/it, val_loss=0.0265]\n","Finish Validation\n","Epoch:37/100\n","Total Loss: 0.054 || Val Loss: 0.027 \n","Start Train\n","Epoch 38/100: 100% 22/22 [00:49<00:00, 2.26s/it, loss=0.0536, lr=0.000702]\n","Finish Train\n","Start Validation\n","Epoch 38/100: 100% 2/2 [00:05<00:00, 2.57s/it, val_loss=0.0278]\n","Finish Validation\n","Epoch:38/100\n","Total Loss: 0.054 || Val Loss: 0.028 \n","Start Train\n","Epoch 39/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.0535, lr=0.000687]\n","Finish Train\n","Start Validation\n","Epoch 39/100: 100% 2/2 [00:04<00:00, 2.36s/it, val_loss=0.0278]\n","Finish Validation\n","Epoch:39/100\n","Total Loss: 0.053 || Val Loss: 0.028 \n","Start Train\n","Epoch 40/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.0558, lr=0.000671]\n","Finish Train\n","Start Validation\n","Epoch 40/100: 100% 2/2 [00:04<00:00, 2.34s/it, val_loss=0.0257]\n","Finish Validation\n","Epoch:40/100\n","Total Loss: 0.056 || Val Loss: 0.026 \n","Start Train\n","Epoch 41/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0562, lr=0.000655]\n","Finish Train\n","Start Validation\n","Epoch 41/100: 100% 2/2 [00:04<00:00, 2.46s/it, val_loss=0.0257]\n","Finish Validation\n","Epoch:41/100\n","Total Loss: 0.056 || Val Loss: 0.026 \n","Start Train\n","Epoch 42/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0526, lr=0.000639]\n","Finish Train\n","Start Validation\n","Epoch 42/100: 100% 2/2 [00:04<00:00, 2.43s/it, val_loss=0.0264]\n","Finish Validation\n","Epoch:42/100\n","Total Loss: 0.053 || Val Loss: 0.026 \n","Start Train\n","Epoch 43/100: 100% 22/22 [00:51<00:00, 2.36s/it, loss=0.046, lr=0.000622]\n","Finish Train\n","Start Validation\n","Epoch 43/100: 100% 2/2 [00:05<00:00, 2.54s/it, val_loss=0.0246]\n","Finish Validation\n","Epoch:43/100\n","Total Loss: 0.046 || Val Loss: 0.025 \n","Start Train\n","Epoch 44/100: 100% 22/22 [00:50<00:00, 2.27s/it, loss=0.0422, lr=0.000606]\n","Finish Train\n","Start Validation\n","Epoch 44/100: 100% 2/2 [00:05<00:00, 2.59s/it, val_loss=0.0264]\n","Finish Validation\n","Epoch:44/100\n","Total Loss: 0.042 || Val Loss: 0.026 \n","Start Train\n","Epoch 45/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0491, lr=0.000589]\n","Finish Train\n","Start Validation\n","Epoch 45/100: 100% 2/2 [00:04<00:00, 2.39s/it, val_loss=0.0257]\n","Finish Validation\n","Epoch:45/100\n","Total Loss: 0.049 || Val Loss: 0.026 \n","Start Train\n","Epoch 46/100: 100% 22/22 [00:49<00:00, 2.26s/it, loss=0.0465, lr=0.000572]\n","Finish Train\n","Start Validation\n","Epoch 46/100: 100% 2/2 [00:05<00:00, 2.68s/it, val_loss=0.0268]\n","Finish Validation\n","Epoch:46/100\n","Total Loss: 0.046 || Val Loss: 0.027 \n","Start Train\n","Epoch 47/100: 100% 22/22 [00:50<00:00, 2.29s/it, loss=0.0474, lr=0.000556]\n","Finish Train\n","Start Validation\n","Epoch 47/100: 100% 2/2 [00:04<00:00, 2.47s/it, val_loss=0.0241]\n","Finish Validation\n","Epoch:47/100\n","Total Loss: 0.047 || Val Loss: 0.024 \n","Start Train\n","Epoch 48/100: 100% 22/22 [00:50<00:00, 2.31s/it, loss=0.046, lr=0.000539]\n","Finish Train\n","Start Validation\n","Epoch 48/100: 100% 2/2 [00:04<00:00, 2.43s/it, val_loss=0.0236]\n","Finish Validation\n","Epoch:48/100\n","Total Loss: 0.046 || Val Loss: 0.024 \n","Start Train\n","Epoch 49/100: 100% 22/22 [00:50<00:00, 2.30s/it, loss=0.0498, lr=0.000522]\n","Finish Train\n","Start Validation\n","Epoch 49/100: 100% 2/2 [00:04<00:00, 2.44s/it, val_loss=0.0244]\n","Finish Validation\n","Epoch:49/100\n","Total Loss: 0.050 || Val Loss: 0.024 \n","Start Train\n","Epoch 50/100: 100% 22/22 [00:49<00:00, 2.26s/it, loss=0.0539, lr=0.000505]\n","Finish Train\n","Start Validation\n","Epoch 50/100: 100% 2/2 [00:04<00:00, 2.39s/it, val_loss=0.0246]\n","Finish Validation\n","Epoch:50/100\n","Total Loss: 0.054 || Val Loss: 0.025 \n","Start Train\n","Epoch 51/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0478, lr=0.000488]\n","Finish Train\n","Start Validation\n","Epoch 51/100: 100% 5/5 [00:06<00:00, 1.35s/it, val_loss=0.031]\n","Finish Validation\n","Epoch:51/100\n","Total Loss: 0.048 || Val Loss: 0.031 \n","Start Train\n","Epoch 52/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0517, lr=0.000471]\n","Finish Train\n","Start Validation\n","Epoch 52/100: 100% 5/5 [00:06<00:00, 1.26s/it, val_loss=0.0264]\n","Finish Validation\n","Epoch:52/100\n","Total Loss: 0.052 || Val Loss: 0.026 \n","Start Train\n","Epoch 53/100: 100% 45/45 [00:55<00:00, 1.22s/it, loss=0.053, lr=0.000454]\n","Finish Train\n","Start Validation\n","Epoch 53/100: 100% 5/5 [00:06<00:00, 1.24s/it, val_loss=0.0247]\n","Finish Validation\n","Epoch:53/100\n","Total Loss: 0.053 || Val Loss: 0.025 \n","Start Train\n","Epoch 54/100: 100% 45/45 [00:57<00:00, 1.27s/it, loss=0.0476, lr=0.000438]\n","Finish Train\n","Start Validation\n","Epoch 54/100: 100% 5/5 [00:05<00:00, 1.20s/it, val_loss=0.0233]\n","Finish Validation\n","Epoch:54/100\n","Total Loss: 0.048 || Val Loss: 0.023 \n","Start Train\n","Epoch 55/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0372, lr=0.000421]\n","Finish Train\n","Start Validation\n","Epoch 55/100: 100% 5/5 [00:06<00:00, 1.20s/it, val_loss=0.0196]\n","Finish Validation\n","Epoch:55/100\n","Total Loss: 0.037 || Val Loss: 0.020 \n","Start Train\n","Epoch 56/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0385, lr=0.000404]\n","Finish Train\n","Start Validation\n","Epoch 56/100: 100% 5/5 [00:05<00:00, 1.19s/it, val_loss=0.0212]\n","Finish Validation\n","Epoch:56/100\n","Total Loss: 0.039 || Val Loss: 0.021 \n","Start Train\n","Epoch 57/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.038, lr=0.000388]\n","Finish Train\n","Start Validation\n","Epoch 57/100: 100% 5/5 [00:06<00:00, 1.24s/it, val_loss=0.0197]\n","Finish Validation\n","Epoch:57/100\n","Total Loss: 0.038 || Val Loss: 0.020 \n","Start Train\n","Epoch 58/100: 100% 45/45 [00:55<00:00, 1.22s/it, loss=0.0402, lr=0.000371]\n","Finish Train\n","Start Validation\n","Epoch 58/100: 100% 5/5 [00:05<00:00, 1.11s/it, val_loss=0.0187]\n","Finish Validation\n","Epoch:58/100\n","Total Loss: 0.040 || Val Loss: 0.019 \n","Start Train\n","Epoch 59/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0399, lr=0.000355]\n","Finish Train\n","Start Validation\n","Epoch 59/100: 100% 5/5 [00:05<00:00, 1.18s/it, val_loss=0.0189]\n","Finish Validation\n","Epoch:59/100\n","Total Loss: 0.040 || Val Loss: 0.019 \n","Start Train\n","Epoch 60/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0334, lr=0.000339]\n","Finish Train\n","Start Validation\n","Epoch 60/100: 100% 5/5 [00:06<00:00, 1.25s/it, val_loss=0.0197]\n","Finish Validation\n","Epoch:60/100\n","Total Loss: 0.033 || Val Loss: 0.020 \n","Start Train\n","Epoch 61/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0339, lr=0.000323]\n","Finish Train\n","Start Validation\n","Epoch 61/100: 100% 5/5 [00:06<00:00, 1.21s/it, val_loss=0.019]\n","Finish Validation\n","Epoch:61/100\n","Total Loss: 0.034 || Val Loss: 0.019 \n","Start Train\n","Epoch 62/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0332, lr=0.000308]\n","Finish Train\n","Start Validation\n","Epoch 62/100: 100% 5/5 [00:06<00:00, 1.23s/it, val_loss=0.0162]\n","Finish Validation\n","Epoch:62/100\n","Total Loss: 0.033 || Val Loss: 0.016 \n","Start Train\n","Epoch 63/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0335, lr=0.000292]\n","Finish Train\n","Start Validation\n","Epoch 63/100: 100% 5/5 [00:05<00:00, 1.16s/it, val_loss=0.0155]\n","Finish Validation\n","Epoch:63/100\n","Total Loss: 0.034 || Val Loss: 0.015 \n","Start Train\n","Epoch 64/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0342, lr=0.000277]\n","Finish Train\n","Start Validation\n","Epoch 64/100: 100% 5/5 [00:05<00:00, 1.20s/it, val_loss=0.0154]\n","Finish Validation\n","Epoch:64/100\n","Total Loss: 0.034 || Val Loss: 0.015 \n","Start Train\n","Epoch 65/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0375, lr=0.000262]\n","Finish Train\n","Start Validation\n","Epoch 65/100: 100% 5/5 [00:05<00:00, 1.18s/it, val_loss=0.0188]\n","Finish Validation\n","Epoch:65/100\n","Total Loss: 0.037 || Val Loss: 0.019 \n","Start Train\n","Epoch 66/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0334, lr=0.000248]\n","Finish Train\n","Start Validation\n","Epoch 66/100: 100% 5/5 [00:06<00:00, 1.22s/it, val_loss=0.0158]\n","Finish Validation\n","Epoch:66/100\n","Total Loss: 0.033 || Val Loss: 0.016 \n","Start Train\n","Epoch 67/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0325, lr=0.000234]\n","Finish Train\n","Start Validation\n","Epoch 67/100: 100% 5/5 [00:06<00:00, 1.22s/it, val_loss=0.0158]\n","Finish Validation\n","Epoch:67/100\n","Total Loss: 0.033 || Val Loss: 0.016 \n","Start Train\n","Epoch 68/100: 100% 45/45 [00:54<00:00, 1.22s/it, loss=0.0346, lr=0.00022]\n","Finish Train\n","Start Validation\n","Epoch 68/100: 100% 5/5 [00:06<00:00, 1.21s/it, val_loss=0.0178]\n","Finish Validation\n","Epoch:68/100\n","Total Loss: 0.035 || Val Loss: 0.018 \n","Start Train\n","Epoch 69/100: 100% 45/45 [00:57<00:00, 1.27s/it, loss=0.0269, lr=0.000206]\n","Finish Train\n","Start Validation\n","Epoch 69/100: 100% 5/5 [00:06<00:00, 1.23s/it, val_loss=0.016]\n","Finish Validation\n","Epoch:69/100\n","Total Loss: 0.027 || Val Loss: 0.016 \n","Start Train\n","Epoch 70/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.029, lr=0.000193]\n","Finish Train\n","Start Validation\n","Epoch 70/100: 100% 5/5 [00:05<00:00, 1.19s/it, val_loss=0.0145]\n","Finish Validation\n","Epoch:70/100\n","Total Loss: 0.029 || Val Loss: 0.015 \n","Start Train\n","Epoch 71/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0254, lr=0.00018]\n","Finish Train\n","Start Validation\n","Epoch 71/100: 100% 5/5 [00:06<00:00, 1.23s/it, val_loss=0.0147]\n","Finish Validation\n","Epoch:71/100\n","Total Loss: 0.025 || Val Loss: 0.015 \n","Start Train\n","Epoch 72/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0238, lr=0.000167]\n","Finish Train\n","Start Validation\n","Epoch 72/100: 100% 5/5 [00:06<00:00, 1.29s/it, val_loss=0.0138]\n","Finish Validation\n","Epoch:72/100\n","Total Loss: 0.024 || Val Loss: 0.014 \n","Start Train\n","Epoch 73/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.026, lr=0.000155]\n","Finish Train\n","Start Validation\n","Epoch 73/100: 100% 5/5 [00:06<00:00, 1.25s/it, val_loss=0.0141]\n","Finish Validation\n","Epoch:73/100\n","Total Loss: 0.026 || Val Loss: 0.014 \n","Start Train\n","Epoch 74/100: 100% 45/45 [00:56<00:00, 1.27s/it, loss=0.0201, lr=0.000143]\n","Finish Train\n","Start Validation\n","Epoch 74/100: 100% 5/5 [00:06<00:00, 1.23s/it, val_loss=0.0136]\n","Finish Validation\n","Epoch:74/100\n","Total Loss: 0.020 || Val Loss: 0.014 \n","Start Train\n","Epoch 75/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0231, lr=0.000132]\n","Finish Train\n","Start Validation\n","Epoch 75/100: 100% 5/5 [00:06<00:00, 1.26s/it, val_loss=0.0139]\n","Finish Validation\n","Epoch:75/100\n","Total Loss: 0.023 || Val Loss: 0.014 \n","Start Train\n","Epoch 76/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0286, lr=0.000121]\n","Finish Train\n","Start Validation\n","Epoch 76/100: 100% 5/5 [00:06<00:00, 1.21s/it, val_loss=0.0136]\n","Finish Validation\n","Epoch:76/100\n","Total Loss: 0.029 || Val Loss: 0.014 \n","Start Train\n","Epoch 77/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.023, lr=0.000111]\n","Finish Train\n","Start Validation\n","Epoch 77/100: 100% 5/5 [00:06<00:00, 1.21s/it, val_loss=0.0144]\n","Finish Validation\n","Epoch:77/100\n","Total Loss: 0.023 || Val Loss: 0.014 \n","Start Train\n","Epoch 78/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0236, lr=0.000101]\n","Finish Train\n","Start Validation\n","Epoch 78/100: 100% 5/5 [00:06<00:00, 1.24s/it, val_loss=0.0144]\n","Finish Validation\n","Epoch:78/100\n","Total Loss: 0.024 || Val Loss: 0.014 \n","Start Train\n","Epoch 79/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0247, lr=9.11e-5]\n","Finish Train\n","Start Validation\n","Epoch 79/100: 100% 5/5 [00:05<00:00, 1.19s/it, val_loss=0.0134]\n","Finish Validation\n","Epoch:79/100\n","Total Loss: 0.025 || Val Loss: 0.013 \n","Start Train\n","Epoch 80/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0225, lr=8.21e-5]\n","Finish Train\n","Start Validation\n","Epoch 80/100: 100% 5/5 [00:05<00:00, 1.17s/it, val_loss=0.0131]\n","Finish Validation\n","Epoch:80/100\n","Total Loss: 0.022 || Val Loss: 0.013 \n","Start Train\n","Epoch 81/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0235, lr=7.35e-5]\n","Finish Train\n","Start Validation\n","Epoch 81/100: 100% 5/5 [00:05<00:00, 1.19s/it, val_loss=0.0134]\n","Finish Validation\n","Epoch:81/100\n","Total Loss: 0.023 || Val Loss: 0.013 \n","Start Train\n","Epoch 82/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0251, lr=6.55e-5]\n","Finish Train\n","Start Validation\n","Epoch 82/100: 100% 5/5 [00:05<00:00, 1.18s/it, val_loss=0.0137]\n","Finish Validation\n","Epoch:82/100\n","Total Loss: 0.025 || Val Loss: 0.014 \n","Start Train\n","Epoch 83/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0225, lr=5.8e-5]\n","Finish Train\n","Start Validation\n","Epoch 83/100: 100% 5/5 [00:05<00:00, 1.18s/it, val_loss=0.0134]\n","Finish Validation\n","Epoch:83/100\n","Total Loss: 0.023 || Val Loss: 0.013 \n","Start Train\n","Epoch 84/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0248, lr=5.1e-5]\n","Finish Train\n","Start Validation\n","Epoch 84/100: 100% 5/5 [00:06<00:00, 1.21s/it, val_loss=0.0133]\n","Finish Validation\n","Epoch:84/100\n","Total Loss: 0.025 || Val Loss: 0.013 \n","Start Train\n","Epoch 85/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0216, lr=4.45e-5]\n","Finish Train\n","Start Validation\n","Epoch 85/100: 100% 5/5 [00:05<00:00, 1.16s/it, val_loss=0.0135]\n","Finish Validation\n","Epoch:85/100\n","Total Loss: 0.022 || Val Loss: 0.013 \n","Start Train\n","Epoch 86/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0226, lr=3.86e-5]\n","Finish Train\n","Start Validation\n","Epoch 86/100: 100% 5/5 [00:05<00:00, 1.18s/it, val_loss=0.0128]\n","Finish Validation\n","Epoch:86/100\n","Total Loss: 0.023 || Val Loss: 0.013 \n","Start Train\n","Epoch 87/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0222, lr=3.32e-5]\n","Finish Train\n","Start Validation\n","Epoch 87/100: 100% 5/5 [00:06<00:00, 1.25s/it, val_loss=0.0126]\n","Finish Validation\n","Epoch:87/100\n","Total Loss: 0.022 || Val Loss: 0.013 \n","Start Train\n","Epoch 88/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0197, lr=2.84e-5]\n","Finish Train\n","Start Validation\n","Epoch 88/100: 100% 5/5 [00:06<00:00, 1.23s/it, val_loss=0.0128]\n","Finish Validation\n","Epoch:88/100\n","Total Loss: 0.020 || Val Loss: 0.013 \n","Start Train\n","Epoch 89/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0188, lr=2.41e-5]\n","Finish Train\n","Start Validation\n","Epoch 89/100: 100% 5/5 [00:05<00:00, 1.18s/it, val_loss=0.0129]\n","Finish Validation\n","Epoch:89/100\n","Total Loss: 0.019 || Val Loss: 0.013 \n","Start Train\n","Epoch 90/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0208, lr=2.04e-5]\n","Finish Train\n","Start Validation\n","Epoch 90/100: 100% 5/5 [00:05<00:00, 1.18s/it, val_loss=0.0128]\n","Finish Validation\n","Epoch:90/100\n","Total Loss: 0.021 || Val Loss: 0.013 \n","Start Train\n","Epoch 91/100: 100% 45/45 [00:56<00:00, 1.26s/it, loss=0.0214, lr=1.72e-5]\n","Finish Train\n","Start Validation\n","Epoch 91/100: 100% 5/5 [00:05<00:00, 1.20s/it, val_loss=0.0129]\n","Finish Validation\n","Epoch:91/100\n","Total Loss: 0.021 || Val Loss: 0.013 \n","Start Train\n","Epoch 92/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.0196, lr=1.46e-5]\n","Finish Train\n","Start Validation\n","Epoch 92/100: 100% 5/5 [00:06<00:00, 1.22s/it, val_loss=0.0129]\n","Finish Validation\n","Epoch:92/100\n","Total Loss: 0.020 || Val Loss: 0.013 \n","Start Train\n","Epoch 93/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.019, lr=1.26e-5]\n","Finish Train\n","Start Validation\n","Epoch 93/100: 100% 5/5 [00:05<00:00, 1.18s/it, val_loss=0.0132]\n","Finish Validation\n","Epoch:93/100\n","Total Loss: 0.019 || Val Loss: 0.013 \n","Start Train\n","Epoch 94/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0203, lr=1.12e-5]\n","Finish Train\n","Start Validation\n","Epoch 94/100: 100% 5/5 [00:06<00:00, 1.21s/it, val_loss=0.0132]\n","Finish Validation\n","Epoch:94/100\n","Total Loss: 0.020 || Val Loss: 0.013 \n","Start Train\n","Epoch 95/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0216, lr=1.03e-5]\n","Finish Train\n","Start Validation\n","Epoch 95/100: 100% 5/5 [00:05<00:00, 1.20s/it, val_loss=0.0127]\n","Finish Validation\n","Epoch:95/100\n","Total Loss: 0.022 || Val Loss: 0.013 \n","Start Train\n","Epoch 96/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.0195, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 96/100: 100% 5/5 [00:06<00:00, 1.24s/it, val_loss=0.0129]\n","Finish Validation\n","Epoch:96/100\n","Total Loss: 0.019 || Val Loss: 0.013 \n","Start Train\n","Epoch 97/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0182, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 97/100: 100% 5/5 [00:06<00:00, 1.20s/it, val_loss=0.0128]\n","Finish Validation\n","Epoch:97/100\n","Total Loss: 0.018 || Val Loss: 0.013 \n","Start Train\n","Epoch 98/100: 100% 45/45 [00:56<00:00, 1.25s/it, loss=0.0221, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 98/100: 100% 5/5 [00:06<00:00, 1.25s/it, val_loss=0.0126]\n","Finish Validation\n","Epoch:98/100\n","Total Loss: 0.022 || Val Loss: 0.013 \n","Start Train\n","Epoch 99/100: 100% 45/45 [00:55<00:00, 1.24s/it, loss=0.02, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 99/100: 100% 5/5 [00:05<00:00, 1.19s/it, val_loss=0.0121]\n","Finish Validation\n","Epoch:99/100\n","Total Loss: 0.020 || Val Loss: 0.012 \n","Start Train\n","Epoch 100/100: 100% 45/45 [00:55<00:00, 1.23s/it, loss=0.021, lr=1e-5]\n","Finish Train\n","Start Validation\n","Epoch 100/100: 100% 5/5 [00:06<00:00, 1.23s/it, val_loss=0.0126]\n","Finish Validation\n","Epoch:100/100\n","Total Loss: 0.021 || Val Loss: 0.013 \n"]}],"source":["# 训练数据,默认为100个epochs\n","!python train.py --tiny --phi 2 --epochs 100 \\\n"," --weights model_data/yolotiny_CBAM_ep100.pth \\\n"," --freeze --freeze-epochs 50 --freeze-size 64 \\\n"," --batch-size 32 --shape 416 \\\n"," --fp16 --cuda"]},{"cell_type":"markdown","metadata":{"id":"FcfrujCdkX9z"},"source":["### ECA版本"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"jaRikD6LkX9z"},"outputs":[],"source":["# 训练数据,默认为100个epochs\n","!python train.py --tiny --phi 3 --epochs 100 \\\n"," --weights model_data/yolotiny_ECA_ep100.pth \\\n"," --freeze --freeze-epochs 50 --freeze-size 64 \\\n"," --batch-size 32 --shape 416 \\\n"," --fp16 --cuda"]},{"cell_type":"markdown","metadata":{"id":"ZdUR507vGaq8"},"source":["# 预测结果"]},{"cell_type":"markdown","source":["### 视频预测或者文件夹预测等"],"metadata":{"id":"l7tb4OUQnpx-"}},{"cell_type":"code","execution_count":31,"metadata":{"colab":{"base_uri":"https://localhost:8080/"},"executionInfo":{"elapsed":8279,"status":"ok","timestamp":1651112609364,"user":{"displayName":"KaiJun Deng","userId":"04642544504944131029"},"user_tz":-480},"id":"Hb5urvahM-SB","outputId":"bfd5b6de-8a07-4a12-d514-7cd88cc60004"},"outputs":[{"output_type":"stream","name":"stdout","text":["Namespace(cuda=True, mode='dir_predict', phi=0, shape=416, tiny=True, weights='logs/ep100-loss0.026-val_loss0.015.pth')\n","logs/ep100-loss0.026-val_loss0.015.pth model, anchors, and classes loaded.\n"," 0% 0/8 [00:00\n","Get map done.\n"]}],"source":["# # 对验证集进行计算\n","# !python get_map.py --mode 0 \\\n","# --tiny --phi 1 \\\n","# --weights model_data/yolotiny_SE_ep100.pth \\\n","# --cuda --shape 416\n","# !python get_map.py --tiny --cuda\n","!python get_map.py --tiny --cuda --phi 0 --weights logs/ep100-loss0.026-val_loss0.015.pth"]}],"metadata":{"accelerator":"GPU","colab":{"collapsed_sections":[],"name":"yolov4-gesture-tutorial.ipynb","provenance":[]},"kernelspec":{"display_name":"Python 3","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.8.13"}},"nbformat":4,"nbformat_minor":0}