mask_rcnn 训练自定义数据集（本地win10系统cpu已调通，采坑无数，均已列出解决方法）

本文主要是介绍mask_rcnn 训练自定义数据集（本地win10系统cpu已调通，采坑无数，均已列出解决方法），希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

源代码地址：GitHub - junlintianxiatjm/Mask_RCNN-master007: 基于mask_rcnn的目标检测案例，本地win10系统CPU测试通过，踩过很多坑，填坑方法均已给出。【本地win10已调通】

【亲测通过】MaskRcnn_tf1.x如何升级到MaskRcnn_tf2.x，实现RTX3090环境训练自定义数据集模型。_君临天下tjm的博客-CSDN博客一、背景:之前一篇博文中已经实现了maskrcnn_tf1.15.0环境的win10+cpu模型训练，但cpu训练实在是非常的耗时，据说tf1.x是支持RTX1060的（本人未测试），但不支持最新的RTX3090，查阅了很多资料，原因应该是tf1.x与tf2.x的差别比较大，必须升级到tf2.x，才可以正常使用rtx3090。下面是maskrcnn_tf1.15.0的开发案例，本人亲测可用。mask_rcnn 训练自定义数据集（本地win10系统cpu已调通，采坑无数，均已列出解决方法）_君..https://blog.csdn.net/shanxiderenheni/article/details/123423905

1.版本信息

python 3.6.9

Tensorflow 1.15.0

keras 2.2.5

Pillow 5.3.0（必须，否则labelme执行json_to_dataset会出错）

Cv2（必须安装，训练模型时用到）

Wrapt

opt_einsum

Gast

scikit-image

IPython

虚拟环境完整配置库如下：

(py36_maskrcnn_env) C:\Users\DELL>pip list

WARNING: Ignoring invalid distribution -ip (f:\programfiles\anaconda3\envs\py36_maskrcnn_env\lib\site-packages)

Package Version

------------------------ -------------------

absl-py 0.13.0

astor 0.8.1

astunparse 1.6.3

backcall 0.2.0

bleach 1.5.0

cached-property 1.5.2

cachetools 4.2.2

certifi 2021.5.30

chardet 4.0.0

colorama 0.4.4

cycler 0.10.0

dataclasses 0.8

decorator 4.4.2

enum34 1.1.10

flatbuffers 1.12

gast 0.2.2

google-auth 1.32.1

google-auth-oauthlib 0.4.4

google-pasta 0.2.0

grpcio 1.32.0

h5py 2.10.0

html5lib 0.9999999

idna 2.10

imageio 2.9.0

imgviz 1.2.6

importlib-metadata 4.6.0

ipython 7.16.1

ipython-genutils 0.2.0

jedi 0.18.0

Keras 2.2.5

Keras-Applications 1.0.8

keras-nightly 2.5.0.dev2021032900

Keras-Preprocessing 1.1.2

kiwisolver 1.3.1

labelme 4.5.9

Markdown 3.3.4

matplotlib 3.2.2

networkx 2.5.1

numpy 1.19.5

oauthlib 3.1.1

object-detection 0.1

opencv-python 4.5.2.54

opt-einsum 3.3.0

parso 0.8.2

pickleshare 0.7.5

Pillow 5.3.0

pip 21.1.3

prompt-toolkit 3.0.19

protobuf 3.17.3

pyasn1 0.4.8

pyasn1-modules 0.2.8

Pygments 2.9.0

pyparsing 2.4.7

PyQt5 5.15.2

PyQt5-sip 12.9.0

python-dateutil 2.8.1

PyWavelets 1.1.1

PyYAML 5.4.1

QtPy 1.9.0

requests 2.25.1

requests-oauthlib 1.3.0

rsa 4.7.2

scikit-image 0.16.2

scipy 1.4.1

setuptools 52.0.0.post20210125

six 1.15.0

tensorboard 1.15.0

tensorboard-data-server 0.6.1

tensorboard-plugin-wit 1.8.0

tensorflow 1.15.0

tensorflow-estimator 1.15.1

tensorflow-gpu 2.2.0

tensorflow-gpu-estimator 2.2.0

termcolor 1.1.0

tifffile 2020.9.3

traitlets 4.3.3

typing-extensions 3.7.4.3

urllib3 1.26.6

wcwidth 0.2.5

Werkzeug 2.0.1

wheel 0.36.2

wincertstore 0.2

wrapt 1.12.1

zipp 3.4.1

WARNING: Ignoring invalid distribution -ip (f:\programfiles\anaconda3\envs\py36_maskrcnn_env\lib\site-packages)

2.labelme中的json_to_dataset.py修改源代码，新版本中加入旧版本的yaml部分代码；

3.新建一个pic、json、train_data和transform_json文件夹，用labelme打标注；

4.用右键“Run makedir”或命令“python makedir.py”生成四个子目录文件夹；

5.将json文件转换为模型需要的mask文件

超详细！使用Mask R-CNN训练自己的数据过程记录_常鸿宇的博客-CSDN博客

使用的是labelme的labelme_json_to_dataset函数。

6.rename_cv2_mask.py ，复制label.png，并改名；

7.将pic、json放到对应的train_data下pic、json，将transform_json下面的json文件夹都复制到train_data/labelme_json路径下。

Keras-MaskRCNN训练自己的数据_hhhuua的博客-CSDN博客_maskrcnn训练

8.下载mask_rcnn_coco.h5,放到根目录下

9.新增文件train.py，注意修改相应参数；

Keras-MaskRCNN训练自己的数据_hhhuua的博客-CSDN博客_maskrcnn训练

在工作区右键点选“Run ’train’”开始模型训练；

管道缺陷数据集（338张）：

10.启动tensorboard，命令行是： tensorboard --logdir=log路径。

【TensorBoard】如何启动tensorboard的详尽步骤_fuqiuai的博客-CSDN博客_启动tensorboard

cmd命令：tensorboard --logdir=shapes20210707T1008

查看tensorboard可视化界面：http://localhost:6006/#scalars

11.测试模型：运行 Run‘test’;

Mask R-CNN tensorflow 训练自己的数据【从标注数据到最终训练和测试】超全教程，呕血踩坑，Ubuntu 16.04 完美运行_Somafish的博客-CSDN博客_mask rcnn tensorflow

检测结果出现很多框，是由于模型相关参数不正确引起的，需要多次尝试参数。

运行 Run ’test2’:

管道数据集测试结果如下：

问题总结：

1. 将json文件转换为模型需要的mask文件，需要修改labelme源码，D:\Anaconda3\Lib\site-packages\labelme\cli\json_to_dataset.py

返回空文件夹，是因为路径问题；

解决方法：

labelme批量json_to_dataset转换_简简单单-CSDN博客

更改json_to_dataset.py源码，且修改路径。

2.问题：UnicodeDecodeError: 'gbk' codec can't decode byte 0xac in position 35: illegal multibyte sequence

解决方法：

python ： 'gbk' codec can't decode byte 0xbe in position 18: illegal multibyte sequenc_腾阳的博客-CSDN博客

3.问题：json_to_dataset.py不生成info.yaml文件；

解决方法：

Keras-MaskRCNN训练自己的数据_hhhuua的博客-CSDN博客_maskrcnn训练

4.问题：AttributeError: ‘Model‘ object has no attribute ‘metrics_tensors‘，AttributeError: module 'tensorflow' has no attribute 'placeholder'，类似问题都是tensorflow与keras的版本不匹配导致的。

解决方法：

AttributeError: ‘Model‘ object has no attribute ‘metrics_tensors‘_mjiansun的专栏-CSDN博客

Tensorflow=1.15.0

Keras=2.2.5

5.问题：测试.h5模型代码test.py出错，ValueError: Layer #391 (named "mrcnn_bbox_fc"), weight <tf.Variable 'mrcnn_bbox_fc_1/kernel:0' shape=(1024, 8) dtype=float32_ref> has shape (1024, 8), but the saved weight has shape (1024, 12).

解决方法：

mask rcnn测试中遇到的问题解决_ilinda的博客-CSDN博客

6.问题：执行 Run ‘train.py’训练模型时，提示警告信息：F:\ProgramFiles\anaconda3\envs\py36_maskrcnn_env\lib\site-packages\skimage\transform\_warps.py:830: FutureWarning: Input image dtype is bool. Interpolation is not defined with bool data type. Please set order to 0 or explicitely cast input image to another data type. Starting from version 0.19 a ValueError will be raised instead of this warning.

order = _validate_interpolation_order(image.dtype, order)

解决方法：

Maybe you can try the skimage version 0.16.2。when I use the version 0.17.2， I faced the same issue.Good luck!Idont know why.

pip install -U scikit-image==0.16.2

python - Input image dtype is bool. Interpolation is not defined with bool data type - Stack Overflow

7.问题：运行train.py，训练模型时，有警告提示：

image_id 4

D:/python-workspace/Mask_RCNN-master2/train.py:89: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.

temp = yaml.load(f.read())

解决方法：

关于Yaml更新并弃用yaml.load（）导致老代码报错 YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated_R_Rick-CSDN博客

8.执行transform_json.py文件，将json文件转换为模型需要的mask文件时，报错。

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

解决方法：

9.问题：AttributeError: module 'tensorboard.plugins.pr_curve.summary' has no attribute 'pb'

解决方法：

参考资料：AttributeError: module 'tensorboard.plugins.pr_curve.summary' has no attribute 'pb' - 简书

10.问题：IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 1

解决方法：

11.问题：module 'tensorflow' has no attribute 'placeholder'

解决方法：

参考资料：tensorflow与keras对应关系 - 知乎

12.问题：'Model' object has no attribute 'metrics_tensors'

解决方法：

参考资料：'Model' object has no attribute 'metrics_tensors' 问题解决！！！_qq_643582002的博客-CSDN博客

13.如何保存检测结果截图：

解决方法：添加下面两个代码即可，保存目录需要提前建好，或者代码中判断再生成。

运行结果如下：

14.报错：AssertionError: len(images) must be equal to BATCH_SIZE。

解决方法：train.py文件修改下面两个参数值；

GPU_COUNT = 1
IMAGES_PER_GPU = 1

1.版本信息

2.labelme中的json_to_dataset.py修改源代码，新版本中加入旧版本的yaml部分代码；

3.新建一个pic、json、train_data和transform_json文件夹，用labelme打标注；

4.用右键“Run makedir”或命令“python makedir.py”生成四个子目录文件夹；

5.将json文件转换为模型需要的mask文件

6.rename_cv2_mask.py ，复制label.png，并改名；

7.将pic、json放到对应的train_data下pic、json，将transform_json下面的json文件夹都复制到train_data/labelme_json路径下。

8.下载mask_rcnn_coco.h5,放到根目录下

9.新增文件train.py，注意修改相应参数；

管道缺陷数据集（338张）：

10.启动tensorboard，命令行是： tensorboard --logdir=log路径。

11.测试模型：运行 Run‘test’;

*问题总结*：

1. 将json文件转换为模型需要的mask文件，需要修改labelme源码，D:\Anaconda3\Lib\site-packages\labelme\cli\json_to_dataset.py

2.问题：UnicodeDecodeError: 'gbk' codec can't decode byte 0xac in position 35: illegal multibyte sequence

3.问题：json_to_dataset.py不生成info.yaml文件；

4.问题：AttributeError: ‘Model‘ object has no attribute ‘metrics_tensors‘，AttributeError: module 'tensorflow' has no attribute 'placeholder'，类似问题都是tensorflow与keras的版本不匹配导致的。

5.问题：测试.h5模型代码test.py出错，ValueError: Layer #391 (named "mrcnn_bbox_fc"), weight has shape (1024, 8), but the saved weight has shape (1024, 12).

6.问题：执行 Run ‘train.py’训练模型时，提示警告信息：F:\ProgramFiles\anaconda3\envs\py36_maskrcnn_env\lib\site-packages\skimage\transform\_warps.py:830: FutureWarning: Input image dtype is bool. Interpolation is not defined with bool data type. Please set order to 0 or explicitely cast input image to another data type. Starting from version 0.19 a ValueError will be raised instead of this warning.

order = _validate_interpolation_order(image.dtype, order)

7.问题：运行train.py，训练模型时，有警告提示：

8.执行transform_json.py文件，将json文件转换为模型需要的mask文件时，报错。

9.问题：AttributeError: module 'tensorboard.plugins.pr_curve.summary' has no attribute 'pb'

10.问题：IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 1

11.问题：module 'tensorflow' has no attribute 'placeholder'

12.问题：'Model' object has no attribute 'metrics_tensors'

13.如何保存检测结果截图：

14.报错：AssertionError: len(images) must be equal to BATCH_SIZE。

这篇关于mask_rcnn 训练自定义数据集（本地win10系统cpu已调通，采坑无数，均已列出解决方法）的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

mask_rcnn 训练自定义数据集（本地win10系统cpu已调通，采坑无数，均已列出解决方法）

1.版本信息

2.labelme中的json_to_dataset.py修改源代码，新版本中加入旧版本的yaml部分代码；

3.新建一个pic、json、train_data和transform_json文件夹，用labelme打标注；

4.用右键“Run makedir”或命令“python makedir.py”生成四个子目录文件夹；

5.将json文件转换为模型需要的mask文件

6.rename_cv2_mask.py ，复制label.png，并改名；

7.将pic、json放到对应的train_data下pic、json，将transform_json下面的json文件夹都复制到train_data/labelme_json路径下。

8.下载mask_rcnn_coco.h5,放到根目录下

9.新增文件train.py，注意修改相应参数；

管道缺陷数据集（338张）：

10.启动tensorboard，命令行是： tensorboard --logdir=log路径。

11.测试模型：运行 Run‘test’;

问题总结：

1. 将json文件转换为模型需要的mask文件，需要修改labelme源码，D:\Anaconda3\Lib\site-packages\labelme\cli\json_to_dataset.py

2.问题：UnicodeDecodeError: 'gbk' codec can't decode byte 0xac in position 35: illegal multibyte sequence

3.问题：json_to_dataset.py不生成info.yaml文件；

4.问题：AttributeError: ‘Model‘ object has no attribute ‘metrics_tensors‘，AttributeError: module 'tensorflow' has no attribute 'placeholder'，类似问题都是tensorflow与keras的版本不匹配导致的。

5.问题：测试.h5模型代码test.py出错，ValueError: Layer #391 (named "mrcnn_bbox_fc"), weight <tf.Variable 'mrcnn_bbox_fc_1/kernel:0' shape=(1024, 8) dtype=float32_ref> has shape (1024, 8), but the saved weight has shape (1024, 12).

order = _validate_interpolation_order(image.dtype, order)

7.问题：运行train.py，训练模型时，有警告提示：

8.执行transform_json.py文件，将json文件转换为模型需要的mask文件时，报错。

9.问题：AttributeError: module 'tensorboard.plugins.pr_curve.summary' has no attribute 'pb'

10.问题：IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 1

11.问题：module 'tensorflow' has no attribute 'placeholder'

12.问题：'Model' object has no attribute 'metrics_tensors'

13.如何保存检测结果截图：

14.报错：AssertionError: len(images) must be equal to BATCH_SIZE。

相关文章

python获取网页表格的多种方法汇总

Spring 中的循环引用问题解决方法

Java学习手册之Filter和Listener使用方法

Pandas统计每行数据中的空值的方法示例

关于MongoDB图片URL存储异常问题以及解决

SpringBoot项目中报错The field screenShot exceeds its maximum permitted size of 1048576 bytes.的问题及解决

解决Maven项目idea找不到本地仓库jar包问题以及使用mvn install:install-file

Windows 上如果忘记了 MySQL 密码重置密码的两种方法

MySQL重复数据处理的七种高效方法

最详细安装 PostgreSQL方法及常见问题解决