Repository: fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials
Branch: master
Commit: d21869d80572
Files: 79
Total size: 203.0 KB

Directory structure:
gitextract_r6c960p4/

├── .github/
│   └── FUNDING.yml
├── README.md
├── chapter 1/
│   ├── Contrib Modules模块内容介绍.md
│   └── Windows系统中安装扩展模块.md
├── chapter 10/
│   ├── OpenCV中的人脸识别.md
│   ├── w
│   ├── 使用人脸标志 API.md
│   ├── 使用人脸标志AMM.md
│   ├── 使用人脸标志检测进行人脸交换.md
│   ├── 向人脸标志API添加新算法.md
│   ├── 图像中的人脸标志检测.md
│   ├── 视频中的人脸标志检测.md
│   └── 训练面部标志探测器.md
├── chapter 11/
│   ├── 3
│   ├── 使用模糊变换滤波.md
│   ├── 模糊变换理论.md
│   └── 通过模糊变换进行图像修复.md
├── chapter 12/
│   ├── 2
│   ├── 创建、写入和读取数据集.md
│   ├── 建立群组.md
│   └── 读写属性.md
├── chapter 13/
│   ├── 4
│   └── 线特征教程.md
├── chapter 14/
│   ├── 4
│   └── 二维相位图展开.md
├── chapter 15/
│   ├── SFM 模块安装.md
│   ├── e
│   ├── 场景重建.md
│   ├── 导入重建模型.md
│   └── 相机运动估计.md
├── chapter 16/
│   ├── 5
│   ├── 准稠密立体视觉.md
│   └── 生成模板参数文件.md
├── chapter 17/
│   ├── 3
│   ├── 捕捉格雷码.md
│   ├── 捕捉正弦图案.md
│   └── 解码格雷码教程.md
├── chapter 18/
│   ├── 4
│   ├── OpenCV跟踪器介绍.md
│   ├── 使用MultiTracker跟踪.md
│   └── 自定义CN跟踪器.md
├── chapter 19/
│   ├── 3
│   ├── 位姿变换.md
│   ├── 创建3D直方图.md
│   ├── 创建小部件.md
│   ├── 设置物体位姿.md
│   └── 运行viz.md
├── chapter 2/
│   ├── ArUco标记板的检测.md
│   ├── ArUco标记检测.md
│   ├── Aruco模块常见问题.md
│   ├── ChArUco角的检测.md
│   ├── 使用ArUco和ChArUco进行相机标定.md
│   └── 菱形标记检测.md
├── chapter 20/
│   ├── 用于快速边缘检测的结构化森.md
│   ├── 视差图滤波.md
│   └── 训练结构化森林.md
├── chapter 21/
│   ├── 图像修复.md
│   ├── 油画效果.md
│   └── 训练基于学习的白平衡算法.md
├── chapter 3/
│   ├── 3
│   └── 背景分割.md
├── chapter 4/
│   ├── 3
│   ├── 处理引起视错觉的图像.md
│   └── 视网膜视觉和真实世界的视觉.md
├── chapter 5/
│   ├── 5
│   ├── 多相机标定.md
│   └── 广角相机标定.md
├── chapter 6/
│   ├── e
│   ├── 使用Icosphere训练数据.md
│   ├── 分析训练模型.md
│   └── 分类.md
├── chapter 7/
│   ├── 3
│   └── 计算机视觉应用的交互式可视化调试.md
├── chapter 8/
│   └── 使用CNNs进行目标检测.md
└── chapter 9/
    ├── e
    ├── 放大图像：单输出.md
    ├── 放大图像：多输出.md
    ├── 放大视频.md
    └── 超分辨率基准测试.md

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/FUNDING.yml
================================================
# These are supported funding model platforms

github: # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]
patreon: # Replace with a single Patreon username
open_collective: # Replace with a single Open Collective username
ko_fi: # Replace with a single Ko-fi username
tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel
community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry
liberapay: # Replace with a single Liberapay username
issuehunt: # Replace with a single IssueHunt username
otechie: # Replace with a single Otechie username
custom：＃最多替换为4个自定义赞助商URL，例如['link1'，'link2']


================================================
FILE: README.md
================================================
# OpenCV contrib扩展模块中文教程

&nbsp;
<p align="center"><strong>本教程由小白学视觉团队 译</strong></p>
&nbsp;
<p align="center"><img src="https://img-blog.csdnimg.cn/20200301190623909.png" height="200"></img></p>
<p align="center"><strong>关注微信公众号“小白学视觉”获取更多计算机视觉学习资料</strong></p>

小白学视觉是2018年5月由哈尔滨工业大学博士生创办的技术类公众号，主要面向对计算机视觉、图像处理感兴趣的爱好者，分享计算机视觉学习资源、学习过程总结、前沿技术等内容，搭建计算机视觉、图像处理学习和交流的平台。

小白学视觉分享内容包括：  **OpenCV入门 &emsp; SLAM学习 &emsp; 优质书籍介绍 &emsp; 优质课程分享 &emsp; 编程技巧 &emsp; 论文解读 &emsp; 前沿方向介绍 &emsp; 招聘内推 &emsp; 招聘信息** 

&nbsp;

- [引言](#引言)
- 第一章  Contrib Modules介绍与安装
    - 1.1 [Contrib Modules安装](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%201/Windows%E7%B3%BB%E7%BB%9F%E4%B8%AD%E5%AE%89%E8%A3%85%E6%89%A9%E5%B1%95%E6%A8%A1%E5%9D%97.md)
    - 1.2 [Contrib Modules模块内容介绍](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%201/Contrib%20Modules%E6%A8%A1%E5%9D%97%E5%86%85%E5%AE%B9%E4%BB%8B%E7%BB%8D.md)
- 第二章 ArUco标记检测(aruco模块)
    - 2.1 [ArUco标记检测](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%202/ArUco%E6%A0%87%E8%AE%B0%E6%A3%80%E6%B5%8B.md)
    - 2.2 [ArUco标记板的检测](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%202/ArUco%E6%A0%87%E8%AE%B0%E6%9D%BF%E7%9A%84%E6%A3%80%E6%B5%8B.md)
    - 2.3 [ChArUco角的检测](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%202/ChArUco%E8%A7%92%E7%9A%84%E6%A3%80%E6%B5%8B.md)
    - 2.4 [菱形标记检测](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%202/%E8%8F%B1%E5%BD%A2%E6%A0%87%E8%AE%B0%E6%A3%80%E6%B5%8B.md)
    - 2.5 [使用ArUco和ChArUco进行相机标定](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%202/%E4%BD%BF%E7%94%A8ArUco%E5%92%8CChArUco%E8%BF%9B%E8%A1%8C%E7%9B%B8%E6%9C%BA%E6%A0%87%E5%AE%9A.md)
    - 2.6 [Aruco模块常见问题](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%202/Aruco%E6%A8%A1%E5%9D%97%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98.md)
- 第三章 背景分割(bgsegm模块)
    - 3.1 [背景分割](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%203/%E8%83%8C%E6%99%AF%E5%88%86%E5%89%B2.md)
- 第四章 生物视觉 (bioinspired模块)
    - 4.1 [视网膜视觉和真实世界的视觉](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%204/%E8%A7%86%E7%BD%91%E8%86%9C%E8%A7%86%E8%A7%89%E5%92%8C%E7%9C%9F%E5%AE%9E%E4%B8%96%E7%95%8C%E7%9A%84%E8%A7%86%E8%A7%89.md)
    - 4.2 [处理引起视错觉的图像](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%204/%E5%A4%84%E7%90%86%E5%BC%95%E8%B5%B7%E8%A7%86%E9%94%99%E8%A7%89%E7%9A%84%E5%9B%BE%E5%83%8F.md)
- 第五章 相机标定(ccalib模块)
    - 5.1 [广角相机标定](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%205/%E5%B9%BF%E8%A7%92%E7%9B%B8%E6%9C%BA%E6%A0%87%E5%AE%9A.md)
    - 5.2 [多相机标定](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%205/%E5%A4%9A%E7%9B%B8%E6%9C%BA%E6%A0%87%E5%AE%9A.md)
- 第六章 3D物体分类和位姿估计(cnn_3dobj模块)
    - 6.1 [使用Icosphere训练数据](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%206/%E4%BD%BF%E7%94%A8Icosphere%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE.md)
    - 6.2 [分类](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%206/%E5%88%86%E7%B1%BB.md)
    - 6.3 [分析训练模型](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%206/%E5%88%86%E6%9E%90%E8%AE%AD%E7%BB%83%E6%A8%A1%E5%9E%8B.md)
- 第七章  可视化调试(cvv模块)
    - 7.1 [计算机视觉应用的交互式可视化调试](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%207/%E8%AE%A1%E7%AE%97%E6%9C%BA%E8%A7%86%E8%A7%89%E5%BA%94%E7%94%A8%E7%9A%84%E4%BA%A4%E4%BA%92%E5%BC%8F%E5%8F%AF%E8%A7%86%E5%8C%96%E8%B0%83%E8%AF%95.md)
- 第八章 CNNs目标检测(dnn_objdetect模块)
    - 8.1 [使用CNNs进行目标检测](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%208/%E4%BD%BF%E7%94%A8CNNs%E8%BF%9B%E8%A1%8C%E7%9B%AE%E6%A0%87%E6%A3%80%E6%B5%8B.md)
- 第九章 图像超分(dnn_superresm模块)
    - 9.1 [放大图像：单输出](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%209/%E6%94%BE%E5%A4%A7%E5%9B%BE%E5%83%8F%EF%BC%9A%E5%8D%95%E8%BE%93%E5%87%BA.md)
    - 9.2 [放大图像：多输出](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%209/%E6%94%BE%E5%A4%A7%E5%9B%BE%E5%83%8F%EF%BC%9A%E5%8D%95%E8%BE%93%E5%87%BA.md)
    - 9.3 [放大视频](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%209/%E6%94%BE%E5%A4%A7%E8%A7%86%E9%A2%91.md)
    - 9.4 [超分辨率基准测试](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%209/%E8%B6%85%E5%88%86%E8%BE%A8%E7%8E%87%E5%9F%BA%E5%87%86%E6%B5%8B%E8%AF%95.md)
- 第十章 人脸识别(face模块)
    - 10.1 [OpenCV中的人脸识别](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2010/OpenCV%E4%B8%AD%E7%9A%84%E4%BA%BA%E8%84%B8%E8%AF%86%E5%88%AB.md)
    - 10.2 [图像中的人脸标志检测](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2010/%E5%9B%BE%E5%83%8F%E4%B8%AD%E7%9A%84%E4%BA%BA%E8%84%B8%E6%A0%87%E5%BF%97%E6%A3%80%E6%B5%8B.md)
    - 10.3 [训练面部标志探测器](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2010/%E8%AE%AD%E7%BB%83%E9%9D%A2%E9%83%A8%E6%A0%87%E5%BF%97%E6%8E%A2%E6%B5%8B%E5%99%A8.md)
    - 10.4 [视频中的人脸标志检测](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2010/%E8%A7%86%E9%A2%91%E4%B8%AD%E7%9A%84%E4%BA%BA%E8%84%B8%E6%A0%87%E5%BF%97%E6%A3%80%E6%B5%8B.md)
    - 10.5 [使用人脸标志检测进行人脸交换](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2010/%E4%BD%BF%E7%94%A8%E4%BA%BA%E8%84%B8%E6%A0%87%E5%BF%97%E6%A3%80%E6%B5%8B%E8%BF%9B%E8%A1%8C%E4%BA%BA%E8%84%B8%E4%BA%A4%E6%8D%A2.md)
    - 10.6 [向人脸标志API添加新算法](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2010/%E5%90%91%E4%BA%BA%E8%84%B8%E6%A0%87%E5%BF%97API%E6%B7%BB%E5%8A%A0%E6%96%B0%E7%AE%97%E6%B3%95.md)
    - 10.7 [使用人脸标志 API](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2010/%E4%BD%BF%E7%94%A8%E4%BA%BA%E8%84%B8%E6%A0%87%E5%BF%97%20API.md)
    - 10.8 [使用人脸标志AMM](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2010/%E4%BD%BF%E7%94%A8%E4%BA%BA%E8%84%B8%E6%A0%87%E5%BF%97AMM.md)
- 第十一章 模糊图像处理(fuzzy模块)
    - 11.1 [模糊变换理论](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2011/%E6%A8%A1%E7%B3%8A%E5%8F%98%E6%8D%A2%E7%90%86%E8%AE%BA.md)
    - 11.2 [通过模糊变换进行图像修复](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2011/%E9%80%9A%E8%BF%87%E6%A8%A1%E7%B3%8A%E5%8F%98%E6%8D%A2%E8%BF%9B%E8%A1%8C%E5%9B%BE%E5%83%8F%E4%BF%AE%E5%A4%8D.md)
    - 11.3 [使用模糊变换滤波](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2011/%E4%BD%BF%E7%94%A8%E6%A8%A1%E7%B3%8A%E5%8F%98%E6%8D%A2%E6%BB%A4%E6%B3%A2.md)
- 第十二章 分层数据格式的输入与输出(hdf)
    - 12.1 [建立群组](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2012/%E5%BB%BA%E7%AB%8B%E7%BE%A4%E7%BB%84.md)
    - 12.2 [创建、写入和读取数据集](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2012/%E5%88%9B%E5%BB%BA%E3%80%81%E5%86%99%E5%85%A5%E5%92%8C%E8%AF%BB%E5%8F%96%E6%95%B0%E6%8D%AE%E9%9B%86.md)
    - 12.3 [读写属性](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2012/%E8%AF%BB%E5%86%99%E5%B1%9E%E6%80%A7.md)
- 第十三章 线特征(line_descriptor模块)
    - 13.1 [线特征教程](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2013/%E7%BA%BF%E7%89%B9%E5%BE%81%E6%95%99%E7%A8%8B.md)
- 第十四章 相位展开(phase_unwrapping)
    - 14.1 [二维相位图展开](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2014/%E4%BA%8C%E7%BB%B4%E7%9B%B8%E4%BD%8D%E5%9B%BE%E5%B1%95%E5%BC%80.md)
- 第十五章 SFM运动恢复结构 (sfm模块)
    - 15.1 [SFM 模块安装](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2015/SFM%20%E6%A8%A1%E5%9D%97%E5%AE%89%E8%A3%85.md)
    - 15.2 [相机运动估计](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2015/%E7%9B%B8%E6%9C%BA%E8%BF%90%E5%8A%A8%E4%BC%B0%E8%AE%A1.md)
    - 15.3 [场景重建](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2015/%E5%9C%BA%E6%99%AF%E9%87%8D%E5%BB%BA.md)
    - 15.4 [导入重建模型](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2015/%E5%AF%BC%E5%85%A5%E9%87%8D%E5%BB%BA%E6%A8%A1%E5%9E%8B.md)
- 第十六章 立体准稠密匹配(stereo模块)
    - 16.1 [准稠密立体视觉](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2016/%E5%87%86%E7%A8%A0%E5%AF%86%E7%AB%8B%E4%BD%93%E8%A7%86%E8%A7%89.md)
    - 16.2 [生成模板参数文件](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2016/%E7%94%9F%E6%88%90%E6%A8%A1%E6%9D%BF%E5%8F%82%E6%95%B0%E6%96%87%E4%BB%B6.md)
- 第十七章 结构光教程(structured_light模块)
    - 17.1 [捕捉格雷码](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2017/%E6%8D%95%E6%8D%89%E6%A0%BC%E9%9B%B7%E7%A0%81.md)
    - 17.2 [解码格雷码教程](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2017/%E8%A7%A3%E7%A0%81%E6%A0%BC%E9%9B%B7%E7%A0%81%E6%95%99%E7%A8%8B.md)
    - 17.3 [捕捉正弦图案](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2017/%E6%8D%95%E6%8D%89%E6%AD%A3%E5%BC%A6%E5%9B%BE%E6%A1%88.md)
- 第十八章 目标跟踪(tracking模块)
    - 18.1 [使用MultiTracker跟踪](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2018/%E4%BD%BF%E7%94%A8MultiTracker%E8%B7%9F%E8%B8%AA.md)
    - 18.2 [OpenCV跟踪器介绍](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2018/OpenCV%E8%B7%9F%E8%B8%AA%E5%99%A8%E4%BB%8B%E7%BB%8D.md)
    - 18.3 [自定义CN跟踪器](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2018/%E8%87%AA%E5%AE%9A%E4%B9%89CN%E8%B7%9F%E8%B8%AA%E5%99%A8.md)
- 第十九章 立体可视化(viz模块)
    - 19.1 [运行viz](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2019/%E8%BF%90%E8%A1%8Cviz.md)
    - 19.2 [创建小部件](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2019/%E5%88%9B%E5%BB%BA%E5%B0%8F%E9%83%A8%E4%BB%B6.md)
    - 19.3 [设置物体位姿](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2019/%E8%AE%BE%E7%BD%AE%E7%89%A9%E4%BD%93%E4%BD%8D%E5%A7%BF.md)
    - 19.4 [位姿变换](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2019/%E4%BD%8D%E5%A7%BF%E5%8F%98%E6%8D%A2.md)
    - 19.5 [创建3D直方图](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2019/%E5%88%9B%E5%BB%BA3D%E7%9B%B4%E6%96%B9%E5%9B%BE.md)
- 第二十章 扩展图像处理(ximgproc模块)
    - 20.1 [视差图滤波](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2020/%E8%A7%86%E5%B7%AE%E5%9B%BE%E6%BB%A4%E6%B3%A2.md)
    - 20.2 [用于快速边缘检测的结构化森](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2020/%E7%94%A8%E4%BA%8E%E5%BF%AB%E9%80%9F%E8%BE%B9%E7%BC%98%E6%A3%80%E6%B5%8B%E7%9A%84%E7%BB%93%E6%9E%84%E5%8C%96%E6%A3%AE.md)
    - 20.3 [训练结构化森林](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2020/%E8%AE%AD%E7%BB%83%E7%BB%93%E6%9E%84%E5%8C%96%E6%A3%AE%E6%9E%97.md)
- 第二十一章 对照片进行处理(xphoto模块)
    - 21.1 [图像修复](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2021/%E5%9B%BE%E5%83%8F%E4%BF%AE%E5%A4%8D.md)
    - 21.2 [油画效果](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2021/%E6%B2%B9%E7%94%BB%E6%95%88%E6%9E%9C.md)
    - 21.3 [训练基于学习的白平衡算法](https://github.com/fengzhenHIT/OpenCV-contrib-module-Chinese-Tutorials/blob/master/chapter%2021/%E8%AE%AD%E7%BB%83%E5%9F%BA%E4%BA%8E%E5%AD%A6%E4%B9%A0%E7%9A%84%E7%99%BD%E5%B9%B3%E8%A1%A1%E7%AE%97%E6%B3%95.md)

# 引言
<p align="center">为什么翻译此书？</p>
<p align="center">小白</p>
<p align="center">2020年2月8日</p>

### 1 什么是OpenCV-Contrib Modules
Contrib Modules是OpenCV的扩展模块，包含了很多用于实现特定算法的子模块。由于计算机视觉和图像处理算法研究的发展，有很多优秀的算法无法很快集成到基础库中（就是我们平时最常使用的OpenCV库），这样将其单独编写独立的模块不仅有利于新算法的推广和维护，而且也可以避免随着版本的更新基础库中的内容越来越冗余，避免所占内存越来越大，方便使用。此外，由于由部分算法被专利保护，使用扩展模块的形式也避免了很多不必要的纠纷，例如大名鼎鼎的SIFT算法就有专利保护。

### 2 为什么是这本书
近些年来，人工智能快速发展，众多效果好、速度快的计算机视觉算法被提出。然而OpenCV基础库中的增加的算法却有限，造成了一种OpenCV只适合基础的算法学习的错觉。实际上，OpenCV每个版本的更新都带来众多最新的算法，只是很多内容被放置在了Contrib Modules扩展模块中。
目前市面上很多关于OpenCV学习的书籍都是在介绍如何使用OpenCV的基础库，却很少有提及扩展模块。虽然书籍的面向群体不同，但是当读者完成OpenCV入门之后，却鲜有书籍能够帮助读者进一步提高，因此读者十分需要一本介绍OpenCV扩展模块的进阶书籍。本书结合官网内容，对OpenCV扩展模块的使用进行整理，更够帮助有一定图像处理基础的读者在OpenCV的使用上更进一步。

### 3 本书如何使用
本书是OpenCV学习的进阶版，建议结合《OpenCV 4计算机视觉编程实战》进行配套学习。

### 4 本书的目标读者
本书面向对计算机视觉和图像处理感兴趣并且使用OpenCV进行学术研究的读者，建议读者对OpenCV具有一定的了解，也可以建议结合《OpenCV 4计算机视觉编程实战》进行配套学习。本书教程为C++语言，因此本书主要面向使用C++语言编程的读者。

### 5 本书的翻译人员
- 庞家明，测控专业在读本科生。主要负责2.5、2.6节和第5章到第9章的内容。
- 王润泽，圣马家沟男子职业技术学院研究生在读，主要负责2.2、2.3节和第11章到第14章内容。
- 唐佳满，北下关军事基地信号处理研一在读，主要负责2.4节和第17章、第20章和第21章内容。
- 吴鹏飞 珠江环岛工业大学未知环境侦查研究生在读，主要负责第10章内容
- 税科，C++图像处理工程师（天津开发区中环系统电子工程股份有限公司博士后工作站），主要负责第18章和第19章内容。
- 瀚海古月，马家沟大沙河驻地机器人研一在读，主要负责第3章、第15章和第16章内容。
- 赵宏峰 霍尼韦尔航空航天部门软件工程师，主要负责2.1节和第4章内容。
- 小白，小白学视觉公众号博主，主要负责第一章和全书的排版校对。

### 特别声明
由于翻译人员的水平和精力有限，在翻译过程中难免会有错误发生，请读者予以理解。如果阅读本文档的过程中发现问题，可以与我们联系，我们会第一时间对内容进行更正。

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

完整版电子书pdf已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取


================================================
FILE: chapter 1/Contrib Modules模块内容介绍.md
================================================
目前Contrib Modules中有五十个子模块，涵盖了从传统机器视觉到深度神经网络，从相机标定到立体视觉，从背景分割到图像识别等众多领域。本节主要目的是为了介绍子模块的功能，让读者了解被隐藏在OpenCV-Contrib中的众多强大功能，也让读者能够直观的了解到模块中都有哪些部分是直接需要的，做到在学习和使用时能够有的放矢。

全部的模块名称和模块用处在表1-1中给出。


<p align="center">
  <strong>表1-1 contrib中各模块名称和含义</strong>
</p>

<center>
  
模块名称|含义
---|---
aruco|ArUco标记检测
bgsegm|改进的背景前景分割方法
bioinspired.|受生物启发的视觉模型和衍生工具
ccalib|用于3D重建的自定义校准图案
cnn_3dobj|3D对象识别和姿态估计API
cudaarithm.|矩阵运算（CUDA）
cudabgsegm|背景分割
cudacodec|视频编码/解码
cudafeatures2d|特征检测与描述
cudafilters|图像过滤
cudaimgproc|图像处理
cudalegacy|版权支持
cudaobjdetect|物体检测
cudaoptflow|光流
cudastereo|立体视觉
cudawarping|图像映射
cudev|设备层
cvv|用于计算机视觉程序的交互式视觉调试的GUI
datasets|处理不同数据集的框架
dnn_objdetect|DNN用于物体检测
dpm|基于零件的可变形模型
face|人脸分析
freetype|用freetype / harfbuzz绘制UTF-8字符串
fuzzy|基于模糊数学的图像处理
hdf|分层数据格式I / O例程
hfs|分层特征选择，实现有效的图像分割
img_hash|该模块带来了不同图像哈希算法的实现。
line_descriptor|从图像提取的行的二进制描述符
optflow	|光流算法
ovis|OGRE 3D可视化工具
phase_unwrapping|相位展开API
plot|Mat数据的绘图功能
quality|图像质量分析（IQA）API
reg|图像配准
rgbd|RGB深度处理
saliency|显着性API
sfm|运动结构
shape|形状距离和匹配
stereo|立体对应算法
structured_light|结构化光源API
superres|超分辨率
surface_matching|表面匹配
text|场景文字检测与识别
tracking|追踪API
videostab|视频稳定
viz|3D可视化器
xfeatures2d|额外的2D功能框架
ximgproc|扩展图像处理
xobjdetect|扩展物体检测
xphoto|其他照片处理算法

</center>

下面是部分子模块的详细介绍：

-	aruco：全名是“ArUco and ChArUco Markers”，AR增强现实模块。ArUco and ChArUco是两个增强现实中常用的标记。ArUco是被嵌入在棋盘的白色区域内的标记。
-	bgsegm：全名是“Background segmentation”，背景分割算法。模块中主要包括统计背景图像估计和按像素的贝叶斯分割等。 
-	bioinspired：全名是“Biological Vision -- Biologically inspired vision model”，生物视觉-生物启发的视觉模型。模块中主要包括最小化噪声、亮度差异、瞬态事件分割、高动态范围色调映射方法等。
-	ccalib：全名是“Custom Calibration”，自定义标定。模块中主要包括三维重建、广角相机标定、随机模式标定和多相机标定等。
-	cnn_3dobj：全名是“cnn_ 3D object recognition and pose estimation”，深度对象识别和姿态检测。模块中主要包含基于Caffe深度神经网络库构建、训练和测试视觉对象识别和姿势的CNN模型。
-	cvv：全名是“Computer Vision Debugger”计算机视觉调试器。模块提供GUI调试界面，便于交互式的调试计算机视觉的程序。
-	datasets：全名是“Datasets Reader”，数据集读取器。模块用于读取现有计算机视觉数据库和使用该读取器训练、测试和运行该数据集提供的示例程序。
-	text：全名是“Visual Text Matching”，文字检测与识别。模块主要包括检测文字、分割词汇、识别文本等。


================================================
FILE: chapter 1/Windows系统中安装扩展模块.md
================================================
本小节将介绍如何安装contrib扩展模块，考虑到读者可能使用windows系统或者Ubuntu系统，本书中将分别介绍在这两个系统中安装contrib扩展模块的过程。


安装contrib扩展模块时不仅需要扩展模块安装包，同时需要基础库安装包。contrib扩展模块安装包以opencv_contrib-x.x.x的形式统一命名，在安装时基础库安装包的版本需要和contrib扩展模块的安装包版本相对应。可以在GitHub上获取与自己OpenCV版本相匹配的opencv_contrib安装包。

## 1.1.1Windows系统中安装扩展模块
在Windows系统中安装opencv_contrib扩展模块其实就是重新将OpenCV编译一遍，因此需要用到CMake编译器，在CMake官网下载.msi安装包，通过双击直接完成安装任务。本书使用的是CMake 3.7.0版本，这里需要注意，由于OpenCV版本更新速度慢于CMake，因此不推荐使用过高版本的CMake编译器，只要满足OpenCV 4.0要求的最低版本即可，因此为了减少读者编译过程中的错误，建议与读者使用同版本的CMake。打开安装好的CMake软件，可以看到如图1-1所示的页面。我们需要选择OpenCV源码所在地址与编译文件的输出地址。源码放在了“…\opencv\sources”文件夹中，为了与OpenCV原有文件区分，在 “…\opencv”中创建一个名为newbuild文件夹用于存放编译输出文件。

<p align="center">
  <img src="https://img-blog.csdnimg.cn/20200226130256257.png" align="center" height="300"></img>
</p>

 
之后通过点击【Configure】按钮，选择将源码编译成与Visual Studio版本相对应的项目工程文件，作者安装的是Visual Studio 2015，同时想使用64位的OpenCV，因此选择“Visual Studio 14 2015 Win64”选项，同时选择本地编辑器“Use default native compilers”。选择配置的操作界面如图1-2所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022613035076.png" align="center" height="250">
</p>

之后再次点击【Configure】开始构建，当出现“Configuring done”后说明构建成功，在CMake界面会出现很多变量，如图1-3所示。首先找到“BUILD_opencv_world”和“OPENCV_ENABLE_NONFREE”这两个变量，在变量后面的方框内打上“√”。第一个变量的含义是生成一个大的.lib文件，在被配置链接器时只有一个“opencv_world400d.lib”文件。第二个变量的含义是为了在编译成功后可以使用具有专利保护的算法，如果该变量不被选中，就不能使用例如SIFT算法在内的具有专利保护的算法，之后找到“OPENCV_EXTRA_MODULES_PATH”变量，该变量的含义是告诉编译器扩展包的源码在哪里，选择我们刚才下载的opencv_contrib安装包里的modules文件夹。如果这个变量为空，在编译过程中也不会报错，只是安装了OpenCV的基础版。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226130522407.png" align="center" height="300">
</p>

在编译完成的界面内，我们可以看到扩展模块中将要编译的子模块，部分模块如图1-4中所示，比如aruco模块、bgsegm模块等。我们可以在右侧的方框中来选择是否编译该模块，默认情况下contrib中的子模块都是进行编译的，不过为了稳妥期间，读者在编译时可以查看并确认一下这些数据。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226130629146.png" align="center" height="250">
</p>

再次点击【Configure】，直到所有的红色变量变成白色，之后点击【Generate】开始编译。编译成功后会在newbuild文件夹中生成许多文件，找到OpenCV.sln文件，用Visual Studio 2015打开该文件并重新生成解决方案，这个过程会比较漫长。 经过漫长时间的等待，在资源管理器中找到CMakeTargets中的INSTALL文件，右键选择“仅用于项目”中的“仅生成INSTALL”，会在newbuild文件夹中生成一个名为install的文件夹，我们用来环境配置的所有文件都存放在这个文件夹中。按照配置OpenCV基础库环境的方式配置环境即可。

**提示
编译后newbuild 文件夹非常的大，会有几个G的文件，但是除了install文件夹最重要，绝大多数文件都是垃圾文件，如果觉得硬盘存储量有限，可以选择性的删除一些文件。**

**注意
*我们认为读者熟练的掌握OpenCV基础库的环境配置方案，如果读者对基础库环境配置过程不是很熟悉，可以参考我们推荐的配套书籍《OpenCV 4计算机视觉编程实战》。***

***警告
编译过程较为繁琐，并且由于网络、系统版本等问题容易遇到比较冷门的错误，若遇到相关棘手问题，可以在“小白学视觉”微信公众号交流学习。
如果读者只是想使用扩展模块，我们在“小白学视觉”微信公众号上准备了多个版本已经编译完成的含有contrib扩展模块的库文件，下载后直接配置环境即可。***

## 1.1.2Ubuntu系统中安装扩展模块
在Ubuntu系统中安装扩展模块比较容易，只需要在cmake编译时使用代码清单1-1中的命令，在“OPENCV_EXTRA_MODULES_PATH=”后面添加上扩展安装包的路径即可，其余步骤与安装OpenCV基础模块没有区别。

```cpp
代码清单1-1 编译OpenCV命令
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local OPENCV_EXTRA_MODULES_PATH=../opencv_contrib/modules ..
```

***注意
我们认为读者熟练的掌握OpenCV在Ubuntu系统中的编译和配置工作。这部分相关内容网络资源较多，这里不再过多叙述。***


================================================
FILE: chapter 10/OpenCV中的人脸识别.md
================================================
## 10.1.1人脸识别介绍
人脸识别对于人类而言是一项轻松的任务。实验表明，即使是1至3天大的婴儿也能够区分已知面孔。那么，对于计算机来说有多难呢？事实证明，到目前为止，计算机对人类的认识还知之甚少。内部特征（眼睛，鼻子，嘴巴）或外部特征（头部形状，发际线）是否可成功用于的面部识别？我们如何分析图像，大脑如何编码呢？David Hubel和Torsten Wiesel证明，我们的大脑具有专门的神经细胞，可以响应场景的特定局部特征，例如线条，边缘，角度或运动。由于我们不认为世界是零散的碎片，因此我们的视觉皮层必须以某种方式将不同的信息源组合成有用的模式。自动人脸识别就是从图像中提取那些有意义的特征，将它们变成有用的表示并对其进行某种分类。

基于面部的几何特征的面部识别可能是最直观的面部识别方法。Picture processing system by computer complex and recognition of human faces中描述了最早的自动面部识别系统之一：标记点（眼睛，耳朵，鼻子等器官的的位置）用于构建特征向量（点之间的距离，它们之间的角度等 ）。识别是通过计算探针和参考图像的特征向量之间的欧式距离来实现的。 这种方法由于其性质使得抵抗照明变化的能力很强，但是有一个巨大的缺点就是即使使用最先进的算法，标记点的精确配准也很复杂。Face recognition through geometrical features中进行了一些有关几何人脸识别的最新工作。通过使用22维特征向量对大型数据集进行的实验表明，仅几何特征可能无法携带足够的信息来进行人脸识别。

Eigenfaces for recognition中描述的Eigenfaces采用了一种整体方法来进行脸部识别：脸部图像是来自高维图像空间的一个点，并且找到了低维表示，使得分类变得容易。使用主成分分析可以找到较低维的子空间，该子空间可以识别具有最大方差的轴。从重建的角度来看，这种转换是最佳的，但它没有考虑任何类标签。想象一种情况，其中的变化是由外部来源产生的，方差最大的轴根本不必包含任何判别信息，因此分类变得不可能。因此，在Eigenfaces vs. fisherfaces: Recognition using class specific linear projection中将具有线性判别分析的特定类投影应用于面部识别。 基本思想是最小化一个类中的方差，同时最大化两个类之间的方差。

最近出现了各种用于局部特征提取的方法。 为了避免输入数据的高维性，仅描述了图像的局部区域，提取的特征（希望）对部分遮挡，照明和较小的样本量更加健壮。 用于局部特征提取的算法是Gabor小波，离散余弦变换和局部二值模式。 应用空间特征提取时，保留空间信息的最佳方法是什么仍然是一个悬而未决的问题，因为空间信息可能是有用的信息。

## 10.1.2人脸数据集
人脸识别需要一些人脸图像，我们可以创建自己的数据集，也可以从可用的人脸数据库中获取图像， http://face-rec.org/databases/提供最新的数据库。 接下来介绍三个有趣的数据库：

- AT&T Facedatabase AT＆T人脸数据库（有时也称为ORL人脸数据库）包含40个不同主题中每个主题的十幅不同图像。 对于某些对象，图像是在不同的时间拍摄的，光线，面部表情（睁开/闭合的眼睛，微笑/不微笑）和面部细节（眼镜/不戴眼镜）均不同。 所有图像都是在深色均匀背景下拍摄的，对象处于直立的，正面的位置（允许某些侧向移动）。

- Yale Facedatabase A, 也称为Yalefaces。AT＆T Facedatabase适用于初始测试，但是它是一个相当简单的数据库。Eigenfaces方法已经具有97％的识别率，因此使用其他算法不会看到任何重大改进。Yale Facedatabase A（也称为Yalefaces）是更适合初始实验的数据集，因为识别问题比较困难。该数据库由15个人（男性14位，女性1位）组成，每个人具有11张320×243像素的灰度图像。光线条件（中心光线，左光线，右光线），面部表情（快乐，正常，悲伤，困倦，惊讶，眨眼）和眼镜（眼镜，无眼镜）都有变化。但是原始图像未裁剪和对齐。

- Extended Yale Facedatabase B 扩展的Yale Facedatabase B在其裁剪版本中包含3814个不同人物的2414张图像。该数据库的重点放在提取对照明稳定的特征上，图像的情感/遮挡/ ...几乎没有变化。我个人认为，对于我在本文中执行的实验而言，该数据集太大。 您最好使用AT＆T Facedatabase进行初始测试。Yale Facedatabase B的第一个版本曾用于查看在剧烈光照变化下Eigenfaces和Fisherfaces方法的性能。

## 10.1.3准备数据
一旦获取了一些数据，就需要在程序中读取它们。在演示应用程序中，我决定从一个非常简单的CSV文件读取图像。因为这是最简单的与平台无关的方法。基本上，所有CSV文件都需要包文件名和标签（作为整数），组成代码清单10-1中的形式。

```cpp
代码清单10-1
1./path/to/image.ext;0
```

/path/to/image.ext是图像的路径，如果在Windows中，则可能是这样的：C：/faces/person0/image0.jpg。 然后是分隔符“; ”最后，我们给图像分配标签0。将标签视为该图像所属的主题（人），因此相同的主题（人）应具有相同的标签。

从AT＆T Facedatabase下载AT＆T Facedatabase，并从at.txt下载相应的CSV文件，该文件中的内容如代码清单10-2中所示

```cpp
代码清单10-2
./at/s1/1.pgm;0
./at/s1/2.pgm;0
...
./at/s2/1.pgm;1
./at/s2/2.pgm;1
...
./at/s40/1.pgm;39
./at/s40/2.pgm;39
```

想象一下，我已经将文件提取到D:/ data / at，并将CSV文件下载到D:/data/at.txt。然后，我们只需要用D:/ data /搜索并替换./。我们可以在自己选择的编辑器中执行此操作，每个足够高级的编辑器都可以执行此操作。拥有包含有效文件名和标签的CSV文件后，我们可以通过将路径作为参数传递到CSV文件来运行任何示例程序，方式如代码清单10-3中所示。


```cpp
代码清单10-3
facerec_demo.exe D:/data/at.txt
```
	
## 10.1.4Eigenfaces介绍
我们得到的图像表示的最大问题是它的大尺寸。二维p×q灰度图像跨越m = pq维矢量空间，因此100×100像素的图像已经位于10000维图像空间中。 但是并不是所有的维度对我们都有用，我们只能决定数据是否存在差异，因此我们要寻找的是构成大多数信息的组件。 主成分分析（PCA）由Karl Pearson（1901）和Harold Hotelling（1933）独立提出，将一组可能相关的变量转换为较小的一组不相关变量。具体想法是，高维数据集通常由相关变量来描述，因此，只有少数几个有意义的维度才能说明大多数信息。PCA方法找到数据中方差最大的方向，称为主成分。

接下来介绍Eigenfaces方法算法

令X={x<sub>1</sub>,x<sub>2</sub>,…,x<sub>n</sub> }是带有观测值x<sub>i</sub>∈R<sup>d</sup>的随机向量。计算平均值μ

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex=\mu&space;=&space;\frac{1}{n}\sum\limits_{i&space;=&space;1}^n&space;{{x_i}}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\mu&space;=&space;\frac{1}{n}\sum\limits_{i&space;=&space;1}^n&space;{{x_i}}" title="\mu = \frac{1}{n}\sum\limits_{i = 1}^n {{x_i}}" /></a></p>
计算协方差矩阵S

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex=S&space;=&space;\frac{1}{n}\sum\limits_{i&space;=&space;1}^n&space;{\left(&space;{{x_i}&space;-&space;\mu&space;}&space;\right){{\left(&space;{{x_i}&space;-&space;\mu&space;}&space;\right)}^T}}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?S&space;=&space;\frac{1}{n}\sum\limits_{i&space;=&space;1}^n&space;{\left(&space;{{x_i}&space;-&space;\mu&space;}&space;\right){{\left(&space;{{x_i}&space;-&space;\mu&space;}&space;\right)}^T}}" title="S = \frac{1}{n}\sum\limits_{i = 1}^n {\left( {{x_i} - \mu } \right){{\left( {{x_i} - \mu } \right)}^T}}" /></a></p>

计算S的特征值λ<sub>i</sub>和特征向量v<sub>i</sub>
<center>Sv<sub>i</sub>=λ<sub>i </sub>v<sub>i</sub>，i=1,2,…,n</center>

对特征向量按其特征值降序进行排序。k个主要成分是与k个最大特征值相对应的特征向量。所观察到的向量x的k个主要成分如下：

<center>y=W<sup>T</sup> (x-μ)</center>

其中W={v<sub>1</sub>,v<sub>2</sub>,…,v<sub>k</sub> }。

从PCA基础上进行的重构，方式由下式给出：
<center>x=Wy+μ</center>

其中W={v<sub>1</sub>,v<sub>2</sub>,…,v<sub>k </sub>}。

然后，Eigenfaces方法通过以下方式执行人脸识别：

- 将所有训练样本投影到PCA子空间中。
- 将查询图像投影到PCA子空间中。
- 在计划的训练图像和计划的查询图像之间找到最近的邻居

这里仍然有一个问题需要解决。想象一下，我们得到了400张尺寸为100×100像素的图像。主成分分析求解协方差矩阵S = XXT，在本教程中为size（X）= 10000×400。我们最终将得到10000×10000矩阵，大约为0.8GB。 解决此问题是不可行的，因此我们需要应用技巧。从线性代数课程中，您知道M> N的M×N矩阵只能具有N-1个非零特征值。 因此，有可能采用大小为N×N的特征值分解S = X<sup>T</sup>X：
<center>X<sup>T</sup> Xv<sub>i</sub>=λ<sub>i </sub>vi</center>

并使用数据矩阵的左乘法获得S = XX<sup>T</sup>的原始特征向量：
<center>XX<sup>T</sup> (Xv<sub>i </sub>)=λ<sub>i </sub>(Xv<sub>i</sub>)</center>

所得的特征向量是正交的，为了获得正交特征向量，需要将它们标准化为单位长度。具体方式读者可以Pattern classification中了解。

## 10.1.5Eigenface示例

```cpp
代码清单10-4
/*
 * Copyright (c) 2011. Philipp Wagner <bytefish[at]gmx[dot]de>.
 * Released to public domain under terms of the BSD Simplified license.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *   * Redistributions of source code must retain the above copyright
 *     notice, this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above copyright
 *     notice, this list of conditions and the following disclaimer in the
 *     documentation and/or other materials provided with the distribution.
 *   * Neither the name of the organization nor the names of its contributors
 *     may be used to endorse or promote products derived from this software
 *     without specific prior written permission.
 *
 *   See <http://www.opensource.org/licenses/bsd-license>
 */
#include "opencv2/core.hpp"
#include "opencv2/face.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <iostream>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace cv::face;
using namespace std;
static Mat norm_0_255(InputArray _src) {
    Mat src = _src.getMat();
    // Create and return normalized image:
    Mat dst;
    switch(src.channels()) {
    case 1:
        cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC1);
        break;
    case 3:
        cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC3);
        break;
    default:
        src.copyTo(dst);
        break;
    }
    return dst;
}
static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
    std::ifstream file(filename.c_str(), ifstream::in);
    if (!file) {
        string error_message = "No valid input file was given, please check the given filename.";
        CV_Error(Error::StsBadArg, error_message);
    }
    string line, path, classlabel;
    while (getline(file, line)) {
        stringstream liness(line);
        getline(liness, path, separator);
        getline(liness, classlabel);
        if(!path.empty() && !classlabel.empty()) {
            images.push_back(imread(path, 0));
            labels.push_back(atoi(classlabel.c_str()));
        }
    }
}
int main(int argc, const char *argv[]) {
    // Check for valid command line arguments, print usage
    // if no arguments were given.
    if (argc < 2) {
        cout << "usage: " << argv[0] << " <csv.ext> <output_folder> " << endl;
        exit(1);
    }
    string output_folder = ".";
    if (argc == 3) {
        output_folder = string(argv[2]);
    }
    // Get the path to your CSV.
    string fn_csv = string(argv[1]);
    // These vectors hold the images and corresponding labels.
    vector<Mat> images;
    vector<int> labels;
    // Read in the data. This can fail if no valid
    // input filename is given.
    try {
        read_csv(fn_csv, images, labels);
    } catch (const cv::Exception& e) {
        cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
        // nothing more we can do
        exit(1);
    }
    // Quit if there are not enough images for this demo.
    if(images.size() <= 1) {
        string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
        CV_Error(Error::StsError, error_message);
    }
    // Get the height from the first image. We'll need this
    // later in code to reshape the images to their original
    // size:
    int height = images[0].rows;
    // The following lines simply get the last images from
    // your dataset and remove it from the vector. This is
    // done, so that the training data (which we learn the
    // cv::BasicFaceRecognizer on) and the test data we test
    // the model with, do not overlap.
    Mat testSample = images[images.size() - 1];
    int testLabel = labels[labels.size() - 1];
    images.pop_back();
    labels.pop_back();
    // The following lines create an Eigenfaces model for
    // face recognition and train it with the images and
    // labels read from the given CSV file.
    // This here is a full PCA, if you just want to keep
    // 10 principal components (read Eigenfaces), then call
    // the factory method like this:
    //
    //      EigenFaceRecognizer::create(10);
    //
    // If you want to create a FaceRecognizer with a
    // confidence threshold (e.g. 123.0), call it with:
    //
    //      EigenFaceRecognizer::create(10, 123.0);
    //
    // If you want to use _all_ Eigenfaces and have a threshold,
    // then call the method like this:
    //
    //      EigenFaceRecognizer::create(0, 123.0);
    //
    Ptr<EigenFaceRecognizer> model = EigenFaceRecognizer::create();
    model->train(images, labels);
    // The following line predicts the label of a given
    // test image:
    int predictedLabel = model->predict(testSample);
    //
    // To get the confidence of a prediction call the model with:
    //
    //      int predictedLabel = -1;
    //      double confidence = 0.0;
    //      model->predict(testSample, predictedLabel, confidence);
    //
    string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
    cout << result_message << endl;
    // Here is how to get the eigenvalues of this Eigenfaces model:
    Mat eigenvalues = model->getEigenValues();
    // And we can do the same to display the Eigenvectors (read Eigenfaces):
    Mat W = model->getEigenVectors();
    // Get the sample mean from the training data
    Mat mean = model->getMean();
    // Display or save:
    if(argc == 2) {
        imshow("mean", norm_0_255(mean.reshape(1, images[0].rows)));
    } else {
        imwrite(format("%s/mean.png", output_folder.c_str()), norm_0_255(mean.reshape(1, images[0].rows)));
    }
    // Display or save the Eigenfaces:
    for (int i = 0; i < min(10, W.cols); i++) {
        string msg = format("Eigenvalue #%d = %.5f", i, eigenvalues.at<double>(i));
        cout << msg << endl;
        // get eigenvector #i
        Mat ev = W.col(i).clone();
        // Reshape to original size & normalize to [0...255] for imshow.
        Mat grayscale = norm_0_255(ev.reshape(1, height));
        // Show the image & apply a Jet colormap for better sensing.
        Mat cgrayscale;
        applyColorMap(grayscale, cgrayscale, COLORMAP_JET);
        // Display or save:
        if(argc == 2) {
            imshow(format("eigenface_%d", i), cgrayscale);
        } else {
            imwrite(format("%s/eigenface_%d.png", output_folder.c_str(), i), norm_0_255(cgrayscale));
        }
    }
    // Display or save the image reconstruction at some predefined steps:
    for(int num_components = min(W.cols, 10); num_components < min(W.cols, 300); num_components+=15) {
        // slice the eigenvectors from the model
        Mat evs = Mat(W, Range::all(), Range(0, num_components));
        Mat projection = LDA::subspaceProject(evs, mean, images[0].reshape(1,1));
        Mat reconstruction = LDA::subspaceReconstruct(evs, mean, projection);
        // Normalize the result:
        reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
        // Display or save:
        if(argc == 2) {
            imshow(format("eigenface_reconstruction_%d", num_components), reconstruction);
        } else {
            imwrite(format("%s/eigenface_reconstruction_%d.png", output_folder.c_str(), num_components), reconstruction);
        }
    }
    // Display if we are not writing to an output folder:
    if(argc == 2) {
        waitKey(0);
    }
    return 0;
}
```

我们使用了喷射色图，因此我们可以看到灰度值如何在特定的Eigenface内分布。可以看到，特征不仅编码面部特征，而且还编码图像中的光照（请参见特征4中的左灯，特征5中的右灯）。具体结果在图10-1给出。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226135302981.png" height="300">
</p>
 
我们可以看到，可以从其较低维度的近似值重建面孔。因此，让我们看看一个好的重建需要多少本征面。我们将用10,30，…，310个本征面做一个子图，具体操作如代码清单10-5中所示。

```cpp
代码清单10-5
// Display or save the image reconstruction at some predefined steps:
for(int num_components = 10; num_components < 300; num_components+=15) {
    // slice the eigenvectors from the model
    Mat evs = Mat(W, Range::all(), Range(0, num_components));
    Mat projection = LDA::subspaceProject(evs, mean, images[0].reshape(1,1));
    Mat reconstruction = LDA::subspaceReconstruct(evs, mean, projection);
    // Normalize the result:
    reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
    // Display or save:
    if(argc == 2) {
        imshow(format("eigenface_reconstruction_%d", num_components), reconstruction);
    } else {
        imwrite(format("%s/eigenface_reconstruction_%d.png", output_folder.c_str(), num_components), reconstruction);
    }
}
```

10个特征向量显然不足以进行良好的图像重建，而50个特征向量可能已经足以编码重要的面部特征。如果使用300个特征向量描述AT＆T Facedatabase，我们将获得良好的重构。 根据经验，要成功识别人脸，应该选择多少个特征脸，但这在很大程度上取决于输入数据。 如图10-5是特征向量增加对同一个图像重建的效果的影响。
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226135512221.png" height="300">
</p>
 

## 10.1.6Fisherfaces介绍
主成分分析（PCA）是Eigenfaces方法的核心，它找到特征的线性组合，该特征使数据的总方差最大化。 尽管这显然是表示数据的一种有效方法，但是它不考虑任何类，因此在丢弃组件时可能会丢失很多区分性信息。想象一种情况，我们的数据差异是由外部来源生成的，那就容易了。 PCA标识的组件根本不必包含任何歧视性信息，因此将投影的样本混在一起，并且无法进行分类（有关示例： http://www.bytefish.de/wiki/pca_lda_with_gnu_octave）。

线性判别分析是特定类的降维，它是由伟大的统计学家Sir R. A. Fisher提出的。他在1936年的论文《分类学问题中多次测量的使用》中成功地将其用于花朵分类。为了找到在类别之间最佳分离的特征组合，线性判别分析使类别之间对类别内部散布的比率最大化，而不是使整体散布最大化。 这个想法很简单：相同的类应该紧密地聚在一起，而不同的类在低维表示中则要尽可能地远离彼此。 Belhumeur，Hespanha和Kriegman也认识到这一点，因此他们在[14]中将判别分析应用于人脸识别。

接下来介绍**Fisherfaces**方法的算法

令X为具有从c类中抽取的样本的随机向量：
<center>X={X<sub>1</sub>,X<sub>2</sub>,…,X<sub>C </sub>}</center>

<center>X<sub>I</sub>={x<sub>1</sub>,x<sub>2</sub>,…,x<sub>n</sub> }</center>

散射矩阵S<sub>B</sub>和S_ {W}的计算公式为：

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex={S_B}&space;=&space;\sum\limits_{i&space;=&space;1}^c&space;{{N_i}\left(&space;{{\mu&space;_i}&space;-&space;u}&space;\right){{\left(&space;{{\mu&space;_i}&space;-&space;u}&space;\right)}^T}}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?{S_B}&space;=&space;\sum\limits_{i&space;=&space;1}^c&space;{{N_i}\left(&space;{{\mu&space;_i}&space;-&space;u}&space;\right){{\left(&space;{{\mu&space;_i}&space;-&space;u}&space;\right)}^T}}" title="{S_B} = \sum\limits_{i = 1}^c {{N_i}\left( {{\mu _i} - u} \right){{\left( {{\mu _i} - u} \right)}^T}}" /></a></p>


<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex={S_W}&space;=&space;\sum\limits_{i&space;=&space;1}^c&space;{\sum\limits_{{x_j}&space;\in&space;{X_i}}&space;{\left(&space;{{x_j}&space;-&space;{u_i}}&space;\right){{\left(&space;{{x_j}&space;-&space;{u_i}}&space;\right)}^T}}&space;}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?{S_W}&space;=&space;\sum\limits_{i&space;=&space;1}^c&space;{\sum\limits_{{x_j}&space;\in&space;{X_i}}&space;{\left(&space;{{x_j}&space;-&space;{u_i}}&space;\right){{\left(&space;{{x_j}&space;-&space;{u_i}}&space;\right)}^T}}&space;}" title="{S_W} = \sum\limits_{i = 1}^c {\sum\limits_{{x_j} \in {X_i}} {\left( {{x_j} - {u_i}} \right){{\left( {{x_j} - {u_i}} \right)}^T}} }" /></a></p>

其中，μ是总平均值：

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex=\mu&space;=&space;\frac{1}{N}\sum\limits_{i&space;=&space;1}^N&space;{{x_i}}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?\mu&space;=&space;\frac{1}{N}\sum\limits_{i&space;=&space;1}^N&space;{{x_i}}" title="\mu = \frac{1}{N}\sum\limits_{i = 1}^N {{x_i}}" /></a></p>

μ<sub>i</sub>是类i∈{1，…，c}的平均值：

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex={\mu&space;_i}&space;=&space;\frac{1}{{\left|&space;{{X_i}}&space;\right|}}\sum\limits_{{x_j}&space;\in&space;{X_i}}^N&space;{{x_j}}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?{\mu&space;_i}&space;=&space;\frac{1}{{\left|&space;{{X_i}}&space;\right|}}\sum\limits_{{x_j}&space;\in&space;{X_i}}^N&space;{{x_j}}" title="{\mu _i} = \frac{1}{{\left| {{X_i}} \right|}}\sum\limits_{{x_j} \in {X_i}}^N {{x_j}}" /></a></p>

Fisher的经典算法现在寻找投影W，该投影W使类可分离性标准最大化：

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex={W_{opt}}&space;=&space;\arg&space;{\max&space;_W}\frac{{\left|&space;{{W^T}{S_B}W}&space;\right|}}{{\left|&space;{{W^T}{S_W}W}&space;\right|}}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?{W_{opt}}&space;=&space;\arg&space;{\max&space;_W}\frac{{\left|&space;{{W^T}{S_B}W}&space;\right|}}{{\left|&space;{{W^T}{S_W}W}&space;\right|}}" title="{W_{opt}} = \arg {\max _W}\frac{{\left| {{W^T}{S_B}W} \right|}}{{\left| {{W^T}{S_W}W} \right|}}" /></a></p>

之后，通过解决一般特征值问题，给出了该优化问题的解决方案：

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex={S_B}{\upsilon&space;_i}&space;=&space;{\lambda&space;_i}{S_\omega&space;}{\upsilon&space;_i}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?{S_B}{\upsilon&space;_i}&space;=&space;{\lambda&space;_i}{S_\omega&space;}{\upsilon&space;_i}" title="{S_B}{\upsilon _i} = {\lambda _i}{S_\omega }{\upsilon _i}" /></a></p>

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex=S_W^{&space;-&space;1}{S_B}{\upsilon&space;_i}&space;=&space;{\lambda&space;_i}{\upsilon&space;_i}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?S_W^{&space;-&space;1}{S_B}{\upsilon&space;_i}&space;=&space;{\lambda&space;_i}{\upsilon&space;_i}" title="S_W^{ - 1}{S_B}{\upsilon _i} = {\lambda _i}{\upsilon _i}" /></a></p>

还有一个问题要解决：SW的等级最多为（N-c），具有N个样本和c个类。 在模式识别问题中，样本数N几乎总是比输入数据的维数（像素数）小，因此散射矩阵SW变得奇异（请参阅[173]）。这是通过对数据执行主成分分析并将样本投影到（NC）维空间来解决的。 然后对简化的数据执行线性判别分析，因为SW不再是奇异的。

然后可以将优化问题重写为：
<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex={W_{pac}}&space;=&space;\arg&space;{\max&space;_W}\left|&space;{{W^T}{S_T}W}&space;\right|" target="_blank"><img src="https://latex.codecogs.com/gif.latex?{W_{pac}}&space;=&space;\arg&space;{\max&space;_W}\left|&space;{{W^T}{S_T}W}&space;\right|" title="{W_{pac}} = \arg {\max _W}\left| {{W^T}{S_T}W} \right|" /></a></p>


<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex={W_{fld}}&space;=&space;\arg&space;{\max&space;_W}\frac{{\left|&space;{{W^T}W_{pca}^T{S_B}{W_{pca}}W}&space;\right|}}{{\left|&space;{{W^T}W_{pca}^T{S_W}{W_{pca}}W}&space;\right|}}" target="_blank"><img src="https://latex.codecogs.com/gif.latex?{W_{fld}}&space;=&space;\arg&space;{\max&space;_W}\frac{{\left|&space;{{W^T}W_{pca}^T{S_B}{W_{pca}}W}&space;\right|}}{{\left|&space;{{W^T}W_{pca}^T{S_W}{W_{pca}}W}&space;\right|}}" title="{W_{fld}} = \arg {\max _W}\frac{{\left| {{W^T}W_{pca}^T{S_B}{W_{pca}}W} \right|}}{{\left| {{W^T}W_{pca}^T{S_W}{W_{pca}}W} \right|}}" /></a></p>


之后，将样本投影到（c-1）维空间的变换矩阵W由下式给出

<p align="center"><a href="https://www.codecogs.com/eqnedit.php?latex=W&space;=&space;W_{fld}^TW_{pca}^T" target="_blank"><img src="https://latex.codecogs.com/gif.latex?W&space;=&space;W_{fld}^TW_{pca}^T" title="W = W_{fld}^TW_{pca}^T" /></p>


## 10.1.7	Fisherfaces示例

```cpp
代码清单10-6
/*
 * Copyright (c) 2011. Philipp Wagner <bytefish[at]gmx[dot]de>.
 * Released to public domain under terms of the BSD Simplified license.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions are met:
 *   * Redistributions of source code must retain the above copyright
 *     notice, this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above copyright
 *     notice, this list of conditions and the following disclaimer in the
 *     documentation and/or other materials provided with the distribution.
 *   * Neither the name of the organization nor the names of its contributors
 *     may be used to endorse or promote products derived from this software
 *     without specific prior written permission.
 *
 *   See <http://www.opensource.org/licenses/bsd-license>
 */
#include "opencv2/core.hpp"
#include "opencv2/face.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <iostream>
#include <fstream>
#include <sstream>
using namespace cv;
using namespace cv::face;
using namespace std;
static Mat norm_0_255(InputArray _src) {
    Mat src = _src.getMat();
    // Create and return normalized image:
    Mat dst;
    switch(src.channels()) {
    case 1:
        cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC1);
        break;
    case 3:
        cv::normalize(_src, dst, 0, 255, NORM_MINMAX, CV_8UC3);
        break;
    default:
        src.copyTo(dst);
        break;
    }
    return dst;
}
static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator = ';') {
    std::ifstream file(filename.c_str(), ifstream::in);
    if (!file) {
        string error_message = "No valid input file was given, please check the given filename.";
        CV_Error(Error::StsBadArg, error_message);
    }
    string line, path, classlabel;
    while (getline(file, line)) {
        stringstream liness(line);
        getline(liness, path, separator);
        getline(liness, classlabel);
        if(!path.empty() && !classlabel.empty()) {
            images.push_back(imread(path, 0));
            labels.push_back(atoi(classlabel.c_str()));
        }
    }
}
int main(int argc, const char *argv[]) {
    // Check for valid command line arguments, print usage
    // if no arguments were given.
    if (argc < 2) {
        cout << "usage: " << argv[0] << " <csv.ext> <output_folder> " << endl;
        exit(1);
    }
    string output_folder = ".";
    if (argc == 3) {
        output_folder = string(argv[2]);
    }
    // Get the path to your CSV.
    string fn_csv = string(argv[1]);
    // These vectors hold the images and corresponding labels.
    vector<Mat> images;
    vector<int> labels;
    // Read in the data. This can fail if no valid
    // input filename is given.
    try {
        read_csv(fn_csv, images, labels);
    } catch (const cv::Exception& e) {
        cerr << "Error opening file \"" << fn_csv << "\". Reason: " << e.msg << endl;
        // nothing more we can do
        exit(1);
    }
    // Quit if there are not enough images for this demo.
    if(images.size() <= 1) {
        string error_message = "This demo needs at least 2 images to work. Please add more images to your data set!";
        CV_Error(Error::StsError, error_message);
    }
    // Get the height from the first image. We'll need this
    // later in code to reshape the images to their original
    // size:
    int height = images[0].rows;
    // The following lines simply get the last images from
    // your dataset and remove it from the vector. This is
    // done, so that the training data (which we learn the
    // cv::BasicFaceRecognizer on) and the test data we test
    // the model with, do not overlap.
    Mat testSample = images[images.size() - 1];
    int testLabel = labels[labels.size() - 1];
    images.pop_back();
    labels.pop_back();
    // The following lines create an Fisherfaces model for
    // face recognition and train it with the images and
    // labels read from the given CSV file.
    // If you just want to keep 10 Fisherfaces, then call
    // the factory method like this:
    //
    //      FisherFaceRecognizer::create(10);
    //
    // However it is not useful to discard Fisherfaces! Please
    // always try to use _all_ available Fisherfaces for
    // classification.
    //
    // If you want to create a FaceRecognizer with a
    // confidence threshold (e.g. 123.0) and use _all_
    // Fisherfaces, then call it with:
    //
    //      FisherFaceRecognizer::create(0, 123.0);
    //
    Ptr<FisherFaceRecognizer> model = FisherFaceRecognizer::create();
    model->train(images, labels);
    // The following line predicts the label of a given
    // test image:
    int predictedLabel = model->predict(testSample);
    //
    // To get the confidence of a prediction call the model with:
    //
    //      int predictedLabel = -1;
    //      double confidence = 0.0;
    //      model->predict(testSample, predictedLabel, confidence);
    //
    string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
    cout << result_message << endl;
    // Here is how to get the eigenvalues of this Eigenfaces model:
    Mat eigenvalues = model->getEigenValues();
    // And we can do the same to display the Eigenvectors (read Eigenfaces):
    Mat W = model->getEigenVectors();
    // Get the sample mean from the training data
    Mat mean = model->getMean();
    // Display or save:
    if(argc == 2) {
        imshow("mean", norm_0_255(mean.reshape(1, images[0].rows)));
    } else {
        imwrite(format("%s/mean.png", output_folder.c_str()), norm_0_255(mean.reshape(1, images[0].rows)));
    }
    // Display or save the first, at most 16 Fisherfaces:
    for (int i = 0; i < min(16, W.cols); i++) {
        string msg = format("Eigenvalue #%d = %.5f", i, eigenvalues.at<double>(i));
        cout << msg << endl;
        // get eigenvector #i
        Mat ev = W.col(i).clone();
        // Reshape to original size & normalize to [0...255] for imshow.
        Mat grayscale = norm_0_255(ev.reshape(1, height));
        // Show the image & apply a Bone colormap for better sensing.
        Mat cgrayscale;
        applyColorMap(grayscale, cgrayscale, COLORMAP_BONE);
        // Display or save:
        if(argc == 2) {
            imshow(format("fisherface_%d", i), cgrayscale);
        } else {
            imwrite(format("%s/fisherface_%d.png", output_folder.c_str(), i), norm_0_255(cgrayscale));
        }
    }
    // Display or save the image reconstruction at some predefined steps:
    for(int num_component = 0; num_component < min(16, W.cols); num_component++) {
        // Slice the Fisherface from the model:
        Mat ev = W.col(num_component);
        Mat projection = LDA::subspaceProject(ev, mean, images[0].reshape(1,1));
        Mat reconstruction = LDA::subspaceReconstruct(ev, mean, projection);
        // Normalize the result:
        reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
        // Display or save:
        if(argc == 2) {
            imshow(format("fisherface_reconstruction_%d", num_component), reconstruction);
        } else {
            imwrite(format("%s/fisherface_reconstruction_%d.png", output_folder.c_str(), num_component), reconstruction);
        }
    }
    // Display if we are not writing to an output folder:
    if(argc == 2) {
        waitKey(0);
    }
    return 0;
}
```

对于代码清单10-6中的程序，我将使用Yale Facedatabase A，因为结果显示效果更好。每个Fisherface的长度与原始图像的长度相同，因此可以将其显示为图像。该演示显示（或保存）第一张，最多16张Fisherfaces，程序运行结果在图10-3中给出。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226144508114.png" height="300">
</p>

Fisherfaces方法学习特定于类的变换矩阵，因此它们不像Eigenfaces方法那样明显捕获照明。判别分析改为找到可区分人的面部特征。值得一提的是，Fisherfaces的性能也很大程度上取决于输入数据。 实用地说：如果您仅学习具有良好照明效果的图片的Fisherfaces，并尝试识别照明条件较差的场景中的人脸，则该方法很可能会找到错误的成分（只是因为这些功能在照明不良的图像中可能并不占主导地位）。这有点合乎逻辑，因为该方法没有机会学习照明。

Fisherfaces可以像Eigenfaces一样重建投影图像。但是，由于我们仅识别出可区分主题的功能，因此不能指望对原始图像进行很好的重建。对于Fisherfaces方法，我们将示例图像投影到每个Fisherfaces上，将获得一个不错的可视化效果。具体操作在代码清单10-7中给出，可视化效果在图10-4给出，对于人眼来说，差异可能是微妙的，但是任然能够看到一些差异：

```cpp
代码清单10-7
// Display or save the image reconstruction at some predefined steps:
for(int num_component = 0; num_component < min(16, W.cols); num_component++) {
    // Slice the Fisherface from the model:
    Mat ev = W.col(num_component);
    Mat projection = LDA::subspaceProject(ev, mean, images[0].reshape(1,1));
    Mat reconstruction = LDA::subspaceReconstruct(ev, mean, projection);
    // Normalize the result:
    reconstruction = norm_0_255(reconstruction.reshape(1, images[0].rows));
    // Display or save:
    if(argc == 2) {
        imshow(format("fisherface_reconstruction_%d", num_component), reconstruction);
    } else {
        imwrite(format("%s/fisherface_reconstruction_%d.png", output_folder.c_str(), num_component), reconstruction);
    }
}
```
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226144628258.png" height="300">
</p>


================================================
FILE: chapter 10/w
================================================


================================================
FILE: chapter 10/使用人脸标志 API.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 10/使用人脸标志AMM.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 10/使用人脸标志检测进行人脸交换.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 10/向人脸标志API添加新算法.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 10/图像中的人脸标志检测.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 10/视频中的人脸标志检测.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 10/训练面部标志探测器.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 11/3
================================================


================================================
FILE: chapter 11/使用模糊变换滤波.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 11/模糊变换理论.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 11/通过模糊变换进行图像修复.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 12/2
================================================


================================================
FILE: chapter 12/创建、写入和读取数据集.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 12/建立群组.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 12/读写属性.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 13/4
================================================


================================================
FILE: chapter 13/线特征教程.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 14/4
================================================


================================================
FILE: chapter 14/二维相位图展开.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 15/SFM 模块安装.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 15/e
================================================


================================================
FILE: chapter 15/场景重建.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 15/导入重建模型.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 15/相机运动估计.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 16/5
================================================


================================================
FILE: chapter 16/准稠密立体视觉.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 16/生成模板参数文件.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 17/3
================================================


================================================
FILE: chapter 17/捕捉格雷码.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 17/捕捉正弦图案.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 17/解码格雷码教程.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 18/4
================================================


================================================
FILE: chapter 18/OpenCV跟踪器介绍.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 18/使用MultiTracker跟踪.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 18/自定义CN跟踪器.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 19/3
================================================


================================================
FILE: chapter 19/位姿变换.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 19/创建3D直方图.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 19/创建小部件.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 19/设置物体位姿.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 19/运行viz.md
================================================
<p align="center">
  <img src="https://img-blog.csdnimg.cn/202003011934311.jpg" height="300"></img>
</p>

&nbsp;
&nbsp;
<p align="center"><strong>特殊说明</strong></p>
&nbsp;

由于小白还在读书，最近项目进度有些紧，因此近期能够用来整理Github的时间比较少，因此现将已经整理完成的部分公开，供大家交流学习。

&nbsp;

**完整版电子书pdf**已经整理完成，喜欢阅读电子书，和迫切需要完整版朋友可以在**小白学视觉**微信公众号后台回复**扩展模块中文教程**获取

&nbsp;
[也可以通过CSDN下载抢鲜版](https://download.csdn.net/download/qq_42722197/12178880)


================================================
FILE: chapter 2/ArUco标记板的检测.md
================================================
ArUco标记板是一组如图2-13所示的标志。在为相机提供单个姿势上，他们就像单个标记一样工作。常用的方式是所有标记都在同一平面上的标志板，因为我们可以直接打印出来。但是，标记板不限于这种布置，还可以用2维或3维布局表示。

<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223141221760.jpg" height="300">
</p>

板和一组独立标记之间的区别在于，板中标记之间的相对位置是先验的。这样可以将所有标记的角点用于估计摄像机相对于整个标记板的姿态。当我们使用一组独立的标记时，由于我们不知道标记在环境中的相对位置，因此可以分别估计每个标记的姿态。

使用标记板的最主要好处有：
-	姿态估计更加通用。执行姿态估计仅需要一部分标记。因此，即便存在一些遮挡或仅是局部视图，也可以估算出姿态。
-	由于采用了大量的点对应关系标记角点，因此获得的姿态通常更加准确。

Aruco模块允许使用标记板，使用的是cv::aruco::Board类，它用代码清单2-15中的方式定义了板的布局。
```cpp
代码清单2-15 图像修复
class Board {
public:
    std::vector<std::vector<cv::Point3f> > objPoints;
    cv::Ptr<cv::aruco::Dictionary> dictionary;
    std::vector<int> ids;
};
```

Board类对象具有三个参数：
-	ObjPoints结构是在三维板参考系统中角点位置的列表，即其布局。对于每个标记，其四个角点均以标准顺序储存，即以顺时针顺序并从左上角开始存储。
-	字典参数指示板中标记属于哪个标识字典。
-	最后，ids结构指示objPoints中相对于指定字典的每一个标记的标记符。

## 2.2.1	标志板检测
Aruco模块提供了一个特定的函数estimatePoseBoard()用于执行标记板的姿态估计，该函数的使用方式在代码清单2-16中给出。
```cpp 
代码清单2-16 板姿态估计
cv::Mat inputImage;
// 从某处读取相机参数
cv::Mat cameraMatrix, distCoeffs;
readCameraParameters(cameraMatrix, distCoeffs);
// 假设我们有一个创建板对象的函数
cv::Ptr<cv::aruco::Board> board = cv::aruco::Board::create();
...
std::vector<int> markerIds;
std::vector<std::vector<cv::Point2f>> markerCorners;
cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds);
// 如果至少检测到一个标识
if(markerIds.size() > 0) {
    cv::Vec3d rvec, tvec;
    int valid = cv::aruco::estimatePoseBoard(markerCorners, markerIds, board, cameraMatrix, distCoeffs, rvec, tvec);
}

```
estimatePoseBoard()函数的每个参数含义为：

-	makerCorners和markerlds：从detectMarkers()函数检测到的标记的结构。
-	board：定义板布局及其ID。
-	cameraMatrix和distCoeffs：姿态估计所需的相机标定参数。
-	该函数返回值用于估计板姿态的标记总数。需要注意，由于仅考虑了其ID在Board::ids结构中列出的标记，因此不应使用markerCorners和markerlds中提供的所有标记。

drawAxis()函数可用于绘制获得的姿态，绘制结果如图2-14所示。即使标记版存在部分遮挡情况，依然可以估计姿态并绘制坐标轴，具体如图2-15所示。
<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223142039640.jpg" height="300">
</p>

## 2.2.2	标记板生成
创建标记板对象需要指定环境中每个标记的角位置。但是，在很多情况下，标记板只是在同一平面和网格布局中的一组标记。aruco模块提供了可以轻松创建并打印这些类型标记的相关函数。GridBoard类是一个继承自Board类的类，它代表一个Board及其所有标记都位于同一平面和网格布局中，如图2-13所示。

具体来说，标记板中的坐标系位于板平面中，坐标原点位于标记板左下角，同时Z轴指向外，具体形式如图2-14中所示（X轴：红色；Y轴：绿色；Z轴：蓝色）。

可以通过调整GirdBoard类型的参数来调整标记板样式，具体参数为：
-	X轴方向上的标识数；
-	Y轴方向上的标识数；
-	标识的边长；
-	标识的间隔长度；
-	标识的字典；
-	所有标识（X×Y标识）的ID。

在GirdBoard类中提供了cv::aruco::GridBoard::create()静态函数用于上述参数的设置，该函数具体使用方法在代码清单2-17中给出。
```cpp
代码清单2-17 创建GirdBoard对象
cv::aruco::GridBoard board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
```
-	第一和第二参数分别是在X轴和Y轴方向上的标记数；
-	第三和第四参数分别是标记边长和标记间隔。考虑到该板的估计姿态将以相同的单位进行测量，因此可以以任何单位提供它们（通常使用米）。
-	最后，提供了标记的字典。

代码清单2-17中的代码创建的标记板由5×7=35个标识组成。默认情况下，每个标识的ID均以从0开始的升序分配，因此他们的编号为0、1、2，…，34。可以通过从Board.ids访问ID向量来轻松地对其进行自定义。

在创建标记板之后，接下来将打印并使用它。cv::aruco::GridBoard::draw()中提供了生成GridBoard图像的功能，具体方法在代码清单2-18中给出。
```cpp
代码清单2-18 生成GridBoard图像
cv::Ptr<cv::aruco::GridBoard> board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
cv::Mat boardImage;
board->draw( cv::Size(600, 500), boardImage, 10, 1 );
```
-	第一个参数是输出图像的大小（以像素为单位）。在这种情况下为600x500像素。如果这与（标识）板尺寸不成比例，它将在图像上居中。
-	boardImage：带有板的输出图像。
-	第三个参数是（可选）以像素为单位的边距，因此所有标记都没有触及图像边界。在这种情况下，边距为10。
-	最后，标记边框的大小，类似于drawMarker()函数。预设值为1。

使用代码清单2-18中的代码将输入如图2-16所示的标记板。为了方便用户使用，aruco模块中提供了创建标记板的示例程序create_board.cpp，可以直接调用创建标记板。该程序调用方式在代码清单2-19中给出。

```cpp
代码清单2-19：调用示例程序
"_output path_/aboard.png" -w=5 -h=7 -l=100 -s=10 -d=10
```

<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223162544554.jpg" height="300">
</p>
 
```cpp
接下来在代码清单2-20中给出检测标记板并估计位姿完整的示例程序。
代码清单1-24 板检测样例
cv::VideoCapture inputVideo;
inputVideo.open(0);
cv::Mat cameraMatrix, distCoeffs;
// camera parameters are read from somewhere
readCameraParameters(cameraMatrix, distCoeffs);
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::Ptr<cv::aruco::GridBoard> board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
while (inputVideo.grab()) {
    cv::Mat image, imageCopy;
    inputVideo.retrieve(image);
    image.copyTo(imageCopy);
    std::vector<int> ids;
    std::vector<std::vector<cv::Point2f> > corners;
    cv::aruco::detectMarkers(image, dictionary, corners, ids);
    // if at least one marker detected
    if (ids.size() > 0) {
        cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);
        cv::Vec3d rvec, tvec;
        int valid = estimatePoseBoard(corners, ids, board, cameraMatrix, distCoeffs, rvec, tvec);
        // if at least one board marker detected
        if(valid > 0)
            cv::aruco::drawAxis(imageCopy, cameraMatrix, distCoeffs, rvec, tvec, 0.1);
    }
    cv::imshow("out", imageCopy);
    char key = (char) cv::waitKey(waitTime);
    if (key == 27)
        break;
}
```

为了方便使用，OpenCV在aruco模块中提供了detect_board.cpp文件，可以调用实现标记板的检测与位姿估计。使用该文件的参数输入在代码清单2-21中给出。
```cpp
代码清单2-21
-c="_path_"/calib.txt" "_path_/aboard.png" -w=5 -h=7 -l=100 -s=10 -d=10
```
标记板检测视频的截图在图2-17给出。
<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223163022569.jpg" height="300">
</p>

**提示**
***完整视频可以在小白学视觉微信公众号后台回复“ArUco标记板检测”获取。***

## 2.2.3	增强标识检测
如果我们已经知道标记板的布局信息，则可以利用已有信息对存在遮挡的标记板进行增强，记找到标记版中没有被识别的标记。opencv提供了fineDetectedMarkers()函数来实现这个步骤，但是需要注意的是，我们应该在执行detectMarkers()函数之后再执行这个函数。

此功能的主要参数是检测到标记的原始图像、Board对象、检测到的标记的角点、检测到的标识的id和被拒标记的角点。我们可以从detectMarkers()函数获得被拒标记的角点，也被称为候选标记。这些候选对象是在原始图像中找到但未通过识别步骤的正方形（可能它们的内部编码存在太多错误），因此尚未被识别为标记。

由于图像中的高频噪声、低分辨率或影响二进制代码提取的其他相关问题，这些候选对象有时候无法正确识别为实际标记。函数fineDetectedMarkers()可查找这些候选标记与标记板被遮挡标记之间的对应关系。该搜索函数基于以下两个信息：
-	候选对象与缺失标记的投影之间的距离。为了获得这些投影，有必要检测检测至少一个在板中的标记。如果提供了相机参数，则可使用相机参数获得投影。如果没有，则从局部单应性变换得出，但是这种情况仅允许使用平面板（即所有标记的角点的Z轴坐标都应相同）。
fineDetectedMarkers()中的minRepDistance参数确定侯选角和投影标识角之间的最小欧几里得距离（默认值为10）。
-	二进制编码。如果候选对象超过最小距离条件，则再次分析其内部，以确定其是否确为投影标识。然而，在这种情况下，条件不是特别强并且允许的错误位数可能会更多。fineDetectedMarkers()函数中的参数errorCorrectionRate（默认值3.0）表示二进制编码。如果设置为负值，则根本不分析内部位而仅估计角点之间的距离。

refineDetectedMarkers()函数的使用方式再代码清单2-22中给出。
```cpp
代码清单2-22 标识检测完善示例
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::Ptr<cv::aruco::GridBoard> board = cv::aruco::GridBoard::create(5, 7, 0.04, 0.01, dictionary);
std::vector<int> markerIds;
std::vector<std::vector<cv::Point2f>> markerCorners, rejectedCandidates;
cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds, cv::aruco::DetectorParameters(), rejectedCandidates);
cv::aruco::refineDetectedMarkersinputImage, board, markerCorners, markerIds, rejectedCandidates);
// 调用此函数后，如果检测到任何新标识，它将从被拒的候选对象中删除，
// 并将其包含在markerCorners和MarkerIds的末尾
```
还必须注意，在某些情况下，如果首先检测到的标记数过少（例如仅1个或2个标记），则丢失标记的投影质量可能很差，从而产生错误的对应关系。


================================================
FILE: chapter 2/ArUco标记检测.md
================================================

姿态估计在许多计算机视觉应用程序中非常重要，如机器人导航，增强现实等。该过程基于发现真实环境中的点与2D图像投影之间的对应关系。 由于投影是从3D到2D，缺少了一个维度，使得位姿估计变成一个比较困难的问题，因此通常需要使用一些人为制作的标记或这基准标记来使增加信息量，以此使位姿估计更容易一些。

最受欢迎的方法之一是使用二进制方形基准标记。 这些标记的主要优点是单个标记可以提供足够的对应关系（四个角）来获得相机的姿态。 而且，内部二进制编码原理使得它们识别稳定，并且具有一定的容错性，从而允许应用错误检测和纠正技术。

aruco模块基于ArUco库，该库是由RafaelMuñoz和Sergio Garrido开发的用于检测方形基准标记的最受欢迎的库之一。

可以通过“#include <opencv2/aruco.hpp>”代码在程序中加入aruco模块中的相关函数。
## 2.1.1 标记和字典	
ArUco标记是由宽黑色边框和确定其标识符（id）的内部二进制矩阵组成的正方形标记。黑色边框有助于其在图像中的快速检测，内部二进制编码用于识别标记和提供错误检测和纠正。标记尺寸的大小决定内部矩阵的大小，例如尺寸为4x4的标记由16位组成二进制组成。在图2-1中给出了一些ArUco标记的示例。

<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223104948460.jpg" height="350">
</p>
需要注意的是，在环境中ArUco标记可能会发生旋转或者倒着的情况，但是在检测时借助二进制编码能够确定其旋转的情况，以便明确标识每个角。因此ArUco标记不会出现中心对称和轴对称的图案。

标记字典是在具体应用时对标记的识别依据，它是对每个二进制编码及对应含义的存储。
字典具有两个主要特性，分别是字典大小和标记尺寸：

-	字典大小是组成字典中存储的标记数目。
-	标记尺寸是这些标记的位数。


aruco模块中具有一些预定义的词典，这些词典具有不同的字典大小和标记尺寸。

有些标记是将信息直接存储在二进制中，在读取时将二进制转换成十进制从而得到真实的数字信息。但是ArUco标记不是这样做的，因此如果标记的尺寸较大，，那么二进制数据将有较多的位数，管理较多的二进制数据在实时系统中是不实际的。但是，通过标记id来寻找字典中的信息将极大的缩减二进制位数。例如，字典中的前5个标记的id为：0、1、2、3和4，而这5个标记可以存储任意的数据。
## 2.1.2 创建标记
在检测到标记之前，需要先打印标记，然后将其放置在环境中。可以使用drawMarker()函数生成标记图像。例如可以通过代码清单2-1中的代码生成标记图像。
```cpp
代码清单2-1：生成标记图像
cv::Mat markerImage;
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::drawMarker(dictionary, 23, 200, markerImage, 1)

```
在生成标记时，首先通过选择aruco模块中的预定义词典来创建Dictionary对象。代码清单2-1中创建的是一个具有250个标记和标记尺寸6×6位（DICT_6X6_250）字典。

绘制ArUCo标记的drawMarker()函数具有五个参数，每个参数的含义具体为：
-	第一个参数是之前创建的Dictionary对象。
-	第二个参数是标记id，代码清单2-1中设置的是字典DICT_6X6_250的标记23。需要注意的是，每个字典由不同数量的标记组成。在DICT_6X6_250字典中，有效id从0到249。任何超出有效范围的特定id都会产生异常。
-	第三个参数200是输出标记图像的大小。在这种情况下，输出图像的大小将为200x200像素。请注意，此参数应足够大以存储特定字典的位数。例如，对于6×6位的标记，我们无法生成5×5像素的图像（并且这不考虑标记边界）。此外，为避免变形，此参数应与位数+边界大小成比例，或者至少比标记大小大得多，以使变形不明显。
-	第四个参数是输出图像。
-	最后一个参数是一个可选参数，用于指定标记黑色边框的宽度。指定的大小与位数成正比。例如，值2表示边框的宽度等于两个内部位的大小。预设值为1。

代码清单2-1生成的图像在图2-2给出。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200223105957333.jpg" height="300">
</p>

aruco模块样例中提供了create_marker.cpp文件用于生成指定的ArRco标记，调用该函数时输入参数需要与代码清单2-2中的格式相同。
```cpp
代码清单2-2：create_marker.cpp的参数
"/Users/Sarthak/Dropbox/OpenCV_GSoC/marker.png" -d=10 -id=1
```
## 2.1.3 标记检测
对ArUco标记进行检测时，需要返回检测结果的一系列信息，包括：
-	图像中四个角的位置（按原始顺序）。
-	标记的ID。

标记检测过程主要分为两步：
1.	筛选候选标记。在此步中，将对图像进行分析，找到可以用作标记的正方形。首先从自适应阈值分割标记开始，然后从阈值图像中提取轮廓，之后去除那些非凸形或不近似于正方形的轮廓。此外还可以设置一些额外筛选条件，例如删除太小或太大的轮廓，删除彼此太近的轮廓等。
2.	在确定候选标记后，需要通过分析其内部编码来确定它们是否为ArUco标记。这步中需要提取每个标记的标记位。首先应用透视变换获得规范形式的标记，然后使用Otsu算法对规范图像进行阈值处理进行白色位和黑色位分离。根据标记大小和边框大小将图像划分为不同的单元，并对每个单元上的黑色或白色像素进行计数，以确定其是白色位还是黑色位。最后，分析这些像素位以确定标记是否属于特定词典，并在必要时使用纠错技术。

例如需要检测的图片如图2-3所示，检测出的标记如图2-4所示。候选标记在第二步被去除的情况如图2-5所示。
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200223111435710.jpg" height="350">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200223111650824.png" height="350">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022311195836.jpg" height="350">
</p>

在aruco模块中，detectMarkers()函数实现标志的检测。此函数是该模块中是最重要的函数，其他函数都是基于detectMarkers()函数的返回值进行再处理。代码清单2-3给出了检测标记的示例程序。
```cpp
代码清单2-3：检测标记
cv::Mat inputImage;
...
std::vector<int> markerIds;
std::vector<std::vector<cv::Point2f>> markerCorners, rejectedCandidates;
cv::Ptr<cv::aruco::DetectorParameters> parameters;
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds, parameters, rejectedCandidates);
```			
detectMarkers()函数具有六个参数，每个参数的含义为：
1.	第一个参数是需要检测标记的图像。
11.	第二个参数是字典对象，本教程中使用的是DICT_6X6_250预定义词典。
12.	第三个参数是检测到的标记的角点列表。对于每个标记，其四个角均按其原始顺序返回（从左上角开始顺时针旋转）。因此，第一个角是左上角，然后是右上角，右下角和左下角。
13.	第四个参数是检测到的每个标记的id。需要注意的是第三个参数和第四个参数具有相同的大小。
14.	第五个参数是类型DetectionParameters的对象。该对象包括在检测过程中可以自定义的所有参数。下一部分将详细说明此参数。
15.	第六个参数是去除的候选标记列表，即找到的但未提供有效编码的正方形。每个候选标记也由其四个角定义，其格式与第三个参数相同，该参数可以省略。

检测并识别标记之后之后，可以利用aruco模块提供的drawDetectedMarkers()函数在输入图像中绘制检测到的标记，该函数的使用方法在代码清单2-4中给出。
```cpp
代码清单2-4：绘制标记
cv::Mat outputImage
cv::aruco::drawDetectedMarkers(image, markerCorners, markerIds);
```
drawDetectedMarkers()函数具有三个参数，每个参数的含义如下：
18. 第一个参数是将绘制标记的输入/输出图像（通常是与检测到标记的图像相同）
19.	第二个参数是检测到的标记的角点列表
20.	第三个参数是检测到的每个标记的id

标记绘制结果在图2-6给出。
**注意**
*此功能仅用于可视化，在实际项目中可以省略。*

<p align="center">
<img src="https://img-blog.csdnimg.cn/202002231129324.jpg" height="350">
</p>

使用上面两个功能，我们可以创建一个基本的标记检测循环程序，对相机拍摄的场景直接检测标记，该例程在代码清单2-5中给出。

```cpp
代码清单2-5：相机检测标记
cv::VideoCapture inputVideo;
inputVideo.open(0);
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
while (inputVideo.grab()) {
    cv::Mat image, imageCopy;
    inputVideo.retrieve(image);
    image.copyTo(imageCopy);
    std::vector<int> ids;
    std::vector<std::vector<cv::Point2f> > corners;
    cv::aruco::detectMarkers(image, dictionary, corners, ids);
    // if at least one marker detected
    if (ids.size() > 0)
        cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);
    cv::imshow("out", imageCopy);
    char key = (char) cv::waitKey(waitTime);
    if (key == 27)
        break;
}
```
 完整的示例程序在aruco模块文件夹内的detect_markers.cpp中。可以通过代码清单2-6中的命令执行该项目。

```cpp
代码清单2-6：detect_markers.cpp文件需要的参数
-c="_path_/calib.txt" -d=10
```
## 2.1.4 姿态估计
检测到标记后，我们需要从标记中获取相机姿态。要执行相机姿态估计，我们需要了解相机的标定参数。这是相机内参矩阵和畸变系数。使用OpenCV基础库中标定函数即可，这里不详细介绍如何对相机进行标定。我们默认读者已经完成了相机的标定。

使用ArUco标记估计姿态时，可以分别估计每个标记的姿态。如果要从一组标记中估计一个姿态，则要使用aruco Boards（请参阅ArUco Boards教程，这里不做过多介绍）。

相机相对于标记的姿态是从标记坐标系到相机坐标系的3d转换。它由旋转量和平移矢量确定。aruco模块提供了cv::aruco::estimatePoseSingleMarkers()函数用于估计所有检测到的标记的姿态，具体使用方法在代码清单2-7中给出。

```cpp
代码清单2-7：估计姿态
cv::Mat cameraMatrix, distCoeffs;
2.	...
std::vector<cv::Vec3d> rvecs, tvecs;
cv::aruco::estimatePoseSingleMarkers(corners, 0.05, cameraMatrix, distCoeffs, rvecs, tvecs);
```
cv::aruco::estimatePoseSingleMarkers()函数有六个参数，每个参数的含义如下：
1. 第一个参数是detectMarkers()函数返回的标记角点的向量。
7.	第二个参数是标记的大小，以米或其他长度单位为单位。
8.	第三个参数和第四个参数是相机的标定参数。
9.	最后两个参数分别是角中每个标记的旋转和平移向量。

此外aruco模块还提供了绘制坐标轴的cv::aruco::drawAxis()函数，可以检查姿态估计结果，该函数的使用方法在代码清单2-8中给出。

```cpp
代码清单2-8：绘制坐标轴
cv::aruco::drawAxis(image, cameraMatrix, distCoeffs, rvec, tvec, 0.1);
```
cv::aruco::drawAxis()函数具有六个参数，每个参数的含义如下
1.	第一个参数是绘制坐标轴轴的输入/输出图像（通常是与检测到标记的图像相同）。
13.	第二个和第三个是相机的标定参数。
14.	第四个和第五个是要绘制坐标轴物体的姿态参数。
15.	第六个参数是轴的长度，单位与第五个参数（通常为米）相同

此函数假定标记坐标系位于Z轴指向的标记中心。 坐标轴颜色对应为X：红色，Y：绿色，Z：蓝色，具体如图2-7所示。

<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223114221549.jpg" height="350">
</p>

**提示**
***在小白学视觉微信公众号后台回复“坐标轴检测”，即可获得该检测视频完整版*。** 

可以通过代码清单2-9实现基于单个标记利用相机进行实时姿态估计，程序运行结果的视频截图在图2-8给出。

```cpp
代码清单2-9：实时姿态估计
cv::VideoCapture inputVideo;
inputVideo.open(0);
cv::Mat cameraMatrix, distCoeffs;
// camera parameters are read from somewhere
readCameraParameters(cameraMatrix, distCoeffs);
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
while (inputVideo.grab()) {
	   cv::Mat image, imageCopy;
    inputVideo.retrieve(image);
    image.copyTo(imageCopy);
    std::vector<int> ids;
    std::vector<std::vector<cv::Point2f>> corners;
    cv::aruco::detectMarkers(image, dictionary, corners, ids);
    // if at least one marker detected
    if (ids.size() > 0) {
        cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);
        std::vector<cv::Vec3d> rvecs, tvecs;
        cv::aruco::estimatePoseSingleMarkers(corners, 0.05, cameraMatrix, distCoeffs, rvecs, tvecs);
        // draw axis for each marker
        for(int i=0; i<ids.size(); i++)
            cv::aruco::drawAxis(imageCopy, cameraMatrix, distCoeffs, rvecs[i], tvecs[i], 0.1);
    }
    cv::imshow("out", imageCopy);
    char key = (char) cv::waitKey(waitTime);
    if (key == 27)
        break;
}
 ```
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200223114941709.jpg" height="350">
</p>


完整的代码程序在aruco模块文件夹内的detect_markers.cpp中。可以通过代码清单2-10中的命令执行该项目。
```cpp
代码清单2-10：调用实时姿态的参数
-c="_path_/calib.txt" -d=10
```
## 2.1.5 选择字典
aruco模块了提供Dictionary类来表示标记字典。除了标记的尺寸和字典中标记的数量之外，标记间的距离也是字典的重要参数。标记间的距离是指字典中所有标记之间的最小距离，它决定了词典的错误检测能力和纠错能力。

一般来说，字典大小越小，标记尺寸越大，标记间的距离越远，反之亦然。但是，如果标记尺寸变大，那么需要冲图像中提取的数据量也更大，这使得对较大尺寸标记的检测更加复杂，因此通常采用尽量缩小字典大小的方式来增加标记间的距离。例如，如果我们只需要10个标记，那么此时使用由10个标记组成的字典要比使用由1000个标记组成的字典更好。因为由10个标记组成的字典会有更高的标记间的距离，从而对错误有更强的鲁棒性。为了让用户可以选择合适的字典来增加程序的鲁棒性，aruco模块提供了多种字典供拥护选择，接下来将详细介绍每种字典构建的方法。
### 1. 预定义的字典

使用预定义字典是选择字典最简单的方法。aruco模块内提供了一组预定义的字典，其中包含多种标记尺寸和标记数量，可以通过代码清单2-11中的代码选择字典的大小和标记的尺寸。
```cpp
代码清单2-11：选择预定义字典
cv::Ptr<cv::aruco::Dictionary>dictionary=cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
```
代码清单2-11中的代码表示创建一个字典大小为250，标记尺寸为6×6的字典。使用预定于的字典，标记尺寸可以在4×4到7×7之间自由选择。同样，字典大小可以在50、100、250和1000中自由选择。在实际需求中，我们需要选择合适的字典。例如在同时满足需求的前提下，DICT_6X6_250的字典要好于DICT_6X6_1000，因为字典越小，标记间的距离就越大。
### 2. 	自动选择字典

有时我们并不知道应该选择多大的字典，因此可以将选择字典的任务交给程序，由程序自动选择一个合适的字典，来保证字符间的距离最大。可以用代码清单2-12中的命令实现自动选择字典。
```cpp
代码清单2-12：自动选择字典
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::generateCustomDictionary(36, 5);
```
代码清单2-12中的命令将会生成一个由36个尺寸为5×5的标记组成的自定义字典。根据参数的不同，这个过程可能会花费几秒钟的时间(对于较大的字典和较高的标记尺寸，这个过程会慢一些)。

### 3. 	自定义字典:

有时我们可以根据自己的需要自定义字典，这种方式的好处是我们可以使用任意的编码方式。在自定义字典之前，我们需要重新定义Dictionary类，定义方式在代码清单2-13中给出。

```cpp
代码清单2-13：重定义Dictionary类
class Dictionary {
    public:
    Mat bytesList;
    int markerSize;
    int maxCorrectionBits;    
    ...
}
```
代码清2-13中每个参数的含义如下：字典参数为：
-	bytesList是包含关于标记代码的所有信息的数组。
-	markerSize是每个标记尺寸的大小(例如，尺寸5×5标记为5)。
-	maxCorrectionBits是标记检测过程中可以纠正的错误比特数的最大值。如果这个值过高，就会导致大量的误报。

**注意**
***除非有特殊的需求，否则建议使用模块中自带的预定义字典。***

bytesList中的每一行表示一个字典标记。但是，这个标记不是以二进制形式存储的，而是以一种特殊的格式存储的，以简化它们的检测。Dictionary类中提供了Dictionary::getByteListFromBits()函数将标记转换为这种形式，具体如代码清单2-14中所示。
```cpp
代码清单2-14：自定义字典示例
cv::aruco::Dictionary dictionary;
// markers of 6x6 bits
dictionary.markerSize = 6;
// maximum number of bit corrections
dictionary.maxCorrectionBits = 3;
// lets create a dictionary of 100 markers
for(int i=0; i<100; i++)
{
    // assume generateMarkerBits() generate a new marker in binary 
    //format, so that markerBits is a 6x6 matrix of CV_8UC1 type,
    // only containing 0s and 1s
    cv::Mat markerBits = generateMarkerBits();
    cv::Mat markerCompressed = cv::aruco::Dictionary::getByteListFromBits(markerBits);
    // add the marker as a new row
    dictionary.bytesList.push_back(markerCompressed);
}
```

## 2.1.6参数检测器
前文我们见到detectmarker()函数中的DetectorParameters对象。该对象包含标记检测过程中可以自定义的所有选项。在本小节中，我们将详细介绍这些参数。

### 1. 阈值

标记检测过程的第一步是对输入图像进行自适应阈值化。例如，本教程所使用图像的阈值图像如所示。
阈值参数可以通过如下参数进行定义：
-	int  adaptiveThreshWinSizeMin, 
-	int  adaptiveThreshWinSizeMax, 
-	int  adaptiveThreshWinSizeStep
<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223120037423.jpg" height="350">
</p>
adaptivewellwinsizemin和adaptivewellwinsizemax参数表示为自适应阈值选择阈值窗口大小(以像素为单位)的区间，adaptiveThreshWinSizeStep表示自适应窗口每次改变的大小。例如，默认值adaptivewellwinsizemin=3、adaptivewellwinsizemax=23以及adaptiveThreshWinSizeStep=10。默认值表示阈值窗口依次为3×3、13×13和23×23。

此外，标记的尺寸也也对图像阈值处理产生影响，关于阈值尺寸的相关参数为：

- double  minMarkerPerimeterRate
-	double  maxMarkerPerimeterRate

这些参数决定了标记点的最小和最大尺寸，具体来说就是标记点的最大和最小周长。这两个参数的单位不是像素值，而是相对于输入图像的最大尺寸的比例。例如，大小为640×480，相对标记最小周长为0.05的图像，其最小标记周长为640×0.05 = 32像素，因为640是图像的最大尺寸。

如果minMarkerPerimeterRate太小，则会大大降低检测性能，因为在未来阶段需要考虑更多的轮廓。对于maxMarkerPerimeterRate参数，这种影响不是很明显，因为通常小轮廓比大轮廓要多得多。如果minMarkerPerimeterRate值为0和maxMarkerPerimeterRate值为4(或更大)则等效于考虑图像中的所有轮廓，但是出于性能原因不建议这样做。

有时标记在图像中会显示成多边形，可以用多边形率来表示形变的程度：

-	double  polygonalApproxAccuracyRat

这个值决定了多边形近似可以产生的最大误差。此参数是相对于候选对象周长长度的比例。例如，如果候选对象的周长为100像素，且polygonalApproxAccuracyRate的值为0.04，则最大误差为100x0.04=5.4像素。在大多数情况下，使用该参数的默认值即可正常工作，但是对于高失真的图像，可能需要更高的数值。该参数的默认值为0.05。
加下来详细介绍多个标记之间的距离参数：

-	double  minCornerDistanceRate

同一标记上任意对角之间的最小距离。它是相对于界标周长表示的。像素的最小距离是周长* minCornerDistanceRate。该参数的默认值为0.05。

-	double  minMarkerDistanceRate

两个不同的标记之间的最小距离。它是相对于两个标记的最小标记周长来表示的。如果两个候选标记太接近，较小的那个就会被忽略。该参数的默认值为0.05。

-	int  minDistanceToBorder

任何标记角到图像边界的最小距离(以像素为单位)。该参数的默认值为3。

2.	分析标志图像的位信息

在候选标志检测完成后需要对每个候选标志进行数位分析，以确定它们是否是ArUco标记。“位”就是组成标志的最小单元，每个位表示一位二进制，例如一个6×6的标志位数就是36位。

在分析标志中二进制代码之前，需要提取二进制位。为了能够精准的提取二进制信息，首先需要消除了视角形变，之后使用Otsu算法对去除形变的图像进行阈值处理，以分离黑白像素。图2-10就是去除形变和二值化后的结果。

<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223120719951.jpg" height="350">
</p>

之后将图像划分到一个网格中，网格的单元格数目与标记中的位数目相同。在每个单元格上，计算黑白像素的数量来决定最终单元格的的颜色，从而确定该位表示的是0还是1。图2-11是划分单元格结果。

<p align="center">
 <img src="https://img-blog.csdnimg.cn/202002231208441.jpg" height="350">
</p>

在这个过程中有几个重要的参数可以选择，具体内容如下：

-	int  markerBorderBits

此参数指示标记边框的宽度。它与每个位的大小有关。例如，数值2表示边界的宽度为每个位宽度的2倍。此参数需要与我们正在使用标记的边框大小一致。我们可以在标记绘制函数drawMarker()中设置边框的大小，并及时记录它。该参数的默认值为1.

-	double  minOtsuStdDev

这个值决定了执行Otsu阈值算法的像素值的最小标准偏差。如果偏差很小，可能所有的单元格都是黑色(或白色)，应用Otsu没有意义。如果是这种情况，所有的位都被设置为0(或1)，这取决于平均值与128的关系。该参数的默认值为5.0。

-	int  perspectiveRemovePixelPerCell

该参数为在去除透视失真(包括边框)后得到的图像中每个单元格像素数目，即图2-11中红色方框的大小。例如，我们处理的是5×5的标记尺寸和1边界的标记图像，那么每个的单元格像素宽度为5 + 2*1 = 7(边界必须计数两次)。像素总数为7×7。如果tiveremovepixelpercell的值为10，则每个单元格内有70×70个像素。该参数值越高，提取过程越快，但性能越差。该参数的默认值为4。

-	double  perspectiveRemoveIgnoredMarginPerCell

在提取每个单元的位元时，将计算黑白像素的数目。通常，不建议考虑所有的单元格像素。相反，最好忽略单元格边缘的一些像素。这样做的原因是，在消除了透视失真之后，每个像素的颜色通常并不是完全分离的，白色单元可以入侵黑色单元的一些像素(反之亦然)。因此，为了避免计算错误的像素，最好忽略一些像素，如图2-12所示只考虑绿色方块内的像素。在右边的图像中可以看到，生成的像素包含来自邻居单元的更低的噪声。参数指示了红色和绿色方块之间的差异。

<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223121121631.jpg" height="300">
</p>

此参数表示绿色方框外忽略部分相对于单元格的总大小。例如，如果单元格大小为40像素，并且该参数的值为0.1，则单元格中忽略40*0.1=4像素的空白。这意味着在每个单元上要分析的像素总数实际上是32x32，而不是40x40。该参数默认值为0.13。

### 3.	标记识别

提取标记中二进制信息后，接下来检查提取的二进制代码是否属于标记字典，如果需要，可以执行错误纠正。接下来依次介绍在这个过程中涉及的参数：

-	double  maxErroneousBitsInBorderRate

标记边界的单元格应该是黑色的。此参数指定边界中允许的错误率，即边界中白色单元格的最大数量。它是相对于标记中总的单元格数目来表示的。该参数的默认值为0.35。

-	double  errorCorrectionRate

每个标记字典都有一个理论上可以更正的最大位数，这个值可以通过errorCorrectionRate参数进行修改。例如，如果允许可以更正的位数(与字典种类相关)是6，并且errorCorrectionRate的值是0.5，那么可以更正的实际最大比特数是6*0.5=3位。这个值有助于降低错误纠正能力，以避免误报，默认值为0.6。

### 4.	角点位置细化

检测并识别标记后，最后一步是对角点位置执行亚像素细化。这一步是可选的，只有在标记角点位置必须为精确值的情况下才有意义，例如姿态估计、运动测量等问题中。角点位置细化通常是一个耗时的步骤，默认情况下是不使用的。接下来依次介绍在这个过程中涉及的参数：

-	int  cornerRefinementMethod

此参数确定是否执行角亚像素处理。如果不需要精确的角点坐标，可以进行角点位置细化。该参数默认值为CORNER_REFINE_NONE，表示不进行角点位置细化。

-	int  cornerRefinementWinSize

此参数确定亚像素细化过程的窗口大小。数值较大会产生窗口区域包含临近的图像角点的效果，使得标记角在处理过程中移动到错误位置。并且，数值较大会影响计算时间。该参数的默认值为5。

-	int  cornerRefinementMaxIterations      double  cornerRefinementMinAccuracy

这两个参数决定了亚像素细化过程的停止条件。cornerRefinementMaxIterations表示迭代的最大次数，cornerRefinementMinAccuracy表示停止迭代之前的最小错误值，两者满足任意一个条件即停止细化迭代。如果迭代次数太多，计算时间将会极大延长，但是如果过低，则会产生较差的亚像素细化。这两个参数的默认值分别为：30和0.1。


================================================
FILE: chapter 2/Aruco模块常见问题.md
================================================
# 2.6	Aruco模块常见问题
我们在使用aruco模块是可能经常会遇到一些问题，我们整理了部分常见的内容，在下方详细列出：
-	如果只想标记一些对象，应该使用什么?

如果只需要单个ArUco标记，那可以在要识别的每个对象中放置一个或多个具有不同ID的标记。

-	标记检测用的哪种算法？

aruco模块基于原始的aruco库。检测过程算法的完整介绍可以阅读Automatic generation and detection of highly reliable fiducial markers under occlusion文论了解。

-	如果标记没有被正确探测到，应该怎么办?

许多因素都可能影响对标记的正确检测。我们可能需要调整DetectorParameters对象中的一些参数。首先检查detectmarker()函数是否将标记作为被拒绝的候选标记返回。根据这一点，可以尝试修改不同的参数。如果使用ArUco board，也可以尝试refineDetectedMarker()函数。

-	ArUco board有什么优点？有什么缺点？

使用ArUco board，可以从一组标记而不是单个标记中获得摄像机的姿态。这样就可以解决标记版局部视图被遮挡的问题，因为仅需一个标记就可获得摄像机的姿态。大多数情况下，使用更多的标记来估计姿态将比使用单一标记的结果更准确。
主要缺点是不如单个标记那样通用。

-	与ArUco board相比，ChArUco board有什么优点和缺点?

ChArUco board将棋盘和ArUco board结合起来。因此ChArUco board提供的角点比ArUco board (或单标记)提供的角点更准确。
主要缺点是ChArUco board不如ArUco board用途广泛。ChArUco board是具有特定标记布局的平面板，而ArUco board甚至可以在3d中具有任何布局。此外，ChArUco板上的标记通常更小，更难以检测。

-	如果不需要相机姿态估计，应该使用ChArUco board吗?

不，ChArUco board的主要目标是为摄像机姿态估计和摄像机标定提供高精度的角点。

-	应该将ArUco board上的所有标记放在同一个平面上吗?

不，ArUco board上的标记可以放置在其3d坐标系中的任何位置。

-	应该将ChArUco board上的所有标记放在同一个平面上吗?

是的，ChArUco棋盘上的所有标记都需要在同一个平面上，它们的布局由棋盘形状决定。

-	想进行相机标定，可以使用这个模块吗?

可以，aruco模块提供了使用aruco board和ChArUco board标定摄像机的功能。

- 应该使用ChArUco board还是ArUco board进行标定？

强烈推荐使用ChArUco板进行标定，因为它精度更高。

-	可以使用这个模块来检测基于二进制基准标记的其他标记吗?

可能可以，您需要将原始库的字典转换成符合aruco模块的格式。

-	是否需要将字典信息存储在一个文件中，以便在不同的应用中读取它?

如果正在使用某个预定义的字典，则没有必要这样做。否则，建议将其保存到文件中。

-	是否需要将板子信息存储在一个文件中，以便在不同的执行中使用它?

如果使用的是GridBoard或ChArUco board，则只需要存储提供给GridBoard::create()或ChArUco::create()函数的板的测量值。如果手动修改板的标记id，或者使用不同类型的板，就应该将它保存到文件中。

-	aruco模块是否提供将字典或板信息保存到文件的功能?

现在还没有。Dictionary和Board类的数据成员都是公共的，可以很容易地存储。


================================================
FILE: chapter 2/ChArUco角的检测.md
================================================
ArUco标识和标记板由于其快速检测和多功能性而非常有用。然而，ArUco标记存在即便是在应用亚像素细化之后，其角点的位置精度也不太高的问题。相反，如果每个角点都被两个黑色正方形包围，那么棋盘图案的角点可以更精确地细化。但是，找到棋盘图案并不像找到ArUco板那样通用，它要求它必须完全可见，并且不允许有任何遮挡。ChArUco标记板试图将这两种方法的优点结合起来，具体形式如图2-18所示。

<p align="center">
 <img src="https://img-blog.csdnimg.cn/2020022316374383.jpg" height="250">
</p>

ArUco标记用于插补棋盘的角点的位置，因此它允许遮挡或使用局部视图，故其具有标记板的通用性。而且由于内插的角点属于棋盘，因此就压像素精度而言它们非常精确。当在需要高精度场合时，例如在相机校准中，ChArUco标记板比标准ArUco标记板更好。

## 2.3.1	创建ChArUco标志板
aruco模块提供了cv::aruco::CharucoBoard类代表Charuco标记板，并继承自Board类。此类被定义在<opencv2/aruco/charuco.hpp>头文件中，在使用时需要将该头文件包含进去。该类中需要如下信息：
-	X轴方向上的棋盘方块数
-	Y轴方向上的棋盘方块数
-	方块的边长
-	标识的边长
-	标识的字典
-	标识的ID

aruco模块提供了创建CharucoBoard类的静态函数cv::aruco::CharucoBoard::create()，该函数的使用方式在代码清单2-23中给出
```cpp
代码清单2-23 创建Charuco Board
cv::aruco::CharucoBoard board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
```
-	第一个和第二个参数分别是X轴、Y轴方向上的正方形数。
-	第三个和第四个参数分别是正方形和标记的边长。考虑到估计姿态时将以相同的单位进行测量（通常使用米），因此可以使用任何单位。
-	最后一个参数提供了标记的字典。

在默认情况下，每个标识的ID都如GirdBoard::create()一样从0开始按升序分配。也如同Board父类一样，可以通过board.ids访问ID向量，从而轻松地对其进行自定义。一旦有了CharucoBoard对象，我们就能创建图像并进行打印。我们可以通过CharucoBoard::draw()函数实现生成图像，该函数的使用方式在代码清单2-24中给出。
```cpp
代码清单2-24 创建图像
cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
cv::Mat boardImage;
board->draw( cv::Size(600, 500), boardImage, 10, 1 );
```
- 第一个参数是输出图像的大小（以像素为单位）。在示例中为600×500像素。
-	boardImage：带有标记板的输出图像；
-	第三个参数（可选）是以像素为单位的边距，在示例中，边距为10；
-	最后是标识边框的大小，类似于函数drawMarker()。预设值为1。

调用代码清单2-24中的程序，输出图像如图2-19所示。
<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223164241928.jpg" height="300">
</p>
 
    OpenCV中aruco模块提供了create_board_charuco.cpp文件，文件中含有生成ChArUco标记板的完整程序，使用该程序需要出入代码清单2-25中的参数。
```cpp
代码清单2-25
"_ output path_/chboard.png" -w=5 -h=7 -sl=200 -ml=120 -d=10
```
## 2.3.2	ChArUco标记板检测
当我们检测一个ChArUco标记板时，实际检测到的是板上的每个棋盘角点。ChArUco板上的每个角点都分配有唯一的标识符（id）。这些id从0到板上角点的总数。所以，ChArUco板的检测由以下几部分组成：

-	std::vector<cv::Point2f> charucoCorners:检测到的角的图像位置列表；
-	std::vector<int> charucoIds:在charucoCorners中的每个检测到的角的ID。

ChArUco角点的检测基于先前检测到的标识。因此，首先检测标识，然后在标识中插入ChArUco角点。aruco模块中提供了cv::aruco::interpolateCornersCharuco()函数用于检测ChArUco角点，该函数的的使用方式在代码清单2-26中给出。
```cpp
代码清单2-26 检测ChArUco角点
cv::Mat inputImage;
cv::Mat cameraMatrix, distCoeffs;
// 从某处读取相机参数
readCameraParameters(cameraMatrix, distCoeffs);
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
...
std::vector<int> markerIds;
std::vector<std::vector<cv::Point2f>> markerCorners;
cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds);
// 如果至少检测到一个标识
if(markerIds.size() > 0) {
    std::vector<cv::Point2f> charucoCorners;
    std::vector<int> charucoIds;
    cv::aruco::interpolateCornersCharuco(markerCorners, markerIds, inputImage, board, charucoCorners, charucoIds, cameraMatrix, distCoeffs);
}
```
interpolateCornersCharuco()函数的每个参数的含义为：

-	markerCorners和markerIds：从函数detectMarkers()检测到的标识；
-	inputImage：检测到标识的原始图像。若要在ChArUco角中进行亚像素化，则必须使用该图像；
-	board：CharucoBoard对象；
-	charucoCorners和charucoIds：输出的插入的Charuco角；
-	cameraMatrix和distCoeffs：可选的相机校准参数；
-	该函数返回插入的Charuco角的数目。

在代码清单-26中，我们调用interpolateCorneresCharuco()函数时提供了相机标定参数。但这些参数页可以不使用，不使用相机标定参数的示例程序在代码清单2-27中给出。
```cpp
代码清单1-32 （未使用相机校准参数）检测ChArUco角
cv::Mat inputImage;
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
...
std::vector<int> markerIds;
std::vector<std::vector<cv::Point2f>> markerCorners;
cv::Ptr<cv::aruco::DetectorParameters> params;
params->cornerRefinementMethod = cv::aruco::CORNER_REFINE_NONE;
cv::aruco::detectMarkers(inputImage, board.dictionary, markerCorners, markerIds, params);
// 如果至少有一个标识被检测到
if(markerIds.size() > 0) {
    std::vector<cv::Point2f> charucoCorners;
    std::vector<int> charucoIds;
    cv::aruco::interpolateCornersCharuco(markerCorners, markerIds, inputImage, board, charucoCorners, charucoIds);
}
```
如果提供了校准参数，则首先通过ArUco标记粗略估计姿态，然后将ArUco角点重新投影回图像来对ChArUco角点进行插值。如果未提供校准参数，则可以通过计算ChArUco平面和ChArUco图像投影之间对应的单应性变换来对ChArUco角点进行插值。使用单应性变换的主要问题是插值对图像失真更敏感。实际上，在使用每个ChArUco角点的最接近的标记以减少失真效果时，才会执行单应性变换。

当检测ChArUco板的标识时，尤其是在使用单应性变换时，建议禁用标记的边角细化。因为棋盘方块的形状相似，压像素处理会在拐角位置产生较大的偏差，并且这些偏差会传递到ChArUco角点的插值中，从而得到较差的结果。

此外，仅返回那些找到两个周围标记的角。如果未检测到两个周围标记中的任何一个，则通常意味着存在某些遮挡或该区域的图像质量不佳。无论哪用情况，最好不要考虑这种角点，因为我们要确保插入的ChArUco角点非常准确。在对ChArUco角点进行插值后，再执行压像素细化。一旦我们插入ChArUco角后，我们可能希望绘制它们以查看检测是否正确。OpenCV提供了drawDetectedCornersCharuco()函数用于显示检测结果，该函数的使用方式在代码清单2-28中给出。检测结果如图2-20和图2-21所示。
```cpp
代码清单2-28绘制
cv::aruco::drawDetectedCornersCharuco(image, charucoCorners, charucoIds, color);
```

-	image是将要绘制角点的图像（通常与检测到角点图像相同），同时也是绘制后的输出图像；
-	charucoCorners和charucoIds是从函数interpolateCornersCharuco()中检测到的Charuco角点；
-	最后一个参数（可选）是我们要用来绘制角的颜色，类型为cv::Scalar。
 
<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223165150456.jpg" height="300">
</p>

不使用相机标定参数情况下对ChArUco标记板检测的示例程序在代码清单2-29中给出。
```cpp
代码清单2-29 ChArUco检测的完整示例
cv::VideoCapture inputVideo;
inputVideo.open(0);
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
cv::Ptr<cv::aruco::DetectorParameters> params;
params->cornerRefinementMethod = cv::aruco::CORNER_REFINE_NONE;
while (inputVideo.grab()) {
    cv::Mat image, imageCopy;
    inputVideo.retrieve(image);
    image.copyTo(imageCopy);
    std::vector<int> ids;
    std::vector<std::vector<cv::Point2f>> corners;
    cv::aruco::detectMarkers(image, dictionary, corners, ids, params);
    // if at least one marker detected
    if (ids.size() > 0) {
        cv::aruco::drawDetectedMarkers(imageCopy, corners, ids);
        std::vector<cv::Point2f> charucoCorners;
        std::vector<int> charucoIds;
        cv::aruco::interpolateCornersCharuco(corners, ids, image, board, charucoCorners, charucoIds);
        // if at least one charuco corner detected
        if(charucoIds.size() > 0)
            cv::aruco::drawDetectedCornersCharuco(imageCopy, charucoCorners, charucoIds, cv::Scalar(255, 0, 0));
    }
    cv::imshow("out", imageCopy);
   char key = (char) cv::waitKey(waitTime);
    if (key == 27)
        break;
}
```
aruco模块中提供了检测ChArUco标记板的完整的示例程序，完整程序存放在detect_board_charuco.cpp文件中，我们可以通过代码清单2-30中的代码直接调用该文件。检测ChArUco标记板的结果在图2-22给出。

**提示
*完整视频可以在小白学视觉微信公众号后台回复“ChArUco标记板检测”获取。***
```cpp
代码清单2-30 样本输入
-c="_path_/calib.txt" -dp="_path_/detector_params.yml" -w=5 -h=7 -sl=0.04 -ml=0.02 -d=10
```
<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223165438347.jpg" height="300">
</p>


## 2.3.3	ChArUco姿态估计
ChArUco标记板的最终目的是非常精确地找到角点以进行高精度校准或姿态估计。aruco模块提供了estimatePoseCharucoBoard()函数用于ChArUco姿态估计。ChArUco标记板的坐标系与GridBoard一样，坐标系放置在板平面中，Z轴指向外侧，并居中于板的左下角。estimatePoseCharucoBoard()函数的使用方法在代码清单2-31中给出。
```cpp
代码清单2-31姿态估计函数
cv::aruco::estimatePoseCharucoBoard(charucoCorners, charucoIds, board, cameraMatrix, distCoeffs, rvec, tvec);
```
-	charucoCoerners和charucoIds参数是从函数interpolateCornersCharuco()中检测到的charuco角点；
-	第三个参数是CharucoBoard对象；
-	cameraMatrix和distCoeffs是姿态估计所需的相机标定参数；
-	rvec和tvec参数是Charuco板的输出姿态；
-	如果正确估计了姿态，则函数返回true，否则返回false。失败的主要原因是没有足够的角点进行姿态估计或者它们不在同一条线。

可以使用drawAxis()绘制轴轴以检查姿态是否正确估计。绘制结果如所示（X轴：红色；Y轴：绿色；Z轴：蓝色）。
<p align="center">
 <img src="https://img-blog.csdnimg.cn/20200223170110524.jpg" height="300">
</center>

代码清单2-33中给出了姿态估计完整示例程序。该程序被存放在aruco模块中的detect_board_charuco.cpp文件中，可以通过代码清单2-32中的参数对该文件进行调用。
```cpp
代码清单2-32 样本输入
"_path_/calib.txt" -dp="_path_/detector_params.yml" -w=5 -h=7 -sl=0.04 -ml=0.02 -d=10
```
```cpp
代码清单2-33：带有姿态估计的ChArUco检测
cv::VideoCapture inputVideo;
inputVideo.open(0);
cv::Mat cameraMatrix, distCoeffs;
// camera parameters are read from somewhere
readCameraParameters(cameraMatrix, distCoeffs);
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::Ptr<cv::aruco::CharucoBoard> board = cv::aruco::CharucoBoard::create(5, 7, 0.04, 0.02, dictionary);
while (inputVideo.grab()) {
    cv::Mat image, imageCopy;
    inputVideo.retrieve(image);
    image.copyTo(imageCopy);
    std::vector<int> ids;
    std::vector<std::vector<cv::Point2f>> corners;
    cv::aruco::detectMarkers(image, dictionary, corners, ids);
    // if at least one marker detected
    if (ids.size() > 0) {
        std::vector<cv::Point2f> charucoCorners;
        std::vector<int> charucoIds;
        cv::aruco::interpolateCornersCharuco(corners, ids, image, board, charucoCorners, charucoIds, cameraMatrix, distCoeffs);
        // if at least one charuco corner detected
        if(charucoIds.size() > 0) {
            cv::aruco::drawDetectedCornersCharuco(imageCopy, charucoCorners, charucoIds, cv::Scalar(255, 0, 0));
            cv::Vec3d rvec, tvec;
            bool valid = cv::aruco::estimatePoseCharucoBoard(charucoCorners, charucoIds, board, cameraMatrix, distCoeffs, rvec, tvec);
            // if charuco pose is valid
            if(valid)
                cv::aruco::drawAxis(imageCopy, cameraMatrix, distCoeffs, rvec, tvec, 0.1);
        }
    }
    cv::imshow("out", imageCopy);
    char key = (char) cv::waitKey(waitTime);
    if (key == 27)
        break;
}
```


================================================
FILE: chapter 2/使用ArUco和ChArUco进行相机标定.md
================================================
ArUco模块也可以用来标定相机。相机标定包括获取相机的内参系数和畸变系数。除非相机光学系统发生变化，否则这一参数保持不变。因此，相机标定只需要做一次。但是，如果相机长时间未使用，建议再次使用时重新标定一次。

相机机标定通常使用标准库中的calibrateCamera()函数。该函数要求空间点与图像中投影点之间存在一定的对应关系。一般来说，这些对应是从棋盘图案的角点获得的。有关更多详细信息，可以阅读基础库中的calibrateCamera()函数文档或OpenCV标定教程。在《OpenCV 4计算机视觉编程实战》书中也有详细的标定算法和流程介绍。

使用ArUco模块，可以基于ArUco标记角点或ChArUco角点进行标定。与使用传统棋盘图案相比，使用ArUco进行标定的功能要更广泛得，因为即使部分标定板被遮挡了，一样可以实现相机的标定。

可以同时使用ArUco标记角点或ChArUco角点进行标定。但是，强烈推荐使用ChArUco的方法，因为它提供的角点更精确。但是如果由于限制而无法使用ChArUco Board的情况下，也可以使用ArUco标定板板进行标定。

## 2.5.1	使用 ChArUco Boards 进行标定
要使用ChArUco进行标定，必须获得不同的角度的棋盘图像，这与传统标定方法对棋盘格的检测方法相同。使用ChArUco的好处是即使标定板部分被遮挡，只要任然有角点可以被检测，就可以实现标定。标定过程中获取的含有ChArUco标定板的图像如图2-28所示。
 <center>
<img src="https://img-blog.csdnimg.cn/20200224113108960.png">
</center>

使用ChArUco标定板进行标定的函数是calibrateCameraCharuco()，其使用方法在代码清单2-40给出。

```cpp
代码清单2-40 创建Mat类
cv::Ptr<aruco::CharucoBoard> board = ... // create charuco board
cv::Size imgSize = ... // camera image size
std::vector<std::vector<cv::Point2f>> allCharucoCorners;
std::vector<std::vector<int>> allCharucoIds;
// Detect charuco board from several viewpoints and fill allCharucoCorners and allCharucoIds
...
...
// After capturing in several viewpoints, start calibration
cv::Mat cameraMatrix, distCoeffs;
std::vector<cv::Mat> rvecs, tvecs;
int calibrationFlags = ... // Set calibration flags (same than in calibrateCamera() function)
double repError = cv::aruco::calibrateCameraCharuco(allCharucoCorners, allCharucoIds, board, imgSize, cameraMatrix, distCoeffs, rvecs, tvecs, calibrationFlags);
```
在每个视角上捕获的ChArUco角点和ChArUco标识符存储在vector容器allCharucoCorners和allcharucoid中，每个视角一个元素。

calibrateCameraCharuco()函数将相机参数存储在数组cameraMatrix和distCoeffs中。该函数的返回值是标定的重投影误差。rvecs和tvecs中存储每幅图对应的相机姿态(相对于ChArUco板)。最后，calibrationFlags参数确定一些用于标定的选项。它的格式与基础库中calibrateCamera()函数中的flags参数格式相同。

完整的工作示例包含在模块示例文件夹中的calibrate_camera_charuco.cpp中，读者可以在扩展模块的文件中找到它。该函数调用时需要输入的参数在代码清单2-41中给出。

```cpp
代码清单2-41 ：调用参数
_output path_" -dp="_path_/detector_params.yml" -w=5 -h=7 -sl=0.04 -ml=0.02 -d=10
```

## 2.5.2	使用 ArUco Boards 进行标定
如前所述，因为ChArUco角点比更准确ChAruco角点更精确，因此建议使用ChAruco标定板进行相机标定。进行摄像机校准，。但是在某些特殊情况下不得不使用ArUco标定板进行标定。使用ArUco标定板时，需要使用calibrateCameraAruco()函数。同样该函数也需要输入不同角度拍摄到的ArUco标定板图像。标定过程中获取的含有ArUco标定板的图像如图2-29所示

 <center>
<img src="https://img-blog.csdnimg.cn/20200224113411996.png">
</center>

calibrateCameraAruco()函数的使用方法在代码清单2-42中给出。
```cpp
代码清单2-42 创建Mat类
cv::Ptr<aruco::Board> board = ... // create aruco board
cv::Size imgSize = ... // camera image size
std::vector<std::vector<cv::Point2f>> allCornersConcatenated;
std::vector<int> allIdsConcatenated;
std::vector<int> markerCounterPerFrame;
// Detect aruco board from several viewpoints and fill allCornersConcatenated, allIdsConcatenated and markerCounterPerFrame
...
...
// After capturing in several viewpoints, start calibration
cv::Mat cameraMatrix, distCoeffs;
std::vector<cv::Mat> rvecs, tvecs;
int calibrationFlags = ... // Set calibration flags (same than in calibrateCamera() function)
double repError = cv::aruco::calibrateCameraAruco(allCornersConcatenated, allIdsConcatenated, markerCounterPerFrame, board, imgSize, cameraMatrix, distCoeffs, rvecs, tvecs, calibrationFlags);
```

与calibrateCameraCharuco（）函数相反，在每幅图中检测到的标记将被串联存储在数组allCornersConcatenated和allCornersConcatenated（该函数的前两个参数）中。第三个参数（数组markerCounterPerFrame）指示在每幅图中检测到的标记数。

完整的工作示例包含在模块示例文件夹中的calibrate_camera.cpp中，读者可以在扩展模块的文件中找到它。该函数调用时需要输入的参数在代码清单2-43中给出。

```cpp
代码清单2-43 ：调用参数
"_path_/calib.txt" -w=5 -h=7 -l=100 -s=10 -d=10
```


================================================
FILE: chapter 2/菱形标记检测.md
================================================
ChArUco菱形标记(或简称菱形标记)是由3×3的方块和白色方块内的4个ArUco标记组成的棋盘，如图2-24中所示。它在外观上和ChArUco board很相似，但它们在概念上是不同的。  
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224110740699.png" height="300">
</p>

无论是ChArUco board还是Diamond marker，其检测都是基于前文检测到的ArUco marker。在ChArUco中，使用的标记是通过直接查找它们的标识符来选择的。这意味着如果在一个图像上发现一个标记(包括在图板中)，它将自动被认为是属于图板的。此外，如果图像中存在多个标识版，就会产生歧义，因为系统无法知道应该使用哪个标识板。

然而，菱形标记检测并不是基于标识符的。相反，它们的检测是基于标记的相对位置。因此，标识符可以在同一菱形标记中重复出现，也可以在不同的菱形中重复出现，并且可以同时检测它们而不产生歧义。然而，由于根据相对位置寻找标记的复杂性，菱形标记的大小被限制在3×3个正方形和4个标记。

与单个ArUco标记一样，每个菱形标记由4个角和一个标识符组成。四个角对应于标记中的4个棋盘角，标识符实际上是一个由4个数字组成的数组，这些数字是菱形内4个ArUco标记的标识符。

菱形标记在图像中需要重复标记的情况下十分有用。例如，通过使用菱形标记来增加单个标记的标识数量。它们最多允许 个不同的id，其中N为使用的字典中标记的数目。例如,可以用四个标记id中的一个来表示标记的规模(即正方形的大小),这样可以找到不同大小的相同的菱形。此外，由于它的角是棋盘的角，可以用来进行精确的姿态估计。检测菱形标记的函数在<opencv2/aruco/charuco.hpp>中，OpenCV提供了检测菱形标志的示例程序，该程序在aruco模块的samples文件夹中的diamond_detector.cpp文件中。

## 2.4.1	创建ChArUco菱形
菱形标记图像可以通过使用drawCharucoDiamond()函数很方便地创建，具体代码在代码清单2-34中给出。
```cpp
代码清单2-34
cv::Mat diamondImage;
cv::Ptr<cv::aruco::Dictionary> dictionary = cv::aruco::getPredefinedDictionary(cv::aruco::DICT_6X6_250);
cv::aruco::drawCharucoDiamond(dictionary, cv::Vec4i(45,68,28,74), 200, 120, markerImage);
```
代码清单2-34中的程序将创建一个大小为200像素，标记大小为120像素的菱形标记图像。标记id在第二个参数中以Vec4i对象给出。菱形布局中标记id的顺序与标准ChArUco板相同，即顶部、左侧、右侧和底部。产生的图像如图2-25所示。
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224111303855.png" height="300">
</p>
模块sample文件夹中的create_diamond.cpp中包含了一个完整的生成菱形标记的的示例程序，我们也可以直接使用该文件来生成菱形标记。这个文件的使用方式在代码清单2-35中给出

```cpp
代码清单2-35
"_path_/mydiamond.png" -sl=200 -ml=120 -d=10 -ids=45,68,28,74
```
## 2.4.2	ChArUco菱形检测
在大多数情况下，菱形标记的检测需要先检测ArUco标记。检测标记后，使用detectCharucoDiamond()函数检测菱形，具体代码在代码清单2-36中给出。

```cpp
代码清单2-36
cv::Mat inputImage;
float squareLength = 0.40;
float markerLength = 0.25;
...
std::vector<int> markerIds;
std::vector<std::vector< cv::Point2f>> markerCorners;
// detect ArUco markers
cv::aruco::detectMarkers(inputImage, dictionary, markerCorners, markerIds);
std::vector<cv::Vec4i> diamondIds;
std::vector<std::vector<cv::Point2f>> diamondCorners;
// detect diamon diamonds
cv::aruco::detectCharucoDiamond(inputImage, markerCorners, markerIds, squareLength / markerLength, diamondCorners, diamondIds);
```
detectCharucoDiamond()函数的参数有原始图像和检测到的标记角和标记对应的id。输入图像必须对ChArUco角点执行亚像素细化。此外，函数的参数还包括正方形大小和标记大小的比例，这是从标记相对位置检测菱形和插入ChArUco所需要的。第五个参数diamondCorners是一个包含每个检测到的菱形的所有四个角点的数组，是一个输出参数。它的格式类似于detectmarker()函数所检测到的角点，对于每个菱形，角点的表示顺序与ArUco标记相同，即从左上角开始顺时针排列。第六个参数diamondIds是返回的所有菱形角的id。每个id实际上是一个由4个整数组成的数组，可以用Vec4i表示。使用drawDetectedDiamonds()函数可以对检测到的菱形进行可视化处理，该函数只接收图像和菱形的边角和id，菱形标记可视化的结果如图2-26所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224111709324.png" height="300">
</p>

OpenCV提供了完整的菱形标记检测程序，该程序在模块示例文件夹内的detect_diamonds.cpp中。该程序的使用当时如代码清单2-37中所示。

```cpp
代码清单2-37
-c="_path_/calib.txt" -dp="_path_/detector_params.yml" -sl=0.04 -ml=0.02 -d=10
```
## 2.4.3	ChArUco菱形姿态估计
由于ChArUco菱形标记是由它的四个角表示的，所以它的姿态可以用与单个ArUco标记相同的方式进行估计，即使用estimatePoseSingleMarkers()函数。代码清单2-38中给出了示例程序。
```cpp
代码清单2-38
std::vector<cv::Vec4i> diamondIds;
std::vector<std::vector<cv::Point2f>> diamondCorners;
// detect diamon diamonds
cv::aruco::detectCharucoDiamond(inputImage, markerCorners, markerIds, squareLength / markerLength, diamondCorners, diamondIds);
// estimate poses
std::vector<cv::Vec3d> rvecs, tvecs;
cv::aruco::estimatePoseSingleMarkers(diamondCorners, squareLength, camMatrix, distCoeffs, rvecs, tvecs);
// draw axis
for(unsigned int i=0; i<rvecs.size(); i++)
    cv::aruco::drawAxis(inputImage, camMatrix, distCoeffs, rvecs[i], tvecs[i], axisLength);
```
该函数将获取每个菱形标记的旋转和平移向量，并将它们存储在rvecs和tvecs中。注意，菱形角点是棋盘上的正方形角点，因此必须提供正方形长度来估计姿态，而不是标记长度。同时还需要相机标定参数。最后，可以使用drawAxis(函数)绘制坐标轴来检查预估的姿态是否正确，效果如图2-27所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224112203454.png" height="300">
</p>
菱形位姿的坐标系统将位于标记的中心，Z轴指向外，就像简单的ArUco标记位姿估计一样。OpenCV提供了完整的程序代码，该代码在模块sample文件夹内的detect_diamonds.cpp文件中。该程序可以通过代码清单2-39中的命令进行调用。

```cpp
代码清单2-39
-c="_output path_/calib.txt" -dp="_path_/detector_params.yml" -sl=0.04 -ml=0.02 -d=10
```


================================================
FILE: chapter 20/用于快速边缘检测的结构化森.md
================================================
# 20.2 用于快速边缘检测的结构化森林(Structured forests)

在本节教程中,我们将学会使用结构化森林来进行图像边缘检测。该算法的具体原理读者可以查阅下面两篇文章进行详细学习:

- Structured forests for fast edge detection
- Sketch tokens: A learned mid-level representation for contour and object detection

## 20.2.1 例图

本历程中所有使用的图片和处理结果都在下方给出。读者也可以在小白学视觉公众号后台回复“**快速边缘检测**”获取所有图片。

<img src="./img/20-5.png">

<img src="./img/20-6.png">

<img src="./img/20-7.png">

<img src="./img/20-8.png">

> 注意：
>
> Canny边 缘 检 测 等 二 值 化 技 术 适 用 于 两 种 算 法 (Sobel 和StructuredEdgeDetection::detectEdges)生成的边缘。

## 20.2.2 C++代码

```
/**************************************************************************************
 The structered edge demo requires you to provide a model.
 This model can be found at the opencv_extra repository on Github on the following link:
 https://github.com/opencv/opencv_extra/blob/master/testdata/cv/ximgproc/model.yml.gz
 ***************************************************************************************/
 
 #include <opencv2/ximgproc.hpp>
 #include "opencv2/highgui.hpp"
 #include "opencv2/core/utility.hpp"
 #include <iostream>
 
 using namespace cv;
 using namespace cv::ximgproc;
 
 const char* keys =
 {
     "{i || input image name}"
     "{m || model name}"
     "{o || output image name}"
 };
 
 int main( int argc, const char** argv )
 {
     bool printHelp = ( argc == 1 );
     printHelp = printHelp || ( argc == 2 && String(argv[1]) == "--help" );
     printHelp = printHelp || ( argc == 2 && String(argv[1]) == "-h" );
 
     if ( printHelp )
     {
         std::cout << "\nThis sample demonstrates structured forests for fast edge detection\n"
                "Call:\n"
                "    structured_edge_detection -i=in_image_name -m=model_name [-o=out_image_name]\n\n";
         return 0;
     }
 
     CommandLineParser parser(argc, argv, keys);
     if ( !parser.check() )
     {
         parser.printErrors();
         return -1;
     }
 
     String modelFilename = parser.get<String>("m");
     String inFilename = parser.get<String>("i");
     String outFilename = parser.get<String>("o");
 
     Mat image = imread(inFilename, 1);
     if ( image.empty() )
         CV_Error(Error::StsError, String("Cannot read image file: ") + inFilename);
 
     if ( modelFilename.size() == 0)
         CV_Error(Error::StsError, String("Empty model name"));
 
     image.convertTo(image, DataType<float>::type, 1/255.0);
 
     Mat edges(image.size(), image.type());
 
     Ptr<StructuredEdgeDetection> pDollar =
         createStructuredEdgeDetection(modelFilename);
     pDollar->detectEdges(image, edges);
 
     // computes orientation from edge map
     Mat orientation_map;
     pDollar->computeOrientation(edges, orientation_map);
 
     // suppress edges
     Mat edge_nms;
     pDollar->edgesNms(edges, orientation_map, edge_nms, 2, 0, 1, true);
 
     if ( outFilename.size() == 0 )
     {
         namedWindow("edges", 1);
         imshow("edges", edges);
         namedWindow("edges nms", 1);
         imshow("edges nms", edge_nms);
         waitKey(0);
     }
     else
         imwrite(outFilename, 255*edges);
 
     return 0;
 }
```

## 20.2.3 代码解释

1、加载源彩色图片

代码清单 20-7:加载图片

```
cv::Mat image = cv::imread(inFilename, 1);
if ( image.empty() )
{
    printf("Cannot read image file: %s\n", inFilename.c_str());
    return -1;
}
```

2、将源图转换到[0,1]范围内

代码清单 20-8:改变图像数据类型

```
image.convertTo(image, cv::DataType<float>::type, 1/255.0);
```

3、执行主算法

代码清单 20-9:执行算法

```
cv::Mat edges(image.size(), image.type());
cv::Ptr<StructuredEdgeDetection> pDollar =
    cv::createStructuredEdgeDetection(modelFilename);
pDollar->detectEdges(image, edges);
```

4、展示结果

代码清单 20-10:展示结果

```
if ( outFilename == "" )
{
    cv::namedWindow("edges", 1);
    cv::imshow("edges", edges);
    cv::waitKey(0);
}
else
    cv::imwrite(outFilename, 255*edges);
```


================================================
FILE: chapter 20/视差图滤波.md
================================================
# 20.1 视差图滤波

## 20.1.1 简介

立体匹配算法，尤其是只使用 CPU 需要实时处理的高度优化算法，在面对具有挑战性的序列上往往会出现相当多的错误。这些误差通常集中在均匀的无纹理区域、半遮挡区域和靠近深度不连续区域。一个解决立体匹配误差的方法就是使用各种技术检测潜在的不准确的视差值并使其无效，从而使视差图半稀疏化。一些这种思想的算法已经在 StereoBM 和StereoSGBM 中实现。另一种方法是使用某种过滤手段将视差图的边缘与源图像的边缘对齐，并将视差值从高置信区域传播到低置信区域(如半遮挡区域)。最近在边缘感知过滤方面的进展使得在 CPU 实时处理的约束下能够执行这种后过滤。

在本节教程中，我们将学习如何对视差图进行滤波来改进 StereoBM 和 StereoSGBM 算法的结果。本教程使用的两张图像分别在图 20-1 和图 20-2 中给出。

<img src="./img/20-1.png">

<img src="./img/20-2.png">

## 20.1.2 C++代码

我们将使用示例应用程序中的代码片段作为讲解使用,完整的代码可以从 OpenCV 官网[下载](https://github.com/opencv/opencv_contrib/blob/master/modules/ximgproc/samples/disparity_filtering.cpp)。

> 提示
>
> 本教程的完整程序可以在小白学视觉微信公众号后台回复“视差图滤波”获取。

  
## 20.1.3 代码解释

所提供的示例具有多个选项，这些选项在生成的视差图的速度和质量之间产生了不同的权衡。 如果用户提供了真实的视差图，则将同时测量速度和质量。 在本教程中，我们将详细介绍默认管道，该管道旨在在CPU实时处理的约束下提供最佳质量。

**代码清单 20-1:加载左右视图**

```
    Mat left  = imread(left_im ,IMREAD_COLOR);
    if ( left.empty() )
    {
        cout<<"Cannot read image file: "<<left_im;
        return -1;
    }
    Mat right = imread(right_im,IMREAD_COLOR);
    if ( right.empty() )
    {
        cout<<"Cannot read image file: "<<right_im;
        return -1;
    }
```

我们首先加载源立体图对。 在本教程中，我们将以MPI-Sintel数据集中具有很多无纹理区域的示例为例，该示例具有挑战性。

## 20.1.4 准备匹配的视图

**代码清单 20-2**

```
			max_disp/=2;
            if(max_disp%16!=0)
                max_disp += 16-(max_disp%16);
            resize(left ,left_for_matcher ,Size(),0.5,0.5, INTER_LINEAR_EXACT);
            resize(right,right_for_matcher,Size(),0.5,0.5, INTER_LINEAR_EXACT);
```

我们在允许质量轻微下降的前提下对视图进行缩小以加速匹配。如果要获取最好的质量
图,则应该避免进行图像缩小。

## 20.1.5 进行匹配和生成 filter 实例

```
			Ptr<StereoBM> left_matcher = StereoBM::create(max_disp,wsize);
            wls_filter = createDisparityWLSFilter(left_matcher);
            Ptr<StereoMatcher> right_matcher = createRightMatcher(left_matcher);
            cvtColor(left_for_matcher,  left_for_matcher,  COLOR_BGR2GRAY);
            cvtColor(right_for_matcher, right_for_matcher, COLOR_BGR2GRAY);
            matching_time = (double)getTickCount();
            left_matcher-> compute(left_for_matcher, right_for_matcher,left_disp);
            right_matcher->compute(right_for_matcher,left_for_matcher, right_disp);
            matching_time = ((double)getTickCount()-matching_time)/getTickFrequency();
```

为了更快地处理,我们使用 StereoBM 算法。如果对速度要求不高，那么 StereoSGBM算法则能够提供更好的质量。filter 实例是通过我们打算使用的 StereoMatcher 实例创建的。createRightMatcher()函数会返回另一个 matcher 实例。然后使用这两个 matcher 实例来计算左边和右边视图的视差图。

## 20.1.6 进行滤波

```
		wls_filter->setLambda(lambda);
        wls_filter->setSigmaColor(sigma);
        filtering_time = (double)getTickCount();
        wls_filter->filter(left_disp,left,filtered_disp,right_disp);
        filtering_time = ((double)getTickCount() - filtering_time)/getTickFrequency();
```

由各个 matcher 实例得到的视差图以及源左视图传给 filter。这里需要注意,滤波过程需要使用原始的非缩小视图。视差图会以边缘感知的方式自动放大,以匹配原始的视图分辨率。结果存储在 filtered_disp 中。

## 20.1.7 可视化视差图

```
		Mat raw_disp_vis;
        getDisparityVis(left_disp,raw_disp_vis,vis_mult);
        namedWindow("raw disparity", WINDOW_AUTOSIZE);
        imshow("raw disparity", raw_disp_vis);
        Mat filtered_disp_vis;
        getDisparityVis(filtered_disp,filtered_disp_vis,vis_mult);
        namedWindow("filtered disparity", WINDOW_AUTOSIZE);
        imshow("filtered disparity", filtered_disp_vis);
        if(!solved_disp.empty())
        {
            Mat solved_disp_vis;
            getDisparityVis(solved_disp,solved_disp_vis,vis_mult);
            namedWindow("solved disparity", WINDOW_AUTOSIZE);
            imshow("solved disparity", solved_disp_vis);
            Mat solved_filtered_disp_vis;
            getDisparityVis(solved_filtered_disp,solved_filtered_disp_vis,vis_mult);
            namedWindow("solved wls disparity", WINDOW_AUTOSIZE);
            imshow("solved wls disparity", solved_filtered_disp_vis);
        }
        while(1)
        {
            char key = (char)waitKey();
            if( key == 27 || key == 'q' || key == 'Q') // 'ESC'
                break;
        }
```

为了方便,我们使用函数 getdisityvis()来可视化视差图。该函数的第二个参数定义对比度(在可视化中,所有的视差值都由这个值缩放)。

## 20.1.8 结果

<img src="./img/20-3.png">

<img src="./img/20-4.png">

================================================
FILE: chapter 20/训练结构化森林.md
================================================
# 20.3 训练结构化森林

在本节教程中,我们展示如何使用 Matlab 文件来训练我们自己的结构化森林。

## 20.3.1 训练流程

1. 通过网络寻找资源并下载 Piotr's Toolbox,并将其放入单独的目录中,如 PToolbox。

下载地址为:http://pdollar.github.io/toolbox/index.html

2. 通过网络寻找资源并下载 BSDS500 数据集,并将其放入单独的目录中,并命名为BSR。下载地址为:https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/BSR/

3.  将这两个文件及其子文件都放到 Matlab 目录中。

4. 下载检测代码,下载地址为:http://research.microsoft.com/en-us/downloads/389109f6-b4e8-404c-84bf-239f7cbf4e3d/

   并将其放到根目录中,现在我们应该有如下文件:

   ```
       BSR
       PToolbox
       models
       private
       Contents.m
       edgesChns.m
       edgesDemo.m
       edgesDemoRgbd.m
       edgesDetect.m
       edgesEval.m
       edgesEvalDir.m
       edgesEvalImg.m
       edgesEvalPlot.m
       edgesSweeps.m
       edgesTrain.m
       license.txt
       readme.txt
   ```

5. 将 models/forest/modelFinal.mat 重命名为 models/forest/modelFinal.mat.backup。

6. 打开 edgesChns.m,注释 26-41 行,并在注释后面加入代码清单 20-11 中的代码:

   代码清单 20-11

   ```
   shrink=opts.shrink;
   chns = single(getFeatures( im2double(I) ));
   ```

7. 之后编译 getFeatures。代码清单 20-22 是实现代码。

```
#include <cv.h>
#include <highgui.h>
#include <mat.h>
#include <mex.h>
#include "MxArray.hpp" // https://github.com/kyamagu/mexopencv
class NewRFFeatureGetter : public cv::RFFeatureGetter
{
public:
    NewRFFeatureGetter() : name("NewRFFeatureGetter"){}
    virtual void getFeatures(const cv::Mat &src, NChannelsMat &features,
                             const int gnrmRad, const int gsmthRad,
                             const int shrink, const int outNum, const int gradNum) const
    {
        // here your feature extraction code, the default one is:
        // resulting features Mat should be n-channels, floating point matrix
    }
protected:
    cv::String name;
};
MEXFUNCTION_LINKAGE void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    if (nlhs != 1) mexErrMsgTxt("nlhs != 1");
    if (nrhs != 1) mexErrMsgTxt("nrhs != 1");
    cv::Mat src = MxArray(prhs[0]).toMat();
    src.convertTo(src, cv::DataType<float>::type);
    std::string modelFile = MxArray(prhs[1]).toString();
    NewRFFeatureGetter *pDollar = createNewRFFeatureGetter();
    cv::Mat edges;
    pDollar->getFeatures(src, edges, 4, 0, 2, 13, 4);
    // you can use other numbers here
    edges.convertTo(edges, cv::DataType<double>::type);
    plhs[0] = MxArray(edges);
}
```

8. 将编译得到的 mex 文件放到根目录,运行 edgesDemo。我们需要等待几个小时，之后新的模型会出现在 models/forest/文件夹下。
9. 最后一步将经过训练的模型从 Matlab 二进制格式转换为 YAML，这样就可以使用我们的ocv::StructuredEdgeDetection 函数。为此，运行opencv_contrib/ximpgroc/tutorials/scripts / modelConvert(model,“model.yml”)即可。

## 20.3.2 如何使用你的模型

如果想使用自己的模型，只需使用上面定义的 NewRFFeatureGetter 类的扩展构造函数。

代码清单 20-13

```
cv::StructuredEdgeDetection pDollar
    = cv::createStructuredEdgeDetection( modelName, makePtr<NewRFFeatureGetter>() );
```


================================================
FILE: chapter 21/图像修复.md
================================================
# 21.1 图像修复

在本节篇教程中,我们将学习如何使用快速频率选择重构(FSR)来进行图像修复。

## 21.1.1 图像修复基础

图像修补是重建图像中受损或缺失部分的过程,通过用类似于相邻像素的像素值替换失真的像素实现的图像修复。目前图像修复有多种算法,本教程使用的是快速频率选择重构(FSR)法。FSR 利用图像小区域在傅里叶域中可以稀疏表示的特性来重构图像信号。对这个算法感兴趣的读者可以详细阅读下面两篇论文:

- Signal and loss geometry aware frequency selective extrapolation for error concealment
- Resampling images to a regular grid from a non-regular subset of pixel positions using
  frequency selective reconstruction

FSR 可以应用于以下应用领域:

- 图像修复:采样掩模表示需要重建的失真输入图像的缺失像素。
-  不规则采样:有关如何选择好的采样掩模。

## 21.1.2 示例教程

代码清单 21-1 中给出了如何使用 FSR 进行图像修复。掩模的非零像素表示有效的图像区域，零像素表示需要重建的区域。我们可以使用 Paint 或 GIMP 等工具手动创建任意掩码。例如从一个简单的白色图像开始,用黑色画一些变形。原始图像、含有失真区域的图像以及修复后的图像分别在图 21-1 和图 21-2 中给出。

**代码清单 21-1:图像修复示例程序**

```c++
#include <opencv2/opencv.hpp>
#include <opencv2/xphoto/inpainting.hpp>
#include <iostream>
using namespace cv;
int main(int argc, char** argv)
{
    // read image and error pattern
    Mat original_, mask_;
    original_ = imread("images/kodim22.png");
    mask_ = imread("images/pattern_random.png", IMREAD_GRAYSCALE);
    // make sure that mask and source image have the same size
    Mat mask;
    resize(mask_, mask, original_.size(), 0.0, 0.0, cv::INTER_NEAREST);
    // distort image
    Mat im_distorted(original_.size(), original_.type(), Scalar::all(0));
    original_.copyTo(im_distorted, mask); // copy valid pixels only (i.e. non-zero pixels in mask)
    // reconstruct the distorted image
    // choose quality profile fast (xphoto::INPAINT_FSR_FAST) or best (xphoto::INPAINT_FSR_BEST)
    Mat reconstructed;
    xphoto::inpaint(im_distorted, mask, reconstructed, xphoto::INPAINT_FSR_FAST);
    imshow("orignal image", original_);
    imshow("distorted image", im_distorted);
    imshow("reconstructed image", reconstructed);
    waitKey();
    return 0;
}
```

<p align="center">

<img src="./img/21-1.png">

</p>

<p align="center">

<img src="./img/21-2.png">

</p>


================================================
FILE: chapter 21/油画效果.md
================================================
# 21.2 油画效果

图像具有多个颜色空间,默认颜色空间为 COLOR_BGR2GRAY。对于图像中的每个像素,如果计算其相邻的大小为 2×size+1 区域的直方图(颜色空间的第一个平面),并将最常出现的值赋给该像素点,这样的结果看起来就很像油画了。xphoto 模块中的 oilPainting()函数便可以实现将普通图像转换成优化的效果的参数 4 来降低图像的动态性,从而提高油画的效果。该函数的使用方式在代码清单 21-2 中给出,转换成的优化效果在图 21-4 给出。

**代码清单 21-2**

```
Mat img;
Mat dst;
img = imread("opencv/samples/data/baboon.jpg");
xphoto::oilPainting(img, dst, 10, 1, COLOR_BGR2Lab);
imshow("oil painting effect", dst);
```

<p align="center">

<img src="./img/21-3.png">

</p>


================================================
FILE: chapter 21/训练基于学习的白平衡算法.md
================================================
# 21.3 训练基于学习的白平衡算法

## 21.3.1 简介

许多传统的白平衡算法是基于统计的,也就是说,它们依赖于这样一个事实，即某些前提假设应该在适当的白平衡图像中存在,比如众所周知的灰色世界假设。然而，基于学习的算法框架利用大规模有标签的图像数据集往往能够获得更好的结果。本小节教程将演示如何训练一个基于学习的白平衡算法并评价其质量。

## 21.3.2 如何训练模型

首先下载训练数据集。在本教程中，我们使用 [Gehler-Shi dataset](http://www.cs.sfu.ca/~colour/data/shi_gehler/) 数据集。该数据集解压后会得到一个具有 568 张训练图片的文件夹。同时我们需要名为 real_illum_568..mat 的文件，该文件中包含 groundtruth 真实值。教程中使用的文件需要分别下载，我们可以在 OpenCV官网中找到下载链接。

> 提示
>
> 教程中使用的数据文件和脚本文件都可以在小白学视觉微信公众号后台回复“Gehler-Shi dataset”进行获取。

接下来使用一个 [Python 脚本](https://github.com/opencv/opencv_contrib/tree/master/modules/xphoto/samples/learn_color_balance.py)来训练模型,使用代码清单 21-3 中的参数实现调用。

**代码清单 21-3:调用脚本**

```
python learn_color_balance.py -i <path to the folder with training
images> -g <path to real_illum_568..mat> -r 0,378 --num_trees 30 --
max_tree_depth 6 --num_augmented 0
```

代码清单 21-3 调用的脚本会使用 378 张图片(2/3 数据集)训练模型。我们将模型的大小设置为每个特征有 30 个回归树对,并将树的深度限制为不超过 6 个。默认情况下，训练得到的模型会保存到color_balance_model.yml 中。

在构造一个 LearningBasedWB 实例时,将训练好的模型路径传入,操作代码在代码清单 21-4 中给出。

**代码清单 21-4:加载模型**

```
Ptr<xphoto::LearningBasedWB> wb = xphoto::createLearningBasedWB(modelFilename);
```

## 21.3.3 如何评估模型

我们将使用 [benchmarking 脚本](https://github.com/opencv/opencv_contrib/tree/master/modules/xphoto/samples/color_balance_benchmark.py)在剩余的 1/3 数据集上比较我们训练的模型与经典的算法。脚本调用参数在代码清单 21-5 中给出。

**代码清单 21-5**

```
python color_balance_benchmark.py -a grayworld,learning_based:color_balance_model.yml -m <full path to folder containing the model> -i <path to the folder with training images> -g <path to real_illum_568..mat> -r 379,567 -d "img"
```

默认情况下评估结果存储在 white_balance_eval_result.html 中，生成的白平衡图像存储在 img 文件夹中,用于对算法进行定性比较。


================================================
FILE: chapter 3/3
================================================


================================================
FILE: chapter 3/背景分割.md
================================================
在一些应用中，我们需要提取前景来进一步操作，例如目标跟踪、移动物体检测。在这些情况下背景分割是一种最为常见和需要的技术。

在本节教程中，我们将学习OpenCV中的背景抠除的方法。

## 3.1.1	方法介绍
背景抠除是在很多基于视觉应用预处理的主要步骤。例如，对于顾客访问柜台的情况，静态相机记录进入或离开房间的顾客人数，或者交通摄像头提取关于车辆的信息等等。在这些所有的案例中，我们首先需要单独提取人员或车辆。从技术上，我们需要从静态的背景提取动态的前景。

如果我们存在只有图像的背景的一张图像，例如一张没有顾客的房间图像，或者没有车辆的道路图像等，那么此时从图像提取动态的前景是一件容易的事情。只需从背景中减去新的图像即可，就能够单独得到前景的物体。但在大多数的情况下，我们可能没有这样的图像，所以我们需要从已有的图像中提取背景。当车辆有阴影的时候，问题将会变得更加复杂。因为阴影也会移动，简单的抠除也会将其标记为前景，这将会复杂化问题。

在下面的内容中，我们将介绍bgsegm模块中抠除背景的两种算法。

## 3.1.2	BackgroundSubstractorMOG
这种方法是基于高斯混合的背景/前景分割算法。它是在由P.KadewTraKuPong在2001年的论文 An improved adaptive background mixture model for real-time tracking with shadow detection中所提出的。它使用一种方法通过混合K个高斯分布(K=3或5)来对每个背景像素进行建模。高斯混合分布的权重代表这些颜色在场景中停留的时间的比例。背景颜色是保持更长且更静态的颜色。

OpenCV中提供cv.bgsegm.createBackgroundSubstractorMOG()函数寻找背景对象。它具有一些可选的参数，例如历史记录的长度，高斯混合的数量，阈值等等。所有这些参数均被设置为一些默认值。然后在视频循环中，使用backgroudsubtractor.apply()方法获取前景模板。在代码清单3-1中给出了利用该函数实现背景抠除的示例程序。
```python
代码清单3-1
import numpy as np
import cv2 as cv
cap = cv.VideoCapture('vtest.avi')
fgbg = cv.bgsegm.createBackgroundSubtractorMOG()
while(1):
 	ret, frame = cap.read()
 	fgmask = fgbg.apply(frame)
 	cv.imshow('frame',fgmask)
 	k = cv.waitKey(30) & 0xff
 	if k == 27:
 		break
cap.release()
cv.destroyAllWindows()
```

## 3.1.3	BackgroundSubtractorGMG
该算法结合了统计背景图像估计和按像素贝叶斯分割。它是在2012年由Andrew B.Godbehere,Akihiro Matsukawa 和Ken Goldberg在论文中Visual Tracking of Human Visitors under Variable-Lighting Conditions for a Responsive Audio Art Installation提出。

该方法使用前几帧(默认为120帧)进行背景建模。它采用概率前景分割算法，该算法使用贝叶斯推断来识别可能的前景对象。这种估计是自适应的；为了适应变化的光照，较新的观测值比旧的观测值有更大的权重。如果像要结果较好，需要一些形态滤波操作，例如闭运算和开运算，以消除不想要的噪声。在最初的几帧中，我们将得到一个黑色的窗口。OpenCV中提供cv.bgsegm.createBackgroundSubtractorGMG()实现该算法，在代码清单3-2中给出了利用该函数实现背景抠除的示例程序。

```python
代码清单3-2
import numpy as np
import cv2 as cv
cap = cv.VideoCapture('vtest.avi')
kernel = cv.getStructuringElement(cv.MORPH_ELLIPSE,(3,3))
fgbg = cv.bgsegm.createBackgroundSubtractorGMG()
while(1):
	ret, frame = cap.read()
	fgmask = fgbg.apply(frame)
	fgmask = cv.morphologyEx(fgmask, cv.MORPH_OPEN, kernel)
	cv.imshow('frame',fgmask)
	k = cv.waitKey(30) & 0xff
	if k == 27:
  	break
cap.release()
cv.destroyAllWindows()
```

## 3.1.4	结果
原始图像和两种方法去除背景后的结果分别在图3-1、图3-2和图3-3中给出。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224120425374.png" height="300">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224120430639.png" height="300">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224120438176.png" height="300">
</p>


================================================
FILE: chapter 4/3
================================================


================================================
FILE: chapter 4/处理引起视错觉的图像.md
================================================
## 4.2.1	目标
在本节教程中，我们将介绍如何重现我们的眼睛在特定光线条件下感知到的视觉错觉：阿德尔森棋盘。

## 4.2.2	阿德尔森棋盘
当看到如图4-6所示的棋盘图，人类的眼睛感知到“B”方块比“A”方块更亮，尽管它们拥有完全相同的RGB颜色。当然，在现实世界中，棋盘上有一个“B”正方形，它比“A”的颜色浅，但在这幅图中，绿色圆柱投射在“B”正方形上的阴影最终使“A”和“B”正方形的亮度相同。这个棋盘就被称为阿德森棋盘。

<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022412270887.png" height="300">
</p>
 
我们的视觉系统确实对阴影进行了“补偿”，使我们感知到“B”方更亮，好像这阴影不会在那里。这是由于中央视网膜凹区域的局部适应过程造成的。我们可以把两个正方形的一部分切下来，然后在没有任何背景的情况下看着它们。我们还可以使用工具测量两个正方形的RGB值。在这张图中，我们裁剪了一小块A和B的方块，并把它们放在一起。很明显它们有相同的亮度，结果图在图4-7给出。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224122822926.png" height="100">
</p>

实际上的原理是，当我们看一个区域时，我们的眼睛局部适应亮度，过滤噪音，加强轮廓等，考虑到周围的区域，这使得产生错觉。因此，被较亮细胞包围的A细胞可以被认为较暗。相反，B细胞的邻域较暗，B细胞则被认为较亮。

## 4.2.3	重现错觉
bioinspired模块模拟了视网膜的过程，它再现了我们眼睛的局部适应能力。这意味着我们可以将细小通道的输出真正包含的亮度值与我们用眼睛感知的亮度值相似。具体来说，在这种情况下，我们期望“B”方块的RGB值实际上比“A”值更小。

为了正确地模仿我们的眼睛，我们需要opencv对正确的图像部分进行局部适应。这意味着我们必须确保opencv的“局部”概念与我们的图像尺寸匹配，否则局部适应将不能像预期的那样工作。由于这个原因，我们可能不得不根据图像分辨率调整hcellsSpatialConstant参数(该参数在技术上指定较低的空间切割频率，或较慢的亮度变化灵敏度)。针对本教程中的图像，可以使用视网膜模型默认参数。

bioinspired模块自带了用于展示的example_bioinspired_retinaDemo示例将图像加载给视网膜模型。运行该示例的命令在代码清单4-10中给出。

```cpp
代码清单4-10
example_bioinspired_retinaDemo -image checkershadow_illusion4med.jpg
```

这是一个静态图像，但我们的视网膜刚刚开始移动到一个新的环境(眼睛睁开)，必须适应这个环境。在这种瞬态状态下，亮度信息起作用，我们或多或少地看到绝对亮度值。为了重现幻觉，绝对亮度是你不需要看的。

一旦达到稳定状态，我们就会收到更多的背景亮度信息。眼睛以中心环绕的方式工作，并考虑周围的亮度来评估感兴趣区域的亮度水平。那就是我们的幻觉出现的时候!

当处理一个单独的帧时，只需要稳态响应，我们需要做的是重复地给视网膜输入相同的帧(这是示例代码所做的)，就像处理一个静止的视频一样。或者，我们可以将视网膜时间参数设置为0以立即获得稳定状态(xml文件的photoreceptorsTemporalConstant和hcellsTemporalConstant参数)。然而，在这种情况下，我们应该意识到，我们正在做的实验再现真实视网膜的行为时，故意降低了准确性!

代码清单4-11中给出了处理图像的一小段python代码，它将迭代20次。这是经过实验发现的较好的次数

```python
代码清单4-11：python代码
import cv2 as cv
inputImage = cv.imread('checkershadow_illusion4med.jpg', 1)
retina = cv.bioinspired.createRetina((inputImage.shape[1], inputImage.shape[0]))
# the retina object is created with default parameters. If you want to read
# the parameters from an external XML file, uncomment the next line
#retina.setup('MyRetinaParameters.xml')
# feed the retina with several frames, in order to reach 'steady' state
for i in range(20):
    retina.run(inputImage)
# get our processed image :)
retinaOut_parvo = retina.getParvo()
# show both the original image and the processed one
cv.imshow('image', inputImage)
cv.imshow('retina parvo out', retinaOut_parvo)
# wait for a key to be pressed and exit
cv.waitKey(0)
cv.destroyAllWindows()
# write the output image on a file
cv.imwrite('checkershadow_parvo.png', retinaOut_parvo)
```
最终我们将得到如图4-8所示的结果。

<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022412331978.png" height="300">
</p>

## 4.2.4	结果分析
将图4-8中“A”块和“B”块裁剪出来放在一起，结果如图4-9所示。我们可以看到经过处理，此时“A”块的像素值要大于“B”块，这样的结果更加符合我们看到的结果。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224123523891.png" height="100">
</p>

 
================================================
FILE: chapter 4/视网膜视觉和真实世界的视觉.md
================================================
## 4.1.1	目标
在本节教程中，我们将介绍人类视网膜模型，它显示了一些有趣的图像预处理和增强特性，本教程中主要内容包括：

-	视网膜的两个主要通道
-	视网膜模型的基本使用
-	微调参数

## 4.1.2	总体介绍
模型的提出来源于Jeanny Herault在Gipsa的研究，其主要研究成果在论文Vision: Images, Signals and Neural Networks-Models of Neural Processing in Visual Perception中给出。它涉及到Listic(代码维护者和用户)实验室的图像处理应用程序。它不是一个完整的模型，但它的提出揭示了一些有趣的事情，可以应用到增强图像处理中。该模型允许使用以下人类视网膜特性：

-	光谱白化具有3个重要的效应：高频率信号抵消(噪声)、中频细节增强和低频亮度能量降低。
-	局部对数亮度压缩可以增强细节，即使在低光条件也可以增强细节。
-	细节信息和瞬态信息的去相关性。

接下俩对前两点特性进行详细说明：

在图4-1中，左侧图像是一个高动态范围的图像。为了使它能较为清晰的看见细节，将原始的输入图像线性的调整图像亮度范围，并转换为8bit/channel格式。因为太强烈的局部对比，这种较强的转换隐藏了许多细节。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224120835276.png" height="300">
</p>

在图4-2中，一起应用局部亮度适应、空间噪声去除和光谱白化，就像你的视网膜一样，在较低范围的8位数据通道上传输准确的信息。在这张图片上，噪音被明显的被去除，局部细节被强烈的亮度对比所掩盖。输出图像保持自然，增强视觉内容。
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224121000879.png" height="300">
</p>
 
## 4.1.3	视网膜模型输出通道
视网膜模型有两种形式的输出：
-	第一个种叫做单细胞通道。它主要活跃于视网膜中央窝区(具有彩色感光感受器的高分辨率中央视觉)，其目的是为静止在视网膜上的视觉细节提供准确的彩色视觉。另一方面，模糊在视网膜上运动的物体。

-	第二个是Magnocellular通道。它主要活跃于视网膜周围视觉，并发出与变化事件(运动、瞬变事件等)相关的信号。这些输出信号还有助于视觉系统将视网膜聚焦在“瞬态”/移动区域进行更详细的分析，从而改进视觉场景上下文联系和对象分类。

注意：与真实的视网膜不同，我们用相同的分辨率将这两个通道应用于整个输入图像。这使得增强的视觉细节和运动信息可以应用在所有的图像种。但是，这两个通道是互斥的。例如，如果单细胞通道在一个区域提供了强大的能量，那么，由于存在瞬态事件，部分细胞通道肯定是模糊的。

为了说明这个问题，我们使用一个在较为黑暗环境种的相机采集的视频作为原始图像，分析视网膜模型中两个通道各起到什么作用。这个视频在一个大学的会场录制，里面有一些学生一边和老师交谈一边在移动。

在这个视频中，由于黑暗的环境，信噪比低，所以图像采集的质量较低，在视觉特征边缘上出现了虚假的色彩。具体形式如图4-3所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224121200816.png" height="300">
</p>

之后将视网膜模型应用于图4-3，单细胞输出可以得到如图4-4所示的图片。在使用的视网膜配置中，全局亮度被保留，局部对比度被增强。此外，信噪比也得到了改善:由于高频时空噪声得到了降低，增强的细节不会被任何增强的噪声所破坏。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224121316946.png" height="300">
</p>

继续输出Magnocellular通道的信息，输出结果如图4-5所示。当瞬态事件发生时，它的信号很强。当一个学生在图像的底部移动时，会产生高能量。其余的图像是静态的，但它是产生了一个强大的噪音。在这里，视网膜过滤掉大部分的噪音，从而产生低虚假运动区域的“警报”。这个通道可以用作瞬态/移动区域检测器：它可以为低成本的分割工具提供相关信息，从而突出显示事件发生的区域。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224121508878.png" height="300">
</p>

## 4.1.4	代码教程
本教程使用的源码可以在opencv_folder/samples/cpp/tutorial_code/bioinspired/ retina_tutorial.cpp找到。视网膜模型在:cv::bioinspired名称空间命名空间中，因此一定要编译bioinspired模块，该模块需要依赖opencv_core (cv::Mat和friends对象管理)、opencv_highgui (显示和图像/视频读取)这两个基础库。我们可以分别用include包含这三个库，如代码清单4-1中所示。

```cpp
代码清单4-1：包含头文件
#include <opencv2/bioinspired.hpp>
#include < opencv2/core.hpp >
#include <opencv2/highgui.hpp>
```
程序中，首先声明一个cv::Mat矩阵，并加载输入图像。同时分配一个cv::VideoCapture对象准备加载视频流(如果必要)，这个过程在代码清单4-2中实现。

```cpp
代码清单4-2
int main(int argc, char* argv[]) {
    // declare the retina input buffer... that will be fed differently in 
    //regard of the input media
  	cv::Mat inputFrame;
  	cv::VideoCapture videoCapture; // in case a video media is used, its 
```

在处理之前，首先检查输入的命令参数。这里我们可以根据用户输入选择加载的数据类型，如果用户选择command –image，就加载图像，如果用户选择command –video就加载视频。

此外，如果用户在程序调用结束时添加log命令，那么由视网膜执行的空间对数图像采样将被布尔标志useLogSampling考虑在内。

```cpp
代码清单4-3
// welcome message
std::cout<<"*********************************************"<<std::endl;
std::cout<<"* Retina demonstration : demonstrates the use of is a wrapper class of the Gipsa/Listic Labs retina model."<<std::endl;
std::cout<<"* This demo will try to load the file 'RetinaSpecificParameters.xml' (if exists).\nTo create it, copy the autogenerated template 'RetinaDefaultParameters.xml'.\nThen tweak it with your own retina parameters."<<std::endl;
// basic input arguments checking
if (argc<2)
{
  	help("bad number of parameter");
  	return -1;
}
bool useLogSampling = !strcmp(argv[argc-1], "log"); // check if user wants retina log sampling processing
std::string inputMediaType=argv[1];
// checking input media type (still image, video file, live video acquisition)
if (!strcmp(inputMediaType.c_str(), "-image") && argc >= 3)
{
    std::cout<<"RetinaDemo: processing image "<<argv[2]<<std::endl;
    // image processing case
  	inputFrame = cv::imread(std::string(argv[2]), 1); // load image in RGB mode
}
if (!strcmp(inputMediaType.c_str(), "-video"))
{
    if (argc == 2 || (argc == 3 && useLogSampling)) // attempt to grab images from a video capture device
    {
        videoCapture.open(0);
    }
    else// attempt to grab images from a video filestream
    {
        std::cout<<"RetinaDemo: processing video stream "<<argv[2]<<std::endl;
        videoCapture.open(argv[2]);
    }
   	// grab a first frame to check if everything is ok
    videoCapture>>inputFrame;
}
else
{
    // bad command parameter
  	help("bad command parameter");
  	return -1;
}
```

检测图像是否被读取，如果没有，显示错误并停止程序。

```cpp
代码清单4-4
if (inputFrame.empty())
{
    help("Input media could not be loaded, aborting");
    return -1;
}
```
之后运行视网膜模型。在这里建议分配一个视网膜实例并管理最终的日志采样选项。视网膜构造器至少需要一个cv::Size对象来显示必须管理的输入数据大小。可以设置其他选项，如颜色及其相关的颜色多路复用策略(这里，Bayer多路复用使用enum cv::bioinspired::RETINA_COLOR_BAYER标志)。如果使用对数采样，可以调整图像简化系数(较小的输出图像)和对数采样强度。

```cpp
代码清单4-5
// pointer to a retina object
cv::Ptr<cv::bioinspired::Retina> myRetina;
// if the last parameter is 'log', then activate log sampling (favour foveal vision and subsamples peripheral vision)
if (useLogSampling)
{
    myRetina = cv::bioinspired::createRetina(inputFrame.size(), true, cv::bioinspired::RETINA_COLOR_BAYER, true, 2.0, 10.0);
}
else// -> else allocate "classical" retina :
    myRetina = cv::bioinspired::createRetina(inputFrame.size());
```

之后将参数写入一个包含视网膜默认参数的默认xml文件，用于创建模板。这里生成的模板xml文件称为RetinaDefaultParameters.xml。

```cpp
代码清单4-6
// save default retina parameters file in order to let you see this and 
//maybe modify it and reload using method "setup"
myRetina->write("RetinaDefaultParameters.xml");
```

之后，视网膜尝试加载另一个名为RetinaSpecificParameters.xml的xml文件。如果我们创建了它，并引入了自己的设置，它将被加载，或者我们可以使用默认的视网膜参数。

```cpp
代码清单4-7
// load parameters if file exists
myRetina->setup("RetinaSpecificParameters.xml");
```

创建一些输出缓冲区，准备接收两个视网膜通道的输出.

```cpp
代码清单4-8
// declare retina output buffers
cv::Mat retinaOutput_parvo;
cv::Mat retinaOutput_magno;
```

然后，在一个循环中运行视网膜，如果需要，从视频序列加载新帧，并将视网膜输出返回到专用缓冲区。

```cpp
代码清单4-9
// processing loop with no stop condition
while(true)
{
    // if using video stream, then, grabbing a new frame, else, input remains the same
    if (videoCapture.isOpened())
  	videoCapture>>inputFrame;
  	// run retina filter on the loaded input frame
  	myRetina->run(inputFrame);
  	// Retrieve and display retina output
  	myRetina->getParvo(retinaOutput_parvo);
   	myRetina->getMagno(retinaOutput_magno);
  	cv::imshow("retina input", inputFrame);
  	cv::imshow("Retina Parvo", retinaOutput_parvo);
  	cv::imshow("Retina Magno", retinaOutput_magno);
  	cv::waitKey(10);
}
```


================================================
FILE: chapter 5/5
================================================


================================================
FILE: chapter 5/多相机标定.md
================================================
## 5.2.1	目标
在本节教程中，我们将介绍如何使用多相机校准工具箱主要内容包括：

1.	“随机”图案介绍和单个相机标定
2.	多相机标定

## 5.2.2	“随机”图案介绍和标定
“随机”图案是一幅随机生成的图像。因为它是随机的，所以有许多特征点。生成后可将其打印出来并用作标定板。图5-10和图5-11分别是随机图像和为它拍摄的图片。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224130609660.png" height="300">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224130634268.png" height="300">
</p>
 
要生成随机图像，需要使用ccalib模块中的类cv::randpattern::RandomPatternGenerator。该类的使用方法在代码清单5-9中给出。

```cpp
代码清单5-9：生成随机图像
cv::randpattern::RandomPatternGenerator generator(width, height);
generator.generatePattern();
pattern = generator.getPattern();
```
这里的width和height是图像的宽和高。生成图案后，我们可以把它打印出来并拍几张照片。之后便可以用这些图像来标定相机。

首先，需要检测拍摄的图像中的objectPoints和imagePoints。这里我们需要使用类cv::randpattern::RandomPatternCornerFinder来检测它们。该类的使用方法在代码清单5-10中给出。

```cpp
代码清单5-10：检测图像角点
cv::randpattern::RandomPatternCornerFinder finder(patternWidth, patternHeight, nMiniMatches);
finder.loadPattern(pattern);
finder.computeObjectImagePoints(vecImg);
vector<Mat> objectPoints = finder.getObjectPoints();
vector<Mat> imagePoints = finder.getImagePoints();
```

这里可变的参数patternWidth和patternHeight是物理图像的宽度和高度，它使用用户自定义的单位。vecImg是存储标定图像的vector容器。

然后，使用标定函数，如cv::calibrateCamera或cv::omnidir::calibrate来标定相机。

## 5.2.3	多相机标定
接下来我们介绍多个相机标定，到目前为止这个工具箱必须使用前文介绍的随机图像。

为了标定多个相机，我们需要拍摄一些包含随机图案的照片。为了标定外部参数，需要同时使用多个摄像机(至少两个)来拍摄同一个图案。为了分清拍摄的相机和对应的图像，图像文件应该像这样命名：“cameraIdx-timestamp.*”。具有相同时间标记的照片意味着它们是由多个相机拍摄的同一个图案。另外，cameraIdx应该从0开始。例如：“0-129.png”、“0-187.png”、“1-187.png”、“2-129.png”。

然后，运行多个相机标定的程序，具体内容如代码清单5-11中所示。

```cpp
代码清单5-11：多相机标定
cv::multicalib::MultiCameraCalibration multiCalib(cameraType, nCamera, inputFilename,patternWidth, patternHeight, showFeatureExtraction, nMiniMatches);
multiCalib.run();
multiCalib.writeParameters(outputFilename);
```

这里的cameraType表示相机类型，支持multicalib::MultiCameraCalibration::PINHOLE和multicalib::MultiCameraCalibration::OMNIDIRECTIONAL两种类型。有关广角相机的更多内容，可以参考cv::omnidir模块了解详情。nCamera是相机的数量。inputFilename是由opencv/sample中的imagelist_creator生成的文件名称。patternWidth和patternHeight是图案的物理宽度和高度。showFeatureExtraction是一个标志，指示是否显示特征提取过程。nMiniMatches是应该在每幅图中检测到的最小的点数，不满足该条件时这幅图将被舍弃。outputFilename是用于存储参数的xml文件名。


================================================
FILE: chapter 5/广角相机标定.md
================================================
本章将介绍contrib扩展模块中相机标定模块，将以示例的形式重点介绍全景相机和多相机的联合标定。其中对广角相机标定包括广角相机的标定、校正和立体重建等三部分；对多相机标定包括“随机”图案的介绍和多相机标定两部分。


## 5.1.1	目标
在本节教程中，我们将介绍如何对广角相机进行标定，主要内容包括：

1.	标定单台相机
2.	标定一对立体相机
3.	校正图像、大幅消除失真
4.	从两幅具有大视场的立体图像中进行三维重建
5.	与opencv/calib3d/中的鱼眼模型进行比较

## 5.1.2	 单个相机标定
标定相机的第一步是获取标定图案并拍摄一些照片。OpenCV支持多种图案，比如棋盘格和圆形网格。还可以使用一个名为random pattern的新图案，我们称为“随机图案”。关于随机图案的更多细节，读者自行查阅官方文档opencv_contrib/modules/ccalib进行了解。

第二步是从标定图案中提取角点。针对棋盘格，可以使用OpenCV基础库中的函数cv::findChessboardCorners提取角点；针对圆形网格，可以使用函数cv::findCirclesGrid提取角点；针对随机图案，可以使用opencv_contrib/modules/ccalib/src/randomPattern.hpp中的randomPatternCornerFinder类提取角点。之后将图像中的角点坐标保存在imagePoints这样类型的变量中。imagePoints的类型可以是std::vector<std::vector<cv::Vec2f>>，第一个vector容器存储每一张图片的角点，第二个vector容器存储同一图案的所有图片中的角点。imagePoints的类型也可以是std::vector<cv::Mat>，其中cv::Mat的类型为CV_32FC2。

此外，还需要世界坐标中相应的三维点坐标。我们可以任意设定世界坐标系，并通过图案的物理大小计算三维点的坐标。之后将这些点保存在objectPoints类型中，类似于imagePoints，objectPoints类型可以是std::vector<std::vector<Vec3f>>或std::vector<cv::Mat>类型，其中cv::Mat为CV_32FC3类型。

最后一个需要输入的参数是图像的尺寸大小。

**警告
*objectPoints和imagePoints的大小必须相同，因为它们彼此对应。***

在官方示例中，标定所需要的数据被存放在一个xml文件中，我们可以通过opencv_contrib/modules/ccalib/tutorial/data/omni_calib_data.xml找到它。文件中存储了一个objectPoints、imagePoints和imageSize这三个类数据。代码清单5-1是加载该文件的代码

```cpp
代码清单5-1：加载数据
cv::FileStorage fs("omni_calib_data.xml", cv::FileStorage::READ);
std::vector<cv::Mat> objectPoints, imagePoints;
cv::Size imgSize;
fs["objectPoints"] >> objectPoints;
fs["imagePoints"] >> imagePoints;
fs["imageSize"] >> imgSize;
```

**提示
*如果没有下载opencv_contrib安装包，可以在“小白学视觉”公众号后台回复“omni_calib_data”获取这个数据文件***

然后定义一些变量来存储输出参数，并运行标定代码，具体代码如下：

```cpp
代码清单5-2：标定代码
cv::Mat K, xi, D, idx;
int flags = 0;
cv::TermCriteria critia(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, 200, 0.0001);
std::vector<cv::Mat> rvecs, tvecs;
double rms = cv::omnidir::calibrate(objectPoints, imagePoints, imgSize, K, xi, D, rvecs, tvecs, flags, critia, idx);
```

K, xi, D是内部参数，rvecs和tvecs是存储图案姿态的外部参数。它们的深度都是CV_64F。xi是广角相机模型的一个单值变量。idx是一个CV_32S的Mat，存储实际用于校准的图像索引，这是因为一些图像在初始化步骤中失败，所以在最后的优化中没有使用它们。返回值rms是重投影误差的均方根值。

校准支持一些功能，flags是一些特性的枚举变量，包括：

-	cv::omnidir::CALIB_FIX_SKEW
-	cv::omnidir::CALIB_FIX_K1
-	cv::omnidir::CALIB_FIX_K2
-	cv::omnidir::CALIB_FIX_P1
-	cv::omnidir::CALIB_FIX_P2
-	cv::omnidir::CALIB_FIX_XI
-	cv::omnidir::CALIB_FIX_GAMMA
-	cv::omnidir::CALIB_FIX_CENTER

我们可以指定flags来在标定期间修正参数。使用“+”操作符可以设置多个参数。例如，CALIB_FIX_SKEW+CALIB_FIX_K1意味着修正kew和K1。

criteria是优化过程中的停止条件，例如，cv::TermCriteria(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, 200, 0.0001)，即迭代200次，或者当相对变化小于0.0001时停止。

## 5.1.3	立体标定
立体标定是同时标定两个摄像机。输出参数包括两台相机的相机内部参数和相机的相对位姿。为了确定相对位姿，两个相机必须同时拍摄相同的图案，这样两个相机的objectPoints是相同的。

首先根据前一小节的方法分别检测两个相机拍摄图像的角点，以获得imagePoints1和imagePoints2。然后计算共同的objectPoints。

立体标定所需要的数据被存放在一个xml文件中，我们可以通过opencv_contrib/modules/ccalib/tutorial/data/omni_stereocalib_data.xml找到它。可以通过代码清单5-3中的代码加载数据。

**提示
*同样可以在“小白学视觉”公众号后台回复“omni_stereocalib_data.xml”获取这个数据文件***

```cpp
代码清单5-3：加载立体标定数据
cv::FileStorage fs("omni_stereocalib_data.xml", cv::FileStorage::READ);
std::vector<cv::Mat> objectPoints, imagePoints1, imagePoints2;
cv::Size imgSize1, imgSize2;
fs["objectPoints"] >> objectPoints;
fs["imagePoints1"] >> imagePoints1;
fs["imagePoints2"] >> imagePoints2;
fs["imageSize1"] >> imgSize1;
fs["imageSize2"] >> imgSize2;
```
之后利用代码清单5-4中的代码进行立体标定。

```cpp
代码清单5-4：立体标定
cv::Mat K1, K2, xi1, xi2, D1, D2;
int flags = 0;
cv::TermCriteria critia(cv::TermCriteria::COUNT + cv::TermCriteria::EPS, 200, 0.0001);
std::vector<cv::Mat> rvecsL, tvecsL;
cv::Mat rvec, tvec;
double rms = cv::omnidir::stereoCalibrate(objectPoints, imagePoints1, imagePoints2, imgSize1, imgSize2, K1, xi1, D1, K2, xi2, D2, rvec, tvec, rvecsL, tvecsL, flags, critia, idx);
```

程序中的rvec和tvec是第一个和第二个照相机之间的转换。rvecsL和tvecsL是空间中标定物体和第一个摄像头之间的转换。

## 5.1.4	图像矫正
全景图像的失真很大，与人眼观测的结果有很大差异。但是如果摄像机的参数已知，可以对全景图像进行矫正。图5-1是一个360度水平视场全景图像的例子。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224124608869.png" height="300">
</p>
 

矫正后，会生成一张类似透视图的图像。接下来将利用ccalib模块中的cv::omnidir::undistortImage函数对全景图像进行矫正。

```cpp
代码清单5-5：全景图像校正函数
cv::omnidir::undistortImage(distorted, undistorted, K, D, xi, int flags, Knew, new_size)
```

函数中参数distorted和undistorted分别为原始图像和矫正后的图象。K、D、xi为相机参数。Knew和new_size是矫正后图像的相机矩阵和图像大小。flags是矫正类型，它可以选择的类型如下：

-	RECTIFY_PERSPECTIVE: 对透视图进行透视矫正，会丢失部分视场。
-	RECTIFY_CYLINDRICAL: 矫正成圆柱形图像，保留所有视场。
-	RECTIFY_STEREOGRAPHIC: 矫正成可能会失去一点视场的立体图像。
-	RECTIFY_LONGLATI: 矫正成像世界地图的经纬度图。这种矫正可以用于立体重建，但可能不方便查看。

图5-2、图5-3、图5-4和图5-5分别是这四种矫正后的结果

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224125217615.png" height="300">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224125252628.png" height="300">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022412531392.png" height="300">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022412542145.png" height="300">
</p>
 
可以看出，透视矫正后的图像只保留了很少的视野。柱面矫正保留了所有的视野，只在画面底部中间位置不自然。在底部中间的位置，立体校正畸变小于柱面矫正，但其他位置的畸变较大，并且无法保留所有视场。对于失真非常大的图像，经纬度矫正效果不好，但是可以在一条线上形成对极约束，因此可以将立体匹配应用于全景图像。

**注意
*为了获得更好的效果，应该谨慎选择参数Knew，它与相机有关。通常来说，较小的焦距会导致较小的视野，反之亦然。下面是一些推荐的设置。***

针对 RECTIFY_PERSPECTIVE （透视矫正）方法，Knew计算方法如代码清单5-6中所示。

```cpp
代码清单5-6：计算Knew
Knew = Matx33f(new_size.width/4, 0, new_size.width/2, 0, new_size.height/4, new_size.height/2, 0, 0, 1);
```
针对 RECTIFY_CYLINDRICAL, RECTIFY_STEREOGRAPHIC, RECTIFY_LONGLATI（柱面校正、立体校正、经纬度校正）方法，Knew计算方法如代码清单5-7中所示。

```cpp
代码清单5-7：计算Knew
Knew = Matx33f(new_size.width/3.1415, 0, 0, 0, new_size.height/3.1415, 0,0, 0, 1);
```

此外，可能需要更改(u0, v0)以获得更好的视图。

## 5.1.5	立体重建
立体重建是从标定好的立体像机对中重建空间中的三维点。这是计算机视觉中的一个基本问题。但是，对于全景摄像机来说，由于畸变较大使得重建困难。常规方法是将图像校正为透视图像，并在透视图像中进行立体重建。但是，上一小节表明，对图像进行透视矫正会损失太多的视场，这就浪费了全景摄像机最大的优势——大视场。

立体重建的第一步是对图像进行立体矫正，使极线成为水平线。我们使用经纬度矫正来保留所有视场。也可以使用透视矫正，但并不推荐这样做。第二步是通过立体匹配来获得视差图。最后，通过视差图生成空间中的三维点。

广角相机立体重建的函数是omnidir::stereoReconstruct。接下来我们将用一个例子来说明如何使用它进行立体重建。

首先，按照前面描述的步骤标定一对立体像机，并获取K1，D1，xi1，K2，D2，xi2，rvec，tvec等参数。然后分别从第一台和第二台相机读取两个图像，例如image1和image2，如图5-6中所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224125836319.png">
</p>

然后，运行omnidir :: stereoReconstruct，示例代码在代码清单5-8中给出。

```cpp
代码清单5-8：三维重建
cv::Size imgSize = img1.size();
int numDisparities = 16*5;
int SADWindowSize = 5;
cv::Mat disMap;
int flag = cv::omnidir::RECTIFY_LONGLATI;
int pointType = omnidir::XYZRGB;
// the range of theta is (0, pi) and the range of phi is (0, pi)
cv::Matx33d KNew(imgSize.width / 3.1415, 0, 0, 0, imgSize.height / 3.1415, 0, 0, 0, 1);
Mat imageRec1, imageRec2, pointCloud;
cv::omnidir::stereoReconstruct(img1, img2, K1, D1, xi1, K2, D2, xi2, R, T, flag, numDisparities, SADWindowSize, disMap, imageRec1, imageRec2, imgSize, KNew, pointCloud);
```

其中，变量flag表示校正类型，只能使用RECTIFY_LONGLATI(推荐)或者RECTIFY_PERSPECTIVE。numDisparities是最大视差值，SADWindowSize是cv :: StereoSGBM的窗口大小。pointType是一个用来定义点云类型的标志，omnidir :: XYZRGB类型表示每个点都是6维向量，前三个元素是xyz坐标，后三个元素是rgb颜色信息。另一种类型omnidir::XYZ表示每个点都是三维的，并且只有XYZ坐标。imageRec1和imagerec2分别是第一幅和第二幅图像校正后的图像。它们的极线具有相同的y坐标，这个特点使得立体匹配变得更容易，结果如图5-7所示。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224130019967.png" height="300">
</p>

从结果中可以看出它们对齐的很好。变量disMap存储了通过函数cv :: StereoSGBM从imageRec1和imageRec2计算出的视差图。图5-7中两张图片的视差图如图5-8所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224130123151.png" height="300">
</p>
 
有了视差图后，我们可以计算每个像素对应的3D位置。点云存储在变量pointCloud中，pointCloud是3通道或6通道cv :: Mat。计算的点云结果如图5-9所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200224130231143.png" height="300">
</p>


================================================
FILE: chapter 6/e
================================================


================================================
FILE: chapter 6/使用Icosphere训练数据.md
================================================
本章介绍的内容需要的OpenCV版本新于OpenCV 3.0.0。

## 6.1.1	目标
在本节教程中，我们将学习如何从3D模型中以适当的姿态生成训练图像用来进行CNN训练，主要内容包括：

-	如何生成物体表面的3D点云模型
-	如何使用3D模型生成训练图像。

## 6.1.2	C++代码
```cpp
代码清单6-1
/*
 * Software License Agreement (BSD License)
 *
 *  Copyright (c) 2009, Willow Garage, Inc.
 *  All rights reserved.
 *
 *  Redistribution and use in source and binary forms, with or without
 *  modification, are permitted provided that the following conditions
 *  are met:
 *
 *   * Redistributions of source code must retain the above copyright
 *     notice, this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above
 *     copyright notice, this list of conditions and the following
 *     disclaimer in the documentation and/or other materials provided
 *     with the distribution.
 *   * Neither the name of Willow Garage, Inc. nor the names of its
 *     contributors may be used to endorse or promote products derived
 *     from this software without specific prior written permission.
 *
 *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
 *  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
 *  COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
 *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
 *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 *  LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 *  CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
 *  ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 *  POSSIBILITY OF SUCH DAMAGE.
 *
 */
#include <opencv2/cnn_3dobj.hpp>
#include <opencv2/viz/vizcore.hpp>
#include <iostream>
#include <stdlib.h>
#include <time.h>
using namespace cv;
using namespace std;
using namespace cv::cnn_3dobj;
static void listDir(const char *path, std::vector<String>& files, bool r)
{
    DIR *pDir;
    struct dirent *ent;
    char childpath[512];
    pDir = opendir(path);
    memset(childpath, 0, sizeof(childpath));
    while ((ent = readdir(pDir)) != NULL)
    {
        if (ent->d_type & DT_DIR)
        {
            if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0 || strcmp(ent->d_name, ".DS_Store") == 0)
            {
                continue;
            }
            if (r)
            {
                sprintf(childpath, "%s/%s", path, ent->d_name);
                listDir(childpath,files,false);
            }
        }
        else
        {
            if (strcmp(ent->d_name, ".DS_Store") != 0)
                files.push_back(ent->d_name);
        }
    }
    sort(files.begin(),files.end());
};
int main(int argc, char *argv[])
{
    const String keys = "{help | | demo :$ ./sphereview_test -ite_depth=2 -plymodel=../data/3Dmodel/ape.ply -imagedir=../data/images_all/ -labeldir=../data/label_all.txt -num_class=6 -label_class=0, then press 'q' to run the demo for images generation when you see the gray background and a coordinate.}"
    "{ite_depth | 3 | Iteration of sphere generation.}"
    "{plymodel | ../data/3Dmodel/ape.ply | Path of the '.ply' file for image rendering. }"
    "{imagedir | ../data/images_all/ | Path of the generated images for one particular .ply model. }"
    "{labeldir | ../data/label_all.txt | Path of the generated images for one particular .ply model. }"
    "{bakgrdir | | Path of the backgroud images sets. }"
    "{cam_head_x | 0 | Head of the camera. }"
    "{cam_head_y | 0 | Head of the camera. }"
    "{cam_head_z | -1 | Head of the camera. }"
    "{semisphere | 1 | Camera only has positions on half of the whole sphere. }"
    "{z_range | 0.6 | Maximum camera position on z axis. }"
    "{center_gen | 0 | Find center from all points. }"
    "{image_size | 128 | Size of captured images. }"
    "{label_class |  | Class label of current .ply model. }"
    "{label_item |  | Item label of current .ply model. }"
    "{rgb_use | 0 | Use RGB image or grayscale. }"
    "{num_class | 6 | Total number of classes of models. }"
    "{binary_out | 0 | Produce binaryfiles for images and label. }"
    "{view_region | 0 | Take a special view of front or back angle}";
    /* Get parameters from comand line. */
    cv::CommandLineParser parser(argc, argv, keys);
    parser.about("Generating training data for CNN with triplet loss");
    if (parser.has("help"))
    {
        parser.printMessage();
        return 0;
    }
    int ite_depth = parser.get<int>("ite_depth");
    String plymodel = parser.get<String>("plymodel");
    String imagedir = parser.get<String>("imagedir");
    String labeldir = parser.get<String>("labeldir");
    String bakgrdir = parser.get<String>("bakgrdir");
    int label_class = parser.get<int>("label_class");
    int label_item = parser.get<int>("label_item");
    float cam_head_x = parser.get<float>("cam_head_x");
    float cam_head_y = parser.get<float>("cam_head_y");
    float cam_head_z = parser.get<float>("cam_head_z");
    int semisphere = parser.get<int>("semisphere");
    float z_range = parser.get<float>("z_range");
    int center_gen = parser.get<int>("center_gen");
    int image_size = parser.get<int>("image_size");
    int rgb_use = parser.get<int>("rgb_use");
    int num_class = parser.get<int>("num_class");
    int binary_out = parser.get<int>("binary_out");
    int view_region = parser.get<int>("view_region");
    double obj_dist, bg_dist, y_range;
    if (view_region == 1 || view_region == 2)
    {
        /* Set for TV */
        if (label_class == 12)
            obj_dist = 340;
        else
            obj_dist = 250;
        ite_depth = ite_depth + 1;
        bg_dist = 700;
        y_range = 0.85;
    }
    else if (view_region == 0)
    {
        obj_dist = 370;
        bg_dist = 400;
    }
    if (label_class == 5 || label_class == 10 || label_class == 11 || label_class == 12)
        ite_depth = ite_depth + 1;
    cv::cnn_3dobj::icoSphere ViewSphere(10,ite_depth);
    std::vector<cv::Point3d> campos;
    std::vector<cv::Point3d> campos_temp = ViewSphere.CameraPos;
    /* Regular objects on the ground using a semisphere view system */
    if (semisphere == 1)
    {
        if (view_region == 1)
        {
            for (int pose = 0; pose < static_cast<int>(campos_temp.size()); pose++)
            {
                if (campos_temp.at(pose).z >= 0 && campos_temp.at(pose).z < z_range && campos_temp.at(pose).y < -y_range)
                    campos.push_back(campos_temp.at(pose));
            }
        }
        else if (view_region == 2)
        {
            for (int pose = 0; pose < static_cast<int>(campos_temp.size()); pose++)
            {
                if (campos_temp.at(pose).z >= 0 && campos_temp.at(pose).z < z_range && campos_temp.at(pose).y > y_range)
                campos.push_back(campos_temp.at(pose));
            }
        }
        else
        {
            /* Set for sofa */
            if (label_class == 10)
            {
                for (int pose = 0; pose < static_cast<int>(campos_temp.size()); pose++)
                {
                    if (campos_temp.at(pose).z >= 0 && campos_temp.at(pose).z < z_range && campos_temp.at(pose).y < -0.4)
                    campos.push_back(campos_temp.at(pose));
                }
            }
            else
            {
                for (int pose = 0; pose < static_cast<int>(campos_temp.size()); pose++)
                {
                    if (campos_temp.at(pose).z >= 0 && campos_temp.at(pose).z < z_range)
                        campos.push_back(campos_temp.at(pose));
                }
            }
        }
    }
    /* Special object such as plane using a full space of view sphere */
    else
    {
        if (view_region == 1)
        {
            for (int pose = 0; pose < static_cast<int>(campos_temp.size()); pose++)
            {
                if (campos_temp.at(pose).z < 0.2 && campos_temp.at(pose).z > -0.2 && campos_temp.at(pose).y < -y_range)
                    campos.push_back(campos_temp.at(pose));
            }
        }
        else if (view_region == 2)
        {
            for (int pose = 0; pose < static_cast<int>(campos_temp.size()); pose++)
            {
                if (campos_temp.at(pose).z < 0.2 && campos_temp.at(pose).z > -0.2 && campos_temp.at(pose).y > y_range)
                campos.push_back(campos_temp.at(pose));
            }
        }
        else
        {
            for (int pose = 0; pose < static_cast<int>(campos_temp.size()); pose++)
            {
                if (campos_temp.at(pose).z < 0.2 && campos_temp.at(pose).z > -0.6)
                    campos.push_back(campos_temp.at(pose));
            }
        }
    }
    std::fstream imglabel;
    imglabel.open(labeldir.c_str(), fstream::app|fstream::out);
    bool camera_pov = true;
    /* Create a window using viz. */
    viz::Viz3d myWindow("Coordinate Frame");
    /* Set window size. */
    myWindow.setWindowSize(Size(image_size,image_size));
    /* Set background color. */
    myWindow.setBackgroundColor(viz::Color::gray());
    myWindow.spinOnce();
    /* Create a Mesh widget, loading .ply models. */
    viz::Mesh objmesh = viz::Mesh::load(plymodel);
    /* Get the center of the generated mesh widget, cause some .ply files, this could be ignored if you are using PASCAL database*/
    Point3d cam_focal_point;
    if (center_gen)
        cam_focal_point = ViewSphere.getCenter(objmesh.cloud);
    else
        cam_focal_point = Point3d(0,0,0);
    const char* headerPath = "../data/header_for_";
    const char* binaryPath = "../data/binary_";
    if (binary_out)
    {
        ViewSphere.createHeader(static_cast<int>(campos.size()), image_size, image_size, headerPath);
    }
    float radius = ViewSphere.getRadius(objmesh.cloud, cam_focal_point);
    objmesh.cloud = objmesh.cloud/radius*100;
    cam_focal_point = cam_focal_point/radius*100;
    Point3d cam_y_dir;
    cam_y_dir.x = cam_head_x;
    cam_y_dir.y = cam_head_y;
    cam_y_dir.z = cam_head_z;
    char temp[1024];
    std::vector<String> name_bkg;
    if (bakgrdir.size() != 0)
    {
        /* List the file names under a given path */
        listDir(bakgrdir.c_str(), name_bkg, false);
        for (unsigned int i = 0; i < name_bkg.size(); i++)
        {
            name_bkg.at(i) = bakgrdir + name_bkg.at(i);
        }
    }
    /* Images will be saved as .png files. */
    size_t cnt_img;
    srand((int)time(0));
    do
    {
        cnt_img = 0;
        for(int pose = 0; pose < static_cast<int>(campos.size()); pose++){
            /* Add light. */
            // double alpha1 = rand()%(314/2)/100;
            // double alpha2 = rand()%(314*2)/100;
            // printf("%f %f %f/n", ceil(10000*sqrt(1 - sin(alpha1)*sin(alpha1))*sin(alpha2)), 10000*sqrt(1 - sin(alpha1)*sin(alpha1))*cos(alpha2), sin(alpha1)*10000);
            // myWindow.addLight(Vec3d(10000*sqrt(1 - sin(alpha1)*sin(alpha1))*sin(alpha2),10000*sqrt(1 - sin(alpha1)*sin(alpha1))*cos(alpha2),sin(alpha1)*10000), Vec3d(0,0,0), viz::Color::white(), viz::Color::white(), viz::Color::black(), viz::Color::white());
            int label_x, label_y, label_z;
            label_x = static_cast<int>(campos.at(pose).x*100);
            label_y = static_cast<int>(campos.at(pose).y*100);
            label_z = static_cast<int>(campos.at(pose).z*100);
            sprintf (temp,"%02i_%02i_%04i_%04i_%04i_%02i", label_class, label_item, label_x, label_y, label_z, static_cast<int>(obj_dist/100));
            String filename = temp;
            filename += ".png";
            imglabel << filename << ' ' << label_class << endl;
            filename = imagedir + filename;
            /* Get the pose of the camera using makeCameraPoses. */
            if (view_region != 0)
            {
                cam_focal_point.x = cam_focal_point.y - label_x/5;
            }
            Affine3f cam_pose = viz::makeCameraPose(campos.at(pose)*obj_dist+cam_focal_point, cam_focal_point, cam_y_dir*obj_dist+cam_focal_point);
            /* Get the transformation matrix from camera coordinate system to global. */
            Affine3f transform = viz::makeTransformToGlobal(Vec3f(1.0f,0.0f,0.0f), Vec3f(0.0f,1.0f,0.0f), Vec3f(0.0f,0.0f,1.0f), campos.at(pose));
            viz::WMesh mesh_widget(objmesh);
            /* Pose of the widget in camera frame. */
            Affine3f cloud_pose = Affine3f().translate(Vec3f(1.0f,1.0f,1.0f));
            /* Pose of the widget in global frame. */
            Affine3f cloud_pose_global = transform * cloud_pose;
            /* Visualize camera frame. */
            if (!camera_pov)
            {
                viz::WCameraPosition cpw(1); // Coordinate axes
                viz::WCameraPosition cpw_frustum(Vec2f(0.5, 0.5)); // Camera frustum
                myWindow.showWidget("CPW", cpw, cam_pose);
                myWindow.showWidget("CPW_FRUSTUM", cpw_frustum, cam_pose);
            }
            /* Visualize widget. */
            if (bakgrdir.size() != 0)
            {
                cv::Mat img_bg = cv::imread(name_bkg.at(rand()%name_bkg.size()));
                /* Back ground images has a distance of 2 times of radius of camera view distance */
                cv::viz::WImage3D background_widget(img_bg, Size2d(image_size*4.2, image_size*4.2), Vec3d(-campos.at(pose)*bg_dist+cam_focal_point), Vec3d(campos.at(pose)*bg_dist-cam_focal_point), Vec3d(0,0,-1)*bg_dist+Vec3d(0,2*cam_focal_point.y,0));
                myWindow.showWidget("bgwidget", background_widget, cloud_pose_global);
            }
            // mesh_widget.setRenderingProperty(viz::LINE_WIDTH, 4.0);
            myWindow.showWidget("targetwidget", mesh_widget, cloud_pose_global);
            /* Set the viewer pose to that of camera. */
            if (camera_pov)
                myWindow.setViewerPose(cam_pose);
            /* Save screen shot as images. */
            myWindow.saveScreenshot(filename);
            if (binary_out)
            {
            /* Write images into binary files for further using in CNN training. */
                ViewSphere.writeBinaryfile(filename, binaryPath, headerPath,static_cast<int>(campos.size())*num_class, label_class, static_cast<int>(campos.at(pose).x*100), static_cast<int>(campos.at(pose).y*100), static_cast<int>(campos.at(pose).z*100), rgb_use);
            }
            cnt_img++;
        }
    } while (cnt_img != campos.size());
    imglabel.close();
    return 1;
};
```

## 6.1.3	代码解释
接下来对代码清单6-1中的程序进行详细介绍。

-	创建一个窗口。

```cpp
代码清单6-2：创建窗口
viz::Viz3d myWindow("Coordinate Frame");
```
-	将窗口大小设置为64*64，我们使用这个尺寸作为它的默认值。
myWindow.setWindowSize(Size(64,64));

-	添加坐标轴。
```cpp
代码清单6-3：窗口中添加坐标轴
myWindow.showWidget("Coordinate Widget", viz::WCoordinateSystem());
myWindow.setBackgroundColor(viz::Color::gray());
myWindow.spin();
```

-	创建一个Mesh部件，加载.ply模型。
```cpp
代码清单6-4：添加.ply模型
viz::Mesh objmesh = viz::Mesh::load(plymodel);
```

-	获取生成的mesh部件的中心，生成一些.ply文件。
```cpp
代码清单6-5：获取部件中心位置
Point3d cam_focal_point = ViewSphere.getCenter(objmesh.cloud);
```

-	使用makeCameraPoses获取相机的姿态。
```cpp
代码清单6-6：获取相机位姿
Affine3f cam_pose = viz::makeCameraPose(campos.at(pose) * radius+cam_focal_point, cam_focal_point, cam_y_dir * radius+cam_focal_point);
```

-	获取相机坐标系到全局坐标系的变换矩阵。
```cpp
代码清单6-7：获取变换矩阵
Affine3f transform = viz::makeTransformToGlobal(Vec3f(1.0f,0.0f,0.0f), Vec3f(0.0f,1.0f,0.0f), Vec3f(0.0f,0.0f,1.0f), campos.at(pose));
viz::WMesh mesh_widget(objmesh);
```

-	将屏幕截图保存为图像。
```cpp
代码清单6-8：截屏
myWindow.saveScreenshot(filename);
```

-	将图像写入二进制文件，以便在CNN训练中进一步使用。
```cpp
代码清单6-9：生成二进制文件
ViewSphere.writeBinaryfile(filename, binaryPath, headerPath,(int)campos.size()*num_class, label_class, (int)(campos.at(pose).x*100), (int)(campos.at(pose).y*100), (int)(campos.at(pose).z*100), rgb_use);
```


================================================
FILE: chapter 6/分析训练模型.md
================================================
您将学习如何分析一个训练过的模型的性能。
## 6.3.1	目标
在本节教程中，我们将学习如何如何分析一个训练过的模型的性能，主要内容包括：

-	如何从特定图像中提取特征
-	如何对提取的特征进行有意义的对比

## 6.3.2	C++代码
```cpp
代码清单6-18
/*
 * Software License Agreement (BSD License)
 *
 *  Copyright (c) 2009, Willow Garage, Inc.
 *  All rights reserved.
 *
 *  Redistribution and use in source and binary forms, with or without
 *  modification, are permitted provided that the following conditions
 *  are met:
 *
 *   * Redistributions of source code must retain the above copyright
 *     notice, this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above
 *     copyright notice, this list of conditions and the following
 *     disclaimer in the documentation and/or other materials provided
 *     with the distribution.
 *   * Neither the name of Willow Garage, Inc. nor the names of its
 *     contributors may be used to endorse or promote products derived
 *     from this software without specific prior written permission.
 *
 *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
 *  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
 *  COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
 *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
 *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 *  LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 *  CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
 *  ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 *  POSSIBILITY OF SUCH DAMAGE.
 *
 */
#include <iostream>
#include "opencv2/imgproc.hpp"
#include "opencv2/cnn_3dobj.hpp"
using namespace cv;
using namespace cv::cnn_3dobj;
int main(int argc, char** argv)
{
    const String keys = "{help | | this demo will have an analysis on the trained model, it will print information about whether the model is suit for set different classes apart and also discriminant on object pose at the same time.}"
"{caffemodel | ../../testdata/cv/3d_triplet_iter_30000.caffemodel | caffe model for feature exrtaction.}"
"{network_forIMG | ../../testdata/cv/3d_triplet_testIMG.prototxt | Network definition file used for extracting feature from a single image and making a classification}"
"{mean_file | no | The mean file generated by Caffe from all gallery images, this could be used for mean value substraction from all images. If you want to use the mean file, you can set this as ../data/images_mean/triplet_mean.binaryproto.}"
"{target_img | ../data/images_all/4_78.png | Path of image in reference.}"
"{ref_img1 | ../data/images_all/4_79.png | Path of closest image.}"
"{ref_img2 | ../data/images_all/4_87.png | Path of less closer image in the same class with reference image.}"
"{ref_img3 | ../data/images_all/3_78.png | Path of image with the same pose in another class.}"
"{feature_blob | feat | Name of layer which will represent as the feature, in this network, ip1 or feat is well.}"
"{device | CPU | device}"
"{dev_id | 0 | dev_id}";
    /* Get parameters from comand line. */
    cv::CommandLineParser parser(argc, argv, keys);
    parser.about("Demo for object data classification and pose estimation");
    if (parser.has("help"))
    {
        parser.printMessage();
        return 0;
    }
    String caffemodel = parser.get<String>("caffemodel");
    String network_forIMG = parser.get<String>("network_forIMG");
    String mean_file = parser.get<String>("mean_file");
    String target_img = parser.get<String>("target_img");
    String ref_img1 = parser.get<String>("ref_img1");
    String ref_img2 = parser.get<String>("ref_img2");
    String ref_img3 = parser.get<String>("ref_img3");
    String feature_blob = parser.get<String>("feature_blob");
    String device = parser.get<String>("device");
    int dev_id = parser.get<int>("dev_id");
    std::vector<String> ref_img;
    /* Sample which is most closest in pose to reference image
    *and also the same class.
    */
    ref_img.push_back(ref_img1);
    /* Sample which is less closest in pose to reference image
    *and also the same class.
    */
    ref_img.push_back(ref_img2);
    /* Sample which is very close in pose to reference image
    *but not the same class.
    */
    ref_img.push_back(ref_img3);
    /* Initialize a net work with Device. */
    cv::cnn_3dobj::descriptorExtractor descriptor(device, dev_id);
    /* Load net with the caffe trained net work parameter and structure. */
    if (strcmp(mean_file.c_str(), "no") == 0)
        descriptor.loadNet(network_forIMG, caffemodel);
    else
        descriptor.loadNet(network_forIMG, caffemodel, mean_file);
    cv::Mat img_base = cv::imread(target_img, -1);
    if (img_base.empty())
    {
        printf("could not read reference image %s\n, make sure the path of images are set properly.", target_img.c_str());
    }
    std::vector<cv::Mat> img;
    for (unsigned int i = 0; i < ref_img.size(); i++)
    {
        img.push_back(cv::imread(ref_img[i], -1));
        if (img[i].empty()) {
          printf("could not read reference image %s\n, make sure the path of images are set properly.", ref_img[i].c_str());
        }
    }
    cv::Mat feature_test;
    descriptor.extract(img_base, feature_test, feature_blob);
    if (feature_test.empty()) {
      printf("could not extract feature from test image which is read into cv::Mat.");
    }
    cv::Mat feature_reference;
    descriptor.extract(img, feature_reference, feature_blob);
    if (feature_reference.empty()) {
      printf("could not extract feature from reference images which is already stored in vector<cv::Mat>.");
    }
    std::vector<float> matches;
    for (int i = 0; i < feature_reference.rows; i++)
    {
        cv::Mat distance = feature_test-feature_reference.row(i);
        matches.push_back(cv::norm(distance));
    }
    bool pose_pass = false;
    bool class_pass = false;
    /* Have comparations on the distance between reference image and 3 other images
    *distance between closest sample and reference image should be smallest and
    *distance between sample in another class and reference image should be largest.
    */
    if (matches[0] < matches[1] && matches[0] < matches[2])
        pose_pass = true;
    if (matches[1] < matches[2])
        class_pass = true;
    if (!pose_pass)
    {
        printf("\n =========== Model %s ========== \nIs not trained properly that the similar pose could not be tell from a cluster of features.\n", caffemodel.c_str());
    }
    else if (!class_pass)
    {
        printf("\n =========== Model %s ========== \nIs not trained properly that feature from the same class is not discriminant from the one of another class with similar pose.\n", caffemodel.c_str());
    }
    else
    {
        printf("\n =========== Model %s ========== \nSuits for setting different classes apart and also discriminant on object pose at the same time.\n", caffemodel.c_str());
    }
    return 0;
}
```

## 6.3.3	代码解释
接下来对代码清单6-18中的程序进行详细介绍.

-	样例最接近参考图像且属于同一类别的样本。

```cpp
代码清单6-19：最相似
ref_img.push_back(ref_img1);
```

-	样例与参考图像最不相似但属于同一类别的样本。

```cpp
代码清单6-20：同类不相似
ref_img.push_back(ref_img2);
```

-	样例与参考图像非常接近，但不是同一类的样本。

```cpp
代码清单6-21：相似不同类
ref_img.push_back(ref_img3);
```

-	使用Device初始化一个网络。
```cpp
代码清单6-22：初始化网络
cv::cnn_3dobj::descriptorExtractor descriptor(device, dev_id);
```

-	加载用caffe模型训练过的网络参数和结构。
```cpp
代码清单6-23：加载结构
if (strcmp(mean_file.c_str(), "no") == 0)
    descriptor.loadNet(network_forIMG, caffemodel);
else
    descriptor.loadNet(network_forIMG, caffemodel, mean_file);
```

-	比较参考图像和其他3个样本图像之间的距离，最接近参考图像的样本和参考图像之间的距离应最小，另一个样本和参考图像之间的距离应最大。
```cpp
代码清单6-24：比较结果
if (matches[0] < matches[1] && matches[0] < matches[2])
    pose_pass = true;
if (matches[1] < matches[2])
    class_pass = true;
```


================================================
FILE: chapter 6/分类.md
================================================
## 6.2.1	目标
在本节教程中，我们将学习如何从图像中提取特征并使用描述符进行预测，主要内容包括：

1.	如何从图像中提取特征
2.	如何从给定根路径下的图像中提取特征
3.	如何利用参考图像和目标图像进行预测

## 6.2.2	C++代码
```cpp
代码清单6-10
/*
 * Software License Agreement (BSD License)
 *
 *  Copyright (c) 2009, Willow Garage, Inc.
 *  All rights reserved.
 *
 *  Redistribution and use in source and binary forms, with or without
 *  modification, are permitted provided that the following conditions
 *  are met:
 *
 *   * Redistributions of source code must retain the above copyright
 *     notice, this list of conditions and the following disclaimer.
 *   * Redistributions in binary form must reproduce the above
 *     copyright notice, this list of conditions and the following
 *     disclaimer in the documentation and/or other materials provided
 *     with the distribution.
 *   * Neither the name of Willow Garage, Inc. nor the names of its
 *     contributors may be used to endorse or promote products derived
 *     from this software without specific prior written permission.
 *
 *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
 *  FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
 *  COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
 *  INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
 *  BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
 *  LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 *  CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 *  LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
 *  ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 *  POSSIBILITY OF SUCH DAMAGE.
 *
 */
#include <opencv2/cnn_3dobj.hpp>
#include <opencv2/features2d.hpp>
#include <iomanip>
using namespace cv;
using namespace std;
using namespace cv::cnn_3dobj;
static void listDir(const char *path, std::vector<String>& files, bool r)
{
    DIR *pDir;
    struct dirent *ent;
    char childpath[512];
    pDir = opendir(path);
    memset(childpath, 0, sizeof(childpath));
    while ((ent = readdir(pDir)) != NULL)
    {
        if (ent->d_type & DT_DIR)
        {
            if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0 || strcmp(ent->d_name, ".DS_Store") == 0)
            {
                continue;
            }
            if (r)
            {
                sprintf(childpath, "%s/%s", path, ent->d_name);
                listDir(childpath,files,false);
            }
        }
        else
        {
            if (strcmp(ent->d_name, ".DS_Store") != 0)
                files.push_back(ent->d_name);
        }
    }
    sort(files.begin(),files.end());
};
static int featureWrite(const Mat &features, const String &fname)
{
    ofstream ouF;
    ouF.open(fname.c_str(), std::ofstream::binary);
    if (!ouF)
    {
        cerr << "failed to open the file : " << fname << endl;
        return 0;
    }
    for (int r = 0; r < features.rows; r++)
    {
        ouF.write(reinterpret_cast<const char*>(features.ptr(r)), features.cols*features.elemSize());
    }
    ouF.close();
    return 1;
}
int main(int argc, char** argv)
{
    const String keys = "{help | | This sample will extract features from reference images and target image for classification. You can add a mean_file if there little variance in data such as human faces, otherwise it is not so useful}"
    "{src_dir | ../data/images_all/ | Source direction of the images ready for being used for extract feature as gallery.}"
    "{caffemodel | ../../testdata/cv/3d_triplet_iter_30000.caffemodel | caffe model for feature exrtaction.}"
    "{network_forIMG | ../../testdata/cv/3d_triplet_testIMG.prototxt | Network definition file used for extracting feature from a single image and making a classification}"
    "{mean_file | no | The mean file generated by Caffe from all gallery images, this could be used for mean value substraction from all images. If you want to use the mean file, you can set this as ../data/images_mean/triplet_mean.binaryproto.}"
    "{target_img | ../data/images_all/4_78.png | Path of image waiting to be classified.}"
    "{feature_blob | feat | Name of layer which will represent as the feature, in this network, ip1 or feat is well.}"
    "{num_candidate | 15 | Number of candidates in gallery as the prediction result.}"
    "{device | CPU | Device type: CPU or GPU}"
    "{dev_id | 0 | Device id}"
    "{gallery_out | 0 | Option on output binary features on gallery images}";
    /* get parameters from comand line */
    cv::CommandLineParser parser(argc, argv, keys);
    parser.about("Feature extraction and classification");
    if (parser.has("help"))
    {
        parser.printMessage();
        return 0;
    }
    String src_dir = parser.get<String>("src_dir");
    String caffemodel = parser.get<String>("caffemodel");
    String network_forIMG   = parser.get<String>("network_forIMG");
    String mean_file    = parser.get<String>("mean_file");
    String target_img   = parser.get<String>("target_img");
    String feature_blob = parser.get<String>("feature_blob");
    int num_candidate = parser.get<int>("num_candidate");
    String device = parser.get<String>("device");
    int gallery_out = parser.get<int>("gallery_out");
    /* Initialize a net work with Device */
    cv::cnn_3dobj::descriptorExtractor descriptor(device);
    std::cout << "Using" << descriptor.getDeviceType() << std::endl;
    /* Load net with the caffe trained net work parameter and structure */
    if (strcmp(mean_file.c_str(), "no") == 0)
        descriptor.loadNet(network_forIMG, caffemodel);
    else
        descriptor.loadNet(network_forIMG, caffemodel, mean_file);
    std::vector<String> name_gallery;
    /* List the file names under a given path */
    listDir(src_dir.c_str(), name_gallery, false);
    if (gallery_out)
    {
        ofstream namelist_out("gallelist.txt");
        /* Writing name of the reference images. */
        for (unsigned int i = 0; i < name_gallery.size(); i++)
            namelist_out << name_gallery.at(i) << endl;
    }
    for (unsigned int i = 0; i < name_gallery.size(); i++)
    {
        name_gallery[i] = src_dir + name_gallery[i];
    }
    std::vector<cv::Mat> img_gallery;
    cv::Mat feature_reference;
    for (unsigned int i = 0; i < name_gallery.size(); i++)
    {
        img_gallery.push_back(cv::imread(name_gallery[i]));
    }
    /* Extract feature from a set of images */
    descriptor.extract(img_gallery, feature_reference, feature_blob);
    if (gallery_out)
    {
        std::cout << std::endl << "---------- Features of gallery images ----------" << std::endl;
        /* Print features of the reference images. */
        for (int i = 0; i < feature_reference.rows; i++)
            std::cout << feature_reference.row(i) << endl;
        std::cout << std::endl << "---------- Saving features of gallery images into feature.bin ----------" << std::endl;
        featureWrite(feature_reference, "feature.bin");
    }
    else
    {
        std::cout << std::endl << "---------- Prediction for " << target_img << " ----------" << std::endl;
        cv::Mat img = cv::imread(target_img);
        std::cout << std::endl << "---------- Features of gallery images ----------" << std::endl;
        std::vector<std::pair<String, float> > prediction;
        /* Print features of the reference images. */
        for (int i = 0; i < feature_reference.rows; i++)
            std::cout << feature_reference.row(i) << endl;
        cv::Mat feature_test;
        descriptor.extract(img, feature_test, feature_blob);
        /* Initialize a matcher which using L2 distance. */
        cv::BFMatcher matcher(NORM_L2);
        std::vector<std::vector<cv::DMatch> > matches;
        /* Have a KNN match on the target and reference images. */
        matcher.knnMatch(feature_test, feature_reference, matches, num_candidate);
        /* Print feature of the target image waiting to be classified. */
        std::cout << std::endl << "---------- Features of target image: " << target_img << "----------" << endl << feature_test << std::endl;
        /* Print the top N prediction. */
        std::cout << std::endl << "---------- Prediction result(Distance - File Name in Gallery) ----------" << std::endl;
        for (size_t i = 0; i < matches[0].size(); ++i)
        {
            std::cout << i << " - " << std::fixed << std::setprecision(2) << name_gallery[matches[0][i].trainIdx] << " - \""  << matches[0][i].distance << "\"" << std::endl;
        }
    }
    return 0;
}
```

## 6.2.3	代码解释
接下来对代码清单6-10中的程序进行详细介绍。
-	使用Device初始化网络
```cpp
代码清单6-11：初始化网络
cv::cnn_3dobj::descriptorExtractor descriptor(device);
```

-	加载用caffe模型训练过的网络参数和结构。
```cpp
代码清单6-12：加载网络
if (strcmp(mean_file.c_str(), "no") == 0)
    descriptor.loadNet(network_forIMG, caffemodel);
else
    descriptor.loadNet(network_forIMG, caffemodel, mean_file);
```

-	列出给定路径下的文件名。
```cpp
代码清单6-13：获取文件名
listDir(src_dir.c_str(), name_gallery, false);
for (unsigned int i = 0; i < name_gallery.size(); i++)
{
    name_gallery[i] = src_dir + name_gallery[i];
}
```

-	从一组图像中提取特征
```cpp
代码清单6-14：提取特征
descriptor.extract(img_gallery, feature_reference, feature_blob);
```

-	初始化一个使用L2距离的匹配器
```cpp
代码清单6-15：初始化匹配器
cv::BFMatcher matcher(NORM_L2);
std::vector<std::vector<cv::DMatch> > matches;
```

-	对目标图像和参考图像进行KNN匹配
```cpp
代码清单6-16：KNN匹配
matcher.knnMatch(feature_test, feature_reference, matches,num_candidate);
```

-	输出参考图像的特征
```cpp
代码清单6-17：输出结果      
std::cout <<std::endl  <<” ---------- Features of target image:” << target_img << "----------" << endl << feature_test << std::endl;
```


================================================
FILE: chapter 7/3
================================================


================================================
FILE: chapter 7/计算机视觉应用的交互式可视化调试.md
================================================
调试计算机视觉应用程序的最常用方法是什么？通常答案是一些临时的、被混在一起的自定义代码并且在发布编译前必须从代码中删除这些自定义代码。

由于编译器的存在，使得在程序中加入断点非常的容易，但是如果没有强大的编译器，例如Visual Studio，那么对于个完整程序进行调试将会变得非常的困难。因此本节将会介绍在linux系统中如何使用cvv模块进行程序调试。

## 7.1.1	目标
在本节教程中，我们将介绍如何使用cvv模块（opencv2 / cvv.hpp）的可视化调试功能，主要内容包括：

1.	将cvv调试功能添加到你的应用程序中
2.	使用可视化调试GUI
3.	在编译期间启用和禁用可视化调试功能（禁用功能时不占用资源）

## 7.1.2	C++代码
代码清单7-1中的代码可以实现如下的功能：

-	捕获图像（视频），例如通过网络摄像头
-	对每个图像（imgproc）进行滤波
-	检测图像特征并将其与上一张图像匹配（features2d）

如果在没有使用视觉调试的情况下编译程序（请参见下面的CMakeLists.txt），唯一的结果是在命令行上打印了一些信息。我们想演示仅仅利用几行cvv命令就可以增加多少调试或开发功能。

```cpp
代码清单7-1
// system includes
 #include <iostream>
 
 // library includes
 #include <opencv2/imgproc.hpp>
 #include <opencv2/features2d.hpp>
 #include <opencv2/imgproc/types_c.h>
 #include <opencv2/videoio.hpp>
 #include <opencv2/videoio/videoio_c.h>
 
 #define CVVISUAL_DEBUGMODE
 #include <opencv2/cvv/debug_mode.hpp>
 #include <opencv2/cvv/show_image.hpp>
 #include <opencv2/cvv/filter.hpp>
 #include <opencv2/cvv/dmatch.hpp>
 #include <opencv2/cvv/final_show.hpp>
 
 using namespace std;
 using namespace cv;
 
 template<class T> std::string toString(const T& p_arg)
 {
   std::stringstream ss;
 
   ss << p_arg;
 
   return ss.str();
 }
 
 
 int
 main(int argc, char** argv)
 {
   cv::Size* resolution = nullptr;
 
   // parser keys
   const char *keys =
       "{ help h usage ?  |   | show this message }"
       "{ width W         |  0| camera resolution width. leave at 0 to use defaults }"
       "{ height H        |  0| camera resolution height. leave at 0 to use defaults }";
 
   CommandLineParser parser(argc, argv, keys);
   if (parser.has("help")) {
     parser.printMessage();
     return 0;
   }
   int res_w = parser.get<int>("width");
   int res_h = parser.get<int>("height");
 
   // setup video capture
   cv::VideoCapture capture(0);
   if (!capture.isOpened()) {
     std::cout << "Could not open VideoCapture" << std::endl;
     return 1;
   }
 
   if (res_w>0 && res_h>0) {
     printf("Setting resolution to %dx%d\n", res_w, res_h);
     capture.set(CV_CAP_PROP_FRAME_WIDTH, res_w);
     capture.set(CV_CAP_PROP_FRAME_HEIGHT, res_h);
   }
 
 
   cv::Mat prevImgGray;
   std::vector<cv::KeyPoint> prevKeypoints;
   cv::Mat prevDescriptors;
 
   int maxFeatureCount = 500;
   Ptr<ORB> detector = ORB::create(maxFeatureCount);
 
   cv::BFMatcher matcher(cv::NORM_HAMMING);
 
   for (int imgId = 0; imgId < 10; imgId++) {
     // capture a frame
     cv::Mat imgRead;
     capture >> imgRead;
     printf("%d: image captured\n", imgId);
 
     std::string imgIdString{"imgRead"};
     imgIdString += toString(imgId);
         cvv::showImage(imgRead, CVVISUAL_LOCATION, imgIdString.c_str());
 
     // convert to grayscale
     cv::Mat imgGray;
     cv::cvtColor(imgRead, imgGray, COLOR_BGR2GRAY);
         cvv::debugFilter(imgRead, imgGray, CVVISUAL_LOCATION, "to gray");
 
     // detect ORB features
     std::vector<cv::KeyPoint> keypoints;
     cv::Mat descriptors;
     detector->detectAndCompute(imgGray, cv::noArray(), keypoints, descriptors);
     printf("%d: detected %zd keypoints\n", imgId, keypoints.size());
 
     // match them to previous image (if available)
     if (!prevImgGray.empty()) {
       std::vector<cv::DMatch> matches;
       matcher.match(prevDescriptors, descriptors, matches);
       printf("%d: all matches size=%zd\n", imgId, matches.size());
       std::string allMatchIdString{"all matches "};
       allMatchIdString += toString(imgId-1) + "<->" + toString(imgId);
       cvv::debugDMatch(prevImgGray, prevKeypoints, imgGray, keypoints, matches, CVVISUAL_LOCATION, allMatchIdString.c_str());
 
       // remove worst (as defined by match distance) bestRatio quantile
       double bestRatio = 0.8;
       std::sort(matches.begin(), matches.end());
       matches.resize(int(bestRatio * matches.size()));
       printf("%d: best matches size=%zd\n", imgId, matches.size());
       std::string bestMatchIdString{"best " + toString(bestRatio) + " matches "};
       bestMatchIdString += toString(imgId-1) + "<->" + toString(imgId);
       cvv::debugDMatch(prevImgGray, prevKeypoints, imgGray, keypoints, matches, CVVISUAL_LOCATION, bestMatchIdString.c_str());
     }
 
     prevImgGray = imgGray;
     prevKeypoints = keypoints;
     prevDescriptors = descriptors;
   }
 
   cvv::finalShow();
 
   return 0;
 }
```
下面是编译时需要使用的CmakeList.text文件

```cpp
代码清单7-2：CmakeLists.txt
cmake_minimum_required(VERSION 2.8)
project(cvvisual_test)
SET(CMAKE_PREFIX_PATH ~/software/opencv/install)
SET(CMAKE_CXX_COMPILER "g++-4.8")
SET(CMAKE_CXX_FLAGS "-std=c++11 -O2 -pthread -Wall -Werror")
# (un)set: cmake -DCVV_DEBUG_MODE=OFF ..
OPTION(CVV_DEBUG_MODE "cvvisual-debug-mode" ON)
if(CVV_DEBUG_MODE MATCHES ON)
  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DCVVISUAL_DEBUGMODE")
endif()
FIND_PACKAGE(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS})
add_executable(cvvt main.cpp)
target_link_libraries(cvvt
  opencv_core opencv_videoio opencv_imgproc opencv_features2d
  opencv_cvv
)
```

## 7.1.3	代码解释
我们可以使用上面的cmakeliss .txt和Option 命令CVV_DEBUG_MODE=ON (cmake -DCVV_DEBUG_MODE=ON)来编译程序。另外也可以将相应的宏定义CVVISUAL_DEBUGMODE添加到我们的编译器中(例如g++ -DCVVISUAL_DEBUGMODE)。

-	第一个cvv调用简单的显示图像(类似于imshow)，并使用imgIdString作为注释。

```cpp
代码清单7-3：简单的显示图像
cvv::showImage(imgRead, CVVISUAL_LOCATION, imgIdString.c_str());
```

图像被添加到可视化调试的GUI中cvv调用块下的Overview选项卡，具体如图7-1所示。
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226101731829.png" height="300">
</p>

然后可以选择和查看图像，具体如图7-2所示。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022610193981.png" height="300">
</p>

当你想继续执行代码，即取消cvv调用时，可以继续执行代码直到下一个cvv调用(Step)（绿色的按钮），继续到最后一个cvv调用(*>>*)（黄色）或者还可以直接运行应用程序，直到它退出程序(Close)（红色）。

下面演示选择继续执行代码直到下一个cvv调用(Step)（绿色的按钮）。

-	下一个cvv调用用于调试各种滤波操作，即以图片作为输入并返回图片作为输出的操作。
```cpp
代码清单7-4：调试各种滤波操作"to gray"
cvv::debugFilter(imgRead, imgGray, CVVISUAL_LOCATION, "to gray");
```
与每一个cvv调用一样，首先要进入Overview选项卡，此时选项卡如图7-3所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226102233572.png" height="300">
</p>

再次按下Step。
```cpp
代码清单7-5：调试各种滤波操作"smoothed"
cvv::debugFilter(imgGray, imgGraySmooth, CVVISUAL_LOCATION, "smoothed");
```
如果打开了滤波器调用，最终将在“DefaultFilterView”中结束。两个图像显示在一起，我们可以同步放大它们，放大后如图7-4所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226102457489.png" height="300">
</p>

我们按两次Step，看看放大后的图像，当达到很高的缩放级别时，可以看到每个像素的灰度值，具体如图7-5所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226102603106.png" height="300">
</p>

查看边缘检测调试结果
```cpp
代码清单7-6：查看边缘
cvv::debugFilter(imgEdges, imgEdgesDilated, CVVISUAL_LOCATION, "dilated edges");
```
此时，DefaultFilterView中显示了两个图像，形式如图7-6所示

<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022610275344.png" height="300">
</p>

现在我们使用右上角的View选项卡选择“DualFilterView”。选择“Changed Pixels”作为滤波器，并应用在中间的图像上，得到如图7-7所示的结果。
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226103030594.png" height="300">
</p>

在我们仔细查看了这些图像(可能使用了不同的视图、滤波器或其他GUI功能)之后，决定让程序继续运行至结束。我们按黄色*>>*按钮。这个程序将阻塞在cvv::finalShow() 并显示概述以及传递给cvv的所有内容，结果如图7-8所示。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226103153100.png" height="300">
</p>

-	cvv debugDMatch调用适用于以下情况：两个图像各有一组相互匹配的描述符。

我们将这两个图像、这两组特征点集及其匹配传递给可视化调试模块。
```cpp
代码清单7-7：特征点匹配
cvv::debugDMatch(prevImgGray, prevKeypoints, imgGray, keypoints, matches, CVVISUAL_LOCATION, allMatchIdString.c_str());
```
我们使用Overview中的滤波器功能(*#type match*)来只显示匹配结果，结果如图7-9所示。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226103414506.png" height="300">
</p>

如果想更详细地查看其中的一项，例如调整我们匹配时使用的参数。窗口中有各种设置来显示关键点和匹配项。此外，还有一个鼠标悬停提示工具。在图7-10中给出这步操作的结果。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226103509610.png" height="300">
</p>


可以看到它有很多不匹配的地方。因此我们可以只显示匹配距离最小的70％的关键点，操作结果在图7-11给出。。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226103602764.png" height="300">
</p>

成功减少视觉干扰后，我们希望更清楚地看到两个图像之间的变化。选择“ TranslationMatchView”，以不同的方式显示匹配关键点的位置，结果如图7-12所示。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226103731534.png" height="300">
</p>

很容易看出，在第二张图片中的杯子相对与第一张向左侧移动了一段距离。

-	尽管cvv的目的是使我们交互式地查看计算机视觉运算结果，但“RawView”对此进行了补充。它允许我们查看底层的数据，具体如图7-13所示。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/2020022610385539.png" height="300">
</p>

-	cvv GUI中包含许多更有用的功能。例如，可以对Overview选项卡进行分组，结果如图7-14所示
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226104011120.png" height="300">
</p>

通过向我们的计算机视觉程序中添加可视化代码，便可以通过可视化工具交互地调试程序。如果我们完成了开发/调试，不必删除这些代码行。我们简单地禁用cvv调试(cmake -DCVV_DEBUG_MODE=OFF或删除*-DCVVISUAL_DEBUGMODE*)，程序便可以正常运行且没有任何调试开销。


================================================
FILE: chapter 8/使用CNNs进行目标检测.md
================================================
本章中的内容需要使用“dnn_objectect”模块，并且在编译的过程中需要构建示例程序，即在编译过程中选择BUILD_EXAMPLES选项，该选项在CMake界面中如图8-1所示，在Linux系统中通过BUILD_EXAMPLES=ON进行设置。
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226104443881.png" height="200">
</p>
 
**提示
*设置BUILD_EXAMPLES=ON后，会编译所有模块的示例程序，因此如果条件允许的情况下，建议单独编译“dnn_objectect”模块。***

## 8.1.1	目标
在本教程中，我们将利用自带例程模型对图像进行识别和分类，使用到的示例程序如下：

1.	example_dnn_objdetect_obj_detect
2.	example_dnn_objdetect_image_classification

本教程中需要使用的模型定义文件和权值文件和可以在opencv_extra/dnn_objdetect中找到。文件的名称分别是在表8-1给出。

<p align="center">
  表8-1 模型文件和权值文件名称
</p>

模型文件|权值文件
-----|-----
SqueezeDet_deploy.prototxt |SqueezeDet.caffemodel	
SqueezeNet_deploy.prototxt|SqueezeNet.caffemodel

**提示
*小白学视觉公众号后台回复“dnn_objdetect”同样可以获取本章教程使用的模型定义、权值文件以及测试使用的图片。***

## 8.1.2	目标检测
对目标进行检测可以通过代码清单8-1中的格式实现。
```cpp
代码清单8-1：目标检测格式
example_dnn_objdetect_obj_detect  <模型定义文件>  <模型权值文件>  <图片>
```

本章中所有的的例子都是运行在Intel(R) Core(TM)2 i3-4005U CPU @ 1.70GHz(不含GPU)的笔记本上。

在本教程中，该模型预测多个边界框的平均时间仅为0.172091秒，在不适用GPU的情况下，速度也十分的快。
```cpp
代码清单8-2：测试飞机
<bin_path>/example_dnn_objdetect_obj_detect  SqueezeDet_deploy.prototxt  SqueezeDet.caffemodel  tutorials/images/aeroplane.jpg
Total objects detected: 1 in 0.168792 seconds
------
Class: aeroplane
Probability: 0.845181
Co-ordinates: 41 116 415 254
------
 ```

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226110321167.png" height="300">
</p>

```cpp
代码清单8-3：测试公交车
<bin_path>/example_dnn_objdetect_obj_detect  SqueezeDet_deploy.prototxt  SqueezeDet.caffemodel  tutorials/images/bus.jpg
Total objects detected: 1 in 0.201276 seconds
------
Class: bus
Probability: 0.701829
Co-ordinates: 0 32 415 244
------
 ```
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226110603905.png" height="300">
</p>

```cpp
代码清单8-4：测试猫
<bin_path>/example_dnn_objdetect_obj_detect  SqueezeDet_deploy.prototxt  SqueezeDet.caffemodel  tutorials/images/cat.jpg
Total objects detected: 1 in 0.190335 seconds
------
Class: cat
Probability: 0.703465
Co-ordinates: 34 0 381 282
------
 ```

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226110709223.png" height="300">
</p>

```cpp
代码清单8-5：测试人
<bin_path>/example_dnn_objdetect_obj_detect  SqueezeDet_deploy.prototxt  SqueezeDet.caffemodel  tutorials/images/persons_mutli.jpg
Total objects detected: 2 in 0.169152 seconds
------
Class: person
Probability: 0.737349
Co-ordinates: 160 67 313 363
------
Class: person
Probability: 0.720328
Co-ordinates: 187 198 222 323
------
 ```

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226110812788.png" height="300">
</p>


## 8.1.3	改变阈值
默认情况下，该模型的检测阈值为置信值0.53。它可以预测到一定数量的边界框，我们可以通过传递变量threshold的值来手动控制阈值，调用格式如代码清单8-6所示。
```cpp
代码清单8-6：改变阈值
<bin_path>/example_dnn_objdetect_obj_detect  <模型定义文件>  <模型权重文件>   <测试图片> <阈值>
```
将阈值更改为0，再次检测飞机，可以如图8-6所示的结果。

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226111045592.png" height="300">
</p>

## 8.1.4	图像分类
-	利用现有模型对图像进行分类的格式在代码清单8-7中给出。

```cpp
代码清单8-7：对图像进行分类
example_dnn_objdetect_image_classification  <模型定义文件 >  <模型权重文件>   <测试图片>
```

该模型的大小为4.9MB，对图像进行分类平均只需要0.136401秒。

-	代码清单8-8中给出了对飞机图像进行分类的代码及运行结果
```cpp
代码清单8-8：对飞机图像进行分类
<bin_path>/example_dnn_objdetect_image_classification  SqueezeNet_deploy.prototxt  SqueezeNet.caffemodel  tutorials/images/aeroplane.jpg
Best class Index: 404
Time taken: 0.137722
Probability: 77.1757
```

预测结果为404，通过查看synset_words.txt，可以找到它属于airliner类

-	继续对猫的图像进行分类，代码清单8-9给出示例代码和分类结果。

```cpp
代码清单8-9：对猫图像进行分类
<bin_path>/example_dnn_objdetect_image_classification  SqueezeNet_deploy.prototxt  SqueezeNet.caffemodel  tutorials/images/cat.jpg
Best class Index: 285
Time taken: 0.136401
Probability: 40.7111
```

预测结果为285，通过查看synset_words.txt，可以找到它属于Egyptian cat类

-	继续对航天飞机图像进行分类，代码清单8-10给出示例代码和分类结果。
```cpp
代码清单8-10：对航天飞机图像分类
<bin_path>/example_dnn_objdetect_image_classification  SqueezeNet_deploy.prototxt  SqueezeNet.caffemodel  tutorials/images/space_shuttle.jpg
Best class Index: 812
Time taken: 0.137792
Probability: 15.8467
```

预测结果为285，通过查看synset_words.txt，可以找到它属于space shuttle类


================================================
FILE: chapter 9/e
================================================


================================================
FILE: chapter 9/放大图像：单输出.md
================================================
学习本章例程需要使用dnn_superres模块，该模块在Liunx系统中通过代码清单9-1中的命令进行安装，或者在CMake-GUI中对dnn_superres模块进行选择。

```cpp
代码清单9-1：安装dnn_superres模块
1.	cmake -DOPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules -Dopencv_dnn_superres=ON <opencv_source_dir>
```

在本节中，我们将学习如何使用'dnn_superres'，通过已有训练的神经网络对图像进行放大。

## 9.1.1	C++代码
```cpp
代码清单9-2
// This file is part of OpenCV project.
 // It is subject to the license terms in the LICENSE file found in the top-level directory
 // of this distribution and at http://opencv.org/license.html.
 
 #include <iostream>
 
 #include <opencv2/dnn_superres.hpp>
 
 #include <opencv2/imgproc.hpp>
 #include <opencv2/highgui.hpp>
 
 using namespace std;
 using namespace cv;
 using namespace dnn;
 using namespace dnn_superres;
 
 int main(int argc, char *argv[])
 {
     // Check for valid command line arguments, print usage
     // if insufficient arguments were given.
     if ( argc < 4 ) {
         cout << "usage:   Arg 1: image     | Path to image" << endl;
         cout << "\t Arg 2: algorithm | bilinear, bicubic, edsr, espcn, fsrcnn or lapsrn" << endl;
         cout << "\t Arg 3: scale     | 2, 3 or 4 \n";
         cout << "\t Arg 4: path to model file \n";
         return -1;
     }
 
     string img_path = string(argv[1]);
     string algorithm = string(argv[2]);
     int scale = atoi(argv[3]);
     string path = "";
 
     if( argc > 4)
         path = string(argv[4]);
 
     // Load the image
     Mat img = cv::imread(img_path);
     Mat original_img(img);
     if ( img.empty() )
     {
         std::cerr << "Couldn't load image: " << img << "\n";
         return -2;
     }
 
     //Make dnn super resolution instance
     DnnSuperResImpl sr;
 
     Mat img_new;
 
     if( algorithm == "bilinear" ){
         resize(img, img_new, Size(), scale, scale, 2);
     }
     else if( algorithm == "bicubic" )
     {
         resize(img, img_new, Size(), scale, scale, 3);
     }
     else if( algorithm == "edsr" || algorithm == "espcn" || algorithm == "fsrcnn" || algorithm == "lapsrn" )
     {
         sr.readModel(path);
         sr.setModel(algorithm, scale);
         sr.upsample(img, img_new);
     }
     else{
         std::cerr << "Algorithm not recognized. \n";
     }
 
     if ( img_new.empty() )
     {
         std::cerr << "Upsampling failed. \n";
         return -3;
     }
     cout << "Upsampling succeeded. \n";
 
     // Display image
     cv::namedWindow("Initial Image", WINDOW_AUTOSIZE);
     cv::imshow("Initial Image", img_new);
     //cv::imwrite("./saved.jpg", img_new);
     cv::waitKey(0);
 
     return 0;
 }
```

## 9.1.2	代码解释
-	包含头文件，设置命名空间

```cpp
代码清单9-3：包含头文件和命名空间
#include <opencv2/dnn_superres.hpp>
using namespace std;
using namespace cv;
using namespace dnn;
using namespace dnn_superres;
```
-	创建Dnn Superres对象

```cpp
代码清单9-4：创建Dnn Superres对象
DnnSuperResImpl sr;
```
这只是为了创建对象，注册自定义dnn层，并访问类函数。

-	读取模型

```cpp
代码清单9-5：读取模型
path = "models/FSRCNN_x2.pb"
sr.readModel(path);
```
这段代码从.pb文件中读取TensorFlow模型。这里的“path”是预训练的Tensorflow模型的路径文件之一。我们可以从从OpenCV的GitHub上在“dnn_superres”模块中下载模型。也可以从小白学视觉公众号后台通过回复“dnn_superres”获取。

-	设置模型

```cpp
代码清单9-6：设置模型
sr.setModel("fsrcnn", 2);
```
根据想要运行的模型，我们需要设置算法和放大系数。这样，即使我们更改了.pb文件的名称，程序也能够知道算法和系数。例如，如果选择了FSRCNN_x2.pb，则算法和系数分别为'fsrcnn'和2。(其他算法选项包括“edsr”、“espcn”和“lapsrn”等)

-	放大图像

```cpp
代码清单9-7：放大图像
Mat img = cv::imread(img_path);
Mat img_new;
sr.upsample(img, img_new);
```

现在我们可以放大任何图像。通过基础库的“imread”函数加载图像，并为目标图像创建一个新Mat。然后对图像进行方法，放大后的图像存储在'img_new'中。原图像在图9-1给出，通过FSRCNN标志放大后的图像在图9-2给出；通过Bicubic Interpolation（双三次插值）算法放大后的图像在图9-3给出。
 
<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226112711741.png" height="150">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226112836749.png" height="300">
</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226112902925.png" height="300">
</p>


================================================
FILE: chapter 9/放大图像：多输出.md
================================================
在本章中，我们将学习如何使用'dnn_superres'通过多输出的预训练神经网络对图像进行放大。如果给出了节点名称，OpenCV的dnn模块支持一次推断访问多个节点。LapSRN模型可以在一次推断中提供更多输出，它支持2x，4x，8x和（2x，4x）和（2x，4x，8x）超分辨率输出。上传的已训练的模型文件应具有以下输出节点名称：

-	2x model: NCHW_output
-	4x model: NCHW_output_2x, NCHW_output_4x
-	8x model: NCHW_output_2x, NCHW_output_4x, NCHW_output_8x

## 9.2.1	C++代码
使用代码清单9-8中的命令运行示例代码，示例代码在代码清单9-9中给出。

```cpp
代码清单9-8：运行示例
./bin/example_dnn_superres_dnn_superres_multioutput path/to/image.png 2,4 NCHW_output_2x,NCHW_output_4x \
path/to/opencv_contrib/modules/dnn_superres/models/LapSRN_x4.pb
```

```cpp
代码清单9-9
// This file is part of OpenCV project.
 // It is subject to the license terms in the LICENSE file found in the top-level directory
 // of this distribution and at http://opencv.org/license.html.
 
 #include <iostream>
 #include <sstream>
 #include <opencv2/dnn_superres.hpp>
 
 #include <opencv2/imgproc.hpp>
 #include <opencv2/highgui.hpp>
 
 using namespace std;
 using namespace cv;
 using namespace dnn_superres;
 
 int main(int argc, char *argv[])
 {
     // Check for valid command line arguments, print usage
     // if insufficient arguments were given.
     if (argc < 4) {
         cout << "usage:   Arg 1: image     | Path to image" << endl;
         cout << "\t Arg 2: scales in a format of 2,4,8\n";
         cout << "\t Arg 3: output node names in a format of nchw_output_0,nchw_output_1\n";
         cout << "\t Arg 4: path to model file \n";
         return -1;
     }
 
     string img_path = string(argv[1]);
     string scales_str = string(argv[2]);
     string output_names_str = string(argv[3]);
     std::string path = string(argv[4]);
 
     //Parse the scaling factors
     std::vector<int> scales;
     char delim = ',';
     {
         std::stringstream ss(scales_str);
         std::string token;
         while (std::getline(ss, token, delim)) {
             scales.push_back(atoi(token.c_str()));
         }
     }
 
     //Parse the output node names
     std::vector<String> node_names;
     {
         std::stringstream ss(output_names_str);
         std::string token;
         while (std::getline(ss, token, delim)) {
             node_names.push_back(token);
         }
     }
 
     // Load the image
     Mat img = cv::imread(img_path);
     Mat original_img(img);
     if (img.empty())
     {
         std::cerr << "Couldn't load image: " << img << "\n";
         return -2;
     }
 
     //Make dnn super resolution instance
     DnnSuperResImpl sr;
     int scale = *max_element(scales.begin(), scales.end());
     std::vector<Mat> outputs;
     sr.readModel(path);
     sr.setModel("lapsrn", scale);
 
     sr.upsampleMultioutput(img, outputs, scales, node_names);
 
     for(unsigned int i = 0; i < outputs.size(); i++)
     {
         cv::namedWindow("Upsampled image", WINDOW_AUTOSIZE);
         cv::imshow("Upsampled image", outputs[i]);
         //cv::imwrite("./saved.jpg", img_new);
         cv::waitKey(0);
     }
 
     return 0;
 }
```

## 9.2.2	代码解释
-	包含头文件，设置命名空间

```cpp
代码清单9-10：头文件和命名空间
#include <opencv2/dnn_superres.hpp>
using namespace std;
using namespace cv;
using namespace dnn_superres;
```

-	创建Dnn Superres对象

```cpp
代码清单9-11：创建Dnn Superres对象
DnnSuperResImpl sr;
```

-	读取模型

```cpp
代码清单9-12：读取模型
path = "models/LapSRN_x8.pb"
sr.readModel(path);
```

-	设置模型

```cpp
代码清单9-13：设置模型
sr.setModel("lapsrn", 8);
```

设置算法和比例系数。最后一个(最大的)系数应该在这里给出。

-	给出节点名称和比例系数

```cpp
代码清单9-14：设置节点名称和比例系数
std::vector<int> scales{2, 4, 8}
std::vector<int> node_names{'NCHW_output_2x','NCHW_output_4x','NCHW_output_8x'}
```

-	放大图像

```cpp
代码清单9-15：放大图像
Mat img = cv::imread(img_path);
std::vector<Mat> outputs;
sr.upsampleMultioutput(img, outputs, scales, node_names);
```
运行程序，将输出图像存储在一个Mat中。


================================================
FILE: chapter 9/放大视频.md
================================================
在本章中，我们将学习如何使用'dnn_superres'通过预先训练的神经网络放大视频。

## 9.3.1	C++代码
```cpp
代码清单9-16
// This file is part of OpenCV project.
 // It is subject to the license terms in the LICENSE file found in the top-level directory
 // of this distribution and at http://opencv.org/license.html.
 
 #include <iostream>
 
 #include <opencv2/dnn_superres.hpp>
 
 #include <opencv2/imgproc.hpp>
 #include <opencv2/highgui.hpp>
 
 using namespace std;
 using namespace cv;
 using namespace dnn_superres;
 
 int main(int argc, char *argv[])
 {
     // Check for valid command line arguments, print usage
     // if insufficient arguments were given.
     if (argc < 4) {
         cout << "usage:   Arg 1: input video path" << endl;
         cout << "\t Arg 2: output video path" << endl;
         cout << "\t Arg 3: algorithm | edsr, espcn, fsrcnn or lapsrn" << endl;
         cout << "\t Arg 4: scale     | 2, 3, 4 or 8 \n";
         cout << "\t Arg 5: path to model file \n";
         return -1;
     }
 
     string input_path = string(argv[1]);
     string output_path = string(argv[2]);
     string algorithm = string(argv[3]);
     int scale = atoi(argv[4]);
     string path = string(argv[5]);
 
     VideoCapture input_video(input_path);
     int ex = static_cast<int>(input_video.get(CAP_PROP_FOURCC));
     Size S = Size((int) input_video.get(CAP_PROP_FRAME_WIDTH) * scale,
                   (int) input_video.get(CAP_PROP_FRAME_HEIGHT) * scale);
 
     VideoWriter output_video;
     output_video.open(output_path, ex, input_video.get(CAP_PROP_FPS), S, true);
 
     if (!input_video.isOpened())
     {
         std::cerr << "Could not open the video." << std::endl;
         return -1;
     }
 
     DnnSuperResImpl sr;
     sr.readModel(path);
     sr.setModel(algorithm, scale);
 
     for(;;)
     {
         Mat frame, output_frame;
         input_video >> frame;
 
         if ( frame.empty() )
             break;
 
         sr.upsample(frame, output_frame);
         output_video << output_frame;
 
         namedWindow("Upsampled video", WINDOW_AUTOSIZE);
         imshow("Upsampled video", output_frame);
 
         namedWindow("Original video", WINDOW_AUTOSIZE);
         imshow("Original video", frame);
 
         char c=(char)waitKey(25);
         if(c==27)
             break;
     }
 
     input_video.release();
     output_video.release();
 
     return 0;
 }
```

## 9.3.2	代码解释
-	包含头文件，设置命名空间

```cpp
代码清单9-17：头文件和命名空间
#include <opencv2/dnn_superres.hpp>
using namespace std;
using namespace cv;
using namespace dnn_superres;
```

-	创建Dnn Superres对象

```cpp
代码清单9-18：创建Dnn Superres对象
DnnSuperResImpl sr;
```

-	读取模型，设置算法和比例系数。

```cpp
代码清单9-19：读取模型
path = "models/ESPCN_x2.pb"
sr.readModel(path);
sr.setModel("espcn", 2);
```

-	放大视频

```cpp
代码清单9-20：放大视频
for(;;)
{
    Mat frame, output_frame;
    input_video >> frame;
    if ( frame.empty() )
        break;
    sr.upsample(frame, output_frame);
    ...
}
```
使用逐帧处理的方式对视频进行方大。


================================================
FILE: chapter 9/超分辨率基准测试.md
================================================

超分辨率模块包含用于基准测试的示例代码，可用来比较不同的模型和算法。本教程中给出了执行基准测试的示例代码和一些测试结果，示例代码在代码清单9-21中给出。测试时PC的配置如下：CPU：Intel i7-9700K CPU，操作系统： Ubuntu 18.04.02 。

## 9.4.1	C++代码

```cpp
代码清单9-21
// This file is part of OpenCV project.
 // It is subject to the license terms in the LICENSE file found in the top-level directory
 // of this distribution and at http://opencv.org/license.html.
 
 #include <iostream>
 #include <opencv2/opencv_modules.hpp>
 
 #ifdef HAVE_OPENCV_QUALITY
 #include <opencv2/dnn_superres.hpp>
 #include <opencv2/quality.hpp>
 #include <opencv2/imgproc.hpp>
 #include <opencv2/highgui.hpp>
 
 using namespace std;
 using namespace cv;
 using namespace dnn_superres;
 
 static void showBenchmark(vector<Mat> images, string title, Size imageSize,
                           const vector<String> imageTitles,
                           const vector<double> psnrValues,
                           const vector<double> ssimValues)
 {
     int fontFace = FONT_HERSHEY_COMPLEX_SMALL;
     int fontScale = 1;
     Scalar fontColor = Scalar(255, 255, 255);
 
     int len = static_cast<int>(images.size());
 
     int cols = 2, rows = 2;
 
     Mat fullImage = Mat::zeros(Size((cols * 10) + imageSize.width * cols, (rows * 10) + imageSize.height * rows),
                                images[0].type());
 
     stringstream ss;
     int h_ = -1;
     for (int i = 0; i < len; i++) {
 
         int fontStart = 15;
         int w_ = i % cols;
         if (i % cols == 0)
             h_++;
 
         Rect ROI((w_ * (10 + imageSize.width)), (h_ * (10 + imageSize.height)), imageSize.width, imageSize.height);
         Mat tmp;
         resize(images[i], tmp, Size(ROI.width, ROI.height));
 
         ss << imageTitles[i];
         putText(tmp,
                 ss.str(),
                 Point(5, fontStart),
                 fontFace,
                 fontScale,
                 fontColor,
                 1,
                 16);
 
         ss.str("");
         fontStart += 20;
 
         ss << "PSNR: " << psnrValues[i];
         putText(tmp,
                 ss.str(),
                 Point(5, fontStart),
                 fontFace,
                 fontScale,
                 fontColor,
                 1,
                 16);
 
         ss.str("");
         fontStart += 20;
 
         ss << "SSIM: " << ssimValues[i];
         putText(tmp,
                 ss.str(),
                 Point(5, fontStart),
                 fontFace,
                 fontScale,
                 fontColor,
                 1,
                 16);
 
         ss.str("");
         fontStart += 20;
 
         tmp.copyTo(fullImage(ROI));
     }
 
     namedWindow(title, 1);
     imshow(title, fullImage);
     waitKey();
 }
 
 static Vec2d getQualityValues(Mat orig, Mat upsampled)
 {
     double psnr = PSNR(upsampled, orig);
     Scalar q = quality::QualitySSIM::compute(upsampled, orig, noArray());
     double ssim = mean(Vec3d((q[0]), q[1], q[2]))[0];
     return Vec2d(psnr, ssim);
 }
 
 int main(int argc, char *argv[])
 {
     // Check for valid command line arguments, print usage
     // if insufficient arguments were given.
     if (argc < 4) {
         cout << "usage:   Arg 1: image path  | Path to image" << endl;
         cout << "\t Arg 2: algorithm | edsr, espcn, fsrcnn or lapsrn" << endl;
         cout << "\t Arg 3: path to model file 2 \n";
         cout << "\t Arg 4: scale  | 2, 3, 4 or 8 \n";
         return -1;
     }
 
     string path = string(argv[1]);
     string algorithm = string(argv[2]);
     string model = string(argv[3]);
     int scale = atoi(argv[4]);
 
     Mat img = imread(path);
     if (img.empty()) {
         cerr << "Couldn't load image: " << img << "\n";
         return -2;
     }
 
     //Crop the image so the images will be aligned
     int width = img.cols - (img.cols % scale);
     int height = img.rows - (img.rows % scale);
     Mat cropped = img(Rect(0, 0, width, height));
 
     //Downscale the image for benchmarking
     Mat img_downscaled;
     resize(cropped, img_downscaled, Size(), 1.0 / scale, 1.0 / scale);
 
     //Make dnn super resolution instance
     DnnSuperResImpl sr;
 
     vector <Mat> allImages;
     Mat img_new;
 
     //Read and set the dnn model
     sr.readModel(model);
     sr.setModel(algorithm, scale);
     sr.upsample(img_downscaled, img_new);
 
     vector<double> psnrValues = vector<double>();
     vector<double> ssimValues = vector<double>();
 
     //DL MODEL
     Vec2f quality = getQualityValues(cropped, img_new);
 
     psnrValues.push_back(quality[0]);
     ssimValues.push_back(quality[1]);
 
     cout << sr.getAlgorithm() << ":" << endl;
     cout << "PSNR: " << quality[0] << " SSIM: " << quality[1] << endl;
     cout << "----------------------" << endl;
 
     //BICUBIC
     Mat bicubic;
     resize(img_downscaled, bicubic, Size(), scale, scale, INTER_CUBIC);
     quality = getQualityValues(cropped, bicubic);
 
     psnrValues.push_back(quality[0]);
     ssimValues.push_back(quality[1]);
 
     cout << "Bicubic " << endl;
     cout << "PSNR: " << quality[0] << " SSIM: " << quality[1] << endl;
     cout << "----------------------" << endl;
 
     //NEAREST NEIGHBOR
     Mat nearest;
     resize(img_downscaled, nearest, Size(), scale, scale, INTER_NEAREST);
     quality = getQualityValues(cropped, nearest);
 
     psnrValues.push_back(quality[0]);
     ssimValues.push_back(quality[1]);
 
     cout << "Nearest neighbor" << endl;
     cout << "PSNR: " << quality[0] << " SSIM: " << quality[1] << endl;
     cout << "----------------------" << endl;
 
     //LANCZOS
     Mat lanczos;
     resize(img_downscaled, lanczos, Size(), scale, scale, INTER_LANCZOS4);
     quality = getQualityValues(cropped, lanczos);
 
     psnrValues.push_back(quality[0]);
     ssimValues.push_back(quality[1]);
 
     cout << "Lanczos" << endl;
     cout << "PSNR: " << quality[0] << " SSIM: " << quality[1] << endl;
     cout << "-----------------------------------------------" << endl;
 
     vector <Mat> imgs{img_new, bicubic, nearest, lanczos};
     vector <String> titles{sr.getAlgorithm(), "Bicubic", "Nearest neighbor", "Lanczos"};
     showBenchmark(imgs, "Quality benchmark", Size(bicubic.cols, bicubic.rows), titles, psnrValues, ssimValues);
 
     waitKey(0);
 
     return 0;
 }
 #else
 int main()
 {
     std::cout << "This sample requires the OpenCV Quality module." << std::endl;
     return 0;
 }
 #endif
```

## 9.4.2	代码解释
-	读取并缩小（下采样）图像

```cpp
代码清单9-22：图像预处理
int width = img.cols - (img.cols % scale);
int height = img.rows - (img.rows % scale);
Mat cropped = img(Rect(0, 0, width, height));
Mat img_downscaled;
cv::resize(cropped, img_downscaled, cv::Size(), 1.0 / scale, 1.0 / scale);
```

通过缩放系数调整图像大小。 在算法执行之前必须进行裁剪，以便图像能够对齐。

-	设置模型
```cpp
代码清单9-23：模型配置
DnnSuperResImpl sr;
sr.readModel(path);
sr.setModel(algorithm, scale);
sr.upsample(img_downscaled, img_new);
```

实例化一个Dnn Superres对象，读取并设置算法和比例系数。

-	执行测试

```cpp
代码清单9-24：进行测试
double psnr = PSNR(img_new, cropped);
Scalar q = cv::quality::QualitySSIM::compute(img_new, cropped, cv::noArray());
double ssim = mean(cv::Vec3f(q[0], q[1], q[2]))[0];
```

计算PSNR和SSIM。通过 PSNR (core OpenCV)和 SSIM (contrib OpenCV)比较图像。反复使用两个数值比较多种算法，如其他DL模型或插值方法(双三次插值，最近邻插值)。

## 9.4.3	基准测试结果
<p align="center">表9-1 扩大2倍</strong></center>


方法|平均时间(s)|平均PSNR|平均SSIM
---|---|---|---
ESPCN|**0.008795**|32.7059|0.9276
EDSR|5.923450|**34.1300**|**0.9447**
FSRCNN|0.021741|32.8886|0.9301
LapSRN|0.114812|32.2681|0.9248
Bicubic|0.000208|32.1638|0.9305
Nearest neighbor|0.000114|29.1665|0.9049
Lanczos|0.001094|32.4687|0.9327


<p align="center">表9-2 扩大3倍</p>

方法|平均时间(s)|平均PSNR|平均SSIM
---|---|---|---
ESPCN|**0.005495**|28.4229|0.8474
EDSR|2.455510|**29.9828**|**0.8801**
FSRCNN|0.008807|28.3068|0.8429
LapSRN|0.282575|26.7330|0.8862
Bicubic|0.000311|26.0635|0.8754
Nearest neighbor|0.000148|23.5628|0.8174
Lanczos|0.001012|25.9115|0.8706

<p align="center">表9-3 扩大4倍</p>


方法|平均时间(s)|平均PSNR|平均SSIM
---|---|---|---
ESPCN|**0.004311**|26.6870|0.7891
EDSR|1.607570|**28.1552**|**0.8317**
FSRCNN|0.005302|26.6088|0.7863
LapSRN|0.121229|26.7383|0.7896
Bicubic|0.000311|26.0635|0.8754
Nearest neighbor|0.000148|23.5628|0.8174
Lanczos|0.001012|25.9115|0.8706

<p align="center">表9-4 扩大2倍后的图像</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226120400463.png">
</p>
 	 	 	 
<p align="center">表9-5 扩大4倍后的图像</p>

<p align="center">
<img src="https://img-blog.csdnimg.cn/20200226120442832.png">
</p>