Repository: jikexueyuanwiki/tensorflow-zh
Branch: master
Commit: 28dd80bc65d6
Files: 181
Total size: 2.2 MB
Directory structure:
gitextract_iq8tfs9_/
├── .gitignore
├── README.md
├── README副本.md
├── SOURCE/
│ ├── api_docs/
│ │ ├── cc/
│ │ │ ├── ClassEnv.md
│ │ │ ├── ClassEnvWrapper.md
│ │ │ ├── ClassRandomAccessFile.md
│ │ │ ├── ClassSession.md
│ │ │ ├── ClassStatus.md
│ │ │ ├── ClassTensor.md
│ │ │ ├── ClassTensorBuffer.md
│ │ │ ├── ClassTensorShape.md
│ │ │ ├── ClassTensorShapeIter.md
│ │ │ ├── ClassTensorShapeUtils.md
│ │ │ ├── ClassThread.md
│ │ │ ├── ClassWritableFile.md
│ │ │ ├── StructSessionOptions.md
│ │ │ ├── StructState.md
│ │ │ ├── StructTensorShapeDim.md
│ │ │ ├── StructThreadOptions.md
│ │ │ └── index.md
│ │ ├── index.md
│ │ └── python/
│ │ ├── array_ops.md
│ │ ├── client.md
│ │ ├── constant_op.md
│ │ ├── control_flow_ops.md
│ │ ├── framework.md
│ │ ├── image.md
│ │ ├── index.md
│ │ ├── io_ops.md
│ │ ├── math_ops.md
│ │ ├── nn.md
│ │ ├── python_io.md
│ │ ├── sparse_ops.md
│ │ ├── state_ops.md
│ │ └── train.md
│ ├── extras/
│ │ └── README.txt
│ ├── faq.md
│ ├── get_started/
│ │ ├── basic_usage.md
│ │ ├── introduction.md
│ │ └── os_setup.md
│ ├── how_tos/
│ │ ├── __init__.py
│ │ ├── adding_an_op/
│ │ │ ├── __init__.py
│ │ │ ├── attr_examples.cc
│ │ │ ├── fact_test.py
│ │ │ ├── index.md
│ │ │ ├── zero_out_1_test.py
│ │ │ ├── zero_out_2_test.py
│ │ │ ├── zero_out_3_test.py
│ │ │ ├── zero_out_grad_2.py
│ │ │ ├── zero_out_op_kernel_1.cc
│ │ │ ├── zero_out_op_kernel_2.cc
│ │ │ └── zero_out_op_kernel_3.cc
│ │ ├── adding_an_op.md
│ │ ├── graph_viz/
│ │ │ └── index.md
│ │ ├── graph_viz.md
│ │ ├── new_data_formats/
│ │ │ ├── Custom Data Readers.md
│ │ │ └── index.md
│ │ ├── new_data_formats.md
│ │ ├── overview.md
│ │ ├── reading_data/
│ │ │ ├── __init__.py
│ │ │ ├── convert_to_records.py
│ │ │ ├── fully_connected_preloaded.py
│ │ │ ├── fully_connected_preloaded_var.py
│ │ │ ├── fully_connected_reader.py
│ │ │ └── index.md
│ │ ├── reading_data.md
│ │ ├── summaries_and_tensorboard/
│ │ │ └── index.md
│ │ ├── summaries_and_tensorboard.md
│ │ ├── threading_and_queues/
│ │ │ └── index.md
│ │ ├── threading_and_queues.md
│ │ ├── using_gpu/
│ │ │ └── index.md
│ │ ├── using_gpu.md
│ │ ├── variable_scope/
│ │ │ └── index.md
│ │ ├── variable_scope.md
│ │ ├── variables/
│ │ │ └── index.md
│ │ └── variables.md
│ ├── images/
│ │ └── getting_started.dot
│ ├── index.md
│ ├── personal.md
│ ├── resource.md
│ ├── resources/
│ │ ├── bib.md
│ │ ├── dims_types.md
│ │ ├── faq.md
│ │ ├── glossary.md
│ │ ├── overview.md
│ │ └── uses.md
│ └── tutorials/
│ ├── BUILD
│ ├── __init__.py
│ ├── deep_cnn/
│ │ ├── cifar_tensorboard.html
│ │ └── index.md
│ ├── deep_cnn.md
│ ├── mandelbrot/
│ │ └── index.md
│ ├── mandelbrot.md
│ ├── mnist/
│ │ ├── __init__.py
│ │ ├── beginners/
│ │ │ └── index.md
│ │ ├── download/
│ │ │ └── index.md
│ │ ├── fully_connected_feed.py
│ │ ├── input_data.py
│ │ ├── mnist.py
│ │ ├── mnist_softmax.py
│ │ ├── pros/
│ │ │ └── index.md
│ │ └── tf/
│ │ └── index.md
│ ├── mnist_beginners.md
│ ├── mnist_download.md
│ ├── mnist_pros.md
│ ├── mnist_tf.md
│ ├── overview.md
│ ├── pdes/
│ │ └── index.md
│ ├── pdes.md
│ ├── recurrent/
│ │ └── index.md
│ ├── recurrent.md
│ ├── seq2seq/
│ │ └── index.md
│ ├── seq2seq.md
│ ├── word2vec/
│ │ ├── __init__.py
│ │ ├── index.md
│ │ └── word2vec_basic.py
│ └── word2vec.md
├── TOC.md
├── config.json
├── learn-github.md
├── markdown.md
└── tex_pdf/
├── api/
│ ├── c4s00.tex
│ ├── cc/
│ │ ├── ClassEnv.md
│ │ ├── ClassEnvWrapper.md
│ │ ├── ClassRandomAccessFile.md
│ │ ├── ClassSession.md
│ │ ├── ClassStatus.md
│ │ ├── ClassTensor.md
│ │ ├── ClassTensorBuffer.md
│ │ ├── ClassTensorShape.md
│ │ ├── ClassTensorShapeIter.md
│ │ ├── ClassTensorShapeUtils.md
│ │ ├── ClassThread.md
│ │ ├── ClassWritableFile.md
│ │ ├── StructSessionOptions.md
│ │ ├── StructState.md
│ │ ├── StructTensorShapeDim.md
│ │ ├── StructThreadOptions.md
│ │ └── index.md
│ └── python/
│ ├── c4s01_framework.tex
│ ├── c4s02_constant_op.tex
│ ├── c4s03_state_ops.tex
│ ├── c4s04_array_ops.tex
│ ├── c4s05_math_ops.tex
│ ├── c4s06_control_flow_ops.tex
│ ├── c4s07_image.tex
│ ├── c4s08_sparse_ops.tex
│ ├── c4s09_io_ops.tex
│ ├── c4s10_python_io.tex
│ ├── c4s11_nn.tex
│ ├── c4s12_client.tex
│ ├── c4s13_train.tex
│ └── index.tex
├── get_started/
│ ├── c1s01_introduction.tex
│ ├── c1s02_os_setup.tex
│ └── c1s03_basic_usage.tex
├── how_tos/
│ ├── c3s00_overview.tex
│ ├── c3s01_variables.tex
│ ├── c3s02_variable_scope.tex
│ ├── c3s03_viz_learning.tex
│ ├── c3s04_graph_viz.tex
│ ├── c3s05_reading_data.tex
│ ├── c3s06_threading_and_queues.tex
│ ├── c3s07_adding_an_op.tex
│ ├── c3s08_new_data_formats.tex
│ └── c3s09_using_gpu.tex
├── tensorflow_manual_cn.idx
├── tensorflow_manual_cn.ilg
├── tensorflow_manual_cn.ind
├── tensorflow_manual_cn.tex
└── tutorials/
├── c2s00_overview.tex
├── c2s01_minist_beginners.tex
├── c2s02_minist_pros.tex
├── c2s03_.tex
├── c2s04_.tex
├── c2s05_word2vec.tex
├── c2s06_recurrent.tex
├── c2s07_seq2seq.tex
├── c2s08_2_9_mandelbrot.tex
├── c2s09_pdes.tex
└── c2s10_mnist_download.tex
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
*.aux
*.fdb_latexmk
*.log
*.out
*.synctex.gz
*.toc
*.fls
*.sublime-project
*.sublime-workspace
*.DS_Store
*.idea
================================================
FILE: README.md
================================================
# TensorFlow 官方文档中文版

### 你正在阅读的项目可能会比 Android 系统更加深远地影响着世界!
## 缘起
2015年11月9日,Google发布人工智能系统TensorFlow并宣布开源,同日,极客学院组织在线TensorFlow中文文档翻译。
机器学习作为人工智能的一种类型,可以让软件根据大量的数据来对未来的情况进行阐述或预判。如今,领先的科技巨头无不在机器学习下予以极大投入。Facebook、苹果、微软,甚至国内的百度。Google 自然也在其中。「TensorFlow」是 Google 多年以来内部的机器学习系统。如今,Google 正在将此系统成为开源系统,并将此系统的参数公布给业界工程师、学者和拥有大量编程能力的技术人员,这意味着什么呢?
打个不太恰当的比喻,如今 Google 对待 TensorFlow 系统,有点类似于该公司对待旗下移动操作系统 Android。如果更多的数据科学家开始使用 Google 的系统来从事机器学习方面的研究,那么这将有利于 Google 对日益发展的机器学习行业拥有更多的主导权。
为了让国内的技术人员在最短的时间内迅速掌握这一世界领先的 AI 系统,极客学院 Wiki 团队发起对 TensorFlow 官方文档的中文协同翻译,一周之内,全部翻译认领完成,一个月后,全部30章节翻译校对完成,上线极客学院Wiki平台并提供下载。
Google TensorFlow项目负责人Jeff Dean为该中文翻译项目回信称:"*看到能够将TensorFlow翻译成中文我非常激动,我们将TensorFlow开源的主要原因之一是为了让全世界的人们能够从机器学习与人工智能中获益,类似这样的协作翻译能够让更多的人更容易地接触到TensorFlow项目,很期待接下来该项目在全球范围内的应用!*"
Jeff回信原文:

再次衷心感谢每一位为该翻译项目做出贡献的同学,我们会持续关注TensorFlow、AI领域以及其它最新技术的发展、持续维护该协作翻译、持续提供更多更优质的内容,为广大IT学习者们服务!
## 内容来源
英文官方网站:
官方GitHub仓库:
中文版 GitHub 仓库:
## 参与者(按认领章节排序)
### 翻译
- [@PFZheng](https://github.com/PFZheng)
- [@Tony Jin](https://github.com/linbojin)
- [@chenweican](https://github.com/chenweican)
- [@bingjin](https://github.com/bingjin)
- [@oskycar](https://github.com/oskycar)
- [@btpeter](https://github.com/btpeter)
- [@Warln](https://github.com/Warln)
- [@ericxk](https://github.com/ericxk)
- [@wangaicc](https://github.com/wangaicc)
- [@Terence Cooper](https://github.com/TerenceCooper)
- [@zhyhooo](https://github.com/zhyhooo)
- [@thylaco1eo](https://github.com/thylaco1eo)
- [@volvet](https://github.com/volvet)
- [@zhangkom](https://github.com/zhangkom)
- [@derekshang](https://github.com/derekshang)
- [@lianghyv](https://github.com/lianghyv)
- [@nb312](https://github.com/nb312)
- [@Jim-Zenn](https://github.com/Jim-Zenn)
- [@andyiac](https://github.com/andyiac)
- [@Terence Cooper](https://github.com/TerenceCooper)
- [@leege100](https://github.com/leege100)
### 校对
- [@yangtze](https://github.com/sstruct)
- [@ericxk](https://github.com/ericxk)
- [@HongyangWang](https://github.com/WangHong-yang)
- [@LichAmnesia](https://github.com/LichAmnesia)
- [@zhyhooo](https://github.com/zhyhooo)
- [@waiwaizheng](https://github.com/waiwaizheng)
- [@HongyangWang](https://github.com/WangHong-yang)
- [@tensorfly](https://github.com/tensorfly)
- [@lonlonago](https://github.com/lonlonago)
- [@jishaoming](https://github.com/jishaoming)
- [@lucky521](https://github.com/lucky521)
- [@allensummer](http://github.com/allensummer)
- [@volvet](https://github.com/volvet)
- [@ZHNathanielLee](https://github.com/ZHNathanielLee)
- [@pengfoo](https://github.com/PengFoo)
- [@qiaohaijun](https://github.com/qiaohaijun)
- [@Seika](https://github.com/SeikaScarlet)
## 进度记录
- 2015-11-10, 谷歌发布全新人工智能系统TensorFlow并宣布开源, 极客学院Wiki启动协同翻译,创建 GitHub 仓库,制定协同规范
- 2015-11-18, 所有章节认领完毕,翻译完成18章,校对认领7章,Star数361,fork数100,协同翻译QQ群及技术交流群的TF爱好者将近300人,GitHub搜索TensorFlow排名第二
- 2015-12-10, Star数超过500
- 2015-12-15, 项目正式上线
## 花絮
在组织翻译的过程中,有些事情令人印象深刻,记录下来,希望以后来学习文档的同学能够明了到手中这份文档的由来:
- 参加翻译的有学生,也有老师;有专门研究AI/ML的,也有对此感兴趣的;有国内的,也有远在纽约的;有工程技术人员也有博士、专家
- 其中一位,[恩泽](http://www.longmotto.com)同学,为了翻译一篇文档,在前一天没有睡觉的情况下坚持翻完,20个小时没有合眼
- 还有一位老师,刚从讲台上讲完课,就立即给我们的翻译提修改意见
- 很多同学自发的将搭建环境中遇到的问题总结到FAQ里帮助他人
- 为了一个翻译细节,经常是来回几次,和其他人讨论完善
## 持续改进
这样的一个高技术领域的文档,我们在翻译的过程中,难免会有不完善的地方,希望请大家一起帮助我们持续改进文档的翻译质量,帮助更多的人,方法:
- 在GitHub上提Issue或Pull Request,地址为: [https://github.com/jikexueyuanwiki/tensorflow-zh](https://github.com/jikexueyuanwiki/tensorflow-zh)
- 加入TensorFlow技术交流群,与TensorFlower们一起研究交流技术干货--TensorFlow技术交流群:782484288
- 对翻译感兴趣?加入协同翻译群:248320884,与翻译大神一道研究TensorFlow的本地化
- 给我们写邮件: wiki@jikexueyuan.com
## 感谢支持
- [极客学院 Wiki](http://wiki.jikexueyuan.com) 提供图文教程托管服务
## 离线版本
目前,离线版本(PDF、ePub)可正常下载、使用
## Tex-PDF 修订版
[Tex-PDF 修订版](tex_pdf) 目前正在编订中,欢迎加入进来一起修订。您可以在此查看[预览版](tex_pdf/tensorflow_manual_cn.pdf)目前最新状态。
================================================
FILE: README副本.md
================================================
# TensorFlow 官方文档中文版

### 你正在翻译的项目可能会比 Android 系统更加深远地影响着世界!
## 缘起
2015年11月9日,Google 官方在其博客上称,Google Research 宣布推出第二代机器学习系统 TensorFlow,针对先前的 DistBelief 的短板有了各方面的加强,更重要的是,它是开源的,任何人都可以用。
机器学习作为人工智能的一种类型,可以让软件根据大量的数据来对未来的情况进行阐述或预判。如今,领先的科技巨头无不在机器学习下予以极大投入。Facebook、苹果、微软,甚至国内的百度。Google 自然也在其中。「TensorFlow」是 Google 多年以来内部的机器学习系统。如今,Google 正在将此系统成为开源系统,并将此系统的参数公布给业界工程师、学者和拥有大量编程能力的技术人员,这意味着什么呢?
打个不太恰当的比喻,如今 Google 对待 TensorFlow 系统,有点类似于该公司对待旗下移动操作系统 Android。如果更多的数据科学家开始使用 Google 的系统来从事机器学习方面的研究,那么这将有利于 Google 对日益发展的机器学习行业拥有更多的主导权。
为了让国内的技术人员在最短的时间内迅速掌握这一世界领先的 AI 系统,极客学院 Wiki 团队发起对 TensorFlow 官方文档的中文协同翻译。
欢迎各路人工智能及机器学习领域的专家和爱好者参与这一翻译项目,我们会为每位翻译和校对人员署名。
## 一起来参与
如果想做出贡献(翻译或者校对)的话,请加QQ群:248320884,谢谢!
PS: 想探讨TensorFlow技术的可以加"TensorFlow技术交流群":495115006
## 内容来源
英文官方网站:
官方GitHub仓库:
中文版 GitHub 仓库:
## 关于TensorFlow的一些报道
[野心:像开放安卓一样,这一次,谷歌决定开放它的机器学习系统——TensorFlow](http://www.huxiu.com/article/130763/1.html?f=wangzhan)
[最强大机器学习平台开源 谷歌准备主宰人工智能?](http://m.wallstreetcn.com/node/226012)
## 参与步骤
* fork主仓库()
* 按照章节认领翻译(每次申请一个章节)或者校对(可申请多个章节)在下面这个`README.md`里找还没有被人申请的章节,写上(@你的github号),给主仓库的`master`分支提pull request;
* 提的 pull request 被确认,合并到主仓库后,代表你申请的章节*认领*完成,开始翻译或校对;
* 翻译或校对的文件为README.md或者TOC.md中对应的md文件,请不要翻译/校对单独文件夹中的index.md
* 翻译过程请参照 *翻译协作规范* (见下一节),完成翻译后提交 pull request 给主仓库的`master`分支;
* 完成校对后同样提交 pull request 给主仓库的`master`分支;
* 全部翻译/校对完成后,我们会生成PDF/ePub文档,放在极客学院Wiki平台发布,并为所有参与者署名!
## 翻译协作规范
为了让大家协作顺畅,需要每一个人遵循如下协作规范~
- 如果对Markdown和GitHub不了解,请先阅读[如何使用Markdown](markdown.md)以及[如何使用GitHub](learn-github.md)
- 使用Markdown进行翻译,文件名必须使用英文
- 翻译后的文档请放到SOURCE文件夹下的对应章节中,然后pull request即可
- 如遇到文中的图片,请统一放在SOURCE/images目录下
- 原文中的HTML标签及代码请不要修改、翻译
- 有其他任何问题都欢迎发issue,我们看到了会尽快回复
- 翻译人员需将对应的原文地址和翻译人姓名添加到译文末尾,审校人员需要将自己的名字添加到译文末尾,具体格式请参见样例:
> 原文:[Color Palettes](http://www.google.com/design/spec/resources/color-palettes.html) 翻译:[iceskysl](https://github.com/iceskysl) 校对:[PoppinLp](https://github.com/poppinlp)
## 校对规范
- 认领校对时请提供机器学习 or AI方面的专业背景说明,保证校对质量
- 文章通顺,适合人类阅读与理解,别让人以为是机翻
- 确保图片都能正常显示,且其相对地址都是SOURCE/images,图片存放正确
- 专有名词符合术语表中的翻译要求,拿不准的新词汇可以使用中文翻译后加括号内英文的形式
- 校对人员需有一定专业背景,保证译文无专业知识方面错误
- 请查看翻译人员是否将原文链接和翻译链接放到译文最下方,审校人员需要将自己的名字添加到译文末尾,具体格式请参见样例:
> 原文:[Color Palettes](http://www.google.com/design/spec/resources/color-palettes.html) 翻译:[iceskysl](https://github.com/iceskysl) 校对:[PoppinLp](https://github.com/poppinlp)
- 有任何问题请提Issues或到协同翻译群讨论,校对后提PR等待Merge,管理员通过后会在目录后面打√标识完成
## 参与者(按认领章节排序)
### 翻译 & 校对
- 起步
- [简介](get_started/introduction.md) 翻译:([@PFZheng](https://github.com/PFZheng))√ 校对:([@yangtze](https://github.com/sstruct))√
- [下载及安装](get_started/os_setup.md) 翻译:([@PFZheng](https://github.com/PFZheng))√ 校对: ([@yangtze](https://github.com/sstruct))√
- [基本用法](get_started/basic_usage.md) 翻译:([@PFZheng](https://github.com/PFZheng))√ 校对:([@yangtze](https://github.com/sstruct))√
- 教程
- [综述](tutorials/overview.md) 翻译:([@PFZheng](https://github.com/PFZheng))√ 校对: ([@ericxk](https://github.com/ericxk)) √
- [MNIST机器学习入门](tutorials/mnist_beginners.md) 翻译:([@Tony Jin](https://github.com/linbojin))√ 校对: ([@ericxk](https://github.com/ericxk))
- [深入MNIST](tutorials/mnist_pros.md) 翻译:([@chenweican](https://github.com/chenweican))√ 校对: ([@HongyangWang](https://github.com/WangHong-yang)) √
- [TensorFlow运作方式入门](tutorials/mnist_tf.md) 翻译:([@bingjin](https://github.com/bingjin))√ 校对: ([@LichAmnesia](https://github.com/LichAmnesia)) √
- [卷积神经网络](tutorials/deep_cnn.md) 翻译: ([@oskycar](https://github.com/oskycar))√ 校对: ([@zhyhooo](https://github.com/zhyhooo))
- [Vector Representations of Words](tutorials/word2vec.md)翻译: ([@btpeter](https://github.com/btpeter)) √ 校对: ([@waiwaizheng](https://github.com/waiwaizheng))
- [递归神经网络](tutorials/recurrent.md) 翻译:([@Warln](https://github.com/Warln))√ 校对: ([@HongyangWang](https://github.com/WangHong-yang))
- [曼德布洛特(Mandelbrot)集合](tutorials/mandelbrot.md) 翻译:([@ericxk](https://github.com/ericxk))√ 校对: ([@tensorfly](https://github.com/tensorfly))√
- [偏微分方程](tutorials/pdes.md) 翻译:([@wangaicc](https://github.com/wangaicc))√ 校对: ([@tensorfly](https://github.com/tensorfly))√
- [MNIST数据下载](tutorials/mnist_download.md) 翻译: ([@btpeter](https://github.com/btpeter)) √ 校对: ([@waiwaizheng](https://github.com/waiwaizheng))
- 运作方式
- [总览](how_tos/overview.md) 翻译: ([@Terence Cooper](https://github.com/TerenceCooper))√ 校对:
([@lonlonago](https://github.com/lonlonago)) √
- [变量:创建、初始化、保存和加载](how_tos/variables.md) 翻译: ([@zhyhooo](https://github.com/zhyhooo))√ 校对: ([@jishaoming](https://github.com/jishaoming))
- [TensorBoard:可视化学习](how_tos/summaries_and_tensorboard.md)翻译:([@thylaco1eo](https://github.com/thylaco1eo))√ 校对: ([@lucky521](https://github.com/lucky521)) √
- [TensorBoard:图表可视化](how_tos/graph_viz.md) 翻译:([@Warln](https://github.com/Warln))√ 校对: ([@lucky521](https://github.com/lucky521)) √
- [读取数据](how_tos/reading_data.md) 翻译: ([@volvet](https://github.com/volvet)) 校对: ([@allensummer](http://github.com/allensummer))
- [线程和队列](how_tos/threading_and_queues.md) 翻译: ([@zhangkom](https://github.com/zhangkom)) √ 校对:([@volvet](https://github.com/volvet))√
- [添加新的Op](how_tos/adding_an_op.md) 翻译:([@PFZheng](https://github.com/PFZheng))√校对:([@ZHNathanielLee](https://github.com/ZHNathanielLee)) √
- [自定义数据读取](how_tos/new_data_formats.md) 翻译:([@derekshang](https://github.com/derekshang))√ 校对:([@pengfoo](https://github.com/PengFoo))
- [使用gpu](how_tos/using_gpu.md)翻译:([@lianghyv](https://github.com/lianghyv))√ 校对:([@qiaohaijun](https://github.com/qiaohaijun))
- [共享变量](how_tos/variable_scope.md) 翻译:([@nb312](https://github.com/nb312))√校对:([@pengfoo](https://github.com/PengFoo))
- 资源
- [总览](resources/overview.md) 翻译:([@Jim-Zenn](https://github.com/Jim-Zenn))√ 校对:([@lonlonago](https://github.com/lonlonago)) √
- [BibTex 引用](resources/bib.md) 翻译:([@Jim-Zenn](https://github.com/Jim-Zenn))√ 校对:([@pengfoo](https://github.com/PengFoo))
- [使用示例](resources/uses.md) 翻译:([@andyiac](https://github.com/andyiac))√ 校对([@lonlonago](https://github.com/lonlonago)) √
- [FAQ](resources/faq.md) 翻译:([@Terence Cooper](https://github.com/TerenceCooper))√ 校对:([@pengfoo](https://github.com/PengFoo))
- [术语表](resources/glossary.md) 翻译: ([@leege100](https://github.com/leege100))√ 校对:
([@lonlonago](https://github.com/lonlonago)) √
- [Tensor的阶、形状和数据类型](resources/dims_types.md) 翻译:([@nb312](https://github.com/nb312))√ 校对:([@lonlonago](https://github.com/lonlonago)) √
- 其他
- [常见问题汇总](faq.md)
## 进度记录
- 2015-11-10, 谷歌发布全新人工智能系统TensorFlow并宣布开源, 极客学院Wiki启动协同翻译,创建 GitHub 仓库,制定协同规范
- 2015-11-18, 所有章节认领完毕,翻译完成18章,校对认领7章,Star数361,fork数100,协同翻译QQ群及技术交流群的TF爱好者将近300人,GitHub搜索TensorFlow排名第二
- 2015-12-10, Star数超过500
## TensorFlow 相关资源
- [Google官方Blog宣布TensorFlow开源](https://googleblog.blogspot.com/2015/11/tensorflow-smarter-machine-learning-for.html)
- [TensorFlow WhitePaper(PDF下载)](http://download.tensorflow.org/paper/whitepaper2015.pdf)
- [Jeff Dean 介绍 TensorFlow(视频)](https://www.youtube.com/watch?v=90-S1M7Ny_o&t=21m2s)
- [TensorFlow 简化版接口 Scikit Flow](https://github.com/google/skflow)
- [TensorFlow 使用样例](https://github.com/aymericdamien/TensorFlow-Examples)
- [TensorFlow 与 mxnet, caffe 对比 @chenrudan](http://chenrudan.github.io/blog/2015/11/18/comparethreeopenlib.html)
- [TensorFlow 与 torch7, caffe, mxnet 在内存使用量以及性能对比 @mli](https://github.com/dmlc/mxnet/blob/master/doc/overview_chn.md)
## 感谢支持
## 离线版本
================================================
FILE: SOURCE/api_docs/cc/ClassEnv.md
================================================
# Class `tensorflow::Env`
An interface used by the tensorflow implementation to access operating system functionality like the filesystem etc.
Callers may wish to provide a custom Env object to get fine grain control.
All Env implementations are safe for concurrent access from multiple threads without any external synchronization.
##Member Summary
* [`tensorflow::Env::Env()`](#tensorflow_Env_Env)
* [`virtual tensorflow::Env::~Env()`](#virtual_tensorflow_Env_Env)
* [`virtual Status tensorflow::Env::NewRandomAccessFile(const string &fname, RandomAccessFile **result)=0`](#virtual_Status_tensorflow_Env_NewRandomAccessFile)
* Creates a brand new random access read-only file with the specified name.
* [`virtual Status tensorflow::Env::NewWritableFile(const string &fname, WritableFile **result)=0`](#virtual_Status_tensorflow_Env_NewWritableFile)
* Creates an object that writes to a new file with the specified name.
* [`virtual Status tensorflow::Env::NewAppendableFile(const string &fname, WritableFile **result)=0`](#virtual_Status_tensorflow_Env_NewAppendableFile)
* Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with).
* [`virtual bool tensorflow::Env::FileExists(const string &fname)=0`](#virtual_bool_tensorflow_Env_FileExists)
* Returns true iff the named file exists.
* [`virtual Status tensorflow::Env::GetChildren(const string &dir, std::vector< string > *result)=0`](#virtual_Status_tensorflow_Env_GetChildren)
* Stores in *result the names of the children of the specified directory. The names are relative to "dir".
* [`virtual Status tensorflow::Env::DeleteFile(const string &fname)=0`](#virtual_Status_tensorflow_Env_DeleteFile)
* Deletes the named file.
* [`virtual Status tensorflow::Env::CreateDir(const string &dirname)=0`](#virtual_Status_tensorflow_Env_CreateDir)
* Creates the specified directory.
* [`virtual Status tensorflow::Env::DeleteDir(const string &dirname)=0`](#virtual_Status_tensorflow_Env_DeleteDir)
* Deletes the specified directory.
* [`virtual Status tensorflow::Env::GetFileSize(const string &fname, uint64 *file_size)=0`](#virtual_Status_tensorflow_Env_GetFileSize)
* Stores the size of fname in *file_size.
* [`virtual Status tensorflow::Env::RenameFile(const string &src, const string &target)=0`](#virtual_Status_tensorflow_Env_RenameFile)
* Renames file src to target. If target already exists, it will be replaced.
* [`virtual uint64 tensorflow::Env::NowMicros()=0`](#virtual_uint64_tensorflow_Env_NowMicros)
* Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time.
* [`virtual void tensorflow::Env::SleepForMicroseconds(int micros)=0`](#virtual_void_tensorflow_Env_SleepForMicroseconds)
* Sleeps/delays the thread for the prescribed number of micro-seconds.
* [`virtual Thread* tensorflow::Env::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) TF_MUST_USE_RESULT=0`](#virtual_Thread_tensorflow_Env_StartThread)
* Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name".
* [`static Env* tensorflow::Env::Default()`](#static_Env_tensorflow_Env_Default)
* Returns a default environment suitable for the current operating system.
##Member Details
#### `tensorflow::Env::Env()`
#### `virtual tensorflow::Env::~Env()`
#### `virtual Status tensorflow::Env::NewRandomAccessFile(const string &fname, RandomAccessFile **result)=0`
Creates a brand new random access read-only file with the specified name.
On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. If the file does not exist, returns a non-OK status.
The returned file may be concurrently accessed by multiple threads.
#### `virtual Status tensorflow::Env::NewWritableFile(const string &fname, WritableFile **result)=0`
Creates an object that writes to a new file with the specified name.
Deletes any existing file with the same name and creates a new file. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK.
The returned file will only be accessed by one thread at a time.
#### `virtual Status tensorflow::Env::NewAppendableFile(const string &fname, WritableFile **result)=0`
Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with).
On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK.
The returned file will only be accessed by one thread at a time.
#### `virtual bool tensorflow::Env::FileExists(const string &fname)=0`
Returns true iff the named file exists.
#### `virtual Status tensorflow::Env::GetChildren(const string &dir, std::vector< string > *result)=0`
Stores in *result the names of the children of the specified directory. The names are relative to "dir".
Original contents of *results are dropped.
#### `virtual Status tensorflow::Env::DeleteFile(const string &fname)=0`
Deletes the named file.
#### `virtual Status tensorflow::Env::CreateDir(const string &dirname)=0`
Creates the specified directory.
#### `virtual Status tensorflow::Env::DeleteDir(const string &dirname)=0`
Deletes the specified directory.
#### `virtual Status tensorflow::Env::GetFileSize(const string &fname, uint64 *file_size)=0`
Stores the size of fname in *file_size.
#### `virtual Status tensorflow::Env::RenameFile(const string &src, const string &target)=0`
Renames file src to target. If target already exists, it will be replaced.
#### `virtual uint64 tensorflow::Env::NowMicros()=0`
Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time.
#### `virtual void tensorflow::Env::SleepForMicroseconds(int micros)=0`
Sleeps/delays the thread for the prescribed number of micro-seconds.
#### `virtual Thread* tensorflow::Env::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) TF_MUST_USE_RESULT=0`
Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name".
Caller takes ownership of the result and must delete it eventually (the deletion will block until fn() stops running).
#### `static Env* tensorflow::Env::Default()`
Returns a default environment suitable for the current operating system.
Sophisticated users may wish to provide their own Env implementation instead of relying on this default environment.
The result of Default() belongs to this library and must never be deleted.
================================================
FILE: SOURCE/api_docs/cc/ClassEnvWrapper.md
================================================
# Class `tensorflow::EnvWrapper`
An implementation of Env that forwards all calls to another Env .
May be useful to clients who wish to override just part of the functionality of another Env .
##Member Summary
* [`tensorflow::EnvWrapper::EnvWrapper(Env *t)`](#tensorflow_EnvWrapper_EnvWrapper)
* Initializes an EnvWrapper that delegates all calls to *t.
* [`virtual tensorflow::EnvWrapper::~EnvWrapper()`](#virtual_tensorflow_EnvWrapper_EnvWrapper)
* [`Env* tensorflow::EnvWrapper::target() const`](#Env_tensorflow_EnvWrapper_target)
* Returns the target to which this Env forwards all calls.
* [`Status tensorflow::EnvWrapper::NewRandomAccessFile(const string &f, RandomAccessFile **r) override`](#Status_tensorflow_EnvWrapper_NewRandomAccessFile)
* Creates a brand new random access read-only file with the specified name.
* [`Status tensorflow::EnvWrapper::NewWritableFile(const string &f, WritableFile **r) override`](#Status_tensorflow_EnvWrapper_NewWritableFile)
* Creates an object that writes to a new file with the specified name.
* [`Status tensorflow::EnvWrapper::NewAppendableFile(const string &f, WritableFile **r) override`](#Status_tensorflow_EnvWrapper_NewAppendableFile)
* Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with).
* [`bool tensorflow::EnvWrapper::FileExists(const string &f) override`](#bool_tensorflow_EnvWrapper_FileExists)
* Returns true iff the named file exists.
* [`Status tensorflow::EnvWrapper::GetChildren(const string &dir, std::vector< string > *r) override`](#Status_tensorflow_EnvWrapper_GetChildren)
* Stores in *result the names of the children of the specified directory. The names are relative to "dir".
* [`Status tensorflow::EnvWrapper::DeleteFile(const string &f) override`](#Status_tensorflow_EnvWrapper_DeleteFile)
* Deletes the named file.
* [`Status tensorflow::EnvWrapper::CreateDir(const string &d) override`](#Status_tensorflow_EnvWrapper_CreateDir)
* Creates the specified directory.
* [`Status tensorflow::EnvWrapper::DeleteDir(const string &d) override`](#Status_tensorflow_EnvWrapper_DeleteDir)
* Deletes the specified directory.
* [`Status tensorflow::EnvWrapper::GetFileSize(const string &f, uint64 *s) override`](#Status_tensorflow_EnvWrapper_GetFileSize)
* Stores the size of fname in *file_size.
* [`Status tensorflow::EnvWrapper::RenameFile(const string &s, const string &t) override`](#Status_tensorflow_EnvWrapper_RenameFile)
* Renames file src to target. If target already exists, it will be replaced.
* [`uint64 tensorflow::EnvWrapper::NowMicros() override`](#uint64_tensorflow_EnvWrapper_NowMicros)
* Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time.
* [`void tensorflow::EnvWrapper::SleepForMicroseconds(int micros) override`](#void_tensorflow_EnvWrapper_SleepForMicroseconds)
* Sleeps/delays the thread for the prescribed number of micro-seconds.
* [`Thread* tensorflow::EnvWrapper::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) override`](#Thread_tensorflow_EnvWrapper_StartThread)
* Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name".
##Member Details
#### `tensorflow::EnvWrapper::EnvWrapper(Env *t)`
Initializes an EnvWrapper that delegates all calls to *t.
#### `virtual tensorflow::EnvWrapper::~EnvWrapper()`
#### `Env* tensorflow::EnvWrapper::target() const`
Returns the target to which this Env forwards all calls.
#### `Status tensorflow::EnvWrapper::NewRandomAccessFile(const string &f, RandomAccessFile **r) override`
Creates a brand new random access read-only file with the specified name.
On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. If the file does not exist, returns a non-OK status.
The returned file may be concurrently accessed by multiple threads.
#### `Status tensorflow::EnvWrapper::NewWritableFile(const string &f, WritableFile **r) override`
Creates an object that writes to a new file with the specified name.
Deletes any existing file with the same name and creates a new file. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK.
The returned file will only be accessed by one thread at a time.
#### `Status tensorflow::EnvWrapper::NewAppendableFile(const string &f, WritableFile **r) override`
Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with).
On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK.
The returned file will only be accessed by one thread at a time.
#### `bool tensorflow::EnvWrapper::FileExists(const string &f) override`
Returns true iff the named file exists.
#### `Status tensorflow::EnvWrapper::GetChildren(const string &dir, std::vector< string > *r) override`
Stores in *result the names of the children of the specified directory. The names are relative to "dir".
Original contents of *results are dropped.
#### `Status tensorflow::EnvWrapper::DeleteFile(const string &f) override`
Deletes the named file.
#### `Status tensorflow::EnvWrapper::CreateDir(const string &d) override`
Creates the specified directory.
#### `Status tensorflow::EnvWrapper::DeleteDir(const string &d) override`
Deletes the specified directory.
#### `Status tensorflow::EnvWrapper::GetFileSize(const string &f, uint64 *s) override`
Stores the size of fname in *file_size.
#### `Status tensorflow::EnvWrapper::RenameFile(const string &s, const string &t) override`
Renames file src to target. If target already exists, it will be replaced.
#### `uint64 tensorflow::EnvWrapper::NowMicros() override`
Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time.
#### `void tensorflow::EnvWrapper::SleepForMicroseconds(int micros) override`
Sleeps/delays the thread for the prescribed number of micro-seconds.
#### `Thread* tensorflow::EnvWrapper::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) override`
Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name".
Caller takes ownership of the result and must delete it eventually (the deletion will block until fn() stops running).
================================================
FILE: SOURCE/api_docs/cc/ClassRandomAccessFile.md
================================================
# Class `tensorflow::RandomAccessFile`
A file abstraction for randomly reading the contents of a file.
##Member Summary
* [`tensorflow::RandomAccessFile::RandomAccessFile()`](#tensorflow_RandomAccessFile_RandomAccessFile)
* [`virtual tensorflow::RandomAccessFile::~RandomAccessFile()`](#virtual_tensorflow_RandomAccessFile_RandomAccessFile)
* [`virtual Status tensorflow::RandomAccessFile::Read(uint64 offset, size_t n, StringPiece *result, char *scratch) const =0`](#virtual_Status_tensorflow_RandomAccessFile_Read)
* Reads up to "n" bytes from the file starting at "offset".
##Member Details
#### `tensorflow::RandomAccessFile::RandomAccessFile()`
#### `virtual tensorflow::RandomAccessFile::~RandomAccessFile()`
#### `virtual Status tensorflow::RandomAccessFile::Read(uint64 offset, size_t n, StringPiece *result, char *scratch) const =0`
Reads up to "n" bytes from the file starting at "offset".
"scratch[0..n-1]" may be written by this routine. Sets "*result" to the data that was read (including if fewer than "n" bytes were successfully read). May set "*result" to point at data in "scratch[0..n-1]", so "scratch[0..n-1]" must be live when "*result" is used.
On OK returned status: "n" bytes have been stored in "*result". On non-OK returned status: [0..n] bytes have been stored in "*result".
Returns `OUT_OF_RANGE` if fewer than n bytes were stored in "*result" because of EOF.
Safe for concurrent use by multiple threads.
================================================
FILE: SOURCE/api_docs/cc/ClassSession.md
================================================
# Class `tensorflow::Session`
A Session instance lets a caller drive a TensorFlow graph computation.
When a Session is created with a given target, a new Session object is bound to the universe of resources specified by that target. Those resources are available to this session to perform computation described in the GraphDef. After extending the session with a graph, the caller uses the Run() API to perform the computation and potentially fetch outputs as Tensors.
Example:
```c++ tensorflow::GraphDef graph;
// ... Create or load graph into "graph".
// This example uses the default options which connects
// to a local runtime.
tensorflow::SessionOptions options;
std::unique_ptr
session(tensorflow::NewSession(options));
// Create the session with this graph.
tensorflow::Status s = session->Create(graph);
if (!s.ok()) { ... }
// Run the graph and fetch the first output of the "output"
// operation, and also run to but do not return anything
// for the "update_state" operation.
std::vector outputs;
s = session->Run({}, {"output:0"}, {"update_state"}, &outputs);
if (!s.ok()) { ... }
// Map the output as a flattened float tensor, and do something
// with it.
auto output_tensor = outputs[0].flat();
if (output_tensor(0) > 0.5) { ... }
// Close the session to release the resources associated with
// this session.
session->Close()
```
A Session allows concurrent calls to Run() , though a Session must be created / extended by a single thread.
Only one thread must call Close() , and Close() must only be called after all other calls to Run() have returned.
##Member Summary
* [`virtual Status tensorflow::Session::Create(const GraphDef &graph)=0`](#virtual_Status_tensorflow_Session_Create)
* Create the graph to be used for the session.
* [`virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0`](#virtual_Status_tensorflow_Session_Extend)
* Adds operations to the graph that is already registered with the Session .
* [`virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0`](#virtual_Status_tensorflow_Session_Run)
* Runs the graph with the provided input tensors and fills `outputs` for the endpoints specified in `output_tensor_names`. Runs to but does not return Tensors for the nodes in `target_node_names`.
* [`virtual Status tensorflow::Session::Close()=0`](#virtual_Status_tensorflow_Session_Close)
* Closes this session.
* [`virtual tensorflow::Session::~Session()`](#virtual_tensorflow_Session_Session)
##Member Details
#### `virtual Status tensorflow::Session::Create(const GraphDef &graph)=0`
Create the graph to be used for the session.
Returns an error if this session has already been created with a graph. To re-use the session with a different graph, the caller must Close() the session first.
#### `virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0`
Adds operations to the graph that is already registered with the Session .
The names of new operations in "graph" must not exist in the graph that is already registered.
#### `virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0`
Runs the graph with the provided input tensors and fills `outputs` for the endpoints specified in `output_tensor_names`. Runs to but does not return Tensors for the nodes in `target_node_names`.
The order of tensors in `outputs` will match the order provided by `output_tensor_names`.
If `Run` returns `OK()`, then `outputs->size()` will be equal to `output_tensor_names.size()`. If `Run` does not return `OK()`, the state of `outputs` is undefined.
REQUIRES: The name of each Tensor of the input or output must match a "Tensor endpoint" in the `GraphDef` passed to ` Create() `.
REQUIRES: outputs is not nullptr if `output_tensor_names` is non-empty.
#### `virtual Status tensorflow::Session::Close()=0`
Closes this session.
Closing a session releases the resources used by this session on the TensorFlow runtime (specified during session creation by the ` SessionOptions::target ` field).
#### `virtual tensorflow::Session::~Session()`
================================================
FILE: SOURCE/api_docs/cc/ClassStatus.md
================================================
# Class `tensorflow::Status`
##Member Summary
* [`tensorflow::Status::Status()`](#tensorflow_Status_Status)
* Create a success status.
* [`tensorflow::Status::~Status()`](#tensorflow_Status_Status)
* [`tensorflow::Status::Status(tensorflow::error::Code code, tensorflow::StringPiece msg)`](#tensorflow_Status_Status)
* Create a status with the specified error code and msg as a human-readable string containing more detailed information.
* [`tensorflow::Status::Status(const Status &s)`](#tensorflow_Status_Status)
* Copy the specified status.
* [`void tensorflow::Status::operator=(const Status &s)`](#void_tensorflow_Status_operator_)
* [`bool tensorflow::Status::ok() const`](#bool_tensorflow_Status_ok)
* Returns true iff the status indicates success.
* [`tensorflow::error::Code tensorflow::Status::code() const`](#tensorflow_error_Code_tensorflow_Status_code)
* [`const string& tensorflow::Status::error_message() const`](#const_string_tensorflow_Status_error_message)
* [`bool tensorflow::Status::operator==(const Status &x) const`](#bool_tensorflow_Status_operator_)
* [`bool tensorflow::Status::operator!=(const Status &x) const`](#bool_tensorflow_Status_operator_)
* [`void tensorflow::Status::Update(const Status &new_status)`](#void_tensorflow_Status_Update)
* If ` ok() `, stores `new_status` into `*this`. If `!ok()`, preserves the current status, but may augment with additional information about `new_status`.
* [`string tensorflow::Status::ToString() const`](#string_tensorflow_Status_ToString)
* Return a string representation of this status suitable for printing. Returns the string `"OK"` for success.
* [`static Status tensorflow::Status::OK()`](#static_Status_tensorflow_Status_OK)
##Member Details
#### `tensorflow::Status::Status()`
Create a success status.
#### `tensorflow::Status::~Status()`
#### `tensorflow::Status::Status(tensorflow::error::Code code, tensorflow::StringPiece msg)`
Create a status with the specified error code and msg as a human-readable string containing more detailed information.
#### `tensorflow::Status::Status(const Status &s)`
Copy the specified status.
#### `void tensorflow::Status::operator=(const Status &s)`
#### `bool tensorflow::Status::ok() const`
Returns true iff the status indicates success.
#### `tensorflow::error::Code tensorflow::Status::code() const`
#### `const string& tensorflow::Status::error_message() const`
#### `bool tensorflow::Status::operator==(const Status &x) const`
#### `bool tensorflow::Status::operator!=(const Status &x) const`
#### `void tensorflow::Status::Update(const Status &new_status)`
If ` ok() `, stores `new_status` into `*this`. If `!ok()`, preserves the current status, but may augment with additional information about `new_status`.
Convenient way of keeping track of the first error encountered. Instead of: `if (overall_status.ok()) overall_status = new_status` Use: `overall_status.Update(new_status);`
#### `string tensorflow::Status::ToString() const`
Return a string representation of this status suitable for printing. Returns the string `"OK"` for success.
#### `static Status tensorflow::Status::OK()`
================================================
FILE: SOURCE/api_docs/cc/ClassTensor.md
================================================
# Class `tensorflow::Tensor`
Represents an n-dimensional array of values.
##Member Summary
* [`tensorflow::Tensor::Tensor()`](#tensorflow_Tensor_Tensor)
* Default Tensor constructor. Creates a 1-dimension, 0-element float tensor.
* [`tensorflow::Tensor::Tensor(DataType type, const TensorShape &shape)`](#tensorflow_Tensor_Tensor)
* Creates a Tensor of the given `type` and `shape`.
* [`tensorflow::Tensor::Tensor(Allocator *a, DataType type, const TensorShape &shape)`](#tensorflow_Tensor_Tensor)
* Creates a tensor with the input `type` and `shape`, using the allocator `a` to allocate the underlying buffer.
* [`tensorflow::Tensor::Tensor(DataType type)`](#tensorflow_Tensor_Tensor)
* Creates an uninitialized Tensor of the given data type.
* [`tensorflow::Tensor::Tensor(const Tensor &other)`](#tensorflow_Tensor_Tensor)
* [`tensorflow::Tensor::~Tensor()`](#tensorflow_Tensor_Tensor)
* Copy constructor.
* [`DataType tensorflow::Tensor::dtype() const`](#DataType_tensorflow_Tensor_dtype)
* Returns the data type.
* [`const TensorShape& tensorflow::Tensor::shape() const`](#const_TensorShape_tensorflow_Tensor_shape)
* Returns the shape of the tensor.
* [`int tensorflow::Tensor::dims() const`](#int_tensorflow_Tensor_dims)
* Convenience accessor for the tensor shape.
* [`int64 tensorflow::Tensor::dim_size(int d) const`](#int64_tensorflow_Tensor_dim_size)
* Convenience accessor for the tensor shape.
* [`int64 tensorflow::Tensor::NumElements() const`](#int64_tensorflow_Tensor_NumElements)
* Convenience accessor for the tensor shape.
* [`bool tensorflow::Tensor::IsSameSize(const Tensor &b) const`](#bool_tensorflow_Tensor_IsSameSize)
* [`bool tensorflow::Tensor::IsInitialized() const`](#bool_tensorflow_Tensor_IsInitialized)
* Has this Tensor been initialized?
* [`size_t tensorflow::Tensor::TotalBytes() const`](#size_t_tensorflow_Tensor_TotalBytes)
* Returns the estimated memory usage of this tensor.
* [`Tensor& tensorflow::Tensor::operator=(const Tensor &other)`](#Tensor_tensorflow_Tensor_operator_)
* Assign operator. This tensor shares other's underlying storage.
* [`bool tensorflow::Tensor::CopyFrom(const Tensor &other, const TensorShape &shape) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_CopyFrom)
* Copy the other tensor into this tensor and reshape it.
* [`Tensor tensorflow::Tensor::Slice(int64 dim0_start, int64 dim0_limit) const`](#Tensor_tensorflow_Tensor_Slice)
* Slice this tensor along the 1st dimension.
* [`bool tensorflow::Tensor::FromProto(const TensorProto &other) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_FromProto)
* Parse `other` and construct the tensor.
* [`bool tensorflow::Tensor::FromProto(Allocator *a, const TensorProto &other) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_FromProto)
* [`void tensorflow::Tensor::AsProtoField(TensorProto *proto) const`](#void_tensorflow_Tensor_AsProtoField)
* Fills in `proto` with `*this` tensor's content.
* [`void tensorflow::Tensor::AsProtoTensorContent(TensorProto *proto) const`](#void_tensorflow_Tensor_AsProtoTensorContent)
* [`TTypes::Vec tensorflow::Tensor::vec()`](#TTypes_T_Vec_tensorflow_Tensor_vec)
* Return the tensor data as an `Eigen::Tensor` with the type and sizes of this ` Tensor `.
* [`TTypes::Matrix tensorflow::Tensor::matrix()`](#TTypes_T_Matrix_tensorflow_Tensor_matrix)
* [`TTypes< T, NDIMS >::Tensor tensorflow::Tensor::tensor()`](#TTypes_T_NDIMS_Tensor_tensorflow_Tensor_tensor)
* [`TTypes::Flat tensorflow::Tensor::flat()`](#TTypes_T_Flat_tensorflow_Tensor_flat)
* Return the tensor data as an `Eigen::Tensor` of the data type and a specified shape.
* [`TTypes::UnalignedFlat tensorflow::Tensor::unaligned_flat()`](#TTypes_T_UnalignedFlat_tensorflow_Tensor_unaligned_flat)
* [`TTypes::Matrix tensorflow::Tensor::flat_inner_dims()`](#TTypes_T_Matrix_tensorflow_Tensor_flat_inner_dims)
* [`TTypes::Matrix tensorflow::Tensor::flat_outer_dims()`](#TTypes_T_Matrix_tensorflow_Tensor_flat_outer_dims)
* [`TTypes< T, NDIMS >::Tensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes)`](#TTypes_T_NDIMS_Tensor_tensorflow_Tensor_shaped)
* [`TTypes< T, NDIMS >::UnalignedTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes)`](#TTypes_T_NDIMS_UnalignedTensor_tensorflow_Tensor_unaligned_shaped)
* [`TTypes< T >::Scalar tensorflow::Tensor::scalar()`](#TTypes_T_Scalar_tensorflow_Tensor_scalar)
* Return the Tensor data as a `TensorMap` of fixed size 1: `TensorMap>`.
* [`TTypes::ConstVec tensorflow::Tensor::vec() const`](#TTypes_T_ConstVec_tensorflow_Tensor_vec)
* Const versions of all the methods above.
* [`TTypes::ConstMatrix tensorflow::Tensor::matrix() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_matrix)
* [`TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::tensor() const`](#TTypes_T_NDIMS_ConstTensor_tensorflow_Tensor_tensor)
* [`TTypes::ConstFlat tensorflow::Tensor::flat() const`](#TTypes_T_ConstFlat_tensorflow_Tensor_flat)
* [`TTypes::UnalignedConstFlat tensorflow::Tensor::unaligned_flat() const`](#TTypes_T_UnalignedConstFlat_tensorflow_Tensor_unaligned_flat)
* [`TTypes::ConstMatrix tensorflow::Tensor::flat_inner_dims() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_flat_inner_dims)
* [`TTypes::ConstMatrix tensorflow::Tensor::flat_outer_dims() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_flat_outer_dims)
* [`TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes) const`](#TTypes_T_NDIMS_ConstTensor_tensorflow_Tensor_shaped)
* [`TTypes< T, NDIMS >::UnalignedConstTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes) const`](#TTypes_T_NDIMS_UnalignedConstTensor_tensorflow_Tensor_unaligned_shaped)
* [`TTypes< T >::ConstScalar tensorflow::Tensor::scalar() const`](#TTypes_T_ConstScalar_tensorflow_Tensor_scalar)
* [`string tensorflow::Tensor::SummarizeValue(int64 max_entries) const`](#string_tensorflow_Tensor_SummarizeValue)
* Render the first `max_entries` values in `*this` into a string.
* [`string tensorflow::Tensor::DebugString() const`](#string_tensorflow_Tensor_DebugString)
* A human-readable summary of the tensor suitable for debugging.
* [`void tensorflow::Tensor::FillDescription(TensorDescription *description) const`](#void_tensorflow_Tensor_FillDescription)
* [`StringPiece tensorflow::Tensor::tensor_data() const`](#StringPiece_tensorflow_Tensor_tensor_data)
* Returns a `StringPiece` mapping the current tensor's buffer.
##Member Details
#### `tensorflow::Tensor::Tensor()`
Default Tensor constructor. Creates a 1-dimension, 0-element float tensor.
#### `tensorflow::Tensor::Tensor(DataType type, const TensorShape &shape)`
Creates a Tensor of the given `type` and `shape`.
The underlying buffer is allocated using a `CPUAllocator`.
#### `tensorflow::Tensor::Tensor(Allocator *a, DataType type, const TensorShape &shape)`
Creates a tensor with the input `type` and `shape`, using the allocator `a` to allocate the underlying buffer.
`a` must outlive the lifetime of this Tensor .
#### `tensorflow::Tensor::Tensor(DataType type)`
Creates an uninitialized Tensor of the given data type.
#### `tensorflow::Tensor::Tensor(const Tensor &other)`
#### `tensorflow::Tensor::~Tensor()`
Copy constructor.
#### `DataType tensorflow::Tensor::dtype() const`
Returns the data type.
#### `const TensorShape& tensorflow::Tensor::shape() const`
Returns the shape of the tensor.
#### `int tensorflow::Tensor::dims() const`
Convenience accessor for the tensor shape.
For all shape accessors, see comments for relevant methods of ` TensorShape ` in ` tensor_shape.h `.
#### `int64 tensorflow::Tensor::dim_size(int d) const`
Convenience accessor for the tensor shape.
#### `int64 tensorflow::Tensor::NumElements() const`
Convenience accessor for the tensor shape.
#### `bool tensorflow::Tensor::IsSameSize(const Tensor &b) const`
#### `bool tensorflow::Tensor::IsInitialized() const`
Has this Tensor been initialized?
#### `size_t tensorflow::Tensor::TotalBytes() const`
Returns the estimated memory usage of this tensor.
#### `Tensor& tensorflow::Tensor::operator=(const Tensor &other)`
Assign operator. This tensor shares other's underlying storage.
#### `bool tensorflow::Tensor::CopyFrom(const Tensor &other, const TensorShape &shape) TF_MUST_USE_RESULT`
Copy the other tensor into this tensor and reshape it.
This tensor shares other's underlying storage. Returns `true` iff `other.shape()` has the same number of elements of the given `shape`.
#### `Tensor tensorflow::Tensor::Slice(int64 dim0_start, int64 dim0_limit) const`
Slice this tensor along the 1st dimension.
I.e., the returned tensor satisifies returned[i, ...] == this[dim0_start + i, ...]. The returned tensor shares the underlying tensor buffer with this tensor.
NOTE: The returned tensor may not satisfies the same alignment requirement as this tensor depending on the shape. The caller must check the returned tensor's alignment before calling certain methods that have alignment requirement (e.g., ` flat() `, `tensor()`).
REQUIRES: ` dims() ` >= 1 REQUIRES: `0 <= dim0_start <= dim0_limit <= dim_size(0)`
#### `bool tensorflow::Tensor::FromProto(const TensorProto &other) TF_MUST_USE_RESULT`
Parse `other` and construct the tensor.
Returns `true` iff the parsing succeeds. If the parsing fails, the state of `*this` is unchanged.
#### `bool tensorflow::Tensor::FromProto(Allocator *a, const TensorProto &other) TF_MUST_USE_RESULT`
#### `void tensorflow::Tensor::AsProtoField(TensorProto *proto) const`
Fills in `proto` with `*this` tensor's content.
` AsProtoField() ` fills in the repeated field for `proto.dtype()`, while `AsProtoTensorContent()` encodes the content in `proto.tensor_content()` in a compact form.
#### `void tensorflow::Tensor::AsProtoTensorContent(TensorProto *proto) const`
#### `TTypes::Vec tensorflow::Tensor::vec()`
Return the tensor data as an `Eigen::Tensor` with the type and sizes of this ` Tensor `.
Use these methods when you know the data type and the number of dimensions of the Tensor and you want an `Eigen::Tensor` automatically sized to the ` Tensor ` sizes. The implementation check fails if either type or sizes mismatch.
Example:
```c++ typedef float T;
Tensor my_mat(...built with Shape{rows: 3, cols: 5}...);
auto mat = my_mat.matrix(); // 2D Eigen::Tensor, 3 x 5.
auto mat = my_mat.tensor(); // 2D Eigen::Tensor, 3 x 5.
auto vec = my_mat.vec(); // CHECK fails as my_mat is 2D.
auto vec = my_mat.tensor(); // CHECK fails as my_mat is 2D.
auto mat = my_mat.matrix();// CHECK fails as type mismatch.
```
#### `TTypes::Matrix tensorflow::Tensor::matrix()`
#### `TTypes< T, NDIMS >::Tensor tensorflow::Tensor::tensor()`
#### `TTypes::Flat tensorflow::Tensor::flat()`
Return the tensor data as an `Eigen::Tensor` of the data type and a specified shape.
These methods allow you to access the data with the dimensions and sizes of your choice. You do not need to know the number of dimensions of the Tensor to call them. However, they `CHECK` that the type matches and the dimensions requested creates an `Eigen::Tensor` with the same number of elements as the tensor.
Example:
```c++ typedef float T;
Tensor my_ten(...built with Shape{planes: 4, rows: 3, cols: 5}...);
// 1D Eigen::Tensor, size 60:
auto flat = my_ten.flat();
// 2D Eigen::Tensor 12 x 5:
auto inner = my_ten.flat_inner_dims();
// 2D Eigen::Tensor 4 x 15:
auto outer = my_ten.shaped({4, 15});
// CHECK fails, bad num elements:
auto outer = my_ten.shaped({4, 8});
// 3D Eigen::Tensor 6 x 5 x 2:
auto weird = my_ten.shaped({6, 5, 2});
// CHECK fails, type mismatch:
auto bad = my_ten.flat();
```
#### `TTypes::UnalignedFlat tensorflow::Tensor::unaligned_flat()`
#### `TTypes::Matrix tensorflow::Tensor::flat_inner_dims()`
Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all Tensor dimensions but the last one into the first dimension of the result.
#### `TTypes::Matrix tensorflow::Tensor::flat_outer_dims()`
Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all Tensor dimensions but the first one into the last dimension of the result.
#### `TTypes< T, NDIMS >::Tensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes)`
#### `TTypes< T, NDIMS >::UnalignedTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes)`
#### `TTypes< T >::Scalar tensorflow::Tensor::scalar()`
Return the Tensor data as a `TensorMap` of fixed size 1: `TensorMap>`.
Using ` scalar() ` allows the compiler to perform optimizations as the size of the tensor is known at compile time.
#### `TTypes::ConstVec tensorflow::Tensor::vec() const`
Const versions of all the methods above.
#### `TTypes::ConstMatrix tensorflow::Tensor::matrix() const`
#### `TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::tensor() const`
#### `TTypes::ConstFlat tensorflow::Tensor::flat() const`
#### `TTypes::UnalignedConstFlat tensorflow::Tensor::unaligned_flat() const`
#### `TTypes::ConstMatrix tensorflow::Tensor::flat_inner_dims() const`
#### `TTypes::ConstMatrix tensorflow::Tensor::flat_outer_dims() const`
#### `TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes) const`
#### `TTypes< T, NDIMS >::UnalignedConstTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes) const`
#### `TTypes< T >::ConstScalar tensorflow::Tensor::scalar() const`
#### `string tensorflow::Tensor::SummarizeValue(int64 max_entries) const`
Render the first `max_entries` values in `*this` into a string.
#### `string tensorflow::Tensor::DebugString() const`
A human-readable summary of the tensor suitable for debugging.
#### `void tensorflow::Tensor::FillDescription(TensorDescription *description) const`
Fill in the `TensorDescription` proto with metadata about the tensor that is useful for monitoring and debugging.
#### `StringPiece tensorflow::Tensor::tensor_data() const`
Returns a `StringPiece` mapping the current tensor's buffer.
The returned `StringPiece` may point to memory location on devices that the CPU cannot address directly.
NOTE: The underlying tensor buffer is refcounted, so the lifetime of the contents mapped by the `StringPiece` matches the lifetime of the buffer; callers should arrange to make sure the buffer does not get destroyed while the `StringPiece` is still used.
REQUIRES: `DataTypeCanUseMemcpy( dtype() )`.
================================================
FILE: SOURCE/api_docs/cc/ClassTensorBuffer.md
================================================
# Class `tensorflow::TensorBuffer`
##Member Summary
* [`tensorflow::TensorBuffer::~TensorBuffer() override`](#tensorflow_TensorBuffer_TensorBuffer)
* [`virtual void* tensorflow::TensorBuffer::data() const =0`](#virtual_void_tensorflow_TensorBuffer_data)
* [`virtual size_t tensorflow::TensorBuffer::size() const =0`](#virtual_size_t_tensorflow_TensorBuffer_size)
* [`virtual TensorBuffer* tensorflow::TensorBuffer::root_buffer()=0`](#virtual_TensorBuffer_tensorflow_TensorBuffer_root_buffer)
* [`virtual void tensorflow::TensorBuffer::FillAllocationDescription(AllocationDescription *proto) const =0`](#virtual_void_tensorflow_TensorBuffer_FillAllocationDescription)
* [`T* tensorflow::TensorBuffer::base() const`](#T_tensorflow_TensorBuffer_base)
##Member Details
#### `tensorflow::TensorBuffer::~TensorBuffer() override`
#### `virtual void* tensorflow::TensorBuffer::data() const =0`
#### `virtual size_t tensorflow::TensorBuffer::size() const =0`
#### `virtual TensorBuffer* tensorflow::TensorBuffer::root_buffer()=0`
#### `virtual void tensorflow::TensorBuffer::FillAllocationDescription(AllocationDescription *proto) const =0`
#### `T* tensorflow::TensorBuffer::base() const`
================================================
FILE: SOURCE/api_docs/cc/ClassTensorShape.md
================================================
# Class `tensorflow::TensorShape`
Manages the dimensions of a Tensor and their sizes.
##Member Summary
* [`tensorflow::TensorShape::TensorShape(gtl::ArraySlice< int64 > dim_sizes)`](#tensorflow_TensorShape_TensorShape)
* Construct a ` TensorShape ` from the provided sizes. REQUIRES: `dim_sizes[i] >= 0`
* [`tensorflow::TensorShape::TensorShape(std::initializer_list< int64 > dim_sizes)`](#tensorflow_TensorShape_TensorShape)
* [`tensorflow::TensorShape::TensorShape(const TensorShapeProto &proto)`](#tensorflow_TensorShape_TensorShape)
* REQUIRES: `IsValid(proto)`
* [`tensorflow::TensorShape::TensorShape()`](#tensorflow_TensorShape_TensorShape)
* [`void tensorflow::TensorShape::Clear()`](#void_tensorflow_TensorShape_Clear)
* Clear a tensor shape.
* [`void tensorflow::TensorShape::AddDim(int64 size)`](#void_tensorflow_TensorShape_AddDim)
* Add a dimension to the end ("inner-most"). REQUIRES: `size >= 0`
* [`void tensorflow::TensorShape::AppendShape(const TensorShape &shape)`](#void_tensorflow_TensorShape_AppendShape)
* Appends all the dimensions from `shape`.
* [`void tensorflow::TensorShape::InsertDim(int d, int64 size)`](#void_tensorflow_TensorShape_InsertDim)
* Insert a dimension somewhere in the ` TensorShape `. REQUIRES: `0 <= d <= dims() ` REQUIRES: `size >= 0`
* [`void tensorflow::TensorShape::set_dim(int d, int64 size)`](#void_tensorflow_TensorShape_set_dim)
* Modifies the size of the dimension `d` to be `size` REQUIRES: `0 <= d < dims() ` REQUIRES: `size >= 0`
* [`void tensorflow::TensorShape::RemoveDim(int d)`](#void_tensorflow_TensorShape_RemoveDim)
* Removes dimension `d` from the ` TensorShape `. REQUIRES: `0 <= d < dims() `
* [`int tensorflow::TensorShape::dims() const`](#int_tensorflow_TensorShape_dims)
* Return the number of dimensions in the tensor.
* [`int64 tensorflow::TensorShape::dim_size(int d) const`](#int64_tensorflow_TensorShape_dim_size)
* Returns the number of elements in dimension `d`. REQUIRES: `0 <= d < dims() `
* [`gtl::ArraySlice tensorflow::TensorShape::dim_sizes() const`](#gtl_ArraySlice_int64_tensorflow_TensorShape_dim_sizes)
* Returns sizes of all dimensions.
* [`int64 tensorflow::TensorShape::num_elements() const`](#int64_tensorflow_TensorShape_num_elements)
* Returns the number of elements in the tensor.
* [`bool tensorflow::TensorShape::IsSameSize(const TensorShape &b) const`](#bool_tensorflow_TensorShape_IsSameSize)
* [`bool tensorflow::TensorShape::operator==(const TensorShape &b) const`](#bool_tensorflow_TensorShape_operator_)
* [`void tensorflow::TensorShape::AsProto(TensorShapeProto *proto) const`](#void_tensorflow_TensorShape_AsProto)
* Fill `*proto` from `*this`.
* [`Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizes() const`](#Eigen_DSizes_Eigen_DenseIndex_NDIMS_tensorflow_TensorShape_AsEigenDSizes)
* Fill `*dsizes` from `*this`.
* [`Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizesWithPadding() const`](#Eigen_DSizes_Eigen_DenseIndex_NDIMS_tensorflow_TensorShape_AsEigenDSizesWithPadding)
* [`TensorShapeIter tensorflow::TensorShape::begin() const`](#TensorShapeIter_tensorflow_TensorShape_begin)
* For iterating through the dimensions.
* [`TensorShapeIter tensorflow::TensorShape::end() const`](#TensorShapeIter_tensorflow_TensorShape_end)
* [`string tensorflow::TensorShape::DebugString() const`](#string_tensorflow_TensorShape_DebugString)
* For error messages.
* [`string tensorflow::TensorShape::ShortDebugString() const`](#string_tensorflow_TensorShape_ShortDebugString)
* [`static bool tensorflow::TensorShape::IsValid(const TensorShapeProto &proto)`](#static_bool_tensorflow_TensorShape_IsValid)
* Returns `true` iff `proto` is a valid tensor shape.
##Member Details
#### `tensorflow::TensorShape::TensorShape(gtl::ArraySlice< int64 > dim_sizes)`
Construct a ` TensorShape ` from the provided sizes. REQUIRES: `dim_sizes[i] >= 0`
#### `tensorflow::TensorShape::TensorShape(std::initializer_list< int64 > dim_sizes)`
#### `tensorflow::TensorShape::TensorShape(const TensorShapeProto &proto)`
REQUIRES: `IsValid(proto)`
#### `tensorflow::TensorShape::TensorShape()`
Create a tensor shape with no dimensions and one element, which you can then call ` AddDim() ` on.
#### `void tensorflow::TensorShape::Clear()`
Clear a tensor shape.
#### `void tensorflow::TensorShape::AddDim(int64 size)`
Add a dimension to the end ("inner-most"). REQUIRES: `size >= 0`
#### `void tensorflow::TensorShape::AppendShape(const TensorShape &shape)`
Appends all the dimensions from `shape`.
#### `void tensorflow::TensorShape::InsertDim(int d, int64 size)`
Insert a dimension somewhere in the ` TensorShape `. REQUIRES: `0 <= d <= dims() ` REQUIRES: `size >= 0`
#### `void tensorflow::TensorShape::set_dim(int d, int64 size)`
Modifies the size of the dimension `d` to be `size` REQUIRES: `0 <= d < dims() ` REQUIRES: `size >= 0`
#### `void tensorflow::TensorShape::RemoveDim(int d)`
Removes dimension `d` from the ` TensorShape `. REQUIRES: `0 <= d < dims() `
#### `int tensorflow::TensorShape::dims() const`
Return the number of dimensions in the tensor.
#### `int64 tensorflow::TensorShape::dim_size(int d) const`
Returns the number of elements in dimension `d`. REQUIRES: `0 <= d < dims() `
#### `gtl::ArraySlice tensorflow::TensorShape::dim_sizes() const`
Returns sizes of all dimensions.
#### `int64 tensorflow::TensorShape::num_elements() const`
Returns the number of elements in the tensor.
We use `int64` and not `size_t` to be compatible with `Eigen::Tensor` which uses `ptrdiff_t`.
#### `bool tensorflow::TensorShape::IsSameSize(const TensorShape &b) const`
Returns true if `*this` and `b` have the same sizes. Ignores dimension names.
#### `bool tensorflow::TensorShape::operator==(const TensorShape &b) const`
#### `void tensorflow::TensorShape::AsProto(TensorShapeProto *proto) const`
Fill `*proto` from `*this`.
#### `Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizes() const`
Fill `*dsizes` from `*this`.
#### `Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizesWithPadding() const`
Same as ` AsEigenDSizes() ` but allows for `NDIMS > dims() ` in which case we pad the rest of the sizes with 1.
#### `TensorShapeIter tensorflow::TensorShape::begin() const`
For iterating through the dimensions.
#### `TensorShapeIter tensorflow::TensorShape::end() const`
#### `string tensorflow::TensorShape::DebugString() const`
For error messages.
#### `string tensorflow::TensorShape::ShortDebugString() const`
#### `static bool tensorflow::TensorShape::IsValid(const TensorShapeProto &proto)`
Returns `true` iff `proto` is a valid tensor shape.
================================================
FILE: SOURCE/api_docs/cc/ClassTensorShapeIter.md
================================================
# Class `tensorflow::TensorShapeIter`
##Member Summary
* [`tensorflow::TensorShapeIter::TensorShapeIter(const TensorShape *shape, int d)`](#tensorflow_TensorShapeIter_TensorShapeIter)
* [`bool tensorflow::TensorShapeIter::operator==(const TensorShapeIter &rhs)`](#bool_tensorflow_TensorShapeIter_operator_)
* [`bool tensorflow::TensorShapeIter::operator!=(const TensorShapeIter &rhs)`](#bool_tensorflow_TensorShapeIter_operator_)
* [`void tensorflow::TensorShapeIter::operator++()`](#void_tensorflow_TensorShapeIter_operator_)
* [`TensorShapeDim tensorflow::TensorShapeIter::operator*()`](#TensorShapeDim_tensorflow_TensorShapeIter_operator_)
##Member Details
#### `tensorflow::TensorShapeIter::TensorShapeIter(const TensorShape *shape, int d)`
#### `bool tensorflow::TensorShapeIter::operator==(const TensorShapeIter &rhs)`
#### `bool tensorflow::TensorShapeIter::operator!=(const TensorShapeIter &rhs)`
#### `void tensorflow::TensorShapeIter::operator++()`
#### `TensorShapeDim tensorflow::TensorShapeIter::operator*()`
================================================
FILE: SOURCE/api_docs/cc/ClassTensorShapeUtils.md
================================================
# Class `tensorflow::TensorShapeUtils`
Static helper routines for ` TensorShape `. Includes a few common predicates on a tensor shape.
##Member Summary
* [`static bool tensorflow::TensorShapeUtils::IsScalar(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsScalar)
* [`static bool tensorflow::TensorShapeUtils::IsVector(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsVector)
* [`static bool tensorflow::TensorShapeUtils::IsLegacyScalar(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsLegacyScalar)
* [`static bool tensorflow::TensorShapeUtils::IsLegacyVector(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsLegacyVector)
* [`static bool tensorflow::TensorShapeUtils::IsVectorOrHigher(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsVectorOrHigher)
* [`static bool tensorflow::TensorShapeUtils::IsMatrix(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsMatrix)
* [`static bool tensorflow::TensorShapeUtils::IsMatrixOrHigher(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsMatrixOrHigher)
* [`static TensorShape tensorflow::TensorShapeUtils::MakeShape(const T *dims, int n)`](#static_TensorShape_tensorflow_TensorShapeUtils_MakeShape)
* Returns a ` TensorShape ` whose dimensions are `dims[0]`, `dims[1]`, ..., `dims[n-1]`.
* [`static string tensorflow::TensorShapeUtils::ShapeListString(const gtl::ArraySlice< TensorShape > &shapes)`](#static_string_tensorflow_TensorShapeUtils_ShapeListString)
* [`static bool tensorflow::TensorShapeUtils::StartsWith(const TensorShape &shape0, const TensorShape &shape1)`](#static_bool_tensorflow_TensorShapeUtils_StartsWith)
##Member Details
#### `static bool tensorflow::TensorShapeUtils::IsScalar(const TensorShape &shape)`
#### `static bool tensorflow::TensorShapeUtils::IsVector(const TensorShape &shape)`
#### `static bool tensorflow::TensorShapeUtils::IsLegacyScalar(const TensorShape &shape)`
#### `static bool tensorflow::TensorShapeUtils::IsLegacyVector(const TensorShape &shape)`
#### `static bool tensorflow::TensorShapeUtils::IsVectorOrHigher(const TensorShape &shape)`
#### `static bool tensorflow::TensorShapeUtils::IsMatrix(const TensorShape &shape)`
#### `static bool tensorflow::TensorShapeUtils::IsMatrixOrHigher(const TensorShape &shape)`
#### `static TensorShape tensorflow::TensorShapeUtils::MakeShape(const T *dims, int n)`
Returns a ` TensorShape ` whose dimensions are `dims[0]`, `dims[1]`, ..., `dims[n-1]`.
#### `static string tensorflow::TensorShapeUtils::ShapeListString(const gtl::ArraySlice< TensorShape > &shapes)`
#### `static bool tensorflow::TensorShapeUtils::StartsWith(const TensorShape &shape0, const TensorShape &shape1)`
================================================
FILE: SOURCE/api_docs/cc/ClassThread.md
================================================
# Class `tensorflow::Thread`
##Member Summary
* [`tensorflow::Thread::Thread()`](#tensorflow_Thread_Thread)
* [`virtual tensorflow::Thread::~Thread()`](#virtual_tensorflow_Thread_Thread)
* Blocks until the thread of control stops running.
##Member Details
#### `tensorflow::Thread::Thread()`
#### `virtual tensorflow::Thread::~Thread()`
Blocks until the thread of control stops running.
================================================
FILE: SOURCE/api_docs/cc/ClassWritableFile.md
================================================
# Class `tensorflow::WritableFile`
A file abstraction for sequential writing.
The implementation must provide buffering since callers may append small fragments at a time to the file.
##Member Summary
* [`tensorflow::WritableFile::WritableFile()`](#tensorflow_WritableFile_WritableFile)
* [`virtual tensorflow::WritableFile::~WritableFile()`](#virtual_tensorflow_WritableFile_WritableFile)
* [`virtual Status tensorflow::WritableFile::Append(const StringPiece &data)=0`](#virtual_Status_tensorflow_WritableFile_Append)
* [`virtual Status tensorflow::WritableFile::Close()=0`](#virtual_Status_tensorflow_WritableFile_Close)
* [`virtual Status tensorflow::WritableFile::Flush()=0`](#virtual_Status_tensorflow_WritableFile_Flush)
* [`virtual Status tensorflow::WritableFile::Sync()=0`](#virtual_Status_tensorflow_WritableFile_Sync)
##Member Details
#### `tensorflow::WritableFile::WritableFile()`
#### `virtual tensorflow::WritableFile::~WritableFile()`
#### `virtual Status tensorflow::WritableFile::Append(const StringPiece &data)=0`
#### `virtual Status tensorflow::WritableFile::Close()=0`
#### `virtual Status tensorflow::WritableFile::Flush()=0`
#### `virtual Status tensorflow::WritableFile::Sync()=0`
================================================
FILE: SOURCE/api_docs/cc/StructSessionOptions.md
================================================
# Struct `tensorflow::SessionOptions`
Configuration information for a Session .
##Member Summary
* [`Env* tensorflow::SessionOptions::env`](#Env_tensorflow_SessionOptions_env)
* The environment to use.
* [`string tensorflow::SessionOptions::target`](#string_tensorflow_SessionOptions_target)
* The TensorFlow runtime to connect to.
* [`ConfigProto tensorflow::SessionOptions::config`](#ConfigProto_tensorflow_SessionOptions_config)
* Configuration options.
* [`tensorflow::SessionOptions::SessionOptions()`](#tensorflow_SessionOptions_SessionOptions)
##Member Details
#### `Env* tensorflow::SessionOptions::env`
The environment to use.
#### `string tensorflow::SessionOptions::target`
The TensorFlow runtime to connect to.
If 'target' is empty or unspecified, the local TensorFlow runtime implementation will be used. Otherwise, the TensorFlow engine defined by 'target' will be used to perform all computations.
"target" can be either a single entry or a comma separated list of entries. Each entry is a resolvable address of the following format: local ip:port host:port ... other system-specific formats to identify tasks and jobs ...
NOTE: at the moment 'local' maps to an in-process service-based runtime.
Upon creation, a single session affines itself to one of the remote processes, with possible load balancing choices when the "target" resolves to a list of possible processes.
If the session disconnects from the remote process during its lifetime, session calls may fail immediately.
#### `ConfigProto tensorflow::SessionOptions::config`
Configuration options.
#### `tensorflow::SessionOptions::SessionOptions()`
================================================
FILE: SOURCE/api_docs/cc/StructState.md
================================================
# Struct `tensorflow::Status::State`
##Member Summary
* [`tensorflow::error::Code tensorflow::Status::State::code`](#tensorflow_error_Code_tensorflow_Status_State_code)
* [`string tensorflow::Status::State::msg`](#string_tensorflow_Status_State_msg)
##Member Details
#### `tensorflow::error::Code tensorflow::Status::State::code`
#### `string tensorflow::Status::State::msg`
================================================
FILE: SOURCE/api_docs/cc/StructTensorShapeDim.md
================================================
# Struct `tensorflow::TensorShapeDim`
##Member Summary
* [`int tensorflow::TensorShapeDim::size`](#int_tensorflow_TensorShapeDim_size)
* [`tensorflow::TensorShapeDim::TensorShapeDim(int64 s)`](#tensorflow_TensorShapeDim_TensorShapeDim)
##Member Details
#### `int tensorflow::TensorShapeDim::size`
#### `tensorflow::TensorShapeDim::TensorShapeDim(int64 s)`
================================================
FILE: SOURCE/api_docs/cc/StructThreadOptions.md
================================================
# Struct `tensorflow::ThreadOptions`
Options to configure a Thread .
Note that the options are all hints, and the underlying implementation may choose to ignore it.
##Member Summary
* [`size_t tensorflow::ThreadOptions::stack_size`](#size_t_tensorflow_ThreadOptions_stack_size)
* Thread stack size to use (in bytes).
* [`size_t tensorflow::ThreadOptions::guard_size`](#size_t_tensorflow_ThreadOptions_guard_size)
* Guard area size to use near thread stacks to use (in bytes)
##Member Details
#### `size_t tensorflow::ThreadOptions::stack_size`
Thread stack size to use (in bytes).
#### `size_t tensorflow::ThreadOptions::guard_size`
Guard area size to use near thread stacks to use (in bytes)
================================================
FILE: SOURCE/api_docs/cc/index.md
================================================
# TensorFlow C++ Session API reference documentation
TensorFlow's public C++ API includes only the API for executing graphs, as of
version 0.5. To control the execution of a graph from C++:
1. Build the computation graph using the [Python API](../../api_docs/python/).
1. Use [tf.train.write_graph()](../../api_docs/python/train.md#write_graph) to
write the graph to a file.
1. Load the graph using the C++ Session API. For example:
```c++
// Reads a model graph definition from disk, and creates a session object you
// can use to run it.
Status LoadGraph(string graph_file_name, Session** session) {
GraphDef graph_def;
TF_RETURN_IF_ERROR(
ReadBinaryProto(Env::Default(), graph_file_name, &graph_def));
TF_RETURN_IF_ERROR(NewSession(SessionOptions(), session));
TF_RETURN_IF_ERROR((*session)->Create(graph_def));
return Status::OK();
}
```
1. Run the graph with a call to `session->Run()`
##Classes
* [tensorflow::Env](../../api_docs/cc/ClassEnv.md)
* [tensorflow::EnvWrapper](../../api_docs/cc/ClassEnvWrapper.md)
* [tensorflow::RandomAccessFile](../../api_docs/cc/ClassRandomAccessFile.md)
* [tensorflow::Session](../../api_docs/cc/ClassSession.md)
* [tensorflow::Status](../../api_docs/cc/ClassStatus.md)
* [tensorflow::Tensor](../../api_docs/cc/ClassTensor.md)
* [tensorflow::TensorBuffer](../../api_docs/cc/ClassTensorBuffer.md)
* [tensorflow::TensorShape](../../api_docs/cc/ClassTensorShape.md)
* [tensorflow::TensorShapeIter](../../api_docs/cc/ClassTensorShapeIter.md)
* [tensorflow::TensorShapeUtils](../../api_docs/cc/ClassTensorShapeUtils.md)
* [tensorflow::Thread](../../api_docs/cc/ClassThread.md)
* [tensorflow::WritableFile](../../api_docs/cc/ClassWritableFile.md)
##Structs
* [tensorflow::SessionOptions](../../api_docs/cc/StructSessionOptions.md)
* [tensorflow::Status::State](../../api_docs/cc/StructState.md)
* [tensorflow::TensorShapeDim](../../api_docs/cc/StructTensorShapeDim.md)
* [tensorflow::ThreadOptions](../../api_docs/cc/StructThreadOptions.md)
-->
================================================
FILE: SOURCE/api_docs/index.md
================================================
# Overview
TensorFlow has APIs available in several languages both for constructing and
executing a TensorFlow graph. The Python API is at present the most complete
and the easiest to use, but the C++ API may offer some performance advantages
in graph execution, and supports deployment to small devices such as Android.
Over time, we hope that the TensorFlow community will develop front ends for
languages like Go, Java, JavaScript, Lua R, and perhaps others. With
[SWIG](http://swig.org), it's relatively easy to develop a TensorFlow interface
for your favorite language.
Note: Many practical aspects of usage are covered in the Mechanics tab, and
some additional documentation not specific to any particular language API is
available in the Resources tab.
================================================
FILE: SOURCE/api_docs/python/array_ops.md
================================================
# Tensor Transformations
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Tensor Transformations](#AUTOGENERATED-tensor-transformations)
* [Casting](#AUTOGENERATED-casting)
* [`tf.string_to_number(string_tensor, out_type=None, name=None)`](#string_to_number)
* [`tf.to_double(x, name='ToDouble')`](#to_double)
* [`tf.to_float(x, name='ToFloat')`](#to_float)
* [`tf.to_bfloat16(x, name='ToBFloat16')`](#to_bfloat16)
* [`tf.to_int32(x, name='ToInt32')`](#to_int32)
* [`tf.to_int64(x, name='ToInt64')`](#to_int64)
* [`tf.cast(x, dtype, name=None)`](#cast)
* [Shapes and Shaping](#AUTOGENERATED-shapes-and-shaping)
* [`tf.shape(input, name=None)`](#shape)
* [`tf.size(input, name=None)`](#size)
* [`tf.rank(input, name=None)`](#rank)
* [`tf.reshape(tensor, shape, name=None)`](#reshape)
* [`tf.squeeze(input, squeeze_dims=None, name=None)`](#squeeze)
* [`tf.expand_dims(input, dim, name=None)`](#expand_dims)
* [Slicing and Joining](#AUTOGENERATED-slicing-and-joining)
* [`tf.slice(input_, begin, size, name=None)`](#slice)
* [`tf.split(split_dim, num_split, value, name='split')`](#split)
* [`tf.tile(input, multiples, name=None)`](#tile)
* [`tf.pad(input, paddings, name=None)`](#pad)
* [`tf.concat(concat_dim, values, name='concat')`](#concat)
* [`tf.pack(values, name='pack')`](#pack)
* [`tf.unpack(value, num=None, name='unpack')`](#unpack)
* [`tf.reverse_sequence(input, seq_lengths, seq_dim, name=None)`](#reverse_sequence)
* [`tf.reverse(tensor, dims, name=None)`](#reverse)
* [`tf.transpose(a, perm=None, name='transpose')`](#transpose)
* [`tf.gather(params, indices, name=None)`](#gather)
* [`tf.dynamic_partition(data, partitions, num_partitions, name=None)`](#dynamic_partition)
* [`tf.dynamic_stitch(indices, data, name=None)`](#dynamic_stitch)
## Casting
TensorFlow provides several operations that you can use to cast tensor data
types in your graph.
- - -
### `tf.string_to_number(string_tensor, out_type=None, name=None)`
Converts each string in the input Tensor to the specified numeric type.
(Note that int32 overflow results in an error while float overflow
results in a rounded value.)
##### Args:
* `string_tensor`: A `Tensor` of type `string`.
* `out_type`: An optional `tf.DType` from: `tf.float32, tf.int32`. Defaults to `tf.float32`.
The numeric type to interpret each string in string_tensor as.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `out_type`.
A Tensor of the same shape as the input string_tensor.
- - -
### `tf.to_double(x, name='ToDouble')`
Casts a tensor to type `float64`.
##### Args:
* `x`: A `Tensor` or `SparseTensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` or `SparseTensor` with same shape as `x` with type `float64`.
##### Raises:
* `TypeError`: If `x` cannot be cast to the `float64`.
- - -
### `tf.to_float(x, name='ToFloat')`
Casts a tensor to type `float32`.
##### Args:
* `x`: A `Tensor` or `SparseTensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` or `SparseTensor` with same shape as `x` with type `float32`.
##### Raises:
* `TypeError`: If `x` cannot be cast to the `float32`.
- - -
### `tf.to_bfloat16(x, name='ToBFloat16')`
Casts a tensor to type `bfloat16`.
##### Args:
* `x`: A `Tensor` or `SparseTensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` or `SparseTensor` with same shape as `x` with type `bfloat16`.
##### Raises:
* `TypeError`: If `x` cannot be cast to the `bfloat16`.
- - -
### `tf.to_int32(x, name='ToInt32')`
Casts a tensor to type `int32`.
##### Args:
* `x`: A `Tensor` or `SparseTensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` or `SparseTensor` with same shape as `x` with type `int32`.
##### Raises:
* `TypeError`: If `x` cannot be cast to the `int32`.
- - -
### `tf.to_int64(x, name='ToInt64')`
Casts a tensor to type `int64`.
##### Args:
* `x`: A `Tensor` or `SparseTensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` or `SparseTensor` with same shape as `x` with type `int64`.
##### Raises:
* `TypeError`: If `x` cannot be cast to the `int64`.
- - -
### `tf.cast(x, dtype, name=None)`
Casts a tensor to a new type.
The operation casts `x` (in case of `Tensor`) or `x.values`
(in case of `SparseTensor`) to `dtype`.
For example:
```python
# tensor `a` is [1.8, 2.2], dtype=tf.float
tf.cast(a, tf.int32) ==> [1, 2] # dtype=tf.int32
```
##### Args:
* `x`: A `Tensor` or `SparseTensor`.
* `dtype`: The destination type.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` or `SparseTensor` with same shape as `x`.
##### Raises:
* `TypeError`: If `x` cannot be cast to the `dtype`.
## Shapes and Shaping
TensorFlow provides several operations that you can use to determine the shape
of a tensor and change the shape of a tensor.
- - -
### `tf.shape(input, name=None)`
Returns the shape of a tensor.
This operation returns a 1-D integer tensor representing the shape of `input`.
For example:
```prettyprint
# 't' is [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]]
shape(t) ==> [2, 2, 3]
```
##### Args:
* `input`: A `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `int32`.
- - -
### `tf.size(input, name=None)`
Returns the size of a tensor.
This operation returns an integer representing the number of elements in
`input`.
For example:
```prettyprint
# 't' is [[[1, 1,, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]]]
size(t) ==> 12
```
##### Args:
* `input`: A `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `int32`.
- - -
### `tf.rank(input, name=None)`
Returns the rank of a tensor.
This operation returns an integer representing the rank of `input`.
For example:
```prettyprint
# 't' is [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]]
# shape of tensor 't' is [2, 2, 3]
rank(t) ==> 3
```
**Note**: The rank of a tensor is not the same as the rank of a matrix. The rank
of a tensor is the number of indices required to uniquely select each element
of the tensor. Rank is also known as "order", "degree", or "ndims."
##### Args:
* `input`: A `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `int32`.
- - -
### `tf.reshape(tensor, shape, name=None)`
Reshapes a tensor.
Given `tensor`, this operation returns a tensor that has the same values
as `tensor` with shape `shape`.
If `shape` is the special value `[-1]`, then `tensor` is flattened and the
operation outputs a 1-D tensor with all elements of `tensor`.
If `shape` is 1-D or higher, then the operation returns a tensor with shape
`shape` filled with the values of `tensor`. In this case, the number of elements
implied by `shape` must be the same as the number of elements in `tensor`.
For example:
```prettyprint
# tensor 't' is [1, 2, 3, 4, 5, 6, 7, 8, 9]
# tensor 't' has shape [9]
reshape(t, [3, 3]) ==> [[1, 2, 3]
[4, 5, 6]
[7, 8, 9]]
# tensor 't' is [[[1, 1], [2, 2]]
# [[3, 3], [4, 4]]]
# tensor 't' has shape [2, 2]
reshape(t, [2, 4]) ==> [[1, 1, 2, 2]
[3, 3, 4, 4]]
# tensor 't' is [[[1, 1, 1],
# [2, 2, 2]],
# [[3, 3, 3],
# [4, 4, 4]],
# [[5, 5, 5],
# [6, 6, 6]]]
# tensor 't' has shape [3, 2, 3]
# pass '[-1]' to flatten 't'
reshape(t, [-1]) ==> [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6]
```
##### Args:
* `tensor`: A `Tensor`.
* `shape`: A `Tensor` of type `int32`. Defines the shape of the output tensor.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `tensor`.
- - -
### `tf.squeeze(input, squeeze_dims=None, name=None)`
Removes dimensions of size 1 from the shape of a tensor.
Given a tensor `input`, this operation returns a tensor of the same type with
all dimensions of size 1 removed. If you don't want to remove all size 1
dimensions, you can remove specific size 1 dimensions by specifying
`squeeze_dims`.
For example:
```prettyprint
# 't' is a tensor of shape [1, 2, 1, 3, 1, 1]
shape(squeeze(t)) ==> [2, 3]
```
Or, to remove specific size 1 dimensions:
```prettyprint
# 't' is a tensor of shape [1, 2, 1, 3, 1, 1]
shape(squeeze(t, [2, 4])) ==> [1, 2, 3, 1]
```
##### Args:
* `input`: A `Tensor`. The `input` to squeeze.
* `squeeze_dims`: An optional list of `ints`. Defaults to `[]`.
If specified, only squeezes the dimensions listed. The dimension
index starts at 0. It is an error to squeeze a dimension that is not 1.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
Contains the same data as `input`, but has one or more dimensions of
size 1 removed.
- - -
### `tf.expand_dims(input, dim, name=None)`
Inserts a dimension of 1 into a tensor's shape.
Given a tensor `input`, this operation inserts a dimension of 1 at the
dimension index `dim` of `input`'s shape. The dimension index `dim` starts at
zero; if you specify a negative number for `dim` it is counted backward from
the end.
This operation is useful if you want to add a batch dimension to a single
element. For example, if you have a single image of shape `[height, width,
channels]`, you can make it a batch of 1 image with `expand_dims(image, 0)`,
which will make the shape `[1, height, width, channels]`.
Other examples:
```prettyprint
# 't' is a tensor of shape [2]
shape(expand_dims(t, 0)) ==> [1, 2]
shape(expand_dims(t, 1)) ==> [2, 1]
shape(expand_dims(t, -1)) ==> [2, 1]
# 't2' is a tensor of shape [2, 3, 5]
shape(expand_dims(t2, 0)) ==> [1, 2, 3, 5]
shape(expand_dims(t2, 2)) ==> [2, 3, 1, 5]
shape(expand_dims(t2, 3)) ==> [2, 3, 5, 1]
```
This operation requires that:
`-1-input.dims() <= dim <= input.dims()`
This operation is related to `squeeze()`, which removes dimensions of
size 1.
##### Args:
* `input`: A `Tensor`.
* `dim`: A `Tensor` of type `int32`.
0-D (scalar). Specifies the dimension index at which to
expand the shape of `input`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
Contains the same data as `input`, but its shape has an additional
dimension of size 1 added.
## Slicing and Joining
TensorFlow provides several operations to slice or extract parts of a tensor,
or join multiple tensors together.
- - -
### `tf.slice(input_, begin, size, name=None)`
Extracts a slice from a tensor.
This operation extracts a slice of size `size` from a tensor `input` starting
at the location specified by `begin`. The slice `size` is represented as a
tensor shape, where `size[i]` is the number of elements of the 'i'th dimension
of `input` that you want to slice. The starting location (`begin`) for the
slice is represented as an offset in each dimension of `input`. In other
words, `begin[i]` is the offset into the 'i'th dimension of `input` that you
want to slice from.
`begin` is zero-based; `size` is one-based. If `size[i]` is -1,
all remaining elements in dimension i are included in the
slice. In other words, this is equivalent to setting:
`size[i] = input.dim_size(i) - begin[i]`
This operation requires that:
`0 <= begin[i] <= begin[i] + size[i] <= Di for i in [0, n]`
For example:
```
# 'input' is [[[1, 1, 1], [2, 2, 2]],
# [[3, 3, 3], [4, 4, 4]],
# [[5, 5, 5], [6, 6, 6]]]
tf.slice(input, [1, 0, 0], [1, 1, 3]) ==> [[[3, 3, 3]]]
tf.slice(input, [1, 0, 0], [1, 2, 3]) ==> [[[3, 3, 3],
[4, 4, 4]]]
tf.slice(input, [1, 0, 0], [2, 1, 3]) ==> [[[3, 3, 3]],
[[5, 5, 5]]]
```
##### Args:
* `input_`: A `Tensor`.
* `begin`: An `int32` or `int64` `Tensor`.
* `size`: An `int32` or `int64` `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` the same type as `input`.
- - -
### `tf.split(split_dim, num_split, value, name='split')`
Splits a tensor into `num_split` tensors along one dimension.
Splits `value` along dimension `split_dim` into `num_split` smaller tensors.
Requires that `num_split` evenly divide `value.shape[split_dim]`.
For example:
```python
# 'value' is a tensor with shape [5, 30]
# Split 'value' into 3 tensors along dimension 1
split0, split1, split2 = tf.split(1, 3, value)
tf.shape(split0) ==> [5, 10]
```
##### Args:
* `split_dim`: A 0-D `int32` `Tensor`. The dimension along which to split.
Must be in the range `[0, rank(value))`.
* `num_split`: A 0-D `int32` `Tensor`. The number of ways to split.
* `value`: The `Tensor` to split.
* `name`: A name for the operation (optional).
##### Returns:
`num_split` `Tensor` objects resulting from splitting `value`.
- - -
### `tf.tile(input, multiples, name=None)`
Constructs a tensor by tiling a given tensor.
This operation creates a new tensor by replicating `input` `multiples` times.
The output tensor's i'th dimension has `input.dims(i) * multiples[i]` elements,
and the values of `input` are replicated `multiples[i]` times along the 'i'th
dimension. For example, tiling `[a b c d]` by `[2]` produces
`[a b c d a b c d]`.
##### Args:
* `input`: A `Tensor`. 1-D or higher.
* `multiples`: A `Tensor` of type `int32`.
1-D. Length must be the same as the number of dimensions in `input`
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
- - -
### `tf.pad(input, paddings, name=None)`
Pads a tensor with zeros.
This operation pads a `input` with zeros according to the `paddings` you
specify. `paddings` is an integer tensor with shape `[Dn, 2]`, where n is the
rank of `input`. For each dimension D of `input`, `paddings[D, 0]` indicates
how many zeros to add before the contents of `input` in that dimension, and
`paddings[D, 1]` indicates how many zeros to add after the contents of `input`
in that dimension.
The padded size of each dimension D of the output is:
`paddings(D, 0) + input.dim_size(D) + paddings(D, 1)`
For example:
```prettyprint
# 't' is [[1, 1], [2, 2]]
# 'paddings' is [[1, 1]], [2, 2]]
# rank of 't' is 2
pad(t, paddings) ==> [[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]
[0, 1, 1, 0, 0]
[[0, 2, 2, 0, 0]
[0, 0, 0, 0, 0]]
```
##### Args:
* `input`: A `Tensor`.
* `paddings`: A `Tensor` of type `int32`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
- - -
### `tf.concat(concat_dim, values, name='concat')`
Concatenates tensors along one dimension.
Concatenates the list of tensors `values` along dimension `concat_dim`. If
`values[i].shape = [D0, D1, ... Dconcat_dim(i), ...Dn]`, the concatenated
result has shape
[D0, D1, ... Rconcat_dim, ...Dn]
where
Rconcat_dim = sum(Dconcat_dim(i))
That is, the data from the input tensors is joined along the `concat_dim`
dimension.
The number of dimensions of the input tensors must match, and all dimensions
except `concat_dim` must be equal.
For example:
```python
t1 = [[1, 2, 3], [4, 5, 6]]
t2 = [[7, 8, 9], [10, 11, 12]]
tf.concat(0, [t1, t2]) ==> [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
tf.concat(1, [t1, t2]) ==> [[1, 2, 3, 7, 8, 9], [4, 5, 6, 10, 11, 12]]
# tensor t3 with shape [2, 3]
# tensor t4 with shape [2, 3]
tf.shape(tf.concat(0, [t3, t4])) ==> [4, 3]
tf.shape(tf.concat(1, [t3, t4])) ==> [2, 6]
```
##### Args:
* `concat_dim`: 0-D `int32` `Tensor`. Dimension along which to concatenate.
* `values`: A list of `Tensor` objects or a single `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` resulting from concatenation of the input tensors.
- - -
### `tf.pack(values, name='pack')`
Packs a list of rank-`R` tensors into one rank-`(R+1)` tensor.
Packs tensors in `values` into a tensor with rank one higher than each tensor
in `values` and shape `[len(values)] + values[0].shape`. The output satisfies
`output[i, ...] = values[i][...]`.
This is the opposite of unpack. The numpy equivalent is
tf.pack([x, y, z]) = np.asarray([x, y, z])
##### Args:
* `values`: A list of `Tensor` objects with the same shape and type.
* `name`: A name for this operation (optional).
##### Returns:
* `output`: A packed `Tensor` with the same type as `values`.
- - -
### `tf.unpack(value, num=None, name='unpack')`
Unpacks the outer dimension of a rank-`R` tensor into rank-`(R-1)` tensors.
Unpacks `num` tensors from `value` along the first dimension.
If `num` is not specified (the default), it is inferred from `value`'s shape.
If `value.shape[0]` is not known, `ValueError` is raised.
The ith tensor in `output` is the slice `value[i, ...]`. Each tensor in
`output` has shape `value.shape[1:]`.
This is the opposite of pack. The numpy equivalent is
tf.unpack(x, n) = list(x)
##### Args:
* `value`: A rank `R > 0` `Tensor` to be unpacked.
* `num`: An `int`. The first dimension of value. Automatically inferred if
`None` (the default).
* `name`: A name for the operation (optional).
##### Returns:
The list of `Tensor` objects unpacked from `value`.
##### Raises:
* `ValueError`: If `num` is unspecified and cannot be inferred.
- - -
### `tf.reverse_sequence(input, seq_lengths, seq_dim, name=None)`
Reverses variable length slices in dimension `seq_dim`.
This op first slices `input` along the first dimension, and for each slice `i`,
reverses the first `seq_lengths[i]` elements along the dimension `seq_dim`.
The elements of `seq_lengths` must obey `seq_lengths[i] < input.dims[seq_dim]`,
and `seq_lengths` must be a vector of length `input.dims(0)`.
The output slice `i` along dimension 0 is then given by input slice `i`, with
the first `seq_lengths[i]` slices along dimension `seq_dim` reversed.
For example:
```prettyprint
# Given this:
seq_dim = 1
input.dims = (4, ...)
seq_lengths = [7, 2, 3, 5]
# then slices of input are reversed on seq_dim, but only up to seq_lengths:
output[0, 0:7, :, ...] = input[0, 7:0:-1, :, ...]
output[1, 0:2, :, ...] = input[1, 2:0:-1, :, ...]
output[2, 0:3, :, ...] = input[2, 3:0:-1, :, ...]
output[3, 0:5, :, ...] = input[3, 5:0:-1, :, ...]
# while entries past seq_lens are copied through:
output[0, 7:, :, ...] = input[0, 7:, :, ...]
output[1, 2:, :, ...] = input[1, 2:, :, ...]
output[2, 3:, :, ...] = input[2, 3:, :, ...]
output[3, 2:, :, ...] = input[3, 2:, :, ...]
```
##### Args:
* `input`: A `Tensor`. The input to reverse.
* `seq_lengths`: A `Tensor` of type `int64`.
1-D with length `input.dims(0)` and
`max(seq_lengths) < input.dims(seq_dim)`
* `seq_dim`: An `int`. The dimension which is partially reversed.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
The partially reversed input. It has the same shape as `input`.
- - -
### `tf.reverse(tensor, dims, name=None)`
Reverses specific dimensions of a tensor.
Given a `tensor`, and a `bool` tensor `dims` representing the dimensions
of `tensor`, this operation reverses each dimension i of `tensor` where
`dims[i]` is `True`.
`tensor` can have up to 8 dimensions. The number of dimensions
of `tensor` must equal the number of elements in `dims`. In other words:
`rank(tensor) = size(dims)`
For example:
```prettyprint
# tensor 't' is [[[[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]],
# [[12, 13, 14, 15],
# [16, 17, 18, 19],
# [20, 21, 22, 23]]]]
# tensor 't' shape is [1, 2, 3, 4]
# 'dims' is [False, False, False, True]
reverse(t, dims) ==> [[[[ 3, 2, 1, 0],
[ 7, 6, 5, 4],
[ 11, 10, 9, 8]],
[[15, 14, 13, 12],
[19, 18, 17, 16],
[23, 22, 21, 20]]]]
# 'dims' is [False, True, False, False]
reverse(t, dims) ==> [[[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]
[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]]]
# 'dims' is [False, False, True, False]
reverse(t, dims) ==> [[[[8, 9, 10, 11],
[4, 5, 6, 7],
[0, 1, 2, 3]]
[[20, 21, 22, 23],
[16, 17, 18, 19],
[12, 13, 14, 15]]]]
```
##### Args:
* `tensor`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `bool`, `float32`, `float64`.
Up to 8-D.
* `dims`: A `Tensor` of type `bool`. 1-D. The dimensions to reverse.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `tensor`. The same shape as `tensor`.
- - -
### `tf.transpose(a, perm=None, name='transpose')`
Transposes `a`. Permutes the dimensions according to `perm`.
The returned tensor's dimension i will correspond to the input dimension
`perm[i]`. If `perm` is not given, it is set to (n-1...0), where n is
the rank of the input tensor. Hence by default, this operation performs a
regular matrix transpose on 2-D input Tensors.
For example:
```python
# 'x' is [[1 2 3]
# [4 5 6]]
tf.transpose(x) ==> [[1 4]
[2 5]
[3 6]]
# Equivalently
tf.transpose(x perm=[0, 1]) ==> [[1 4]
[2 5]
[3 6]]
# 'perm' is more useful for n-dimensional tensors, for n > 2
# 'x' is [[[1 2 3]
# [4 5 6]]
# [[7 8 9]
# [10 11 12]]]
# Take the transpose of the matrices in dimension-0
tf.transpose(b, perm=[0, 2, 1]) ==> [[[1 4]
[2 5]
[3 6]]
[[7 10]
[8 11]
[9 12]]]
```
##### Args:
* `a`: A `Tensor`.
* `perm`: A permutation of the dimensions of `a`.
* `name`: A name for the operation (optional).
##### Returns:
A transposed `Tensor`.
- - -
### `tf.gather(params, indices, name=None)`
Gather slices from `params` according to `indices`.
`indices` must be an integer tensor of any dimension (usually 0-D or 1-D).
Produces an output tensor with shape `indices.shape + params.shape[1:]` where:
# Scalar indices
output[:, ..., :] = params[indices, :, ... :]
# Vector indices
output[i, :, ..., :] = params[indices[i], :, ... :]
# Higher rank indices
output[i, ..., j, :, ... :] = params[indices[i, ..., j], :, ..., :]
If `indices` is a permutation and `len(indices) == params.shape[0]` then
this operation will permute `params` accordingly.
##### Args:
* `params`: A `Tensor`.
* `indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `params`.
- - -
### `tf.dynamic_partition(data, partitions, num_partitions, name=None)`
Partitions `data` into `num_partitions` tensors using indices from `partitions`.
For each index tuple `js` of size `partitions.ndim`, the slice `data[js, ...]`
becomes part of `outputs[partitions[js]]`. The slices with `partitions[js] = i`
are placed in `outputs[i]` in lexicographic order of `js`, and the first
dimension of `outputs[i]` is the number of entries in `partitions` equal to `i`.
In detail,
outputs[i].shape = [sum(partitions == i)] + data.shape[partitions.ndim:]
outputs[i] = pack([data[js, ...] for js if partitions[js] == i])
`data.shape` must start with `partitions.shape`.
For example:
# Scalar partitions
partitions = 1
num_partitions = 2
data = [10, 20]
outputs[0] = [] # Empty with shape [0, 2]
outputs[1] = [[10, 20]]
# Vector partitions
partitions = [0, 0, 1, 1, 0]
num_partitions = 2
data = [10, 20, 30, 40, 50]
outputs[0] = [10, 20, 50]
outputs[1] = [30, 40]
##### Args:
* `data`: A `Tensor`.
* `partitions`: A `Tensor` of type `int32`.
Any shape. Indices in the range `[0, num_partitions)`.
* `num_partitions`: An `int` that is `>= 1`.
The number of partitions to output.
* `name`: A name for the operation (optional).
##### Returns:
A list of `num_partitions` `Tensor` objects of the same type as data.
- - -
### `tf.dynamic_stitch(indices, data, name=None)`
Interleave the values from the `data` tensors into a single tensor.
Builds a merged tensor such that
merged[indices[m][i, ..., j], ...] = data[m][i, ..., j, ...]
For example, if each `indices[m]` is scalar or vector, we have
# Scalar indices
merged[indices[m], ...] = data[m][...]
# Vector indices
merged[indices[m][i], ...] = data[m][i, ...]
Each `data[i].shape` must start with the corresponding `indices[i].shape`,
and the rest of `data[i].shape` must be constant w.r.t. `i`. That is, we
must have `data[i].shape = indices[i].shape + constant`. In terms of this
`constant`, the output shape is
merged.shape = [max(indices)] + constant
Values are merged in order, so if an index appears in both `indices[m][i]` and
`indices[n][j]` for `(m,i) < (n,j)` the slice `data[n][j]` will appear in the
merged result.
For example:
indices[0] = 6
indices[1] = [4, 1]
indices[2] = [[5, 2], [0, 3]]
data[0] = [61, 62]
data[1] = [[41, 42], [11, 12]]
data[2] = [[[51, 52], [21, 22]], [[1, 2], [31, 32]]]
merged = [[1, 2], [11, 12], [21, 22], [31, 32], [41, 42],
[51, 52], [61, 62]]
##### Args:
* `indices`: A list of at least 2 `Tensor` objects of type `int32`.
* `data`: A list with the same number of `Tensor` objects as `indices` of `Tensor` objects of the same type.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
================================================
FILE: SOURCE/api_docs/python/client.md
================================================
# Running Graphs
## Contents
### [Running Graphs](#AUTOGENERATED-running-graphs)
* [Session management](#AUTOGENERATED-session-management)
* [`class tf.Session`](#Session)
* [`class tf.InteractiveSession`](#InteractiveSession)
* [`tf.get_default_session()`](#get_default_session)
* [Error classes](#AUTOGENERATED-error-classes)
* [`class tf.OpError`](#OpError)
* [`class tf.errors.CancelledError`](#CancelledError)
* [`class tf.errors.UnknownError`](#UnknownError)
* [`class tf.errors.InvalidArgumentError`](#InvalidArgumentError)
* [`class tf.errors.DeadlineExceededError`](#DeadlineExceededError)
* [`class tf.errors.NotFoundError`](#NotFoundError)
* [`class tf.errors.AlreadyExistsError`](#AlreadyExistsError)
* [`class tf.errors.PermissionDeniedError`](#PermissionDeniedError)
* [`class tf.errors.UnauthenticatedError`](#UnauthenticatedError)
* [`class tf.errors.ResourceExhaustedError`](#ResourceExhaustedError)
* [`class tf.errors.FailedPreconditionError`](#FailedPreconditionError)
* [`class tf.errors.AbortedError`](#AbortedError)
* [`class tf.errors.OutOfRangeError`](#OutOfRangeError)
* [`class tf.errors.UnimplementedError`](#UnimplementedError)
* [`class tf.errors.InternalError`](#InternalError)
* [`class tf.errors.UnavailableError`](#UnavailableError)
* [`class tf.errors.DataLossError`](#DataLossError)
This library contains classes for launching graphs and executing operations.
The [basic usage](../../get_started/introduction.md#basic-usage) guide has
examples of how a graph is launched in a [`tf.Session`](#Session).
## Session management
- - -
### `class tf.Session`
A class for running TensorFlow operations.
A `Session` object encapsulates the environment in which `Operation`
objects are executed, and `Tensor` objects are evaluated. For
example:
```python
# Build a graph.
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b
# Launch the graph in a session.
sess = tf.Session()
# Evaluate the tensor `c`.
print sess.run(c)
```
A session may own resources, such as
[variables](../../api_docs/python/state_ops.md#Variable), [queues](../../api_docs/python/io_ops.md#QueueBase),
and [readers](../../api_docs/python/io_ops.md#ReaderBase). It is important to release
these resources when they are no longer required. To do this, either
invoke the [`close()`](#Session.close) method on the session, or use
the session as a context manager. The following two examples are
equivalent:
```python
# Using the `close()` method.
sess = tf.Session()
sess.run(...)
sess.close()
# Using the context manager.
with tf.Session() as sess:
sess.run(...)
```
The [`ConfigProto`]
(https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/config.proto)
protocol buffer exposes various configuration options for a
session. For example, to create a session that uses soft constraints
for device placement, and log the resulting placement decisions,
create a session as follows:
```python
# Launch the graph in a session that allows soft device placement and
# logs the placement decisions.
sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True,
log_device_placement=True))
```
- - -
#### `tf.Session.__init__(target='', graph=None, config=None)`
Creates a new TensorFlow session.
If no `graph` argument is specified when constructing the session,
the default graph will be launched in the session. If you are
using more than one graph (created with `tf.Graph()` in the same
process, you will have to use different sessions for each graph,
but each graph can be used in multiple sessions. In this case, it
is often clearer to pass the graph to be launched explicitly to
the session constructor.
##### Args:
* `target`: (Optional.) The execution engine to connect to.
Defaults to using an in-process engine. At present, no value
other than the empty string is supported.
* `graph`: (Optional.) The `Graph` to be launched (described above).
* `config`: (Optional.) A [`ConfigProto`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/config.proto)
protocol buffer with configuration options for the session.
- - -
#### `tf.Session.run(fetches, feed_dict=None)`
Runs the operations and evaluates the tensors in `fetches`.
This method runs one "step" of TensorFlow computation, by
running the necessary graph fragment to execute every `Operation`
and evaluate every `Tensor` in `fetches`, substituting the values in
`feed_dict` for the corresponding input values.
The `fetches` argument may be a list of graph elements or a single
graph element, and these determine the return value of this
method. A graph element can be one of the following types:
* If the *i*th element of `fetches` is an
[`Operation`](../../api_docs/python/framework.md#Operation), the *i*th
return value will be `None`.
* If the *i*th element of `fetches` is a
[`Tensor`](../../api_docs/python/framework.md#Tensor), the *i*th return
value will be a numpy ndarray containing the value of that tensor.
* If the *i*th element of `fetches` is a
[`SparseTensor`](../../api_docs/python/sparse_ops.md#SparseTensor),
the *i*th return value will be a
[`SparseTensorValue`](../../api_docs/python/sparse_ops.md#SparseTensorValue)
containing the value of that sparse tensor.
The optional `feed_dict` argument allows the caller to override
the value of tensors in the graph. Each key in `feed_dict` can be
one of the following types:
* If the key is a [`Tensor`](../../api_docs/python/framework.md#Tensor), the
value may be a Python scalar, string, list, or numpy ndarray
that can be converted to the same `dtype` as that
tensor. Additionally, if the key is a
[placeholder](../../api_docs/python/io_ops.md#placeholder), the shape of
the value will be checked for compatibility with the placeholder.
* If the key is a
[`SparseTensor`](../../api_docs/python/sparse_ops.md#SparseTensor),
the value should be a
[`SparseTensorValue`](../../api_docs/python/sparse_ops.md#SparseTensorValue).
##### Args:
* `fetches`: A single graph element, or a list of graph elements
(described above).
* `feed_dict`: A dictionary that maps graph elements to values
(described above).
##### Returns:
Either a single value if `fetches` is a single graph element, or
a list of values if `fetches` is a list (described above).
##### Raises:
* `RuntimeError`: If this `Session` is in an invalid state (e.g. has been
closed).
* `TypeError`: If `fetches` or `feed_dict` keys are of an inappropriate type.
* `ValueError`: If `fetches` or `feed_dict` keys are invalid or refer to a
`Tensor` that doesn't exist.
- - -
#### `tf.Session.close()`
Closes this session.
Calling this method frees all resources associated with the session.
##### Raises:
* `RuntimeError`: If an error occurs while closing the session.
- - -
#### `tf.Session.graph`
The graph that was launched in this session.
- - -
#### `tf.Session.as_default()`
Returns a context manager that makes this object the default session.
Use with the `with` keyword to specify that calls to
[`Operation.run()`](../../api_docs/python/framework.md#Operation.run) or
[`Tensor.run()`](../../api_docs/python/framework.md#Tensor.run) should be
executed in this session.
```python
c = tf.constant(..)
sess = tf.Session()
with sess.as_default():
assert tf.get_default_session() is sess
print c.eval()
```
To get the current default session, use
[`tf.get_default_session()`](#get_default_session).
*N.B.* The `as_default` context manager *does not* close the
session when you exit the context, and you must close the session
explicitly.
```python
c = tf.constant(...)
sess = tf.Session()
with sess.as_default():
print c.eval()
# ...
with sess.as_default():
print c.eval()
sess.close()
```
Alternatively, you can use `with tf.Session():` to create a
session that is automatically closed on exiting the context,
including when an uncaught exception is raised.
*N.B.* The default graph is a property of the current thread. If you
create a new thread, and wish to use the default session in that
thread, you must explicitly add a `with sess.as_default():` in that
thread's function.
##### Returns:
A context manager using this session as the default session.
- - -
### `class tf.InteractiveSession`
A TensorFlow `Session` for use in interactive contexts, such as a shell.
The only difference with a regular `Session` is that an `InteractiveSession`
installs itself as the default session on construction.
The methods [`Tensor.eval()`](../../api_docs/python/framework.md#Tensor.eval)
and [`Operation.run()`](../../api_docs/python/framework.md#Operation.run)
will use that session to run ops.
This is convenient in interactive shells and [IPython
notebooks](http://ipython.org), as it avoids having to pass an explicit
`Session` object to run ops.
For example:
```python
sess = tf.InteractiveSession()
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b
# We can just use 'c.eval()' without passing 'sess'
print c.eval()
sess.close()
```
Note that a regular session installs itself as the default session when it
is created in a `with` statement. The common usage in non-interactive
programs is to follow that pattern:
```python
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b
with tf.Session():
# We can also use 'c.eval()' here.
print c.eval()
```
- - -
#### `tf.InteractiveSession.__init__(target='', graph=None)`
Creates a new interactive TensorFlow session.
If no `graph` argument is specified when constructing the session,
the default graph will be launched in the session. If you are
using more than one graph (created with `tf.Graph()` in the same
process, you will have to use different sessions for each graph,
but each graph can be used in multiple sessions. In this case, it
is often clearer to pass the graph to be launched explicitly to
the session constructor.
##### Args:
* `target`: (Optional.) The execution engine to connect to.
Defaults to using an in-process engine. At present, no value
other than the empty string is supported.
* `graph`: (Optional.) The `Graph` to be launched (described above).
- - -
#### `tf.InteractiveSession.close()`
Closes an `InteractiveSession`.
- - -
### `tf.get_default_session()`
Returns the default session for the current thread.
The returned `Session` will be the innermost session on which a
`Session` or `Session.as_default()` context has been entered.
*N.B.* The default session is a property of the current thread. If you
create a new thread, and wish to use the default session in that
thread, you must explicitly add a `with sess.as_default():` in that
thread's function.
##### Returns:
The default `Session` being used in the current thread.
## Error classes
- - -
### `class tf.OpError`
A generic error that is raised when TensorFlow execution fails.
Whenever possible, the session will raise a more specific subclass
of `OpError` from the `tf.errors` module.
- - -
#### `tf.OpError.op`
The operation that failed, if known.
*N.B.* If the failed op was synthesized at runtime, e.g. a `Send`
or `Recv` op, there will be no corresponding
[`Operation`](../../api_docs/python/framework.md#Operation) object. In that case, this
will return `None`, and you should instead use the
[`OpError.node_def`](#OpError.node_def) to discover information about the
op.
##### Returns:
The `Operation` that failed, or None.
- - -
#### `tf.OpError.node_def`
The `NodeDef` proto representing the op that failed.
#### Other Methods
- - -
#### `tf.OpError.__init__(node_def, op, message, error_code)`
Creates a new OpError indicating that a particular op failed.
##### Args:
* `node_def`: The graph_pb2.NodeDef proto representing the op that failed.
* `op`: The ops.Operation that failed, if known; otherwise None.
* `message`: The message string describing the failure.
* `error_code`: The error_codes_pb2.Code describing the error.
- - -
#### `tf.OpError.error_code`
The integer error code that describes the error.
- - -
#### `tf.OpError.message`
The error message that describes the error.
- - -
### `class tf.errors.CancelledError`
Raised when an operation or step is cancelled.
For example, a long-running operation (e.g.
[`queue.enqueue()`](../../api_docs/python/io_ops.md#QueueBase.enqueue) may be
cancelled by running another operation (e.g.
[`queue.close(cancel_pending_enqueues=True)`](../../api_docs/python/io_ops.md#QueueBase.close),
or by [closing the session](../../api_docs/python/client.md#Session.close).
A step that is running such a long-running operation will fail by raising
`CancelledError`.
- - -
#### `tf.errors.CancelledError.__init__(node_def, op, message)`
Creates a `CancelledError`.
- - -
### `class tf.errors.UnknownError`
Unknown error.
An example of where this error may be returned is if a Status value
received from another address space belongs to an error-space that
is not known to this address space. Also errors raised by APIs that
do not return enough error information may be converted to this
error.
- - -
#### `tf.errors.UnknownError.__init__(node_def, op, message, error_code=2)`
Creates an `UnknownError`.
- - -
### `class tf.errors.InvalidArgumentError`
Raised when an operation receives an invalid argument.
This may occur, for example, if an operation is receives an input
tensor that has an invalid value or shape. For example, the
[`tf.matmul()`](../../api_docs/python/math_ops.md#matmul) op will raise this
error if it receives an input that is not a matrix, and the
[`tf.reshape()`](../../api_docs/python/array_ops.md#reshape) op will raise
this error if the new shape does not match the number of elements in the input
tensor.
- - -
#### `tf.errors.InvalidArgumentError.__init__(node_def, op, message)`
Creates an `InvalidArgumentError`.
- - -
### `class tf.errors.DeadlineExceededError`
Raised when a deadline expires before an operation could complete.
This exception is not currently used.
- - -
#### `tf.errors.DeadlineExceededError.__init__(node_def, op, message)`
Creates a `DeadlineExceededError`.
- - -
### `class tf.errors.NotFoundError`
Raised when a requested entity (e.g., a file or directory) was not found.
For example, running the
[`tf.WholeFileReader.read()`](../../api_docs/python/io_ops.md#WholeFileReader)
operation could raise `NotFoundError` if it receives the name of a file that
does not exist.
- - -
#### `tf.errors.NotFoundError.__init__(node_def, op, message)`
Creates a `NotFoundError`.
- - -
### `class tf.errors.AlreadyExistsError`
Raised when an entity that we attempted to create already exists.
For example, running an operation that saves a file
(e.g. [`tf.train.Saver.save()`](../../api_docs/python/train.md#Saver.save))
could potentially raise this exception if an explicit filename for an
existing file was passed.
- - -
#### `tf.errors.AlreadyExistsError.__init__(node_def, op, message)`
Creates an `AlreadyExistsError`.
- - -
### `class tf.errors.PermissionDeniedError`
Raised when the caller does not have permission to run an operation.
For example, running the
[`tf.WholeFileReader.read()`](../../api_docs/python/io_ops.md#WholeFileReader)
operation could raise `PermissionDeniedError` if it receives the name of a
file for which the user does not have the read file permission.
- - -
#### `tf.errors.PermissionDeniedError.__init__(node_def, op, message)`
Creates a `PermissionDeniedError`.
- - -
### `class tf.errors.UnauthenticatedError`
The request does not have valid authentication credentials.
This exception is not currently used.
- - -
#### `tf.errors.UnauthenticatedError.__init__(node_def, op, message)`
Creates an `UnauthenticatedError`.
- - -
### `class tf.errors.ResourceExhaustedError`
Some resource has been exhausted.
For example, this error might be raised if a per-user quota is
exhausted, or perhaps the entire file system is out of space.
- - -
#### `tf.errors.ResourceExhaustedError.__init__(node_def, op, message)`
Creates a `ResourceExhaustedError`.
- - -
### `class tf.errors.FailedPreconditionError`
Operation was rejected because the system is not in a state to execute it.
This exception is most commonly raised when running an operation
that reads a [`tf.Variable`](../../api_docs/python/state_ops.md#Variable)
before it has been initialized.
- - -
#### `tf.errors.FailedPreconditionError.__init__(node_def, op, message)`
Creates a `FailedPreconditionError`.
- - -
### `class tf.errors.AbortedError`
The operation was aborted, typically due to a concurrent action.
For example, running a
[`queue.enqueue()`](../../api_docs/python/io_ops.md#QueueBase.enqueue)
operation may raise `AbortedError` if a
[`queue.close()`](../../api_docs/python/io_ops.md#QueueBase.close) operation
previously ran.
- - -
#### `tf.errors.AbortedError.__init__(node_def, op, message)`
Creates an `AbortedError`.
- - -
### `class tf.errors.OutOfRangeError`
Raised when an operation executed past the valid range.
This exception is raised in "end-of-file" conditions, such as when a
[`queue.dequeue()`](../../api_docs/python/io_ops.md#QueueBase.dequeue)
operation is blocked on an empty queue, and a
[`queue.close()`](../../api_docs/python/io_ops.md#QueueBase.close)
operation executes.
- - -
#### `tf.errors.OutOfRangeError.__init__(node_def, op, message)`
Creates an `OutOfRangeError`.
- - -
### `class tf.errors.UnimplementedError`
Raised when an operation has not been implemented.
Some operations may raise this error when passed otherwise-valid
arguments that it does not currently support. For example, running
the [`tf.nn.max_pool()`](../../api_docs/python/nn.md#max_pool) operation
would raise this error if pooling was requested on the batch dimension,
because this is not yet supported.
- - -
#### `tf.errors.UnimplementedError.__init__(node_def, op, message)`
Creates an `UnimplementedError`.
- - -
### `class tf.errors.InternalError`
Raised when the system experiences an internal error.
This exception is raised when some invariant expected by the runtime
has been broken. Catching this exception is not recommended.
- - -
#### `tf.errors.InternalError.__init__(node_def, op, message)`
Creates an `InternalError`.
- - -
### `class tf.errors.UnavailableError`
Raised when the runtime is currently unavailable.
This exception is not currently used.
- - -
#### `tf.errors.UnavailableError.__init__(node_def, op, message)`
Creates an `UnavailableError`.
- - -
### `class tf.errors.DataLossError`
Raised when unrecoverable data loss or corruption is encountered.
For example, this may be raised by running a
[`tf.WholeFileReader.read()`](../../api_docs/python/io_ops.md#WholeFileReader)
operation, if the file is truncated while it is being read.
- - -
#### `tf.errors.DataLossError.__init__(node_def, op, message)`
Creates a `DataLossError`.
================================================
FILE: SOURCE/api_docs/python/constant_op.md
================================================
# Constants, Sequences, and Random Values
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Constants, Sequences, and Random Values](#AUTOGENERATED-constants--sequences--and-random-values)
* [Constant Value Tensors](#AUTOGENERATED-constant-value-tensors)
* [`tf.zeros(shape, dtype=tf.float32, name=None)`](#zeros)
* [`tf.zeros_like(tensor, dtype=None, name=None)`](#zeros_like)
* [`tf.ones(shape, dtype=tf.float32, name=None)`](#ones)
* [`tf.ones_like(tensor, dtype=None, name=None)`](#ones_like)
* [`tf.fill(dims, value, name=None)`](#fill)
* [`tf.constant(value, dtype=None, shape=None, name='Const')`](#constant)
* [Sequences](#AUTOGENERATED-sequences)
* [`tf.linspace(start, stop, num, name=None)`](#linspace)
* [`tf.range(start, limit, delta=1, name='range')`](#range)
* [Random Tensors](#AUTOGENERATED-random-tensors)
* [Examples:](#AUTOGENERATED-examples-)
* [`tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)`](#random_normal)
* [`tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)`](#truncated_normal)
* [`tf.random_uniform(shape, minval=0.0, maxval=1.0, dtype=tf.float32, seed=None, name=None)`](#random_uniform)
* [`tf.random_shuffle(value, seed=None, name=None)`](#random_shuffle)
* [`tf.set_random_seed(seed)`](#set_random_seed)
## Constant Value Tensors
TensorFlow provides several operations that you can use to generate constants.
- - -
### `tf.zeros(shape, dtype=tf.float32, name=None)`
Creates a tensor with all elements set to zero.
This operation returns a tensor of type `dtype` with shape `shape` and
all elements set to zero.
For example:
```python
tf.zeros([3, 4], int32) ==> [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
```
##### Args:
* `shape`: Either a list of integers, or a 1-D `Tensor` of type `int32`.
* `dtype`: The type of an element in the resulting `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` with all elements set to zero.
- - -
### `tf.zeros_like(tensor, dtype=None, name=None)`
Creates a tensor with all elements set to zero.
Given a single tensor (`tensor`), this operation returns a tensor of the
same type and shape as `tensor` with all elements set to zero. Optionally,
you can use `dtype` to specify a new type for the returned tensor.
For example:
```python
# 'tensor' is [[1, 2, 3], [4, 5, 6]]
tf.zeros_like(tensor) ==> [[0, 0, 0], [0, 0, 0]]
```
##### Args:
* `tensor`: A `Tensor`.
* `dtype`: A type for the returned `Tensor`. Must be `float32`, `float64`,
`int8`, `int16`, `int32`, `int64`, `uint8`, or `complex64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` with all elements set to zero.
- - -
### `tf.ones(shape, dtype=tf.float32, name=None)`
Creates a tensor with all elements set to 1.
This operation returns a tensor of type `dtype` with shape `shape` and all
elements set to 1.
For example:
```python
tf.ones([2, 3], int32) ==> [[1, 1, 1], [1, 1, 1]]
```
##### Args:
* `shape`: Either a list of integers, or a 1-D `Tensor` of type `int32`.
* `dtype`: The type of an element in the resulting `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` with all elements set to 1.
- - -
### `tf.ones_like(tensor, dtype=None, name=None)`
Creates a tensor with all elements set to 1.
Given a single tensor (`tensor`), this operation returns a tensor of the same
type and shape as `tensor` with all elements set to 1. Optionally, you can
specify a new type (`dtype`) for the returned tensor.
For example:
```python
# 'tensor' is [[1, 2, 3], [4, 5, 6]]
tf.ones_like(tensor) ==> [[1, 1, 1], [1, 1, 1]]
```
##### Args:
* `tensor`: A `Tensor`.
* `dtype`: A type for the returned `Tensor`. Must be `float32`, `float64`,
`int8`, `int16`, `int32`, `int64`, `uint8`, or `complex64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` with all elements set to 1.
- - -
### `tf.fill(dims, value, name=None)`
Creates a tensor filled with a scalar value.
This operation creates a tensor of shape `dims` and fills it with `value`.
For example:
```prettyprint
# output tensor shape needs to be [2, 3]
# so 'dims' is [2, 3]
fill(dims, 9) ==> [[9, 9, 9]
[9, 9, 9]]
```
##### Args:
* `dims`: A `Tensor` of type `int32`.
1-D. Represents the shape of the output tensor.
* `value`: A `Tensor`. 0-D (scalar). Value to fill the returned tensor.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `value`.
- - -
### `tf.constant(value, dtype=None, shape=None, name='Const')`
Creates a constant tensor.
The resulting tensor is populated with values of type `dtype`, as
specified by arguments `value` and (optionally) `shape` (see examples
below).
The argument `value` can be a constant value, or a list of values of type
`dtype`. If `value` is a list, then the length of the list must be less
than or equal to the number of elements implied by the `shape` argument (if
specified). In the case where the list length is less than the number of
elements specified by `shape`, the last element in the list will be used
to fill the remaining entries.
The argument `shape` is optional. If present, it specifies the dimensions
of the resulting tensor. If not present, then the tensor is a scalar (0-D)
if `value` is a scalar, or 1-D otherwise.
If the argument `dtype` is not specified, then the type is inferred from
the type of `value`.
For example:
```python
# Constant 1-D Tensor populated with value list.
tensor = tf.constant([1, 2, 3, 4, 5, 6, 7]) => [1 2 3 4 5 6 7]
# Constant 2-D tensor populated with scalar value -1.
tensor = tf.constant(-1.0, shape=[2, 3]) => [[-1. -1. -1.]
[-1. -1. -1.]]
```
##### Args:
* `value`: A constant value (or list) of output type `dtype`.
* `dtype`: The type of the elements of the resulting tensor.
* `shape`: Optional dimensions of resulting tensor.
* `name`: Optional name for the tensor.
##### Returns:
A Constant Tensor.
## Sequences
- - -
### `tf.linspace(start, stop, num, name=None)`
Generates values in an interval.
A sequence of `num` evenly-spaced values are generated beginning at `start`.
If `num > 1`, the values in the sequence increase by `stop - start / num - 1`,
so that the last one is exactly `stop`.
For example:
```
tf.linspace(10.0, 12.0, 3, name="linspace") => [ 10.0 11.0 12.0]
```
##### Args:
* `start`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
First entry in the range.
* `stop`: A `Tensor`. Must have the same type as `start`.
Last entry in the range.
* `num`: A `Tensor` of type `int32`. Number of values to generate.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `start`. 1-D. The generated values.
- - -
### `tf.range(start, limit, delta=1, name='range')`
Creates a sequence of integers.
This operation creates a sequence of integers that begins at `start` and
extends by increments of `delta` up to but not including `limit`.
For example:
```
# 'start' is 3
# 'limit' is 18
# 'delta' is 3
tf.range(start, limit, delta) ==> [3, 6, 9, 12, 15]
```
##### Args:
* `start`: A 0-D (scalar) of type `int32`. First entry in sequence.
* `limit`: A 0-D (scalar) of type `int32`. Upper limit of sequence,
exclusive.
* `delta`: A 0-D `Tensor` (scalar) of type `int32`. Optional. Default is 1.
Number that increments `start`.
* `name`: A name for the operation (optional).
##### Returns:
An 1-D `int32` `Tensor`.
## Random Tensors
TensorFlow has several ops that create random tensors with different
distributions. The random ops are stateful, and create new random values each
time they are evaluated.
The `seed` keyword argument in these functions acts in conjunction with
the graph-level random seed. Changing either the graph-level seed using
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) or the
op-level seed will change the underlying seed of these operations. Setting
neither graph-level nor op-level seed, results in a random seed for all
operations.
See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for details on the interaction between operation-level and graph-level random
seeds.
### Examples:
```python
# Create a tensor of shape [2, 3] consisting of random normal values, with mean
# -1 and standard deviation 4.
norm = tf.random_normal([2, 3], mean=-1, stddev=4)
# Shuffle the first dimension of a tensor
c = tf.constant([[1, 2], [3, 4], [5, 6]])
shuff = tf.random_shuffle(c)
# Each time we run these ops, different results are generated
sess = tf.Session()
print sess.run(norm)
print sess.run(norm)
# Set an op-level seed to generate repeatable sequences across sessions.
c = tf.constant([[1, 2], [3, 4], [5, 6]])
sess = tf.Session()
norm = tf.random_normal(c, seed=1234)
print sess.run(norm)
print sess.run(norm)
```
Another common use of random values is the intialization of variables. Also see
the [Variables How To](../../how_tos/variables/index.md).
```python
# Use random uniform values in [0, 1) as the initializer for a variable of shape
# [2, 3]. The default type is float32.
var = tf.Variable(tf.random_uniform([2, 3]), name="var")
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
print sess.run(var)
```
- - -
### `tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)`
Outputs random values from a normal distribution.
##### Args:
* `shape`: A 1-D integer Tensor or Python array. The shape of the output tensor.
* `mean`: A 0-D Tensor or Python value of type `dtype`. The mean of the normal
distribution.
* `stddev`: A 0-D Tensor or Python value of type `dtype`. The standard deviation
of the normal distribution.
* `dtype`: The type of the output.
* `seed`: A Python integer. Used to create a random seed for the distribution.
See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
* `name`: A name for the operation (optional).
##### Returns:
A tensor of the specified shape filled with random normal values.
- - -
### `tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)`
Outputs random values from a truncated normal distribution.
The generated values follow a normal distribution with specified mean and
standard deviation, except that values whose magnitude is more than 2 standard
deviations from the mean are dropped and re-picked.
##### Args:
* `shape`: A 1-D integer Tensor or Python array. The shape of the output tensor.
* `mean`: A 0-D Tensor or Python value of type `dtype`. The mean of the
truncated normal distribution.
* `stddev`: A 0-D Tensor or Python value of type `dtype`. The standard deviation
of the truncated normal distribution.
* `dtype`: The type of the output.
* `seed`: A Python integer. Used to create a random seed for the distribution.
See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
* `name`: A name for the operation (optional).
##### Returns:
A tensor of the specified shape filled with random truncated normal values.
- - -
### `tf.random_uniform(shape, minval=0.0, maxval=1.0, dtype=tf.float32, seed=None, name=None)`
Outputs random values from a uniform distribution.
The generated values follow a uniform distribution in the range
`[minval, maxval)`. The lower bound `minval` is included in the range, while
the upper bound `maxval` is excluded.
##### Args:
* `shape`: A 1-D integer Tensor or Python array. The shape of the output tensor.
* `minval`: A 0-D Tensor or Python value of type `dtype`. The lower bound on the
range of random values to generate.
* `maxval`: A 0-D Tensor or Python value of type `dtype`. The upper bound on
the range of random values to generate.
* `dtype`: The type of the output.
* `seed`: A Python integer. Used to create a random seed for the distribution.
See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
* `name`: A name for the operation (optional).
##### Returns:
A tensor of the specified shape filled with random uniform values.
- - -
### `tf.random_shuffle(value, seed=None, name=None)`
Randomly shuffles a tensor along its first dimension.
The tensor is shuffled along dimension 0, such that each `value[j]` is mapped
to one and only one `output[i]`. For example, a mapping that might occur for a
3x2 tensor is:
```python
[[1, 2], [[5, 6],
[3, 4], ==> [1, 2],
[5, 6]] [3, 4]]
```
##### Args:
* `value`: A Tensor to be shuffled.
* `seed`: A Python integer. Used to create a random seed for the distribution.
See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
* `name`: A name for the operation (optional).
##### Returns:
A tensor of same shape and type as `value`, shuffled along its first
dimension.
- - -
### `tf.set_random_seed(seed)`
Sets the graph-level random seed.
Operations that rely on a random seed actually derive it from two seeds:
the graph-level and operation-level seeds. This sets the graph-level seed.
Its interactions with operation-level seeds is as follows:
1. If neither the graph-level nor the operation seed is set:
A random seed is used for this op.
2. If the graph-level seed is set, but the operation seed is not:
The system deterministically picks an operation seed in conjunction
with the graph-level seed so that it gets a unique random sequence.
3. If the graph-level seed is not set, but the operation seed is set:
A default graph-level seed and the specified operation seed are used to
determine the random sequence.
4. If both the graph-level and the operation seed are set:
Both seeds are used in conjunction to determine the random sequence.
To illustrate the user-visible effects, consider these examples:
To generate different sequences across sessions, set neither
graph-level nor op-level seeds:
```python
a = tf.random_uniform([1])
b = tf.random_normal([1])
print "Session 1"
with tf.Session() as sess1:
print sess1.run(a) # generates 'A1'
print sess1.run(a) # generates 'A2'
print sess1.run(b) # generates 'B1'
print sess1.run(b) # generates 'B2'
print "Session 2"
with tf.Session() as sess2:
print sess2.run(a) # generates 'A3'
print sess2.run(a) # generates 'A4'
print sess2.run(b) # generates 'B3'
print sess2.run(b) # generates 'B4'
```
To generate the same repeatable sequence for an op across sessions, set the
seed for the op:
```python
a = tf.random_uniform([1], seed=1)
b = tf.random_normal([1])
# Repeatedly running this block with the same graph will generate the same
# sequence of values for 'a', but different sequences of values for 'b'.
print "Session 1"
with tf.Session() as sess1:
print sess1.run(a) # generates 'A1'
print sess1.run(a) # generates 'A2'
print sess1.run(b) # generates 'B1'
print sess1.run(b) # generates 'B2'
print "Session 2"
with tf.Session() as sess2:
print sess2.run(a) # generates 'A1'
print sess2.run(a) # generates 'A2'
print sess2.run(b) # generates 'B3'
print sess2.run(b) # generates 'B4'
```
To make the random sequences generated by all ops be repeatable across
sessions, set a graph-level seed:
```python
tf.set_random_seed(1234)
a = tf.random_uniform([1])
b = tf.random_normal([1])
# Repeatedly running this block with the same graph will generate different
# sequences of 'a' and 'b'.
print "Session 1"
with tf.Session() as sess1:
print sess1.run(a) # generates 'A1'
print sess1.run(a) # generates 'A2'
print sess1.run(b) # generates 'B1'
print sess1.run(b) # generates 'B2'
print "Session 2"
with tf.Session() as sess2:
print sess2.run(a) # generates 'A1'
print sess2.run(a) # generates 'A2'
print sess2.run(b) # generates 'B1'
print sess2.run(b) # generates 'B2'
```
##### Args:
* `seed`: integer.
================================================
FILE: SOURCE/api_docs/python/control_flow_ops.md
================================================
# Control Flow
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Control Flow](#AUTOGENERATED-control-flow)
* [Control Flow Operations](#AUTOGENERATED-control-flow-operations)
* [`tf.identity(input, name=None)`](#identity)
* [`tf.tuple(tensors, name=None, control_inputs=None)`](#tuple)
* [`tf.group(*inputs, **kwargs)`](#group)
* [`tf.no_op(name=None)`](#no_op)
* [`tf.count_up_to(ref, limit, name=None)`](#count_up_to)
* [Logical Operators](#AUTOGENERATED-logical-operators)
* [`tf.logical_and(x, y, name=None)`](#logical_and)
* [`tf.logical_not(x, name=None)`](#logical_not)
* [`tf.logical_or(x, y, name=None)`](#logical_or)
* [`tf.logical_xor(x, y, name='LogicalXor')`](#logical_xor)
* [Comparison Operators](#AUTOGENERATED-comparison-operators)
* [`tf.equal(x, y, name=None)`](#equal)
* [`tf.not_equal(x, y, name=None)`](#not_equal)
* [`tf.less(x, y, name=None)`](#less)
* [`tf.less_equal(x, y, name=None)`](#less_equal)
* [`tf.greater(x, y, name=None)`](#greater)
* [`tf.greater_equal(x, y, name=None)`](#greater_equal)
* [`tf.select(condition, t, e, name=None)`](#select)
* [`tf.where(input, name=None)`](#where)
* [Debugging Operations](#AUTOGENERATED-debugging-operations)
* [`tf.is_finite(x, name=None)`](#is_finite)
* [`tf.is_inf(x, name=None)`](#is_inf)
* [`tf.is_nan(x, name=None)`](#is_nan)
* [`tf.verify_tensor_all_finite(t, msg, name=None)`](#verify_tensor_all_finite)
* [`tf.check_numerics(tensor, message, name=None)`](#check_numerics)
* [`tf.add_check_numerics_ops()`](#add_check_numerics_ops)
* [`tf.Assert(condition, data, summarize=None, name=None)`](#Assert)
* [`tf.Print(input_, data, message=None, first_n=None, summarize=None, name=None)`](#Print)
## Control Flow Operations
TensorFlow provides several operations and classes that you can use to control
the execution of operations and add conditional dependencies to your graph.
- - -
### `tf.identity(input, name=None)`
Return a tensor with the same shape and contents as the input tensor or value.
##### Args:
* `input`: A `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
- - -
### `tf.tuple(tensors, name=None, control_inputs=None)`
Group tensors together.
This creates a tuple of tensors with the same values as the `tensors`
argument, except that the value of each tensor is only returned after the
values of all tensors have been computed.
`control_inputs` contains additional ops that have to finish before this op
finishes, but whose outputs are not returned.
This can be used as a "join" mechanism for parallel computations: all the
argument tensors can be computed in parallel, but the values of any tensor
returned by `tuple` are only available after all the parallel computations
are done.
See also `group` and `with_dependencies`.
##### Args:
* `tensors`: A list of `Tensor`s or `IndexedSlices`, some entries can be `None`.
* `name`: (optional) A name to use as a `name_scope` for the operation.
* `control_inputs`: List of additional ops to finish before returning.
##### Returns:
Same as `tensors`.
##### Raises:
* `ValueError`: If `tensors` does not contain any `Tensor` or `IndexedSlices`.
- - -
### `tf.group(*inputs, **kwargs)`
Create an op that groups multiple operations.
When this op finishes, all ops in `input` have finished. This op has no
output.
See also `tuple` and `with_dependencies`.
##### Args:
* `*inputs`: One or more tensors to group.
* `**kwargs`: Optional parameters to pass when constructing the NodeDef.
* `name`: A name for this operation (optional).
##### Returns:
An Operation that executes all its inputs.
##### Raises:
* `ValueError`: If an unknown keyword argument is provided, or if there are
no inputs.
- - -
### `tf.no_op(name=None)`
Does nothing. Only useful as a placeholder for control edges.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
### `tf.count_up_to(ref, limit, name=None)`
Increments 'ref' until it reaches 'limit'.
This operation outputs "ref" after the update is done. This makes it
easier to chain operations that need to use the updated value.
##### Args:
* `ref`: A mutable `Tensor`. Must be one of the following types: `int32`, `int64`.
Should be from a scalar `Variable` node.
* `limit`: An `int`.
If incrementing ref would bring it above limit, instead generates an
'OutOfRange' error.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `ref`.
A copy of the input before increment. If nothing else modifies the
input, the values produced will all be distinct.
## Logical Operators
TensorFlow provides several operations that you can use to add logical operators
to your graph.
- - -
### `tf.logical_and(x, y, name=None)`
Returns the truth value of x AND y element-wise.
##### Args:
* `x`: A `Tensor` of type `bool`.
* `y`: A `Tensor` of type `bool`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.logical_not(x, name=None)`
Returns the truth value of NOT x element-wise.
##### Args:
* `x`: A `Tensor` of type `bool`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.logical_or(x, y, name=None)`
Returns the truth value of x OR y element-wise.
##### Args:
* `x`: A `Tensor` of type `bool`.
* `y`: A `Tensor` of type `bool`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.logical_xor(x, y, name='LogicalXor')`
x ^ y = (x | y) & ~(x & y).
## Comparison Operators
TensorFlow provides several operations that you can use to add comparison
operators to your graph.
- - -
### `tf.equal(x, y, name=None)`
Returns the truth value of (x == y) element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `complex64`, `quint8`, `qint8`, `qint32`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.not_equal(x, y, name=None)`
Returns the truth value of (x != y) element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `complex64`, `quint8`, `qint8`, `qint32`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.less(x, y, name=None)`
Returns the truth value of (x < y) element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.less_equal(x, y, name=None)`
Returns the truth value of (x <= y) element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.greater(x, y, name=None)`
Returns the truth value of (x > y) element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.greater_equal(x, y, name=None)`
Returns the truth value of (x >= y) element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.select(condition, t, e, name=None)`
Selects elements from `t` or `e`, depending on `condition`.
The `condition`, `t`, and `e` tensors must all have the same shape,
and the output will also have that shape. The `condition` tensor acts
as an element-wise mask that chooses, based on the value at each
element, whether the corresponding element in the output should be
taken from `t` (if true) or `e` (if false). For example:
For example:
```prettyprint
# 'condition' tensor is [[True, False]
# [True, False]]
# 't' is [[1, 1],
# [1, 1]]
# 'e' is [[2, 2],
# [2, 2]]
select(condition, t, e) ==> [[1, 2],
[1, 2]]
```
##### Args:
* `condition`: A `Tensor` of type `bool`.
* `t`: A `Tensor` with the same shape as `condition`.
* `e`: A `Tensor` with the same type and shape as `t`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` with the same type and shape as `t` and `e`.
- - -
### `tf.where(input, name=None)`
Returns locations of true values in a boolean tensor.
This operation returns the coordinates of true elements in `input`. The
coordinates are returned in a 2-D tensor where the first dimension (rows)
represents the number of true elements, and the second dimension (columns)
represents the coordinates of the true elements. Keep in mind, the shape of
the output tensor can vary depending on how many true values there are in
`input`. Indices are output in row-major order.
For example:
```prettyprint
# 'input' tensor is [[True, False]
# [True, False]]
# 'input' has two true values, so output has two coordinates.
# 'input' has rank of 2, so coordinates have two indices.
where(input) ==> [[0, 0],
[1, 0]]
# `input` tensor is [[[True, False]
# [True, False]]
# [[False, True]
# [False, True]]
# [[False, False]
# [False, True]]]
# 'input' has 5 true values, so output has 5 coordinates.
# 'input' has rank of 3, so coordinates have three indices.
where(input) ==> [[0, 0, 0],
[0, 1, 0],
[1, 0, 1],
[1, 1, 1],
[2, 1, 1]]
```
##### Args:
* `input`: A `Tensor` of type `bool`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `int64`.
## Debugging Operations
TensorFlow provides several operations that you can use to validate values and
debug your graph.
- - -
### `tf.is_finite(x, name=None)`
Returns which elements of x are finite.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.is_inf(x, name=None)`
Returns which elements of x are Inf.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.is_nan(x, name=None)`
Returns which elements of x are NaN.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`.
- - -
### `tf.verify_tensor_all_finite(t, msg, name=None)`
Assert that the tensor does not contain any NaN's or Inf's.
##### Args:
* `t`: Tensor to check.
* `msg`: Message to log on failure.
* `name`: A name for this operation (optional).
##### Returns:
Same tensor as `t`.
- - -
### `tf.check_numerics(tensor, message, name=None)`
Checks a tensor for NaN and Inf values.
When run, reports an `InvalidArgument` error if `tensor` has any values
that are not a number (NaN) or infinity (Inf). Otherwise, passes `tensor` as-is.
##### Args:
* `tensor`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
* `message`: A `string`. Prefix of the error message.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `tensor`.
- - -
### `tf.add_check_numerics_ops()`
Connect a check_numerics to every floating point tensor.
`check_numerics` operations themselves are added for each `float` or `double`
tensor in the graph. For all ops in the graph, the `check_numerics` op for
all of its (`float` or `double`) inputs is guaranteed to run before the
`check_numerics` op on any of its outputs.
##### Returns:
A `group` op depending on all `check_numerics` ops added.
- - -
### `tf.Assert(condition, data, summarize=None, name=None)`
Asserts that the given condition is true.
If `condition` evaluates to false, print the list of tensors in `data`.
`summarize` determines how many entries of the tensors to print.
##### Args:
* `condition`: The condition to evaluate.
* `data`: The tensors to print out when condition is false.
* `summarize`: Print this many entries of each tensor.
* `name`: A name for this operation (optional).
- - -
### `tf.Print(input_, data, message=None, first_n=None, summarize=None, name=None)`
Prints a list of tensors.
This is an identity op with the side effect of printing `data` when
evaluating.
##### Args:
* `input_`: A tensor passed through this op.
* `data`: A list of tensors to print out when op is evaluated.
* `message`: A string, prefix of the error message.
* `first_n`: Only log `first_n` number of times. Negative numbers log always;
this is the default.
* `summarize`: Only print this many entries of each tensor.
* `name`: A name for the operation (optional).
##### Returns:
Same tensor as `input_`.
================================================
FILE: SOURCE/api_docs/python/framework.md
================================================
# Building Graphs
## Contents
### [Building Graphs](#AUTOGENERATED-building-graphs)
* [Core graph data structures](#AUTOGENERATED-core-graph-data-structures)
* [`class tf.Graph`](#Graph)
* [`class tf.Operation`](#Operation)
* [`class tf.Tensor`](#Tensor)
* [Tensor types](#AUTOGENERATED-tensor-types)
* [`class tf.DType`](#DType)
* [`tf.as_dtype(type_value)`](#as_dtype)
* [Utility functions](#AUTOGENERATED-utility-functions)
* [`tf.device(dev)`](#device)
* [`tf.name_scope(name)`](#name_scope)
* [`tf.control_dependencies(control_inputs)`](#control_dependencies)
* [`tf.convert_to_tensor(value, dtype=None, name=None)`](#convert_to_tensor)
* [`tf.get_default_graph()`](#get_default_graph)
* [`tf.import_graph_def(graph_def, input_map=None, return_elements=None, name=None, op_dict=None)`](#import_graph_def)
* [Graph collections](#AUTOGENERATED-graph-collections)
* [`tf.add_to_collection(name, value)`](#add_to_collection)
* [`tf.get_collection(key, scope=None)`](#get_collection)
* [`class tf.GraphKeys`](#GraphKeys)
* [Defining new operations](#AUTOGENERATED-defining-new-operations)
* [`class tf.RegisterGradient`](#RegisterGradient)
* [`tf.NoGradient(op_type)`](#NoGradient)
* [`class tf.RegisterShape`](#RegisterShape)
* [`class tf.TensorShape`](#TensorShape)
* [`class tf.Dimension`](#Dimension)
* [`tf.op_scope(values, name, default_name)`](#op_scope)
* [`tf.get_seed(op_seed)`](#get_seed)
Classes and functions for building TensorFlow graphs.
## Core graph data structures
- - -
### `class tf.Graph`
A TensorFlow computation, represented as a dataflow graph.
A `Graph` contains a set of
[`Operation`](../../api_docs/python/framework.md#Operation) objects,
which represent units of computation; and
[`Tensor`](../../api_docs/python/framework.md#Tensor) objects, which represent
the units of data that flow between operations.
A default `Graph` is always registered, and accessible by calling
[`tf.get_default_graph()`](../../api_docs/python/framework.md#get_default_graph).
To add an operation to the default graph, simply call one of the functions
that defines a new `Operation`:
```
c = tf.constant(4.0)
assert c.graph is tf.get_default_graph()
```
Another typical usage involves the
[`Graph.as_default()`](../../api_docs/python/framework.md#Graph.as_default)
context manager, which overrides the current default graph for the
lifetime of the context:
```python
g = tf.Graph()
with g.as_default():
# Define operations and tensors in `g`.
c = tf.constant(30.0)
assert c.graph is g
```
Important note: This class *is not* thread-safe for graph construction. All
operations should be created from a single thread, or external
synchronization must be provided. Unless otherwise specified, all methods
are not thread-safe.
- - -
#### `tf.Graph.__init__()`
Creates a new, empty Graph.
- - -
#### `tf.Graph.as_default()`
Returns a context manager that makes this `Graph` the default graph.
This method should be used if you want to create multiple graphs
in the same process. For convenience, a global default graph is
provided, and all ops will be added to this graph if you do not
create a new graph explicitly. Use this method the `with` keyword
to specify that ops created within the scope of a block should be
added to this graph.
The default graph is a property of the current thread. If you
create a new thread, and wish to use the default graph in that
thread, you must explicitly add a `with g.as_default():` in that
thread's function.
The following code examples are equivalent:
```python
# 1. Using Graph.as_default():
g = tf.Graph()
with g.as_default():
c = tf.constant(5.0)
assert c.graph is g
# 2. Constructing and making default:
with tf.Graph().as_default() as g:
c = tf.constant(5.0)
assert c.graph is g
```
##### Returns:
A context manager for using this graph as the default graph.
- - -
#### `tf.Graph.as_graph_def(from_version=None)`
Returns a serialized `GraphDef` representation of this graph.
The serialized `GraphDef` can be imported into another `Graph`
(using [`import_graph_def()`](#import_graph_def)) or used with the
[C++ Session API](../../api_docs/cc/index.md).
This method is thread-safe.
##### Args:
* `from_version`: Optional. If this is set, returns a `GraphDef`
containing only the nodes that were added to this graph since
its `version` property had the given value.
##### Returns:
A [`GraphDef`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto)
protocol buffer.
- - -
#### `tf.Graph.finalize()`
Finalizes this graph, making it read-only.
After calling `g.finalize()`, no new operations can be added to
`g`. This method is used to ensure that no operations are added
to a graph when it is shared between multiple threads, for example
when using a [`QueueRunner`](../../api_docs/python/train.md#QueueRunner).
- - -
#### `tf.Graph.finalized`
True if this graph has been finalized.
- - -
#### `tf.Graph.control_dependencies(control_inputs)`
Returns a context manager that specifies control dependencies.
Use with the `with` keyword to specify that all operations constructed
within the context should have control dependencies on
`control_inputs`. For example:
```python
with g.control_dependencies([a, b, c]):
# `d` and `e` will only run after `a`, `b`, and `c` have executed.
d = ...
e = ...
```
Multiple calls to `control_dependencies()` can be nested, and in
that case a new `Operation` will have control dependencies on the union
of `control_inputs` from all active contexts.
```python
with g.control_dependencies([a, b]):
# Ops declared here run after `a` and `b`.
with g.control_dependencies([c, d]):
# Ops declared here run after `a`, `b`, `c`, and `d`.
```
*N.B.* The control dependencies context applies *only* to ops that
are constructed within the context. Merely using an op or tensor
in the context does not add a control dependency. The following
example illustrates this point:
```python
# WRONG
def my_func(pred, tensor):
t = tf.matmul(tensor, tensor)
with tf.control_dependencies([pred]):
# The matmul op is created outside the context, so no control
# dependency will be added.
return t
# RIGHT
def my_func(pred, tensor):
with tf.control_dependencies([pred]):
# The matmul op is created in the context, so a control dependency
# will be added.
return tf.matmul(tensor, tensor)
```
##### Args:
* `control_inputs`: A list of `Operation` or `Tensor` objects, which
must be executed or computed before running the operations
defined in the context.
##### Returns:
A context manager that specifies control dependencies for all
operations constructed within the context.
##### Raises:
* `TypeError`: If `control_inputs` is not a list of `Operation` or
`Tensor` objects.
- - -
#### `tf.Graph.device(device_name_or_function)`
Returns a context manager that specifies the default device to use.
The `device_name_or_function` argument may either be a device name
string, a device function, or None:
* If it is a device name string, all operations constructed in
this context will be assigned to the device with that name.
* If it is a function, it will be treated as function from
Operation objects to device name strings, and invoked each time
a new Operation is created. The Operation will be assigned to
the device with the returned name.
* If it is None, the default device will be cleared.
For example:
```python
with g.device('/gpu:0'):
# All operations constructed in this context will be placed
# on GPU 0.
with g.device(None):
# All operations constructed in this context will have no
# assigned device.
# Defines a function from `Operation` to device string.
def matmul_on_gpu(n):
if n.type == "MatMul":
return "/gpu:0"
else:
return "/cpu:0"
with g.device(matmul_on_gpu):
# All operations of type "MatMul" constructed in this context
# will be placed on GPU 0; all other operations will be placed
# on CPU 0.
```
##### Args:
* `device_name_or_function`: The device name or function to use in
the context.
##### Returns:
A context manager that specifies the default device to use for newly
created ops.
- - -
#### `tf.Graph.name_scope(name)`
Returns a context manager that creates hierarchical names for operations.
A graph maintains a stack of name scopes. A `with name_scope(...):`
statement pushes a new name onto the stack for the lifetime of the context.
The `name` argument will be interpreted as follows:
* A string (not ending with '/') will create a new name scope, in which
`name` is appended to the prefix of all operations created in the
context. If `name` has been used before, it will be made unique by
calling `self.unique_name(name)`.
* A scope previously captured from a `with g.name_scope(...) as
scope:` statement will be treated as an "absolute" name scope, which
makes it possible to re-enter existing scopes.
* A value of `None` or the empty string will reset the current name scope
to the top-level (empty) name scope.
For example:
```python
with tf.Graph().as_default() as g:
c = tf.constant(5.0, name="c")
assert c_1.name == "c"
c_1 = tf.constant(6.0, name="c")
assert c_1.name == "c_1"
# Creates a scope called "nested"
with g.name_scope("nested") as scope:
nested_c = tf.constant(10.0, name="c")
assert nested_c.name == "nested/c"
# Creates a nested scope called "inner".
with g.name_scope("inner"):
nested_inner_c = tf.constant(20.0, name="c")
assert nested_inner_c.name == "nested/inner/c"
# Create a nested scope called "inner_1".
with g.name_scope("inner"):
nested_inner_1_c = tf.constant(30.0, name="c")
assert nested_inner_1_c.name == "nested/inner_1/c"
# Treats `scope` as an absolute name scope, and
# switches to the "nested/" scope.
with g.name_scope(scope):
nested_d = tf.constant(40.0, name="d")
assert nested_d.name == "nested/d"
with g.name_scope(""):
e = tf.constant(50.0, name="e")
assert e.name == "e"
```
The name of the scope itself can be captured by `with
g.name_scope(...) as scope:`, which stores the name of the scope
in the variable `scope`. This value can be used to name an
operation that represents the overall result of executing the ops
in a scope. For example:
```python
inputs = tf.constant(...)
with g.name_scope('my_layer') as scope:
weights = tf.Variable(..., name="weights")
biases = tf.Variable(..., name="biases")
affine = tf.matmul(inputs, weights) + biases
output = tf.nn.relu(affine, name=scope)
```
##### Args:
* `name`: A name for the scope.
##### Returns:
A context manager that installs `name` as a new name scope.
A `Graph` instance supports an arbitrary number of "collections"
that are identified by name. For convenience when building a large
graph, collections can store groups of related objects: for
example, the `tf.Variable` uses a collection (named
[`tf.GraphKeys.VARIABLES`](../../api_docs/python/framework.md#GraphKeys)) for
all variables that are created during the construction of a graph. The caller
may define additional collections by specifying a new name.
- - -
#### `tf.Graph.add_to_collection(name, value)`
Stores `value` in the collection with the given `name`.
##### Args:
* `name`: The key for the collection. For example, the `GraphKeys` class
contains many standard names for collections.
* `value`: The value to add to the collection.
- - -
#### `tf.Graph.get_collection(name, scope=None)`
Returns a list of values in the collection with the given `name`.
##### Args:
* `key`: The key for the collection. For example, the `GraphKeys` class
contains many standard names for collections.
* `scope`: (Optional.) If supplied, the resulting list is filtered to include
only items whose name begins with this string.
##### Returns:
The list of values in the collection with the given `name`, or
an empty list if no value has been added to that collection. The
list contains the values in the order under which they were
collected.
- - -
#### `tf.Graph.as_graph_element(obj, allow_tensor=True, allow_operation=True)`
Returns the object referred to by `obj`, as an `Operation` or `Tensor`.
This function validates that `obj` represents an element of this
graph, and gives an informative error message if it is not.
This function is the canonical way to get/validate an object of
one of the allowed types from an external argument reference in the
Session API.
This method may be called concurrently from multiple threads.
##### Args:
* `obj`: A `Tensor`, an `Operation`, or the name of a tensor or operation.
Can also be any object with an `_as_graph_element()` method that returns
a value of one of these types.
* `allow_tensor`: If true, `obj` may refer to a `Tensor`.
* `allow_operation`: If true, `obj` may refer to an `Operation`.
##### Returns:
The `Tensor` or `Operation` in the Graph corresponding to `obj`.
##### Raises:
* `TypeError`: If `obj` is not a type we support attempting to convert
to types.
* `ValueError`: If `obj` is of an appropriate type but invalid. For
example, an invalid string.
* `KeyError`: If `obj` is not an object in the graph.
- - -
#### `tf.Graph.get_operation_by_name(name)`
Returns the `Operation` with the given `name`.
This method may be called concurrently from multiple threads.
##### Args:
* `name`: The name of the `Operation` to return.
##### Returns:
The `Operation` with the given `name`.
##### Raises:
* `TypeError`: If `name` is not a string.
* `KeyError`: If `name` does not correspond to an operation in this graph.
- - -
#### `tf.Graph.get_tensor_by_name(name)`
Returns the `Tensor` with the given `name`.
This method may be called concurrently from multiple threads.
##### Args:
* `name`: The name of the `Tensor` to return.
##### Returns:
The `Tensor` with the given `name`.
##### Raises:
* `TypeError`: If `name` is not a string.
* `KeyError`: If `name` does not correspond to a tensor in this graph.
- - -
#### `tf.Graph.get_operations()`
Return the list of operations in the graph.
You can modify the operations in place, but modifications
to the list such as inserts/delete have no effect on the
list of operations known to the graph.
This method may be called concurrently from multiple threads.
##### Returns:
A list of Operations.
- - -
#### `tf.Graph.get_default_device()`
Returns the default device.
##### Returns:
A string.
- - -
#### `tf.Graph.seed`
- - -
#### `tf.Graph.unique_name(name)`
Return a unique Operation name for "name".
Note: You rarely need to call unique_name() directly. Most of the time you
just need to create "with g.name_scope()" blocks to generate structured
names.
`unique_name` is used to generate structured names, separated by "/",
to help identify Operations when debugging a Graph. Operation names
are displayed in error messages reported by the TensorFlow runtime,
and in various visualization tools such as TensorBoard.
##### Args:
* `name`: The name for an `Operation`.
##### Returns:
A string to be passed to `create_op()` that will be used
to name the operation being created.
- - -
#### `tf.Graph.version`
Returns a version number that increases as ops are added to the graph.
- - -
#### `tf.Graph.create_op(op_type, inputs, dtypes, input_types=None, name=None, attrs=None, op_def=None, compute_shapes=True)`
Creates an `Operation` in this graph.
This is a low-level interface for creating an `Operation`. Most
programs will not call this method directly, and instead use the
Python op constructors, such as `tf.constant()`, which add ops to
the default graph.
##### Args:
* `op_type`: The `Operation` type to create. This corresponds to the
`OpDef.name` field for the proto that defines the operation.
* `inputs`: A list of `Tensor` objects that will be inputs to the `Operation`.
* `dtypes`: A list of `DType` objects that will be the types of the tensors
that the operation produces.
* `input_types`: (Optional.) A list of `DType`s that will be the types of
the tensors that the operation consumes. By default, uses the base
`DType` of each input in `inputs`. Operations that expect
reference-typed inputs must specify `input_types` explicitly.
* `name`: (Optional.) A string name for the operation. If not specified, a
name is generated based on `op_type`.
* `attrs`: (Optional.) A list of `AttrValue` protos for the `attr` field of
the `NodeDef` proto that will represent the operation.
* `op_def`: (Optional.) The `OpDef` proto that describes the `op_type` that
the operation will have.
* `compute_shapes`: (Optional.) If True, shape inference will be performed
to compute the shapes of the outputs.
##### Raises:
* `TypeError`: if any of the inputs is not a `Tensor`.
##### Returns:
An `Operation` object.
- - -
#### `tf.Graph.gradient_override_map(op_type_map)`
EXPERIMENTAL: A context manager for overriding gradient functions.
This context manager can be used to override the gradient function
that will be used for ops within the scope of the context.
For example:
```python
@tf.RegisterGradient("CustomSquare")
def _custom_square_grad(op, inputs):
# ...
with tf.Graph().as_default() as g:
c = tf.constant(5.0)
s_1 = tf.square(c) # Uses the default gradient for tf.square.
with g.gradient_override_map({"Square": "CustomSquare"}):
s_2 = tf.square(s_2) # Uses _custom_square_grad to compute the
# gradient of s_2.
```
##### Args:
* `op_type_map`: A dictionary mapping op type strings to alternative op
type strings.
##### Returns:
A context manager that sets the alternative op type to be used for one
or more ops created in that context.
##### Raises:
* `TypeError`: If `op_type_map` is not a dictionary mapping strings to
strings.
- - -
### `class tf.Operation`
Represents a graph node that performs computation on tensors.
An `Operation` is a node in a TensorFlow `Graph` that takes zero or
more `Tensor` objects as input, and produces zero or more `Tensor`
objects as output. Objects of type `Operation` are created by
calling a Python op constructor (such as
[`tf.matmul()`](../../api_docs/python/math_ops.md#matmul))
or [`Graph.create_op()`](../../api_docs/python/framework.md#Graph.create_op).
For example `c = tf.matmul(a, b)` creates an `Operation` of type
"MatMul" that takes tensors `a` and `b` as input, and produces `c`
as output.
After the graph has been launched in a session, an `Operation` can
be executed by passing it to
[`Session.run()`](../../api_docs/python/client.md#Session.run).
`op.run()` is a shortcut for calling `tf.get_default_session().run(op)`.
- - -
#### `tf.Operation.name`
The full name of this operation.
- - -
#### `tf.Operation.type`
The type of the op (e.g. `"MatMul"`).
- - -
#### `tf.Operation.inputs`
The list of `Tensor` objects representing the data inputs of this op.
- - -
#### `tf.Operation.control_inputs`
The `Operation` objects on which this op has a control dependency.
Before this op is executed, TensorFlow will ensure that the
operations in `self.control_inputs` have finished executing. This
mechanism can be used to run ops sequentially for performance
reasons, or to ensure that the side effects of an op are observed
in the correct order.
##### Returns:
A list of `Operation` objects.
- - -
#### `tf.Operation.outputs`
The list of `Tensor` objects representing the outputs of this op.
- - -
#### `tf.Operation.device`
The name of the device to which this op has been assigned, if any.
##### Returns:
The string name of the device to which this op has been
assigned, or None if it has not been assigned to a device.
- - -
#### `tf.Operation.graph`
The `Graph` that contains this operation.
- - -
#### `tf.Operation.run(feed_dict=None, session=None)`
Runs this operation in a `Session`.
Calling this method will execute all preceding operations that
produce the inputs needed for this operation.
*N.B.* Before invoking `Operation.run()`, its graph must have been
launched in a session, and either a default session must be
available, or `session` must be specified explicitly.
##### Args:
* `feed_dict`: A dictionary that maps `Tensor` objects to feed values.
See [`Session.run()`](../../api_docs/python/client.md#Session.run)
for a description of the valid feed values.
* `session`: (Optional.) The `Session` to be used to run to this operation. If
none, the default session will be used.
- - -
#### `tf.Operation.get_attr(name)`
Returns the value of the attr of this op with the given `name`.
##### Args:
* `name`: The name of the attr to fetch.
##### Returns:
The value of the attr, as a Python object.
##### Raises:
* `ValueError`: If this op does not have an attr with the given `name`.
- - -
#### `tf.Operation.traceback`
Returns the call stack from when this operation was constructed.
#### Other Methods
- - -
#### `tf.Operation.__init__(node_def, g, inputs=None, output_types=None, control_inputs=None, input_types=None, original_op=None, op_def=None)`
Creates an `Operation`.
NOTE: This constructor validates the name of the Operation (passed
as "node_def.name"). Valid Operation names match the following
regular expression:
[A-Za-z0-9.][A-Za-z0-9_.\-/]*
##### Args:
* `node_def`: graph_pb2.NodeDef. NodeDef for the Operation.
Used for attributes of graph_pb2.NodeDef, typically "name",
"op", and "device". The "input" attribute is irrelevant here
as it will be computed when generating the model.
* `g`: Graph. The parent graph.
* `inputs`: list of Tensor objects. The inputs to this Operation.
* `output_types`: list of types_pb2.DataType. List of the types of the
Tensors computed by this operation. The length of this list indicates
the number of output endpoints of the Operation.
* `control_inputs`: list of operations or tensors from which to have a
control dependency.
* `input_types`: List of types_pb2.DataType representing the
types of the Tensors accepted by the Operation. By default
uses [x.dtype.base_dtype for x in inputs]. Operations that expect
reference-typed inputs must specify these explicitly.
* `original_op`: Optional. Used to associate the new Operation with an
existing Operation (for example, a replica with the op that was
replicated).
* `op_def`: Optional. The op_def_pb2.OpDef proto that describes the
op type that this Operation represents.
##### Raises:
* `TypeError`: if control inputs are not Operations or Tensors,
or if node_def is not a NodeDef,
or if g is not a Graph,
or if inputs are not Tensors,
or if inputs and input_types are incompatible.
* `ValueError`: if the node_def name is not valid.
- - -
#### `tf.Operation.node_def`
Returns a serialized `NodeDef` representation of this operation.
##### Returns:
A
[`NodeDef`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto)
protocol buffer.
- - -
#### `tf.Operation.op_def`
Returns the `OpDef` proto that represents the type of this op.
##### Returns:
An
[`OpDef`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def.proto)
protocol buffer.
- - -
#### `tf.Operation.values()`
DEPRECATED: Use outputs.
- - -
### `class tf.Tensor`
Represents a value produced by an `Operation`.
A `Tensor` is a symbolic handle to one of the outputs of an
`Operation`. It does not hold the values of that operation's output,
but instead provides a means of computing those values in a
TensorFlow [`Session`](../../api_docs/python/client.md#Session).
This class has two primary purposes:
1. A `Tensor` can be passed as an input to another `Operation`.
This builds a dataflow connection between operations, which
enables TensorFlow to execute an entire `Graph` that represents a
large, multi-step computation.
2. After the graph has been launched in a session, the value of the
`Tensor` can be computed by passing it to
[`Session.run()`](../../api_docs/python/client.md#Session.run).
`t.eval()` is a shortcut for calling
`tf.get_default_session().run(t)`.
In the following example, `c`, `d`, and `e` are symbolic `Tensor`
objects, whereas `result` is a numpy array that stores a concrete
value:
```python
# Build a dataflow graph.
c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
d = tf.constant([[1.0, 1.0], [0.0, 1.0]])
e = tf.matmul(c, d)
# Construct a `Session` to execut the graph.
sess = tf.Session()
# Execute the graph and store the value that `e` represents in `result`.
result = sess.run(e)
```
- - -
#### `tf.Tensor.dtype`
The `DType` of elements in this tensor.
- - -
#### `tf.Tensor.name`
The string name of this tensor.
- - -
#### `tf.Tensor.value_index`
The index of this tensor in the outputs of its `Operation`.
- - -
#### `tf.Tensor.graph`
The `Graph` that contains this tensor.
- - -
#### `tf.Tensor.op`
The `Operation` that produces this tensor as an output.
- - -
#### `tf.Tensor.consumers()`
Returns a list of `Operation`s that consume this tensor.
##### Returns:
A list of `Operation`s.
- - -
#### `tf.Tensor.eval(feed_dict=None, session=None)`
Evaluates this tensor in a `Session`.
Calling this method will execute all preceding operations that
produce the inputs needed for the operation that produces this
tensor.
*N.B.* Before invoking `Tensor.eval()`, its graph must have been
launched in a session, and either a default session must be
available, or `session` must be specified explicitly.
##### Args:
* `feed_dict`: A dictionary that maps `Tensor` objects to feed values.
See [`Session.run()`](../../api_docs/python/client.md#Session.run) for a
description of the valid feed values.
* `session`: (Optional.) The `Session` to be used to evaluate this tensor. If
none, the default session will be used.
##### Returns:
A numpy array corresponding to the value of this tensor.
- - -
#### `tf.Tensor.get_shape()`
Returns the `TensorShape` that represents the shape of this tensor.
The shape is computed using shape inference functions that are
registered for each `Operation` type using `tf.RegisterShape`.
See [`TensorShape`](../../api_docs/python/framework.md#TensorShape) for more
details of what a shape represents.
The inferred shape of a tensor is used to provide shape
information without having to launch the graph in a session. This
can be used for debugging, and providing early error messages. For
example:
```python
c = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
print c.get_shape()
==> TensorShape([Dimension(2), Dimension(3)])
d = tf.constant([[1.0, 0.0], [0.0, 1.0], [1.0, 0.0], [0.0, 1.0]])
print d.get_shape()
==> TensorShape([Dimension(4), Dimension(2)])
# Raises a ValueError, because `c` and `d` do not have compatible
# inner dimensions.
e = tf.matmul(c, d)
f = tf.matmul(c, d, transpose_a=True, transpose_b=True)
print f.get_shape()
==> TensorShape([Dimension(3), Dimension(4)])
```
In some cases, the inferred shape may have unknown dimensions. If
the caller has additional information about the values of these
dimensions, `Tensor.set_shape()` can be used to augment the
inferred shape.
##### Returns:
A `TensorShape` representing the shape of this tensor.
- - -
#### `tf.Tensor.set_shape(shape)`
Updates the shape of this tensor.
This method can be called multiple times, and will merge the given
`shape` with the current shape of this tensor. It can be used to
provide additional information about the shape of this tensor that
cannot be inferred from the graph alone. For example, this can be used
to provide additional information about the shapes of images:
```python
_, image_data = tf.TFRecordReader(...).read(...)
image = tf.image.decode_png(image_data, channels=3)
# The height and width dimensions of `image` are data dependent, and
# cannot be computed without executing the op.
print image.get_shape()
==> TensorShape([Dimension(None), Dimension(None), Dimension(3)])
# We know that each image in this dataset is 28 x 28 pixels.
image.set_shape([28, 28, 3])
print image.get_shape()
==> TensorShape([Dimension(28), Dimension(28), Dimension(3)])
```
##### Args:
* `shape`: A `TensorShape` representing the shape of this tensor.
##### Raises:
* `ValueError`: If `shape` is not compatible with the current shape of
this tensor.
#### Other Methods
- - -
#### `tf.Tensor.__init__(op, value_index, dtype)`
Creates a new `Tensor`.
##### Args:
* `op`: An `Operation`. `Operation` that computes this tensor.
* `value_index`: An `int`. Index of the operation's endpoint that produces
this tensor.
* `dtype`: A `types.DType`. Type of data stored in this tensor.
##### Raises:
* `TypeError`: If the op is not an `Operation`.
- - -
#### `tf.Tensor.device`
The name of the device on which this tensor will be produced, or None.
## Tensor types
- - -
### `class tf.DType`
Represents the type of the elements in a `Tensor`.
The following `DType` objects are defined:
* `tf.float32`: 32-bit single-precision floating-point.
* `tf.float64`: 64-bit double-precision floating-point.
* `tf.bfloat16`: 16-bit truncated floating-point.
* `tf.complex64`: 64-bit single-precision complex.
* `tf.int8`: 8-bit signed integer.
* `tf.uint8`: 8-bit unsigned integer.
* `tf.int32`: 32-bit signed integer.
* `tf.int64`: 64-bit signed integer.
* `tf.bool`: Boolean.
* `tf.string`: String.
* `tf.qint8`: Quantized 8-bit signed integer.
* `tf.quint8`: Quantized 8-bit unsigned integer.
* `tf.qint32`: Quantized 32-bit signed integer.
In addition, variants of these types with the `_ref` suffix are
defined for reference-typed tensors.
The `tf.as_dtype()` function converts numpy types and string type
names to a `DType` object.
- - -
#### `tf.DType.is_compatible_with(other)`
Returns True if the `other` DType will be converted to this DType.
The conversion rules are as follows:
```
DType(T) .is_compatible_with(DType(T)) == True
DType(T) .is_compatible_with(DType(T).as_ref) == True
DType(T).as_ref.is_compatible_with(DType(T)) == False
DType(T).as_ref.is_compatible_with(DType(T).as_ref) == True
```
##### Args:
* `other`: A `DType` (or object that may be converted to a `DType`).
##### Returns:
True if a Tensor of the `other` `DType` will be implicitly converted to
this `DType`.
- - -
#### `tf.DType.name`
Returns the string name for this `DType`.
- - -
#### `tf.DType.base_dtype`
Returns a non-reference `DType` based on this `DType`.
- - -
#### `tf.DType.is_ref_dtype`
Returns `True` if this `DType` represents a reference type.
- - -
#### `tf.DType.as_ref`
Returns a reference `DType` based on this `DType`.
- - -
#### `tf.DType.is_integer`
Returns whether this is a (non-quantized) integer type.
- - -
#### `tf.DType.is_quantized`
Returns whether this is a quantized data type.
- - -
#### `tf.DType.as_numpy_dtype`
Returns a `numpy.dtype` based on this `DType`.
- - -
#### `tf.DType.as_datatype_enum`
Returns a `types_pb2.DataType` enum value based on this `DType`.
#### Other Methods
- - -
#### `tf.DType.__init__(type_enum)`
Creates a new `DataType`.
NOTE(mrry): In normal circumstances, you should not need to
construct a DataType object directly. Instead, use the
types.as_dtype() function.
##### Args:
* `type_enum`: A `types_pb2.DataType` enum value.
##### Raises:
* `TypeError`: If `type_enum` is not a value `types_pb2.DataType`.
- - -
#### `tf.DType.max`
Returns the maximum representable value in this data type.
##### Raises:
* `TypeError`: if this is a non-numeric, unordered, or quantized type.
- - -
#### `tf.DType.min`
Returns the minimum representable value in this data type.
##### Raises:
* `TypeError`: if this is a non-numeric, unordered, or quantized type.
- - -
### `tf.as_dtype(type_value)`
Converts the given `type_value` to a `DType`.
##### Args:
* `type_value`: A value that can be converted to a `tf.DType`
object. This may currently be a `tf.DType` object, a
[`DataType` enum](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.proto),
a string type name, or a `numpy.dtype`.
##### Returns:
A `DType` corresponding to `type_value`.
##### Raises:
* `TypeError`: If `type_value` cannot be converted to a `DType`.
## Utility functions
- - -
### `tf.device(dev)`
Wrapper for `Graph.device()` using the default graph.
See
[`Graph.name_scope()`](../../api_docs/python/framework.md#Graph.name_scope)
for more details.
##### Args:
* `device_name_or_function`: The device name or function to use in
the context.
##### Returns:
A context manager that specifies the default device to use for newly
created ops.
- - -
### `tf.name_scope(name)`
Wrapper for `Graph.name_scope()` using the default graph.
See
[`Graph.name_scope()`](../../api_docs/python/framework.md#Graph.name_scope)
for more details.
##### Args:
* `name`: A name for the scope.
##### Returns:
A context manager that installs `name` as a new name scope in the
default graph.
- - -
### `tf.control_dependencies(control_inputs)`
Wrapper for `Graph.control_dependencies()` using the default graph.
See [`Graph.control_dependencies()`](../../api_docs/python/framework.md#Graph.control_dependencies)
for more details.
##### Args:
* `control_inputs`: A list of `Operation` or `Tensor` objects, which
must be executed or computed before running the operations
defined in the context.
##### Returns:
A context manager that specifies control dependencies for all
operations constructed within the context.
- - -
### `tf.convert_to_tensor(value, dtype=None, name=None)`
Converts the given `value` to a `Tensor`.
This function converts Python objects of various types to `Tensor`
objects. It accepts `Tensor` objects, numpy arrays, Python lists,
and Python scalars. For example:
```python
import numpy as np
array = np.random.rand((32, 100, 100))
def my_func(arg):
arg = tf.convert_to_tensor(arg, dtype=tf.float32)
return tf.matmul(arg, arg) + arg
# The following calls are equivalent.
value_1 = my_func(tf.constant([[1.0, 2.0], [3.0, 4.0]]))
value_2 = my_func([[1.0, 2.0], [3.0, 4.0]])
value_3 = my_func(np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float32))
```
This function can be useful when composing a new operation in Python
(such as `my_func` in the example above). All standard Python op
constructors apply this function to each of their Tensor-valued
inputs, which allows those ops to accept numpy arrays, Python lists,
and scalars in addition to `Tensor` objects.
##### Args:
* `value`: An object whose type has a registered `Tensor` conversion function.
* `dtype`: Optional element type for the returned tensor. If missing, the
type is inferred from the type of `value`.
* `name`: Optional name to use if a new `Tensor` is created.
##### Returns:
A `Tensor` based on `value`.
##### Raises:
* `TypeError`: If no conversion function is registered for `value`.
* `RuntimeError`: If a registered conversion function returns an invalid value.
- - -
### `tf.get_default_graph()`
Returns the default graph for the current thread.
The returned graph will be the innermost graph on which a
`Graph.as_default()` context has been entered, or a global default
graph if none has been explicitly created.
*N.B.* The default graph is a property of the current thread. If you
create a new thread, and wish to use the default graph in that
thread, you must explicitly add a `with g.as_default():` in that
thread's function.
##### Returns:
The default `Graph` being used in the current thread.
- - -
### `tf.import_graph_def(graph_def, input_map=None, return_elements=None, name=None, op_dict=None)`
Imports the TensorFlow graph in `graph_def` into the Python `Graph`.
This function provides a way to import a serialized TensorFlow
[`GraphDef`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto)
protocol buffer, and extract individual objects in the `GraphDef` as
[`Tensor`](#Tensor) and [`Operation`](#Operation) objects. See
[`Graph.as_graph_def()`](#Graph.as_graph_def) for a way to create a
`GraphDef` proto.
##### Args:
* `graph_def`: A `GraphDef` proto containing operations to be imported into
the default graph.
* `input_map`: A dictionary mapping input names (as strings) in `graph_def`
to `Tensor` objects. The values of the named input tensors in the
imported graph will be re-mapped to the respective `Tensor` values.
* `return_elements`: A list of strings containing operation names in
`graph_def` that will be returned as `Operation` objects; and/or
tensor names in `graph_def` that will be returned as `Tensor` objects.
* `name`: (Optional.) A prefix that will be prepended to the names in
`graph_def`. Defaults to `"import"`.
* `op_dict`: (Optional.) A dictionary mapping op type names to `OpDef` protos.
Must contain an `OpDef` proto for each op type named in `graph_def`.
If omitted, uses the `OpDef` protos registered in the global registry.
##### Returns:
A list of `Operation` and/or `Tensor` objects from the imported graph,
corresponding to the names in `return_elements'.
##### Raises:
* `TypeError`: If `graph_def` is not a `GraphDef` proto,
`input_map' is not a dictionary mapping strings to `Tensor` objects,
or `return_elements` is not a list of strings.
* `ValueError`: If `input_map`, or `return_elements` contains names that
do not appear in `graph_def`, or `graph_def` is not well-formed (e.g.
it refers to an unknown tensor).
## Graph collections
- - -
### `tf.add_to_collection(name, value)`
Wrapper for `Graph.add_to_collection()` using the default graph.
See [`Graph.add_to_collection()`](../../api_docs/python/framework.md#Graph.add_to_collection)
for more details.
##### Args:
* `name`: The key for the collection. For example, the `GraphKeys` class
contains many standard names for collections.
* `value`: The value to add to the collection.
- - -
### `tf.get_collection(key, scope=None)`
Wrapper for `Graph.get_collection()` using the default graph.
See [`Graph.get_collection()`](../../api_docs/python/framework.md#Graph.get_collection)
for more details.
##### Args:
* `key`: The key for the collection. For example, the `GraphKeys` class
contains many standard names for collections.
* `scope`: (Optional.) If supplied, the resulting list is filtered to include
only items whose name begins with this string.
##### Returns:
The list of values in the collection with the given `name`, or
an empty list if no value has been added to that collection. The
list contains the values in the order under which they were
collected.
- - -
### `class tf.GraphKeys`
Standard names to use for graph collections.
The standard library uses various well-known names to collect and
retrieve values associated with a graph. For example, the
`tf.Optimizer` subclasses default to optimizing the variables
collected under `tf.GraphKeys.TRAINABLE_VARIABLES` if none is
specified, but it is also possible to pass an explicit list of
variables.
The following standard keys are defined:
* `VARIABLES`: the `Variable` objects that comprise a model, and
must be saved and restored together. See
[`tf.all_variables()`](../../api_docs/python/state_ops.md#all_variables)
for more details.
* `TRAINABLE_VARIABLES`: the subset of `Variable` objects that will
be trained by an optimizer. See
[`tf.trainable_variables()`](../../api_docs/python/state_ops.md#trainable_variables)
for more details.
* `SUMMARIES`: the summary `Tensor` objects that have been created in the
graph. See
[`tf.merge_all_summaries()`](../../api_docs/python/train.md#merge_all_summaries)
for more details.
* `QUEUE_RUNNERS`: the `QueueRunner` objects that are used to
produce input for a computation. See
[`tf.start_queue_runners()`](../../api_docs/python/train.md#start_queue_runners)
for more details.
## Defining new operations
- - -
### `class tf.RegisterGradient`
A decorator for registering the gradient function for an op type.
This decorator is only used when defining a new op type. For an op
with `m` inputs and `n` inputs, the gradient function is a function
that takes the original `Operation` and `n` `Tensor` objects
(representing the gradients with respect to each output of the op),
and returns `m` `Tensor` objects (representing the partial gradients
with respect to each input of the op).
For example, assuming that operations of type `"Sub"` take two
inputs `x` and `y`, and return a single output `x - y`, the
following gradient function would be registered:
```python
@tf.RegisterGradient("Sub")
def _sub_grad(unused_op, grad):
return grad, tf.Neg(grad)
```
The decorator argument `op_type` is the string type of an
operation. This corresponds to the `OpDef.name` field for the proto
that defines the operation.
- - -
#### `tf.RegisterGradient.__init__(op_type)`
Creates a new decorator with `op_type` as the Operation type.
##### Args:
* `op_type`: The string type of an operation. This corresponds to the
`OpDef.name` field for the proto that defines the operation.
- - -
### `tf.NoGradient(op_type)`
Specifies that ops of type `op_type` do not have a defined gradient.
This function is only used when defining a new op type. It may be
used for ops such as `tf.size()` that are not differentiable. For
example:
```python
tf.NoGradient("Size")
```
##### Args:
* `op_type`: The string type of an operation. This corresponds to the
`OpDef.name` field for the proto that defines the operation.
##### Raises:
* `TypeError`: If `op_type` is not a string.
- - -
### `class tf.RegisterShape`
A decorator for registering the shape function for an op type.
This decorator is only used when defining a new op type. A shape
function is a function from an `Operation` object to a list of
`TensorShape` objects, with one `TensorShape` for each output of the
operation.
For example, assuming that operations of type `"Sub"` take two
inputs `x` and `y`, and return a single output `x - y`, all with the
same shape, the following shape function would be registered:
```python
@tf.RegisterShape("Sub")
def _sub_shape(op):
return [op.inputs[0].get_shape().merge_with(op.inputs[1].get_shape())]
```
The decorator argument `op_type` is the string type of an
operation. This corresponds to the `OpDef.name` field for the proto
that defines the operation.
- - -
#### `tf.RegisterShape.__init__(op_type)`
Saves the "op_type" as the Operation type.
- - -
### `class tf.TensorShape`
Represents the shape of a `Tensor`.
A `TensorShape` represents a possibly-partial shape specification for a
`Tensor`. It may be one of the following:
* *Fully-known shape:* has a known number of dimensions and a known size
for each dimension.
* *Partially-known shape:* has a known number of dimensions, and an unknown
size for one or more dimension.
* *Unknown shape:* has an unknown number of dimensions, and an unknown
size in all dimensions.
If a tensor is produced by an operation of type `"Foo"`, its shape
may be inferred if there is a registered shape function for
`"Foo"`. See [`tf.RegisterShape()`](../../api_docs/python/framework.md#RegisterShape)
for details of shape
functions and how to register them. Alternatively, the shape may be set
explicitly using [`Tensor.set_shape()`](../../api_docs/python/framework.md#Tensor.set_shape).
- - -
#### `tf.TensorShape.merge_with(other)`
Returns a `TensorShape` combining the information in `self` and `other`.
The dimensions in `self` and `other` are merged elementwise,
according to the rules defined for `Dimension.merge_with()`.
##### Args:
* `other`: Another `TensorShape`.
##### Returns:
A `TensorShape` containing the combined information of `self` and
`other`.
##### Raises:
* `ValueError`: If `self` and `other` are not compatible.
- - -
#### `tf.TensorShape.concatenate(other)`
Returns the concatenation of the dimension in `self` and `other`.
*N.B.* If either `self` or `other` is completely unknown,
concatenation will discard information about the other shape. In
future, we might support concatenation that preserves this
information for use with slicing.
##### Args:
* `other`: Another `TensorShape`.
##### Returns:
A `TensorShape` whose dimensions are the concatenation of the
dimensions in `self` and `other`.
- - -
#### `tf.TensorShape.ndims`
Returns the rank of this shape, or None if it is unspecified.
- - -
#### `tf.TensorShape.dims`
Returns a list of Dimensions, or None if the shape is unspecified.
- - -
#### `tf.TensorShape.as_list()`
Returns a list of integers or None for each dimension.
- - -
#### `tf.TensorShape.is_compatible_with(other)`
Returns True iff `self` is compatible with `other`.
Two possibly-partially-defined shapes are compatible if there
exists a fully-defined shape that both shapes can represent. Thus,
compatibility allows the shape inference code to reason about
partially-defined shapes. For example:
* TensorShape(None) is compatible with all shapes.
* TensorShape([None, None]) is compatible with all two-dimensional
shapes, such as TensorShape([32, 784]), and also TensorShape(None). It is
not compatible with, for example, TensorShape([None]) or
TensorShape([None, None, None]).
* TensorShape([32, None]) is compatible with all two-dimensional shapes
with size 32 in the 0th dimension, and also TensorShape([None, None])
and TensorShape(None). It is not compatible with, for example,
TensorShape([32]), TensorShape([32, None, 1]) or TensorShape([64, None]).
* TensorShape([32, 784]) is compatible with itself, and also
TensorShape([32, None]), TensorShape([None, 784]), TensorShape([None,
None]) and TensorShape(None). It is not compatible with, for example,
TensorShape([32, 1, 784]) or TensorShape([None]).
The compatibility relation is reflexive and symmetric, but not
transitive. For example, TensorShape([32, 784]) is compatible with
TensorShape(None), and TensorShape(None) is compatible with
TensorShape([4, 4]), but TensorShape([32, 784]) is not compatible with
TensorShape([4, 4]).
##### Args:
* `other`: Another TensorShape.
##### Returns:
True iff `self` is compatible with `other`.
- - -
#### `tf.TensorShape.is_fully_defined()`
Returns True iff `self` is fully defined in every dimension.
- - -
#### `tf.TensorShape.with_rank(rank)`
Returns a shape based on `self` with the given rank.
This method promotes a completely unknown shape to one with a
known rank.
##### Args:
* `rank`: An integer.
##### Returns:
A shape that is at least as specific as `self` with the given rank.
##### Raises:
* `ValueError`: If `self` does not represent a shape with the given `rank`.
- - -
#### `tf.TensorShape.with_rank_at_least(rank)`
Returns a shape based on `self` with at least the given rank.
##### Args:
* `rank`: An integer.
##### Returns:
A shape that is at least as specific as `self` with at least the given
rank.
##### Raises:
* `ValueError`: If `self` does not represent a shape with at least the given
`rank`.
- - -
#### `tf.TensorShape.with_rank_at_most(rank)`
Returns a shape based on `self` with at most the given rank.
##### Args:
* `rank`: An integer.
##### Returns:
A shape that is at least as specific as `self` with at most the given
rank.
##### Raises:
* `ValueError`: If `self` does not represent a shape with at most the given
`rank`.
- - -
#### `tf.TensorShape.assert_has_rank(rank)`
Raises an exception if `self` is not compatible with the given `rank`.
##### Args:
* `rank`: An integer.
##### Raises:
* `ValueError`: If `self` does not represent a shape with the given `rank`.
- - -
#### `tf.TensorShape.assert_same_rank(other)`
Raises an exception if `self` and `other` do not have compatible ranks.
##### Args:
* `other`: Another `TensorShape`.
##### Raises:
* `ValueError`: If `self` and `other` do not represent shapes with the
same rank.
- - -
#### `tf.TensorShape.assert_is_compatible_with(other)`
Raises exception if `self` and `other` do not represent the same shape.
This method can be used to assert that there exists a shape that both
`self` and `other` represent.
##### Args:
* `other`: Another TensorShape.
##### Raises:
* `ValueError`: If `self` and `other` do not represent the same shape.
- - -
#### `tf.TensorShape.assert_is_fully_defined()`
Raises an exception if `self` is not fully defined in every dimension.
##### Raises:
* `ValueError`: If `self` does not have a known value for every dimension.
#### Other Methods
- - -
#### `tf.TensorShape.__init__(dims)`
Creates a new TensorShape with the given dimensions.
##### Args:
* `dims`: A list of Dimensions, or None if the shape is unspecified.
* `DEPRECATED`: A single integer is treated as a singleton list.
- - -
#### `tf.TensorShape.as_dimension_list()`
DEPRECATED: use as_list().
- - -
#### `tf.TensorShape.num_elements()`
Returns the total number of elements, or none for incomplete shapes.
- - -
### `class tf.Dimension`
Represents the value of one dimension in a TensorShape.
- - -
#### `tf.Dimension.__init__(value)`
Creates a new Dimension with the given value.
- - -
#### `tf.Dimension.assert_is_compatible_with(other)`
Raises an exception if `other` is not compatible with this Dimension.
##### Args:
* `other`: Another Dimension.
##### Raises:
* `ValueError`: If `self` and `other` are not compatible (see
is_compatible_with).
- - -
#### `tf.Dimension.is_compatible_with(other)`
Returns true if `other` is compatible with this Dimension.
Two known Dimensions are compatible if they have the same value.
An unknown Dimension is compatible with all other Dimensions.
##### Args:
* `other`: Another Dimension.
##### Returns:
True if this Dimension and `other` are compatible.
- - -
#### `tf.Dimension.merge_with(other)`
Returns a Dimension that combines the information in `self` and `other`.
Dimensions are combined as follows:
Dimension(n) .merge_with(Dimension(n)) == Dimension(n)
Dimension(n) .merge_with(Dimension(None)) == Dimension(n)
Dimension(None).merge_with(Dimension(n)) == Dimension(n)
Dimension(None).merge_with(Dimension(None)) == Dimension(None)
Dimension(n) .merge_with(Dimension(m)) raises ValueError for n != m
##### Args:
* `other`: Another Dimension.
##### Returns:
A Dimension containing the combined information of `self` and
`other`.
##### Raises:
* `ValueError`: If `self` and `other` are not compatible (see
is_compatible_with).
- - -
#### `tf.Dimension.value`
The value of this dimension, or None if it is unknown.
- - -
### `tf.op_scope(values, name, default_name)`
Returns a context manager for use when defining a Python op.
This context manager validates that the given `values` are from the
same graph, ensures that that graph is the default graph, and pushes a
name scope.
For example, to define a new Python op called `my_op`:
```python
def my_op(a, b, c, name=None):
with tf.op_scope([a, b, c], name, "MyOp") as scope:
a = tf.convert_to_tensor(a, name="a")
b = tf.convert_to_tensor(b, name="b")
c = tf.convert_to_tensor(c, name="c")
# Define some computation that uses `a`, `b`, and `c`.
return foo_op(..., name=scope)
```
##### Args:
* `values`: The list of `Tensor` arguments that are passed to the op function.
* `name`: The name argument that is passed to the op function.
* `default_name`: The default name to use if the `name` argument is `None`.
##### Returns:
A context manager for use in defining a Python op.
- - -
### `tf.get_seed(op_seed)`
Returns the local seeds an operation should use given an op-specific seed.
Given operation-specific seed, `op_seed`, this helper function returns two
seeds derived from graph-level and op-level seeds. Many random operations
internally use the two seeds to allow user to change the seed globally for a
graph, or for only specific operations.
For details on how the graph-level seed interacts with op seeds, see
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed).
##### Args:
* `op_seed`: integer.
##### Returns:
A tuple of two integers that should be used for the local seed of this
operation.
================================================
FILE: SOURCE/api_docs/python/image.md
================================================
# Images
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Images](#AUTOGENERATED-images)
* [Encoding and Decoding](#AUTOGENERATED-encoding-and-decoding)
* [`tf.image.decode_jpeg(contents, channels=None, ratio=None, fancy_upscaling=None, try_recover_truncated=None, acceptable_fraction=None, name=None)`](#decode_jpeg)
* [`tf.image.encode_jpeg(image, format=None, quality=None, progressive=None, optimize_size=None, chroma_downsampling=None, density_unit=None, x_density=None, y_density=None, xmp_metadata=None, name=None)`](#encode_jpeg)
* [`tf.image.decode_png(contents, channels=None, name=None)`](#decode_png)
* [`tf.image.encode_png(image, compression=None, name=None)`](#encode_png)
* [Resizing](#AUTOGENERATED-resizing)
* [`tf.image.resize_images(images, new_height, new_width, method=0)`](#resize_images)
* [`tf.image.resize_area(images, size, name=None)`](#resize_area)
* [`tf.image.resize_bicubic(images, size, name=None)`](#resize_bicubic)
* [`tf.image.resize_bilinear(images, size, name=None)`](#resize_bilinear)
* [`tf.image.resize_nearest_neighbor(images, size, name=None)`](#resize_nearest_neighbor)
* [Cropping](#AUTOGENERATED-cropping)
* [`tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)`](#resize_image_with_crop_or_pad)
* [`tf.image.pad_to_bounding_box(image, offset_height, offset_width, target_height, target_width)`](#pad_to_bounding_box)
* [`tf.image.crop_to_bounding_box(image, offset_height, offset_width, target_height, target_width)`](#crop_to_bounding_box)
* [`tf.image.random_crop(image, size, seed=None, name=None)`](#random_crop)
* [`tf.image.extract_glimpse(input, size, offsets, centered=None, normalized=None, uniform_noise=None, name=None)`](#extract_glimpse)
* [Flipping and Transposing](#AUTOGENERATED-flipping-and-transposing)
* [`tf.image.flip_up_down(image)`](#flip_up_down)
* [`tf.image.random_flip_up_down(image, seed=None)`](#random_flip_up_down)
* [`tf.image.flip_left_right(image)`](#flip_left_right)
* [`tf.image.random_flip_left_right(image, seed=None)`](#random_flip_left_right)
* [`tf.image.transpose_image(image)`](#transpose_image)
* [Image Adjustments](#AUTOGENERATED-image-adjustments)
* [`tf.image.adjust_brightness(image, delta, min_value=None, max_value=None)`](#adjust_brightness)
* [`tf.image.random_brightness(image, max_delta, seed=None)`](#random_brightness)
* [`tf.image.adjust_contrast(images, contrast_factor, min_value=None, max_value=None)`](#adjust_contrast)
* [`tf.image.random_contrast(image, lower, upper, seed=None)`](#random_contrast)
* [`tf.image.per_image_whitening(image)`](#per_image_whitening)
## Encoding and Decoding
TensorFlow provides Ops to decode and encode JPEG and PNG formats. Encoded
images are represented by scalar string Tensors, decoded images by 3-D uint8
tensors of shape `[height, width, channels]`.
The encode and decode Ops apply to one image at a time. Their input and output
are all of variable size. If you need fixed size images, pass the output of
the decode Ops to one of the cropping and resizing Ops.
Note: The PNG encode and decode Ops support RGBA, but the conversions Ops
presently only support RGB, HSV, and GrayScale.
- - -
### `tf.image.decode_jpeg(contents, channels=None, ratio=None, fancy_upscaling=None, try_recover_truncated=None, acceptable_fraction=None, name=None)`
Decode a JPEG-encoded image to a uint8 tensor.
The attr `channels` indicates the desired number of color channels for the
decoded image.
Accepted values are:
* 0: Use the number of channels in the JPEG-encoded image.
* 1: output a grayscale image.
* 3: output an RGB image.
If needed, the JPEG-encoded image is transformed to match the requested number
of color channels.
The attr `ratio` allows downscaling the image by an integer factor during
decoding. Allowed values are: 1, 2, 4, and 8. This is much faster than
downscaling the image later.
##### Args:
* `contents`: A `Tensor` of type `string`. 0-D. The JPEG-encoded image.
* `channels`: An optional `int`. Defaults to `0`.
Number of color channels for the decoded image.
* `ratio`: An optional `int`. Defaults to `1`. Downscaling ratio.
* `fancy_upscaling`: An optional `bool`. Defaults to `True`.
If true use a slower but nicer upscaling of the
chroma planes (yuv420/422 only).
* `try_recover_truncated`: An optional `bool`. Defaults to `False`.
If true try to recover an image from truncated input.
* `acceptable_fraction`: An optional `float`. Defaults to `1`.
The minimum required fraction of lines before a truncated
input is accepted.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `uint8`. 3-D with shape `[height, width, channels]`..
- - -
### `tf.image.encode_jpeg(image, format=None, quality=None, progressive=None, optimize_size=None, chroma_downsampling=None, density_unit=None, x_density=None, y_density=None, xmp_metadata=None, name=None)`
JPEG-encode an image.
`image` is a 3-D uint8 Tensor of shape `[height, width, channels]`.
The attr `format` can be used to override the color format of the encoded
output. Values can be:
* `''`: Use a default format based on the number of channels in the image.
* `grayscale`: Output a grayscale JPEG image. The `channels` dimension
of `image` must be 1.
* `rgb`: Output an RGB JPEG image. The `channels` dimension
of `image` must be 3.
If `format` is not specified or is the empty string, a default format is picked
in function of the number of channels in `image`:
* 1: Output a grayscale image.
* 3: Output an RGB image.
##### Args:
* `image`: A `Tensor` of type `uint8`.
3-D with shape `[height, width, channels]`.
* `format`: An optional `string` from: `"", "grayscale", "rgb"`. Defaults to `""`.
Per pixel image format.
* `quality`: An optional `int`. Defaults to `95`.
Quality of the compression from 0 to 100 (higher is better and slower).
* `progressive`: An optional `bool`. Defaults to `False`.
If True, create a JPEG that loads progressively (coarse to fine).
* `optimize_size`: An optional `bool`. Defaults to `False`.
If True, spend CPU/RAM to reduce size with no quality change.
* `chroma_downsampling`: An optional `bool`. Defaults to `True`.
See http://en.wikipedia.org/wiki/Chroma_subsampling.
* `density_unit`: An optional `string` from: `"in", "cm"`. Defaults to `"in"`.
Unit used to specify `x_density` and `y_density`:
pixels per inch (`'in'`) or centimeter (`'cm'`).
* `x_density`: An optional `int`. Defaults to `300`.
Horizontal pixels per density unit.
* `y_density`: An optional `int`. Defaults to `300`.
Vertical pixels per density unit.
* `xmp_metadata`: An optional `string`. Defaults to `""`.
If not empty, embed this XMP metadata in the image header.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `string`. 0-D. JPEG-encoded image.
- - -
### `tf.image.decode_png(contents, channels=None, name=None)`
Decode a PNG-encoded image to a uint8 tensor.
The attr `channels` indicates the desired number of color channels for the
decoded image.
Accepted values are:
* 0: Use the number of channels in the PNG-encoded image.
* 1: output a grayscale image.
* 3: output an RGB image.
* 4: output an RGBA image.
If needed, the PNG-encoded image is transformed to match the requested number
of color channels.
##### Args:
* `contents`: A `Tensor` of type `string`. 0-D. The PNG-encoded image.
* `channels`: An optional `int`. Defaults to `0`.
Number of color channels for the decoded image.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `uint8`. 3-D with shape `[height, width, channels]`.
- - -
### `tf.image.encode_png(image, compression=None, name=None)`
PNG-encode an image.
`image` is a 3-D uint8 Tensor of shape `[height, width, channels]` where
`channels` is:
* 1: for grayscale.
* 3: for RGB.
* 4: for RGBA.
The ZLIB compression level, `compression`, can be -1 for the PNG-encoder
default or a value from 0 to 9. 9 is the highest compression level, generating
the smallest output, but is slower.
##### Args:
* `image`: A `Tensor` of type `uint8`.
3-D with shape `[height, width, channels]`.
* `compression`: An optional `int`. Defaults to `-1`. Compression level.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `string`. 0-D. PNG-encoded image.
## Resizing
The resizing Ops accept input images as tensors of several types. They always
output resized images as float32 tensors.
The convenience function [resize_images()](#resize_images) supports both 4-D
and 3-D tensors as input and output. 4-D tensors are for batches of images,
3-D tensors for individual images.
Other resizing Ops only support 3-D individual images as input:
[resize_area](#resize_area), [resize_bicubic](#resize_bicubic),
[resize_bilinear](#resize_bilinear),
[resize_nearest_neighbor](#resize_nearest_neighbor).
Example:
```python
# Decode a JPG image and resize it to 299 by 299.
image = tf.image.decode_jpeg(...)
resized_image = tf.image.resize_bilinear(image, [299, 299])
```
Maybe refer to the Queue examples that show how to add images to a Queue
after resizing them to a fixed size, and how to dequeue batches of resized
images from the Queue.
- - -
### `tf.image.resize_images(images, new_height, new_width, method=0)`
Resize `images` to `new_width`, `new_height` using the specified `method`.
Resized images will be distorted if their original aspect ratio is not
the same as `new_width`, `new_height`. To avoid distortions see
[resize_image_with_crop_or_pad](#resize_image_with_crop_or_pad).
`method` can be one of:
* ResizeMethod.BILINEAR: [Bilinear interpolation.]
(https://en.wikipedia.org/wiki/Bilinear_interpolation)
* ResizeMethod.NEAREST_NEIGHBOR: [Nearest neighbor interpolation.]
(https://en.wikipedia.org/wiki/Nearest-neighbor_interpolation)
* ResizeMethod.BICUBIC: [Bicubic interpolation.]
(https://en.wikipedia.org/wiki/Bicubic_interpolation)
* ResizeMethod.AREA: Area interpolation.
##### Args:
* `images`: 4-D Tensor of shape `[batch, height, width, channels]` or
3-D Tensor of shape `[height, width, channels]`.
* `new_height`: integer.
* `new_width`: integer.
* `method`: ResizeMethod. Defaults to `ResizeMethod.BILINEAR`.
##### Raises:
* `ValueError`: if the shape of `images` is incompatible with the
shape arguments to this function
* `ValueError`: if an unsupported resize method is specified.
##### Returns:
If `images` was 4-D, a 4-D float Tensor of shape
`[batch, new_height, new_width, channels]`.
If `images` was 3-D, a 3-D float Tensor of shape
`[new_height, new_width, channels]`.
- - -
### `tf.image.resize_area(images, size, name=None)`
Resize `images` to `size` using area interpolation.
Input images can be of different types but output images are always float.
##### Args:
* `images`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `float32`, `float64`.
4-D with shape `[batch, height, width, channels]`.
* `size`: A 1-D int32 Tensor of 2 elements: `new_height, new_width`. The
new size for the images.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `float32`. 4-D with shape
`[batch, new_height, new_width, channels]`.
- - -
### `tf.image.resize_bicubic(images, size, name=None)`
Resize `images` to `size` using bicubic interpolation.
Input images can be of different types but output images are always float.
##### Args:
* `images`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `float32`, `float64`.
4-D with shape `[batch, height, width, channels]`.
* `size`: A 1-D int32 Tensor of 2 elements: `new_height, new_width`. The
new size for the images.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `float32`. 4-D with shape
`[batch, new_height, new_width, channels]`.
- - -
### `tf.image.resize_bilinear(images, size, name=None)`
Resize `images` to `size` using bilinear interpolation.
Input images can be of different types but output images are always float.
##### Args:
* `images`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `float32`, `float64`.
4-D with shape `[batch, height, width, channels]`.
* `size`: A 1-D int32 Tensor of 2 elements: `new_height, new_width`. The
new size for the images.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `float32`. 4-D with shape
`[batch, new_height, new_width, channels]`.
- - -
### `tf.image.resize_nearest_neighbor(images, size, name=None)`
Resize `images` to `size` using nearest neighbor interpolation.
Input images can be of different types but output images are always float.
##### Args:
* `images`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `float32`, `float64`.
4-D with shape `[batch, height, width, channels]`.
* `size`: A 1-D int32 Tensor of 2 elements: `new_height, new_width`. The
new size for the images.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `images`. 4-D with shape
`[batch, new_height, new_width, channels]`.
## Cropping
- - -
### `tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)`
Crops and/or pads an image to a target width and height.
Resizes an image to a target width and height by either centrally
cropping the image or padding it evenly with zeros.
If `width` or `height` is greater than the specified `target_width` or
`target_height` respectively, this op centrally crops along that dimension.
If `width` or `height` is smaller than the specified `target_width` or
`target_height` respectively, this op centrally pads with 0 along that
dimension.
##### Args:
* `image`: 3-D tensor of shape [height, width, channels]
* `target_height`: Target height.
* `target_width`: Target width.
##### Raises:
* `ValueError`: if `target_height` or `target_width` are zero or negative.
##### Returns:
Cropped and/or padded image of shape
`[target_height, target_width, channels]`
- - -
### `tf.image.pad_to_bounding_box(image, offset_height, offset_width, target_height, target_width)`
Pad `image` with zeros to the specified `height` and `width`.
Adds `offset_height` rows of zeros on top, `offset_width` columns of
zeros on the left, and then pads the image on the bottom and right
with zeros until it has dimensions `target_height`, `target_width`.
This op does nothing if `offset_*` is zero and the image already has size
`target_height` by `target_width`.
##### Args:
* `image`: 3-D tensor with shape `[height, width, channels]`
* `offset_height`: Number of rows of zeros to add on top.
* `offset_width`: Number of columns of zeros to add on the left.
* `target_height`: Height of output image.
* `target_width`: Width of output image.
##### Returns:
3-D tensor of shape `[target_height, target_width, channels]`
##### Raises:
* `ValueError`: If the shape of `image` is incompatible with the `offset_*` or
`target_*` arguments
- - -
### `tf.image.crop_to_bounding_box(image, offset_height, offset_width, target_height, target_width)`
Crops an image to a specified bounding box.
This op cuts a rectangular part out of `image`. The top-left corner of the
returned image is at `offset_height, offset_width` in `image`, and its
lower-right corner is at
`offset_height + target_height, offset_width + target_width'.
##### Args:
* `image`: 3-D tensor with shape `[height, width, channels]`
* `offset_height`: Vertical coordinate of the top-left corner of the result in
the input.
* `offset_width`: Horizontal coordinate of the top-left corner of the result in
the input.
* `target_height`: Height of the result.
* `target_width`: Width of the result.
##### Returns:
3-D tensor of image with shape `[target_height, target_width, channels]`
##### Raises:
* `ValueError`: If the shape of `image` is incompatible with the `offset_*` or
`target_*` arguments
- - -
### `tf.image.random_crop(image, size, seed=None, name=None)`
Randomly crops `image` to size `[target_height, target_width]`.
The offset of the output within `image` is uniformly random. `image` always
fully contains the result.
##### Args:
* `image`: 3-D tensor of shape `[height, width, channels]`
* `size`: 1-D tensor with two elements, specifying target `[height, width]`
* `seed`: A Python integer. Used to create a random seed. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
* `name`: A name for this operation (optional).
##### Returns:
A cropped 3-D tensor of shape `[target_height, target_width, channels]`.
- - -
### `tf.image.extract_glimpse(input, size, offsets, centered=None, normalized=None, uniform_noise=None, name=None)`
Extracts a glimpse from the input tensor.
Returns a set of windows called glimpses extracted at location `offsets`
from the input tensor. If the windows only partially overlaps the inputs, the
non overlapping areas will be filled with random noise.
The result is a 4-D tensor of shape `[batch_size, glimpse_height,
glimpse_width, channels]`. The channels and batch dimensions are the same as that
of the input tensor. The height and width of the output windows are
specified in the `size` parameter.
The argument `normalized` and `centered` controls how the windows are built:
* If the coordinates are normalized but not centered, 0.0 and 1.0
correspond to the minimum and maximum of each height and width dimension.
* If the coordinates are both normalized and centered, they range from -1.0 to
1.0. The coordinates (-1.0, -1.0) correspond to the upper left corner, the
lower right corner is located at (1.0, 1.0) and the center is at (0, 0).
* If the coordinates are not normalized they are interpreted as numbers of pixels.
##### Args:
* `input`: A `Tensor` of type `float32`.
A 4-D float tensor of shape `[batch_size, height, width, channels]`.
* `size`: A `Tensor` of type `int32`.
A 1-D tensor of 2 elements containing the size of the glimpses to extract.
The glimpse height must be specified first, following by the glimpse width.
* `offsets`: A `Tensor` of type `float32`.
A 2-D integer tensor of shape `[batch_size, 2]` containing the x, y
locations of the center of each window.
* `centered`: An optional `bool`. Defaults to `True`.
indicates if the offset coordinates are centered relative to
the image, in which case the (0, 0) offset is relative to the center of the
input images. If false, the (0,0) offset corresponds to the upper left corner
of the input images.
* `normalized`: An optional `bool`. Defaults to `True`.
indicates if the offset coordinates are normalized.
* `uniform_noise`: An optional `bool`. Defaults to `True`.
indicates if the noise should be generated using a
uniform distribution or a gaussian distribution.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `float32`.
A tensor representing the glimpses `[batch_size, glimpse_height,
glimpse_width, channels]`.
## Flipping and Transposing
- - -
### `tf.image.flip_up_down(image)`
Flip an image horizontally (upside down).
Outputs the contents of `image` flipped along the first dimension, which is
`height`.
See also `reverse()`.
##### Args:
* `image`: A 3-D tensor of shape `[height, width, channels].`
##### Returns:
A 3-D tensor of the same type and shape as `image`.
##### Raises:
* `ValueError`: if the shape of `image` not supported.
- - -
### `tf.image.random_flip_up_down(image, seed=None)`
Randomly flips an image vertically (upside down).
With a 1 in 2 chance, outputs the contents of `image` flipped along the first
dimension, which is `height`. Otherwise output the image as-is.
##### Args:
* `image`: A 3-D tensor of shape `[height, width, channels].`
* `seed`: A Python integer. Used to create a random seed. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
##### Returns:
A 3-D tensor of the same type and shape as `image`.
##### Raises:
* `ValueError`: if the shape of `image` not supported.
- - -
### `tf.image.flip_left_right(image)`
Flip an image horizontally (left to right).
Outputs the contents of `image` flipped along the second dimension, which is
`width`.
See also `reverse()`.
##### Args:
* `image`: A 3-D tensor of shape `[height, width, channels].`
##### Returns:
A 3-D tensor of the same type and shape as `image`.
##### Raises:
* `ValueError`: if the shape of `image` not supported.
- - -
### `tf.image.random_flip_left_right(image, seed=None)`
Randomly flip an image horizontally (left to right).
With a 1 in 2 chance, outputs the contents of `image` flipped along the
second dimension, which is `width`. Otherwise output the image as-is.
##### Args:
* `image`: A 3-D tensor of shape `[height, width, channels].`
* `seed`: A Python integer. Used to create a random seed. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
##### Returns:
A 3-D tensor of the same type and shape as `image`.
##### Raises:
* `ValueError`: if the shape of `image` not supported.
- - -
### `tf.image.transpose_image(image)`
Transpose an image by swapping the first and second dimension.
See also `transpose()`.
##### Args:
* `image`: 3-D tensor of shape `[height, width, channels]`
##### Returns:
A 3-D tensor of shape `[width, height, channels]`
##### Raises:
* `ValueError`: if the shape of `image` not supported.
## Image Adjustments
TensorFlow provides functions to adjust images in various ways: brightness,
contrast, hue, and saturation. Each adjustment can be done with predefined
parameters or with random parameters picked from predefined intervals. Random
adjustments are often useful to expand a training set and reduce overfitting.
- - -
### `tf.image.adjust_brightness(image, delta, min_value=None, max_value=None)`
Adjust the brightness of RGB or Grayscale images.
The value `delta` is added to all components of the tensor `image`. `image`
and `delta` are cast to `float` before adding, and the resulting values are
clamped to `[min_value, max_value]`. Finally, the result is cast back to
`images.dtype`.
If `min_value` or `max_value` are not given, they are set to the minimum and
maximum allowed values for `image.dtype` respectively.
##### Args:
* `image`: A tensor.
* `delta`: A scalar. Amount to add to the pixel values.
* `min_value`: Minimum value for output.
* `max_value`: Maximum value for output.
##### Returns:
A tensor of the same shape and type as `image`.
- - -
### `tf.image.random_brightness(image, max_delta, seed=None)`
Adjust the brightness of images by a random factor.
Equivalent to `adjust_brightness()` using a `delta` randomly picked in the
interval `[-max_delta, max_delta)`.
Note that `delta` is picked as a float. Because for integer type images,
the brightness adjusted result is rounded before casting, integer images may
have modifications in the range `[-max_delta,max_delta]`.
##### Args:
* `image`: 3-D tensor of shape `[height, width, channels]`.
* `max_delta`: float, must be non-negative.
* `seed`: A Python integer. Used to create a random seed. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
##### Returns:
3-D tensor of images of shape `[height, width, channels]`
##### Raises:
* `ValueError`: if max_delta is negative.
- - -
### `tf.image.adjust_contrast(images, contrast_factor, min_value=None, max_value=None)`
Adjust contrast of RGB or grayscale images.
`images` is a tensor of at least 3 dimensions. The last 3 dimensions are
interpreted as `[height, width, channels]`. The other dimensions only
represent a collection of images, such as `[batch, height, width, channels].`
Contrast is adjusted independently for each channel of each image.
For each channel, this Op first computes the mean of the image pixels in the
channel and then adjusts each component `x` of each pixel to
`(x - mean) * contrast_factor + mean`.
The adjusted values are then clipped to fit in the `[min_value, max_value]`
interval. If `min_value` or `max_value` is not given, it is replaced with the
minimum and maximum values for the data type of `images` respectively.
The contrast-adjusted image is always computed as `float`, and it is
cast back to its original type after clipping.
##### Args:
* `images`: Images to adjust. At least 3-D.
* `contrast_factor`: A float multiplier for adjusting contrast.
* `min_value`: Minimum value for clipping the adjusted pixels.
* `max_value`: Maximum value for clipping the adjusted pixels.
##### Returns:
The constrast-adjusted image or images.
##### Raises:
* `ValueError`: if the arguments are invalid.
- - -
### `tf.image.random_contrast(image, lower, upper, seed=None)`
Adjust the contrase of an image by a random factor.
Equivalent to `adjust_constrast()` but uses a `contrast_factor` randomly
picked in the interval `[lower, upper]`.
##### Args:
* `image`: 3-D tensor of shape `[height, width, channels]`.
* `lower`: float. Lower bound for the random contrast factor.
* `upper`: float. Upper bound for the random contrast factor.
* `seed`: A Python integer. Used to create a random seed. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
##### Returns:
3-D tensor of shape `[height, width, channels]`.
##### Raises:
* `ValueError`: if `upper <= lower` or if `lower < 0`.
- - -
### `tf.image.per_image_whitening(image)`
Linearly scales `image` to have zero mean and unit norm.
This op computes `(x - mean) / adjusted_stddev`, where `mean` is the average
of all values in image, and
`adjusted_stddev = max(stddev, 1.0/srqt(image.NumElements()))`.
`stddev` is the standard deviation of all values in `image`. It is capped
away from zero to protect against division by 0 when handling uniform images.
Note that this implementation is limited:
* It only whitens based on the statistics of an individual image.
* It does not take into account the covariance structure.
##### Args:
* `image`: 3-D tensor of shape `[height, width, channels]`.
##### Returns:
The whitened image with same shape as `image`.
##### Raises:
* `ValueError`: if the shape of 'image' is incompatible with this function.
================================================
FILE: SOURCE/api_docs/python/index.md
================================================
# TensorFlow Python reference documentation
* **[Building Graphs](../../api_docs/python/framework.md)**:
* [`add_to_collection`](../../api_docs/python/framework.md#add_to_collection)
* [`as_dtype`](../../api_docs/python/framework.md#as_dtype)
* [`control_dependencies`](../../api_docs/python/framework.md#control_dependencies)
* [`convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor)
* [`device`](../../api_docs/python/framework.md#device)
* [`Dimension`](../../api_docs/python/framework.md#Dimension)
* [`DType`](../../api_docs/python/framework.md#DType)
* [`get_collection`](../../api_docs/python/framework.md#get_collection)
* [`get_default_graph`](../../api_docs/python/framework.md#get_default_graph)
* [`get_seed`](../../api_docs/python/framework.md#get_seed)
* [`Graph`](../../api_docs/python/framework.md#Graph)
* [`GraphKeys`](../../api_docs/python/framework.md#GraphKeys)
* [`import_graph_def`](../../api_docs/python/framework.md#import_graph_def)
* [`name_scope`](../../api_docs/python/framework.md#name_scope)
* [`NoGradient`](../../api_docs/python/framework.md#NoGradient)
* [`op_scope`](../../api_docs/python/framework.md#op_scope)
* [`Operation`](../../api_docs/python/framework.md#Operation)
* [`RegisterGradient`](../../api_docs/python/framework.md#RegisterGradient)
* [`RegisterShape`](../../api_docs/python/framework.md#RegisterShape)
* [`Tensor`](../../api_docs/python/framework.md#Tensor)
* [`TensorShape`](../../api_docs/python/framework.md#TensorShape)
* **[Constants, Sequences, and Random Values](../../api_docs/python/constant_op.md)**:
* [`constant`](../../api_docs/python/constant_op.md#constant)
* [`fill`](../../api_docs/python/constant_op.md#fill)
* [`linspace`](../../api_docs/python/constant_op.md#linspace)
* [`ones`](../../api_docs/python/constant_op.md#ones)
* [`ones_like`](../../api_docs/python/constant_op.md#ones_like)
* [`random_normal`](../../api_docs/python/constant_op.md#random_normal)
* [`random_shuffle`](../../api_docs/python/constant_op.md#random_shuffle)
* [`random_uniform`](../../api_docs/python/constant_op.md#random_uniform)
* [`range`](../../api_docs/python/constant_op.md#range)
* [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
* [`truncated_normal`](../../api_docs/python/constant_op.md#truncated_normal)
* [`zeros`](../../api_docs/python/constant_op.md#zeros)
* [`zeros_like`](../../api_docs/python/constant_op.md#zeros_like)
* **[Variables](../../api_docs/python/state_ops.md)**:
* [`all_variables`](../../api_docs/python/state_ops.md#all_variables)
* [`assert_variables_initialized`](../../api_docs/python/state_ops.md#assert_variables_initialized)
* [`assign`](../../api_docs/python/state_ops.md#assign)
* [`assign_add`](../../api_docs/python/state_ops.md#assign_add)
* [`assign_sub`](../../api_docs/python/state_ops.md#assign_sub)
* [`constant_initializer`](../../api_docs/python/state_ops.md#constant_initializer)
* [`count_up_to`](../../api_docs/python/state_ops.md#count_up_to)
* [`device`](../../api_docs/python/state_ops.md#device)
* [`get_checkpoint_state`](../../api_docs/python/state_ops.md#get_checkpoint_state)
* [`get_variable`](../../api_docs/python/state_ops.md#get_variable)
* [`get_variable_scope`](../../api_docs/python/state_ops.md#get_variable_scope)
* [`IndexedSlices`](../../api_docs/python/state_ops.md#IndexedSlices)
* [`initialize_all_variables`](../../api_docs/python/state_ops.md#initialize_all_variables)
* [`initialize_variables`](../../api_docs/python/state_ops.md#initialize_variables)
* [`latest_checkpoint`](../../api_docs/python/state_ops.md#latest_checkpoint)
* [`random_normal_initializer`](../../api_docs/python/state_ops.md#random_normal_initializer)
* [`random_uniform_initializer`](../../api_docs/python/state_ops.md#random_uniform_initializer)
* [`Saver`](../../api_docs/python/state_ops.md#Saver)
* [`scatter_add`](../../api_docs/python/state_ops.md#scatter_add)
* [`scatter_sub`](../../api_docs/python/state_ops.md#scatter_sub)
* [`scatter_update`](../../api_docs/python/state_ops.md#scatter_update)
* [`sparse_mask`](../../api_docs/python/state_ops.md#sparse_mask)
* [`trainable_variables`](../../api_docs/python/state_ops.md#trainable_variables)
* [`truncated_normal_initializer`](../../api_docs/python/state_ops.md#truncated_normal_initializer)
* [`uniform_unit_scaling_initializer`](../../api_docs/python/state_ops.md#uniform_unit_scaling_initializer)
* [`update_checkpoint_state`](../../api_docs/python/state_ops.md#update_checkpoint_state)
* [`Variable`](../../api_docs/python/state_ops.md#Variable)
* [`variable_scope`](../../api_docs/python/state_ops.md#variable_scope)
* [`zeros_initializer`](../../api_docs/python/state_ops.md#zeros_initializer)
* **[Tensor Transformations](../../api_docs/python/array_ops.md)**:
* [`cast`](../../api_docs/python/array_ops.md#cast)
* [`concat`](../../api_docs/python/array_ops.md#concat)
* [`dynamic_partition`](../../api_docs/python/array_ops.md#dynamic_partition)
* [`dynamic_stitch`](../../api_docs/python/array_ops.md#dynamic_stitch)
* [`expand_dims`](../../api_docs/python/array_ops.md#expand_dims)
* [`gather`](../../api_docs/python/array_ops.md#gather)
* [`pack`](../../api_docs/python/array_ops.md#pack)
* [`pad`](../../api_docs/python/array_ops.md#pad)
* [`rank`](../../api_docs/python/array_ops.md#rank)
* [`reshape`](../../api_docs/python/array_ops.md#reshape)
* [`reverse`](../../api_docs/python/array_ops.md#reverse)
* [`reverse_sequence`](../../api_docs/python/array_ops.md#reverse_sequence)
* [`shape`](../../api_docs/python/array_ops.md#shape)
* [`size`](../../api_docs/python/array_ops.md#size)
* [`slice`](../../api_docs/python/array_ops.md#slice)
* [`split`](../../api_docs/python/array_ops.md#split)
* [`squeeze`](../../api_docs/python/array_ops.md#squeeze)
* [`string_to_number`](../../api_docs/python/array_ops.md#string_to_number)
* [`tile`](../../api_docs/python/array_ops.md#tile)
* [`to_bfloat16`](../../api_docs/python/array_ops.md#to_bfloat16)
* [`to_double`](../../api_docs/python/array_ops.md#to_double)
* [`to_float`](../../api_docs/python/array_ops.md#to_float)
* [`to_int32`](../../api_docs/python/array_ops.md#to_int32)
* [`to_int64`](../../api_docs/python/array_ops.md#to_int64)
* [`transpose`](../../api_docs/python/array_ops.md#transpose)
* [`unpack`](../../api_docs/python/array_ops.md#unpack)
* **[Math](../../api_docs/python/math_ops.md)**:
* [`abs`](../../api_docs/python/math_ops.md#abs)
* [`accumulate_n`](../../api_docs/python/math_ops.md#accumulate_n)
* [`add`](../../api_docs/python/math_ops.md#add)
* [`add_n`](../../api_docs/python/math_ops.md#add_n)
* [`argmax`](../../api_docs/python/math_ops.md#argmax)
* [`argmin`](../../api_docs/python/math_ops.md#argmin)
* [`batch_cholesky`](../../api_docs/python/math_ops.md#batch_cholesky)
* [`batch_matmul`](../../api_docs/python/math_ops.md#batch_matmul)
* [`batch_matrix_determinant`](../../api_docs/python/math_ops.md#batch_matrix_determinant)
* [`batch_matrix_inverse`](../../api_docs/python/math_ops.md#batch_matrix_inverse)
* [`ceil`](../../api_docs/python/math_ops.md#ceil)
* [`cholesky`](../../api_docs/python/math_ops.md#cholesky)
* [`complex`](../../api_docs/python/math_ops.md#complex)
* [`complex_abs`](../../api_docs/python/math_ops.md#complex_abs)
* [`conj`](../../api_docs/python/math_ops.md#conj)
* [`cos`](../../api_docs/python/math_ops.md#cos)
* [`diag`](../../api_docs/python/math_ops.md#diag)
* [`div`](../../api_docs/python/math_ops.md#div)
* [`edit_distance`](../../api_docs/python/math_ops.md#edit_distance)
* [`exp`](../../api_docs/python/math_ops.md#exp)
* [`floor`](../../api_docs/python/math_ops.md#floor)
* [`imag`](../../api_docs/python/math_ops.md#imag)
* [`inv`](../../api_docs/python/math_ops.md#inv)
* [`invert_permutation`](../../api_docs/python/math_ops.md#invert_permutation)
* [`listdiff`](../../api_docs/python/math_ops.md#listdiff)
* [`log`](../../api_docs/python/math_ops.md#log)
* [`matmul`](../../api_docs/python/math_ops.md#matmul)
* [`matrix_determinant`](../../api_docs/python/math_ops.md#matrix_determinant)
* [`matrix_inverse`](../../api_docs/python/math_ops.md#matrix_inverse)
* [`maximum`](../../api_docs/python/math_ops.md#maximum)
* [`minimum`](../../api_docs/python/math_ops.md#minimum)
* [`mod`](../../api_docs/python/math_ops.md#mod)
* [`mul`](../../api_docs/python/math_ops.md#mul)
* [`neg`](../../api_docs/python/math_ops.md#neg)
* [`pow`](../../api_docs/python/math_ops.md#pow)
* [`real`](../../api_docs/python/math_ops.md#real)
* [`reduce_all`](../../api_docs/python/math_ops.md#reduce_all)
* [`reduce_any`](../../api_docs/python/math_ops.md#reduce_any)
* [`reduce_max`](../../api_docs/python/math_ops.md#reduce_max)
* [`reduce_mean`](../../api_docs/python/math_ops.md#reduce_mean)
* [`reduce_min`](../../api_docs/python/math_ops.md#reduce_min)
* [`reduce_prod`](../../api_docs/python/math_ops.md#reduce_prod)
* [`reduce_sum`](../../api_docs/python/math_ops.md#reduce_sum)
* [`round`](../../api_docs/python/math_ops.md#round)
* [`rsqrt`](../../api_docs/python/math_ops.md#rsqrt)
* [`segment_max`](../../api_docs/python/math_ops.md#segment_max)
* [`segment_mean`](../../api_docs/python/math_ops.md#segment_mean)
* [`segment_min`](../../api_docs/python/math_ops.md#segment_min)
* [`segment_prod`](../../api_docs/python/math_ops.md#segment_prod)
* [`segment_sum`](../../api_docs/python/math_ops.md#segment_sum)
* [`sign`](../../api_docs/python/math_ops.md#sign)
* [`sin`](../../api_docs/python/math_ops.md#sin)
* [`sparse_segment_mean`](../../api_docs/python/math_ops.md#sparse_segment_mean)
* [`sparse_segment_sum`](../../api_docs/python/math_ops.md#sparse_segment_sum)
* [`sqrt`](../../api_docs/python/math_ops.md#sqrt)
* [`square`](../../api_docs/python/math_ops.md#square)
* [`sub`](../../api_docs/python/math_ops.md#sub)
* [`transpose`](../../api_docs/python/math_ops.md#transpose)
* [`unique`](../../api_docs/python/math_ops.md#unique)
* [`unsorted_segment_sum`](../../api_docs/python/math_ops.md#unsorted_segment_sum)
* [`where`](../../api_docs/python/math_ops.md#where)
* **[Control Flow](../../api_docs/python/control_flow_ops.md)**:
* [`add_check_numerics_ops`](../../api_docs/python/control_flow_ops.md#add_check_numerics_ops)
* [`Assert`](../../api_docs/python/control_flow_ops.md#Assert)
* [`check_numerics`](../../api_docs/python/control_flow_ops.md#check_numerics)
* [`count_up_to`](../../api_docs/python/control_flow_ops.md#count_up_to)
* [`equal`](../../api_docs/python/control_flow_ops.md#equal)
* [`greater`](../../api_docs/python/control_flow_ops.md#greater)
* [`greater_equal`](../../api_docs/python/control_flow_ops.md#greater_equal)
* [`group`](../../api_docs/python/control_flow_ops.md#group)
* [`identity`](../../api_docs/python/control_flow_ops.md#identity)
* [`is_finite`](../../api_docs/python/control_flow_ops.md#is_finite)
* [`is_inf`](../../api_docs/python/control_flow_ops.md#is_inf)
* [`is_nan`](../../api_docs/python/control_flow_ops.md#is_nan)
* [`less`](../../api_docs/python/control_flow_ops.md#less)
* [`less_equal`](../../api_docs/python/control_flow_ops.md#less_equal)
* [`logical_and`](../../api_docs/python/control_flow_ops.md#logical_and)
* [`logical_not`](../../api_docs/python/control_flow_ops.md#logical_not)
* [`logical_or`](../../api_docs/python/control_flow_ops.md#logical_or)
* [`logical_xor`](../../api_docs/python/control_flow_ops.md#logical_xor)
* [`no_op`](../../api_docs/python/control_flow_ops.md#no_op)
* [`not_equal`](../../api_docs/python/control_flow_ops.md#not_equal)
* [`Print`](../../api_docs/python/control_flow_ops.md#Print)
* [`select`](../../api_docs/python/control_flow_ops.md#select)
* [`tuple`](../../api_docs/python/control_flow_ops.md#tuple)
* [`verify_tensor_all_finite`](../../api_docs/python/control_flow_ops.md#verify_tensor_all_finite)
* [`where`](../../api_docs/python/control_flow_ops.md#where)
* **[Images](../../api_docs/python/image.md)**:
* [`adjust_brightness`](../../api_docs/python/image.md#adjust_brightness)
* [`adjust_contrast`](../../api_docs/python/image.md#adjust_contrast)
* [`crop_to_bounding_box`](../../api_docs/python/image.md#crop_to_bounding_box)
* [`decode_jpeg`](../../api_docs/python/image.md#decode_jpeg)
* [`decode_png`](../../api_docs/python/image.md#decode_png)
* [`encode_jpeg`](../../api_docs/python/image.md#encode_jpeg)
* [`encode_png`](../../api_docs/python/image.md#encode_png)
* [`extract_glimpse`](../../api_docs/python/image.md#extract_glimpse)
* [`flip_left_right`](../../api_docs/python/image.md#flip_left_right)
* [`flip_up_down`](../../api_docs/python/image.md#flip_up_down)
* [`pad_to_bounding_box`](../../api_docs/python/image.md#pad_to_bounding_box)
* [`per_image_whitening`](../../api_docs/python/image.md#per_image_whitening)
* [`random_brightness`](../../api_docs/python/image.md#random_brightness)
* [`random_contrast`](../../api_docs/python/image.md#random_contrast)
* [`random_crop`](../../api_docs/python/image.md#random_crop)
* [`random_flip_left_right`](../../api_docs/python/image.md#random_flip_left_right)
* [`random_flip_up_down`](../../api_docs/python/image.md#random_flip_up_down)
* [`resize_area`](../../api_docs/python/image.md#resize_area)
* [`resize_bicubic`](../../api_docs/python/image.md#resize_bicubic)
* [`resize_bilinear`](../../api_docs/python/image.md#resize_bilinear)
* [`resize_image_with_crop_or_pad`](../../api_docs/python/image.md#resize_image_with_crop_or_pad)
* [`resize_images`](../../api_docs/python/image.md#resize_images)
* [`resize_nearest_neighbor`](../../api_docs/python/image.md#resize_nearest_neighbor)
* [`transpose_image`](../../api_docs/python/image.md#transpose_image)
* **[Sparse Tensors](../../api_docs/python/sparse_ops.md)**:
* [`shape`](../../api_docs/python/sparse_ops.md#shape)
* [`sparse_concat`](../../api_docs/python/sparse_ops.md#sparse_concat)
* [`sparse_fill_empty_rows`](../../api_docs/python/sparse_ops.md#sparse_fill_empty_rows)
* [`sparse_reorder`](../../api_docs/python/sparse_ops.md#sparse_reorder)
* [`sparse_retain`](../../api_docs/python/sparse_ops.md#sparse_retain)
* [`sparse_tensor_to_dense`](../../api_docs/python/sparse_ops.md#sparse_tensor_to_dense)
* [`sparse_to_dense`](../../api_docs/python/sparse_ops.md#sparse_to_dense)
* [`sparse_to_indicator`](../../api_docs/python/sparse_ops.md#sparse_to_indicator)
* [`SparseTensor`](../../api_docs/python/sparse_ops.md#SparseTensor)
* [`SparseTensorValue`](../../api_docs/python/sparse_ops.md#SparseTensorValue)
* **[Inputs and Readers](../../api_docs/python/io_ops.md)**:
* [`batch`](../../api_docs/python/io_ops.md#batch)
* [`batch_join`](../../api_docs/python/io_ops.md#batch_join)
* [`decode_csv`](../../api_docs/python/io_ops.md#decode_csv)
* [`decode_raw`](../../api_docs/python/io_ops.md#decode_raw)
* [`FIFOQueue`](../../api_docs/python/io_ops.md#FIFOQueue)
* [`FixedLengthRecordReader`](../../api_docs/python/io_ops.md#FixedLengthRecordReader)
* [`IdentityReader`](../../api_docs/python/io_ops.md#IdentityReader)
* [`limit_epochs`](../../api_docs/python/io_ops.md#limit_epochs)
* [`match_filenames_once`](../../api_docs/python/io_ops.md#match_filenames_once)
* [`matching_files`](../../api_docs/python/io_ops.md#matching_files)
* [`parse_example`](../../api_docs/python/io_ops.md#parse_example)
* [`parse_single_example`](../../api_docs/python/io_ops.md#parse_single_example)
* [`placeholder`](../../api_docs/python/io_ops.md#placeholder)
* [`QueueBase`](../../api_docs/python/io_ops.md#QueueBase)
* [`RandomShuffleQueue`](../../api_docs/python/io_ops.md#RandomShuffleQueue)
* [`range_input_producer`](../../api_docs/python/io_ops.md#range_input_producer)
* [`read_file`](../../api_docs/python/io_ops.md#read_file)
* [`ReaderBase`](../../api_docs/python/io_ops.md#ReaderBase)
* [`shuffle_batch`](../../api_docs/python/io_ops.md#shuffle_batch)
* [`shuffle_batch_join`](../../api_docs/python/io_ops.md#shuffle_batch_join)
* [`size`](../../api_docs/python/io_ops.md#size)
* [`slice_input_producer`](../../api_docs/python/io_ops.md#slice_input_producer)
* [`string_input_producer`](../../api_docs/python/io_ops.md#string_input_producer)
* [`TextLineReader`](../../api_docs/python/io_ops.md#TextLineReader)
* [`TFRecordReader`](../../api_docs/python/io_ops.md#TFRecordReader)
* [`WholeFileReader`](../../api_docs/python/io_ops.md#WholeFileReader)
* **[Data IO (Python functions)](../../api_docs/python/python_io.md)**:
* [`tf_record_iterator`](../../api_docs/python/python_io.md#tf_record_iterator)
* [`TFRecordWriter`](../../api_docs/python/python_io.md#TFRecordWriter)
* **[Neural Network](../../api_docs/python/nn.md)**:
* [`avg_pool`](../../api_docs/python/nn.md#avg_pool)
* [`bias_add`](../../api_docs/python/nn.md#bias_add)
* [`compute_accidental_hits`](../../api_docs/python/nn.md#compute_accidental_hits)
* [`conv2d`](../../api_docs/python/nn.md#conv2d)
* [`depthwise_conv2d`](../../api_docs/python/nn.md#depthwise_conv2d)
* [`dropout`](../../api_docs/python/nn.md#dropout)
* [`embedding_lookup`](../../api_docs/python/nn.md#embedding_lookup)
* [`fixed_unigram_candidate_sampler`](../../api_docs/python/nn.md#fixed_unigram_candidate_sampler)
* [`in_top_k`](../../api_docs/python/nn.md#in_top_k)
* [`l2_loss`](../../api_docs/python/nn.md#l2_loss)
* [`l2_normalize`](../../api_docs/python/nn.md#l2_normalize)
* [`learned_unigram_candidate_sampler`](../../api_docs/python/nn.md#learned_unigram_candidate_sampler)
* [`local_response_normalization`](../../api_docs/python/nn.md#local_response_normalization)
* [`log_uniform_candidate_sampler`](../../api_docs/python/nn.md#log_uniform_candidate_sampler)
* [`max_pool`](../../api_docs/python/nn.md#max_pool)
* [`max_pool_with_argmax`](../../api_docs/python/nn.md#max_pool_with_argmax)
* [`moments`](../../api_docs/python/nn.md#moments)
* [`nce_loss`](../../api_docs/python/nn.md#nce_loss)
* [`relu`](../../api_docs/python/nn.md#relu)
* [`relu6`](../../api_docs/python/nn.md#relu6)
* [`sampled_softmax_loss`](../../api_docs/python/nn.md#sampled_softmax_loss)
* [`separable_conv2d`](../../api_docs/python/nn.md#separable_conv2d)
* [`sigmoid`](../../api_docs/python/nn.md#sigmoid)
* [`sigmoid_cross_entropy_with_logits`](../../api_docs/python/nn.md#sigmoid_cross_entropy_with_logits)
* [`softmax`](../../api_docs/python/nn.md#softmax)
* [`softmax_cross_entropy_with_logits`](../../api_docs/python/nn.md#softmax_cross_entropy_with_logits)
* [`softplus`](../../api_docs/python/nn.md#softplus)
* [`tanh`](../../api_docs/python/nn.md#tanh)
* [`top_k`](../../api_docs/python/nn.md#top_k)
* [`uniform_candidate_sampler`](../../api_docs/python/nn.md#uniform_candidate_sampler)
* **[Running Graphs](../../api_docs/python/client.md)**:
* [`AbortedError`](../../api_docs/python/client.md#AbortedError)
* [`AlreadyExistsError`](../../api_docs/python/client.md#AlreadyExistsError)
* [`CancelledError`](../../api_docs/python/client.md#CancelledError)
* [`DataLossError`](../../api_docs/python/client.md#DataLossError)
* [`DeadlineExceededError`](../../api_docs/python/client.md#DeadlineExceededError)
* [`FailedPreconditionError`](../../api_docs/python/client.md#FailedPreconditionError)
* [`get_default_session`](../../api_docs/python/client.md#get_default_session)
* [`InteractiveSession`](../../api_docs/python/client.md#InteractiveSession)
* [`InternalError`](../../api_docs/python/client.md#InternalError)
* [`InvalidArgumentError`](../../api_docs/python/client.md#InvalidArgumentError)
* [`NotFoundError`](../../api_docs/python/client.md#NotFoundError)
* [`OpError`](../../api_docs/python/client.md#OpError)
* [`OutOfRangeError`](../../api_docs/python/client.md#OutOfRangeError)
* [`PermissionDeniedError`](../../api_docs/python/client.md#PermissionDeniedError)
* [`ResourceExhaustedError`](../../api_docs/python/client.md#ResourceExhaustedError)
* [`Session`](../../api_docs/python/client.md#Session)
* [`UnauthenticatedError`](../../api_docs/python/client.md#UnauthenticatedError)
* [`UnavailableError`](../../api_docs/python/client.md#UnavailableError)
* [`UnimplementedError`](../../api_docs/python/client.md#UnimplementedError)
* [`UnknownError`](../../api_docs/python/client.md#UnknownError)
* **[Training](../../api_docs/python/train.md)**:
* [`AdagradOptimizer`](../../api_docs/python/train.md#AdagradOptimizer)
* [`AdamOptimizer`](../../api_docs/python/train.md#AdamOptimizer)
* [`add_queue_runner`](../../api_docs/python/train.md#add_queue_runner)
* [`AggregationMethod`](../../api_docs/python/train.md#AggregationMethod)
* [`clip_by_average_norm`](../../api_docs/python/train.md#clip_by_average_norm)
* [`clip_by_global_norm`](../../api_docs/python/train.md#clip_by_global_norm)
* [`clip_by_norm`](../../api_docs/python/train.md#clip_by_norm)
* [`clip_by_value`](../../api_docs/python/train.md#clip_by_value)
* [`Coordinator`](../../api_docs/python/train.md#Coordinator)
* [`exponential_decay`](../../api_docs/python/train.md#exponential_decay)
* [`ExponentialMovingAverage`](../../api_docs/python/train.md#ExponentialMovingAverage)
* [`FtrlOptimizer`](../../api_docs/python/train.md#FtrlOptimizer)
* [`global_norm`](../../api_docs/python/train.md#global_norm)
* [`global_step`](../../api_docs/python/train.md#global_step)
* [`GradientDescentOptimizer`](../../api_docs/python/train.md#GradientDescentOptimizer)
* [`gradients`](../../api_docs/python/train.md#gradients)
* [`histogram_summary`](../../api_docs/python/train.md#histogram_summary)
* [`image_summary`](../../api_docs/python/train.md#image_summary)
* [`merge_all_summaries`](../../api_docs/python/train.md#merge_all_summaries)
* [`merge_summary`](../../api_docs/python/train.md#merge_summary)
* [`MomentumOptimizer`](../../api_docs/python/train.md#MomentumOptimizer)
* [`Optimizer`](../../api_docs/python/train.md#Optimizer)
* [`QueueRunner`](../../api_docs/python/train.md#QueueRunner)
* [`RMSPropOptimizer`](../../api_docs/python/train.md#RMSPropOptimizer)
* [`scalar_summary`](../../api_docs/python/train.md#scalar_summary)
* [`start_queue_runners`](../../api_docs/python/train.md#start_queue_runners)
* [`stop_gradient`](../../api_docs/python/train.md#stop_gradient)
* [`summary_iterator`](../../api_docs/python/train.md#summary_iterator)
* [`SummaryWriter`](../../api_docs/python/train.md#SummaryWriter)
* [`write_graph`](../../api_docs/python/train.md#write_graph)
* [`zero_fraction`](../../api_docs/python/train.md#zero_fraction)
-->
================================================
FILE: SOURCE/api_docs/python/io_ops.md
================================================
# Inputs and Readers
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Inputs and Readers](#AUTOGENERATED-inputs-and-readers)
* [Placeholders](#AUTOGENERATED-placeholders)
* [`tf.placeholder(dtype, shape=None, name=None)`](#placeholder)
* [Readers](#AUTOGENERATED-readers)
* [`class tf.ReaderBase`](#ReaderBase)
* [`class tf.TextLineReader`](#TextLineReader)
* [`class tf.WholeFileReader`](#WholeFileReader)
* [`class tf.IdentityReader`](#IdentityReader)
* [`class tf.TFRecordReader`](#TFRecordReader)
* [`class tf.FixedLengthRecordReader`](#FixedLengthRecordReader)
* [Converting](#AUTOGENERATED-converting)
* [`tf.decode_csv(records, record_defaults, field_delim=None, name=None)`](#decode_csv)
* [`tf.decode_raw(bytes, out_type, little_endian=None, name=None)`](#decode_raw)
* [Example protocol buffer](#AUTOGENERATED-example-protocol-buffer)
* [`tf.parse_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseExample')`](#parse_example)
* [`tf.parse_single_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseSingleExample')`](#parse_single_example)
* [Queues](#AUTOGENERATED-queues)
* [`class tf.QueueBase`](#QueueBase)
* [`class tf.FIFOQueue`](#FIFOQueue)
* [`class tf.RandomShuffleQueue`](#RandomShuffleQueue)
* [Dealing with the filesystem](#AUTOGENERATED-dealing-with-the-filesystem)
* [`tf.matching_files(pattern, name=None)`](#matching_files)
* [`tf.read_file(filename, name=None)`](#read_file)
* [Input pipeline](#AUTOGENERATED-input-pipeline)
* [Beginning of an input pipeline](#AUTOGENERATED-beginning-of-an-input-pipeline)
* [`tf.train.match_filenames_once(pattern, name=None)`](#match_filenames_once)
* [`tf.train.limit_epochs(tensor, num_epochs=None, name=None)`](#limit_epochs)
* [`tf.train.range_input_producer(limit, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`](#range_input_producer)
* [`tf.train.slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`](#slice_input_producer)
* [`tf.train.string_input_producer(string_tensor, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`](#string_input_producer)
* [Batching at the end of an input pipeline](#AUTOGENERATED-batching-at-the-end-of-an-input-pipeline)
* [`tf.train.batch(tensor_list, batch_size, num_threads=1, capacity=32, enqueue_many=False, shapes=None, name=None)`](#batch)
* [`tf.train.batch_join(tensor_list_list, batch_size, capacity=32, enqueue_many=False, shapes=None, name=None)`](#batch_join)
* [`tf.train.shuffle_batch(tensor_list, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, name=None)`](#shuffle_batch)
* [`tf.train.shuffle_batch_join(tensor_list_list, batch_size, capacity, min_after_dequeue, seed=None, enqueue_many=False, shapes=None, name=None)`](#shuffle_batch_join)
## Placeholders
TensorFlow provides a placeholder operation that must be fed with data
on execution. For more info, see the section on [Feeding
data](../../how_tos/reading_data/index.md#feeding).
- - -
### `tf.placeholder(dtype, shape=None, name=None)`
Inserts a placeholder for a tensor that will be always fed.
**Important**: This tensor will produce an error if evaluated. Its value must
be fed using the `feed_dict` optional argument to `Session.run()`,
`Tensor.eval()`, or `Operation.run()`.
For example:
```python
x = tf.placeholder(float, shape=(1024, 1024))
y = tf.matmul(x, x)
with tf.Session() as sess:
print sess.run(y) # ERROR: will fail because x was not fed.
rand_array = np.random.rand(1024, 1024)
print sess.run(y, feed_dict={x: rand_array}) # Will succeed.
```
##### Args:
* `dtype`: The type of elements in the tensor to be fed.
* `shape`: The shape of the tensor to be fed (optional). If the shape is not
specified, you can feed a tensor of any shape.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` that may be used as a handle for feeding a value, but not
evaluated directly.
## Readers
TensorFlow provides a set of Reader classes for reading data formats.
For more information on inputs and readers, see [Reading
data](../../how_tos/reading_data/index.md).
- - -
### `class tf.ReaderBase`
Base class for different Reader types, that produce a record every step.
Conceptually, Readers convert string 'work units' into records (key,
value pairs). Typically the 'work units' are filenames and the
records are extracted from the contents of those files. We want a
single record produced per step, but a work unit can correspond to
many records.
Therefore we introduce some decoupling using a queue. The queue
contains the work units and the Reader dequeues from the queue when
it is asked to produce a record (via Read()) but it has finished the
last work unit.
- - -
#### `tf.ReaderBase.__init__(reader_ref, supports_serialize=False)`
Creates a new ReaderBase.
##### Args:
* `reader_ref`: The operation that implements the reader.
* `supports_serialize`: True if the reader implementation can
serialize its state.
- - -
#### `tf.ReaderBase.num_records_produced(name=None)`
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have
succeeded.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.ReaderBase.num_work_units_completed(name=None)`
Returns the number of work units this reader has finished processing.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.ReaderBase.read(queue, name=None)`
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the
Reader needs to start reading from a new file since it has
finished with the previous file).
##### Args:
* `queue`: A Queue or a mutable string Tensor representing a handle
to a Queue, with string work items.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of Tensors (key, value).
* `key`: A string scalar Tensor.
* `value`: A string scalar Tensor.
- - -
#### `tf.ReaderBase.reader_ref`
Op that implements the reader.
- - -
#### `tf.ReaderBase.reset(name=None)`
Restore a reader to its initial clean state.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.ReaderBase.restore_state(state, name=None)`
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an
Unimplemented error.
##### Args:
* `state`: A string Tensor.
Result of a SerializeState of a Reader with matching type.
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.ReaderBase.serialize_state(name=None)`
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an
Unimplemented error.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
A string Tensor.
- - -
#### `tf.ReaderBase.supports_serialize`
Whether the Reader implementation can serialize its state.
- - -
### `class tf.TextLineReader`
A Reader that outputs the lines of a file delimited by newlines.
Newlines are stripped from the output.
See ReaderBase for supported methods.
- - -
#### `tf.TextLineReader.__init__(skip_header_lines=None, name=None)`
Create a TextLineReader.
##### Args:
* `skip_header_lines`: An optional int. Defaults to 0. Number of lines
to skip from the beginning of every file.
* `name`: A name for the operation (optional).
- - -
#### `tf.TextLineReader.num_records_produced(name=None)`
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have
succeeded.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.TextLineReader.num_work_units_completed(name=None)`
Returns the number of work units this reader has finished processing.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.TextLineReader.read(queue, name=None)`
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the
Reader needs to start reading from a new file since it has
finished with the previous file).
##### Args:
* `queue`: A Queue or a mutable string Tensor representing a handle
to a Queue, with string work items.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of Tensors (key, value).
* `key`: A string scalar Tensor.
* `value`: A string scalar Tensor.
- - -
#### `tf.TextLineReader.reader_ref`
Op that implements the reader.
- - -
#### `tf.TextLineReader.reset(name=None)`
Restore a reader to its initial clean state.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.TextLineReader.restore_state(state, name=None)`
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an
Unimplemented error.
##### Args:
* `state`: A string Tensor.
Result of a SerializeState of a Reader with matching type.
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.TextLineReader.serialize_state(name=None)`
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an
Unimplemented error.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
A string Tensor.
- - -
#### `tf.TextLineReader.supports_serialize`
Whether the Reader implementation can serialize its state.
- - -
### `class tf.WholeFileReader`
A Reader that outputs the entire contents of a file as a value.
To use, enqueue filenames in a Queue. The output of Read will
be a filename (key) and the contents of that file (value).
See ReaderBase for supported methods.
- - -
#### `tf.WholeFileReader.__init__(name=None)`
Create a WholeFileReader.
##### Args:
* `name`: A name for the operation (optional).
- - -
#### `tf.WholeFileReader.num_records_produced(name=None)`
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have
succeeded.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.WholeFileReader.num_work_units_completed(name=None)`
Returns the number of work units this reader has finished processing.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.WholeFileReader.read(queue, name=None)`
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the
Reader needs to start reading from a new file since it has
finished with the previous file).
##### Args:
* `queue`: A Queue or a mutable string Tensor representing a handle
to a Queue, with string work items.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of Tensors (key, value).
* `key`: A string scalar Tensor.
* `value`: A string scalar Tensor.
- - -
#### `tf.WholeFileReader.reader_ref`
Op that implements the reader.
- - -
#### `tf.WholeFileReader.reset(name=None)`
Restore a reader to its initial clean state.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.WholeFileReader.restore_state(state, name=None)`
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an
Unimplemented error.
##### Args:
* `state`: A string Tensor.
Result of a SerializeState of a Reader with matching type.
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.WholeFileReader.serialize_state(name=None)`
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an
Unimplemented error.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
A string Tensor.
- - -
#### `tf.WholeFileReader.supports_serialize`
Whether the Reader implementation can serialize its state.
- - -
### `class tf.IdentityReader`
A Reader that outputs the queued work as both the key and value.
To use, enqueue strings in a Queue. Read will take the front
work string and output (work, work).
See ReaderBase for supported methods.
- - -
#### `tf.IdentityReader.__init__(name=None)`
Create a IdentityReader.
##### Args:
* `name`: A name for the operation (optional).
- - -
#### `tf.IdentityReader.num_records_produced(name=None)`
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have
succeeded.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.IdentityReader.num_work_units_completed(name=None)`
Returns the number of work units this reader has finished processing.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.IdentityReader.read(queue, name=None)`
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the
Reader needs to start reading from a new file since it has
finished with the previous file).
##### Args:
* `queue`: A Queue or a mutable string Tensor representing a handle
to a Queue, with string work items.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of Tensors (key, value).
* `key`: A string scalar Tensor.
* `value`: A string scalar Tensor.
- - -
#### `tf.IdentityReader.reader_ref`
Op that implements the reader.
- - -
#### `tf.IdentityReader.reset(name=None)`
Restore a reader to its initial clean state.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.IdentityReader.restore_state(state, name=None)`
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an
Unimplemented error.
##### Args:
* `state`: A string Tensor.
Result of a SerializeState of a Reader with matching type.
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.IdentityReader.serialize_state(name=None)`
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an
Unimplemented error.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
A string Tensor.
- - -
#### `tf.IdentityReader.supports_serialize`
Whether the Reader implementation can serialize its state.
- - -
### `class tf.TFRecordReader`
A Reader that outputs the records from a TFRecords file.
See ReaderBase for supported methods.
- - -
#### `tf.TFRecordReader.__init__(name=None)`
Create a TFRecordReader.
##### Args:
* `name`: A name for the operation (optional).
- - -
#### `tf.TFRecordReader.num_records_produced(name=None)`
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have
succeeded.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.TFRecordReader.num_work_units_completed(name=None)`
Returns the number of work units this reader has finished processing.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.TFRecordReader.read(queue, name=None)`
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the
Reader needs to start reading from a new file since it has
finished with the previous file).
##### Args:
* `queue`: A Queue or a mutable string Tensor representing a handle
to a Queue, with string work items.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of Tensors (key, value).
* `key`: A string scalar Tensor.
* `value`: A string scalar Tensor.
- - -
#### `tf.TFRecordReader.reader_ref`
Op that implements the reader.
- - -
#### `tf.TFRecordReader.reset(name=None)`
Restore a reader to its initial clean state.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.TFRecordReader.restore_state(state, name=None)`
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an
Unimplemented error.
##### Args:
* `state`: A string Tensor.
Result of a SerializeState of a Reader with matching type.
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.TFRecordReader.serialize_state(name=None)`
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an
Unimplemented error.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
A string Tensor.
- - -
#### `tf.TFRecordReader.supports_serialize`
Whether the Reader implementation can serialize its state.
- - -
### `class tf.FixedLengthRecordReader`
A Reader that outputs fixed-length records from a file.
See ReaderBase for supported methods.
- - -
#### `tf.FixedLengthRecordReader.__init__(record_bytes, header_bytes=None, footer_bytes=None, name=None)`
Create a FixedLengthRecordReader.
##### Args:
* `record_bytes`: An int.
* `header_bytes`: An optional int. Defaults to 0.
* `footer_bytes`: An optional int. Defaults to 0.
* `name`: A name for the operation (optional).
- - -
#### `tf.FixedLengthRecordReader.num_records_produced(name=None)`
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have
succeeded.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.FixedLengthRecordReader.num_work_units_completed(name=None)`
Returns the number of work units this reader has finished processing.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
An int64 Tensor.
- - -
#### `tf.FixedLengthRecordReader.read(queue, name=None)`
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the
Reader needs to start reading from a new file since it has
finished with the previous file).
##### Args:
* `queue`: A Queue or a mutable string Tensor representing a handle
to a Queue, with string work items.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of Tensors (key, value).
* `key`: A string scalar Tensor.
* `value`: A string scalar Tensor.
- - -
#### `tf.FixedLengthRecordReader.reader_ref`
Op that implements the reader.
- - -
#### `tf.FixedLengthRecordReader.reset(name=None)`
Restore a reader to its initial clean state.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.FixedLengthRecordReader.restore_state(state, name=None)`
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an
Unimplemented error.
##### Args:
* `state`: A string Tensor.
Result of a SerializeState of a Reader with matching type.
* `name`: A name for the operation (optional).
##### Returns:
The created Operation.
- - -
#### `tf.FixedLengthRecordReader.serialize_state(name=None)`
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an
Unimplemented error.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
A string Tensor.
- - -
#### `tf.FixedLengthRecordReader.supports_serialize`
Whether the Reader implementation can serialize its state.
## Converting
TensorFlow provides several operations that you can use to convert various data
formats into tensors.
- - -
### `tf.decode_csv(records, record_defaults, field_delim=None, name=None)`
Convert CSV records to tensors. Each column maps to one tensor.
RFC 4180 format is expected for the CSV records.
(https://tools.ietf.org/html/rfc4180)
Note that we allow leading and trailing spaces with int or float field.
##### Args:
* `records`: A `Tensor` of type `string`.
Each string is a record/row in the csv and all records should have
the same format.
* `record_defaults`: A list of `Tensor` objects with types from: `float32`, `int32`, `int64`, `string`.
One tensor per column of the input record, with either a
scalar default value for that column or empty if the column is required.
* `field_delim`: An optional `string`. Defaults to `","`.
delimiter to separate fields in a record.
* `name`: A name for the operation (optional).
##### Returns:
A list of `Tensor` objects. Has the same type as `record_defaults`.
Each tensor will have the same shape as records.
- - -
### `tf.decode_raw(bytes, out_type, little_endian=None, name=None)`
Reinterpret the bytes of a string as a vector of numbers.
##### Args:
* `bytes`: A `Tensor` of type `string`.
All the elements must have the same length.
* `out_type`: A `tf.DType` from: `tf.float32, tf.float64, tf.int32, tf.uint8, tf.int16, tf.int8, tf.int64`.
* `little_endian`: An optional `bool`. Defaults to `True`.
Whether the input bytes are in little-endian order.
Ignored for out_types that are stored in a single byte like uint8.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `out_type`.
A Tensor with one more dimension than the input bytes. The
added dimension will have size equal to the length of the elements
of bytes divided by the number of bytes to represent out_type.
- - -
### Example protocol buffer
TensorFlow's [recommended format for training
examples](../../how_tos/reading_data/index.md#standard-tensorflow-format)
is serialized `Example` protocol buffers, [described
here](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto).
They contain `Features`, [described
here](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/feature.proto).
- - -
### `tf.parse_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseExample')`
Parses `Example` protos.
Parses a number of serialized [`Example`]
(https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto)
protos given in `serialized`.
`names` may contain descriptive names for the corresponding serialized protos.
These may be useful for debugging purposes, but they have no effect on the
output. If not `None`, `names` must be the same length as `serialized`.
This op parses serialized examples into a dictionary mapping keys to `Tensor`
and `SparseTensor` objects respectively, depending on whether the keys appear
in `sparse_keys` or `dense_keys`.
The key `dense_keys[j]` is mapped to a `Tensor` of type `dense_types[j]` and
of shape `(serialized.size(),) + dense_shapes[j]`.
`dense_defaults` provides defaults for values referenced using `dense_keys`.
If a key is not present in this dictionary, the corresponding dense `Feature`
is required in all elements of `serialized`.
`dense_shapes[j]` provides the shape of each `Feature` entry referenced by
`dense_keys[j]`. The number of elements in the `Feature` corresponding to
`dense_key[j]` must always have `np.prod(dense_shapes[j])` entries. The
returned `Tensor` for `dense_key[j]` has shape `[N] + dense_shape[j]`, where
`N` is the number of `Example`s in `serialized`.
The key `sparse_keys[j]` is mapped to a `SparseTensor` of type
`sparse_types[j]`. The `SparseTensor` represents a ragged matrix.
Its indices are `[batch, index]` where `batch` is the batch entry the value
is from, and `index` is the value's index in the list of values associated
with that feature and example.
Examples:
For example, if one expects a `tf.float32` sparse feature `ft` and three
serialized `Example`s are provided:
```
serialized = [
features:
{ feature: [ key: { "ft" value: float_list: { value: [1.0, 2.0] } } ] },
features:
{ feature: [] },
features:
{ feature: [ key: { "ft" value: float_list: { value: [3.0] } } ] }
]
```
then the output will look like:
```
{"ft": SparseTensor(indices=[[0, 0], [0, 1], [2, 0]],
values=[1.0, 2.0, 3.0],
shape=(3, 2)) }
```
Given two `Example` input protos in `serialized`:
```
[
features: {
feature: { key: "kw" value: { bytes_list: { value: [ "knit", "big" ] } } }
feature: { key: "gps" value: { float_list: { value: [] } } }
},
features: {
feature: { key: "kw" value: { bytes_list: { value: [ "emmy" ] } } }
feature: { key: "dank" value: { int64_list: { value: [ 42 ] } } }
feature: { key: "gps" value: { } }
}
]
```
And arguments
```
names: ["input0", "input1"],
sparse_keys: ["kw", "dank", "gps"]
sparse_types: [DT_STRING, DT_INT64, DT_FLOAT]
```
Then the output is a dictionary:
```python
{
"kw": SparseTensor(
indices=[[0, 0], [0, 1], [1, 0]],
values=["knit", "big", "emmy"]
shape=[2, 2]),
"dank": SparseTensor(
indices=[[1, 0]],
values=[42],
shape=[2, 1]),
"gps": SparseTensor(
indices=[],
values=[],
shape=[2, 0]),
}
```
For dense results in two serialized `Example`s:
```
[
features: {
feature: { key: "age" value: { int64_list: { value: [ 0 ] } } }
feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } }
},
features: {
feature: { key: "age" value: { int64_list: { value: [] } } }
feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } }
}
]
```
We can use arguments:
```
names: ["input0", "input1"],
dense_keys: np.array(["age", "gender"]),
dense_types: [tf.int64, tf.string],
dense_defaults: {
"age": -1 # "age" defaults to -1 if missing
# "gender" has no specified default so it's required
}
dense_shapes: [(1,), (1,)], # age, gender, label, weight
```
And the expected output is:
```python
{
"age": [[0], [-1]],
"gender": [["f"], ["f"]],
}
```
##### Args:
* `serialized`: A list of strings, a batch of binary serialized `Example`
protos.
* `names`: A list of strings, the names of the serialized protos.
* `sparse_keys`: A list of string keys in the examples' features.
The results for these keys will be returned as `SparseTensor` objects.
* `sparse_types`: A list of `DTypes` of the same length as `sparse_keys`.
Only `tf.float32` (`FloatList`), `tf.int64` (`Int64List`),
and `tf.string` (`BytesList`) are supported.
* `dense_keys`: A list of string keys in the examples' features.
The results for these keys will be returned as `Tensor`s
* `dense_types`: A list of DTypes of the same length as `dense_keys`.
Only `tf.float32` (`FloatList`), `tf.int64` (`Int64List`),
and `tf.string` (`BytesList`) are supported.
* `dense_defaults`: A dict mapping string keys to `Tensor`s.
The keys of the dict must match the dense_keys of the feature.
* `dense_shapes`: A list of tuples with the same length as `dense_keys`.
The shape of the data for each dense feature referenced by `dense_keys`.
* `name`: A name for this operation (optional).
##### Returns:
A `dict` mapping keys to `Tensor`s and `SparseTensor`s.
##### Raises:
* `ValueError`: If sparse and dense key sets intersect, or input lengths do not
match up.
- - -
### `tf.parse_single_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseSingleExample')`
Parses a single `Example` proto.
Similar to `parse_example`, except:
For dense tensors, the returned `Tensor` is identical to the output of
`parse_example`, except there is no batch dimension, the output shape is the
same as the shape given in `dense_shape`.
For `SparseTensor`s, the first (batch) column of the indices matrix is removed
(the indices matrix is a column vector), the values vector is unchanged, and
the first (batch_size) entry of the shape vector is removed (it is now a
single element vector).
See also `parse_example`.
##### Args:
* `serialized`: A scalar string, a single serialized Example.
See parse_example documentation for more details.
* `names`: (Optional) A scalar string, the associated name.
See parse_example documentation for more details.
* `sparse_keys`: See parse_example documentation for more details.
* `sparse_types`: See parse_example documentation for more details.
* `dense_keys`: See parse_example documentation for more details.
* `dense_types`: See parse_example documentation for more details.
* `dense_defaults`: See parse_example documentation for more details.
* `dense_shapes`: See parse_example documentation for more details.
* `name`: A name for this operation (optional).
##### Returns:
A dictionary mapping keys to Tensors and SparseTensors.
##### Raises:
* `ValueError`: if "scalar" or "names" have known shapes, and are not scalars.
## Queues
TensorFlow provides several implementations of 'Queues', which are
structures within the TensorFlow computation graph to stage pipelines
of tensors together. The following describe the basic Queue interface
and some implementations. To see an example use, see [Threading and
Queues](../../how_tos/threading_and_queues/index.md).
- - -
### `class tf.QueueBase`
Base class for queue implementations.
A queue is a TensorFlow data structure that stores tensors across
multiple steps, and exposes operations that enqueue and dequeue
tensors.
Each queue element is a tuple of one or more tensors, where each
tuple component has a static dtype, and may have a static shape. The
queue implementations support versions of enqueue and dequeue that
handle single elements, versions that support enqueuing and
dequeuing a batch of elements at once.
See [`tf.FIFOQueue`](#FIFOQueue) and
[`tf.RandomShuffleQueue`](#RandomShuffleQueue) for concrete
implementations of this class, and instructions on how to create
them.
- - -
#### `tf.QueueBase.enqueue(vals, name=None)`
Enqueues one element to this queue.
If the queue is full when this operation executes, it will block
until the element has been enqueued.
##### Args:
* `vals`: The tuple of `Tensor` objects to be enqueued.
* `name`: A name for the operation (optional).
##### Returns:
The operation that enqueues a new tuple of tensors to the queue.
- - -
#### `tf.QueueBase.enqueue_many(vals, name=None)`
Enqueues zero or elements to this queue.
This operation slices each component tensor along the 0th dimension to
make multiple queue elements. All of the tensors in `vals` must have the
same size in the 0th dimension.
If the queue is full when this operation executes, it will block
until all of the elements have been enqueued.
##### Args:
* `vals`: The tensor or tuple of tensors from which the queue elements
are taken.
* `name`: A name for the operation (optional).
##### Returns:
The operation that enqueues a batch of tuples of tensors to the queue.
- - -
#### `tf.QueueBase.dequeue(name=None)`
Dequeues one element from this queue.
If the queue is empty when this operation executes, it will block
until there is an element to dequeue.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
The tuple of tensors that was dequeued.
- - -
#### `tf.QueueBase.dequeue_many(n, name=None)`
Dequeues and concatenates `n` elements from this queue.
This operation concatenates queue-element component tensors along
the 0th dimension to make a single component tensor. All of the
components in the dequeued tuple will have size `n` in the 0th dimension.
If the queue contains fewer than `n` elements when this operation
executes, it will block until `n` elements have been dequeued.
##### Args:
* `n`: A scalar `Tensor` containing the number of elements to dequeue.
* `name`: A name for the operation (optional).
##### Returns:
The tuple of concatenated tensors that was dequeued.
- - -
#### `tf.QueueBase.size(name=None)`
Compute the number of elements in this queue.
##### Args:
* `name`: A name for the operation (optional).
##### Returns:
A scalar tensor containing the number of elements in this queue.
- - -
#### `tf.QueueBase.close(cancel_pending_enqueues=False, name=None)`
Closes this queue.
This operation signals that no more elements will be enqueued in
the given queue. Subsequent `enqueue` and `enqueue_many`
operations will fail. Subsequent `dequeue` and `dequeue_many`
operations will continue to succeed if sufficient elements remain
in the queue. Subsequent `dequeue` and `dequeue_many` operations
that would block will fail immediately.
If `cancel_pending_enqueues` is `True`, all pending requests will also
be cancelled.
##### Args:
* `cancel_pending_enqueues`: (Optional.) A boolean, defaulting to
`False` (described above).
* `name`: A name for the operation (optional).
##### Returns:
The operation that closes the queue.
#### Other Methods
- - -
#### `tf.QueueBase.__init__(dtypes, shapes, queue_ref)`
Constructs a queue object from a queue reference.
##### Args:
* `dtypes`: A list of types. The length of dtypes must equal the number
of tensors in each element.
* `shapes`: Constraints on the shapes of tensors in an element:
A list of shape tuples or None. This list is the same length
as dtypes. If the shape of any tensors in the element are constrained,
all must be; shapes can be None if the shapes should not be constrained.
* `queue_ref`: The queue reference, i.e. the output of the queue op.
- - -
#### `tf.QueueBase.dtypes`
The list of dtypes for each component of a queue element.
- - -
#### `tf.QueueBase.name`
The name of the underlying queue.
- - -
#### `tf.QueueBase.queue_ref`
The underlying queue reference.
- - -
### `class tf.FIFOQueue`
A queue implementation that dequeues elements in first-in-first out order.
See [`tf.QueueBase`](#QueueBase) for a description of the methods on
this class.
- - -
#### `tf.FIFOQueue.__init__(capacity, dtypes, shapes=None, shared_name=None, name='fifo_queue')`
Creates a queue that dequeues elements in a first-in first-out order.
A `FIFOQueue` has bounded capacity; supports multiple concurrent
producers and consumers; and provides exactly-once delivery.
A `FIFOQueue` holds a list of up to `capacity` elements. Each
element is a fixed-length tuple of tensors whose dtypes are
described by `dtypes`, and whose shapes are optionally described
by the `shapes` argument.
If the `shapes` argument is specified, each component of a queue
element must have the respective fixed shape. If it is
unspecified, different queue elements may have different shapes,
but the use of `dequeue_many` is disallowed.
##### Args:
* `capacity`: An integer. The upper bound on the number of elements
that may be stored in this queue.
* `dtypes`: A list of `DType` objects. The length of `dtypes` must equal
the number of tensors in each queue element.
* `shapes`: (Optional.) A list of fully-defined `TensorShape` objects,
with the same length as `dtypes` or `None`.
* `shared_name`: (Optional.) If non-empty, this queue will be shared under
the given name across multiple sessions.
* `name`: Optional name for the queue operation.
- - -
### `class tf.RandomShuffleQueue`
A queue implementation that dequeues elements in a random order.
See [`tf.QueueBase`](#QueueBase) for a description of the methods on
this class.
- - -
#### `tf.RandomShuffleQueue.__init__(capacity, min_after_dequeue, dtypes, shapes=None, seed=None, shared_name=None, name='random_shuffle_queue')`
Create a queue that dequeues elements in a random order.
A `RandomShuffleQueue` has bounded capacity; supports multiple
concurrent producers and consumers; and provides exactly-once
delivery.
A `RandomShuffleQueue` holds a list of up to `capacity`
elements. Each element is a fixed-length tuple of tensors whose
dtypes are described by `dtypes`, and whose shapes are optionally
described by the `shapes` argument.
If the `shapes` argument is specified, each component of a queue
element must have the respective fixed shape. If it is
unspecified, different queue elements may have different shapes,
but the use of `dequeue_many` is disallowed.
The `min_after_dequeue` argument allows the caller to specify a
minimum number of elements that will remain in the queue after a
`dequeue` or `dequeue_many` operation completes, to ensure a
minimum level of mixing of elements. This invariant is maintained
by blocking those operations until sufficient elements have been
enqueued. The `min_after_dequeue` argument is ignored after the
queue has been closed.
##### Args:
* `capacity`: An integer. The upper bound on the number of elements
that may be stored in this queue.
* `min_after_dequeue`: An integer (described above).
* `dtypes`: A list of `DType` objects. The length of `dtypes` must equal
the number of tensors in each queue element.
* `shapes`: (Optional.) A list of fully-defined `TensorShape` objects,
with the same length as `dtypes` or `None`.
* `seed`: A Python integer. Used to create a random seed. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
* `shared_name`: (Optional.) If non-empty, this queue will be shared under
the given name across multiple sessions.
* `name`: Optional name for the queue operation.
## Dealing with the filesystem
- - -
### `tf.matching_files(pattern, name=None)`
Returns the set of files matching a pattern.
Note that this routine only supports wildcard characters in the
basename portion of the pattern, not in the directory portion.
##### Args:
* `pattern`: A `Tensor` of type `string`. A (scalar) shell wildcard pattern.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `string`. A vector of matching filenames.
- - -
### `tf.read_file(filename, name=None)`
Reads and outputs the entire contents of the input filename.
##### Args:
* `filename`: A `Tensor` of type `string`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `string`.
## Input pipeline
TensorFlow functions for setting up an input-prefetching pipeline.
Please see the [reading data how-to](../../how_tos/reading_data/index.md)
for context.
### Beginning of an input pipeline
The "producer" functions add a queue to the graph and a corresponding
`QueueRunner` for running the subgraph that fills that queue.
- - -
### `tf.train.match_filenames_once(pattern, name=None)`
Save the list of files matching pattern, so it is only computed once.
##### Args:
* `pattern`: A file pattern (glob).
* `name`: A name for the operations (optional).
##### Returns:
A variable that is initialized to the list of files matching pattern.
- - -
### `tf.train.limit_epochs(tensor, num_epochs=None, name=None)`
Returns tensor num_epochs times and then raises an OutOfRange error.
##### Args:
* `tensor`: Any Tensor.
* `num_epochs`: An integer (optional). If specified, limits the number
of steps the output tensor may be evaluated.
* `name`: A name for the operations (optional).
##### Returns:
tensor or OutOfRange.
- - -
### `tf.train.range_input_producer(limit, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`
Produces the integers from 0 to limit-1 in a queue.
##### Args:
* `limit`: An int32 scalar tensor.
* `num_epochs`: An integer (optional). If specified, `range_input_producer`
produces each integer `num_epochs` times before generating an
OutOfRange error. If not specified, `range_input_producer` can cycle
through the integers an unlimited number of times.
* `shuffle`: Boolean. If true, the integers are randomly shuffled within each
epoch.
* `seed`: An integer (optional). Seed used if shuffle == True.
* `capacity`: An integer. Sets the queue capacity.
* `name`: A name for the operations (optional).
##### Returns:
A Queue with the output integers. A QueueRunner for the Queue
is added to the current Graph's QUEUE_RUNNER collection.
- - -
### `tf.train.slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`
Produces a slice of each Tensor in tensor_list.
Implemented using a Queue -- a QueueRunner for the Queue
is added to the current Graph's QUEUE_RUNNER collection.
##### Args:
* `tensor_list`: A list of Tensors. Every Tensor in tensor_list must
have the same size in the first dimension.
* `num_epochs`: An integer (optional). If specified, `slice_input_producer`
produces each slice `num_epochs` times before generating
an OutOfRange error. If not specified, `slice_input_producer` can cycle
through the slices an unlimited number of times.
* `seed`: An integer (optional). Seed used if shuffle == True.
* `capacity`: An integer. Sets the queue capacity.
* `name`: A name for the operations (optional).
##### Returns:
A list of tensors, one for each element of tensor_list. If the tensor
in tensor_list has shape [N, a, b, .., z], then the corresponding output
tensor will have shape [a, b, ..., z].
- - -
### `tf.train.string_input_producer(string_tensor, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`
Output strings (e.g. filenames) to a queue for an input pipeline.
##### Args:
* `string_tensor`: A 1-D string tensor with the strings to produce.
* `num_epochs`: An integer (optional). If specified, `string_input_producer`
produces each string from `string_tensor` `num_epochs` times before
generating an OutOfRange error. If not specified, `string_input_producer`
can cycle through the strings in `string_tensor` an unlimited number of
times.
* `shuffle`: Boolean. If true, the strings are randomly shuffled within each
epoch.
* `seed`: An integer (optional). Seed used if shuffle == True.
* `capacity`: An integer. Sets the queue capacity.
* `name`: A name for the operations (optional).
##### Returns:
A queue with the output strings. A QueueRunner for the Queue
is added to the current Graph's QUEUE_RUNNER collection.
### Batching at the end of an input pipeline
These functions add a queue to the graph to assemble a batch of examples, with
possible shuffling. They also add a `QueueRunner` for running the subgraph
that fills that queue.
Use [batch](#batch) or [batch_join](#batch_join) for batching examples that have
already been well shuffled. Use [shuffle_batch](#shuffle_batch) or
[shuffle_batch_join](#shuffle_batch_join) for examples that
would benefit from additional shuffling.
Use [batch](#batch) or [shuffle_batch](#shuffle_batch) if you want a
single thread producing examples to batch, or if you have a
single subgraph producing examples but you want to run it in N threads
(where you increase N until it can keep the queue full). Use
[batch_join](#batch_join) or [shuffle_batch_join](#shuffle_batch_join)
if you have N different subgraphs producing examples to batch and you
want them run by N threads.
- - -
### `tf.train.batch(tensor_list, batch_size, num_threads=1, capacity=32, enqueue_many=False, shapes=None, name=None)`
Creates batches of tensors in `tensor_list`.
This function is implemented using a queue. A `QueueRunner` for the
queue is added to the current `Graph`'s `QUEUE_RUNNER` collection.
If `enqueue_many` is `False`, `tensor_list` is assumed to represent a
single example. An input tensor with shape `[x, y, z]` will be output
as a tensor with shape `[batch_size, x, y, z]`.
If `enqueue_many` is `True`, `tensor_list` is assumed to represent a
batch of examples, where the first dimension is indexed by example,
and all members of `tensor_list` should have the same size in the
first dimension. If an input tensor has shape `[*, x, y, z]`, the
output will have shape `[batch_size, x, y, z]`. The `capacity` argument
controls the how long the prefetching is allowed to grow the queues.
##### Args:
* `tensor_list`: The list of tensors to enqueue.
* `batch_size`: The new batch size pulled from the queue.
* `num_threads`: The number of threads enqueuing `tensor_list`.
* `capacity`: An integer. The maximum number of elements in the queue.
* `enqueue_many`: Whether each tensor in `tensor_list` is a single example.
* `shapes`: (Optional) The shapes for each example. Defaults to the
inferred shapes for `tensor_list`.
* `name`: (Optional) A name for the operations.
##### Returns:
A list of tensors with the same number and types as `tensor_list`.
- - -
### `tf.train.batch_join(tensor_list_list, batch_size, capacity=32, enqueue_many=False, shapes=None, name=None)`
Runs a list of tensors to fill a queue to create batches of examples.
Enqueues a different list of tensors in different threads.
Implemented using a queue -- a `QueueRunner` for the queue
is added to the current `Graph`'s `QUEUE_RUNNER` collection.
`len(tensor_list_list)` threads will be started,
with thread `i` enqueuing the tensors from
`tensor_list_list[i]`. `tensor_list_list[i1][j]` must match
`tensor_list_list[i2][j]` in type and shape, except in the first
dimension if `enqueue_many` is true.
If `enqueue_many` is `False`, each `tensor_list_list[i]` is assumed
to represent a single example. An input tensor `x` will be output as a
tensor with shape `[batch_size] + x.shape`.
If `enqueue_many` is `True`, `tensor_list_list[i]` is assumed to
represent a batch of examples, where the first dimension is indexed
by example, and all members of `tensor_list_list[i]` should have the
same size in the first dimension. The slices of any input tensor
`x` are treated as examples, and the output tensors will have shape
`[batch_size] + x.shape[1:]`.
The `capacity` argument controls the how long the prefetching is allowed to
grow the queues.
##### Args:
* `tensor_list_list`: A list of tuples of tensors to enqueue.
* `batch_size`: An integer. The new batch size pulled from the queue.
* `capacity`: An integer. The maximum number of elements in the queue.
* `enqueue_many`: Whether each tensor in `tensor_list_list` is a single
example.
* `shapes`: (Optional) The shapes for each example. Defaults to the
inferred shapes for `tensor_list_list[i]`.
* `name`: (Optional) A name for the operations.
##### Returns:
A list of tensors with the same number and types as
`tensor_list_list[i]`.
- - -
### `tf.train.shuffle_batch(tensor_list, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, name=None)`
Creates batches by randomly shuffling tensors.
This function adds the following to the current `Graph`:
* A shuffling queue into which tensors from `tensor_list` are enqueued.
* A `dequeue_many` operation to create batches from the queue.
* A `QueueRunner` to `QUEUE_RUNNER` collection, to enqueue the tensors
from `tensor_list`.
If `enqueue_many` is `False`, `tensor_list` is assumed to represent a
single example. An input tensor with shape `[x, y, z]` will be output
as a tensor with shape `[batch_size, x, y, z]`.
If `enqueue_many` is `True`, `tensor_list` is assumed to represent a
batch of examples, where the first dimension is indexed by example,
and all members of `tensor_list` should have the same size in the
first dimension. If an input tensor has shape `[*, x, y, z]`, the
output will have shape `[batch_size, x, y, z]`.
The `capacity` argument controls the how long the prefetching is allowed to
grow the queues.
For example:
```python
# Creates batches of 32 images and 32 labels.
image_batch, label_batch = tf.train.shuffle_batch(
[single_image, single_label],
batch_size=32,
num_threads=4,
capacity=50000,
min_after_dequeue=10000)
```
##### Args:
* `tensor_list`: The list of tensors to enqueue.
* `batch_size`: The new batch size pulled from the queue.
* `capacity`: An integer. The maximum number of elements in the queue.
* `min_after_dequeue`: Minimum number elements in the queue after a
dequeue, used to ensure a level of mixing of elements.
* `num_threads`: The number of threads enqueuing `tensor_list`.
* `seed`: Seed for the random shuffling within the queue.
* `enqueue_many`: Whether each tensor in `tensor_list` is a single example.
* `shapes`: (Optional) The shapes for each example. Defaults to the
inferred shapes for `tensor_list`.
* `name`: (Optional) A name for the operations.
##### Returns:
A list of tensors with the same number and types as `tensor_list`.
- - -
### `tf.train.shuffle_batch_join(tensor_list_list, batch_size, capacity, min_after_dequeue, seed=None, enqueue_many=False, shapes=None, name=None)`
Create batches by randomly shuffling tensors.
This version enqueues a different list of tensors in different threads.
It adds the following to the current `Graph`:
* A shuffling queue into which tensors from `tensor_list_list` are enqueued.
* A `dequeue_many` operation to create batches from the queue.
* A `QueueRunner` to `QUEUE_RUNNER` collection, to enqueue the tensors
from `tensor_list_list`.
`len(tensor_list_list)` threads will be started, with thread `i` enqueuing
the tensors from `tensor_list_list[i]`. `tensor_list_list[i1][j]` must match
`tensor_list_list[i2][j]` in type and shape, except in the first dimension if
`enqueue_many` is true.
If `enqueue_many` is `False`, each `tensor_list_list[i]` is assumed
to represent a single example. An input tensor with shape `[x, y,
z]` will be output as a tensor with shape `[batch_size, x, y, z]`.
If `enqueue_many` is `True`, `tensor_list_list[i]` is assumed to
represent a batch of examples, where the first dimension is indexed
by example, and all members of `tensor_list_list[i]` should have the
same size in the first dimension. If an input tensor has shape `[*, x,
y, z]`, the output will have shape `[batch_size, x, y, z]`.
The `capacity` argument controls the how long the prefetching is allowed to
grow the queues.
##### Args:
* `tensor_list_list`: A list of tuples of tensors to enqueue.
* `batch_size`: An integer. The new batch size pulled from the queue.
* `capacity`: An integer. The maximum number of elements in the queue.
* `min_after_dequeue`: Minimum number elements in the queue after a
dequeue, used to ensure a level of mixing of elements.
* `seed`: Seed for the random shuffling within the queue.
* `enqueue_many`: Whether each tensor in `tensor_list_list` is a single
example.
* `shapes`: (Optional) The shapes for each example. Defaults to the
inferred shapes for `tensor_list_list[i]`.
* `name`: (Optional) A name for the operations.
##### Returns:
A list of tensors with the same number and types as `tensor_list_list[i]`.
================================================
FILE: SOURCE/api_docs/python/math_ops.md
================================================
# Math
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Math](#AUTOGENERATED-math)
* [Arithmetic Operators](#AUTOGENERATED-arithmetic-operators)
* [`tf.add(x, y, name=None)`](#add)
* [`tf.sub(x, y, name=None)`](#sub)
* [`tf.mul(x, y, name=None)`](#mul)
* [`tf.div(x, y, name=None)`](#div)
* [`tf.mod(x, y, name=None)`](#mod)
* [Basic Math Functions](#AUTOGENERATED-basic-math-functions)
* [`tf.add_n(inputs, name=None)`](#add_n)
* [`tf.abs(x, name=None)`](#abs)
* [`tf.neg(x, name=None)`](#neg)
* [`tf.sign(x, name=None)`](#sign)
* [`tf.inv(x, name=None)`](#inv)
* [`tf.square(x, name=None)`](#square)
* [`tf.round(x, name=None)`](#round)
* [`tf.sqrt(x, name=None)`](#sqrt)
* [`tf.rsqrt(x, name=None)`](#rsqrt)
* [`tf.pow(x, y, name=None)`](#pow)
* [`tf.exp(x, name=None)`](#exp)
* [`tf.log(x, name=None)`](#log)
* [`tf.ceil(x, name=None)`](#ceil)
* [`tf.floor(x, name=None)`](#floor)
* [`tf.maximum(x, y, name=None)`](#maximum)
* [`tf.minimum(x, y, name=None)`](#minimum)
* [`tf.cos(x, name=None)`](#cos)
* [`tf.sin(x, name=None)`](#sin)
* [Matrix Math Functions](#AUTOGENERATED-matrix-math-functions)
* [`tf.diag(diagonal, name=None)`](#diag)
* [`tf.transpose(a, perm=None, name='transpose')`](#transpose)
* [`tf.matmul(a, b, transpose_a=False, transpose_b=False, a_is_sparse=False, b_is_sparse=False, name=None)`](#matmul)
* [`tf.batch_matmul(x, y, adj_x=None, adj_y=None, name=None)`](#batch_matmul)
* [`tf.matrix_determinant(input, name=None)`](#matrix_determinant)
* [`tf.batch_matrix_determinant(input, name=None)`](#batch_matrix_determinant)
* [`tf.matrix_inverse(input, name=None)`](#matrix_inverse)
* [`tf.batch_matrix_inverse(input, name=None)`](#batch_matrix_inverse)
* [`tf.cholesky(input, name=None)`](#cholesky)
* [`tf.batch_cholesky(input, name=None)`](#batch_cholesky)
* [Complex Number Functions](#AUTOGENERATED-complex-number-functions)
* [`tf.complex(real, imag, name=None)`](#complex)
* [`tf.complex_abs(x, name=None)`](#complex_abs)
* [`tf.conj(in_, name=None)`](#conj)
* [`tf.imag(in_, name=None)`](#imag)
* [`tf.real(in_, name=None)`](#real)
* [Reduction](#AUTOGENERATED-reduction)
* [`tf.reduce_sum(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_sum)
* [`tf.reduce_prod(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_prod)
* [`tf.reduce_min(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_min)
* [`tf.reduce_max(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_max)
* [`tf.reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_mean)
* [`tf.reduce_all(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_all)
* [`tf.reduce_any(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_any)
* [`tf.accumulate_n(inputs, shape=None, tensor_dtype=None, name=None)`](#accumulate_n)
* [Segmentation](#AUTOGENERATED-segmentation)
* [`tf.segment_sum(data, segment_ids, name=None)`](#segment_sum)
* [`tf.segment_prod(data, segment_ids, name=None)`](#segment_prod)
* [`tf.segment_min(data, segment_ids, name=None)`](#segment_min)
* [`tf.segment_max(data, segment_ids, name=None)`](#segment_max)
* [`tf.segment_mean(data, segment_ids, name=None)`](#segment_mean)
* [`tf.unsorted_segment_sum(data, segment_ids, num_segments, name=None)`](#unsorted_segment_sum)
* [`tf.sparse_segment_sum(data, indices, segment_ids, name=None)`](#sparse_segment_sum)
* [`tf.sparse_segment_mean(data, indices, segment_ids, name=None)`](#sparse_segment_mean)
* [Sequence Comparison and Indexing](#AUTOGENERATED-sequence-comparison-and-indexing)
* [`tf.argmin(input, dimension, name=None)`](#argmin)
* [`tf.argmax(input, dimension, name=None)`](#argmax)
* [`tf.listdiff(x, y, name=None)`](#listdiff)
* [`tf.where(input, name=None)`](#where)
* [`tf.unique(x, name=None)`](#unique)
* [`tf.edit_distance(hypothesis, truth, normalize=True, name='edit_distance')`](#edit_distance)
* [`tf.invert_permutation(x, name=None)`](#invert_permutation)
## Arithmetic Operators
TensorFlow provides several operations that you can use to add basic arithmetic
operators to your graph.
- - -
### `tf.add(x, y, name=None)`
Returns x + y element-wise.
*NOTE*: Add supports broadcasting. AddN does not.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int8`, `int16`, `int32`, `complex64`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.sub(x, y, name=None)`
Returns x - y element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.mul(x, y, name=None)`
Returns x * y element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int8`, `int16`, `int32`, `complex64`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.div(x, y, name=None)`
Returns x / y element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.mod(x, y, name=None)`
Returns element-wise remainder of division.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `int32`, `int64`, `float32`, `float64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
## Basic Math Functions
TensorFlow provides several operations that you can use to add basic
mathematical functions to your graph.
- - -
### `tf.add_n(inputs, name=None)`
Add all input tensors element wise.
##### Args:
* `inputs`: A list of at least 1 `Tensor` objects of the same type in: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`.
Must all be the same size and shape.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `inputs`.
- - -
### `tf.abs(x, name=None)`
Computes the absolute value of a tensor.
Given a tensor of real numbers `x`, this operation returns a tensor
containing the absolute value of each element in `x`. For example, if x is
an input element and y is an output element, this operation computes
\\(y = |x|\\).
See [`tf.complex_abs()`](#tf_complex_abs) to compute the absolute value of a complex
number.
##### Args:
* `x`: A `Tensor` of type `float`, `double`, `int32`, or `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` the same size and type as `x` with absolute values.
- - -
### `tf.neg(x, name=None)`
Computes numerical negative value element-wise.
I.e., \\(y = -x\\).
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.sign(x, name=None)`
Returns an element-wise indication of the sign of a number.
y = sign(x) = -1 if x < 0; 0 if x == 0; 1 if x > 0.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.inv(x, name=None)`
Computes the reciprocal of x element-wise.
I.e., \\(y = 1 / x\\).
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.square(x, name=None)`
Computes square of x element-wise.
I.e., \\(y = x * x = x^2\\).
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.round(x, name=None)`
Rounds the values of a tensor to the nearest integer, element-wise.
For example:
```python
# 'a' is [0.9, 2.5, 2.3, -4.4]
tf.round(a) ==> [ 1.0, 3.0, 2.0, -4.0 ]
```
##### Args:
* `x`: A `Tensor` of type `float` or `double`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of same shape and type as `x`.
- - -
### `tf.sqrt(x, name=None)`
Computes square root of x element-wise.
I.e., \\(y = \sqrt{x} = x^{1/2}\\).
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.rsqrt(x, name=None)`
Computes reciprocal of square root of x element-wise.
I.e., \\(y = 1 / \sqrt{x}\\).
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.pow(x, y, name=None)`
Computes the power of one value to another.
Given a tensor `x` and a tensor `y`, this operation computes \\(x^y\\) for
corresponding elements in `x` and `y`. For example:
```
# tensor 'x' is [[2, 2]], [3, 3]]
# tensor 'y' is [[8, 16], [2, 3]]
tf.pow(x, y) ==> [[256, 65536], [9, 27]]
```
##### Args:
* `x`: A `Tensor` of type `float`, `double`, `int32`, `complex64`, or `int64`.
* `y`: A `Tensor` of type `float`, `double`, `int32`, `complex64`, or `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`.
- - -
### `tf.exp(x, name=None)`
Computes exponential of x element-wise. \\(y = e^x\\).
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.log(x, name=None)`
Computes natural logrithm of x element-wise.
I.e., \\(y = \log_e x\\).
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.ceil(x, name=None)`
Returns element-wise smallest integer in not less than x.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.floor(x, name=None)`
Returns element-wise largest integer not greater than x.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.maximum(x, y, name=None)`
Returns the max of x and y (i.e. x > y ? x : y) element-wise, broadcasts.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.minimum(x, y, name=None)`
Returns the min of x and y (i.e. x < y ? x : y) element-wise, broadcasts.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`.
* `y`: A `Tensor`. Must have the same type as `x`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.cos(x, name=None)`
Computes cos of x element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
- - -
### `tf.sin(x, name=None)`
Computes sin of x element-wise.
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
## Matrix Math Functions
TensorFlow provides several operations that you can use to add basic
mathematical functions for matrices to your graph.
- - -
### `tf.diag(diagonal, name=None)`
Returns a diagonal tensor with a given diagonal values.
Given a `diagonal`, this operation returns a tensor with the `diagonal` and
everything else padded with zeros. The diagonal is computed as follows:
Assume `diagonal` has dimensions [D1,..., Dk], then the output is a tensor of
rank 2k with dimensions [D1,..., Dk, D1,..., Dk] where:
`output[i1,..., ik, i1,..., ik] = diagonal[i1, ..., ik]` and 0 everywhere else.
For example:
```prettyprint
# 'diagonal' is [1, 2, 3, 4]
tf.diag(diagonal) ==> [[1, 0, 0, 0]
[0, 2, 0, 0]
[0, 0, 3, 0]
[0, 0, 0, 4]]
```
##### Args:
* `diagonal`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`.
Rank k tensor where k is at most 3.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `diagonal`.
- - -
### `tf.transpose(a, perm=None, name='transpose')`
Transposes `a`. Permutes the dimensions according to `perm`.
The returned tensor's dimension i will correspond to the input dimension
`perm[i]`. If `perm` is not given, it is set to (n-1...0), where n is
the rank of the input tensor. Hence by default, this operation performs a
regular matrix transpose on 2-D input Tensors.
For example:
```python
# 'x' is [[1 2 3]
# [4 5 6]]
tf.transpose(x) ==> [[1 4]
[2 5]
[3 6]]
# Equivalently
tf.transpose(x perm=[0, 1]) ==> [[1 4]
[2 5]
[3 6]]
# 'perm' is more useful for n-dimensional tensors, for n > 2
# 'x' is [[[1 2 3]
# [4 5 6]]
# [[7 8 9]
# [10 11 12]]]
# Take the transpose of the matrices in dimension-0
tf.transpose(b, perm=[0, 2, 1]) ==> [[[1 4]
[2 5]
[3 6]]
[[7 10]
[8 11]
[9 12]]]
```
##### Args:
* `a`: A `Tensor`.
* `perm`: A permutation of the dimensions of `a`.
* `name`: A name for the operation (optional).
##### Returns:
A transposed `Tensor`.
- - -
### `tf.matmul(a, b, transpose_a=False, transpose_b=False, a_is_sparse=False, b_is_sparse=False, name=None)`
Multiplies matrix `a` by matrix `b`, producing `a` * `b`.
The inputs must be two-dimensional matrices, with matching inner dimensions,
possibly after transposition.
Both matrices must be of the same type. The supported types are:
`float`, `double`, `int32`, `complex64`.
Either matrix can be transposed on the fly by setting the corresponding flag
to `True`. This is `False` by default.
If one or both of the matrices contain a lot of zeros, a more efficient
multiplication algorithm can be used by setting the corresponding
`a_is_sparse` or `b_is_sparse` flag to `True`. These are `False` by default.
For example:
```python
# 2-D tensor `a`
a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3]) => [[1. 2. 3.]
[4. 5. 6.]]
# 2-D tensor `b`
b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2]) => [[7. 8.]
[9. 10.]
[11. 12.]]
c = tf.matmul(a, b) => [[58 64]
[139 154]]
```
##### Args:
* `a`: `Tensor` of type `float`, `double`, `int32` or `complex64`.
* `b`: `Tensor` with same type as `a`.
* `transpose_a`: If `True`, `a` is transposed before multiplication.
* `transpose_b`: If `True`, `b` is transposed before multiplication.
* `a_is_sparse`: If `True`, `a` is treated as a sparse matrix.
* `b_is_sparse`: If `True`, `b` is treated as a sparse matrix.
* `name`: Name for the operation (optional).
##### Returns:
A `Tensor` of the same type as `a`.
- - -
### `tf.batch_matmul(x, y, adj_x=None, adj_y=None, name=None)`
Multiplies slices of two tensors in batches.
Multiplies all slices of `Tensor` `x` and `y` (each slice can be
viewed as an element of a batch), and arranges the individual results
in a single output tensor of the same batch size. Each of the
individual slices can optionally be adjointed (to adjoint a matrix
means to transpose and conjugate it) before multiplication by setting
the `adj_x` or `adj_y` flag to `True`, which are by default `False`.
The input tensors `x` and `y` are 3-D or higher with shape `[..., r_x, c_x]`
and `[..., r_y, c_y]`.
The output tensor is 3-D or higher with shape `[..., r_o, c_o]`, where:
r_o = c_x if adj_x else r_x
c_o = r_y if adj_y else c_y
It is computed as:
out[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :])
##### Args:
* `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`.
3-D or higher with shape `[..., r_x, c_x]`.
* `y`: A `Tensor`. Must have the same type as `x`.
3-D or higher with shape `[..., r_y, c_y]`.
* `adj_x`: An optional `bool`. Defaults to `False`.
If `True`, adjoint the slices of `x`. Defaults to `False`.
* `adj_y`: An optional `bool`. Defaults to `False`.
If `True`, adjoint the slices of `y`. Defaults to `False`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `x`.
3-D or higher with shape `[..., r_o, c_o]`
- - -
### `tf.matrix_determinant(input, name=None)`
Calculates the determinant of a square matrix.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
A tensor of shape `[M, M]`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
A scalar, equal to the determinant of the input.
- - -
### `tf.batch_matrix_determinant(input, name=None)`
Calculates the determinants for a batch of square matrices.
The input is a tensor of shape `[..., M, M]` whose inner-most 2 dimensions
form square matrices. The output is a 1-D tensor containing the determinants
for all input submatrices `[..., :, :]`.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
Shape is `[..., M, M]`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`. Shape is `[...]`.
- - -
### `tf.matrix_inverse(input, name=None)`
Calculates the inverse of a square invertible matrix. Checks for invertibility.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
Shape is `[M, M]`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
Shape is `[M, M]` containing the matrix inverse of the input.
- - -
### `tf.batch_matrix_inverse(input, name=None)`
Calculates the inverse of square invertible matrices. Checks for invertibility.
The input is a tensor of shape `[..., M, M]` whose inner-most 2 dimensions
form square matrices. The output is a tensor of the same shape as the input
containing the inverse for all input submatrices `[..., :, :]`.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
Shape is `[..., M, M]`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`. Shape is `[..., M, M]`.
- - -
### `tf.cholesky(input, name=None)`
Calculates the Cholesky decomposition of a square matrix.
The input has to be symmetric and positive definite. Only the lower-triangular
part of the input will be used for this operation. The upper-triangular part
will not be read.
The result is the lower-triangular matrix of the Cholesky decomposition of the
input.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float64`, `float32`.
Shape is `[M, M]`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`. Shape is `[M, M]`.
- - -
### `tf.batch_cholesky(input, name=None)`
Calculates the Cholesky decomposition of a batch of square matrices.
The input is a tensor of shape `[..., M, M]` whose inner-most 2 dimensions
form square matrices, with the same constraints as the single matrix Cholesky
decomposition above. The output is a tensor of the same shape as the input
containing the Cholesky decompositions for all input submatrices `[..., :, :]`.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float64`, `float32`.
Shape is `[..., M, M]`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`. Shape is `[..., M, M]`.
## Complex Number Functions
TensorFlow provides several operations that you can use to add complex number
functions to your graph.
- - -
### `tf.complex(real, imag, name=None)`
Converts two real numbers to a complex number.
Given a tensor `real` representing the real part of a complex number, and a
tensor `imag` representing the imaginary part of a complex number, this
operation computes complex numbers elementwise of the form \\(a + bj\\),
where *a* represents the `real` part and *b* represents the `imag` part.
The input tensors `real` and `imag` must be the same shape.
For example:
```
# tensor 'real' is [2.25, 3.25]
# tensor `imag` is [4.75, 5.75]
tf.complex(real, imag) ==> [[2.25 + 4.74j], [3.25 + 5.75j]]
```
##### Args:
* `real`: A `Tensor` of type `float`.
* `imag`: A `Tensor` of type `float`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `complex64`.
- - -
### `tf.complex_abs(x, name=None)`
Computes the complex absolute value of a tensor.
Given a tensor `x` of complex numbers, this operation returns a tensor of type
`float` that is the absolute value of each element in `x`. All elements in `x`
must be complex numbers of the form \\(a + bj\\). The absolute value is
computed as \\( \sqrt{a^2 + b^2}\\).
For example:
```
# tensor 'x' is [[-2.25 + 4.75j], [-3.25 + 5.75j]]
tf.complex_abs(x) ==> [5.25594902, 6.60492229]
```
##### Args:
* `x`: A `Tensor` of type `complex64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `float32`.
- - -
### `tf.conj(in_, name=None)`
Returns the complex conjugate of a complex number.
Given a tensor `in` of complex numbers, this operation returns a tensor of
complex numbers that are the complex conjugate of each element in `in`. The
complex numbers in `in` must be of the form \\(a + bj\\), where *a* is the real
part and *b* is the imaginary part.
The complex conjugate returned by this operation is of the form \\(a - bj\\).
For example:
```
# tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j]
tf.conj(in) ==> [-2.25 - 4.75j, 3.25 - 5.75j]
```
##### Args:
* `in_`: A `Tensor` of type `complex64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `complex64`.
- - -
### `tf.imag(in_, name=None)`
Returns the imaginary part of a complex number.
Given a tensor `in` of complex numbers, this operation returns a tensor of type
`float` that is the imaginary part of each element in `in`. All elements in `in`
must be complex numbers of the form \\(a + bj\\), where *a* is the real part
and *b* is the imaginary part returned by this operation.
For example:
```
# tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j]
tf.imag(in) ==> [4.75, 5.75]
```
##### Args:
* `in_`: A `Tensor` of type `complex64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `float32`.
- - -
### `tf.real(in_, name=None)`
Returns the real part of a complex number.
Given a tensor `in` of complex numbers, this operation returns a tensor of type
`float` that is the real part of each element in `in`. All elements in `in`
must be complex numbers of the form \\(a + bj\\), where *a* is the real part
returned by this operation and *b* is the imaginary part.
For example:
```
# tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j]
tf.real(in) ==> [-2.25, 3.25]
```
##### Args:
* `in_`: A `Tensor` of type `complex64`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `float32`.
## Reduction
TensorFlow provides several operations that you can use to perform
common math computations that reduce various dimensions of a tensor.
- - -
### `tf.reduce_sum(input_tensor, reduction_indices=None, keep_dims=False, name=None)`
Computes the sum of elements across dimensions of a tensor.
Reduces `input_tensor` along the dimensions given in `reduction_indices`.
Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each
entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions
are retained with length 1.
If `reduction_indices` has no entries, all dimensions are reduced, and a
tensor with a single element is returned.
For example:
```python
# 'x' is [[1, 1, 1]]
# [1, 1, 1]]
tf.reduce_sum(x) ==> 6
tf.reduce_sum(x, 0) ==> [2, 2, 2]
tf.reduce_sum(x, 1) ==> [3, 3]
tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]]
tf.reduce_sum(x, [0, 1]) ==> 6
```
##### Args:
* `input_tensor`: The tensor to reduce. Should have numeric type.
* `reduction_indices`: The dimensions to reduce. If `None` (the defaut),
reduces all dimensions.
* `keep_dims`: If true, retains reduced dimensions with length 1.
* `name`: A name for the operation (optional).
##### Returns:
The reduced tensor.
- - -
### `tf.reduce_prod(input_tensor, reduction_indices=None, keep_dims=False, name=None)`
Computes the product of elements across dimensions of a tensor.
Reduces `input_tensor` along the dimensions given in `reduction_indices`.
Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each
entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions
are retained with length 1.
If `reduction_indices` has no entries, all dimensions are reduced, and a
tensor with a single element is returned.
##### Args:
* `input_tensor`: The tensor to reduce. Should have numeric type.
* `reduction_indices`: The dimensions to reduce. If `None` (the defaut),
reduces all dimensions.
* `keep_dims`: If true, retains reduced dimensions with length 1.
* `name`: A name for the operation (optional).
##### Returns:
The reduced tensor.
- - -
### `tf.reduce_min(input_tensor, reduction_indices=None, keep_dims=False, name=None)`
Computes the minimum of elements across dimensions of a tensor.
Reduces `input_tensor` along the dimensions given in `reduction_indices`.
Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each
entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions
are retained with length 1.
If `reduction_indices` has no entries, all dimensions are reduced, and a
tensor with a single element is returned.
##### Args:
* `input_tensor`: The tensor to reduce. Should have numeric type.
* `reduction_indices`: The dimensions to reduce. If `None` (the defaut),
reduces all dimensions.
* `keep_dims`: If true, retains reduced dimensions with length 1.
* `name`: A name for the operation (optional).
##### Returns:
The reduced tensor.
- - -
### `tf.reduce_max(input_tensor, reduction_indices=None, keep_dims=False, name=None)`
Computes the maximum of elements across dimensions of a tensor.
Reduces `input_tensor` along the dimensions given in `reduction_indices`.
Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each
entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions
are retained with length 1.
If `reduction_indices` has no entries, all dimensions are reduced, and a
tensor with a single element is returned.
##### Args:
* `input_tensor`: The tensor to reduce. Should have numeric type.
* `reduction_indices`: The dimensions to reduce. If `None` (the defaut),
reduces all dimensions.
* `keep_dims`: If true, retains reduced dimensions with length 1.
* `name`: A name for the operation (optional).
##### Returns:
The reduced tensor.
- - -
### `tf.reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)`
Computes the mean of elements across dimensions of a tensor.
Reduces `input_tensor` along the dimensions given in `reduction_indices`.
Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each
entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions
are retained with length 1.
If `reduction_indices` has no entries, all dimensions are reduced, and a
tensor with a single element is returned.
For example:
```python
# 'x' is [[1., 1. ]]
# [2., 2.]]
tf.reduce_mean(x) ==> 1.5
tf.reduce_mean(x, 0) ==> [1.5, 1.5]
tf.reduce_mean(x, 1) ==> [1., 2.]
```
##### Args:
* `input_tensor`: The tensor to reduce. Should have numeric type.
* `reduction_indices`: The dimensions to reduce. If `None` (the defaut),
reduces all dimensions.
* `keep_dims`: If true, retains reduced dimensions with length 1.
* `name`: A name for the operation (optional).
##### Returns:
The reduced tensor.
- - -
### `tf.reduce_all(input_tensor, reduction_indices=None, keep_dims=False, name=None)`
Computes the "logical and" of elements across dimensions of a tensor.
Reduces `input_tensor` along the dimensions given in `reduction_indices`.
Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each
entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions
are retained with length 1.
If `reduction_indices` has no entries, all dimensions are reduced, and a
tensor with a single element is returned.
For example:
```python
# 'x' is [[True, True]]
# [False, False]]
tf.reduce_all(x) ==> False
tf.reduce_all(x, 0) ==> [False, False]
tf.reduce_all(x, 1) ==> [True, False]
```
##### Args:
* `input_tensor`: The boolean tensor to reduce.
* `reduction_indices`: The dimensions to reduce. If `None` (the defaut),
reduces all dimensions.
* `keep_dims`: If true, retains reduced dimensions with length 1.
* `name`: A name for the operation (optional).
##### Returns:
The reduced tensor.
- - -
### `tf.reduce_any(input_tensor, reduction_indices=None, keep_dims=False, name=None)`
Computes the "logical or" of elements across dimensions of a tensor.
Reduces `input_tensor` along the dimensions given in `reduction_indices`.
Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each
entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions
are retained with length 1.
If `reduction_indices` has no entries, all dimensions are reduced, and a
tensor with a single element is returned.
For example:
```python
# 'x' is [[True, True]]
# [False, False]]
tf.reduce_any(x) ==> True
tf.reduce_any(x, 0) ==> [True, True]
tf.reduce_any(x, 1) ==> [True, False]
```
##### Args:
* `input_tensor`: The boolean tensor to reduce.
* `reduction_indices`: The dimensions to reduce. If `None` (the defaut),
reduces all dimensions.
* `keep_dims`: If true, retains reduced dimensions with length 1.
* `name`: A name for the operation (optional).
##### Returns:
The reduced tensor.
- - -
### `tf.accumulate_n(inputs, shape=None, tensor_dtype=None, name=None)`
Returns the element-wise sum of a list of tensors.
Optionally, pass `shape` and `tensor_dtype` for shape and type checking,
otherwise, these are inferred.
For example:
```python
# tensor 'a' is [[1, 2], [3, 4]
# tensor `b` is [[5, 0], [0, 6]]
tf.accumulate_n([a, b, a]) ==> [[7, 4], [6, 14]]
# Explicitly pass shape and type
tf.accumulate_n([a, b, a], shape=[2, 2], tensor_dtype=tf.int32)
==> [[7, 4], [6, 14]]
```
##### Args:
* `inputs`: A list of `Tensor` objects, each with same shape and type.
* `shape`: Shape of elements of `inputs`.
* `tensor_dtype`: The type of `inputs`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of same shape and type as the elements of `inputs`.
##### Raises:
* `ValueError`: If `inputs` don't all have same shape and dtype or the shape
cannot be inferred.
## Segmentation
TensorFlow provides several operations that you can use to perform common
math computations on tensor segments.
Here a segmentation is a partitioning of a tensor along
the first dimension, i.e. it defines a mapping from the first dimension onto
`segment_ids`. The `segment_ids` tensor should be the size of
the first dimension, `d0`, with consecutive IDs in the range `0` to `k`,
where `k [[0 0 0 0]
[5 6 7 8]]
```
- - -
### `tf.segment_sum(data, segment_ids, name=None)`
Computes the sum along segments of a tensor.
Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation)
for an explanation of segments.
Computes a tensor such that
\\(output_i = \sum_j data_j\\) where sum is over `j` such
that `segment_ids[j] == i`.
##### Args:
* `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A 1-D tensor whose rank is equal to the rank of `data`'s
first dimension. Values should be sorted and can be repeated.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
Has same shape as data, except for dimension_0 which
has size `k`, the number of segments.
- - -
### `tf.segment_prod(data, segment_ids, name=None)`
Computes the product along segments of a tensor.
Read [the section on
Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation
of segments.
Computes a tensor such that
\\(output_i = \prod_j data_j\\) where the product is over `j` such
that `segment_ids[j] == i`.
##### Args:
* `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A 1-D tensor whose rank is equal to the rank of `data`'s
first dimension. Values should be sorted and can be repeated.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
Has same shape as data, except for dimension_0 which
has size `k`, the number of segments.
- - -
### `tf.segment_min(data, segment_ids, name=None)`
Computes the minimum along segments of a tensor.
Read [the section on
Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation
of segments.
Computes a tensor such that
\\(output_i = \min_j(data_j)\\) where `min` is over `j` such
that `segment_ids[j] == i`.
##### Args:
* `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A 1-D tensor whose rank is equal to the rank of `data`'s
first dimension. Values should be sorted and can be repeated.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
Has same shape as data, except for dimension_0 which
has size `k`, the number of segments.
- - -
### `tf.segment_max(data, segment_ids, name=None)`
Computes the maximum along segments of a tensor.
Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation)
for an explanation of segments.
Computes a tensor such that
\\(output_i = \max_j(data_j)\\) where `max` is over `j` such
that `segment_ids[j] == i`.
##### Args:
* `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A 1-D tensor whose rank is equal to the rank of `data`'s
first dimension. Values should be sorted and can be repeated.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
Has same shape as data, except for dimension_0 which
has size `k`, the number of segments.
- - -
### `tf.segment_mean(data, segment_ids, name=None)`
Computes the mean along segments of a tensor.
Read [the section on
Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation
of segments.
Computes a tensor such that
\\(output_i = \frac{\sum_j data_j}{N}\\) where `mean` is
over `j` such that `segment_ids[j] == i` and `N` is the total number of
values summed.
##### Args:
* `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A 1-D tensor whose rank is equal to the rank of `data`'s
first dimension. Values should be sorted and can be repeated.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
Has same shape as data, except for dimension_0 which
has size `k`, the number of segments.
- - -
### `tf.unsorted_segment_sum(data, segment_ids, num_segments, name=None)`
Computes the sum along segments of a tensor.
Read [the section on
Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation
of segments.
Computes a tensor such that
\\(output_i = \sum_j data_j\\) where sum is over `j` such
that `segment_ids[j] == i`. Unlike `SegmentSum`, `segment_ids`
need not be sorted and need not cover all values in the full
range of valid values.
If the sum is empty for a given segment ID `i`, `output[i] = 0`.
`num_segments` should equal the number of distinct segment IDs.
##### Args:
* `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A 1-D tensor whose rank is equal to the rank of `data`'s
first dimension.
* `num_segments`: A `Tensor` of type `int32`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
Has same shape as data, except for dimension_0 which
has size `num_segments`.
- - -
### `tf.sparse_segment_sum(data, indices, segment_ids, name=None)`
Computes the sum along sparse segments of a tensor.
Read [the section on
Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation
of segments.
Like `SegmentSum`, but `segment_ids` can have rank less than `data`'s first
dimension, selecting a subset of dimension_0, specified by `indices`.
For example:
```prettyprint
c = tf.constant([[1,2,3,4], [-1,-2,-3,-4], [5,6,7,8]])
# Select two rows, one segment.
tf.sparse_segment_sum(c, tf.constant([0, 1]), tf.constant([0, 0]))
==> [[0 0 0 0]]
# Select two rows, two segment.
tf.sparse_segment_sum(c, tf.constant([0, 1]), tf.constant([0, 1]))
==> [[ 1 2 3 4]
[-1 -2 -3 -4]]
# Select all rows, two segments.
tf.sparse_segment_sum(c, tf.constant([0, 1, 2]), tf.constant([0, 0, 1]))
==> [[0 0 0 0]
[5 6 7 8]]
# Which is equivalent to:
tf.segment_sum(c, tf.constant([0, 0, 1]))
```
##### Args:
* `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `indices`: A `Tensor` of type `int32`.
A 1-D tensor. Has same rank as `segment_ids`.
* `segment_ids`: A `Tensor` of type `int32`.
A 1-D tensor. Values should be sorted and can be repeated.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
Has same shape as data, except for dimension_0 which
has size `k`, the number of segments.
- - -
### `tf.sparse_segment_mean(data, indices, segment_ids, name=None)`
Computes the mean along sparse segments of a tensor.
Read [the section on
Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation
of segments.
Like `SegmentMean`, but `segment_ids` can have rank less than `data`'s first
dimension, selecting a subset of dimension_0, specified by `indices`.
##### Args:
* `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
* `indices`: A `Tensor` of type `int32`.
A 1-D tensor. Has same rank as `segment_ids`.
* `segment_ids`: A `Tensor` of type `int32`.
A 1-D tensor. Values should be sorted and can be repeated.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `data`.
Has same shape as data, except for dimension_0 which
has size `k`, the number of segments.
## Sequence Comparison and Indexing
TensorFlow provides several operations that you can use to add sequence
comparison and index extraction to your graph. You can use these operations to
determine sequence differences and determine the indexes of specific values in
a tensor.
- - -
### `tf.argmin(input, dimension, name=None)`
Returns the index with the smallest value across dimensions of a tensor.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`.
* `dimension`: A `Tensor` of type `int32`.
int32, 0 <= dimension < rank(input). Describes which dimension
of the input Tensor to reduce across. For vectors, use dimension = 0.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `int64`.
- - -
### `tf.argmax(input, dimension, name=None)`
Returns the index with the largest value across dimensions of a tensor.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`.
* `dimension`: A `Tensor` of type `int32`.
int32, 0 <= dimension < rank(input). Describes which dimension
of the input Tensor to reduce across. For vectors, use dimension = 0.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `int64`.
- - -
### `tf.listdiff(x, y, name=None)`
Computes the difference between two lists of numbers.
Given a list `x` and a list `y`, this operation returns a list `out` that
represents all numbers that are in `x` but not in `y`. The returned list `out`
is sorted in the same order that the numbers appear in `x` (duplicates are
preserved). This operation also returns a list `idx` that represents the
position of each `out` element in `x`. In other words:
`out[i] = x[idx[i]] for i in [0, 1, ..., len(out) - 1]`
For example, given this input:
```prettyprint
x = [1, 2, 3, 4, 5, 6]
y = [1, 3, 5]
```
This operation would return:
```prettyprint
out ==> [2, 4, 6]
idx ==> [1, 3, 5]
```
##### Args:
* `x`: A `Tensor`. 1-D. Values to keep.
* `y`: A `Tensor`. Must have the same type as `x`. 1-D. Values to remove.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of `Tensor` objects (out, idx).
* `out`: A `Tensor`. Has the same type as `x`. 1-D. Values present in `x` but not in `y`.
* `idx`: A `Tensor` of type `int32`. 1-D. Positions of `x` values preserved in `out`.
- - -
### `tf.where(input, name=None)`
Returns locations of true values in a boolean tensor.
This operation returns the coordinates of true elements in `input`. The
coordinates are returned in a 2-D tensor where the first dimension (rows)
represents the number of true elements, and the second dimension (columns)
represents the coordinates of the true elements. Keep in mind, the shape of
the output tensor can vary depending on how many true values there are in
`input`. Indices are output in row-major order.
For example:
```prettyprint
# 'input' tensor is [[True, False]
# [True, False]]
# 'input' has two true values, so output has two coordinates.
# 'input' has rank of 2, so coordinates have two indices.
where(input) ==> [[0, 0],
[1, 0]]
# `input` tensor is [[[True, False]
# [True, False]]
# [[False, True]
# [False, True]]
# [[False, False]
# [False, True]]]
# 'input' has 5 true values, so output has 5 coordinates.
# 'input' has rank of 3, so coordinates have three indices.
where(input) ==> [[0, 0, 0],
[0, 1, 0],
[1, 0, 1],
[1, 1, 1],
[2, 1, 1]]
```
##### Args:
* `input`: A `Tensor` of type `bool`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `int64`.
- - -
### `tf.unique(x, name=None)`
Finds unique elements in a 1-D tensor.
This operation returns a tensor `y` containing all of the unique elements of `x`
sorted in the same order that they occur in `x`. This operation also returns a
tensor `idx` the same size as `x` that contains the index of each value of `x`
in the unique output `y`. In other words:
`y[idx[i]] = x[i] for i in [0, 1,...,rank(x) - 1]`
For example:
```prettyprint
# tensor 'x' is [1, 1, 2, 4, 4, 4, 7, 8, 8]
y, idx = unique(x)
y ==> [1, 2, 4, 7, 8]
idx ==> [0, 0, 1, 2, 2, 2, 3, 4, 4]
```
##### Args:
* `x`: A `Tensor`. 1-D.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of `Tensor` objects (y, idx).
* `y`: A `Tensor`. Has the same type as `x`. 1-D.
* `idx`: A `Tensor` of type `int32`. 1-D.
- - -
### `tf.edit_distance(hypothesis, truth, normalize=True, name='edit_distance')`
Computes the Levenshtein distance between sequences.
This operation takes variable-length sequences (`hypothesis` and `truth`),
each provided as a `SparseTensor`, and computes the Levenshtein distance.
You can normalize the edit distance by length of `truth` by setting
`normalize` to true.
For example, given the following input:
```python
# 'hypothesis' is a tensor of shape `[2, 1]` with variable-length values:
# (0,0) = ["a"]
# (1,0) = ["b"]
hypothesis = tf.SparseTensor(
[[0, 0, 0],
[1, 0, 0]],
["a", "b"]
(2, 1, 1))
# 'truth' is a tensor of shape `[2, 2]` with variable-length values:
# (0,0) = []
# (0,1) = ["a"]
# (1,0) = ["b", "c"]
# (1,1) = ["a"]
truth = tf.SparseTensor(
[[0, 1, 0],
[1, 0, 0],
[1, 0, 1],
[1, 1, 0]]
["a", "b", "c", "a"],
(2, 2, 2))
normalize = True
```
This operation would return the following:
```python
# 'output' is a tensor of shape `[2, 2]` with edit distances normalized
# by 'truth' lengths.
output ==> [[inf, 1.0], # (0,0): no truth, (0,1): no hypothesis
[0.5, 1.0]] # (1,0): addition, (1,1): no hypothesis
```
##### Args:
* `hypothesis`: A `SparseTensor` containing hypothesis sequences.
* `truth`: A `SparseTensor` containing truth sequences.
* `normalize`: A `bool`. If `True`, normalizes the Levenshtein distance by
length of `truth.`
* `name`: A name for the operation (optional).
##### Returns:
A dense `Tensor` with rank `R - 1`, where R is the rank of the
`SparseTensor` inputs `hypothesis` and `truth`.
##### Raises:
* `TypeError`: If either `hypothesis` or `truth` are not a `SparseTensor`.
- - -
### `tf.invert_permutation(x, name=None)`
Computes the inverse permutation of a tensor.
This operation computes the inverse of an index permutation. It takes a 1-D
integer tensor `x`, which represents the indices of a zero-based array, and
swaps each value with its index position. In other words, for an ouput tensor
`y` and an input tensor `x`, this operation computes the following:
`y[x[i]] = i for i in [0, 1, ..., len(x) - 1]`
The values must include 0. There can be no duplicate values or negative values.
For example:
```prettyprint
# tensor `x` is [3, 4, 0, 2, 1]
invert_permutation(x) ==> [2, 4, 3, 0, 1]
```
##### Args:
* `x`: A `Tensor` of type `int32`. 1-D.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `int32`. 1-D.
================================================
FILE: SOURCE/api_docs/python/nn.md
================================================
# Neural Network
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Neural Network](#AUTOGENERATED-neural-network)
* [Activation Functions](#AUTOGENERATED-activation-functions)
* [`tf.nn.relu(features, name=None)`](#relu)
* [`tf.nn.relu6(features, name=None)`](#relu6)
* [`tf.nn.softplus(features, name=None)`](#softplus)
* [`tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None, name=None)`](#dropout)
* [`tf.nn.bias_add(value, bias, name=None)`](#bias_add)
* [`tf.sigmoid(x, name=None)`](#sigmoid)
* [`tf.tanh(x, name=None)`](#tanh)
* [Convolution](#AUTOGENERATED-convolution)
* [`tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)`](#conv2d)
* [`tf.nn.depthwise_conv2d(input, filter, strides, padding, name=None)`](#depthwise_conv2d)
* [`tf.nn.separable_conv2d(input, depthwise_filter, pointwise_filter, strides, padding, name=None)`](#separable_conv2d)
* [Pooling](#AUTOGENERATED-pooling)
* [`tf.nn.avg_pool(value, ksize, strides, padding, name=None)`](#avg_pool)
* [`tf.nn.max_pool(value, ksize, strides, padding, name=None)`](#max_pool)
* [`tf.nn.max_pool_with_argmax(input, ksize, strides, padding, Targmax=None, name=None)`](#max_pool_with_argmax)
* [Normalization](#AUTOGENERATED-normalization)
* [`tf.nn.l2_normalize(x, dim, epsilon=1e-12, name=None)`](#l2_normalize)
* [`tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)`](#local_response_normalization)
* [`tf.nn.moments(x, axes, name=None)`](#moments)
* [Losses](#AUTOGENERATED-losses)
* [`tf.nn.l2_loss(t, name=None)`](#l2_loss)
* [Classification](#AUTOGENERATED-classification)
* [`tf.nn.sigmoid_cross_entropy_with_logits(logits, targets, name=None)`](#sigmoid_cross_entropy_with_logits)
* [`tf.nn.softmax(logits, name=None)`](#softmax)
* [`tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)`](#softmax_cross_entropy_with_logits)
* [Embeddings](#AUTOGENERATED-embeddings)
* [`tf.nn.embedding_lookup(params, ids, name=None)`](#embedding_lookup)
* [Evaluation](#AUTOGENERATED-evaluation)
* [`tf.nn.top_k(input, k, name=None)`](#top_k)
* [`tf.nn.in_top_k(predictions, targets, k, name=None)`](#in_top_k)
* [Candidate Sampling](#AUTOGENERATED-candidate-sampling)
* [Sampled Loss Functions](#AUTOGENERATED-sampled-loss-functions)
* [`tf.nn.nce_loss(weights, biases, inputs, labels, num_sampled, num_classes, num_true=1, sampled_values=None, remove_accidental_hits=False, name='nce_loss')`](#nce_loss)
* [`tf.nn.sampled_softmax_loss(weights, biases, inputs, labels, num_sampled, num_classes, num_true=1, sampled_values=None, remove_accidental_hits=True, name='sampled_softmax_loss')`](#sampled_softmax_loss)
* [Candidate Samplers](#AUTOGENERATED-candidate-samplers)
* [`tf.nn.uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`](#uniform_candidate_sampler)
* [`tf.nn.log_uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`](#log_uniform_candidate_sampler)
* [`tf.nn.learned_unigram_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`](#learned_unigram_candidate_sampler)
* [`tf.nn.fixed_unigram_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, vocab_file='', distortion=0.0, num_reserved_ids=0, num_shards=1, shard=0, unigrams=[], seed=None, name=None)`](#fixed_unigram_candidate_sampler)
* [Miscellaneous candidate sampling utilities](#AUTOGENERATED-miscellaneous-candidate-sampling-utilities)
* [`tf.nn.compute_accidental_hits(true_classes, sampled_candidates, num_true, seed=None, name=None)`](#compute_accidental_hits)
## Activation Functions
The activation ops provide different types of nonlinearities for use in
neural networks. These include smooth nonlinearities (`sigmoid`,
`tanh`, and `softplus`), continuous but not everywhere differentiable
functions (`relu`, `relu6`, and `relu_x`), and random regularization
(`dropout`).
All activation ops apply componentwise, and produce a tensor of the same
shape as the input tensor.
- - -
### `tf.nn.relu(features, name=None)`
Computes rectified linear: `max(features, 0)`.
##### Args:
* `features`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `features`.
- - -
### `tf.nn.relu6(features, name=None)`
Computes Rectified Linear 6: `min(max(features, 0), 6)`.
##### Args:
* `features`: A `Tensor` with type `float`, `double`, `int32`, `int64`, `uint8`,
`int16`, or `int8`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` with the same type as `features`.
- - -
### `tf.nn.softplus(features, name=None)`
Computes softplus: `log(exp(features) + 1)`.
##### Args:
* `features`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `features`.
- - -
### `tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None, name=None)`
Computes dropout.
With probability `keep_prob`, outputs the input element scaled up by
`1 / keep_prob`, otherwise outputs `0`. The scaling is so that the expected
sum is unchanged.
By default, each element is kept or dropped independently. If `noise_shape`
is specified, it must be
[broadcastable](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
to the shape of `x`, and only dimensions with `noise_shape[i] == shape(x)[i]`
will make independent decisions. For example, if `shape(x) = [k, l, m, n]`
and `noise_shape = [k, 1, 1, n]`, each batch and channel component will be
kept independently and each row and column will be kept or not kept together.
##### Args:
* `x`: A tensor.
* `keep_prob`: A Python float. The probability that each element is kept.
* `noise_shape`: A 1-D `Tensor` of type `int32`, representing the
shape for randomly generated keep/drop flags.
* `seed`: A Python integer. Used to create random seeds. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
* `name`: A name for this operation (optional).
##### Returns:
A Tensor of the same shape of `x`.
##### Raises:
* `ValueError`: If `keep_prob` is not in `(0, 1]`.
- - -
### `tf.nn.bias_add(value, bias, name=None)`
Adds `bias` to `value`.
This is (mostly) a special case of `tf.add` where `bias` is restricted to 1-D.
Broadcasting is supported, so `value` may have any number of dimensions.
Unlike `tf.add`, the type of `bias` is allowed to differ from `value` in the
case where both types are quantized.
##### Args:
* `value`: A `Tensor` with type `float`, `double`, `int64`, `int32`, `uint8`,
`int16`, `int8`, or `complex64`.
* `bias`: A 1-D `Tensor` with size matching the last dimension of `value`.
Must be the same type as `value` unless `value` is a quantized type,
in which case a different quantized type may be used.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` with the same type as `value`.
- - -
### `tf.sigmoid(x, name=None)`
Computes sigmoid of `x` element-wise.
Specifically, `y = 1 / (1 + exp(-x))`.
##### Args:
* `x`: A Tensor with type `float`, `double`, `int32`, `complex64`, `int64`,
or `qint32`.
* `name`: A name for the operation (optional).
##### Returns:
A Tensor with the same type as `x` if `x.dtype != qint32`
otherwise the return type is `quint8`.
- - -
### `tf.tanh(x, name=None)`
Computes hyperbolic tangent of `x` element-wise.
##### Args:
* `x`: A Tensor with type `float`, `double`, `int32`, `complex64`, `int64`,
or `qint32`.
* `name`: A name for the operation (optional).
##### Returns:
A Tensor with the same type as `x` if `x.dtype != qint32` otherwise
the return type is `quint8`.
## Convolution
The convolution ops sweep a 2-D filter over a batch of images, applying the
filter to each window of each image of the appropriate size. The different
ops trade off between generic vs. specific filters:
* `conv2d`: Arbitrary filters that can mix channels together.
* `depthwise_conv2d`: Filters that operate on each channel independently.
* `separable_conv2d`: A depthwise spatial filter followed by a pointwise filter.
Note that although these ops are called "convolution", they are strictly
speaking "cross-correlation" since the filter is combined with an input window
without reversing the filter. For details, see [the properties of
cross-correlation](https://en.wikipedia.org/wiki/Cross-correlation#Properties).
The filter is applied to image patches of the same size as the filter and
strided according to the `strides` argument. `strides = [1, 1, 1, 1]` applies
the filter to a patch at every offset, `strides = [1, 2, 2, 1]` applies the
filter to every other image patch in each dimension, etc.
Ignoring channels for the moment, the spatial semantics of the convolution ops
are as follows. If the 4-D `input` has shape
`[batch, in_height, in_width, ...]` and the 4-D `filter` has shape
`[filter_height, filter_width, ...]`, then
shape(output) = [batch,
(in_height - filter_height + 1) / strides[1],
(in_width - filter_width + 1) / strides[2],
...]
output[b, i, j, :] =
sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, ...] *
filter[di, dj, ...]
Since `input` is 4-D, each `input[b, i, j, :]` is a vector. For `conv2d`, these
vectors are multiplied by the `filter[di, dj, :, :]` matrices to produce new
vectors. For `depthwise_conv_2d`, each scalar component `input[b, i, j, k]`
is multiplied by a vector `filter[di, dj, k]`, and all the vectors are
concatenated.
In the formula for `shape(output)`, the rounding direction depends on padding:
* `padding = 'SAME'`: Round down (only full size windows are considered).
* `padding = 'VALID'`: Round up (partial windows are included).
- - -
### `tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)`
Computes a 2-D convolution given 4-D `input` and `filter` tensors.
Given an input tensor of shape `[batch, in_height, in_width, in_channels]`
and a filter / kernel tensor of shape
`[filter_height, filter_width, in_channels, out_channels]`, this op
performs the following:
1. Flattens the filter to a 2-D matrix with shape
`[filter_height * filter_width * in_channels, output_channels]`.
2. Extracts image patches from the the input tensor to form a *virtual*
tensor of shape `[batch, out_height, out_width,
filter_height * filter_width * in_channels]`.
3. For each patch, right-multiplies the filter matrix and the image patch
vector.
In detail,
output[b, i, j, k] =
sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] *
filter[di, dj, q, k]
Must have `strides[0] = strides[3] = 1`. For the most common case of the same
horizontal and vertices strides, `strides = [1, stride, stride, 1]`.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
* `filter`: A `Tensor`. Must have the same type as `input`.
* `strides`: A list of `ints`.
1-D of length 4. The stride of the sliding window for each dimension
of `input`.
* `padding`: A `string` from: `"SAME", "VALID"`.
The type of padding algorithm to use.
* `use_cudnn_on_gpu`: An optional `bool`. Defaults to `True`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
- - -
### `tf.nn.depthwise_conv2d(input, filter, strides, padding, name=None)`
Depthwise 2-D convolution.
Given an input tensor of shape `[batch, in_height, in_width, in_channels]`
and a filter tensor of shape
`[filter_height, filter_width, in_channels, channel_multiplier]`
containing `in_channels` convolutional filters of depth 1, `depthwise_conv2d`
applies a different filter to each input channel (expanding from 1 channel
to `channel_multiplier` channels for each), then concatenates the results
together. The output has `in_channels * channel_multiplier` channels.
In detail,
output[b, i, j, k * channel_multiplier + q] =
sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, k] *
filter[di, dj, k, q]
Must have `strides[0] = strides[3] = 1`. For the most common case of the
same horizontal and vertical strides, `strides = [1, stride, stride, 1]`.
##### Args:
* `input`: 4-D with shape `[batch, in_height, in_width, in_channels]`.
* `filter`: 4-D with shape
`[filter_height, filter_width, in_channels, channel_multiplier]`.
* `strides`: 1-D of size 4. The stride of the sliding window for each
dimension of `input`.
* `padding`: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
* `name`: A name for this operation (optional).
##### Returns:
A 4-D `Tensor` of shape
`[batch, out_height, out_width, in_channels * channel_multiplier].`
- - -
### `tf.nn.separable_conv2d(input, depthwise_filter, pointwise_filter, strides, padding, name=None)`
2-D convolution with separable filters.
Performs a depthwise convolution that acts separately on channels followed by
a pointwise convolution that mixes channels. Note that this is separability
between dimensions `[1, 2]` and `3`, not spatial separability between
dimensions `1` and `2`.
In detail,
output[b, i, j, k] = sum_{di, dj, q, r]
input[b, strides[1] * i + di, strides[2] * j + dj, q] *
depthwise_filter[di, dj, q, r] *
pointwise_filter[0, 0, q * channel_multiplier + r, k]
`strides` controls the strides for the depthwise convolution only, since
the pointwise convolution has implicit strides of `[1, 1, 1, 1]`. Must have
`strides[0] = strides[3] = 1`. For the most common case of the same
horizontal and vertical strides, `strides = [1, stride, stride, 1]`.
##### Args:
* `input`: 4-D `Tensor` with shape `[batch, in_height, in_width, in_channels]`.
* `depthwise_filter`: 4-D `Tensor` with shape
`[filter_height, filter_width, in_channels, channel_multiplier]`.
Contains `in_channels` convolutional filters of depth 1.
* `pointwise_filter`: 4-D `Tensor` with shape
`[1, 1, channel_multiplier * in_channels, out_channels]`. Pointwise
filter to mix channels after `depthwise_filter` has convolved spatially.
* `strides`: 1-D of size 4. The strides for the depthwise convolution for
each dimension of `input`.
* `padding`: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
* `name`: A name for this operation (optional).
##### Returns:
A 4-D `Tensor` of shape `[batch, out_height, out_width, out_channels]`.
## Pooling
The pooling ops sweep a rectangular window over the input tensor, computing a
reduction operation for each window (average, max, or max with argmax). Each
pooling op uses rectangular windows of size `ksize` separated by offset
`strides`. For example, if `strides` is all ones every window is used, if
`strides` is all twos every other window is used in each dimension, etc.
In detail, the output is
output[i] = reduce(value[strides * i:strides * i + ksize])
for each tuple of indices `i`. The output shape is
shape(output) = (shape(value) - ksize + 1) / strides
where the rounding direction depends on padding:
* `padding = 'SAME'`: Round down (only full size windows are considered).
* `padding = 'VALID'`: Round up (partial windows are included).
- - -
### `tf.nn.avg_pool(value, ksize, strides, padding, name=None)`
Performs the average pooling on the input.
Each entry in `output` is the mean of the corresponding size `ksize`
window in `value`.
##### Args:
* `value`: A 4-D `Tensor` of shape `[batch, height, width, channels]` and type
`float32`, `float64`, `qint8`, `quint8`, or `qint32`.
* `ksize`: A list of ints that has length >= 4.
The size of the window for each dimension of the input tensor.
* `strides`: A list of ints that has length >= 4.
The stride of the sliding window for each dimension of the
input tensor.
* `padding`: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
* `name`: Optional name for the operation.
##### Returns:
A `Tensor` with the same type as `value`. The average pooled output tensor.
- - -
### `tf.nn.max_pool(value, ksize, strides, padding, name=None)`
Performs the max pooling on the input.
##### Args:
* `value`: A 4-D `Tensor` with shape `[batch, height, width, channels]` and
type `float32`, `float64`, `qint8`, `quint8`, `qint32`.
* `ksize`: A list of ints that has length >= 4. The size of the window for
each dimension of the input tensor.
* `strides`: A list of ints that has length >= 4. The stride of the sliding
window for each dimension of the input tensor.
* `padding`: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
* `name`: Optional name for the operation.
##### Returns:
A `Tensor` with the same type as `value`. The max pooled output tensor.
- - -
### `tf.nn.max_pool_with_argmax(input, ksize, strides, padding, Targmax=None, name=None)`
Performs max pooling on the input and outputs both max values and indices.
The indices in `argmax` are flattened, so that a maximum value at position
`[b, y, x, c]` becomes flattened index
`((b * height + y) * width + x) * channels + c`.
##### Args:
* `input`: A `Tensor` of type `float32`.
4-D with shape `[batch, height, width, channels]`. Input to pool over.
* `ksize`: A list of `ints` that has length `>= 4`.
The size of the window for each dimension of the input tensor.
* `strides`: A list of `ints` that has length `>= 4`.
The stride of the sliding window for each dimension of the
input tensor.
* `padding`: A `string` from: `"SAME", "VALID"`.
The type of padding algorithm to use.
* `Targmax`: An optional `tf.DType` from: `tf.int32, tf.int64`. Defaults to `tf.int64`.
* `name`: A name for the operation (optional).
##### Returns:
A tuple of `Tensor` objects (output, argmax).
* `output`: A `Tensor` of type `float32`. The max pooled output tensor.
* `argmax`: A `Tensor` of type `Targmax`. 4-D. The flattened indices of the max values chosen for each output.
## Normalization
Normalization is useful to prevent neurons from saturating when inputs may
have varying scale, and to aid generalization.
- - -
### `tf.nn.l2_normalize(x, dim, epsilon=1e-12, name=None)`
Normalizes along dimension `dim` using an L2 norm.
For a 1-D tensor with `dim = 0`, computes
output = x / sqrt(max(sum(x**2), epsilon))
For `x` with more dimensions, independently normalizes each 1-D slice along
dimension `dim`.
##### Args:
* `x`: A `Tensor`.
* `dim`: Dimension along which to normalize.
* `epsilon`: A lower bound value for the norm. Will use `sqrt(epsilon)` as the
divisor if `norm < sqrt(epsilon)`.
* `name`: A name for this operation (optional).
##### Returns:
A `Tensor` with the same shape as `x`.
- - -
### `tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)`
Local Response Normalization.
The 4-D `input` tensor is treated as a 3-D array of 1-D vectors (along the last
dimension), and each vector is normalized independently. Within a given vector,
each component is divided by the weighted, squared sum of inputs within
`depth_radius`. In detail,
sqr_sum[a, b, c, d] =
sum(input[a, b, c, d - depth_radius : d + depth_radius + 1] ** 2)
output = input / (bias + alpha * sqr_sum ** beta)
For details, see [Krizhevsky et al., ImageNet classification with deep
convolutional neural networks (NIPS 2012)]
(http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks).
##### Args:
* `input`: A `Tensor` of type `float32`. 4-D.
* `depth_radius`: An optional `int`. Defaults to `5`.
0-D. Half-width of the 1-D normalization window.
* `bias`: An optional `float`. Defaults to `1`.
An offset (usually positive to avoid dividing by 0).
* `alpha`: An optional `float`. Defaults to `1`.
A scale factor, usually positive.
* `beta`: An optional `float`. Defaults to `0.5`. An exponent.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `float32`.
- - -
### `tf.nn.moments(x, axes, name=None)`
Calculate the mean and variance of `x`.
The mean and variance are calculated by aggregating the contents of `x`
across `axes`. If `x` is 1-D and `axes = [0]` this is just the mean
and variance of a vector.
For so-called "global normalization" needed for convolutional filters pass
`axes=[0, 1, 2]` (batch, height, width). For batch normalization pass
`axes=[0]` (batch).
##### Args:
* `x`: A `Tensor`.
* `axes`: array of ints. Axes along which to compute mean and
variance.
* `name`: Name used to scope the operations that compute the moments.
##### Returns:
Two `Tensors`: `mean` and `variance`.
## Losses
The loss ops measure error between two tensors, or between a tensor and zero.
These can be used for measuring accuracy of a network in a regression task
or for regularization purposes (weight decay).
- - -
### `tf.nn.l2_loss(t, name=None)`
L2 Loss.
Computes half the L2 norm of a tensor without the `sqrt`:
output = sum(t ** 2) / 2
##### Args:
* `t`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`.
Typically 2-D, but may have any dimensions.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `t`. 0-D.
## Classification
TensorFlow provides several operations that help you perform classification.
- - -
### `tf.nn.sigmoid_cross_entropy_with_logits(logits, targets, name=None)`
Computes sigmoid cross entropy given `logits`.
Measures the probability error in discrete classification tasks in which each
class is independent and not mutually exclusive. For instance, one could
perform multilabel classification where a picture can contain both an elephant
and a dog at the same time.
For brevity, let `x = logits`, `z = targets`. The logistic loss is
x - x * z + log(1 + exp(-x))
To ensure stability and avoid overflow, the implementation uses
max(x, 0) - x * z + log(1 + exp(-abs(x)))
`logits` and `targets` must have the same type and shape.
##### Args:
* `logits`: A `Tensor` of type `float32` or `float64`.
* `targets`: A `Tensor` of the same type and shape as `logits`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of the same shape as `logits` with the componentwise
logistic losses.
- - -
### `tf.nn.softmax(logits, name=None)`
Computes softmax activations.
For each batch `i` and class `j` we have
softmax[i, j] = exp(logits[i, j]) / sum(exp(logits[i]))
##### Args:
* `logits`: A `Tensor`. Must be one of the following types: `float32`, `float64`.
2-D with shape `[batch_size, num_classes]`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `logits`. Same shape as `logits`.
- - -
### `tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)`
Computes softmax cross entropy between `logits` and `labels`.
Measures the probability error in discrete classification tasks in which the
classes are mutually exclusive (each entry is in exactly one class). For
example, each CIFAR-10 image is labeled with one and only one label: an image
can be a dog or a truck, but not both.
**WARNING:** This op expects unscaled logits, since it performs a `softmax`
on `logits` internally for efficiency. Do not call this op with the
output of `softmax`, as it will produce incorrect results.
`logits` and `labels` must have the same shape `[batch_size, num_classes]`
and the same dtype (either `float32` or `float64`).
##### Args:
* `logits`: Unscaled log probabilities.
* `labels`: Each row `labels[i]` must be a valid probability distribution.
* `name`: A name for the operation (optional).
##### Returns:
A 1-D `Tensor` of length `batch_size` of the same type as `logits` with the
softmax cross entropy loss.
## Embeddings
TensorFlow provides library support for looking up values in embedding
tensors.
- - -
### `tf.nn.embedding_lookup(params, ids, name=None)`
Looks up `ids` in a list of embedding tensors.
This function is used to perform parallel lookups on the list of
tensors in `params`. It is a generalization of
[`tf.gather()`](../../api_docs/python/array_ops.md#gather), where `params` is
interpreted as a partition of a larger embedding tensor.
If `len(params) > 1`, each element `id` of `ids` is partitioned between
the elements of `params` by computing `p = id % len(params)`, and is
then used to look up the slice `params[p][id // len(params), ...]`.
The results of the lookup are then concatenated into a dense
tensor. The returned tensor has shape `shape(ids) + shape(params)[1:]`.
##### Args:
* `params`: A list of tensors with the same shape and type.
* `ids`: A `Tensor` with type `int32` containing the ids to be looked
up in `params`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` with the same type as the tensors in `params`.
##### Raises:
* `ValueError`: If `params` is empty.
## Evaluation
The evaluation ops are useful for measuring the performance of a network.
Since they are nondifferentiable, they are typically used at evaluation time.
- - -
### `tf.nn.top_k(input, k, name=None)`
Returns the values and indices of the k largest elements for each row.
\\(values_{i, j}\\) represents the j-th largest element in \\(input_i\\).
\\(indices_{i, j}\\) gives the column index of the corresponding element,
such that \\(input_{i, indices_{i, j}} = values_{i, j}\\). If two
elements are equal, the lower-index element appears first.
##### Args:
* `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`.
A batch_size x classes tensor
* `k`: An `int` that is `>= 1`.
Number of top elements to look for within each row
* `name`: A name for the operation (optional).
##### Returns:
A tuple of `Tensor` objects (values, indices).
* `values`: A `Tensor`. Has the same type as `input`. A batch_size x k tensor with the k largest elements for each row,
sorted in descending order
* `indices`: A `Tensor` of type `int32`. A batch_size x k tensor with the index of each value within each row
- - -
### `tf.nn.in_top_k(predictions, targets, k, name=None)`
Says whether the targets are in the top K predictions.
This outputs a batch_size bool array, an entry out[i] is true if the
prediction for the target class is among the top k predictions among
all predictions for example i. Note that the behavior of InTopK differs
from the TopK op in its handling of ties; if multiple classes have the
same prediction value and straddle the top-k boundary, all of those
classes are considered to be in the top k.
More formally, let
\\(predictions_i\\) be the predictions for all classes for example i,
\\(targets_i\\) be the target class for example i,
\\(out_i\\) be the output for example i,
$$out_i = predictions_{i, targets_i} \in TopKIncludingTies(predictions_i)$$
##### Args:
* `predictions`: A `Tensor` of type `float32`. A batch_size x classes tensor
* `targets`: A `Tensor` of type `int32`. A batch_size vector of class ids
* `k`: An `int`. Number of top elements to look at for computing precision
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor` of type `bool`. Computed Precision at k as a bool Tensor
## Candidate Sampling
Do you want to train a multiclass or multilabel model with thousands
or millions of output classes (for example, a language model with a
large vocabulary)? Training with a full Softmax is slow in this case,
since all of the classes are evaluated for every training example.
Candidate Sampling training algorithms can speed up your step times by
only considering a small randomly-chosen subset of contrastive classes
(called candidates) for each batch of training examples.
See our [Candidate Sampling Algorithms Reference]
(../../extras/candidate_sampling.pdf)
### Sampled Loss Functions
TensorFlow provides the following sampled loss functions for faster training.
- - -
### `tf.nn.nce_loss(weights, biases, inputs, labels, num_sampled, num_classes, num_true=1, sampled_values=None, remove_accidental_hits=False, name='nce_loss')`
Computes and returns the noise-contrastive estimation training loss.
See [Noise-contrastive estimation: A new estimation principle for
unnormalized statistical models]
(http://www.jmlr.org/proceedings/papers/v9/gutmann10a/gutmann10a.pdf).
Also see our [Candidate Sampling Algorithms Reference]
(http://www.tensorflow.org/extras/candidate_sampling.pdf)
Note: In the case where num_true > 1, we assign to each target class
the target probability 1 / num_true so that the target probabilities
sum to 1 per-example.
Note: It would be useful to allow a variable number of target classes per
example. We hope to provide this functionality in a future release.
For now, if you have a variable number of target classes, you can pad them
out to a constant number by either repeating them or by padding
with an otherwise unused class.
##### Args:
* `weights`: A `Tensor` of shape [num_classes, dim]. The class embeddings.
* `biases`: A `Tensor` of shape [num_classes]. The class biases.
* `inputs`: A `Tensor` of shape [batch_size, dim]. The forward
activations of the input network.
* `labels`: A `Tensor` of type `int64` and shape `[batch_size,
num_true]`. The target classes.
* `num_sampled`: An `int`. The number of classes to randomly sample per batch.
* `num_classes`: An `int`. The number of possible classes.
* `num_true`: An `int`. The number of target classes per training example.
* `sampled_values`: a tuple of `(sampled_candidates, true_expected_count,
sampled_expected_count)` returned by a *_candidate_sampler function.
(if None, we default to LogUniformCandidateSampler)
* `remove_accidental_hits`: A `bool`. Whether to remove "accidental hits"
where a sampled class equals one of the target classes. If set to
`True`, this is a "Sampled Logistic" loss instead of NCE, and we are
learning to generate log-odds instead of log probabilities. See
our [Candidate Sampling Algorithms Reference]
(http://www.tensorflow.org/extras/candidate_sampling.pdf).
Default is False.
* `name`: A name for the operation (optional).
##### Returns:
A batch_size 1-D tensor of per-example NCE losses.
- - -
### `tf.nn.sampled_softmax_loss(weights, biases, inputs, labels, num_sampled, num_classes, num_true=1, sampled_values=None, remove_accidental_hits=True, name='sampled_softmax_loss')`
Computes and returns the sampled softmax training loss.
This is a faster way to train a softmax classifier over a huge number of
classes.
This operation is for training only. It is generally an underestimate of
the full softmax loss.
At inference time, you can compute full softmax probabilities with the
expression `tf.nn.softmax(tf.matmul(inputs, weights) + biases)`.
See our [Candidate Sampling Algorithms Reference]
(http://www.tensorflow.org/extras/candidate_sampling.pdf)
Also see Section 3 of http://arxiv.org/abs/1412.2007 for the math.
##### Args:
* `weights`: A `Tensor` of shape [num_classes, dim]. The class embeddings.
* `biases`: A `Tensor` of shape [num_classes]. The class biases.
* `inputs`: A `Tensor` of shape [batch_size, dim]. The forward
activations of the input network.
* `labels`: A `Tensor` of type `int64` and shape `[batch_size,
num_true]`. The target classes. Note that this format differs from
the `labels` argument of `nn.softmax_cross_entropy_with_logits`.
* `num_sampled`: An `int`. The number of classes to randomly sample per batch.
* `num_classes`: An `int`. The number of possible classes.
* `num_true`: An `int`. The number of target classes per training example.
* `sampled_values`: a tuple of `(sampled_candidates, true_expected_count,
sampled_expected_count)` returned by a *_candidate_sampler function.
(if None, we default to LogUniformCandidateSampler)
* `remove_accidental_hits`: A `bool`. whether to remove "accidental hits"
where a sampled class equals one of the target classes. Default is
True.
* `name`: A name for the operation (optional).
##### Returns:
A batch_size 1-D tensor of per-example sampled softmax losses.
### Candidate Samplers
TensorFlow provides the following samplers for randomly sampling candidate
classes when using one of the sampled loss functions above.
- - -
### `tf.nn.uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`
Samples a set of classes using a uniform base distribution.
This operation randomly samples a tensor of sampled classes
(`sampled_candidates`) from the range of integers `[0, range_max]`.
The elements of `sampled_candidates` are drawn without replacement
(if `unique=True`) or with replacement (if `unique=False`) from
the base distribution.
The base distribution for this operation is the uniform distribution
over the range of integers `[0, range_max]`.
In addition, this operation returns tensors `true_expected_count`
and `sampled_expected_count` representing the number of times each
of the target classes (`true_classes`) and the sampled
classes (`sampled_candidates`) is expected to occur in an average
tensor of sampled classes. These values correspond to `Q(y|x)`
defined in [this
document](http://www.tensorflow.org/extras/candidate_sampling.pdf).
If `unique=True`, then these are post-rejection probabilities and we
compute them approximately.
##### Args:
* `true_classes`: A `Tensor` of type `int64` and shape `[batch_size,
num_true]`. The target classes.
* `num_true`: An `int`. The number of target classes per training example.
* `num_sampled`: An `int`. The number of classes to randomly sample per batch.
* `unique`: A `bool`. Determines whether all sampled classes in a batch are
unique.
* `range_max`: An `int`. The number of possible classes.
* `seed`: An `int`. An operation-specific seed. Default is 0.
* `name`: A name for the operation (optional).
##### Returns:
* `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`.
The sampled classes.
* `true_expected_count`: A tensor of type `float`. Same shape as
`true_classes`. The expected counts under the sampling distribution
of each of `true_classes`.
* `sampled_expected_count`: A tensor of type `float`. Same shape as
`sampled_candidates`. The expected counts under the sampling distribution
of each of `sampled_candidates`.
- - -
### `tf.nn.log_uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`
Samples a set of classes using a log-uniform (Zipfian) base distribution.
This operation randomly samples a tensor of sampled classes
(`sampled_candidates`) from the range of integers `[0, range_max]`.
The elements of `sampled_candidates` are drawn without replacement
(if `unique=True`) or with replacement (if `unique=False`) from
the base distribution.
The base distribution for this operation is an approximately log-uniform
or Zipfian distribution:
`P(class) = (log(class + 2) - log(class + 1)) / log(range_max + 1)`
This sampler is useful when the target classes approximately follow such
a distribution - for example, if the classes represent words in a lexicon
sorted in decreasing order of frequency. If your classes are not ordered by
decreasing frequency, do not use this op.
In addition, this operation returns tensors `true_expected_count`
and `sampled_expected_count` representing the number of times each
of the target classes (`true_classes`) and the sampled
classes (`sampled_candidates`) is expected to occur in an average
tensor of sampled classes. These values correspond to `Q(y|x)`
defined in [this
document](http://www.tensorflow.org/extras/candidate_sampling.pdf).
If `unique=True`, then these are post-rejection probabilities and we
compute them approximately.
##### Args:
* `true_classes`: A `Tensor` of type `int64` and shape `[batch_size,
num_true]`. The target classes.
* `num_true`: An `int`. The number of target classes per training example.
* `num_sampled`: An `int`. The number of classes to randomly sample per batch.
* `unique`: A `bool`. Determines whether all sampled classes in a batch are
unique.
* `range_max`: An `int`. The number of possible classes.
* `seed`: An `int`. An operation-specific seed. Default is 0.
* `name`: A name for the operation (optional).
##### Returns:
* `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`.
The sampled classes.
* `true_expected_count`: A tensor of type `float`. Same shape as
`true_classes`. The expected counts under the sampling distribution
of each of `true_classes`.
* `sampled_expected_count`: A tensor of type `float`. Same shape as
`sampled_candidates`. The expected counts under the sampling distribution
of each of `sampled_candidates`.
- - -
### `tf.nn.learned_unigram_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`
Samples a set of classes from a distribution learned during training.
This operation randomly samples a tensor of sampled classes
(`sampled_candidates`) from the range of integers `[0, range_max]`.
The elements of `sampled_candidates` are drawn without replacement
(if `unique=True`) or with replacement (if `unique=False`) from
the base distribution.
The base distribution for this operation is constructed on the fly
during training. It is a unigram distribution over the target
classes seen so far during training. Every integer in `[0, range_max]`
begins with a weight of 1, and is incremented by 1 each time it is
seen as a target class. The base distribution is not saved to checkpoints,
so it is reset when the model is reloaded.
In addition, this operation returns tensors `true_expected_count`
and `sampled_expected_count` representing the number of times each
of the target classes (`true_classes`) and the sampled
classes (`sampled_candidates`) is expected to occur in an average
tensor of sampled classes. These values correspond to `Q(y|x)`
defined in [this
document](http://www.tensorflow.org/extras/candidate_sampling.pdf).
If `unique=True`, then these are post-rejection probabilities and we
compute them approximately.
##### Args:
* `true_classes`: A `Tensor` of type `int64` and shape `[batch_size,
num_true]`. The target classes.
* `num_true`: An `int`. The number of target classes per training example.
* `num_sampled`: An `int`. The number of classes to randomly sample per batch.
* `unique`: A `bool`. Determines whether all sampled classes in a batch are
unique.
* `range_max`: An `int`. The number of possible classes.
* `seed`: An `int`. An operation-specific seed. Default is 0.
* `name`: A name for the operation (optional).
##### Returns:
* `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`.
The sampled classes.
* `true_expected_count`: A tensor of type `float`. Same shape as
`true_classes`. The expected counts under the sampling distribution
of each of `true_classes`.
* `sampled_expected_count`: A tensor of type `float`. Same shape as
`sampled_candidates`. The expected counts under the sampling distribution
of each of `sampled_candidates`.
- - -
### `tf.nn.fixed_unigram_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, vocab_file='', distortion=0.0, num_reserved_ids=0, num_shards=1, shard=0, unigrams=[], seed=None, name=None)`
Samples a set of classes using the provided (fixed) base distribution.
This operation randomly samples a tensor of sampled classes
(`sampled_candidates`) from the range of integers `[0, range_max]`.
The elements of `sampled_candidates` are drawn without replacement
(if `unique=True`) or with replacement (if `unique=False`) from
the base distribution.
The base distribution is read from a file or passed in as an
in-memory array. There is also an option to skew the distribution by
applying a distortion power to the weights.
In addition, this operation returns tensors `true_expected_count`
and `sampled_expected_count` representing the number of times each
of the target classes (`true_classes`) and the sampled
classes (`sampled_candidates`) is expected to occur in an average
tensor of sampled classes. These values correspond to `Q(y|x)`
defined in [this
document](http://www.tensorflow.org/extras/candidate_sampling.pdf).
If `unique=True`, then these are post-rejection probabilities and we
compute them approximately.
##### Args:
* `true_classes`: A `Tensor` of type `int64` and shape `[batch_size,
num_true]`. The target classes.
* `num_true`: An `int`. The number of target classes per training example.
* `num_sampled`: An `int`. The number of classes to randomly sample per batch.
* `unique`: A `bool`. Determines whether all sampled classes in a batch are
unique.
* `range_max`: An `int`. The number of possible classes.
* `vocab_file`: Each valid line in this file (which should have a CSV-like
format) corresponds to a valid word ID. IDs are in sequential order,
starting from num_reserved_ids. The last entry in each line is expected
to be a value corresponding to the count or relative probability. Exactly
one of `vocab_file` and `unigrams` needs to be passed to this operation.
* `distortion`: The distortion is used to skew the unigram probability
distribution. Each weight is first raised to the distortion's power
before adding to the internal unigram distribution. As a result,
`distortion = 1.0` gives regular unigram sampling (as defined by the vocab
file), and `distortion = 0.0` gives a uniform distribution.
* `num_reserved_ids`: Optionally some reserved IDs can be added in the range
`[0, num_reserved_ids]` by the users. One use case is that a special
unknown word token is used as ID 0. These IDs will have a sampling
probability of 0.
* `num_shards`: A sampler can be used to sample from a subset of the original
range in order to speed up the whole computation through parallelism. This
parameter (together with `shard`) indicates the number of partitions that
are being used in the overall computation.
* `shard`: A sampler can be used to sample from a subset of the original range
in order to speed up the whole computation through parallelism. This
parameter (together with `num_shards`) indicates the particular partition
number of the operation, when partitioning is being used.
* `unigrams`: A list of unigram counts or probabilities, one per ID in
sequential order. Exactly one of `vocab_file` and `unigrams` should be
passed to this operation.
* `seed`: An `int`. An operation-specific seed. Default is 0.
* `name`: A name for the operation (optional).
##### Returns:
* `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`.
The sampled classes.
* `true_expected_count`: A tensor of type `float`. Same shape as
`true_classes`. The expected counts under the sampling distribution
of each of `true_classes`.
* `sampled_expected_count`: A tensor of type `float`. Same shape as
`sampled_candidates`. The expected counts under the sampling distribution
of each of `sampled_candidates`.
### Miscellaneous candidate sampling utilities
- - -
### `tf.nn.compute_accidental_hits(true_classes, sampled_candidates, num_true, seed=None, name=None)`
Compute the ids of positions in sampled_candidates matching true_classes.
In Candidate Sampling, this operation facilitates virtually removing
sampled classes which happen to match target classes. This is done
in Sampled Softmax and Sampled Logistic.
See our [Candidate Sampling Algorithms
Reference](http://www.tensorflow.org/extras/candidate_sampling.pdf).
We presuppose that the `sampled_candidates` are unique.
We call it an 'accidental hit' when one of the target classes
matches one of the sampled classes. This operation reports
accidental hits as triples `(index, id, weight)`, where `index`
represents the row number in `true_classes`, `id` represents the
position in `sampled_candidates`, and weight is `-FLOAT_MAX`.
The result of this op should be passed through a `sparse_to_dense`
operation, then added to the logits of the sampled classes. This
removes the contradictory effect of accidentally sampling the true
target classes as noise classes for the same example.
##### Args:
* `true_classes`: A `Tensor` of type `int64` and shape `[batch_size,
num_true]`. The target classes.
* `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`.
The sampled_candidates output of CandidateSampler.
* `num_true`: An `int`. The number of target classes per training example.
* `seed`: An `int`. An operation-specific seed. Default is 0.
* `name`: A name for the operation (optional).
##### Returns:
* `indices`: A `Tensor` of type `int32` and shape `[num_accidental_hits]`.
Values indicate rows in `true_classes`.
* `ids`: A `Tensor` of type `int64` and shape `[num_accidental_hits]`.
Values indicate positions in `sampled_candidates`.
* `weights`: A `Tensor` of type `float` and shape `[num_accidental_hits]`.
Each value is `-FLOAT_MAX`.
================================================
FILE: SOURCE/api_docs/python/python_io.md
================================================
# Data IO (Python functions)
## Contents
### [Data IO (Python functions)](#AUTOGENERATED-data-io--python-functions-)
* [Data IO (Python Functions)](#AUTOGENERATED-data-io--python-functions-)
* [`class tf.python_io.TFRecordWriter`](#TFRecordWriter)
* [`tf.python_io.tf_record_iterator(path)`](#tf_record_iterator)
* [TFRecords Format Details](#AUTOGENERATED-tfrecords-format-details)
## Data IO (Python Functions)
A TFRecords file represents a sequence of (binary) strings. The format is not
random access, so it is suitable for streaming large amounts of data but not
suitable if fast sharding or other non-sequential access is desired.
- - -
### `class tf.python_io.TFRecordWriter`
A class to write records to a TFRecords file.
This class implements `__enter__` and `__exit__`, and can be used
in `with` blocks like a normal file.
- - -
#### `tf.python_io.TFRecordWriter.__init__(path)`
Opens file `path` and creates a `TFRecordWriter` writing to it.
##### Args:
* `path`: The path to the TFRecords file.
##### Raises:
* `IOError`: If `path` cannot be opened for writing.
- - -
#### `tf.python_io.TFRecordWriter.write(record)`
Write a string record to the file.
##### Args:
* `record`: str
- - -
#### `tf.python_io.TFRecordWriter.close()`
Close the file.
- - -
### `tf.python_io.tf_record_iterator(path)`
An iterator that read the records from a TFRecords file.
##### Args:
* `path`: The path to the TFRecords file.
##### Yields:
Strings.
##### Raises:
* `IOError`: If `path` cannot be opened for reading.
- - -
### TFRecords Format Details
A TFRecords file contains a sequence of strings with CRC hashes. Each record
has the format
uint64 length
uint32 masked_crc32_of_length
byte data[length]
uint32 masked_crc32_of_data
and the records are concatenated together to produce the file. The CRC32s
are [described here](https://en.wikipedia.org/wiki/Cyclic_redundancy_check),
and the mask of a CRC is
masked_crc = ((crc >> 15) | (crc << 17)) + 0xa282ead8ul
================================================
FILE: SOURCE/api_docs/python/sparse_ops.md
================================================
# Sparse Tensors
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Sparse Tensors](#AUTOGENERATED-sparse-tensors)
* [Sparse Tensor Representation](#AUTOGENERATED-sparse-tensor-representation)
* [`class tf.SparseTensor`](#SparseTensor)
* [`class tf.SparseTensorValue`](#SparseTensorValue)
* [Sparse to Dense Conversion](#AUTOGENERATED-sparse-to-dense-conversion)
* [`tf.sparse_to_dense(sparse_indices, output_shape, sparse_values, default_value, name=None)`](#sparse_to_dense)
* [`tf.sparse_tensor_to_dense(sp_input, default_value, name=None)`](#sparse_tensor_to_dense)
* [`tf.sparse_to_indicator(sp_input, vocab_size, name=None)`](#sparse_to_indicator)
* [Manipulation](#AUTOGENERATED-manipulation)
* [`tf.sparse_concat(concat_dim, sp_inputs, name=None)`](#sparse_concat)
* [`tf.sparse_reorder(sp_input, name=None)`](#sparse_reorder)
* [`tf.sparse_retain(sp_input, to_retain)`](#sparse_retain)
* [`tf.sparse_fill_empty_rows(sp_input, default_value, name=None)`](#sparse_fill_empty_rows)
## Sparse Tensor Representation
Tensorflow supports a `SparseTensor` representation for data that is sparse
in multiple dimensions. Contrast this representation with `IndexedSlices`,
which is efficient for representing tensors that are sparse in their first
dimension, and dense along all other dimensions.
- - -
### `class tf.SparseTensor`
Represents a sparse tensor.
Tensorflow represents a sparse tensor as three separate dense tensors:
`indices`, `values`, and `dense_shape`. In Python, the three tensors are
collected into a `SparseTensor` class for ease of use. If you have separate
`indices`, `values`, and `dense_shape` tensors, wrap them in a `SparseTensor`
object before passing to the Ops below.
Concretely, the sparse tensor `SparseTensor(values, indices, dense_shape)` is
* `indices`: A 2-D int64 tensor of shape `[N, ndims]`.
* `values`: A 1-D tensor of any type and shape `[N]`.
* `dense_shape`: A 1-D int64 tensor of shape `[ndims]`.
where `N` and `ndims` are the number of values, and number of dimensions in
the `SparseTensor` respectively.
The corresponding dense tensor satisfies
```python
dense.shape = dense_shape
dense[tuple(indices[i])] = values[i]
```
By convention, `indices` should be sorted in row-major order (or equivalently
lexigraphic order on the tuples `indices[i]`). This is not enforced when
`SparseTensor` objects are constructed, but most Ops assume correct ordering.
If the ordering is wrong, it can be fixed by calling `sparse_reorder` on the
misordered `SparseTensor`.
Example: The sparse tensor
```python
SparseTensor(values=[1, 2], indices=[[0, 0], [1, 2]], shape=[3, 4])
```
represents the dense tensor
```python
[[1, 0, 0, 0]
[0, 0, 2, 0]
[0, 0, 0, 0]]
```
- - -
#### `tf.SparseTensor.__init__(indices, values, shape)`
Creates a `SparseTensor`.
##### Args:
* `indices`: A 2-D int64 tensor of shape `[N, ndims]`.
* `values`: A 1-D tensor of any type and shape `[N]`.
* `dense_shape`: A 1-D int64 tensor of shape `[ndims]`.
##### Returns:
A `SparseTensor`
- - -
#### `tf.SparseTensor.indices`
The indices of non-zero values in the represented dense tensor.
##### Returns:
A 2-D Tensor of int64 with shape `[N, ndims]`, where `N` is the
number of non-zero values in the tensor, and `ndims` is the rank.
- - -
#### `tf.SparseTensor.values`
The non-zero values in the represented dense tensor.
##### Returns:
A 1-D Tensor of any data type.
- - -
#### `tf.SparseTensor.dtype`
The `DType` of elements in this tensor.
- - -
#### `tf.SparseTensor.shape`
A 1-D Tensor of int64 representing the shape of the dense tensor.
- - -
#### `tf.SparseTensor.graph`
The `Graph` that contains the index, value, and shape tensors.
- - -
### `class tf.SparseTensorValue`
SparseTensorValue(indices, values, shape)
- - -
#### `tf.SparseTensorValue.indices`
Alias for field number 0
- - -
#### `tf.SparseTensorValue.shape`
Alias for field number 2
- - -
#### `tf.SparseTensorValue.values`
Alias for field number 1
## Sparse to Dense Conversion
- - -
### `tf.sparse_to_dense(sparse_indices, output_shape, sparse_values, default_value, name=None)`
Converts a sparse representation into a dense tensor.
Builds an array `dense` with shape `output_shape` such that
```prettyprint
# If sparse_indices is scalar
dense[i] = (i == sparse_indices ? sparse_values : default_value)
# If sparse_indices is a vector, then for each i
dense[sparse_indices[i]] = sparse_values[i]
# If sparse_indices is an n by d matrix, then for each i in [0, n)
dense[sparse_indices[i][0], ..., sparse_indices[i][d-1]] = sparse_values[i]
```
All other values in `dense` are set to `default_value`. If `sparse_values` is a
scalar, all sparse indices are set to this single value.
##### Args:
* `sparse_indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
0-D, 1-D, or 2-D. `sparse_indices[i]` contains the complete
index where `sparse_values[i]` will be placed.
* `output_shape`: A `Tensor`. Must have the same type as `sparse_indices`.
1-D. Shape of the dense output tensor.
* `sparse_values`: A `Tensor`.
1-D. Values corresponding to each row of `sparse_indices`,
or a scalar value to be used for all sparse indices.
* `default_value`: A `Tensor`. Must have the same type as `sparse_values`.
Scalar value to set for indices not specified in
`sparse_indices`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `sparse_values`.
Dense output tensor of shape `output_shape`.
- - -
### `tf.sparse_tensor_to_dense(sp_input, default_value, name=None)`
Converts a `SparseTensor` into a dense tensor.
This op is a convenience wrapper around `sparse_to_dense` for `SparseTensor`s.
For example, if `sp_input` has shape `[3, 5]` and non-empty string values:
[0, 1]: a
[0, 3]: b
[2, 0]: c
and `default_value` is `x`, then the output will be a dense `[3, 5]`
string tensor with values:
[[x a x b x]
[x x x x x]
[c x x x x]]
##### Args:
* `sp_input`: The input `SparseTensor`.
* `default_value`: Scalar value to set for indices not specified in
`sp_input`.
* `name`: A name prefix for the returned tensors (optional).
##### Returns:
A dense tensor with shape `sp_input.shape` and values specified by
the non-empty values in `sp_input`. Indices not in `sp_input` are assigned
`default_value`.
##### Raises:
* `TypeError`: If `sp_input` is not a `SparseTensor`.
- - -
### `tf.sparse_to_indicator(sp_input, vocab_size, name=None)`
Converts a `SparseTensor` of ids into a dense bool indicator tensor.
The last dimension of `sp_input` is discarded and replaced with the values of
`sp_input`. If `sp_input.shape = [D0, D1, ..., Dn, K]`, then
`output.shape = [D0, D1, ..., Dn, vocab_size]`, where
output[d_0, d_1, ..., d_n, sp_input[d_0, d_1, ..., d_n, k]] = True
and False elsewhere in `output`.
For example, if `sp_input.shape = [2, 3, 4]` with non-empty values:
[0, 0, 0]: 0
[0, 1, 0]: 10
[1, 0, 3]: 103
[1, 1, 2]: 112
[1, 1, 3]: 113
[1, 2, 1]: 121
and `vocab_size = 200`, then the output will be a `[2, 3, 200]` dense bool
tensor with False everywhere except at positions
(0, 0, 0), (0, 1, 10), (1, 0, 103), (1, 1, 112), (1, 1, 113), (1, 2, 121).
This op is useful for converting `SparseTensor`s into dense formats for
compatibility with ops that expect dense tensors.
The input `SparseTensor` must be in row-major order.
##### Args:
* `sp_input`: A `SparseTensor` of type `int32` or `int64`.
* `vocab_size`: The new size of the last dimension, with
`all(0 <= sp_input.values < vocab_size)`.
* `name`: A name prefix for the returned tensors (optional)
##### Returns:
A dense bool indicator tensor representing the indices with specified value.
##### Raises:
* `TypeError`: If `sp_input` is not a `SparseTensor`.
## Manipulation
- - -
### `tf.sparse_concat(concat_dim, sp_inputs, name=None)`
Concatenates a list of `SparseTensor` along the specified dimension.
Concatenation is with respect to the dense versions of each sparse input.
It is assumed that each inputs is a `SparseTensor` whose elements are ordered
along increasing dimension number.
All inputs' shapes must match, except for the concat dimension. The
`indices`, `values`, and `shapes` lists must have the same length.
The output shape is identical to the inputs', except along the concat
dimension, where it is the sum of the inputs' sizes along that dimension.
The output elements will be resorted to preserve the sort order along
increasing dimension number.
This op runs in `O(M log M)` time, where `M` is the total number of non-empty
values across all inputs. This is due to the need for an internal sort in
order to concatenate efficiently across an arbitrary dimension.
For example, if `concat_dim = 1` and the inputs are
sp_inputs[0]: shape = [2, 3]
[0, 2]: "a"
[1, 0]: "b"
[1, 1]: "c"
sp_inputs[1]: shape = [2, 4]
[0, 1]: "d"
[0, 2]: "e"
then the output will be
shape = [2, 7]
[0, 2]: "a"
[0, 4]: "d"
[0, 5]: "e"
[1, 0]: "b"
[1, 1]: "c"
Graphically this is equivalent to doing
[ a] concat [ d e ] = [ a d e ]
[b c ] [ ] [b c ]
##### Args:
* `concat_dim`: Dimension to concatenate along.
* `sp_inputs`: List of `SparseTensor` to concatenate.
* `name`: A name prefix for the returned tensors (optional).
##### Returns:
A `SparseTensor` with the concatenated output.
##### Raises:
* `TypeError`: If `sp_inputs` is not a list of `SparseTensor`.
- - -
### `tf.sparse_reorder(sp_input, name=None)`
Reorders a `SparseTensor` into the canonical, row-major ordering.
Note that by convention, all sparse ops preserve the canonical ordering
along increasing dimension number. The only time ordering can be violated
is during manual manipulation of the indices and values to add entries.
Reordering does not affect the shape of the `SparseTensor`.
For example, if sp_input has shape `[4, 5]` and `indices` / `values`:
[0, 3]: b
[0, 1]: a
[3, 1]: d
[2, 0]: c
then the output will be a `SparseTensor` of shape `[4, 5]` and
`indices` / `values`:
[0, 1]: a
[0, 3]: b
[2, 0]: c
[3, 1]: d
##### Args:
* `sp_input`: The input `SparseTensor`.
* `name`: A name prefix for the returned tensors (optional)
##### Returns:
A `SparseTensor` with the same shape and non-empty values, but in
canonical ordering.
##### Raises:
* `TypeError`: If `sp_input` is not a `SparseTensor`.
- - -
### `tf.sparse_retain(sp_input, to_retain)`
Retains specified non-empty values within a `SparseTensor`.
For example, if `sp_input` has shape `[4, 5]` and 4 non-empty string values:
[0, 1]: a
[0, 3]: b
[2, 0]: c
[3, 1]: d
and `to_retain = [True, False, False, True]`, then the output will
be a `SparseTensor` of shape `[4, 5]` with 2 non-empty values:
[0, 1]: a
[3, 1]: d
##### Args:
* `sp_input`: The input `SparseTensor` with `N` non-empty elements.
* `to_retain`: A bool vector of length `N` with `M` true values.
##### Returns:
A `SparseTensor` with the same shape as the input and `M` non-empty
elements corresponding to the true positions in `to_retain`.
##### Raises:
* `TypeError`: If `sp_input` is not a `SparseTensor`.
- - -
### `tf.sparse_fill_empty_rows(sp_input, default_value, name=None)`
Fills empty rows in the input 2-D `SparseTensor` with a default value.
This op adds entries with the specified `default_value` at index
`[row, 0]` for any row in the input that does not already have a value.
For example, suppose `sp_input` has shape `[5, 6]` and non-empty values:
[0, 1]: a
[0, 3]: b
[2, 0]: c
[3, 1]: d
Rows 1 and 4 are empty, so the output will be of shape `[5, 6]` with values:
[0, 1]: a
[0, 3]: b
[1, 0]: default_value
[2, 0]: c
[3, 1]: d
[4, 0]: default_value
Note that the input may have empty columns at the end, with no effect on
this op.
The output `SparseTensor` will be in row-major order and will have the
same shape as the input.
This op also returns an indicator vector such that
empty_row_indicator[i] = True iff row i was an empty row.
##### Args:
* `sp_input`: A `SparseTensor` with shape `[N, M]`.
* `default_value`: The value to fill for empty rows, with the same type as
`sp_input.`
* `name`: A name prefix for the returned tensors (optional)
##### Returns:
* `sp_ordered_output`: A `SparseTensor` with shape `[N, M]`, and with all empty
rows filled in with `default_value`.
* `empty_row_indicator`: A bool vector of length `N` indicating whether each
input row was empty.
##### Raises:
* `TypeError`: If `sp_input` is not a `SparseTensor`.
================================================
FILE: SOURCE/api_docs/python/state_ops.md
================================================
# Variables
Note: Functions taking `Tensor` arguments can also take anything accepted by
[`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor).
## Contents
### [Variables](#AUTOGENERATED-variables)
* [Variables](#AUTOGENERATED-variables)
* [`class tf.Variable`](#Variable)
* [Variable helper functions](#AUTOGENERATED-variable-helper-functions)
* [`tf.all_variables()`](#all_variables)
* [`tf.trainable_variables()`](#trainable_variables)
* [`tf.initialize_all_variables()`](#initialize_all_variables)
* [`tf.initialize_variables(var_list, name='init')`](#initialize_variables)
* [`tf.assert_variables_initialized(var_list=None)`](#assert_variables_initialized)
* [Saving and Restoring Variables](#AUTOGENERATED-saving-and-restoring-variables)
* [`class tf.train.Saver`](#Saver)
* [`tf.train.latest_checkpoint(checkpoint_dir, latest_filename=None)`](#latest_checkpoint)
* [`tf.train.get_checkpoint_state(checkpoint_dir, latest_filename=None)`](#get_checkpoint_state)
* [`tf.train.update_checkpoint_state(save_dir, model_checkpoint_path, all_model_checkpoint_paths=None, latest_filename=None)`](#update_checkpoint_state)
* [Sharing Variables](#AUTOGENERATED-sharing-variables)
* [`tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, trainable=True, collections=None)`](#get_variable)
* [`tf.get_variable_scope()`](#get_variable_scope)
* [`tf.variable_scope(name_or_scope, reuse=None, initializer=None)`](#variable_scope)
* [`tf.constant_initializer(value=0.0)`](#constant_initializer)
* [`tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None)`](#random_normal_initializer)
* [`tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None)`](#truncated_normal_initializer)
* [`tf.random_uniform_initializer(minval=0.0, maxval=1.0, seed=None)`](#random_uniform_initializer)
* [`tf.uniform_unit_scaling_initializer(factor=1.0, seed=None)`](#uniform_unit_scaling_initializer)
* [`tf.zeros_initializer(shape, dtype=tf.float32)`](#zeros_initializer)
* [Sparse Variable Updates](#AUTOGENERATED-sparse-variable-updates)
* [`tf.scatter_update(ref, indices, updates, use_locking=None, name=None)`](#scatter_update)
* [`tf.scatter_add(ref, indices, updates, use_locking=None, name=None)`](#scatter_add)
* [`tf.scatter_sub(ref, indices, updates, use_locking=None, name=None)`](#scatter_sub)
* [`tf.sparse_mask(a, mask_indices, name=None)`](#sparse_mask)
* [`class tf.IndexedSlices`](#IndexedSlices)
## Variables
- - -
### `class tf.Variable`
See the [Variables How To](../../how_tos/variables/index.md) for a high
level overview.
A variable maintains state in the graph across calls to `run()`. You add a
variable to the graph by constructing an instance of the class `Variable`.
The `Variable()` constructor requires an initial value for the variable,
which can be a `Tensor` of any type and shape. The initial value defines the
type and shape of the variable. After construction, the type and shape of
the variable are fixed. The value can be changed using one of the assign
methods.
If you want to change the shape of a variable later you have to use an
`assign` Op with `validate_shape=False`.
Just like any `Tensor`, variables created with `Variable()` can be used as
inputs for other Ops in the graph. Additionally, all the operators
overloaded for the `Tensor` class are carried over to variables, so you can
also add nodes to the graph by just doing arithmetic on variables.
```python
import tensorflow as tf
# Create a variable.
w = tf.Variable(, name=)
# Use the variable in the graph like any Tensor.
y = tf.matmul(w, ...another variable or tensor...)
# The overloaded operators are available too.
z = tf.sigmoid(w + b)
# Assign a new value to the variable with `assign()` or a related method.
w.assign(w + 1.0)
w.assign_add(1.0)
```
When you launch the graph, variables have to be explicitly initialized before
you can run Ops that use their value. You can initialize a variable by
running its *initializer op*, restoring the variable from a save file, or
simply running an `assign` Op that assigns a value to the variable. In fact,
the variable *initializer op* is just an `assign` Op that assigns the
variable's initial value to the variable itself.
```python
# Launch the graph in a session.
with tf.Session() as sess:
# Run the variable initializer.
sess.run(w.initializer)
# ...you now can run ops that use the value of 'w'...
```
The most common initialization pattern is to use the convenience function
`initialize_all_variables()` to add an Op to the graph that initializes
all the variables. You then run that Op after launching the graph.
```python
# Add an Op to initialize all variables.
init_op = tf.initialize_all_variables()
# Launch the graph in a session.
with tf.Session() as sess:
# Run the Op that initializes all variables.
sess.run(init_op)
# ...you can now run any Op that uses variable values...
```
If you need to create a variable with an initial value dependent on another
variable, use the other variable's `initialized_value()`. This ensures that
variables are initialized in the right order.
All variables are automatically collected in the graph where they are
created. By default, the constructor adds the new variable to the graph
collection `GraphKeys.VARIABLES`. The convenience function
`all_variables()` returns the contents of that collection.
When building a machine learning model it is often convenient to distinguish
betwen variables holding the trainable model parameters and other variables
such as a `global step` variable used to count training steps. To make this
easier, the variable constructor supports a `trainable=` parameter. If
`True`, the new variable is also added to the graph collection
`GraphKeys.TRAINABLE_VARIABLES`. The convenience function
`trainable_variables()` returns the contents of this collection. The
various `Optimizer` classes use this collection as the default list of
variables to optimize.
Creating a variable.
- - -
#### `tf.Variable.__init__(initial_value, trainable=True, collections=None, validate_shape=True, name=None)`
Creates a new variable with value `initial_value`.
The new variable is added to the graph collections listed in `collections`,
which defaults to `[GraphKeys.VARIABLES]`.
If `trainable` is `True` the variable is also added to the graph collection
`GraphKeys.TRAINABLE_VARIABLES`.
This constructor creates both a `variable` Op and an `assign` Op to set the
variable to its initial value.
##### Args:
* `initial_value`: A `Tensor`, or Python object convertible to a `Tensor`.
The initial value for the Variable. Must have a shape specified unless
`validate_shape` is set to False.
* `trainable`: If `True`, the default, also adds the variable to the graph
collection `GraphKeys.TRAINABLE_VARIABLES`. This collection is used as
the default list of variables to use by the `Optimizer` classes.
* `collections`: List of graph collections keys. The new variable is added to
these collections. Defaults to `[GraphKeys.VARIABLES]`.
* `validate_shape`: If `False`, allows the variable to be initialized with a
value of unknown shape. If `True`, the default, the shape of
`initial_value` must be known.
* `name`: Optional name for the variable. Defaults to `'Variable'` and gets
uniquified automatically.
##### Returns:
A Variable.
##### Raises:
* `ValueError`: If the initial value does not have a shape and
`validate_shape` is `True`.
- - -
#### `tf.Variable.initialized_value()`
Returns the value of the initialized variable.
You should use this instead of the variable itself to initialize another
variable with a value that depends on the value of this variable.
```python
# Initialize 'v' with a random tensor.
v = tf.Variable(tf.truncated_normal([10, 40]))
# Use `initialized_value` to guarantee that `v` has been
# initialized before its value is used to initialize `w`.
# The random values are picked only once.
w = tf.Variable(v.initialized_value() * 2.0)
```
##### Returns:
A `Tensor` holding the value of this variable after its initializer
has run.
Changing a variable value.
- - -
#### `tf.Variable.assign(value, use_locking=False)`
Assigns a new value to the variable.
This is essentially a shortcut for `assign(self, value)`.
##### Args:
* `value`: A `Tensor`. The new value for this variable.
* `use_locking`: If `True`, use locking during the assignment.
##### Returns:
A `Tensor` that will hold the new value of this variable after
the assignment has completed.
- - -
#### `tf.Variable.assign_add(delta, use_locking=False)`
Adds a value to this variable.
This is essentially a shortcut for `assign_add(self, delta)`.
##### Args:
* `delta`: A `Tensor`. The value to add to this variable.
* `use_locking`: If `True`, use locking during the operation.
##### Returns:
A `Tensor` that will hold the new value of this variable after
the addition has completed.
- - -
#### `tf.Variable.assign_sub(delta, use_locking=False)`
Subtracts a value from this variable.
This is essentially a shortcut for `assign_sub(self, delta)`.
##### Args:
* `delta`: A `Tensor`. The value to subtract from this variable.
* `use_locking`: If `True`, use locking during the operation.
##### Returns:
A `Tensor` that will hold the new value of this variable after
the subtraction has completed.
- - -
#### `tf.Variable.scatter_sub(sparse_delta, use_locking=False)`
Subtracts `IndexedSlices` from this variable.
This is essentially a shortcut for `scatter_sub(self, sparse_delta.indices,
sparse_delta.values)`.
##### Args:
* `sparse_delta`: `IndexedSlices` to be subtracted from this variable.
* `use_locking`: If `True`, use locking during the operation.
##### Returns:
A `Tensor` that will hold the new value of this variable after
the scattered subtraction has completed.
##### Raises:
* `ValueError`: if `sparse_delta` is not an `IndexedSlices`.
- - -
#### `tf.Variable.count_up_to(limit)`
Increments this variable until it reaches `limit`.
When that Op is run it tries to increment the variable by `1`. If
incrementing the variable would bring it above `limit` then the Op raises
the exception `OutOfRangeError`.
If no error is raised, the Op outputs the value of the variable before
the increment.
This is essentially a shortcut for `count_up_to(self, limit)`.
##### Args:
* `limit`: value at which incrementing the variable raises an error.
##### Returns:
A `Tensor` that will hold the variable value before the increment. If no
other Op modifies this variable, the values produced will all be
distinct.
- - -
#### `tf.Variable.eval(session=None)`
In a session, computes and returns the value of this variable.
This is not a graph construction method, it does not add ops to the graph.
This convenience method requires a session where the graph containing this
variable has been launched. If no session is passed, the default session is
used. See the [Session class](../../api_docs/python/client.md#Session) for more information on
launching a graph and on sessions.
```python
v = tf.Variable([1, 2])
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
# Usage passing the session explicitly.
print v.eval(sess)
# Usage with the default session. The 'with' block
# above makes 'sess' the default session.
print v.eval()
```
##### Args:
* `session`: The session to use to evaluate this variable. If
none, the default session is used.
##### Returns:
A numpy `ndarray` with a copy of the value of this variable.
Properties.
- - -
#### `tf.Variable.name`
The name of this variable.
- - -
#### `tf.Variable.dtype`
The `DType` of this variable.
- - -
#### `tf.Variable.get_shape()`
The `TensorShape` of this variable.
##### Returns:
A `TensorShape`.
- - -
#### `tf.Variable.device`
The device of this variable.
- - -
#### `tf.Variable.initializer`
The initializer operation for this variable.
- - -
#### `tf.Variable.graph`
The `Graph` of this variable.
- - -
#### `tf.Variable.op`
The `Operation` of this variable.
## Variable helper functions
TensorFlow provides a set of functions to help manage the set of variables
collected in the graph.
- - -
### `tf.all_variables()`
Returns all variables collected in the graph.
The `Variable()` constructor automatically adds new variables to the graph
collection `GraphKeys.VARIABLES`. This convenience function returns the
contents of that collection.
##### Returns:
A list of `Variable` objects.
- - -
### `tf.trainable_variables()`
Returns all variables created with `trainable=True`.
When passed `trainable=True`, the `Variable()` constructor automatically
adds new variables to the graph collection
`GraphKeys.TRAINABLE_VARIABLES`. This convenience function returns the
contents of that collection.
##### Returns:
A list of Variable objects.
- - -
### `tf.initialize_all_variables()`
Returns an Op that initializes all variables.
This is just a shortcut for `initialize_variables(all_variables())`
##### Returns:
An Op that initializes all variables in the graph.
- - -
### `tf.initialize_variables(var_list, name='init')`
Returns an Op that initializes a list of variables.
After you launch the graph in a session, you can run the returned Op to
initialize all the variables in `var_list`. This Op runs all the
initializers of the variables in `var_list` in parallel.
Calling `initialize_variables()` is equivalent to passing the list of
initializers to `Group()`.
If `var_list` is empty, however, the function still returns an Op that can
be run. That Op just has no effect.
##### Args:
* `var_list`: List of `Variable` objects to initialize.
* `name`: Optional name for the returned operation.
##### Returns:
An Op that run the initializers of all the specified variables.
- - -
### `tf.assert_variables_initialized(var_list=None)`
Returns an Op to check if variables are initialized.
When run, the returned Op will raise the exception `FailedPreconditionError`
if any of the variables has not yet been initialized.
Note: This function is implemented by trying to fetch the values of the
variables. If one of the variables is not initialized a message may be
logged by the C++ runtime. This is expected.
##### Args:
* `var_list`: List of `Variable` objects to check. Defaults to the
value of `all_variables().`
##### Returns:
An Op, or None if there are no variables.
## Saving and Restoring Variables
- - -
### `class tf.train.Saver`
Saves and restores variables.
See [Variables](../../how_tos/variables/index.md)
for an overview of variables, saving and restoring.
The `Saver` class adds ops to save and restore variables to and from
*checkpoints*. It also provides convenience methods to run these ops.
Checkpoints are binary files in a proprietary format which map variable names
to tensor values. The best way to examine the contents of a checkpoint is to
load it using a `Saver`.
Savers can automatically number checkpoint filenames with a provided counter.
This lets you keep multiple checkpoints at different steps while training a
model. For example you can number the checkpoint filenames with the training
step number. To avoid filling up disks, savers manage checkpoint files
automatically. For example, they can keep only the N most recent files, or
one checkpoint for every N hours of training.
You number checkpoint filenames by passing a value to the optional
`global_step` argument to `save()`:
```python
saver.save(sess, 'my-model', global_step=0) ==> filename: 'my-model-0'
...
saver.save(sess, 'my-model', global_step=1000) ==> filename: 'my-model-1000'
```
Additionally, optional arguments to the `Saver()` constructor let you control
the proliferation of checkpoint files on disk:
* `max_to_keep` indicates the maximum number of recent checkpoint files to
keep. As new files are created, older files are deleted. If None or 0,
all checkpoint files are kept. Defaults to 5 (that is, the 5 most recent
checkpoint files are kept.)
* `keep_checkpoint_every_n_hours`: In addition to keeping the most recent
`max_to_keep` checkpoint files, you might want to keep one checkpoint file
for every N hours of training. This can be useful if you want to later
analyze how a model progressed during a long training session. For
example, passing `keep_checkpoint_every_n_hours=2` ensures that you keep
one checkpoint file for every 2 hours of training. The default value of
10,000 hours effectively disables the feature.
Note that you still have to call the `save()` method to save the model.
Passing these arguments to the constructor will not save variables
automatically for you.
A training program that saves regularly looks like:
```python
...
# Create a saver.
saver = tf.train.Saver(...variables...)
# Launch the graph and train, saving the model every 1,000 steps.
sess = tf.Session()
for step in xrange(1000000):
sess.run(..training_op..)
if step % 1000 == 0:
# Append the step number to the checkpoint name:
saver.save(sess, 'my-model', global_step=step)
```
In addition to checkpoint files, savers keep a protocol buffer on disk with
the list of recent checkpoints. This is used to manage numbered checkpoint
files and by `latest_checkpoint()`, which makes it easy to discover the path
to the most recent checkpoint. That protocol buffer is stored in a file named
'checkpoint' next to the checkpoint files.
If you create several savers, you can specify a different filename for the
protocol buffer file in the call to `save()`.
- - -
#### `tf.train.Saver.__init__(var_list=None, reshape=False, sharded=False, max_to_keep=5, keep_checkpoint_every_n_hours=10000.0, name=None, restore_sequentially=False, saver_def=None, builder=None)`
Creates a `Saver`.
The constructor adds ops to save and restore variables.
`var_list` specifies the variables that will be saved and restored. It can
be passed as a `dict` or a list:
* A `dict` of names to variables: The keys are the names that will be
used to save or restore the variables in the checkpoint files.
* A list of variables: The variables will be keyed with their op name in
the checkpoint files.
For example:
```python
v1 = tf.Variable(..., name='v1')
v2 = tf.Variable(..., name='v2')
# Pass the variables as a dict:
saver = tf.train.Saver({'v1': v1, 'v2': v2})
# Or pass them as a list.
saver = tf.train.Saver([v1, v2])
# Passing a list is equivalent to passing a dict with the variable op names
# as keys:
saver = tf.train.Saver({v.op.name: v for v in [v1, v2]})
```
The optional `reshape` argument, if True, allows restoring a variable from
a save file where the variable had a different shape, but the same number
of elements and type. This is useful if you have reshaped a variable and
want to reload it from an older checkpoint.
The optional `sharded` argument, if True, instructs the saver to shard
checkpoints per device.
##### Args:
* `var_list`: A list of Variables or a dictionary mapping names to
Variables. If None, defaults to the list of all variables.
* `reshape`: If True, allows restoring parameters from a checkpoint
where the variables have a different shape.
* `sharded`: If True, shard the checkpoints, one per device.
* `max_to_keep`: maximum number of recent checkpoints to keep.
Defaults to 10,000 hours.
* `keep_checkpoint_every_n_hours`: How often to keep checkpoints.
Defaults to 10,000 hours.
* `name`: string. Optional name to use as a prefix when adding operations.
* `restore_sequentially`: A Bool, which if true, causes restore of different
variables to happen sequentially within each device. This can lower
memory usage when restoring very large models.
* `saver_def`: Optional SaverDef proto to use instead of running the builder.
This is only useful for specialty code that wants to recreate a Saver
object for a previously built Graph that had a Saver. The saver_def
proto should be the one returned by the as_saver_def() call of the
Saver that was created for that Graph.
* `builder`: Optional SaverBuilder to use if a saver_def was not provided.
Defaults to BaseSaverBuilder().
##### Raises:
* `TypeError`: If `var_list` is invalid.
* `ValueError`: If any of the keys or values in `var_list` is not unique.
- - -
#### `tf.train.Saver.save(sess, save_path, global_step=None, latest_filename=None)`
Saves variables.
This method runs the ops added by the constructor for saving variables.
It requires a session in which the graph was launched. The variables to
save must also have been initialized.
The method returns the path of the newly created checkpoint file. This
path can be passed directly to a call to `restore()`.
##### Args:
* `sess`: A Session to use to save the variables.
* `save_path`: string. Path to the checkpoint filename. If the saver is
`sharded`, this is the prefix of the sharded checkpoint filename.
* `global_step`: If provided the global step number is appended to
`save_path` to create the checkpoint filename. The optional argument
can be a Tensor, a Tensor name or an integer.
* `latest_filename`: Optional name for the protocol buffer file that will
contains the list of most recent checkpoint filenames. That file,
kept in the same directory as the checkpoint files, is automatically
managed by the saver to keep track of recent checkpoints. Defaults to
'checkpoint'.
##### Returns:
A string: path at which the variables were saved. If the saver is
sharded, this string ends with: '-?????-of-nnnnn' where 'nnnnn'
is the number of shards created.
##### Raises:
* `TypeError`: If `sess` is not a Session.
- - -
#### `tf.train.Saver.restore(sess, save_path)`
Restores previously saved variables.
This method runs the ops added by the constructor for restoring variables.
It requires a session in which the graph was launched. The variables to
restore do not have to have been initialized, as restoring is itself a way
to initialize variables.
The `save_path` argument is typically a value previously returned from a
`save()` call, or a call to `latest_checkpoint()`.
##### Args:
* `sess`: A Session to use to restore the parameters.
* `save_path`: Path where parameters were previously saved.
Other utility methods.
- - -
#### `tf.train.Saver.last_checkpoints`
List of not-yet-deleted checkpoint filenames.
You can pass any of the returned values to `restore()`.
##### Returns:
A list of checkpoint filenames, sorted from oldest to newest.
- - -
#### `tf.train.Saver.set_last_checkpoints(last_checkpoints)`
Sets the list of not-yet-deleted checkpoint filenames.
##### Args:
* `last_checkpoints`: a list of checkpoint filenames.
##### Raises:
* `AssertionError`: if the list of checkpoint filenames has already been set.
- - -
#### `tf.train.Saver.as_saver_def()`
Generates a `SaverDef` representation of this saver.
##### Returns:
A `SaverDef` proto.
- - -
### `tf.train.latest_checkpoint(checkpoint_dir, latest_filename=None)`
Finds the filename of latest saved checkpoint file.
##### Args:
* `checkpoint_dir`: Directory where the variables were saved.
* `latest_filename`: Optional name for the protocol buffer file that
contains the list of most recent checkpoint filenames.
See the corresponding argument to `Saver.save()`.
##### Returns:
The full path to the latest checkpoint or None if no checkpoint was found.
- - -
### `tf.train.get_checkpoint_state(checkpoint_dir, latest_filename=None)`
Returns CheckpointState proto from the "checkpoint" file.
If the "checkpoint" file contains a valid CheckpointState
proto, returns it.
##### Args:
* `checkpoint_dir`: The directory of checkpoints.
* `latest_filename`: Optional name of the checkpoint file. Default to
'checkpoint'.
##### Returns:
A CheckpointState if the state was available, None
otherwise.
- - -
### `tf.train.update_checkpoint_state(save_dir, model_checkpoint_path, all_model_checkpoint_paths=None, latest_filename=None)`
Updates the content of the 'checkpoint' file.
This updates the checkpoint file containing a CheckpointState
proto.
##### Args:
* `save_dir`: Directory where the model was saved.
* `model_checkpoint_path`: The checkpoint file.
* `all_model_checkpoint_paths`: list of strings. Paths to all not-yet-deleted
checkpoints, sorted from oldest to newest. If this is a non-empty list,
the last element must be equal to model_checkpoint_path. These paths
are also saved in the CheckpointState proto.
* `latest_filename`: Optional name of the checkpoint file. Default to
'checkpoint'.
##### Raises:
* `RuntimeError`: If the save paths conflict.
## Sharing Variables
TensorFlow provides several classes and operations that you can use to
create variables contingent on certain conditions.
- - -
### `tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, trainable=True, collections=None)`
Gets an existing variable with these parameters or create a new one.
This function prefixes the name with the current variable scope
and performs reuse checks. See the
[Variable Scope How To](../../how_tos/variable_scope/index.md)
for an extensive description of how reusing works. Here is a basic example:
```python
with tf.variable_scope("foo"):
v = get_variable("v", [1]) # v.name == "foo/v:0"
w = get_variable("w", [1]) # w.name == "foo/w:0"
with tf.variable_scope("foo", reuse=True)
v1 = get_variable("v") # The same as v above.
```
If initializer is `None` (the default), the default initializer passed in
the constructor is used. If that one is `None` too, a
`UniformUnitScalingInitializer` will be used.
##### Args:
* `name`: the name of the new or existing variable.
* `shape`: shape of the new or existing variable.
* `dtype`: type of the new or existing variable (defaults to `DT_FLOAT`).
* `initializer`: initializer for the variable if one is created.
* `trainable`: If `True` also add the variable to the graph collection
`GraphKeys.TRAINABLE_VARIABLES` (see variables.Variable).
* `collections`: List of graph collections keys to add the Variable to.
Defaults to `[GraphKeys.VARIABLES]` (see variables.Variable).
##### Returns:
The created or existing variable.
##### Raises:
* `ValueError`: when creating a new variable and shape is not declared,
or when violating reuse during variable creation. Reuse is set inside
`variable_scope`.
- - -
### `tf.get_variable_scope()`
Returns the current variable scope.
- - -
### `tf.variable_scope(name_or_scope, reuse=None, initializer=None)`
Returns a context for variable scope.
Variable scope allows to create new variables and to share already created
ones while providing checks to not create or share by accident. For details,
see the [Variable Scope How To](../../how_tos/variable_scope/index.md),
here we present only a few basic examples.
Simple example of how to create a new variable:
```python
with tf.variable_scope("foo"):
with tf.variable_scope("bar"):
v = tf.get_variable("v", [1])
assert v.name == "foo/bar/v:0"
```
Basic example of sharing a variable:
```python
with tf.variable_scope("foo"):
v = get_variable("v", [1])
with tf.variable_scope("foo", reuse=True):
v1 = tf.get_variable("v", [1])
assert v1 == v
```
Sharing a variable by capturing a scope and setting reuse:
```python
with tf.variable_scope("foo") as scope.
v = get_variable("v", [1])
scope.reuse_variables()
v1 = tf.get_variable("v", [1])
assert v1 == v
```
To prevent accidental sharing of variables, we raise an exception when
getting an existing variable in a non-reusing scope.
```python
with tf.variable_scope("foo") as scope.
v = get_variable("v", [1])
v1 = tf.get_variable("v", [1])
# Raises ValueError("... v already exists ...").
```
Similarly, we raise an exception when trying to get a variable that
does not exist in reuse mode.
```python
with tf.variable_scope("foo", reuse=True):
v = get_variable("v", [1])
# Raises ValueError("... v does not exists ...").
```
Note that the `reuse` flag is inherited: if we open a reusing scope,
then all its sub-scopes become reusing as well.
##### Args:
* `name_or_scope`: `string` or `VariableScope`: the scope to open.
* `reuse`: `True` or `None`; if `True`, we go into reuse mode for this scope as
well as all sub-scopes; if `None`, we just inherit the parent scope reuse.
* `initializer`: default initializer for variables within this scope.
##### Yields:
A scope that can be to captured and reused.
##### Raises:
* `ValueError`: when trying to reuse within a create scope, or create within
a reuse scope, or if reuse is not `None` or `True`.
* `TypeError`: when the types of some arguments are not appropriate.
- - -
### `tf.constant_initializer(value=0.0)`
Returns an initializer that generates Tensors with a single value.
##### Args:
* `value`: A Python scalar. All elements of the initialized variable
will be set to this value.
##### Returns:
An initializer that generates Tensors with a single value.
- - -
### `tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None)`
Returns an initializer that generates Tensors with a normal distribution.
##### Args:
* `mean`: a python scalar or a scalar tensor. Mean of the random values
to generate.
* `stddev`: a python scalar or a scalar tensor. Standard deviation of the
random values to generate.
* `seed`: A Python integer. Used to create random seeds. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
##### Returns:
An initializer that generates Tensors with a normal distribution.
- - -
### `tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None)`
Returns an initializer that generates a truncated normal distribution.
These values are similar to values from a random_normal_initializer
except that values more than two standard deviations from the mean
are discarded and re-drawn. This is the recommended initializer for
neural network weights and filters.
##### Args:
* `mean`: a python scalar or a scalar tensor. Mean of the random values
to generate.
* `stddev`: a python scalar or a scalar tensor. Standard deviation of the
random values to generate.
* `seed`: A Python integer. Used to create random seeds. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
##### Returns:
An initializer that generates Tensors with a truncated normal
distribution.
- - -
### `tf.random_uniform_initializer(minval=0.0, maxval=1.0, seed=None)`
Returns an initializer that generates Tensors with a uniform distribution.
##### Args:
* `minval`: a python scalar or a scalar tensor. lower bound of the range
of random values to generate.
* `maxval`: a python scalar or a scalar tensor. upper bound of the range
of random values to generate.
* `seed`: A Python integer. Used to create random seeds. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
##### Returns:
An initializer that generates Tensors with a uniform distribution.
- - -
### `tf.uniform_unit_scaling_initializer(factor=1.0, seed=None)`
Returns an initializer that generates tensors without scaling variance.
When initializing a deep network, it is in principle advantageous to keep
the scale of the input variance constant, so it does not explode or diminish
by reaching the final layer. If the input is `x` and the operation `x * W`,
and we want to initialize `W` uniformly at random, we need to pick `W` from
[-sqrt(3) / sqrt(dim), sqrt(3) / sqrt(dim)]
to keep the scale intact, where `dim = W.shape[0]` (the size of the input).
A similar calculation for convolutional networks gives an analogous result
with `dim` equal to the product of the first 3 dimensions. When
nonlinearities are present, we need to multiply this by a constant `factor`.
See for deeper motivation, experiments
and the calculation of constants. In section 2.3 there, the constants were
numerically computed: for a linear layer it's 1.0, relu: ~1.43, tanh: ~1.15.
##### Args:
* `factor`: Float. A multiplicative factor by which the values will be scaled.
* `seed`: A Python integer. Used to create random seeds. See
[`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
for behavior.
##### Returns:
An initializer that generates tensors with unit variance.
- - -
### `tf.zeros_initializer(shape, dtype=tf.float32)`
An adaptor for zeros() to match the Initializer spec.
## Sparse Variable Updates
The sparse update ops modify a subset of the entries in a dense `Variable`,
either overwriting the entries or adding / subtracting a delta. These are
useful for training embedding models and similar lookup-based networks, since
only a small subset of embedding vectors change in any given step.
Since a sparse update of a large tensor may be generated automatically during
gradient computation (as in the gradient of
[`tf.gather`](../../api_docs/python/array_ops.md#gather)),
an [`IndexedSlices`](#IndexedSlices) class is provided that encapsulates a set
of sparse indices and values. `IndexedSlices` objects are detected and handled
automatically by the optimizers in most cases.
- - -
### `tf.scatter_update(ref, indices, updates, use_locking=None, name=None)`
Applies sparse updates to a variable reference.
This operation computes
# Scalar indices
ref[indices, ...] = updates[...]
# Vector indices (for each i)
ref[indices[i], ...] = updates[i, ...]
# High rank indices (for each i, ..., j)
ref[indices[i, ..., j], ...] = updates[i, ..., j, ...]
This operation outputs `ref` after the update is done.
This makes it easier to chain operations that need to use the reset value.
If `indices` contains duplicate entries, lexicographically later entries
override earlier entries.
Requires `updates.shape = indices.shape + ref.shape[1:]`.
##### Args:
* `ref`: A mutable `Tensor`. Should be from a `Variable` node.
* `indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A tensor of indices into the first dimension of `ref`.
* `updates`: A `Tensor`. Must have the same type as `ref`.
A tensor of updated values to store in `ref`.
* `use_locking`: An optional `bool`. Defaults to `True`.
If True, the assignment will be protected by a lock;
otherwise the behavior is undefined, but may exhibit less contention.
* `name`: A name for the operation (optional).
##### Returns:
Same as `ref`. Returned as a convenience for operations that want
to use the updated values after the update is done.
- - -
### `tf.scatter_add(ref, indices, updates, use_locking=None, name=None)`
Adds sparse updates to a variable reference.
This operation computes
# Scalar indices
ref[indices, ...] += updates[...]
# Vector indices (for each i)
ref[indices[i], ...] += updates[i, ...]
# High rank indices (for each i, ..., j)
ref[indices[i, ..., j], ...] += updates[i, ..., j, ...]
This operation outputs `ref` after the update is done.
This makes it easier to chain operations that need to use the reset value.
Duplicate entries are handled correctly: if multiple `indices` reference
the same location, their contributions add.
Requires `updates.shape = indices.shape + ref.shape[1:]`.
##### Args:
* `ref`: A mutable `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`.
Should be from a `Variable` node.
* `indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A tensor of indices into the first dimension of `ref`.
* `updates`: A `Tensor`. Must have the same type as `ref`.
A tensor of updated values to add to `ref`.
* `use_locking`: An optional `bool`. Defaults to `False`.
If True, the addition will be protected by a lock;
otherwise the behavior is undefined, but may exhibit less contention.
* `name`: A name for the operation (optional).
##### Returns:
Same as `ref`. Returned as a convenience for operations that want
to use the updated values after the update is done.
- - -
### `tf.scatter_sub(ref, indices, updates, use_locking=None, name=None)`
Subtracts sparse updates to a variable reference.
# Scalar indices
ref[indices, ...] -= updates[...]
# Vector indices (for each i)
ref[indices[i], ...] -= updates[i, ...]
# High rank indices (for each i, ..., j)
ref[indices[i, ..., j], ...] -= updates[i, ..., j, ...]
This operation outputs `ref` after the update is done.
This makes it easier to chain operations that need to use the reset value.
Duplicate entries are handled correctly: if multiple `indices` reference
the same location, their (negated) contributions add.
Requires `updates.shape = indices.shape + ref.shape[1:]`.
##### Args:
* `ref`: A mutable `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`.
Should be from a `Variable` node.
* `indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`.
A tensor of indices into the first dimension of `ref`.
* `updates`: A `Tensor`. Must have the same type as `ref`.
A tensor of updated values to subtract from `ref`.
* `use_locking`: An optional `bool`. Defaults to `False`.
If True, the subtraction will be protected by a lock;
otherwise the behavior is undefined, but may exhibit less contention.
* `name`: A name for the operation (optional).
##### Returns:
Same as `ref`. Returned as a convenience for operations that want
to use the updated values after the update is done.
- - -
### `tf.sparse_mask(a, mask_indices, name=None)`
Masks elements of `IndexedSlices`.
Given an `IndexedSlices` instance `a`, returns another `IndexedSlices` that
contains a subset of the slices of `a`. Only the slices at indices specified
in `mask_indices` are returned.
This is useful when you need to extract a subset of slices in an
`IndexedSlices` object.
For example:
```python
# `a` contains slices at indices [12, 26, 37, 45] from a large tensor
# with shape [1000, 10]
a.indices => [12, 26, 37, 45]
tf.shape(a.values) => [4, 10]
# `b` will be the subset of `a` slices at its second and third indices, so
# we want to mask of its first and last indices (which are at absolute
# indices 12, 45)
b = tf.sparse_mask(a, [12, 45])
b.indices => [26, 37]
tf.shape(b.values) => [2, 10]
```
##### Args:
* `a`: An `IndexedSlices` instance.
* `mask_indices`: Indices of elements to mask.
* `name`: A name for the operation (optional).
##### Returns:
The masked `IndexedSlices` instance.
- - -
### `class tf.IndexedSlices`
A sparse representation of a set of tensor slices at given indices.
This class is a simple wrapper for a pair of `Tensor` objects:
* `values`: A `Tensor` of any dtype with shape `[D0, D1, ..., Dn]`.
* `indices`: A 1-D integer `Tensor` with shape `[D0]`.
An `IndexedSlices` is typically used to represent a subset of a larger
tensor `dense` of shape `[LARGE0, D1, .. , DN]` where `LARGE0 >> D0`.
The values in `indices` are the indices in the first dimension of
the slices that have been extracted from the larger tensor.
The dense tensor `dense` represented by an `IndexedSlices` `slices` has
```python
dense[slices.indices[i], :, :, :, ...] = slices.values[i, :, :, :, ...]
```
The `IndexedSlices` class is used principally in the definition of
gradients for operations that have sparse gradients
(e.g. [`tf.gather`](../../api_docs/python/array_ops.md#gather)).
Contrast this representation with
[`SparseTensor`](../../api_docs/python/sparse_ops.md#SparseTensor),
which uses multi-dimensional indices and scalar values.
- - -
#### `tf.IndexedSlices.__init__(values, indices, dense_shape=None)`
Creates an `IndexedSlices`.
- - -
#### `tf.IndexedSlices.values`
A `Tensor` containing the values of the slices.
- - -
#### `tf.IndexedSlices.indices`
A 1-D `Tensor` containing the indices of the slices.
- - -
#### `tf.IndexedSlices.dense_shape`
A 1-D `Tensor` containing the shape of the corresponding dense tensor.
- - -
#### `tf.IndexedSlices.name`
The name of this `IndexedSlices`.
- - -
#### `tf.IndexedSlices.dtype`
The `DType` of elements in this tensor.
- - -
#### `tf.IndexedSlices.device`
The name of the device on which `values` will be produced, or `None`.
- - -
#### `tf.IndexedSlices.op`
The `Operation` that produces `values` as an output.
================================================
FILE: SOURCE/api_docs/python/train.md
================================================
# Training
## Contents
### [Training](#AUTOGENERATED-training)
* [Optimizers](#AUTOGENERATED-optimizers)
* [`class tf.train.Optimizer`](#Optimizer)
* [Usage](#AUTOGENERATED-usage)
* [Processing gradients before applying them.](#AUTOGENERATED-processing-gradients-before-applying-them.)
* [Gating Gradients](#AUTOGENERATED-gating-gradients)
* [Slots](#AUTOGENERATED-slots)
* [`class tf.train.GradientDescentOptimizer`](#GradientDescentOptimizer)
* [`class tf.train.AdagradOptimizer`](#AdagradOptimizer)
* [`class tf.train.MomentumOptimizer`](#MomentumOptimizer)
* [`class tf.train.AdamOptimizer`](#AdamOptimizer)
* [`class tf.train.FtrlOptimizer`](#FtrlOptimizer)
* [`class tf.train.RMSPropOptimizer`](#RMSPropOptimizer)
* [Gradient Computation](#AUTOGENERATED-gradient-computation)
* [`tf.gradients(ys, xs, grad_ys=None, name='gradients', colocate_gradients_with_ops=False, gate_gradients=False, aggregation_method=None)`](#gradients)
* [`class tf.AggregationMethod`](#AggregationMethod)
* [`tf.stop_gradient(input, name=None)`](#stop_gradient)
* [Gradient Clipping](#AUTOGENERATED-gradient-clipping)
* [`tf.clip_by_value(t, clip_value_min, clip_value_max, name=None)`](#clip_by_value)
* [`tf.clip_by_norm(t, clip_norm, name=None)`](#clip_by_norm)
* [`tf.clip_by_average_norm(t, clip_norm, name=None)`](#clip_by_average_norm)
* [`tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None)`](#clip_by_global_norm)
* [`tf.global_norm(t_list, name=None)`](#global_norm)
* [Decaying the learning rate](#AUTOGENERATED-decaying-the-learning-rate)
* [`tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)`](#exponential_decay)
* [Moving Averages](#AUTOGENERATED-moving-averages)
* [`class tf.train.ExponentialMovingAverage`](#ExponentialMovingAverage)
* [Coordinator and QueueRunner](#AUTOGENERATED-coordinator-and-queuerunner)
* [`class tf.train.Coordinator`](#Coordinator)
* [`class tf.train.QueueRunner`](#QueueRunner)
* [`tf.train.add_queue_runner(qr, collection='queue_runners')`](#add_queue_runner)
* [`tf.train.start_queue_runners(sess=None, coord=None, daemon=True, start=True, collection='queue_runners')`](#start_queue_runners)
* [Summary Operations](#AUTOGENERATED-summary-operations)
* [`tf.scalar_summary(tags, values, collections=None, name=None)`](#scalar_summary)
* [`tf.image_summary(tag, tensor, max_images=None, collections=None, name=None)`](#image_summary)
* [`tf.histogram_summary(tag, values, collections=None, name=None)`](#histogram_summary)
* [`tf.nn.zero_fraction(value, name=None)`](#zero_fraction)
* [`tf.merge_summary(inputs, collections=None, name=None)`](#merge_summary)
* [`tf.merge_all_summaries(key='summaries')`](#merge_all_summaries)
* [Adding Summaries to Event Files](#AUTOGENERATED-adding-summaries-to-event-files)
* [`class tf.train.SummaryWriter`](#SummaryWriter)
* [`tf.train.summary_iterator(path)`](#summary_iterator)
* [Training utilities](#AUTOGENERATED-training-utilities)
* [`tf.train.global_step(sess, global_step_tensor)`](#global_step)
* [`tf.train.write_graph(graph_def, logdir, name, as_text=True)`](#write_graph)
This library provides a set of classes and functions that helps train models.
## Optimizers
The Optimizer base class provides methods to compute gradients for a loss and
apply gradients to variables. A collection of subclasses implement classic
optimization algorithms such as GradientDescent and Adagrad.
You never instantiate the Optimizer class itself, but instead instantiate one
of the subclasses.
- - -
### `class tf.train.Optimizer`
Base class for optimizers.
This class defines the API to add Ops to train a model. You never use this
class directly, but instead instantiate one of its subclasses such as
`GradientDescentOptimizer`, `AdagradOptimizer`, or `MomentumOptimizer`.
### Usage
```
# Create an optimizer with the desired parameters.
opt = GradientDescentOptimizer(learning_rate=0.1)
# Add Ops to the graph to minimize a cost by updating a list of variables.
# "cost" is a Tensor, and the list of variables contains variables.Variable
# objects.
opt_op = opt.minimize(cost, )
```
In the training program you will just have to run the returned Op.
```
# Execute opt_op to do one step of training:
opt_op.run()
```
### Processing gradients before applying them.
Calling `minimize()` takes care of both computing the gradients and
applying them to the variables. If you want to process the gradients
before applying them you can instead use the optimizer in three steps:
1. Compute the gradients with `compute_gradients()`.
2. Process the gradients as you wish.
3. Apply the processed gradients with `apply_gradients()`.
Example:
```
# Create an optimizer.
opt = GradientDescentOptimizer(learning_rate=0.1)
# Compute the gradients for a list of variables.
grads_and_vars = opt.compute_gradients(loss, )
# grads_and_vars is a list of tuples (gradient, variable). Do whatever you
# need to the 'gradient' part, for example cap them, etc.
capped_grads_and_vars = [(MyCapper(gv[0]), gv[1])) for gv in grads_and_vars]
# Ask the optimizer to apply the capped gradients.
opt.apply_gradients(capped_grads_and_vars)
```
- - -
#### `tf.train.Optimizer.__init__(use_locking, name)`
Create a new Optimizer.
This must be called by the constructors of subclasses.
##### Args:
* `use_locking`: Bool. If True apply use locks to prevent concurrent updates
to variables.
* `name`: A non-empty string. The name to use for accumulators created
for the optimizer.
##### Raises:
* `ValueError`: if name is malformed.
- - -
#### `tf.train.Optimizer.minimize(loss, global_step=None, var_list=None, gate_gradients=1, name=None)`
Add operations to minimize 'loss' by updating 'var_list'.
This method simply combines calls compute_gradients() and
apply_gradients(). If you want to process the gradient before applying them
call compute_gradients() and apply_gradients() explicitly instead of using
this function.
##### Args:
* `loss`: A Tensor containing the value to minimize.
* `global_step`: Optional Variable to increment by one after the
variables have been updated.
* `var_list`: Optional list of variables.Variable to update to minimize
'loss'. Defaults to the list of variables collected in the graph
under the key GraphKeys.TRAINABLE_VARIABLES.
* `gate_gradients`: How to gate the computation of gradients. Can be
GATE_NONE, GATE_OP, or GATE_GRAPH.
* `name`: Optional name for the returned operation.
##### Returns:
An Operation that updates the variables in 'var_list'. If 'global_step'
was not None, that operation also increments global_step.
##### Raises:
* `ValueError`: if some of the variables are not variables.Variable objects.
- - -
#### `tf.train.Optimizer.compute_gradients(loss, var_list=None, gate_gradients=1)`
Compute gradients of "loss" for the variables in "var_list".
This is the first part of minimize(). It returns a list
of (gradient, variable) pairs where "gradient" is the gradient
for "variable". Note that "gradient" can be a Tensor, a
IndexedSlices, or None if there is no gradient for the
given variable.
##### Args:
* `loss`: A Tensor containing the value to minimize.
* `var_list`: Optional list of variables.Variable to update to minimize
"loss". Defaults to the list of variables collected in the graph
under the key GraphKey.TRAINABLE_VARIABLES.
* `gate_gradients`: How to gate the computation of gradients. Can be
GATE_NONE, GATE_OP, or GATE_GRAPH.
##### Returns:
A list of (gradient, variable) pairs.
##### Raises:
* `TypeError`: If var_list contains anything else than variables.Variable.
* `ValueError`: If some arguments are invalid.
- - -
#### `tf.train.Optimizer.apply_gradients(grads_and_vars, global_step=None, name=None)`
Apply gradients to variables.
This is the second part of minimize(). It returns an Operation that
applies gradients.
##### Args:
* `grads_and_vars`: List of (gradient, variable) pairs as returned by
compute_gradients().
* `global_step`: Optional Variable to increment by one after the
variables have been updated.
* `name`: Optional name for the returned operation. Default to the
name passed to the Optimizer constructor.
##### Returns:
An Operation that applies the specified gradients. If 'global_step'
was not None, that operation also increments global_step.
##### Raises:
* `TypeError`: if grads_and_vars is malformed.
### Gating Gradients
Both `minimize()` and `compute_gradients()` accept a `gate_gradient` argument
that controls the degree of parallelism during the application of the
gradients.
The possible values are: `GATE_NONE`, `GATE_OP`, and `GATE_GRAPH`.
GATE_NONE: Compute and apply gradients in parallel. This provides the
maximum parallelism in execution, at the cost of some non-reproducibility in
the results. For example the two gradients of MatMul depend on the input
values: With `GATE_NONE` one of the gradients could be applied to one of the
inputs _before_ the other gradient is computed resulting in non-reproducible
results.
GATE_OP: For each Op, make sure all gradients are computed before they
are used. This prevents race conditions for Ops that generate gradients for
multiple inputs where the gradients depend on the inputs.
GATE_GRAPH: Make sure all gradients for all variables are computed
before any one of them is used. This provides the least parallelism but can
be useful if you want to process all gradients before applying any of them.
### Slots
Some optimizer subclasses, such as `MomentumOptimizer` and `AdagradOptimizer`
allocate and manage additional variables associated with the variables to
train. These are called Slots. Slots have names and you can ask the
optimizer for the names of the slots that it uses. Once you have a slot name
you can ask the optimizer for the variable it created to hold the slot value.
This can be useful if you want to log debug a training algorithm, report stats
about the slots, etc.
- - -
#### `tf.train.Optimizer.get_slot_names()`
Return a list of the names of slots created by the Optimizer.
See get_slot().
##### Returns:
A list of strings.
- - -
#### `tf.train.Optimizer.get_slot(var, name)`
Return a slot named "name" created for "var" by the Optimizer.
Some Optimizer subclasses use additional variables. For example
Momentum and Adagrad use variables to accumulate updates. This method
gives access to these Variables if for some reason you need them.
Use get_slot_names() to get the list of slot names created by the Optimizer.
##### Args:
* `var`: A variable passed to minimize() or apply_gradients().
* `name`: A string.
##### Returns:
The Variable for the slot if it was created, None otherwise.
- - -
### `class tf.train.GradientDescentOptimizer`
Optimizer that implements the gradient descent algorithm.
- - -
#### `tf.train.GradientDescentOptimizer.__init__(learning_rate, use_locking=False, name='GradientDescent')`
Construct a new gradient descent optimizer.
##### Args:
* `learning_rate`: A Tensor or a floating point value. The learning
rate to use.
* `use_locking`: If True use locks for update operation.s
* `name`: Optional name prefix for the operations created when applying
gradients. Defaults to "GradientDescent".
- - -
### `class tf.train.AdagradOptimizer`
Optimizer that implements the Adagrad algorithm.
- - -
#### `tf.train.AdagradOptimizer.__init__(learning_rate, initial_accumulator_value=0.1, use_locking=False, name='Adagrad')`
Construct a new Adagrad optimizer.
##### Args:
* `learning_rate`: A `Tensor` or a floating point value. The learning rate.
* `initial_accumulator_value`: A floating point value.
Starting value for the accumulators, must be positive.
* `use_locking`: If `True` use locks for update operations.
* `name`: Optional name prefix for the operations created when applying
gradients. Defaults to "Adagrad".
##### Raises:
* `ValueError`: If the initial_accumulator_value is invalid.
- - -
### `class tf.train.MomentumOptimizer`
Optimizer that implements the Momentum algorithm.
- - -
#### `tf.train.MomentumOptimizer.__init__(learning_rate, momentum, use_locking=False, name='Momentum')`
Construct a new Momentum optimizer.
##### Args:
* `learning_rate`: A `Tensor` or a floating point value. The learning rate.
* `momentum`: A `Tensor` or a floating point value. The momentum.
* `use_locking`: If `True` use locks for update operations.
* `name`: Optional name prefix for the operations created when applying
gradients. Defaults to "Momentum".
- - -
### `class tf.train.AdamOptimizer`
Optimizer that implements the Adam algorithm.
- - -
#### `tf.train.AdamOptimizer.__init__(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, use_locking=False, name='Adam')`
Construct a new Adam optimizer.
Implementation is based on: http://arxiv.org/pdf/1412.6980v7.pdf
Initialization:
```
m_0 <- 0 (Initialize initial 1st moment vector)
v_0 <- 0 (Initialize initial 2nd moment vector)
t <- 0 (Initialize timestep)
```
The update rule for `variable` with gradient `g` uses an optimization
described at the end of section2 of the paper:
```
t <- t + 1
lr_t <- learning_rate * sqrt(1 - beta2^t) / (1 - beta1^t)
m_t <- beta1 * m_{t-1} + (1 - beta1) * g
v_t <- beta2 * v_{t-1} + (1 - beta2) * g * g
variable <- variable - lr_t * m_t / (sqrt(v_t) + epsilon)
```
The default value of 1e-8 for epsilon might not be a good default in
general. For example, when training an Inception network on ImageNet a
current good choice is 1.0 or 0.1.
##### Args:
* `learning_rate`: A Tensor or a floating point value. The learning rate.
* `beta1`: A float value or a constant float tensor.
The exponential decay rate for the 1st moment estimates.
* `beta2`: A float value or a constant float tensor.
The exponential decay rate for the 2st moment estimates.
* `epsilon`: A small constant for numerical stability.
* `use_locking`: If True use locks for update operation.s
* `name`: Optional name for the operations created when applying gradients.
Defaults to "Adam".
- - -
### `class tf.train.FtrlOptimizer`
Optimizer that implements the FTRL algorithm.
- - -
#### `tf.train.FtrlOptimizer.__init__(learning_rate, learning_rate_power=-0.5, initial_accumulator_value=0.1, l1_regularization_strength=0.0, l2_regularization_strength=0.0, use_locking=False, name='Ftrl')`
Construct a new FTRL optimizer.
The Ftrl-proximal algorithm, abbreviated for Follow-the-regularized-leader,
is described in the paper [Ad Click Prediction: a View from the Trenches](
https://www.eecs.tufts.edu/~dsculley/papers/ad-click-prediction.pdf).
It can give a good performance vs. sparsity tradeoff.
Ftrl-proximal uses its own global base learning rate and can behave like
Adagrad with `learning_rate_power=-0.5`, or like gradient descent with
`learning_rate_power=0.0`.
The effective learning rate is adjusted per parameter, relative to this
base learning rate as:
```
effective_learning_rate_i = (learning_rate /
pow(k + summed_squared_gradients_for_i, learning_rate_power));
```
where k is the small constant `initial_accumulator_value`.
Note that the real regularization coefficient of `|w|^2` for objective
function is `1 / lambda_2` if specifying `l2 = lambda_2` as argument when
using this function.
##### Args:
* `learning_rate`: A float value or a constant float `Tensor`.
* `learning_rate_power`: A float value, must be less or equal to zero.
* `initial_accumulator_value`: The starting value for accumulators.
Only positive values are allowed.
* `l1_regularization_strength`: A float value, must be greater than or
equal to zero.
* `l2_regularization_strength`: A float value, must be greater than or
equal to zero.
* `use_locking`: If `True` use locks for update operations.
* `name`: Optional name prefix for the operations created when applying
gradients. Defaults to "Ftrl".
##### Raises:
* `ValueError`: if one of the arguments is invalid.
- - -
### `class tf.train.RMSPropOptimizer`
Optimizer that implements the RMSProp algorithm.
- - -
#### `tf.train.RMSPropOptimizer.__init__(learning_rate, decay, momentum=0.0, epsilon=1e-10, use_locking=False, name='RMSProp')`
Construct a new RMSProp optimizer.
##### Args:
* `learning_rate`: A Tensor or a floating point value. The learning rate.
* `decay`: discounting factor for the history/coming gradient
* `momentum`: a scalar tensor.
* `epsilon`: small value to avoid zero denominator.
* `use_locking`: If True use locks for update operation.
* `name`: Optional name prefic for the operations created when applying
gradients. Defaults to "RMSProp".
## Gradient Computation
TensorFlow provides functions to compute the derivatives for a given
TensorFlow computation graph, adding operations to the graph. The
optimizer classes automatically compute derivatives on your graph, but
creators of new Optimizers or expert users can call the lower-level
functions below.
- - -
### `tf.gradients(ys, xs, grad_ys=None, name='gradients', colocate_gradients_with_ops=False, gate_gradients=False, aggregation_method=None)`
Constructs symbolic partial derivatives of `ys` w.r.t. x in `xs`.
`ys` and `xs` are each a `Tensor` or a list of tensors. `grad_ys`
is a list of `Tensor`, holding the gradients received by the
`ys`. The list must be the same length as `ys`.
`gradients()` adds ops to the graph to output the partial
derivatives of `ys` with respect to `xs`. It returns a list of
`Tensor` of length `len(xs)` where each tensor is the `sum(dy/dx)`
for y in `ys`.
`grad_ys` is a list of tensors of the same length as `ys` that holds
the initial gradients for each y in `ys`. When `grad_ys` is None,
we fill in a tensor of '1's of the shape of y for each y in `ys`. A
user can provide their own initial 'grad_ys` to compute the
derivatives using a different initial gradient for each y (e.g., if
one wanted to weight the gradient differently for each value in
each y).
##### Args:
* `ys`: A `Tensor` or list of tensors to be differentiated.
* `xs`: A `Tensor` or list of tensors to be used for differentiation.
* `grad_ys`: Optional. A `Tensor` or list of tensors the same size as
`ys` and holding the gradients computed for each y in `ys`.
* `name`: Optional name to use for grouping all the gradient ops together.
defaults to 'gradients'.
* `colocate_gradients_with_ops`: If True, try colocating gradients with
the corresponding op.
* `gate_gradients`: If True, add a tuple around the gradients returned
for an operations. This avoids some race conditions.
* `aggregation_method`: Specifies the method used to combine gradient terms.
Accepted values are constants defined in the class `AggregationMethod`.
##### Returns:
A list of `sum(dy/dx)` for each x in `xs`.
##### Raises:
* `LookupError`: if one of the operations between `x` and `y` does not
have a registered gradient function.
* `ValueError`: if the arguments are invalid.
- - -
### `class tf.AggregationMethod`
A class listing aggregation methods used to combine gradients.
Computing partial derivatives can require aggregating gradient
contributions. This class lists the various methods that can
be used to combine gradients in the graph:
* `ADD_N`: All of the gradient terms are summed as part of one
operation using the "AddN" op. It has the property that all
gradients must be ready before any aggregation is performed.
* `DEFAULT`: The system-chosen default aggregation method.
- - -
### `tf.stop_gradient(input, name=None)`
Stops gradient computation.
When executed in a graph, this op outputs its input tensor as-is.
When building ops to compute gradients, this op prevents the contribution of
its inputs to be taken into account. Normally, the gradient generator adds ops
to a graph to compute the derivatives of a specified 'loss' by recursively
finding out inputs that contributed to its computation. If you insert this op
in the graph it inputs are masked from the gradient generator. They are not
taken into account for computing gradients.
This is useful any time you want to compute a value with TensorFlow but need
to pretend that the value was a constant. Some examples include:
* The *EM* algorithm where the *M-step* should not involve backpropagation
through the output of the *E-step*.
* Contrastive divergence training of Boltzmann machines where, when
differentiating the energy function, the training must not backpropagate
through the graph that generated the samples from the model.
* Adversarial training, where no backprop should happen through the adversarial
example generation process.
##### Args:
* `input`: A `Tensor`.
* `name`: A name for the operation (optional).
##### Returns:
A `Tensor`. Has the same type as `input`.
## Gradient Clipping
TensorFlow provides several operations that you can use to add clipping
functions to your graph. You can use these functions to perform general data
clipping, but they're particularly useful for handling exploding or vanishing
gradients.
- - -
### `tf.clip_by_value(t, clip_value_min, clip_value_max, name=None)`
Clips tensor values to a specified min and max.
Given a tensor `t`, this operation returns a tensor of the same type and
shape as `t` with its values clipped to `clip_value_min` and `clip_value_max`.
Any values less than `clip_value_min` are set to `clip_value_min`. Any values
greater than `clip_value_max` are set to `clip_value_max`.
##### Args:
* `t`: A `Tensor`.
* `clip_value_min`: A 0-D (scalar) `Tensor`. The minimum value to clip by.
* `clip_value_max`: A 0-D (scalar) `Tensor`. The maximum value to clip by.
* `name`: A name for the operation (optional).
##### Returns:
A clipped `Tensor`.
- - -
### `tf.clip_by_norm(t, clip_norm, name=None)`
Clips tensor values to a maximum L2-norm.
Given a tensor `t`, and a maximum clip value `clip_norm`, this operation
normalizes `t` so that its L2-norm is less than or equal to `clip_norm'.
Specifically, if the L2-norm is already less than or equal to `clip_norm`,
then `t` is not modified. If the L2-norm is greater than `clip_norm`, then
this operation returns a tensor of the same type and shape as `t` with its
values set to:
`t * clip_norm / l2norm(t)`
In this case, the L2-norm of the output tensor is `clip_norm`.
This operation is typically used to clip gradients before applying them with
an optimizer.
##### Args:
* `t`: A `Tensor`.
* `clip_norm`: A 0-D (scalar) `Tensor` > 0. A maximum clipping value.
* `name`: A name for the operation (optional).
##### Returns:
A clipped `Tensor`.
- - -
### `tf.clip_by_average_norm(t, clip_norm, name=None)`
Clips tensor values to a maximum average L2-norm.
Given a tensor `t`, and a maximum clip value `clip_norm`, this operation
normalizes `t` so that its average L2-norm is less than or equal to
`clip_norm'. Specifically, if the average L2-norm is already less than or
equal to `clip_norm`, then `t` is not modified. If the average L2-norm is
greater than `clip_norm`, then this operation returns a tensor of the same
type and shape as `t` with its values set to:
`t * clip_norm / l2norm_avg(t)`
In this case, the average L2-norm of the output tensor is `clip_norm`.
This operation is typically used to clip gradients before applying them with
an optimizer.
##### Args:
* `t`: A `Tensor`.
* `clip_norm`: A 0-D (scalar) `Tensor` > 0. A maximum clipping value.
* `name`: A name for the operation (optional).
##### Returns:
A clipped `Tensor`.
- - -
### `tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None)`
Clips values of multiple tensors by the ratio of the sum of their norms.
Given a tuple or list of tensors `t_list`, and a clipping ratio `clip_norm`,
this operation returns a list of clipped tensors `list_clipped`
and the global norm (`global_norm`) of all tensors in `t_list`. Optionally,
if you've already computed the global norm for `t_list`, you can specify
the global norm with `use_norm`.
To perform the clipping, the values t_list[i] are set to:
`t_list[i] * clip_norm / max(global_norm, clip_norm)`
where:
`global_norm = sqrt(sum([l2norm(t)**2 for t in t_list]))`
If `clip_norm > global_norm` then the entries in `t_list` remain as they are,
otherwise they're all shrunk by the global ratio.
Any of the entries of `t_list` that are of type None are ignored.
This is the correct way to perform gradient clipping (for example, see
R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training
Recurrent Neural Networks". http://arxiv.org/abs/1211.5063)
However, it is slower than `clip_by_norm()` because all the parameters must be
ready before the clipping operation can be performed.
##### Args:
* `t_list`: A tuple or list of mixed `Tensors`, `IndexedSlices`, or None.
* `clip_norm`: A 0-D (scalar) `Tensor` > 0. The clipping ratio.
* `use_norm`: A 0-D (scalar) `Tensor` of type `float` (optional). The global
norm to use. If not provided, `global_norm()` is used to compute the norm.
* `name`: A name for the operation (optional).
##### Returns:
* `list_clipped`: A list of `Tensors` of the same type as `list_t`.
* `global_norm`: A 0-D (scalar) `Tensor` representing the global norm.
##### Raises:
* `TypeError`: If `t_list` is not a sequence.
- - -
### `tf.global_norm(t_list, name=None)`
Computes the global norm of multiple tensors.
Given a tuple or list of tensors `t_list`, this operation returns the
global norm of the elements in all tensors in `t_list`. The global norm is
computed as:
`global_norm = sqrt(sum([l2norm(t)**2 for t in t_list]))`
Any entries in `t_list` that are of type None are ignored.
##### Args:
* `t_list`: A tuple or list of mixed `Tensors`, `IndexedSlices`, or None.
* `name`: A name for the operation (optional).
##### Returns:
A 0-D (scalar) `Tensor` of type `float`.
##### Raises:
* `TypeError`: If `t_list` is not a sequence.
## Decaying the learning rate
- - -
### `tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)`
Applies exponential decay to the learning rate.
When training a model, it is often recommended to lower the learning rate as
the training progresses. This function applies an exponential decay function
to a provided initial learning rate. It requires a `global_step` value to
compute the decayed learning rate. You can just pass a TensorFlow variable
that you increment at each training step.
The function returns the decayed learning rate. It is computed as:
```python
decayed_learning_rate = learning_rate *
decay_rate ^ (global_step / decay_steps)
```
If the argument `staircase` is `True`, then `global_step /decay_steps` is an
integer division and the decayed learning rate follows a staircase function.
Example: decay every 100000 steps with a base of 0.96:
```python
...
global_step = tf.Variable(0, trainable=False)
starter_learning_rate = 0.1
learning_rate = tf.exponential_decay(starter_learning_rate, global_step,
100000, 0.96, staircase=True)
optimizer = tf.GradientDescent(learning_rate)
# Passing global_step to minimize() will increment it at each step.
optimizer.minimize(...my loss..., global_step=global_step)
```
##### Args:
* `learning_rate`: A scalar `float32` or `float64` `Tensor` or a
Python number. The initial learning rate.
* `global_step`: A scalar `int32` or `int64` `Tensor` or a Python number.
Global step to use for the decay computation. Must not be negative.
* `decay_steps`: A scalar `int32` or `int64` `Tensor` or a Python number.
Must be positive. See the decay computation above.
* `decay_rate`: A scalar `float32` or `float64` `Tensor` or a
Python number. The decay rate.
* `staircase`: Boolean. It `True` decay the learning rate at discrete intervals.
* `name`: string. Optional name of the operation. Defaults to 'ExponentialDecay'
##### Returns:
A scalar `Tensor` of the same type as `learning_rate`. The decayed
learning rate.
## Moving Averages
Some training algorithms, such as GradientDescent and Momentum often benefit
from maintaining a moving average of variables during optimization. Using the
moving averages for evaluations often improve results significantly.
- - -
### `class tf.train.ExponentialMovingAverage`
Maintains moving averages of variables by employing and exponential decay.
When training a model, it is often beneficial to maintain moving averages of
the trained parameters. Evaluations that use averaged parameters sometimes
produce significantly better results than the final trained values.
The `apply()` method adds shadow copies of trained variables and add ops that
maintain a moving average of the trained variables in their shadow copies.
It is used when building the training model. The ops that maintain moving
averages are typically run after each training step.
The `average()` and `average_name()` methods give access to the shadow
variables and their names. They are useful when building an evaluation
model, or when restoring a model from a checkpoint file. They help use the
moving averages in place of the last trained values for evaluations.
The moving averages are computed using exponential decay. You specify the
decay value when creating the `ExponentialMovingAverage` object. The shadow
variables are initialized with the same initial values as the trained
variables. When you run the ops to maintain the moving averages, each
shadow variable is updated with the formula:
`shadow_variable -= (1 - decay) * (shadow_variable - variable)`
This is mathematically equivalent to the classic formula below, but the use
of an `assign_sub` op (the `"-="` in the formula) allows concurrent lockless
updates to the variables:
`shadow_variable = decay * shadow_variable + (1 - decay) * variable`
Reasonable values for `decay` are close to 1.0, typically in the
multiple-nines range: 0.999, 0.9999, etc.
Example usage when creating a training model:
```python
# Create variables.
var0 = tf.Variable(...)
var1 = tf.Variable(...)
# ... use the variables to build a training model...
...
# Create an op that applies the optimizer. This is what we usually
# would use as a training op.
opt_op = opt.minimize(my_loss, [var0, var1])
# Create an ExponentialMovingAverage object
ema = tf.train.ExponentialMovingAverage(decay=0.9999)
# Create the shadow variables, and add ops to maintain moving averages
# of var0 and var1.
maintain_averages_op = ema.apply([var0, var1])
# Create an op that will update the moving averages after each training
# step. This is what we will use in place of the usuall trainig op.
with tf.control_dependencies([opt_op]):
training_op = tf.group(maintain_averages_op)
...train the model by running training_op...
```
There are two ways to use the moving averages for evaluations:
* Build a model that uses the shadow variables instead of the variables.
For this, use the `average()` method which returns the shadow variable
for a given variable.
* Build a model normally but load the checkpoint files to evaluate by using
the shadow variable names. For this use the `average_name()` method. See
the [Saver class](../../api_docs/python/train.md#Saver) for more
information on restoring saved variables.
Example of restoring the shadow variable values:
```python
# Create a Saver that loads variables from their saved shadow values.
shadow_var0_name = ema.average_name(var0)
shadow_var1_name = ema.average_name(var1)
saver = tf.train.Saver({shadow_var0_name: var0, shadow_var1_name: var1})
saver.restore(...checkpoint filename...)
# var0 and var1 now hold the moving average values
```
- - -
#### `tf.train.ExponentialMovingAverage.__init__(decay, num_updates=None, name='ExponentialMovingAverage')`
Creates a new ExponentialMovingAverage object.
The `Apply()` method has to be called to create shadow variables and add
ops to maintain moving averages.
The optional `num_updates` parameter allows one to tweak the decay rate
dynamically. . It is typical to pass the count of training steps, usually
kept in a variable that is incremented at each step, in which case the
decay rate is lower at the start of training. This makes moving averages
move faster. If passed, the actual decay rate used is:
`min(decay, (1 + num_updates) / (10 + num_updates))`
##### Args:
* `decay`: Float. The decay to use.
* `num_updates`: Optional count of number of updates applied to variables.
* `name`: String. Optional prefix name to use for the name of ops added in
`Apply()`.
- - -
#### `tf.train.ExponentialMovingAverage.apply(var_list=None)`
Maintains moving averages of variables.
`var_list` must be a list of `Variable` or `Tensor` objects. This method
creates shadow variables for all elements of `var_list`. Shadow variables
for `Variable` objects are initialized to the variable's initial value.
For `Tensor` objects, the shadow variables are initialized to 0.
shadow variables are created with `trainable=False` and added to the
`GraphKeys.ALL_VARIABLES` collection. They will be returned by calls to
`tf.all_variables()`.
Returns an op that updates all shadow variables as described above.
Note that `apply()` can be called multiple times with different lists of
variables.
##### Args:
* `var_list`: A list of Variable or Tensor objects. The variables
and Tensors must be of types float32 or float64.
##### Returns:
An Operation that updates the moving averages.
##### Raises:
* `TypeError`: If the arguments are not all float32 or float64.
* `ValueError`: If the moving average of one of the variables is already
being computed.
- - -
#### `tf.train.ExponentialMovingAverage.average_name(var)`
Returns the name of the `Variable` holding the average for `var`.
The typical scenario for `ExponentialMovingAverage` is to compute moving
averages of variables during training, and restore the variables from the
computed moving averages during evaluations.
To restore variables, you have to know the name of the shadow variables.
That name and the original variable can then be passed to a `Saver()` object
to restore the variable from the moving average value with:
`saver = tf.train.Saver({ema.average_name(var): var})`
`average_name()` can be called whether or not `apply()` has been called.
##### Args:
* `var`: A `Variable` object.
##### Returns:
A string: the name of the variable that will be used or was used
by the `ExponentialMovingAverage class` to hold the moving average of
`var`.
- - -
#### `tf.train.ExponentialMovingAverage.average(var)`
Returns the `Variable` holding the average of `var`.
##### Args:
* `var`: A `Variable` object.
##### Returns:
A `Variable` object or `None` if the moving average of `var`
is not maintained..
## Coordinator and QueueRunner
See [Threading and Queues](../../how_tos/threading_and_queues/index.md)
for how to use threads and queues. For documentation on the Queue API,
see [Queues](../../api_docs/python/io_ops.md#queues).
- - -
### `class tf.train.Coordinator`
A coordinator for threads.
This class implements a simple mechanism to coordinate the termination of a
set of threads.
#### Usage:
```python
# Create a coordinator.
coord = Coordinator()
# Start a number of threads, passing the coordinator to each of them.
...start thread 1...(coord, ...)
...start thread N...(coord, ...)
# Wait for all the threads to terminate.
coord.join(threads)
```
Any of the threads can call `coord.request_stop()` to ask for all the threads
to stop. To cooperate with the requests, each thread must check for
`coord.should_stop()` on a regular basis. `coord.should_stop()` returns
`True` as soon as `coord.request_stop()` has been called.
A typical thread running with a Coordinator will do something like:
```python
while not coord.should_stop():
...do some work...
```
#### Exception handling:
A thread can report an exception to the Coordinator as part of the
`should_stop()` call. The exception will be re-raised from the
`coord.join()` call.
Thread code:
```python
try:
while not coord.should_stop():
...do some work...
except Exception, e:
coord.request_stop(e)
```
Main code:
```python
try:
...
coord = Coordinator()
# Start a number of threads, passing the coordinator to each of them.
...start thread 1...(coord, ...)
...start thread N...(coord, ...)
# Wait for all the threads to terminate.
coord.join(threads)
except Exception, e:
...exception that was passed to coord.request_stop()
```
#### Grace period for stopping:
After a thread has called `coord.request_stop()` the other threads have a
fixed time to stop, this is called the 'stop grace period' and defaults to 2
minutes. If any of the threads is still alive after the grace period expires
`coord.join()` raises a RuntimeException reporting the laggards.
```
try:
...
coord = Coordinator()
# Start a number of threads, passing the coordinator to each of them.
...start thread 1...(coord, ...)
...start thread N...(coord, ...)
# Wait for all the threads to terminate, give them 10s grace period
coord.join(threads, stop_grace_period_secs=10)
except RuntimeException:
...one of the threads took more than 10s to stop after request_stop()
...was called.
except Exception:
...exception that was passed to coord.request_stop()
```
- - -
#### `tf.train.Coordinator.__init__()`
Create a new Coordinator.
- - -
#### `tf.train.Coordinator.join(threads, stop_grace_period_secs=120)`
Wait for threads to terminate.
Blocks until all 'threads' have terminated or request_stop() is called.
After the threads stop, if an 'exc_info' was passed to request_stop, that
exception is re-reaised.
Grace period handling: When request_stop() is called, threads are given
'stop_grace_period_secs' seconds to terminate. If any of them is still
alive after that period expires, a RuntimeError is raised. Note that if
an 'exc_info' was passed to request_stop() then it is raised instead of
that RuntimeError.
##### Args:
* `threads`: List threading.Threads. The started threads to join.
* `stop_grace_period_secs`: Number of seconds given to threads to stop after
request_stop() has been called.
##### Raises:
* `RuntimeError`: If any thread is still alive after request_stop()
is called and the grace period expires.
- - -
#### `tf.train.Coordinator.request_stop(ex=None)`
Request that the threads stop.
After this is called, calls to should_stop() will return True.
##### Args:
* `ex`: Optional Exception, or Python 'exc_info' tuple as returned by
sys.exc_info(). If this is the first call to request_stop() the
corresponding exception is recorded and re-raised from join().
- - -
#### `tf.train.Coordinator.should_stop()`
Check if stop was requested.
##### Returns:
True if a stop was requested.
- - -
#### `tf.train.Coordinator.wait_for_stop(timeout=None)`
Wait till the Coordinator is told to stop.
##### Args:
* `timeout`: float. Sleep for up to that many seconds waiting for
should_stop() to become True.
##### Returns:
True if the Coordinator is told stop, False if the timeout expired.
- - -
### `class tf.train.QueueRunner`
Holds a list of enqueue operations for a queue, each to be run in a thread.
Queues are a convenient TensorFlow mechanism to compute tensors
asynchronously using multiple threads. For example in the canonical 'Input
Reader' setup one set of threads generates filenames in a queue; a second set
of threads read records from the files, processes them, and enqueues tensors
on a second queue; a third set of threads dequeues these input records to
construct batches and runs them through training operations.
There are several delicate issues when running multiple threads that way:
closing the queues in sequence as the input is exhausted, correctly catching
and reporting exceptions, etc.
The `QueueRunner`, combined with the `Coordinator`, helps handle these issues.
- - -
#### `tf.train.QueueRunner.__init__(queue, enqueue_ops)`
Create a QueueRunner.
On construction the `QueueRunner` adds an op to close the queue. That op
will be run if the enqueue ops raise exceptions.
When you later call the `create_threads()` method, the `QueueRunner` will
create one thread for each op in `enqueue_ops`. Each thread will run its
enqueue op in parallel with the other threads. The enqueue ops do not have
to all be the same op, but it is expected that they all enqueue tensors in
`queue`.
##### Args:
* `queue`: A `Queue`.
* `enqueue_ops`: List of enqueue ops to run in threads later.
- - -
#### `tf.train.QueueRunner.create_threads(sess, coord=None, daemon=False, start=False)`
Create threads to run the enqueue ops.
This method requires a session in which the graph was launched. It creates
a list of threads, optionally starting them. There is one thread for each
op passed in `enqueue_ops`.
The `coord` argument is an optional coordinator, that the threads will use
to terminate together and report exceptions. If a coordinator is given,
this method starts an additional thread to close the queue when the
coordinator requests a stop.
This method may be called again as long as all threads from a previous call
have stopped.
##### Args:
* `sess`: A `Session`.
* `coord`: Optional `Coordinator` object for reporting errors and checking
stop conditions.
* `daemon`: Boolean. If `True` make the threads daemon threads.
* `start`: Boolean. If `True` starts the threads. If `False` the
caller must call the `start()` method of the returned threads.
##### Returns:
A list of threads.
##### Raises:
* `RuntimeError`: If threads from a previous call to `create_threads()` are
still running.
- - -
#### `tf.train.QueueRunner.exceptions_raised`
Exceptions raised but not handled by the `QueueRunner` threads.
Exceptions raised in queue runner threads are handled in one of two ways
depending on whether or not a `Coordinator` was passed to
`create_threads()`:
* With a `Coordinator`, exceptions are reported to the coordinator and
forgotten by the `QueueRunner`.
* Without a `Coordinator`, exceptions are captured by the `QueueRunner` and
made available in this `exceptions_raised` property.
##### Returns:
A list of Python `Exception` objects. The list is empty if no exception
was captured. (No exceptions are captured when using a Coordinator.)
- - -
### `tf.train.add_queue_runner(qr, collection='queue_runners')`
Adds a `QueueRunner` to a collection in the graph.
When building a complex model that uses many queues it is often difficult to
gather all the queue runners that need to be run. This convenience function
allows you to add a queue runner to a well known collection in the graph.
The companion method `start_queue_runners()` can be used to start threads for
all the collected queue runners.
##### Args:
* `qr`: A `QueueRunner`.
* `collection`: A `GraphKey` specifying the graph collection to add
the queue runner to. Defaults to `GraphKeys.QUEUE_RUNNERS`.
- - -
### `tf.train.start_queue_runners(sess=None, coord=None, daemon=True, start=True, collection='queue_runners')`
Starts all queue runners collected in the graph.
This is a companion method to `add_queue_runner()`. It just starts
threads for all queue runners collected in the graph. It returns
the list of all threads.
##### Args:
* `sess`: `Session` used to run the queue ops. Defaults to the
default session.
* `coord`: Optional `Coordinator` for coordinating the started threads.
* `daemon`: Whether the threads should be marked as `daemons`, meaning
they don't block program exit.
* `start`: Set to `False` to only create the threads, not start them.
* `collection`: A `GraphKey` specifying the graph collection to
get the queue runners from. Defaults to `GraphKeys.QUEUE_RUNNERS`.
##### Returns:
A list of threads.
## Summary Operations
The following ops output
[`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto)
protocol buffers as serialized string tensors.
You can fetch the output of a summary op in a session, and pass it to
a [SummaryWriter](../../api_docs/python/train.md#SummaryWriter) to append it
to an event file. Event files contain
[`Event`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/util/event.proto)
protos that can contain `Summary` protos along with the timestamp and
step. You can then use TensorBoard to visualize the contents of the
event files. See [TensorBoard and
Summaries](../../how_tos/summaries_and_tensorboard/index.md) for more
details.
- - -
### `tf.scalar_summary(tags, values, collections=None, name=None)`
Outputs a `Summary` protocol buffer with scalar values.
The input `tags` and `values` must have the same shape. The generated
summary has a summary value for each tag-value pair in `tags` and `values`.
##### Args:
* `tags`: A 1-D `string` `Tensor`. Tags for the summaries.
* `values`: A 1-D `float32` or `float64` Tensor. Values for the summaries.
* `collections`: Optional list of graph collections keys. The new summary op is
added to these collections. Defaults to `[GraphKeys.SUMMARIES]`.
* `name`: A name for the operation (optional).
##### Returns:
A scalar `Tensor` of type `string`. The serialized `Summary` protocol
buffer.
- - -
### `tf.image_summary(tag, tensor, max_images=None, collections=None, name=None)`
Outputs a `Summary` protocol buffer with images.
The summary has up to `max_images` summary values containing images. The
images are built from `tensor` which must be 4-D with shape `[batch_size,
height, width, channels]` and where `channels` can be:
* 1: `tensor` is interpreted as Grayscale.
* 3: `tensor` is interpreted as RGB.
* 4: `tensor` is interpreted as RGBA.
The images have the same number of channels as the input tensor. Their values
are normalized, one image at a time, to fit in the range `[0, 255]`. The
op uses two different normalization algorithms:
* If the input values are all positive, they are rescaled so the largest one
is 255.
* If any input value is negative, the values are shifted so input value 0.0
is at 127. They are then rescaled so that either the smallest value is 0,
or the largest one is 255.
The `tag` argument is a scalar `Tensor` of type `string`. It is used to
build the `tag` of the summary values:
* If `max_images` is 1, the summary value tag is '*tag*/image'.
* If `max_images` is greater than 1, the summary value tags are
generated sequentially as '*tag*/image/0', '*tag*/image/1', etc.
##### Args:
* `tag`: A scalar `Tensor` of type `string`. Used to build the `tag`
of the summary values.
* `tensor`: A 4-D `float32` `Tensor` of shape `[batch_size, height, width,
channels]` where `channels` is 1, 3, or 4.
* `max_images`: Max number of batch elements to generate images for.
* `collections`: Optional list of ops.GraphKeys. The collections to add the
summary to. Defaults to [ops.GraphKeys.SUMMARIES]
* `name`: A name for the operation (optional).
##### Returns:
A scalar `Tensor` of type `string`. The serialized `Summary` protocol
buffer.
- - -
### `tf.histogram_summary(tag, values, collections=None, name=None)`
Outputs a `Summary` protocol buffer with a histogram.
The generated
[`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto)
has one summary value containing a histogram for `values`.
This op reports an `OutOfRange` error if any value is not finite.
##### Args:
* `tag`: A `string` `Tensor`. 0-D. Tag to use for the summary value.
* `values`: A `float32` `Tensor`. Any shape. Values to use to build the
histogram.
* `collections`: Optional list of graph collections keys. The new summary op is
added to these collections. Defaults to `[GraphKeys.SUMMARIES]`.
* `name`: A name for the operation (optional).
##### Returns:
A scalar `Tensor` of type `string`. The serialized `Summary` protocol
buffer.
- - -
### `tf.nn.zero_fraction(value, name=None)`
Returns the fraction of zeros in `value`.
If `value` is empty, the result is `nan`.
This is useful in summaries to measure and report sparsity. For example,
z = tf.Relu(...)
summ = tf.scalar_summary('sparsity', tf.zero_fraction(z))
##### Args:
* `value`: A tensor of numeric type.
* `name`: A name for the operation (optional).
##### Returns:
The fraction of zeros in `value`, with type `float32`.
- - -
### `tf.merge_summary(inputs, collections=None, name=None)`
Merges summaries.
This op creates a
[`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto)
protocol buffer that contains the union of all the values in the input
summaries.
When the Op is run, it reports an `InvalidArgument` error if multiple values
in the summaries to merge use the same tag.
##### Args:
* `inputs`: A list of `string` `Tensor` objects containing serialized `Summary`
protocol buffers.
* `collections`: Optional list of graph collections keys. The new summary op is
added to these collections. Defaults to `[GraphKeys.SUMMARIES]`.
* `name`: A name for the operation (optional).
##### Returns:
A scalar `Tensor` of type `string`. The serialized `Summary` protocol
buffer resulting from the merging.
- - -
### `tf.merge_all_summaries(key='summaries')`
Merges all summaries collected in the default graph.
##### Args:
* `key`: `GraphKey` used to collect the summaries. Defaults to
`GraphKeys.SUMMARIES`.
##### Returns:
If no summaries were collected, returns None. Otherwise returns a scalar
`Tensor` of type`string` containing the serialized `Summary` protocol
buffer resulting from the merging.
## Adding Summaries to Event Files
See [Summaries and
TensorBoard](../../how_tos/summaries_and_tensorboard/index.md) for an
overview of summaries, event files, and visualization in TensorBoard.
- - -
### `class tf.train.SummaryWriter`
Writes `Summary` protocol buffers to event files.
The `SummaryWriter` class provides a mechanism to create an event file in a
given directory and add summaries and events to it. The class updates the
file contents asynchronously. This allows a training program to call methods
to add data to the file directly from the training loop, without slowing down
training.
- - -
#### `tf.train.SummaryWriter.__init__(logdir, graph_def=None, max_queue=10, flush_secs=120)`
Creates a `SummaryWriter` and an event file.
On construction the summary writer creates a new event file in `logdir`.
This event file will contain `Event` protocol buffers constructed when you
call one of the following functions: `add_summary()`, `add_event()`, or
`add_graph()`.
If you pass a `graph_def` protocol buffer to the constructor it is added to
the event file. (This is equivalent to calling `add_graph()` later).
TensorBoard will pick the graph from the file and display it graphically so
you can interactively explore the graph you built. You will usually pass
the graph from the session in which you launched it:
```python
...create a graph...
# Launch the graph in a session.
sess = tf.Session()
# Create a summary writer, add the 'graph_def' to the event file.
writer = tf.train.SummaryWriter(, sess.graph_def)
```
The other arguments to the constructor control the asynchronous writes to
the event file:
* `flush_secs`: How often, in seconds, to flush the added summaries
and events to disk.
* `max_queue`: Maximum number of summaries or events pending to be
written to disk before one of the 'add' calls block.
##### Args:
* `logdir`: A string. Directory where event file will be written.
* `graph_def`: A `GraphDef` protocol buffer.
* `max_queue`: Integer. Size of the queue for pending events and summaries.
* `flush_secs`: Number. How often, in seconds, to flush the
pending events and summaries to disk.
- - -
#### `tf.train.SummaryWriter.add_summary(summary, global_step=None)`
Adds a `Summary` protocol buffer to the event file.
This method wraps the provided summary in an `Event` procotol buffer
and adds it to the event file.
You can pass the output of any summary op, as-is, to this function. You
can also pass a `Summary` procotol buffer that you manufacture with your
own data. This is commonly done to report evaluation results in event
files.
##### Args:
* `summary`: A `Summary` protocol buffer, optionally serialized as a string.
* `global_step`: Number. Optional global step value to record with the
summary.
- - -
#### `tf.train.SummaryWriter.add_event(event)`
Adds an event to the event file.
##### Args:
* `event`: An `Event` protocol buffer.
- - -
#### `tf.train.SummaryWriter.add_graph(graph_def, global_step=None)`
Adds a `GraphDef` protocol buffer to the event file.
The graph described by the protocol buffer will be displayed by
TensorBoard. Most users pass a graph in the constructor instead.
##### Args:
* `graph_def`: A `GraphDef` protocol buffer.
* `global_step`: Number. Optional global step counter to record with the
graph.
- - -
#### `tf.train.SummaryWriter.flush()`
Flushes the event file to disk.
Call this method to make sure that all pending events have been written to
disk.
- - -
#### `tf.train.SummaryWriter.close()`
Flushes the event file to disk and close the file.
Call this method when you do not need the summary writer anymore.
- - -
### `tf.train.summary_iterator(path)`
An iterator for reading `Event` protocol buffers from an event file.
You can use this function to read events written to an event file. It returns
a Python iterator that yields `Event` protocol buffers.
Example: Print the contents of an events file.
```python
for e in tf.summary_iterator(path to events file):
print e
```
Example: Print selected summary values.
```python
# This example supposes that the events file contains summaries with a
# summary value tag 'loss'. These could have been added by calling
# `add_summary()`, passing the output of a scalar summary op created with
# with: `tf.scalar_summary(['loss'], loss_tensor)`.
for e in tf.summary_iterator(path to events file):
for v in e.summary.value:
if v.tag == 'loss':
print v.simple_value
```
See the protocol buffer definitions of
[Event](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/util/event.proto)
and
[Summary](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto)
for more information about their attributes.
##### Args:
* `path`: The path to an event file created by a `SummaryWriter`.
##### Yields:
`Event` protocol buffers.
## Training utilities
- - -
### `tf.train.global_step(sess, global_step_tensor)`
Small helper to get the global step.
```python
# Creates a variable to hold the global_step.
global_step_tensor = tf.Variable(10, trainable=False, name='global_step')
# Creates a session.
sess = tf.Session()
# Initializes the variable.
sess.run(global_step_tensor.initializer)
print 'global_step:', tf.train.global_step(sess, global_step_tensor)
global_step: 10
```
##### Args:
* `sess`: A brain `Session` object.
* `global_step_tensor`: `Tensor` or the `name` of the operation that contains
the global step.
##### Returns:
The global step value.
- - -
### `tf.train.write_graph(graph_def, logdir, name, as_text=True)`
Writes a graph proto on disk.
The graph is written as a binary proto unless as_text is `True`.
```python
v = tf.Variable(0, name='my_variable')
sess = tf.Session()
tf.train.write_graph(sess.graph_def, '/tmp/my-model', 'train.pbtxt')
```
##### Args:
* `graph_def`: A `GraphDef` protocol buffer.
* `logdir`: Directory where to write the graph.
* `name`: Filename for the graph.
* `as_text`: If `True`, writes the graph as an ASCII proto.
================================================
FILE: SOURCE/extras/README.txt
================================================
This directory holds extra files we'd like to be able
to link to and serve from within tensorflow.org
================================================
FILE: SOURCE/faq.md
================================================
# 常见问题汇总
## 说明
本章非官方文档翻译,是由众多TensorFlow爱好者将安装和使用TF过程中的问题总结而成的。
## 常见问题及解答
### (示例)官网地址是哪里?
[http://www.tensorflow.org/](http://www.tensorflow.org/)
### 如何安装 pip ?
* Ubuntu (14.04)
sudo apt-get update && sudo apt-get install -y python python-dev python-pip
* CentOS 7
yum update -y && yum install -y python python-devel epel-release.noarch python-pip
* MACOS
sudo easy_install pip
### docker run -it b.gcr.io/tensorflow/tensorflow 失败
该镜像所在仓库被墙,需要梯子。
[这里](http://pan.baidu.com/s/1bnyVrMR)(密码:v9ts)有镜像的导出包。
使用方法
docker load < sensorflow.tar.gz
### SSLError: SSL_VERIFY_FAILED
在用url通过pip的方式安装的时候,可能会遇到的错误如下:
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)
解决办法:先下载wheel到本地,之后本地pip install
================================================
FILE: SOURCE/get_started/basic_usage.md
================================================
# 基本使用
使用 TensorFlow, 你必须明白 TensorFlow:
* 使用图 (graph) 来表示计算任务.
* 在被称之为 `会话 (Session)` 的上下文 (context) 中执行图.
* 使用 tensor 表示数据.
* 通过 `变量 (Variable)` 维护状态.
* 使用 feed 和 fetch 可以为任意的操作(arbitrary operation) 赋值或者从其中获取数据.
## 综述
TensorFlow 是一个编程系统, 使用图来表示计算任务. 图中的节点被称之为 *op*
(operation 的缩写). 一个 op 获得 0 个或多个 `Tensor`, 执行计算,
产生 0 个或多个 `Tensor `. 每个 Tensor 是一个类型化的多维数组.
例如, 你可以将一小组图像集表示为一个四维浮点数数组,
这四个维度分别是 `[batch, height, width, channels]`.
一个 TensorFlow 图*描述*了计算的过程. 为了进行计算, 图必须在 `会话` 里被启动.
`会话` 将图的 op 分发到诸如 CPU 或 GPU 之类的 `设备` 上, 同时提供执行 op 的方法.
这些方法执行后, 将产生的 tensor 返回. 在 Python 语言中, 返回的 tensor 是
[numpy](http://www.numpy.org) `ndarray` 对象; 在 C 和 C++ 语言中, 返回的 tensor 是
`tensorflow::Tensor` 实例.
## 计算图
TensorFlow 程序通常被组织成一个构建阶段和一个执行阶段. 在构建阶段, op 的执行步骤
被描述成一个图. 在执行阶段, 使用会话执行执行图中的 op.
例如, 通常在构建阶段创建一个图来表示和训练神经网络, 然后在执行阶段反复执行图中的训练 op.
TensorFlow 支持 C, C++, Python 编程语言. 目前, TensorFlow 的 Python 库更加易用,
它提供了大量的辅助函数来简化构建图的工作, 这些函数尚未被 C 和 C++ 库支持.
三种语言的会话库 (session libraries) 是一致的.
### 构建图
构建图的第一步, 是创建源 op (source op). 源 op 不需要任何输入, 例如 `常量 (Constant)`. 源 op 的输出被传递给其它 op 做运算.
Python 库中, op 构造器的返回值代表被构造出的 op 的输出, 这些返回值可以传递给其它
op 构造器作为输入.
TensorFlow Python 库有一个*默认图 (default graph)*, op 构造器可以为其增加节点. 这个默认图对
许多程序来说已经足够用了. 阅读 [Graph 类](../api_docs/python/framework.md#Graph) 文档
来了解如何管理多个图.
```python
import tensorflow as tf
# 创建一个常量 op, 产生一个 1x2 矩阵. 这个 op 被作为一个节点
# 加到默认图中.
#
# 构造器的返回值代表该常量 op 的返回值.
matrix1 = tf.constant([[3., 3.]])
# 创建另外一个常量 op, 产生一个 2x1 矩阵.
matrix2 = tf.constant([[2.],[2.]])
# 创建一个矩阵乘法 matmul op , 把 'matrix1' 和 'matrix2' 作为输入.
# 返回值 'product' 代表矩阵乘法的结果.
product = tf.matmul(matrix1, matrix2)
```
默认图现在有三个节点, 两个 `constant()` op, 和一个`matmul()` op. 为了真正进行矩阵相乘运算, 并得到矩阵乘法的
结果, 你必须在会话里启动这个图.
### 在一个会话中启动图
构造阶段完成后, 才能启动图. 启动图的第一步是创建一个 `Session` 对象, 如果无任何创建参数,
会话构造器将启动默认图.
欲了解完整的会话 API, 请阅读[Session 类](../api_docs/python/client.md#session-management).
```python
# 启动默认图.
sess = tf.Session()
# 调用 sess 的 'run()' 方法来执行矩阵乘法 op, 传入 'product' 作为该方法的参数.
# 上面提到, 'product' 代表了矩阵乘法 op 的输出, 传入它是向方法表明, 我们希望取回
# 矩阵乘法 op 的输出.
#
# 整个执行过程是自动化的, 会话负责传递 op 所需的全部输入. op 通常是并发执行的.
#
# 函数调用 'run(product)' 触发了图中三个 op (两个常量 op 和一个矩阵乘法 op) 的执行.
#
# 返回值 'result' 是一个 numpy `ndarray` 对象.
result = sess.run(product)
print result
# ==> [[ 12.]]
# 任务完成, 关闭会话.
sess.close()
```
`Session` 对象在使用完后需要关闭以释放资源. 除了显式调用 close 外, 也可以使用 "with" 代码块
来自动完成关闭动作.
```python
with tf.Session() as sess:
result = sess.run([product])
print result
```
在实现上, TensorFlow 将图形定义转换成分布式执行的操作, 以充分利用可用的计算资源(如 CPU
或 GPU). 一般你不需要显式指定使用 CPU 还是 GPU, TensorFlow 能自动检测. 如果检测到 GPU, TensorFlow
会尽可能地利用找到的第一个 GPU 来执行操作.
如果机器上有超过一个可用的 GPU, 除第一个外的其它 GPU 默认是不参与计算的. 为了让 TensorFlow
使用这些 GPU, 你必须将 op 明确指派给它们执行. `with...Device` 语句用来指派特定的 CPU 或 GPU
执行操作:
```python
with tf.Session() as sess:
with tf.device("/gpu:1"):
matrix1 = tf.constant([[3., 3.]])
matrix2 = tf.constant([[2.],[2.]])
product = tf.matmul(matrix1, matrix2)
...
```
设备用字符串进行标识. 目前支持的设备包括:
* `"/cpu:0"`: 机器的 CPU.
* `"/gpu:0"`: 机器的第一个 GPU, 如果有的话.
* `"/gpu:1"`: 机器的第二个 GPU, 以此类推.
阅读[使用GPU](tensorflow-zh/SOURCE/how_tos/using_gpu.md)章节, 了解 TensorFlow GPU 使用的更多信息.
## 交互式使用
文档中的 Python 示例使用一个会话 [`Session`](../api_docs/python/client.md#Session) 来
启动图, 并调用 [`Session.run()`](../api_docs/python/client.md#Session.run) 方法执行操作.
为了便于使用诸如 [IPython](http://ipython.org) 之类的 Python 交互环境, 可以使用
[`InteractiveSession`](../api_docs/python/client.md#InteractiveSession) 代替
`Session` 类, 使用 [`Tensor.eval()`](../api_docs/python/framework.md#Tensor.eval)
和 [`Operation.run()`](../api_docs/python/framework.md#Operation.run) 方法代替
`Session.run()`. 这样可以避免使用一个变量来持有会话.
```python
# 进入一个交互式 TensorFlow 会话.
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0])
# 使用初始化器 initializer op 的 run() 方法初始化 'x'
x.initializer.run()
# 增加一个减法 sub op, 从 'x' 减去 'a'. 运行减法 op, 输出结果
sub = tf.sub(x, a)
print sub.eval()
# ==> [-2. -1.]
```
## Tensor
TensorFlow 程序使用 tensor 数据结构来代表所有的数据, 计算图中, 操作间传递的数据都是 tensor.
你可以把 TensorFlow tensor 看作是一个 n 维的数组或列表. 一个 tensor 包含一个静态类型 rank, 和
一个 shape. 想了解 TensorFlow 是如何处理这些概念的, 参见
[Rank, Shape, 和 Type](../resources/dims_types.md).
## 变量
[Variables](../how_tos/variables/index.md) for more details.
变量维护图执行过程中的状态信息. 下面的例子演示了如何使用变量实现一个简单的计数器. 参见
[变量](tensorflow-zh/how_tos/variables.md) 章节了解更多细节.
```python
# 创建一个变量, 初始化为标量 0.
state = tf.Variable(0, name="counter")
# 创建一个 op, 其作用是使 state 增加 1
one = tf.constant(1)
new_value = tf.add(state, one)
update = tf.assign(state, new_value)
# 启动图后, 变量必须先经过`初始化` (init) op 初始化,
# 首先必须增加一个`初始化` op 到图中.
init_op = tf.initialize_all_variables()
# 启动图, 运行 op
with tf.Session() as sess:
# 运行 'init' op
sess.run(init_op)
# 打印 'state' 的初始值
print sess.run(state)
# 运行 op, 更新 'state', 并打印 'state'
for _ in range(3):
sess.run(update)
print sess.run(state)
# 输出:
# 0
# 1
# 2
# 3
```
代码中 `assign()` 操作是图所描绘的表达式的一部分, 正如 `add()` 操作一样. 所以在调用 `run()`
执行表达式之前, 它并不会真正执行赋值操作.
通常会将一个统计模型中的参数表示为一组变量. 例如, 你可以将一个神经网络的权重作为某个变量存储在一个 tensor 中.
在训练过程中, 通过重复运行训练图, 更新这个 tensor.
## Fetch
为了取回操作的输出内容, 可以在使用 `Session` 对象的 `run()` 调用 执行图时, 传入一些 tensor,
这些 tensor 会帮助你取回结果. 在之前的例子里, 我们只取回了单个节点 `state`, 但是你也可以取回多个
tensor:
```python
input1 = tf.constant(3.0)
input2 = tf.constant(2.0)
input3 = tf.constant(5.0)
intermed = tf.add(input2, input3)
mul = tf.mul(input1, intermed)
with tf.Session() as sess:
result = sess.run([mul, intermed])
print result
# 输出:
# [array([ 21.], dtype=float32), array([ 7.], dtype=float32)]
```
需要获取的多个 tensor 值,在 op 的一次运行中一起获得(而不是逐个去获取 tensor)。
## Feed
上述示例在计算图中引入了 tensor, 以常量或变量的形式存储. TensorFlow 还提供了 feed 机制, 该机制
可以临时替代图中的任意操作中的 tensor 可以对图中任何操作提交补丁, 直接插入一个 tensor.
feed 使用一个 tensor 值临时替换一个操作的输出结果. 你可以提供 feed 数据作为 `run()` 调用的参数.
feed 只在调用它的方法内有效, 方法结束, feed 就会消失. 最常见的用例是将某些特殊的操作指定为 "feed" 操作,
标记的方法是使用 tf.placeholder() 为这些操作创建占位符.
```python
input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)
output = tf.mul(input1, input2)
with tf.Session() as sess:
print sess.run([output], feed_dict={input1:[7.], input2:[2.]})
# 输出:
# [array([ 14.], dtype=float32)]
```
for a larger-scale example of feeds.
如果没有正确提供 feed, `placeholder()` 操作将会产生错误.
MNIST 全连通 [feed 教程](http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_tf.html)
([source code](https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/tutorials/mnist/fully_connected_feed.py))
给出了一个更大规模的使用 feed 的例子.
> 原文:[Basic Usage](http://tensorflow.org/get_started/basic_usage.md) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@yangtze](https://github.com/sstruct)
================================================
FILE: SOURCE/get_started/introduction.md
================================================
# 简介
本章的目的是让你了解和运行 TensorFlow!
在开始之前, 让我们先看一段使用 Python API 撰写的 TensorFlow 示例代码,
让你对将要学习的内容有初步的印象.
这段很短的 Python 程序生成了一些三维数据, 然后用一个平面拟合它.
```python
import tensorflow as tf
import numpy as np
# 使用 NumPy 生成假数据(phony data), 总共 100 个点.
x_data = np.float32(np.random.rand(2, 100)) # 随机输入
y_data = np.dot([0.100, 0.200], x_data) + 0.300
# 构造一个线性模型
#
b = tf.Variable(tf.zeros([1]))
W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0))
y = tf.matmul(W, x_data) + b
# 最小化方差
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)
# 初始化变量
init = tf.initialize_all_variables()
# 启动图 (graph)
sess = tf.Session()
sess.run(init)
# 拟合平面
for step in xrange(0, 201):
sess.run(train)
if step % 20 == 0:
print step, sess.run(W), sess.run(b)
# 得到最佳拟合结果 W: [[0.100 0.200]], b: [0.300]
```
为了进一步激发你的学习欲望, 我们想让你先看一下 TensorFlow 是如何解决一个经典的机器
学习问题的. 在神经网络领域, 最为经典的问题莫过于 MNIST 手写数字分类问题. 我们准备了
两篇不同的教程, 分别面向机器学习领域的初学者和专家. 如果你已经使用其它软件训练过许多
MNIST 模型, 请阅读高级教程 (红色药丸链接). 如果你以前从未听说过 MNIST, 请阅读初级教程
(蓝色药丸链接). 如果你的水平介于这两类人之间, 我们建议你先快速浏览初级教程, 然后再阅读高级教程.
> 原文:[Introduction](http://tensorflow.org/get_started) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@yangtze](https://github.com/sstruct)
================================================
FILE: SOURCE/get_started/os_setup.md
================================================
# 下载与安装
你可以使用我们提供的 Pip, Docker, Virtualenv, Anaconda 或 源码编译的方法安装 TensorFlow.
## Pip 安装
[Pip](https://en.wikipedia.org/wiki/Pip_(package_manager)) 是一个 Python 的软件包安装与管理工具.
在安装 TensorFlow 过程中要涉及安装或升级的包详见 [列表](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/pip_package/setup.py)
首先安装 pip (或 Python3 的 pip3 ):
```bash
# Ubuntu/Linux 64-bit
$ sudo apt-get install python-pip python-dev
# Mac OS X
$ sudo easy_install pip
```
安装 TensorFlow :
```bash
# Ubuntu/Linux 64-bit, CPU only, Python 2.7:
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled, Python 2.7. Requires CUDA toolkit 7.5 and CuDNN v4.
# For other versions, see "Install from sources" below.
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
# Mac OS X, CPU only:
$ sudo easy_install --upgrade six
$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py2-none-any.whl
```
如果是 Python3 :
```bash
# Ubuntu/Linux 64-bit, CPU only, Python 3.4:
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled, Python 3.4. Requires CUDA toolkit 7.5 and CuDNN v4.
# For other versions, see "Install from sources" below.
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl
# Mac OS X, CPU only:
$ sudo easy_install --upgrade six
$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py3-none-any.whl
```
备注:如果之前安装过 TensorFlow < 0.7.1 的版本,应该先使用 `pip uninstall` 卸载 TensorFlow 和 protobuf ,保证获取的是一个最新 protobuf 依赖下的安装包.
之后可以[测试](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#test-the-tensorflow-installation)一下.
## 基于 Docker 的安装
我们也支持通过 [Docker](http://docker.com/) 运行 TensorFlow.
该方式的优点是不用操心软件依赖问题.
首先, [安装 Docker](http://docs.docker.com/engine/installation/). 一旦 Docker
已经启动运行, 可以通过命令启动一个容器:
```bash
$ docker run -it b.gcr.io/tensorflow/tensorflow
```
该命令将启动一个已经安装好 TensorFlow 及相关依赖的容器.
### 其它镜像
默认的 Docker 镜像只包含启动和运行 TensorFlow 所需依赖库的一个最小集. 我们额外提供了
下面的容器, 该容器同样可以通过上述 `docker run` 命令安装:
* `b.gcr.io/tensorflow/tensorflow-full`: 镜像中的 TensorFlow 是从源代码完整安装的,
包含了编译和运行 TensorFlow 所需的全部工具. 在该镜像上, 可以直接使用源代码进行实验,
而不需要再安装上述的任何依赖.
## 基于 VirtualEnv 的安装
我们推荐使用 [virtualenv](https://pypi.python.org/pypi/virtualenv) 创建一个隔离的容器, 来安装 TensorFlow. 这是可选的, 但是这样做能使排查安装问题变得更容易.
首先, 安装所有必备工具:
```bash
# 在 Linux 上:
$ sudo apt-get install python-pip python-dev python-virtualenv
# 在 Mac 上:
$ sudo easy_install pip # 如果还没有安装 pip
$ sudo pip install --upgrade virtualenv
```
接下来, 建立一个全新的 virtualenv 环境. 为了将环境建在 `~/tensorflow`
目录下, 执行:
```bash
$ virtualenv --system-site-packages ~/tensorflow
$ cd ~/tensorflow
```
然后, 激活 virtualenv:
```bash
$ source bin/activate # 如果使用 bash
$ source bin/activate.csh # 如果使用 csh
(tensorflow)$ # 终端提示符应该发生变化
```
在 virtualenv 内, 安装 TensorFlow:
```bash
(tensorflow)$ pip install --upgrade <$url_to_binary.whl>
```
接下来, 使用类似命令运行 TensorFlow 程序:
```bash
(tensorflow)$ cd tensorflow/models/image/mnist
(tensorflow)$ python convolutional.py
# 当使用完 TensorFlow
(tensorflow)$ deactivate # 停用 virtualenv
$ # 你的命令提示符会恢复原样
```
## 基于 Anaconda 的安装
[Anaconda](https://www.continuum.io/why-anaconda) 是一个集成许多第三方科学计算库的 Python 科学计算环境,Anaconda 使用 conda 作为自己的包管理工具,同时具有自己的[计算环境](http://conda.pydata.org/docs/using/envs.html),类似 Virtualenv.
和 Virtualenv 一样,不同 Python 工程需要的依赖包,conda 将他们存储在不同的地方。 TensorFlow 上安装的 Anaconda 不会对之前安装的 Python 包进行覆盖.
* 安装 Anaconda
* 建立一个 conda 计算环境
* 激活环境,使用 conda 安装 TensorFlow
* 安装成功后,每次使用 TensorFlow 的时候需要激活 conda 环境
安装 Anaconda :
参考 Anaconda 的下载页面的[指导](https://www.continuum.io/downloads)
建立一个 conda 计算环境名字叫`tensorflow`:
```bash
# Python 2.7
$ conda create -n tensorflow python=2.7
# Python 3.4
$ conda create -n tensorflow python=3.4
```
激活`tensorflow`环境,然后使用其中的 pip 安装 TensorFlow. 当使用`easy_install`使用`--ignore-installed`标记防止错误的产生。
```bash
$ source activate tensorflow
(tensorflow)$ # Your prompt should change
# Ubuntu/Linux 64-bit, CPU only, Python 2.7:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled, Python 2.7. Requires CUDA toolkit 7.5 and CuDNN v4.
# For other versions, see "Install from sources" below.
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
# Mac OS X, CPU only:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py2-none-any.whl
```
对于 Python 3.x :
```bash
$ source activate tensorflow
(tensorflow)$ # Your prompt should change
# Ubuntu/Linux 64-bit, CPU only, Python 3.4:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl
# Ubuntu/Linux 64-bit, GPU enabled, Python 3.4. Requires CUDA toolkit 7.5 and CuDNN v4.
# For other versions, see "Install from sources" below.
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl
# Mac OS X, CPU only:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py3-none-any.whl
```
conda 环境激活后,你可以[测试](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#test-the-tensorflow-installation)
当你不用 TensorFlow 的时候,关闭环境:
```bash
(tensorflow)$ source deactivate
$ # Your prompt should change back
```
再次使用的时候再激活 :-)
```bash
$ source activate tensorflow
(tensorflow)$ # Your prompt should change.
# Run Python programs that use TensorFlow.
...
# When you are done using TensorFlow, deactivate the environment.
(tensorflow)$ source deactivate
```
## 尝试你的第一个 TensorFlow 程序
### (可选) 启用 GPU 支持
如果你使用 pip 二进制包安装了开启 GPU 支持的 TensorFlow, 你必须确保
系统里安装了正确的 CUDA sdk 和 CUDNN 版本. 请参间 [CUDA 安装教程](#install_cuda)
你还需要设置 `LD_LIBRARY_PATH` 和 `CUDA_HOME` 环境变量. 可以考虑将下面的命令
添加到 `~/.bash_profile` 文件中, 这样每次登陆后自动生效. 注意, 下面的命令
假定 CUDA 安装目录为 `/usr/local/cuda`:
```bash
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME=/usr/local/cuda
```
### 运行 TensorFlow
打开一个 python 终端:
```bash
$ python
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print sess.run(hello)
Hello, TensorFlow!
>>> a = tf.constant(10)
>>> b = tf.constant(32)
>>> print sess.run(a+b)
42
>>>
```
## 从源码安装
### 克隆 TensorFlow 仓库
```bash
$ git clone --recurse-submodules https://github.com/tensorflow/tensorflow
```
`--recurse-submodules` 参数是必须得, 用于获取 TesorFlow 依赖的 protobuf 库.
### Linux 安装
#### 安装 Bazel
首先依照 [教程](http://bazel.io/docs/install.html) 安装 Bazel 的依赖.
然后在 [链接](https://github.com/bazelbuild/bazel/releases) 中下载适合你的操作系统的最新稳定版,
最后按照下面脚本执行:
```bash
$ chmod +x PATH_TO_INSTALL.SH
$ ./PATH_TO_INSTALL.SH --user
```
注意把 `PATH_TO_INSTALL.SH` 替换为你下载的安装包的文件路径.
将执行路径 `output/bazel` 添加到 `$PATH` 环境变量中.
#### 安装其他依赖
```bash
# For Python 2.7:
$ sudo apt-get install python-numpy swig python-dev python-wheel
# For Python 3.x:
$ sudo apt-get install python3-numpy swig python3-dev python3-wheel
```
#### 可选: 安装 CUDA (在 Linux 上开启 GPU 支持)
为了编译并运行能够使用 GPU 的 TensorFlow, 需要先安装 NVIDIA 提供的 Cuda Toolkit 7.0
和 CUDNN 6.5 V2.
TensorFlow 的 GPU 特性只支持 NVidia Compute Capability >= 3.5 的显卡. 被支持的显卡
包括但不限于:
* NVidia Titan
* NVidia Titan X
* NVidia K20
* NVidia K40
##### 下载并安装 Cuda Toolkit 7.0
[下载地址](https://developer.nvidia.com/cuda-toolkit-70)
将工具安装到诸如 `/usr/local/cuda` 之类的路径.
##### 下载并安装 CUDNN Toolkit 6.5
[下载地址](https://developer.nvidia.com/rdp/cudnn-archive)
解压并拷贝 CUDNN 文件到 Cuda Toolkit 7.0 安装路径下. 假设 Cuda Toolkit 7.0 安装
在 `/usr/local/cuda`, 执行以下命令:
``` bash
tar xvzf cudnn-6.5-linux-x64-v2.tgz
sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include
sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
```
##### 配置 TensorFlow 的 Cuda 选项
从源码树的根路径执行:
``` bash
$ ./configure
Do you wish to bulid TensorFlow with GPU support? [y/n] y
GPU support will be enabled for TensorFlow
Please specify the location where CUDA 7.0 toolkit is installed. Refer to
README.md for more details. [default is: /usr/local/cuda]: /usr/local/cuda
Please specify the location where CUDNN 6.5 V2 library is installed. Refer to
README.md for more details. [default is: /usr/local/cuda]: /usr/local/cuda
Setting up Cuda include
Setting up Cuda lib64
Setting up Cuda bin
Setting up Cuda nvvm
Configuration finished
```
这些配置将建立到系统 Cuda 库的符号链接. 每当 Cuda 库的路径发生变更时, 必须重新执行上述
步骤, 否则无法调用 bazel 编译命令.
##### 编译目标程序, 开启 GPU 支持
从源码树的根路径执行:
```bash
$ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
$ bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu
# 大量的输出信息. 这个例子用 GPU 迭代计算一个 2x2 矩阵的主特征值 (major eigenvalue).
# 最后几行输出和下面的信息类似.
000009/000005 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427]
000006/000001 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427]
000009/000009 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427]
```
注意, GPU 支持需通过编译选项 "--config=cuda" 开启.
##### 已知问题
* 尽管可以在同一个源码树下编译开启 Cuda 支持和禁用 Cuda 支持的版本, 我们还是推荐在
在切换这两种不同的编译配置时, 使用 "bazel clean" 清理环境.
* 在执行 bazel 编译前必须先运行 configure, 否则编译会失败并提示错误信息. 未来,
我们可能考虑将 configure 步骤包含在编译过程中, 以简化整个过程, 前提是 bazel 能够提供新的特性支持这样.
### Mac OS X 安装
Mac 和 Linux 需要的软件依赖完全一样, 但是安装过程区别很大. 以下链接用于帮助你
在 Mac OS X 上安装这些依赖:
#### Bazel
参见[本网页](http://bazel.io/docs/install.html)的 Mac OS X 安装指南.
#### SWIG
[Mac OS X 安装教程](http://www.swig.org/Doc3.0/Preface.html#Preface_osx_installation).
注意: 你需要安装[PCRE](ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/),
而*不是* PCRE2.
#### Numpy
参见[安装教程](http://docs.scipy.org/doc/numpy/user/install.html).
### 创建 pip 包并安装
```bash
$ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# .whl 文件的实际名字与你所使用的平台有关
$ pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl
```
## 训练你的第一个 TensorFlow 神经网络模型
从源代码树的根路径执行:
```python
$ cd tensorflow/models/image/mnist
$ python convolutional.py
Succesfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Succesfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Succesfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Succesfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
Initialized!
Epoch 0.00
Minibatch loss: 12.054, learning rate: 0.010000
Minibatch error: 90.6%
Validation error: 84.6%
Epoch 0.12
Minibatch loss: 3.285, learning rate: 0.010000
Minibatch error: 6.2%
Validation error: 7.0%
...
...
```
## 常见问题
### GPU 相关问题
如果在尝试运行一个 TensorFlow 程序时出现以下错误:
```python
ImportError: libcudart.so.7.0: cannot open shared object file: No such file or directory
```
请确认你正确安装了 GPU 支持, 参见 [相关章节](#install_cuda).
### 在 Linux 上
如果出现错误:
```python
...
"__add__", "__radd__",
^
SyntaxError: invalid syntax
```
解决方案: 确认正在使用的 Python 版本为 Python 2.7.
### 在 Mac OS X 上
如果出现错误:
```python
import six.moves.copyreg as copyreg
ImportError: No module named copyreg
```
解决方案: TensorFlow 使用的 protobuf 依赖 `six-1.10.0`. 但是, Apple 的默认 python 环境
已经安装了 `six-1.4.1`, 该版本可能很难升级. 这里提供几种方法来解决该问题:
1. 升级全系统的 `six`:
```bash
sudo easy_install -U six
```
2. 通过 homebrew 安装一个隔离的 python 副本:
```bash
brew install python
```
3. 在[`virtualenv`](#virtualenv_install) 内编译或使用 TensorFlow.
如果出现错误:
```
>>> import tensorflow as tf
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/site-packages/tensorflow/__init__.py", line 4, in
from tensorflow.python import *
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 13, in
from tensorflow.core.framework.graph_pb2 import *
...
File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/tensor_shape_pb2.py", line 22, in
serialized_pb=_b('\n,tensorflow/core/framework/tensor_shape.proto\x12\ntensorflow\"d\n\x10TensorShapeProto\x12-\n\x03\x64im\x18\x02 \x03(\x0b\x32 .tensorflow.TensorShapeProto.Dim\x1a!\n\x03\x44im\x12\x0c\n\x04size\x18\x01 \x01(\x03\x12\x0c\n\x04name\x18\x02 \x01(\tb\x06proto3')
TypeError: __init__() got an unexpected keyword argument 'syntax'
```
这是由于安装了冲突的 protobuf 版本引起的, TensorFlow 需要的是 protobuf 3.0.0. 当前
最好的解决方案是确保没有安装旧版本的 protobuf, 可以使用以下命令重新安装 protobuf 来解决
冲突:
```bash
brew reinstall --devel protobuf
```
> 原文:[Download and Setup](http://tensorflow.org/get_started/os_setup.md) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@yangtze](https://github.com/sstruct)
================================================
FILE: SOURCE/how_tos/__init__.py
================================================
================================================
FILE: SOURCE/how_tos/adding_an_op/__init__.py
================================================
================================================
FILE: SOURCE/how_tos/adding_an_op/attr_examples.cc
================================================
#include
#include "tensorflow/core/framework/op.h"
REGISTER_OP("RestrictedTypeExample").Attr("t: {int32, float, bool}");
REGISTER_OP("NumberType").Attr("t: numbertype");
REGISTER_OP("EnumExample").Attr("e: {'apple', 'orange'}");
REGISTER_OP("MinIntExample").Attr("a: int >= 2");
REGISTER_OP("TypeListExample").Attr("a: list({int32, float}) >= 3");
REGISTER_OP("AttrDefaultExample").Attr("i: int = 0");
REGISTER_OP("AttrDefaultExampleForAllTypes")
.Attr("s: string = 'foo'")
.Attr("i: int = 0")
.Attr("f: float = 1.0")
.Attr("b: bool = true")
.Attr("ty: type = DT_INT32")
.Attr("sh: shape = { dim { size: 1 } dim { size: 2 } }")
.Attr("te: tensor = { dtype: DT_INT32 int_val: 5 }")
.Attr("l_empty: list(int) = []")
.Attr("l_int: list(int) = [2, 3, 5, 7]");
int main(int argc, char* argv[]) {
printf("All registered ops:\n%s\n",
tensorflow::OpRegistry::Global()->DebugString(false).c_str());
return 0;
}
================================================
FILE: SOURCE/how_tos/adding_an_op/fact_test.py
================================================
"""Test that user ops can be used as expected."""
from __future__ import print_function
import tensorflow.python.platform
import tensorflow as tf
class FactTest(tf.test.TestCase):
def test(self):
with self.test_session():
print(tf.user_ops.my_fact().eval())
if __name__ == '__main__':
tf.test.main()
================================================
FILE: SOURCE/how_tos/adding_an_op/index.md
================================================
# Adding a New Op
PREREQUISITES:
* Some familiarity with C++.
* Must have [downloaded TensorFlow source](../../get_started/introduction.md),
and be able to build it.
If you'd like to incorporate an operation that isn't covered by the existing
library, you can create a custom Op. To incorporate your custom Op, you'll need
to:
* Register the new Op in a C++ file. The Op registration is independent of the
implementation, and describes the semantics of how the Op is invoked. For
example, it defines the Op name, and specifies its inputs and outputs.
* Implement the Op in C++. This implementation is called a "kernel", and there
can be multiple kernels for different architectures (e.g. CPUs, GPUs) or
input / output types.
* Create a Python wrapper. This wrapper is the public API to create the Op. A
default wrapper is generated from the Op registration, which can be used
directly or added to.
* Optionally, write a function to compute gradients for the Op.
* Optionally, write a function that describes the input and output shapes
for the Op. This allows shape inference to work with your Op.
* Test the Op, typically in Python.
## Contents
### [Adding a New Op](#AUTOGENERATED-adding-a-new-op)
* [Define the Op's interface](#define_interface)
* [Implement the kernel for the Op](#AUTOGENERATED-implement-the-kernel-for-the-op)
* [Generate the client wrapper](#AUTOGENERATED-generate-the-client-wrapper)
* [The Python Op wrapper](#AUTOGENERATED-the-python-op-wrapper)
* [The C++ Op wrapper](#AUTOGENERATED-the-c---op-wrapper)
* [Verify it works](#AUTOGENERATED-verify-it-works)
* [Validation](#Validation)
* [Op registration](#AUTOGENERATED-op-registration)
* [Attrs](#Attrs)
* [Attr types](#AUTOGENERATED-attr-types)
* [Polymorphism](#Polymorphism)
* [Inputs and Outputs](#AUTOGENERATED-inputs-and-outputs)
* [Backwards compatibility](#AUTOGENERATED-backwards-compatibility)
* [GPU Support](#mult-archs)
* [Implement the gradient in Python](#AUTOGENERATED-implement-the-gradient-in-python)
* [Implement a shape function in Python](#AUTOGENERATED-implement-a-shape-function-in-python)
## Define the Op's interface
You define the interface of an Op by registering it with the TensorFlow system.
In the registration, you specify the name of your Op, its inputs (types and
names) and outputs (types and names), as well as docstrings and
any [attrs](#Attrs) the Op might require.
To see how this works, suppose you'd like to create an Op that takes a tensor of
`int32`s and outputs a copy of the tensor, with all but the first element set to
zero. Create file [`tensorflow/core/user_ops`][user_ops]`/zero_out.cc` and
add a call to the `REGISTER_OP` macro that defines the interface for such an Op:
```c++
#include "tensorflow/core/framework/op.h"
REGISTER_OP("ZeroOut")
.Input("to_zero: int32")
.Output("zeroed: int32");
```
This `ZeroOut` Op takes one tensor `to_zero` of 32-bit integers as input, and
outputs a tensor `zeroed` of 32-bit integers.
> A note on naming: The name of the Op should be unique and CamelCase. Names
> starting with an underscore (`_`) are reserved for internal use.
## Implement the kernel for the Op
After you define the interface, provide one or more implementations of the Op.
To create one of these kernels, create a class that extends `OpKernel` and
overrides the `Compute` method. The `Compute` method provides one `context`
argument of type `OpKernelContext*`, from which you can access useful things
like the input and output tensors.
Add your kernel to the file you created above. The kernel might look something
like this:
```c++
#include "tensorflow/core/framework/op_kernel.h"
using namespace tensorflow;
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}
void Compute(OpKernelContext* context) override {
// Grab the input tensor
const Tensor& input_tensor = context->input(0);
auto input = input_tensor.flat();
// Create an output tensor
Tensor* output_tensor = NULL;
OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(),
&output_tensor));
auto output = output_tensor->template flat();
// Set all but the first element of the output tensor to 0.
const int N = input.size();
for (int i = 1; i < N; i++) {
output(i) = 0;
}
// Preserve the first input value if possible.
if (N > 0) output(0) = input(0);
}
};
```
After implementing your kernel, you register it with the TensorFlow system. In
the registration, you specify different constraints under which this kernel
will run. For example, you might have one kernel made for CPUs, and a separate
one for GPUs.
To do this for the `ZeroOut` op, add the following to `zero_out.cc`:
```c++
REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);
```
Once you
[build and reinstall TensorFlow](../../get_started/os_setup.md#create-pip), the
Tensorflow system can reference and use the Op when requested.
## Generate the client wrapper
### The Python Op wrapper
Python op wrappers are created automatically in
`bazel-genfiles/tensorflow/python/ops/gen_user_ops.py` for all ops placed in the
[`tensorflow/core/user_ops`][user_ops] directory when you build Tensorflow.
Those ops are imported into
[`tensorflow/python/user_ops/user_ops.py`][python-user_ops] with the statement:
```python
from tensorflow.python.ops.gen_user_ops import *
```
You may optionally use your own function instead. To do this, you first hide
the generated code for that op by adding its name to the `hidden` list in the
`"user_ops"` rule in
[`tensorflow/python/BUILD`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD):
```python
tf_gen_op_wrapper_py(
name = "user_ops",
hidden = [
"Fact",
],
require_shape_functions = False,
)
```
List your op next to `"Fact"`. Next you add your replacement function to
[`tensorflow/python/user_ops/user_ops.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py).
Typically your function will call the generated function to actually add the op
to the graph. The hidden version of the generated function will be in the
`gen_user_ops` package and start with an underscore ("`_`"). For example:
```python
def my_fact():
"""Example of overriding the generated code for an Op."""
return gen_user_ops._fact()
```
### The C++ Op wrapper
C++ op wrappers are created automatically for all ops placed in the
[`tensorflow/core/user_ops`][user_ops] directory, when you build Tensorflow. For
example, ops in `tensorflow/core/user_ops/zero_out.cc` will generate wrappers in
`bazel-genfiles/tensorflow/cc/ops/user_ops.{h,cc}`.
All generated wrappers for user ops are automatically
imported into [`tensorflow/cc/ops/standard_ops.h`][standard_ops-cc] with the
statement
```c++
#include "tensorflow/cc/ops/user_ops.h"
```
## Verify it works
A good way to verify that you've successfully implemented your Op is to write a
test for it. Create the file
`tensorflow/python/kernel_tests/zero_out_op_test.py` with the contents:
```python
import tensorflow as tf
class ZeroOutTest(tf.test.TestCase):
def testZeroOut(self):
with self.test_session():
result = tf.user_ops.zero_out([5, 4, 3, 2, 1])
self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0])
```
Then run your test:
```sh
$ bazel test tensorflow/python:zero_out_op_test
```
## Validation
The example above assumed that the Op applied to a tensor of any shape. What
if it only applied to vectors? That means adding a check to the above OpKernel
implementation.
```c++
void Compute(OpKernelContext* context) override {
// Grab the input tensor
const Tensor& input_tensor = context->input(0);
OP_REQUIRES(context, TensorShapeUtils::IsVector(input_tensor.shape()),
errors::InvalidArgument("ZeroOut expects a 1-D vector."));
// ...
}
```
This asserts that the input is a vector, and returns having set the
`InvalidArgument` status if it isn't. The
[`OP_REQUIRES` macro][validation-macros] takes three arguments:
* The `context`, which can either be an `OpKernelContext` or
`OpKernelConstruction` pointer (see
[`tensorflow/core/framework/op_kernel.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_kernel.h)),
for its `SetStatus()` method.
* The condition. For example, there are functions for validating the shape
of a tensor in
[`tensorflow/core/public/tensor_shape.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/tensor_shape.h)
* The error itself, which is represented by a `Status` object, see
[`tensorflow/core/public/status.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/status.h). A
`Status` has both a type (frequently `InvalidArgument`, but see the list of
types) and a message. Functions for constructing an error may be found in
[`tensorflow/core/lib/core/errors.h`][validation-macros].
Alternatively, if you want to test whether a `Status` object returned from some
function is an error, and if so return it, use
[`OP_REQUIRES_OK`][validation-macros]. Both of these macros return from the
function on error.
## Op registration
### Attrs
Ops can have attrs, whose values are set when the Op is added to a graph. These
are used to configure the Op, and their values can be accessed both within the
kernel implementation and in the types of inputs and outputs in the Op
registration. Prefer using an input instead of an attr when possible, since
inputs are more flexible. They can change every step, be set using a feed, etc.
Attrs are used for things that can't be done with inputs: any configuration
that affects the signature (number or type of inputs or outputs) or that
can't change from step-to-step.
You define an attr when you register the Op, by specifying its name and type
using the `Attr` method, which expects a spec of the form:
```
:
```
where `` begins with a letter and can be composed of alphanumeric
characters and underscores, and `` is a type expression of the
form [described below](#attr-types)
For example, if you'd like the `ZeroOut` Op to preserve a user-specified index,
instead of only the 0th element, you can register the Op like so:
Your kernel can then access this attr in its constructor via the `context`
parameter:
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction\* context) : OpKernel(context) {
// Get the index of the value to preserve
OP\_REQUIRES\_OK(context,
context->GetAttr("preserve\_index", &preserve\_index\_));
// Check that preserve\_index is positive
OP\_REQUIRES(context, preserve\_index_ >= 0,
errors::InvalidArgument("Need preserve\_index >= 0, got ",
preserve\_index_));
}
void Compute(OpKernelContext\* context) override {
// ...
}
private:
int preserve\_index\_;
};
which can then be used in the `Compute` method:
void Compute(OpKernelContext\* context) override {
// ...
// Check that preserve_index is in range
OP\_REQUIRES(context, preserve\_index_ < input.dimension(0),
errors::InvalidArgument("preserve\_index out of range")); // Set all the elements of the output tensor to 0
const int N = input.size();
for (int i = 0; i < N; i++) {
output\_flat(i) = 0;
} // Preserve the requested input value
output\_flat(preserve\_index\_) = input(preserve\_index\_);
}
> To preserve [backwards compatibility](#backwards-compatibility), you should
> specify a [default value](#default-values-constraints) when adding an attr to
> an existing op:
>
>
### Attr types
The following types are supported in an attr:
* `string`: Any sequence of bytes (not required to be UTF8).
* `int`: A signed integer.
* `float`: A floating point number.
* `bool`: True or false.
* `type`: One of the (non-ref) values of [`DataType`][DataTypeString].
* `shape`: A [`TensorShapeProto`][TensorShapeProto].
* `tensor`: A [`TensorProto`][TensorProto].
* `list()`: A list of ``, where `` is one of the above types.
Note that `list(list())` is invalid.
See also: [`op_def_builder.cc:FinalizeAttr`][FinalizeAttr] for a definitive list.
#### Default values & constraints
Attrs may have default values, and some types of attrs can have constraints. To
define an attr with constraints, you can use the following ``s:
* `{'', ''}`: The value must be a string that has either the
value `` or ``. The name of the type, `string`, is implied
when you use this syntax. This emulates an enum:
```c++
REGISTER_OP("EnumExample")
.Attr("e: {'apple', 'orange'}");
```
* `{, }`: The value is of type `type`, and must be one of
`` or ``, where `` and `` are supported
[tensor types](../../resources/dims_types.md#data-types). You don't specify
that the type of the attr is `type`. This is implied when you have a list of
types in `{...}`. For example, in this case the attr `t` is a type that must
be an `int32`, a `float`, or a `bool`:
```c++
REGISTER_OP("RestrictedTypeExample")
.Attr("t: {int32, float, bool}");
```
* There are shortcuts for common type constraints:
* `numbertype`: Type `type` restricted to the numeric (non-string and
non-bool) types.
* `realnumbertype`: Like `numbertype` without complex types.
* `quantizedtype`: Like `numbertype` but just the quantized number types.
The specific lists of types allowed by these are defined by the functions
(like `NumberTypes()`) in
[`tensorflow/core/framework/types.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.h).
In this example the attr `t` must be one of the numeric types:
```c++
REGISTER_OP("NumberType")
.Attr("t: numbertype");
```
For this op:
```python
tf.number_type(t=tf.int32) # Valid
tf.number_type(t=tf.bool) # Invalid
```
* `int >= `: The value must be an int whose value is greater than or equal to
``, where `` is a natural number.
For example, the following Op registration specifies that the attr `a` must
have a value that is at least `2`:
```c++
REGISTER_OP("MinIntExample")
.Attr("a: int >= 2");
```
* `list() >= `: A list of type `` whose length is greater than
or equal to ``.
For example, the following Op registration specifies that the attr `a` is a
list of types (either `int32` or `float`), and that there must be at least 3
of them:
```c++
REGISTER_OP("TypeListExample")
.Attr("a: list({int32, float}) >= 3");
```
To set a default value for an attr (making it optional in the generated code),
add `= ` to the end, as in:
```c++
REGISTER_OP("AttrDefaultExample")
.Attr("i: int = 0");
```
The supported syntax of the default value is what would be used in the proto
representation of the resulting GraphDef definition.
Here are examples for how to specify a default for all types:
```c++
REGISTER_OP("AttrDefaultExampleForAllTypes")
.Attr("s: string = 'foo'")
.Attr("i: int = 0")
.Attr("f: float = 1.0")
.Attr("b: bool = true")
.Attr("ty: type = DT_INT32")
.Attr("sh: shape = { dim { size: 1 } dim { size: 2 } }")
.Attr("te: tensor = { dtype: DT_INT32 int_val: 5 }")
.Attr("l_empty: list(int) = []")
.Attr("l_int: list(int) = [2, 3, 5, 7]");
```
Note in particular that the values of type `type` use [the `DT_*` names
for the types](../../resources/dims_types.md#data-types).
### Polymorphism
#### Type Polymorphism
For ops that can take different types as input or produce different output
types, you can specify [an attr](#attrs) in
[an input or output type](#inputs-outputs) in the Op registration. Typically
you would then register an `OpKernel` for each supported type.
For instance, if you'd like the `ZeroOut` Op to work on `float`s
in addition to `int32`s, your Op registration might look like:
Your Op registration now specifies that the input's type must be `float`, or
`int32`, and that its output will be the same type, since both have type `T`.
> A note on naming:{#naming} Inputs, outputs, and attrs generally should be
> given snake_case names. The one exception is attrs that are used as the type
> of an input or in the type of an input. Those attrs can be inferred when the
> op is added to the graph and so don't appear in the op's function. For
> example, this last definition of ZeroOut will generate a Python function that
> looks like:
>
> ```python
> def zero_out(to_zero, name=None):
> """...
> Args:
> to_zero: A `Tensor`. Must be one of the following types:
> `float32`, `int32`.
> name: A name for the operation (optional).
>
> Returns:
> A `Tensor`. Has the same type as `to_zero`.
> """
> ```
>
> If `to_zero` is passed an `int32` tensor, then `T` is automatically set to
> `int32` (well, actually `DT_INT32`). Those inferred attrs are given
> Capitalized or CamelCase names.
>
> Compare this with an op that has a type attr that determines the output
> type:
>
> ```c++
> REGISTER_OP("StringToNumber")
> .Input("string_tensor: string")
> .Output("output: out_type")
> .Attr("out_type: {float, int32}");
> .Doc(R"doc(
> Converts each string in the input Tensor to the specified numeric type.
> )doc");
> ```
>
> In this case, the user has to specify the output type, as in the generated
> Python:
>
> ```python
> def string_to_number(string_tensor, out_type=None, name=None):
> """Converts each string in the input Tensor to the specified numeric type.
>
> Args:
> string_tensor: A `Tensor` of type `string`.
> out_type: An optional `tf.DType` from: `tf.float32, tf.int32`.
> Defaults to `tf.float32`.
> name: A name for the operation (optional).
>
> Returns:
> A `Tensor` of type `out_type`.
> """
> ```
\#include "tensorflow/core/framework/op_kernel.h"
class ZeroOutInt32Op : public OpKernel {
// as before
};
class ZeroOutFloatOp : public OpKernel {
public:
explicit ZeroOutFloatOp(OpKernelConstruction\* context)
: OpKernel(context) {}
void Compute(OpKernelContext\* context) override {
// Grab the input tensor
const Tensor& input\_tensor = context->input(0);
auto input = input\_tensor.flat<float>();
// Create an output tensor
Tensor* output = NULL;
OP\_REQUIRES\_OK(context,
context->allocate\_output(0, input_tensor.shape(), &output));
auto output\_flat = output->template flat<float>();
// Set all the elements of the output tensor to 0
const int N = input.size();
for (int i = 0; i < N; i++) {
output\_flat(i) = 0;
}
// Preserve the first input value
if (N > 0) output\_flat(0) = input(0);
}
};
// Note that TypeConstraint<int32>("T") means that attr "T" (defined
// in the Op registration above) must be "int32" to use this template
// instantiation.
REGISTER\_KERNEL\_BUILDER(
Name("ZeroOut")
.Device(DEVICE\_CPU)
.TypeConstraint<int32>("T"),
ZeroOutOpInt32);
REGISTER\_KERNEL\_BUILDER(
Name("ZeroOut")
.Device(DEVICE\_CPU)
.TypeConstraint<float>("T"),
ZeroOutFloatOp);
> To preserve [backwards compatibility](#backwards-compatibility), you should
> specify a [default value](#default-values-constraints) when adding an attr to
> an existing op:
>
>
Instead of writing another `OpKernel` with redundant code as above, often you
will be able to use a C++ template instead. You will still have one kernel
registration (`REGISTER\_KERNEL\_BUILDER` call) per overload.
template <typename T>
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction\* context) : OpKernel(context) {}
void Compute(OpKernelContext\* context) override {
// Grab the input tensor
const Tensor& input\_tensor = context->input(0);
auto input = input\_tensor.flat<T>();
// Create an output tensor
Tensor* output = NULL;
OP\_REQUIRES\_OK(context,
context->allocate\_output(0, input_tensor.shape(), &output));
auto output\_flat = output->template flat<T>();
// Set all the elements of the output tensor to 0
const int N = input.size();
for (int i = 0; i < N; i++) {
output\_flat(i) = 0;
}
// Preserve the first input value
if (N > 0) output\_flat(0) = input(0);
}
};
// Note that TypeConstraint<int32>("T") means that attr "T" (defined
// in the Op registration above) must be "int32" to use this template
// instantiation.
REGISTER\_KERNEL\_BUILDER(
Name("ZeroOut")
.Device(DEVICE\_CPU)
.TypeConstraint<int32>("T"),
ZeroOutOp<int32>);
REGISTER\_KERNEL\_BUILDER(
Name("ZeroOut")
.Device(DEVICE\_CPU)
.TypeConstraint<float>("T"),
ZeroOutOp<float>);
REGISTER\_KERNEL\_BUILDER(
Name("ZeroOut")
.Device(DEVICE\_CPU)
.TypeConstraint<double>("T"),
ZeroOutOp<double>);
If you have more than a couple overloads, you can put the registration in a
macro.
```c++
#include "tensorflow/core/framework/op_kernel.h"
#define REGISTER_KERNEL(type) \
REGISTER_KERNEL_BUILDER( \
Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \
ZeroOutOp)
REGISTER_KERNEL(int32);
REGISTER_KERNEL(float);
REGISTER_KERNEL(double);
#undef REGISTER_KERNEL
```
Depending on the list of types you are registering the kernel for, you may be
able to use a macro provided by
[`tensorflow/core/framework/register_types.h`][register_types]:
```c++
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/register_types.h"
REGISTER_OP("ZeroOut")
.Attr("T: realnumbertype")
.Input("to_zero: T")
.Output("zeroed: T");
template
class ZeroOutOp : public OpKernel { ... };
#define REGISTER_KERNEL(type) \
REGISTER_KERNEL_BUILDER( \
Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \
ZeroOutOp)
TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL);
#undef REGISTER_KERNEL
```
#### List Inputs and Outputs
In addition to being able to accept or produce different types, ops can consume
or produce a variable number of tensors.
In the next example, the attr `T` holds a *list* of types, and is used as the
type of both the input `in` and the output `out`. The input and output are
lists of tensors of that type (and the number and types of tensors in the output
are the same as the input, since both have type `T`).
```c++
REGISTER_OP("PolymorphicListExample")
.Attr("T: list(type)")
.Input("in: T")
.Output("out: T");
```
You can also place restrictions on what types can be specified in the list. In
this next case, the input is a list of `float` and `double` tensors. The Op
accepts, for example, input types `(float, double, float)` and in that case the
output type would also be `(float, double, float)`.
```c++
REGISTER_OP("ListTypeRestrictionExample")
.Attr("T: list({float, double})")
.Input("in: T")
.Output("out: T");
```
If you want all the tensors in a list to be of the same type, you might do
something like:
```c++
REGISTER_OP("IntListInputExample")
.Attr("N: int")
.Input("in: N * int32")
.Output("out: int32");
```
This accepts a list of `int32` tensors, and uses an `int` attr `N` to
specify the length of the list.
This can be made [type polymorphic](#type-polymorphism) as well. In the next
example, the input is a list of tensors (with length `"N"`) of the same (but
unspecified) type (`"T"`), and the output is a single tensor of matching type:
```c++
REGISTER_OP("SameListInputExample")
.Attr("N: int")
.Attr("T: type")
.Input("in: N * T")
.Output("out: T");
```
By default, tensor lists have a minimum length of 1. You can change that default
using
[a `">="` constraint on the corresponding attr](#default-values-constraints).
In this next example, the input is a list of at least 2 `int32` tensors:
```c++
REGISTER_OP("MinLengthIntListExample")
.Attr("N: int >= 2")
.Input("in: N * int32")
.Output("out: int32");
```
The same syntax works with `"list(type)"` attrs:
```c++
REGISTER_OP("MinimumLengthPolymorphicListExample")
.Attr("T: list(type) >= 3")
.Input("in: T")
.Output("out: T");
```
### Inputs and Outputs
To summarize the above, an Op registration can have multiple inputs and outputs:
```c++
REGISTER_OP("MultipleInsAndOuts")
.Input("y: int32")
.Input("z: float")
.Output("a: string")
.Output("b: int32");
```
Each input or output spec is of the form:
```
:
```
where `` begins with a letter and can be composed of alphanumeric
characters and underscores. `` is one of the following type
expressions:
* ``, where `` is a supported input type (e.g. `float`, `int32`,
`string`). This specifies a single tensor of the given type.
See
[the list of supported Tensor types](../../resources/dims_types.md#data-types).
```c++
REGISTER_OP("BuiltInTypesExample")
.Input("integers: int32")
.Input("complex_numbers: scomplex64");
```
* ``, where `` is the name of an [Attr](#attrs) with type
`type` or `list(type)` (with a possible type restriction). This syntax allows
for [polymorphic ops](#Polymorphism).
```c++
REGISTER_OP("PolymorphicSingleInput")
.Attr("T: type")
.Input("in: T);
REGISTER_OP("RestrictedPolymorphicSingleInput")
.Attr("T: {int32, int64}")
.Input("in: T);
```
Referencing an attr of type `list(type)` allows you to accept a sequence of
tensors.
```c++
REGISTER_OP("ArbitraryTensorSequenceExample")
.Attr("T: list(type)")
.Input("in: T")
.Output("out: T");
REGISTER_OP("RestrictedTensorSequenceExample")
.Attr("T: list({int32, int64})")
.Input("in: T")
.Output("out: T");
```
Note that the number and types of tensors in the output `out` is the same as
in the input `in`, since both are of type `T`.
* For a sequence of tensors with the same type: ` * `, where
`` is the name of an [Attr](#attrs) with type `int`. The `` can
either be
[a specific type like `int32` or `float`](../../resources/dims_types.md#data-types),
or the name of an attr with type `type`. As an example of the first, this
Op accepts a list of `int32` tensors:
```c++
REGISTER_OP("Int32SequenceExample")
.Attr("NumTensors: int")
.Input("in: NumTensors * int32")
```
Whereas this Op accepts a list of tensors of any type, as long as they are all
the same:
```c++
REGISTER_OP("SameTypeSequenceExample")
.Attr("NumTensors: int")
.Attr("T: type")
.Input("in: NumTensors * T")
```
* For a reference to a tensor: `Ref()`, where `` is one of the
previous types.
> A note on naming: Any attr used in the type of an input will be inferred. By
> convention those inferred attrs use capital names (like `T` or `N`).
> Otherwise inputs, outputs, and attrs have names like function parameters
> (e.g. `num_outputs`). For more details, see the
> [earlier note on naming](#naming).
For more details, see
[`tensorflow/core/framework/op_def_builder.h`][op_def_builder].
### Backwards compatibility
In general, changes to specifications must be backwards-compatible: changing the
specification of an Op must not break prior serialized GraphDefs constructed
from older specfications.
There are several ways to preserve backwards-compatibility.
1. Any new attrs added to an operation must have default values defined, and
with that default value the Op must have the original behavior. To change an
operation from not polymorphic to polymorphic, you *must* give a default
value to the new type attr to preserve the original signature by default. For
example, if your operation was:
```c++
REGISTER_OP("MyGeneralUnaryOp")
.Input("in: float")
.Output("out: float");
```
you can make it polymorphic in a backwards-compatible way using:
```c++
REGISTER_OP("MyGeneralUnaryOp")
.Input("in: T")
.Output("out: T")
.Attr("T: numerictype = float");
```
1. You can safely make a constraint on an attr less restrictive. For example,
you can change from `{int32, int64}` to `{int32, int64, float}` or from
`{"apple", "orange"}` to `{"apple", "banana", "orange"}`.
1. Namespace any new Ops you create, by prefixing the Op names with something
unique to your project. This avoids having your Op colliding with any Ops
that might be included in future versions of Tensorflow.
1. Plan ahead! Try to anticipate future uses for the Op. Some signature changes
can't be done in a compatible way (for example, adding an input, or making a
single input into a list).
If you cannot make your change to an operation backwards compatible, then
create a new operation with a new name with the new semantics.
## GPU Support
You can implement different OpKernels and register one for CPU and another for
GPU, just like you can [register kernels for different types](#Polymorphism).
There are several examples of kernels with GPU support in
[`tensorflow/core/kernels/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/).
Notice some kernels have a CPU version in a `.cc` file, a GPU version in a file
ending in `_gpu.cu.cc`, and some code shared in common in a `.h` file.
For example, the [`pad` op](../../api_docs/python/array_ops.md#pad) has
everything but the GPU kernel in [`tensorflow/core/kernels/pad_op.cc`][pad_op].
The GPU kernel is in
[`tensorflow/core/kernels/pad_op_gpu.cu.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op_gpu.cu.cc),
and the shared code is a templated class defined in
[`tensorflow/core/kernels/pad_op.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.h).
One thing to note, even when the GPU kernel version of `pad` is used, it still
needs its `"paddings"` input in CPU memory. To mark that inputs or outputs are
kept on the CPU, add a `HostMemory()` call to the kernel registration, e.g.:
```c++
#define REGISTER_GPU_KERNEL(T) \
REGISTER_KERNEL_BUILDER(Name("Pad") \
.Device(DEVICE_GPU) \
.TypeConstraint("T") \
.HostMemory("paddings"), \
PadOp)
```
## Implement the gradient in Python
Given a graph of ops, TensorFlow uses automatic differentiation
(backpropagation) to add new ops representing gradients with respect to the
existing ops (see
[Gradient Computation](../../api_docs/python/train.md#gradient-computation)).
To make automatic differentiation work for new ops, you must register a gradient
function which computes gradients with respect to the ops' inputs given
gradients with respect to the ops' outputs.
Mathematically, if an op computes \\(y = f(x)\\) the registered gradient op
converts gradients \\(\partial / \partial y\\) with respect to \\(y\\) into
gradients \\(\partial / \partial x\\) with respect to \\(x\\) via the chain
rule:
$$\frac{\partial}{\partial x}
= \frac{\partial}{\partial y} \frac{\partial y}{\partial x}
= \frac{\partial}{\partial y} \frac{\partial f}{\partial x}.$$
In the case of `ZeroOut`, only one entry in the input affects the output, so the
gradient with respect to the input is a sparse "one hot" tensor. This is
expressed as follows:
```python
from tensorflow.python.framework import ops
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import sparse_ops
@ops.RegisterGradient("ZeroOut")
def _zero_out_grad(op, grad):
"""The gradients for `zero_out`.
Args:
op: The `zero_out` `Operation` that we are differentiating, which we can use
to find the inputs and outputs of the original op.
grad: Gradient with respect to the output of the `zero_out` op.
Returns:
Gradients with respect to the input of `zero_out`.
"""
to_zero = op.inputs[0]
shape = array_ops.shape(to_zero)
index = array_ops.zeros_like(shape)
first_grad = array_ops.reshape(grad, [-1])[0]
to_zero_grad = sparse_ops.sparse_to_dense(index, shape, first_grad, 0)
return [to_zero_grad] # List of one Tensor, since we have one input
```
Details about registering gradient functions with
[`ops.RegisterGradient`](../../api_docs/python/framework.md#RegisterGradient):
* For an op with one output, the gradient function will take an
[`Operation`](../../api_docs/python/framework.md#Operation) `op` and a
[`Tensor`](../../api_docs/python/framework.md#Tensor) `grad` and build new ops
out of the tensors
[`op.inputs[i]`](../../api_docs/python/framework.md#Operation.inputs),
[`op.outputs[i]`](../../api_docs/python/framework.md#Operation.outputs), and `grad`. Information
about any attrs can be found via
[`op.get_attr`](../../api_docs/python/framework.md#Operation.get_attr).
* If the op has multiple outputs, the gradient function will take `op` and
`grads`, where `grads` is a list of gradients with respect to each output.
The result of the gradient function must be a list of `Tensor` objects
representing the gradients with respect to each input.
* If there is no well-defined gradient for some input, such as for integer
inputs used as indices, the corresponding returned gradient should be
`None`. For example, for an op taking a floating point tensor `x` and an
integer index `i`, the gradient function would `return [x_grad, None]`.
* If there is no meaningful gradient for the op at all, use
`ops.NoGradient("OpName")` to disable automatic differentiation.
Note that at the time the gradient function is called, only the data flow graph
of ops is available, not the tensor data itself. Thus, all computation must be
performed using other tensorflow ops, to be run at graph execution time.
## Implement a shape function in Python
The TensorFlow Python API has a feature called "shape inference" that provides
information about the shapes of tensors without having to execute the
graph. Shape inference is supported by "shape functions" that are registered for
each op type, and perform two roles: asserting that the shapes of the inputs are
compatible, and specifying the shapes for the outputs. A shape function is a
Python function that takes an
[`Operation`](../../api_docs/python/framework.md#Operation) as input, and
returns a list of
[`TensorShape`](../../api_docs/python/framework.md#TensorShape) objects (one per
output of the op). To register a shape function, apply the
[`tf.RegisterShape` decorator](../../api_docs/python/framework.md#RegisterShape)
to a shape function. For example, the
[`ZeroOut` op defined above](#define_interface) would have a shape function like
the following:
```python
@tf.RegisterShape("ZeroOut"):
def _zero_out_shape(op):
"""Shape function for the ZeroOut op.
This is the unconstrained version of ZeroOut, which produces an output
with the same shape as its input.
"""
return [op.inputs[0].get_shape()]
```
A shape function can also constrain the shape of an input. For the version of
[`ZeroOut` with a vector shape constraint](#Validation), the shape function
would be as follows:
```python
@tf.RegisterShape("ZeroOut"):
def _zero_out_shape(op):
"""Shape function for the ZeroOut op.
This is the constrained version of ZeroOut, which requires the input to
have rank 1 (a vector).
"""
input_shape = op.inputs[0].get_shape().with_rank(1)
return [input_shape]
```
If your op is [polymorphic with multiple inputs](#Polymorphism), use the
properties of the operation to determine the number of shapes to check:
```
@tf.RegisterShape("IntListInputExample")
def _int_list_input_example_shape(op):
"""Shape function for the "IntListInputExample" op.
All inputs and the output are matrices of the same size.
"""
output_shape = tf.TensorShape(None)
for input in op.inputs:
output_shape = output_shape.merge_with(input.get_shape().with_rank(2))
return [output_shape]
```
Since shape inference is an optional feature, and the shapes of tensors may vary
dynamically, shape functions must be robust to incomplete shape information for
any of the inputs. The [`merge_with`](../../api_docs/python/framework.md)
method allows the caller to assert that two shapes are the same, even if either
or both of them do not have complete information. Shape functions are defined
for all of the
[standard Python ops](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/),
and provide many different usage examples.
[core-array_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/array_ops.cc
[python-user_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py
[tf-kernels]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/
[user_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/user_ops/
[pad_op]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.cc
[standard_ops-py]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/standard_ops.py
[standard_ops-cc]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/cc/ops/standard_ops.h
[python-BUILD]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD
[validation-macros]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h
[op_def_builder]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.h
[register_types]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/register_types.h
[FinalizeAttr]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.cc#FinalizeAttr
[DataTypeString]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.cc#DataTypeString
[python-BUILD]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD
[types-proto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.proto
[TensorShapeProto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor_shape.proto
[TensorProto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor.proto
================================================
FILE: SOURCE/how_tos/adding_an_op/zero_out_1_test.py
================================================
"""Test for version 1 of the zero_out op."""
import tensorflow.python.platform
import tensorflow as tf
from tensorflow.g3doc.how_tos.adding_an_op import gen_zero_out_op_1
class ZeroOut1Test(tf.test.TestCase):
def test(self):
with self.test_session():
result = gen_zero_out_op_1.zero_out([5, 4, 3, 2, 1])
self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0])
if __name__ == '__main__':
tf.test.main()
================================================
FILE: SOURCE/how_tos/adding_an_op/zero_out_2_test.py
================================================
"""Test for version 2 of the zero_out op."""
import tensorflow.python.platform
import tensorflow as tf
from tensorflow.g3doc.how_tos.adding_an_op import gen_zero_out_op_2
from tensorflow.g3doc.how_tos.adding_an_op import zero_out_grad_2
from tensorflow.python.kernel_tests import gradient_checker
class ZeroOut2Test(tf.test.TestCase):
def test(self):
with self.test_session():
result = gen_zero_out_op_2.zero_out([5, 4, 3, 2, 1])
self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0])
def test_grad(self):
with self.test_session():
shape = (5,)
x = tf.constant([5, 4, 3, 2, 1], dtype=tf.float32)
y = gen_zero_out_op_2.zero_out(x)
err = gradient_checker.ComputeGradientError(x, shape, y, shape)
self.assertLess(err, 1e-4)
if __name__ == '__main__':
tf.test.main()
================================================
FILE: SOURCE/how_tos/adding_an_op/zero_out_3_test.py
================================================
"""Test for version 3 of the zero_out op."""
import tensorflow.python.platform
import tensorflow as tf
from tensorflow.g3doc.how_tos.adding_an_op import gen_zero_out_op_3
class ZeroOut3Test(tf.test.TestCase):
def test(self):
with self.test_session():
result = gen_zero_out_op_3.zero_out([5, 4, 3, 2, 1])
self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0])
def testAttr(self):
with self.test_session():
result = gen_zero_out_op_3.zero_out([5, 4, 3, 2, 1], preserve_index=3)
self.assertAllEqual(result.eval(), [0, 0, 0, 2, 0])
def testNegative(self):
with self.test_session():
result = gen_zero_out_op_3.zero_out([5, 4, 3, 2, 1], preserve_index=-1)
with self.assertRaisesOpError("Need preserve_index >= 0, got -1"):
result.eval()
def testLarge(self):
with self.test_session():
result = gen_zero_out_op_3.zero_out([5, 4, 3, 2, 1], preserve_index=17)
with self.assertRaisesOpError("preserve_index out of range"):
result.eval()
if __name__ == '__main__':
tf.test.main()
================================================
FILE: SOURCE/how_tos/adding_an_op/zero_out_grad_2.py
================================================
"""The gradient of the tutorial zero_out op."""
from tensorflow.python.framework import ops
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import sparse_ops
@ops.RegisterGradient("ZeroOut")
def _zero_out_grad(op, grad):
"""The gradients for `zero_out`.
Args:
op: The `zero_out` `Operation` that we are differentiating, which we can use
to find the inputs and outputs of the original op.
grad: Gradient with respect to the output of the `zero_out` op.
Returns:
Gradients with respect to the input of `zero_out`.
"""
to_zero = op.inputs[0]
shape = array_ops.shape(to_zero)
index = array_ops.zeros_like(shape)
first_grad = array_ops.reshape(grad, [-1])[0]
to_zero_grad = sparse_ops.sparse_to_dense(index, shape, first_grad, 0)
return [to_zero_grad] # List of one Tensor, since we have one input
================================================
FILE: SOURCE/how_tos/adding_an_op/zero_out_op_kernel_1.cc
================================================
#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/op_kernel.h"
using namespace tensorflow;
REGISTER_OP("ZeroOut")
.Input("to_zero: int32")
.Output("zeroed: int32")
.Doc(R"doc(
Zeros out all but the first value of a Tensor.
zeroed: A Tensor whose first value is identical to `to_zero`, and 0
otherwise.
)doc");
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}
void Compute(OpKernelContext* context) override {
// Grab the input tensor
const Tensor& input_tensor = context->input(0);
auto input = input_tensor.flat();
// Create an output tensor
Tensor* output_tensor = NULL;
OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(),
&output_tensor));
auto output = output_tensor->template flat();
// Set all but the first element of the output tensor to 0.
const int N = input.size();
for (int i = 1; i < N; i++) {
output(i) = 0;
}
// Preserve the first input value.
if (N > 0) output(0) = input(0);
}
};
REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);
================================================
FILE: SOURCE/how_tos/adding_an_op/zero_out_op_kernel_2.cc
================================================
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/register_types.h"
using namespace tensorflow;
REGISTER_OP("ZeroOut")
.Attr("T: realnumbertype")
.Input("to_zero: T")
.Output("zeroed: T")
.Doc(R"doc(
Zeros out all but the first value of a Tensor.
zeroed: A Tensor whose first value is identical to `to_zero`, and 0
otherwise.
)doc");
REGISTER_OP("ZeroOut2")
.Attr("T: realnumbertype")
.Input("to_zero: T")
.Output("zeroed: T")
.Doc(R"doc(
Zeros out all but the first value of a Tensor.
zeroed: A Tensor whose first value is identical to `to_zero`, and 0
otherwise.
)doc");
REGISTER_OP("ZeroOut3")
.Attr("T: realnumbertype")
.Input("to_zero: T")
.Output("zeroed: T")
.Doc(R"doc(
Zeros out all but the first value of a Tensor.
zeroed: A Tensor whose first value is identical to `to_zero`, and 0
otherwise.
)doc");
template
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}
void Compute(OpKernelContext* context) override {
// Grab the input tensor
const Tensor& input_tensor = context->input(0);
auto input = input_tensor.flat();
// Create an output tensor
Tensor* output = NULL;
OP_REQUIRES_OK(context,
context->allocate_output(0, input_tensor.shape(), &output));
auto output_flat = output->template flat();
// Set all the elements of the output tensor to 0
const int N = input.size();
for (int i = 0; i < N; i++) {
output_flat(i) = 0;
}
// Preserve the first input value
if (N > 0) output_flat(0) = input(0);
}
};
REGISTER_KERNEL_BUILDER(Name("ZeroOut")
.Device(DEVICE_CPU)
.TypeConstraint("T"),
ZeroOutOp);
REGISTER_KERNEL_BUILDER(Name("ZeroOut")
.Device(DEVICE_CPU)
.TypeConstraint("T"),
ZeroOutOp);
REGISTER_KERNEL_BUILDER(Name("ZeroOut")
.Device(DEVICE_CPU)
.TypeConstraint("T"),
ZeroOutOp);
#define REGISTER_KERNEL(type) \
REGISTER_KERNEL_BUILDER( \
Name("ZeroOut2").Device(DEVICE_CPU).TypeConstraint("T"), \
ZeroOutOp)
REGISTER_KERNEL(float);
REGISTER_KERNEL(double);
REGISTER_KERNEL(int32);
#undef REGISTER_KERNEL
#define REGISTER_KERNEL(type) \
REGISTER_KERNEL_BUILDER( \
Name("ZeroOut3").Device(DEVICE_CPU).TypeConstraint("T"), \
ZeroOutOp)
TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL);
#undef REGISTER_KERNEL
================================================
FILE: SOURCE/how_tos/adding_an_op/zero_out_op_kernel_3.cc
================================================
#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/op_kernel.h"
using namespace tensorflow;
REGISTER_OP("ZeroOut")
.Attr("preserve_index: int = 0")
.Input("to_zero: int32")
.Output("zeroed: int32");
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {
// Get the index of the value to preserve
OP_REQUIRES_OK(context,
context->GetAttr("preserve_index", &preserve_index_));
// Check that preserve\_index is positive
OP_REQUIRES(context, preserve_index_ >= 0,
errors::InvalidArgument("Need preserve_index >= 0, got ",
preserve_index_));
}
void Compute(OpKernelContext* context) override {
// Grab the input tensor
const Tensor& input_tensor = context->input(0);
auto input = input_tensor.flat();
// Check that preserve_index is in range
OP_REQUIRES(context, preserve_index_ < input.dimension(0),
errors::InvalidArgument("preserve_index out of range"));
// Create an output tensor
Tensor* output_tensor = NULL;
OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(),
&output_tensor));
auto output = output_tensor->template flat();
// Set all the elements of the output tensor to 0
const int N = input.size();
for (int i = 0; i < N; i++) {
output(i) = 0;
}
// Preserve the requested input value
output(preserve_index_) = input(preserve_index_);
}
private:
int preserve_index_;
};
REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);
================================================
FILE: SOURCE/how_tos/adding_an_op.md
================================================
# 增加一个新 Op
预备知识:
* 对 C++ 有一定了解.
* 已经[下载 TensorFlow 源代码](tensorflow-zh/SOURCE/get_started/introduction.md#source)并有能力编译它.
如果现有的库没有涵盖你想要的操作, 你可以自己定制一个. 为了使定制的 Op 能够兼容原有的库
, 你必须做以下工作:
* 在一个 C++ 文件中注册新 Op. Op 的注册与实现是相互独立的. 在其注册时描述了 Op 该如何执行.
例如, 注册 Op 时定义了 Op 的名字, 并指定了它的输入和输出.
* 使用 C++ 实现 Op. 每一个实现称之为一个 "kernel", 可以存在多个 kernel, 以适配不同的架构
(CPU, GPU 等)或不同的输入/输出类型.
* 创建一个 Python 包装器(wrapper). 这个包装器是创建 Op 的公开 API. 当注册 Op 时, 会自动生成一个默认
默认的包装器. 既可以直接使用默认包装器, 也可以添加一个新的包装器.
* (可选) 写一个函数计算 Op 的梯度.
* (可选) 写一个函数, 描述 Op 的输入和输出 shape. 该函数能够允许从 Op 推断 shape.
* 测试 Op, 通常使用 Pyhton。如果你定义了梯度,你可以使用Python的[GradientChecker](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/kernel_tests/gradient_checker.py)来测试它。
## 内容
### [增加一个新 Op](#AUTOGENERATED-adding-a-new-op)
* [定义 Op 的接口](#define_interface)
* [为 Op 实现 kernel](#AUTOGENERATED-implement-the-kernel-for-the-op)
* [生成客户端包装器](#AUTOGENERATED-generate-the-client-wrapper)
* [Python Op 包装器](#AUTOGENERATED-the-python-op-wrapper)
* [C++ Op 包装器](#AUTOGENERATED-the-c---op-wrapper)
* [检查 Op 能否正常工作](#AUTOGENERATED-verify-it-works)
* [验证条件](#Validation)
* [Op 注册](#AUTOGENERATED-op-registration)
* [属性](#Attrs)
* [属性类型](#AUTOGENERATED-attr-types)
* [多态](#Polymorphism)
* [输入和输出](#AUTOGENERATED-inputs-and-outputs)
* [向后兼容性](#AUTOGENERATED-backwards-compatibility)
* [GPU 支持](#mult-archs)
* [使用 Python 实现梯度](#AUTOGENERATED-implement-the-gradient-in-python)
* [使用 Python 实现 shape 函数](#AUTOGENERATED-implement-a-shape-function-in-python)
## 定义 Op 的接口
向 TensorFlow 系统注册来定义 Op 的接口. 在注册时, 指定 Op 的名称, 它的输入(类型和名称)
和输出(类型和名称), 和所需要任何 [属性](#Attrs)的文档说明.
为了让你有直观的认识, 创建一个简单的 Op 作为例子. 该 Op 接受一个 `int32` 类型 tensor 作为
输入, 输出这个 tensor 的一个副本, 副本与原 tensor 唯一的区别在于第一个元素被置为 0. 创建
文件 `tensorflow/core/user_ops/zero_out.cc`, 并调用 `REGISTER_OP` 宏来定义 Op 的接口.
```
#include "tensorflow/core/framework/op.h"
REGISTER_OP("ZeroOut")
.Input("to_zero: int32")
.Output("zeroed: int32");
```
`ZeroOut` Op 接受 32 位整型的 tensor `to_zero` 作为输入, 输出 32 位整型的 tensor `zeroed`.
>命名的注意事项: Op 的名称必须是为唯一的, 并使用驼峰命名法. 以下划线 `_` 开始的名称保留为内部使用.
## 为 Op 实现 kernel
在定义接口之后, 提供一个或多个 Op 的实现. 为这些 kernel 的每一个创建一个对应的类, 继承
`OpKernel`, 覆盖 `Compute` 方法. `Compute` 方法提供一个类型为 `OpKernelContext*` 的参数 `context`, 用于访问一些有用的信息, 例如输入和输出的 tensor.
将 kernel 添加到刚才创建的文件中, kernel 看起来和下面的代码类似:
```
#include "tensorflow/core/framework/op_kernel.h"
using namespace tensorflow;
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}
void Compute(OpKernelContext* context) override {
// 获取输入 tensor.
const Tensor& input_tensor = context->input(0);
auto input = input_tensor.flat();
// 创建一个输出 tensor.
Tensor* output_tensor = NULL;
OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(),
&output_tensor));
auto output = output_tensor->template flat();
// 设置 tensor 除第一个之外的元素均设为 0.
const int N = input.size();
for (int i = 1; i < N; i++) {
output(i) = 0;
}
// 尽可能地保留第一个元素的值.
if (N > 0) output(0) = input(0);
}
};
```
实现 kernel 后, 将其注册到 TensorFlow 系统中. 注册时, 可以指定该 kernel 运行时的多个约束
条件. 例如可以指定一个 kernel 在 CPU 上运行, 另一个在 GPU 上运行.
将下列代码加入到 `zero_out.cc` 中, 注册 `ZeroOut` op:
```
REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);
```
一旦[创建和重新安装了 TensorFlow ](tensorflow-zh/SOURCE/get_started/os_setup.md#create-pip),
Tensorflow 系统可以在需要时引用和使用该 Op.
## 生成客户端包装器
### Python Op 包装器
当编译 TensorFlow 时, 所有放在 [`tensorflow/core/user_ops`][user_ops] 目录下
的 Op 会自动在 `bazel-genfiles/tensorflow/python/ops/gen_user_ops.py` 文件
中生成 Python Op 包装器. 通过以下声明, 把那些 Op 引入到 [`tensorflow/python/user_ops/user_ops.py`][python-user_ops]
中:
```python
from tensorflow.python.ops.gen_user_ops import *
```
你可以选择性将部分函数替换为自己的实现. 为此, 首先要隐藏自动生成的代码,
在 [`tensorflow/python/BUILD`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD)
文件中, 将其名字添加到 `"user_ops"` 的 `hidden` 列表.
```python
tf_gen_op_wrapper_py(
name = "user_ops",
hidden = [
"Fact",
],
require_shape_functions = False,
)
```
紧接着 `"Fact"` 列出自己的 Op. 然后, 在
[`tensorflow/python/user_ops/user_ops.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py)
中添加你的替代实现函数. 通常, 替代实现函数也会调用自动生成函数来真正把 Op 添加
到图中. 被隐藏的自动生成函数位于 `gen_user_ops` 包中, 名称多了一个下划线前缀
("`_`"). 例如:
```python
def my_fact():
"""覆盖一个 Op 自动生成代码的示例."""
return gen_user_ops._fact()
```
### C++ Op 包装器
当编译 TensorFlow 时, 所有 [`tensorflow/core/user_ops`][user_ops] 文件夹
下的 Op 会自动创建 C++ Op 包装器. 例如, `tensorflow/core/user_ops/zero_out.cc` 中的 Op 会自动在 `bazel-genfiles/tensorflow/cc/ops/user_ops.{h,cc}`
中生成包装器.
[`tensorflow/cc/ops/standard_ops.h`][standard_ops-cc] 通过下述申明,
导入用户自定义 Op 自动生成的包装器.
```
#include "tensorflow/cc/ops/user_ops.h"
```
## 检查 Op 能否正常工作
验证已经成功实现 Op 的方式是编写测试程序. 创建文件
`tensorflow/python/kernel_tests/zero_out_op_test.py`,
包含以下内容:
```python
import tensorflow as tf
class ZeroOutTest(tf.test.TestCase):
def testZeroOut(self):
with self.test_session():
result = tf.user_ops.zero_out([5, 4, 3, 2, 1])
self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0])
```
然后运行测试:
```sh
$ bazel test tensorflow/python:zero_out_op_test
```
## 验证条件
上述示例假定 Op 能够应用在任何 shape 的 tensor 上. 如果只想应用到 vector 上
呢?
这意味需要在上述 OpKernel 实现中添加相关的检查.
```
void Compute(OpKernelContext* context) override {
// 获取输入 tensor
const Tensor& input_tensor = context->input(0);
OP_REQUIRES(context, TensorShapeUtils::IsVector(input_tensor.shape()),
errors::InvalidArgument("ZeroOut expects a 1-D vector."));
// ...
}
```
OP_REQUIRES 断言的输入是一个 vector, 如果不是 vector, 将设置 `InvalidArgument` 状态并返回.
[`OP_REQUIRES` 宏][validation-macros] 有三个参数:
* `context`: 可以是一个 `OpKernelContext` 或 `OpKernelConstruction` 指针
(参见 [`tensorflow/core/framework/op_kernel.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_kernel.h)),
其 `SetStatus()` 方法将被使用到.
* 检查条件: [`tensorflow/core/public/tensor_shape.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/tensor_shape.h)
中有一些验证 tensor shape 的函数.
* 条件不满足时产生的错误: 错误用一个 `Status` 对象表示, 参见
[`tensorflow/core/public/status.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/status.h).
`Status` 包含一个类型 (通常是 `InvalidArgument`, 但也可以是任何类型) 和一个消息. 构造
一个错误的函数位于 [`tensorflow/core/lib/core/errors.h`][validation-macros] 中.
如果想要测试一个函数返回的 `Status` 对象是否是一个错误, 可以使用 [`OP_REQUIRES_OK`][validation-macros].
这些宏如果检测到错误, 会直接跳出函数, 终止函数执行.
## Op 注册
### 属性
Op 可以有属性, 属性的值在 Op 添加到图中时被设置. 属性值用于配置 Op, 在 kernel 实现中, Op 注册的输入和输出类型中, 均可访问这些属性值. 尽可能地使用输入代替属性, 因为输入的灵活性更高, 例如可以在执行步骤中
中被更改, 可以使用 feed 等等. 属性可用于实现一些输入无法做到的事情, 例如影响 Op 签名 (即输入输出的数量和类型)
的配置或只读配置可以通过属性实现.
注册 Op 时可以用 `Attr` 方法指定属性的名称和类型, 以此来定义一个属性, 形式如下:
```
:
```
`` 必须以字母开头, 可以由数字, 字母, 下划线组成. `` 是一个类型表达式,
形式[如下](#attr-types):
例如, 如果想要 `ZeroOut` Op 保存一个用户索引, 指示该 Op 不仅仅只有一个元素, 你可以注册 Op 如下:
```
REGISTER_OP("ZeroOut")
.Attr("preserve_index: int")
.Input("to_zero: int32")
.Output("zeroed: int32");
```
你的 kernel 可以在构造函数里, 通过 `context` 参数访问这个属性:
```
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction * context) : OpKernel(context) {
// 获取欲保存的索引值
OP_REQUIRES_OK(context,
context->GetAttr("preserve_index", &preserve_index_));
// 检查 preserve_index 是否为正
OP_REQUIRES(context, preserve_index_ >= 0,
errors::InvalidArgument("Need preserve_index >= 0, got ",
preserve_index_));
}
void Compute(OpKernelContext* context) override {
// ...
}
private:
int preserve_index_;
};
```
该值可以在 `Compute` 方法中被使用:
```
void Compute(OpKernelContext* context) override {
// ...
// 检查 preserve_index 范围是否合法
OP_REQUIRES(context, preserve_index_ < input.dimension(0),
errors::InvalidArgument("preserve_index out of range"));
// 设置输出 tensor 所有的元素值为 0
const int N = input.size();
for (int i = 0; i < N; i++) {
output_flat(i) = 0;
}
// 保存请求的输入值
output_flat(preserve_index_) = input(preserve_index_);
}
```
> 为了维持[向后兼容性](#backwards-compatibility), 将一个属性添加到一个已有的 Op 时,
> 必须指定一个[默认值](#default-values-constraints):
```
REGISTER_OP("ZeroOut")
.Attr("preserve_index: int = 0")
.Input("to_zero: int32")
.Output("zeroed: int32");
```
### 属性类型
属性可以使用下面的类型:
* `string`: 任何二进制字节流 (UTF8 不是必须的).
* `int`: 一个有型整数.
* `float`: 一个浮点数.
* `bool`: 真或假.
* `type`: [`DataType`][DataTypeString] 非引用类型之一.
* `shape`: 一个 [`TensorShapeProto`][TensorShapeProto].
* `tensor`: 一个 [`TensorProto`][TensorProto].
* `list()`: `` 列表, 其中 `` 是上述类型之一.
注意 `list(list())` 是无效的.
权威的列表以 [`op_def_builder.cc:FinalizeAttr`][FinalizeAttr] 为准.
#### 默认值和约束条件
属性可能有默认值, 一些类型的属性可以有约束条件. 为了定义一个有约束条件的属性, 你可以使用下列的
`` 形式:
* `{'', ''}`: 属性值必须是一个字符串, 取值可以为 `` 或 ``.
值的语法已经暗示了值的类型为 `string`, 已经暗示了. 下述语句模拟了一个枚举值:
```
REGISTER_OP("EnumExample")
.Attr("e: {'apple', 'orange'}");
```
* `{, }`: 值是 `type` 类型, 且必须为 `` 或 `` 之一, 当然
`` 和 `` 必须都是有效的 [tensor 类型](tensorflow-zh/SOURCE/resources/dims_types.md#data-types).
你无须指定属性的类型为 `type`, 而是通过 `{...}` 语句给出一个类型列表. 例如, 在下面的例子里,
属性 `t` 的类型必须为 `int32`, `float`, 或 `bool`:
```
REGISTER_OP("RestrictedTypeExample")
.Attr("t: {int32, float, bool}");
```
* 这里有一些常见类型约束条件的快捷方式:
* `numbertype`: 限制类型为数字类型, 即非 string 非 bool 的类型.
* `realnumbertype`: 与 `numbertype` 区别是不支持复杂类型.
* `quantizedtype`: 与 `numbertype` 区别是只支持量化数值 (quantized number type).
这些类型的列表在 [`tensorflow/core/framework/types.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.h)
文件中通过函数定义 (如 `NumberTypes()`).
本例中属性 `t` 必须为某种数字类型:
```
REGISTER_OP("NumberType")
.Attr("t: numbertype");
```
对于这个 Op:
```python
tf.number_type(t=tf.int32) # 有效
tf.number_type(t=tf.bool) # 无效
```
* `int >= `: 值必须是一个整数, 且取值大于等于 ``, `` 是一个自然数.
例如, 下列 Op 注册操作指定了属性 `a` 的取值至少为 `2`.
```
REGISTER_OP("MinIntExample")
.Attr("a: int >= 2");
```
* `list() >= `: 一个 `` 类型列表, 列表长度必须大于等于 ``.
例如, 下面的 Op 注册操作指定属性 `a` 是一个列表, 列表中的元素类型是 `int32` 或 `float`列表长度至少为3.
```
REGISTER_OP("TypeListExample")
.Attr("a: list({int32, float}) >= 3");
```
通过添加 `= ` 到约束条件末尾, 给一个属性设置默认值 (使其在自动生成的代码里
变成可选属性), 如下:
```
REGISTER_OP("AttrDefaultExample")
.Attr("i: int = 0");
```
默认值支持的语法将在最终 GraphDef 定义的 protobuf 表示中被使用.
下面是给所有类型赋予默认值的例子:
```
REGISTER_OP("AttrDefaultExampleForAllTypes")
.Attr("s: string = 'foo'")
.Attr("i: int = 0")
.Attr("f: float = 1.0")
.Attr("b: bool = true")
.Attr("ty: type = DT_INT32")
.Attr("sh: shape = { dim { size: 1 } dim { size: 2 } }")
.Attr("te: tensor = { dtype: DT_INT32 int_val: 5 }")
.Attr("l_empty: list(int) = []")
.Attr("l_int: list(int) = [2, 3, 5, 7]");
```
请特别注意那些类型值里面包含的 [`DT_*` 名称](tensorflow-zh/SOURCE/resources/dims_types.md#data-types).
### 多态
#### Type Polymorphism
对于那些可以使用不同类型输入或产生不同类型输出的 Op, 可以注册 Op 时为输入/输出类型里指定一个[属性](#attrs).
一般紧接着, 会为每一个支持的类型注册一个 `OpKernel`.
例如, 除了 `int32` 外, 想要 `ZeroOut` Op 支持 `float`, 注册代码如下:
```
REGISTER_OP("ZeroOut")
.Attr("T: {float, int32}")
.Input("to_zero: T")
.Output("zeroed: T");
```
这段 Op 注册代码现在指定了输入的类型必须为 `float` 或 `int32`, 而且
既然输入和输出制定了同样的类型 `T`, 输出也同样如此.
> 一个命名建议:{#naming} 输入, 输出, 和属性通常使用 snake_case 命名法.
> 唯一的例外是属性被用作输入类型或是输入类型的一部分. 当添加到图中时, 这些属性
> 可以被推断出来, 因此不会出现在 Op 的函数里. 例如, 最后一个 ZeroOut 定义
> 生成的 Python 函数如下:
```python
def zero_out(to_zero, name=None):
"""...
参数:
to_zero: 一个 `Tensor`. 必须为下列类型之一:
`float32`, `int32`.
name: 操作的名字 (可选).
返回值:
一个 `Tensor`, 类型和 `to_zero` 一样.
"""
```
> 如果输入的 `to_zero` 是一个 `int32` 的tensor, 然后 `T` 将被自动
> 设置为 `int32` (实际上是 `DT_INT32`). 那些推导出的属性的名称字母全大写
> 或采用驼峰命名法.
>
> 下面是一个输出类型自动推断的例子, 读者可以对比一下:
```
REGISTER_OP("StringToNumber")
.Input("string_tensor: string")
.Output("output: out_type")
.Attr("out_type: {float, int32}");
.Doc(R"doc(
Converts each string in the input Tensor to the specified numeric type.
)doc");
```
> 在这种情况下, 用户需要在生成的 Python 代码中指定输出类型.
```python
def string_to_number(string_tensor, out_type=None, name=None):
"""将输入 Tensor 中的每一个字符串转化成指定的数字类型
参数:
string_tensor: 一个 `string` 类型的 `Tensor`.
out_type: 一个可选的 `tf.DType`, 取值为 `tf.float32, tf.int32`.
默认值是 `tf.float32`.
name: 操作的名称 (可选).
返回值:
一个 `out_type` 类型的 `Tensor`.
"""
```
```
#include "tensorflow/core/framework/op_kernel.h"
class ZeroOutInt32Op : public OpKernel {
// 和之前一样
};
class ZeroOutFloatOp : public OpKernel {
public:
explicit ZeroOutFloatOp(OpKernelConstruction * context)
: OpKernel(context) {}
void Compute(OpKernelContext * context) override {
// 获取输入 tensor
const Tensor& input_tensor = context->input(0);
auto input = input_tensor.flat();
// 创建一个输出 tensor
Tensor * output = NULL;
OP_REQUIRES_OK(context,
context->allocate_output(0, input_tensor.shape(), &output));
auto output_flat = output->template flat();
// 设置输出 tensor 的所有元素为 0
const int N = input.size();
for (int i = 0; i < N; i++) {
output_flat(i) = 0;
}
// 保留第一个输入值
if (N > 0) output_flat(0) = input(0);
}
};
// 注意, TypeConstraint("T") 意味着属性 "T" (在上面 Op 注册代码中
// 定义的) 必须是 "int32", 才能实例化.
REGISTER_KERNEL_BUILDER(
Name("ZeroOut")
.Device(DEVICE_CPU)
.TypeConstraint<int32>("T"),
ZeroOutOpInt32);
REGISTER_KERNEL_BUILDER(
Name("ZeroOut")
.Device(DEVICE_CPU)
.TypeConstraint("T"),
ZeroOutFloatOp);
```
> 为了保持[向后兼容性](#backwards-compatibility), 你在为一个
> 已有的 op 添加属性时, 必须指定一个[默认值](#default-values-constraints):
```
REGISTER_OP("ZeroOut")
.Attr("T: {float, int32} = DT_INT32")
.Input("to_zero: T")
.Output("zeroed: T")
```
如果需要添加更多类型, 例如 `double`:
```
REGISTER_OP("ZeroOut")
.Attr("T: {float, double, int32}")
.Input("to_zero: T")
.Output("zeroed: T");
```
为了避免为新增的类型写冗余的 `OpKernel` 代码, 通常可以写一个 C++ 模板作为替代.
当然, 仍然需要为每一个重载版本定义一个 keneral 注册 (`REGISTER\_KERNEL\_BUILDER` 调用).
```
template ;
class ZeroOutOp : public OpKernel {
public:
explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}
void Compute(OpKernelContext* context) override {
// 获取输入 tensor
const Tensor& input_tensor = context->input(0);
auto input = input_tensor.flat();
// 创建一个输出 tensor
Tensor* output = NULL;
OP_REQUIRES_OK(context,
context->allocate_output(0, input_tensor.shape(), &output));
auto output_flat = output->template flat();
// 设置输出 tensor 的所有元素为 0
const int N = input.size();
for (int i = 0; i < N; i++) {
output_flat(i) = 0;
}
// Preserve the first input value
if (N > 0) output_flat(0) = input(0);
}
};
};
// 注意, TypeConstraint("T") 意味着属性 "T" (在上面 Op 注册代码中
// 定义的) 必须是 "int32", 才能实例化.
REGISTER_KERNEL_BUILDER(
Name("ZeroOut")
.Device(DEVICE_CPU)
.TypeConstraint("T"),
ZeroOutOp);
REGISTER_KERNEL_BUILDER(
Name("ZeroOut")
.Device(DEVICE_CPU)
.TypeConstraint("T"),
ZeroOutOp);
REGISTER_KERNEL_BUILDER(
Name("ZeroOut")
.Device(DEVICE_CPU)
.TypeConstraint("T"),
ZeroOutOp);
```
如果有很多重载版本, 可以将注册操作通过一个宏来实现.
```
#include "tensorflow/core/framework/op_kernel.h"
#define REGISTER_KERNEL(type) \
REGISTER_KERNEL_BUILDER( \
Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \
ZeroOutOp)
REGISTER_KERNEL(int32);
REGISTER_KERNEL(float);
REGISTER_KERNEL(double);
#undef REGISTER_KERNEL
```
取决于注册 kernel 使用哪些类型, 你可能可以使用[`tensorflow/core/framework/register_types.h`][register_types]
提供的宏:
```
#include "tensorflow/core/framework/op_kernel.h"
#include "tensorflow/core/framework/register_types.h"
REGISTER_OP("ZeroOut")
.Attr("T: realnumbertype")
.Input("to_zero: T")
.Output("zeroed: T");
template
class ZeroOutOp : public OpKernel { ... };
#define REGISTER_KERNEL(type) \
REGISTER_KERNEL_BUILDER( \
Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \
ZeroOutOp)
TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL);
#undef REGISTER_KERNEL
```
#### 列表输入和输出
除了能够使用不同类型的 tensor 作为输入或输出, Op 还支持使用多个 tensor 作为输入或输出.
在接下来的例子里, 属性 `T` 存储了一个类型*列表*, 并同时作为输入 `in` 和输出 `out` 的类型.
输入和输出均为指定类型的 tensor 列表. 既然输入和输出的类型均为 `T`, 它们的 tensor 数量和类型
是一致的.
```
REGISTER_OP("PolymorphicListExample")
.Attr("T: list(type)")
.Input("in: T")
.Output("out: T");
```
可以为列表中可存放的类型设置约束条件. 在下一个例子中, 输入是 `float` 和
`double` 类型的 tensor 列表. 例如, 这个 Op 可接受的
输入类型为 `(float, double, float)` 的数据, 且在此情况下, 输出类型同样
为 `(float, double, float)`.
```
REGISTER_OP("ListTypeRestrictionExample")
.Attr("T: list({float, double})")
.Input("in: T")
.Output("out: T");
```
如果想要一个列表中的所有 tensor 是同一类型, 你需要写下列代码:
```
REGISTER_OP("IntListInputExample")
.Attr("N: int")
.Input("in: N * int32")
.Output("out: int32");
```
这段代码接受 `int32` tensor 列表, 并用一个 `int` 属性 `N`
来指定列表的长度.
这也可用于[类型推断](#type-polymorphism). 在下一个例子中,
输入是一个 tensor 列表, 长度为 `"N"`, 类型为 `"T"`, 输出是单个 `"T"` 的 tensor:
```
REGISTER_OP("SameListInputExample")
.Attr("N: int")
.Attr("T: type")
.Input("in: N * T")
.Output("out: T");
```
默认情况下, tensor 列表的最小长度为1. 这个约束条件可以通过
[为指定的属性增加一个 `">="` 约束](#default-values-constraints)来变更:
```
REGISTER_OP("MinLengthIntListExample")
.Attr("N: int >= 2")
.Input("in: N * int32")
.Output("out: int32");
```
同样的语法也适用于 `"list(type)"` 属性:
```
REGISTER_OP("MinimumLengthPolymorphicListExample")
.Attr("T: list(type) >= 3")
.Input("in: T")
.Output("out: T");
```
### 输入和输出
总结一下上述内容, 一个 Op 注册操作可以指定多个输入和输出:
```
REGISTER_OP("MultipleInsAndOuts")
.Input("y: int32")
.Input("z: float")
.Output("a: string")
.Output("b: int32");
```
每一个输入或输出形式如下:
```
:
```
其中, `` 以字母打头, 且只能由数字, 字母和下划线组成. `` 可以是
下列类型表达式之一:
* ``, 一个合法的输入类型, 如 `float`, `int32`, `string`. 这可用于指定给定类型的单个 tensor.
参见[合法 Tensor 类型列表](tensorflow-zh/SOURCE/resources/dims_types.md#data-types).
```
REGISTER_OP("BuiltInTypesExample")
.Input("integers: int32")
.Input("complex_numbers: scomplex64");
```
* ``, 一个[属性](#attrs)和一个类型 `type` 或类型列表 `list(type)`(可能
包含类型限制). 该语法可实现[多态 Op](#Polymorphism).
```
REGISTER_OP("PolymorphicSingleInput")
.Attr("T: type")
.Input("in: T);
REGISTER_OP("RestrictedPolymorphicSingleInput")
.Attr("T: {int32, int64}")
.Input("in: T);
```
将属性的类型设置为 `list(type)` 将允许你接受一个序列的 tensor.
```
REGISTER_OP("ArbitraryTensorSequenceExample")
.Attr("T: list(type)")
.Input("in: T")
.Output("out: T");
REGISTER_OP("RestrictedTensorSequenceExample")
.Attr("T: list({int32, int64})")
.Input("in: T")
.Output("out: T");
```
注意, 输入和输出均为 `T`, 意味着输入和输出的类型与数量均相同.
* ` * `, 一组拥有相同类型的 tensor, `` 是一个 `int` 类型属性的名称.
`` 可以是[一个类似于 `int32` 和 `float` 的特定类型](tensorflow-zh/SOURCE/resources/dims_types.md#data-types),
或者一个 `type` 类型属性的名字. 前者的例子如下, 该例子接受一个 `int32` tensor 列表作为 Op 输入:
```
REGISTER_OP("Int32SequenceExample")
.Attr("NumTensors: int")
.Input("in: NumTensors * int32")
```
后者的例子如下, 该例子接受一个泛型 tensor 列表作为 Op 输入:
```
REGISTER_OP("SameTypeSequenceExample")
.Attr("NumTensors: int")
.Attr("T: type")
.Input("in: NumTensors * T")
```
* Tensor 的引用表示为 `Ref()`, 其中 `` 是上述类型之一.
> 一个命名建议: 当使用属性表示一个输入的类型时, 该类型可以被推断出来. 实现该特性, 将需要推断
> 的类型用大写名称表示 (如 `T` 或 `N`), 其它的输入, 输出, 和属性像使用函数参数一样使用这些
> 大写名称. 参见之前的[命名建议](#naming)章节查看更多细节.
更多细节参见 [`tensorflow/core/framework/op_def_builder.h`][op_def_builder].
### 向后兼容性
通常, 对规范的改变必须保持向后兼容性: Op 使用新规范后, 需保证使用旧规范构造的序列化 GraphDef 仍能正确工作.
下面是几种保持向后兼容性的方式:
1. 任何添加到 Op 的新属性必须有默认值, 且默认值下的行为有明确定义. 将一个非多态的操作变为多态操作,
你*必须*为新的类型属性赋予默认值, 以保持原始的函数签名. 例如, 有如下操作:
```
REGISTER_OP("MyGeneralUnaryOp")
.Input("in: float")
.Output("out: float");
```
可以通过下述方式将其变为多态, 且保持向后兼容性:
```
REGISTER_OP("MyGeneralUnaryOp")
.Input("in: T")
.Output("out: T")
.Attr("T: numerictype = float");
```
1.放宽一个属性的约束条件是安全的. 例如, 你可以将 `{int32, int64}` 变为 `{int32, int64, float}`,
或者, 将 `{"apple", "orange"}` 变为 `{"apple", "banana", "orange"}`.
2.通过给 Op 名称添加一些项目中唯一的标识作为前缀, 来为新建的 Op 添加命名空间. 命名空间
可以预防你的 Op 与 TensorFlow 未来版本里的内置 Op 产生命名冲突.
3.超前计划! 尝试着去预测 Op 未来的的用途, 超前设计, 毕竟, 一些签名的变更无法保证兼容性
(例如, 增加新的输入, 或将原来的单元素输入变成一个列表).
如果不能以兼容的方式改变一个操作, 那就创建一个全新的操作, 来实现所需功能.
## GPU 支持
你可以实现不同的 OpKernel, 将其中之一注册到 GPU, 另一个注册到 GPU, 正如[为不同的类型注册 kernel ](#Polymorphism)一样.
[`tensorflow/core/kernels/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/) 中有一些 GPU 支持的例子.
注意, 一些 kernel 的 CPU 版本位于 `.cc` 文件, GPU 版本位于 `_gpu.cu.cc` 文件, 共享的代码位于 `.h` 文件.
例如, [`pad` op](tensorflow-zh/SOURCE/api_docs/python/array_ops.md#pad) 除了 GPU kernel 外的其它代码
均在 [`tensorflow/core/kernels/pad_op.cc`][pad_op] 中. GPU kernel 位于 [`tensorflow/core/kernels/pad_op_gpu.cu.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op_gpu.cu.cc),
共享的一个模板类代码定义在 [`tensorflow/core/kernels/pad_op.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.h).
需要注意的事情是, 即使使用 `pad` 的 GPU 版本时, 仍然需要将 `"paddings"` 输入放置到内存中.
为了实现这一点, 将输入或输出标记为必须保存在内存中, 为 kernel 注册一个 `HostMemory()` 调用.
如下:
```
#define REGISTER_GPU_KERNEL(T) \
REGISTER_KERNEL_BUILDER(Name("Pad") \
.Device(DEVICE_GPU) \
.TypeConstraint("T") \
.HostMemory("paddings"), \
PadOp)
```
## 使用 Python 实现梯度
给定一个 Op 组成的图, TensorFlow 使用自动微分 (反向传播) 来添加新的 Op 以表示梯度运算, 同时
不影响已有的 Op (参见[梯度运算](tensorflow-zh/SOURCE/api_docs/python/train.md#gradient-computation)).
为了使自动微分能够与新的 Op 协同工作, 必须注册一个梯度函数, 从 Op 的输入计算梯度, 并返回代表
梯度值的输出.
数学上, 如果一个 Op 计算 \\(y = f(x)\\), 注册的梯度 Op 通过以下链式法则, 将 \\(\partial / \partial y\\)
的梯度运算转化为 \\(\partial / \partial x\\) 的梯度运算.
$$\frac{\partial}{\partial x}
= \frac{\partial}{\partial y} \frac{\partial y}{\partial x}
= \frac{\partial}{\partial y} \frac{\partial f}{\partial x}.$$
在 `ZeroOut` 的例子中, 输入中只有一个项会影响输出, 所以, 代表输入的梯度值的 tensor 也只有
一个输入项. 如下所示:
```python
from tensorflow.python.framework import ops
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import sparse_ops
@ops.RegisterGradient("ZeroOut")
def _zero_out_grad(op, grad):
"""`zero_out` 的梯度.
参数:
op: 欲进行微分的 `zero_out` `操作`, 可以用于获取原始 Op 的输入和输出.
grad: 代表 `zero_out` 输出的梯度 Op.
返回:
代表输入 `zero_out` 的微分.
"""
to_zero = op.inputs[0]
shape = array_ops.shape(to_zero)
index = array_ops.zeros_like(shape)
first_grad = array_ops.reshape(grad, [-1])[0]
to_zero_grad = sparse_ops.sparse_to_dense(index, shape, first_grad, 0)
return [to_zero_grad] # 单个 Tensor 的列表, 既然只有一个输入
```
使用 [`ops.RegisterGradient`](tensorflow-zh/SOURCE/api_docs/python/framework.md#RegisterGradient)
注册梯度函数需要注意的一些细节:
* 对于仅有一个输出的 Op, 梯度函数使用 [`Operation`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation) `op`
和一个 [`Tensor`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Tensor) `grad` 作为参数, 并从
[`op.inputs[i]`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation.inputs),
[`op.outputs[i]`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation.outputs),
和 `grad` 构建新的 Op. 属性的信息可以通过 [`op.get_attr`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation.get_attr) 获取.
* 如果 Op 有多个输出, 梯度函数将使用 `op` 和 `grads` 作为参数, 其中, `grads` 是一个
梯度 Op 的列表, 为每一个输出计算梯度. 梯度函数的输出必须是一个 `Tensor` 对象列表, 对应到
每一个输入的梯度.
* 如果没有为一些输入定义梯度, 譬如用作索引的整型, 这些输入返回的梯度为 `None`. 举一个例子,
如果一个 Op 的输入为一个浮点数 tensor `x` 和一个整型索引 `i`, 那么梯度函数将返回
`[x_grad, None]`.
* 如果梯度对于一个 Op 来说毫无意义, 使用 `ops.NoGradient("OpName")` 禁用自动差分.
注意当梯度函数被调用时, 作用的对象是数据流图中的 Op, 而不是 tensor 数据本身. 因此, 只有在图运行时,
梯度运算才会被其它 tensorflow Op 的执行动作所触发.
## 在 Python 中实现一个形状函数
TensorFlow Python API 有一个 "形状推断" 功能, 可以不执行图就获取 tensor 的形状信息.
形状推断功能藉由每一个 Op 类型注册的 "形状函数" 来支持, 该函数有两个规则: 假设所有输入的
形状必须是兼容的, 以及指定输出的形状. 一个形状函数以一个 [`Operation`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation)
作为输入, 返回一个 [`TensorShape`](tensorflow-zh/SOURCE/api_docs/python/framework.md#TensorShape)
对象列表 (每一个输出一个对象). 使用 [`tf.RegisterShape` 装饰器](tensorflow-zh/SOURCE/api_docs/python/framework.md#RegisterShape)
注册形状函数. 例如, [上文定义的 `ZeroOut` Op](#define_interface) 的形状函数如下:
```python
@tf.RegisterShape("ZeroOut"):
def _zero_out_shape(op):
"""ZeroOut Op 的形状函数.
这是 ZeroOut 形状函数的无约束版本, 为每一个输出产生的形状和对应的输入一样.
"""
return [op.inputs[0].get_shape()]
```
一个形状函数也可以约束输入的形状. 下面是 [ZeroOut 形状函数的 vector 输入约束](#Validation)版本:
```python
@tf.RegisterShape("ZeroOut"):
def _zero_out_shape(op):
"""ZeroOut Op 的形状函数.
这是 ZeroOut 形状函数的约束版本, 要输入的 rank 必须是 1 (即使一个 vector).
"""
input_shape = op.inputs[0].get_shape().with_rank(1)
return [input_shape]
```
如果 Op 是[多输入的多态 Op](#Polymorphism), 使用操作的属性来决定需要检查的形状数量:
```
@tf.RegisterShape("IntListInputExample")
def _int_list_input_example_shape(op):
""" "IntListInputExample" Op 的形状函数.
所有的输入和输出是同大小的矩阵.
"""
output_shape = tf.TensorShape(None)
for input in op.inputs:
output_shape = output_shape.merge_with(input.get_shape().with_rank(2))
return [output_shape]
```
既然形状推断是一个可选的特性, 且 tensor 的形状可能动态变化, 形状函数必须足够健壮, 能够处理任意
输入形状信息缺失的情形. [`merge_with`](tensorflow-zh/SOURCE/api_docs/python/framework.md) 方法能够帮助
调用者判断两个形状是否是一样的, 即使两个形状的信息不全, 该函数同样有效. 所有的[标准 Python Op](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/)
的形状函数都已经定义好了, 并且已经有很多不同的使用示例.
[core-array_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/array_ops.cc
[python-user_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py
[tf-kernels]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/
[user_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/user_ops/
[pad_op]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.cc
[standard_ops-py]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/standard_ops.py
[standard_ops-cc]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/cc/ops/standard_ops.h
[python-BUILD]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD
[validation-macros]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h
[op_def_builder]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.h
[register_types]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/register_types.h
[FinalizeAttr]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.cc#FinalizeAttr
[DataTypeString]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.cc#DataTypeString
[python-BUILD]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD
[types-proto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.proto
[TensorShapeProto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor_shape.proto
[TensorProto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor.proto
> 原文:[Adding a New Op](http://www.tensorflow.org/how_tos/adding_an_op/index.html#adding-a-new-op) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@ZHNathanielLee](https://github.com/ZHNathanielLee)
================================================
FILE: SOURCE/how_tos/graph_viz/index.md
================================================
# TensorBoard: Graph Visualization
TensorFlow computation graphs are powerful but complicated. The graph visualization can help you understand and debug them. Here's an example of the visualization at work.

*Visualization of a TensorFlow graph.*
To see your own graph, run TensorBoard pointing it to the log directory of the job, click on the graph tab on the top pane and select the appropriate run using the menu at the upper left corner. For in depth information on how to run TensorBoard and make sure you are logging all the necessary information, see [Summaries and TensorBoard](../../how_tos/summaries_and_tensorboard/index.md).
## Name scoping and nodes
Typical TensorFlow graphs can have many thousands of nodes--far too many to see
easily all at once, or even to lay out using standard graph tools. To simplify,
variable names can be scoped and the visualization uses this information to
define a hierarchy on the nodes in the graph. By default, only the top of this
hierarchy is shown. Here is an example that defines three operations under the
`hidden` name scope using
[`tf.name_scope`](../../api_docs/python/framework.md#name_scope):
```python
import tensorflow as tf
with tf.name_scope('hidden') as scope:
a = tf.constant(5, name='alpha')
W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0), name='weights')
b = tf.Variable(tf.zeros([1]), name='biases')
```
This results in the following three op names:
* *hidden*/alpha
* *hidden*/weights
* *hidden*/biases
By default, the visualization will collapse all three into a node labeled `hidden`.
The extra detail isn't lost. You can double-click, or click
on the orange `+` sign in the top right to expand the node, and then you'll see
three subnodes for `alpha`, `weights` and `biases`.
Here's a real-life example of a more complicated node in its initial and
expanded states.
Initial view of top-level name scope pool_1. Clicking on the orange + button on the top right or double-clicking on the node itself will expand it.
Expanded view of pool_1 name scope. Clicking on the orange - button on the top right or double-clicking on the node itself will collapse the name scope.
Grouping nodes by name scopes is critical to making a legible graph. If you're
building a model, name scopes give you control over the resulting visualization.
**The better your name scopes, the better your visualization.**
The figure above illustrates a second aspect of the visualization. TensorFlow
graphs have two kinds of connections: data dependencies and control
dependencies. Data dependencies show the flow of tensors between two ops and
are shown as solid arrows, while control dependencies use dotted lines. In the
expanded view (right side of the figure above) all the connections are data
dependencies with the exception of the dotted line connecting `CheckNumerics`
and `control_dependency`.
There's a second trick to simplifying the layout. Most TensorFlow graphs have a
few nodes with many connections to other nodes. For example, many nodes might
have a control dependencies on an initialization step. Drawing all edges
between the `init` node and its dependencies would create a very cluttered
view.
To reduce clutter, the visualization separates out all high-degree nodes to an
*auxiliary* area on the right and doesn't draw lines to represent their edges.
Instead of lines, we draw small *node icons* to indicate the connections.
Separating out the auxiliary nodes typically doesn't remove critical
information since these nodes are usually related to bookkeeping functions.
Node conv_1 is connected to save. Note the little save node icon on its right.
save has a high degree, and will appear as an auxiliary node. The connection with conv_1 is shown as a node icon on its left. To further reduce clutter, since save has a lot of connections, we show the first 5 and abbreviate the others as ... 12 more.
One last structural simplification is *series collapsing*. Sequential
motifs--that is, nodes whose names differ by a number at the end and have
isomorphic structures--are collapsed into a single *stack* of nodes, as shown
below. For networks with long sequences, this greatly simplifies the view. As
with hierarchical nodes, double-clicking expands the series.
A collapsed view of a node sequence.
A small piece of the expanded view, after double-click.
Finally, as one last aid to legibility, the visualization uses special icons
for constants and summary nodes. To summarize, here's a table of node symbols:
Symbol | Meaning
--- | ---
 | *High-level* node representing a name scope. Double-click to expand a high-level node.
 | Sequence of numbered nodes that are not connected to each other.
 | Sequence of numbered nodes that are connected to each other.
 | An individual operation node.
 | A constant.
 | A summary node.
 | Edge showing the data flow between operations.
 | Edge showing the control dependency between operations.
 | A reference edge showing that the outgoing operation node can mutate the incoming tensor.
## Interaction
Navigate the graph by panning and zooming. Click and drag to pan, and use a
scroll gesture to zoom. Double-click on a node, or click on its `+` button, to
expand a name scope that represents a group of operations. To easily keep
track of the current viewpoint when zooming and panning, there is a minimap in
the bottom right corner.
To close an open node, double-click it again or click its `-` button. You can
also click once to select a node. It will turn a darker color, and details
about it and the nodes it connects to will appear in the info card at upper
right corner of the visualization.
Info card showing detailed information for the conv2 name scope. The inputs and outputs are combined from the inputs and outputs of the operation nodes inside the name scope. For name scopes no attributes are shown.
Info card showing detailed information for the DecodeRaw operation node. In addition to inputs and outputs, the card shows the device and the attributes associated with the current operation.
Selection can also be helpful in understanding high-degree nodes. Select any
high-degree node, and the corresponding node icons for its other connections
will be selected as well. This makes it easy, for example, to see which nodes
are being saved--and which aren't.
Clicking on a node name in the info card will select it. If necessary, the
viewpoint will automatically pan so that the node is visible.
Finally, you can choose two color schemes for your graph, using the color menu
above the legend. The default *Structure View* shows structure: when two
high-level nodes have the same structure, they appear in the same color of the
rainbow. Uniquely structured nodes are gray. There's a second view, which shows
what device the different operations run on. Name scopes are colored
proportionally to the fraction of devices for the operations inside them.
The images below give an illustration for a piece of a real-life graph.
Structure view: The gray nodes have unique structure. The orange conv1 and conv2 nodes have the same structure, and analogously for nodes with other colors.
Device view: Name scopes are colored proportionally to the fraction of devices of the operation nodes inside them. Here, purple means GPU and the green is CPU.
原文: [TensorBoard: Graph Visualization](../images/index.html#tensorboard-graph-visualization)
翻译: [@Warln](https://github.com/Warln) 校对: [lucky521](https://github.com/lucky521)
================================================
FILE: SOURCE/how_tos/new_data_formats/Custom Data Readers.md
================================================
# 自定义数据读取
基本要求:
* 熟悉 C++ 编程。
* 确保
[下载 TensorFlow 源文件](../../get_started/os_setup.md#source), 并可编译使用。
我们将支持文件格式的任务分成两部分:
* 文件格式: 我们使用 *Reader* Op来从文件中读取一个 *record* (可以使任意字符串)。
* 记录格式: 我们使用解码器或者解析运算将一个字符串记录转换为TensorFlow可以使用的张量。
例如, 读取一个
[CSV 文件](https://en.wikipedia.org/wiki/Comma-separated_values),我们使用
[一个文本读写器](../../api_docs/python/io_ops.md#TextLineReader),
然后是[从一行文本中解析CSV数据的运算](../../api_docs/python/io_ops.md#decode_csv)。
## 主要内容
### [自定义数据读取](#AUTOGENERATED-custom-data-readers)
* [编写一个文件格式读写器](#AUTOGENERATED-writing-a-reader-for-a-file-format)
* [编写一个记录格式Op](#AUTOGENERATED-writing-an-op-for-a-record-format)
## 编写一个文件格式读写器
Reader 是专门用来读取文件中的记录的。TensorFlow中内建了一些读写器Op的实例:
* [tf.TFRecordReader](../../api_docs/python/io_ops.md#TFRecordReader)
([代码位于kernels/tf_record_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/tf_record_reader_op.cc))
* [tf.FixedLengthRecordReader](../../api_docs/python/io_ops.md#FixedLengthRecordReader)
([代码位于 kernels/fixed_length_record_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/fixed_length_record_reader_op.cc))
* [tf.TextLineReader](../../api_docs/python/io_ops.md#TextLineReader)
([代码位于 kernels/text_line_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/text_line_reader_op.cc))
你可以看到这些读写器的界面是一样的,唯一的差异是在它们的构造函数中。最重要的方法是 Read。
它需要一个行列参数,通过这个行列参数,可以在需要的时候随时读取文件名 (例如: 当 Read Op首次运行,或者
前一个 Read` 从一个文件中读取最后一条记录时)。它将会生成两个标量张量: 一个字符串和一个字符串关键值。
新创建一个名为 SomeReader 的读写器,需要以下步骤:
1. 在 C++ 中, 定义一个
[tensorflow::ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h)的子类,命名为
"SomeReader".
2. 在 C++ 中,注册一个新的读写器Op和Kernel,命名为 "SomeReader"。
3. 在 Python 中, 定义一个 [tf.ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py) 的子类,命名为 "SomeReader"。
你可以把所有的 C++ 代码放在
tensorflow/core/user_ops/some_reader_op.cc文件中. 读取文件的代码将被嵌入到C++ 的 ReaderBase 类的迭代中。 这个 ReaderBase 类 是在 [tensorflow/core/kernels/reader_base.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h) 中定义的。
你需要执行以下的方法:
* OnWorkStartedLocked:打开下一个文件
* ReadLocked:读取一个记录或报告 EOF/error
* OnWorkFinishedLocked:关闭当前文件
* ResetLocked:清空记录,例如:一个错误记录
以上这些方法的名字后面都带有 "Locked", 表示 ReaderBase 在调用任何一个方法之前确保获得互斥锁,这样就不用担心线程安全(虽然只保护了该类中的元素而不是全局的)。
对于 OnWorkStartedLocked, 需要打开的文件名是 current_work() 函数的返回值。 此时的 ReadLocked 的数字签名如下:
```c++
Status ReadLocked(string* key, string* value, bool* produced, bool* at_end)
```
如果 ReadLocked 从文件中成功读取了一条记录,它将更新为:
* *key: 记录的标志位,通过该标志位可以重新定位到该记录。 可以包含从 current_work() 返回值获得的文件名,并追加一个记录号或其他信息。
* *value: 包含记录的内容。
* *produced: 设置为 true。
当你在文件(EOF)末尾,设置 *at_end 为 true ,在任何情况下,都将返回 Status::OK()。 当出现错误的时候,只需要使用
[tensorflow/core/lib/core/errors.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h) 中的一个辅助功能就可以简单地返回,不需要做任何参数修改。
接下来你讲创建一个实际的读写器Op。 如果你已经熟悉了[添加新的Op](../../how_tos/adding_an_op/index.md) 那会很有帮助。 主要步骤如下:
* 注册Op。
* 定义并注册 OpKernel。
要注册Op,你需要用到一个调用指令定义在
[tensorflow/core/framework/op.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op.h)中的REGISTER_OP。
读写器 Op 没有输入,只有 Ref(string) 类型的单输出。它们调用 SetIsStateful(),并有一个
container 字符串和 shared_name 属性. 你可以在一个 Doc 中定义配置或包含文档的额外属性。 例如:详见
[tensorflow/core/ops/io_ops.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/io_ops.cc)等:
```c++
#include "tensorflow/core/framework/op.h"
REGISTER_OP("TextLineReader")
.Output("reader_handle: Ref(string)")
.Attr("skip_header_lines: int = 0")
.Attr("container: string = ''")
.Attr("shared_name: string = ''")
.SetIsStateful()
.Doc(R"doc(
A Reader that outputs the lines of a file delimited by '\n'.
)doc");
```
要定义一个 OpKernel, 读写器可以使用定义在[tensorflow/core/framework/reader_op_kernel.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/reader_op_kernel.h)中的 ReaderOpKernel 的递减快捷方式,并运行一个叫 SetReaderFactory 的构造函数。
定义所需要的类之后,你需要通过 REGISTER_KERNEL_BUILDER(...) 注册这个类。
一个没有属性的例子:
```c++
#include "tensorflow/core/framework/reader_op_kernel.h"
class TFRecordReaderOp : public ReaderOpKernel {
public:
explicit TFRecordReaderOp(OpKernelConstruction* context)
: ReaderOpKernel(context) {
Env* env = context->env();
SetReaderFactory([this, env]() { return new TFRecordReader(name(), env); });
}
};
REGISTER_KERNEL_BUILDER(Name("TFRecordReader").Device(DEVICE_CPU),
TFRecordReaderOp);
```
一个带有属性的例子:
```c++
#include "tensorflow/core/framework/reader_op_kernel.h"
class TextLineReaderOp : public ReaderOpKernel {
public:
explicit TextLineReaderOp(OpKernelConstruction* context)
: ReaderOpKernel(context) {
int skip_header_lines = -1;
OP_REQUIRES_OK(context,
context->GetAttr("skip_header_lines", &skip_header_lines));
OP_REQUIRES(context, skip_header_lines >= 0,
errors::InvalidArgument("skip_header_lines must be >= 0 not ",
skip_header_lines));
Env* env = context->env();
SetReaderFactory([this, skip_header_lines, env]() {
return new TextLineReader(name(), skip_header_lines, env);
});
}
};
REGISTER_KERNEL_BUILDER(Name("TextLineReader").Device(DEVICE_CPU),
TextLineReaderOp);
```
最后一步是添加 Python 包装器,你需要将 tensorflow.python.ops.io_ops 导入到
[tensorflow/python/user_ops/user_ops.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py),并添加一个 [io_ops.ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py)的衍生函数。
```python
from tensorflow.python.framework import ops
from tensorflow.python.ops import common_shapes
from tensorflow.python.ops import io_ops
class SomeReader(io_ops.ReaderBase):
def __init__(self, name=None):
rr = gen_user_ops.some_reader(name=name)
super(SomeReader, self).__init__(rr)
ops.NoGradient("SomeReader")
ops.RegisterShape("SomeReader")(common_shapes.scalar_shape)
```
你可以在
[tensorflow/python/ops/io_ops.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py)中查看一些范例。
## 编写一个记录格式Op
一般来说,这是一个普通的Op, 需要一个标量字符串记录作为输入, 因此遵循 [添加Op的说明](../../how_tos/adding_an_op/index.md)。 你可以选择一个标量字符串作为输入, 并包含在错误消息中报告不正确的格式化数据。
用于解码记录的运算实例:
* [tf.parse_single_example](../../api_docs/python/io_ops.md#parse_single_example)
(and
[tf.parse_example](../../api_docs/python/io_ops.md#parse_example))
* [tf.decode_csv](../../api_docs/python/io_ops.md#decode_csv)
* [tf.decode_raw](../../api_docs/python/io_ops.md#decode_raw)
请注意,使用多个Op 来解码某个特定的记录格式也是有效的。 例如,你有一张以字符串格式保存在
[tf.train.Example 协议缓冲区](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto)的图像文件。
根据该图像的格式, 你可能从
[tf.parse_single_example](../../api_docs/python/io_ops.md#parse_single_example) 的Op 读取响应输出并调用 [tf.decode_jpeg](../../api_docs/python/image.md#decode_jpeg),
[tf.decode_png](../../api_docs/python/image.md#decode_png), 或者
[tf.decode_raw](../../api_docs/python/io_ops.md#decode_raw)。通过读取 tf.decode_raw 的响应输出并使用[tf.slice](../../api_docs/python/array_ops.md#slice) 和
[tf.reshape](../../api_docs/python/array_ops.md#reshape) 来提取数据是通用的方法。
> 原文:[Custom Data Readers](http://tensorflow.org/how_tos/new_data_formats/index.html#custom-data-readers) 翻译:[@derekshang](https://github.com/derekshang) 校对:
================================================
FILE: SOURCE/how_tos/new_data_formats/index.md
================================================
# Custom Data Readers
PREREQUISITES:
* Some familiarity with C++.
* Must have
[downloaded TensorFlow source](../../get_started/os_setup.md#source), and be
able to build it.
We divide the task of supporting a file format into two pieces:
* File formats: We use a *Reader* Op to read a *record* (which can be any
string) from a file.
* Record formats: We use decoder or parsing Ops to turn a string record
into tensors usable by TensorFlow.
For example, to read a
[CSV file](https://en.wikipedia.org/wiki/Comma-separated_values), we use
[a Reader for text files](../../api_docs/python/io_ops.md#TextLineReader)
followed by
[an Op that parses CSV data from a line of text](../../api_docs/python/io_ops.md#decode_csv).
## Contents
### [Custom Data Readers](#AUTOGENERATED-custom-data-readers)
* [Writing a Reader for a file format](#AUTOGENERATED-writing-a-reader-for-a-file-format)
* [Writing an Op for a record format](#AUTOGENERATED-writing-an-op-for-a-record-format)
## Writing a Reader for a file format
A `Reader` is something that reads records from a file. There are some examples
of Reader Ops already built into TensorFlow:
* [`tf.TFRecordReader`](../../api_docs/python/io_ops.md#TFRecordReader)
([source in `kernels/tf_record_reader_op.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/tf_record_reader_op.cc))
* [`tf.FixedLengthRecordReader`](../../api_docs/python/io_ops.md#FixedLengthRecordReader)
([source in `kernels/fixed_length_record_reader_op.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/fixed_length_record_reader_op.cc))
* [`tf.TextLineReader`](../../api_docs/python/io_ops.md#TextLineReader)
([source in `kernels/text_line_reader_op.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/text_line_reader_op.cc))
You can see these all expose the same interface, the only differences
are in their constructors. The most important method is `read`.
It takes a queue argument, which is where it gets filenames to
read from whenever it needs one (e.g. when the `read` op first runs, or
the previous `read` reads the last record from a file). It produces
two scalar tensors: a string key and and a string value.
To create a new reader called `SomeReader`, you will need to:
1. In C++, define a subclass of
[`tensorflow::ReaderBase`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h)
called `SomeReader`.
2. In C++, register a new reader op and kernel with the name `"SomeReader"`.
3. In Python, define a subclass of [`tf.ReaderBase`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py) called `SomeReader`.
You can put all the C++ code in a file in
`tensorflow/core/user_ops/some_reader_op.cc`. The code to read a file will live
in a descendant of the C++ `ReaderBase` class, which is defined in
[`tensorflow/core/kernels/reader_base.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h).
You will need to implement the following methods:
* `OnWorkStartedLocked`: open the next file
* `ReadLocked`: read a record or report EOF/error
* `OnWorkFinishedLocked`: close the current file, and
* `ResetLocked`: get a clean slate after, e.g., an error
These methods have names ending in "Locked" since `ReaderBase` makes sure
to acquire a mutex before calling any of these methods, so you generally don't
have to worry about thread safety (though that only protects the members of the
class, not global state).
For `OnWorkStartedLocked`, the name of the file to open is the value returned by
the `current_work()` method. `ReadLocked` has this signature:
```c++
Status ReadLocked(string* key, string* value, bool* produced, bool* at_end)
```
If `ReadLocked` successfully reads a record from the file, it should fill in:
* `*key`: with an identifier for the record, that a human could use to find
this record again. You can include the filename from `current_work()`,
and append a record number or whatever.
* `*value`: with the contents of the record.
* `*produced`: set to `true`.
If you hit the end of a file (EOF), set `*at_end` to `true`. In either case,
return `Status::OK()`. If there is an error, simply return it using one of the
helper functions from
[`tensorflow/core/lib/core/errors.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h)
without modifying any arguments.
Next you will create the actual Reader op. It will help if you are familiar
with [the adding an op how-to](../../how_tos/adding_an_op/index.md). The main steps
are:
* Registering the op.
* Define and register an `OpKernel`.
To register the op, you will use a `REGISTER_OP` call defined in
[`tensorflow/core/framework/op.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op.h).
Reader ops never take any input and always have a single output with type
`Ref(string)`. They should always call `SetIsStateful()`, and have a string
`container` and `shared_name` attrs. You may optionally define additional attrs
for configuration or include documentation in a `Doc`. For examples, see
[`tensorflow/core/ops/io_ops.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/io_ops.cc),
e.g.:
```c++
#include "tensorflow/core/framework/op.h"
REGISTER_OP("TextLineReader")
.Output("reader_handle: Ref(string)")
.Attr("skip_header_lines: int = 0")
.Attr("container: string = ''")
.Attr("shared_name: string = ''")
.SetIsStateful()
.Doc(R"doc(
A Reader that outputs the lines of a file delimited by '\n'.
)doc");
```
To define an `OpKernel`, Readers can use the shortcut of descending from
`ReaderOpKernel`, defined in
[`tensorflow/core/framework/reader_op_kernel.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/reader_op_kernel.h),
and implement a constructor that calls `SetReaderFactory`. After defining
your class, you will need to register it using `REGISTER_KERNEL_BUILDER(...)`.
An example with no attrs:
```c++
#include "tensorflow/core/framework/reader_op_kernel.h"
class TFRecordReaderOp : public ReaderOpKernel {
public:
explicit TFRecordReaderOp(OpKernelConstruction* context)
: ReaderOpKernel(context) {
Env* env = context->env();
SetReaderFactory([this, env]() { return new TFRecordReader(name(), env); });
}
};
REGISTER_KERNEL_BUILDER(Name("TFRecordReader").Device(DEVICE_CPU),
TFRecordReaderOp);
```
An example with attrs:
```c++
#include "tensorflow/core/framework/reader_op_kernel.h"
class TextLineReaderOp : public ReaderOpKernel {
public:
explicit TextLineReaderOp(OpKernelConstruction* context)
: ReaderOpKernel(context) {
int skip_header_lines = -1;
OP_REQUIRES_OK(context,
context->GetAttr("skip_header_lines", &skip_header_lines));
OP_REQUIRES(context, skip_header_lines >= 0,
errors::InvalidArgument("skip_header_lines must be >= 0 not ",
skip_header_lines));
Env* env = context->env();
SetReaderFactory([this, skip_header_lines, env]() {
return new TextLineReader(name(), skip_header_lines, env);
});
}
};
REGISTER_KERNEL_BUILDER(Name("TextLineReader").Device(DEVICE_CPU),
TextLineReaderOp);
```
The last step is to add the Python wrapper. You will import
`tensorflow.python.ops.io_ops` in
[`tensorflow/python/user_ops/user_ops.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py)
and add a descendant of [`io_ops.ReaderBase`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py).
```python
from tensorflow.python.framework import ops
from tensorflow.python.ops import common_shapes
from tensorflow.python.ops import io_ops
class SomeReader(io_ops.ReaderBase):
def __init__(self, name=None):
rr = gen_user_ops.some_reader(name=name)
super(SomeReader, self).__init__(rr)
ops.NoGradient("SomeReader")
ops.RegisterShape("SomeReader")(common_shapes.scalar_shape)
```
You can see some examples in
[`tensorflow/python/ops/io_ops.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py).
## Writing an Op for a record format
Generally this is an ordinary op that takes a scalar string record as input, and
so follow [the instructions to add an Op](../../how_tos/adding_an_op/index.md). You may
optionally take a scalar string key as input, and include that in error messages
reporting improperly formatted data. That way users can more easily track down
where the bad data came from.
Examples of Ops useful for decoding records:
* [`tf.parse_single_example`](../../api_docs/python/io_ops.md#parse_single_example)
(and
[`tf.parse_example`](../../api_docs/python/io_ops.md#parse_example))
* [`tf.decode_csv`](../../api_docs/python/io_ops.md#decode_csv)
* [`tf.decode_raw`](../../api_docs/python/io_ops.md#decode_raw)
Note that it can be useful to use multiple Ops to decode a particular record
format. For example, you may have an image saved as a string in
[a `tf.train.Example` protocol buffer](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto).
Depending on the format of that image, you might take the corresponding output
from a
[`tf.parse_single_example`](../../api_docs/python/io_ops.md#parse_single_example)
op and call [`tf.decode_jpeg`](../../api_docs/python/image.md#decode_jpeg),
[`tf.decode_png`](../../api_docs/python/image.md#decode_png), or
[`tf.decode_raw`](../../api_docs/python/io_ops.md#decode_raw). It is common to
take the output of `tf.decode_raw` and use
[`tf.slice`](../../api_docs/python/array_ops.md#slice) and
[`tf.reshape`](../../api_docs/python/array_ops.md#reshape) to extract pieces.
================================================
FILE: SOURCE/how_tos/new_data_formats.md
================================================
# 自定义数据读取
基本要求:
* 熟悉 C++ 编程。
* 确保[下载 TensorFlow 源文件](tensorflow-zh/SOURCE/get_started/os_setup.md#source), 并可编译使用。
我们将支持文件格式的任务分成两部分:
* 文件格式: 我们使用 *Reader* Op来从文件中读取一个 *record* (可以使任意字符串)。
* 记录格式: 我们使用解码器或者解析运算将一个字符串记录转换为TensorFlow可以使用的张量。
例如, 读取一个
[CSV 文件](https://en.wikipedia.org/wiki/Comma-separated_values),我们使用
[一个文本读写器](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TextLineReader),
然后是[从一行文本中解析CSV数据的运算](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_csv)。
## 主要内容
### [自定义数据读取](#AUTOGENERATED-custom-data-readers)
* [编写一个文件格式读写器](#AUTOGENERATED-writing-a-reader-for-a-file-format)
* [编写一个记录格式Op](#AUTOGENERATED-writing-an-op-for-a-record-format)
## 编写一个文件格式读写器
Reader 是专门用来读取文件中的记录的。TensorFlow中内建了一些读写器Op的实例:
* [tf.TFRecordReader](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TFRecordReader)
([代码位于kernels/tf_record_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/tf_record_reader_op.cc))
* [tf.FixedLengthRecordReader](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#FixedLengthRecordReader)
([代码位于 kernels/fixed_length_record_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/fixed_length_record_reader_op.cc))
* [tf.TextLineReader](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TextLineReader)
([代码位于 kernels/text_line_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/text_line_reader_op.cc))
你可以看到这些读写器的界面是一样的,唯一的差异是在它们的构造函数中。最重要的方法是 Read。
它需要一个行列参数,通过这个行列参数,可以在需要的时候随时读取文件名 (例如: 当 Read Op首次运行,或者前一个 Read` 从一个文件中读取最后一条记录时)。它将会生成两个标量张量: 一个字符串和一个字符串关键值。
新创建一个名为 SomeReader 的读写器,需要以下步骤:
1. 在 C++ 中, 定义一个
[tensorflow::ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h)的子类,命名为
"SomeReader".
2. 在 C++ 中,注册一个新的读写器Op和Kernel,命名为 "SomeReader"。
3. 在 Python 中, 定义一个 [tf.ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py) 的子类,命名为 "SomeReader"。
你可以把所有的 C++ 代码放在
`tensorflow/core/user_ops/some_reader_op.cc`文件中. 读取文件的代码将被嵌入到C++ 的 ReaderBase 类的迭代中。 这个 ReaderBase 类 是在 [tensorflow/core/kernels/reader_base.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h) 中定义的。
你需要执行以下的方法:
* OnWorkStartedLocked:打开下一个文件
* ReadLocked:读取一个记录或报告 EOF/error
* OnWorkFinishedLocked:关闭当前文件
* ResetLocked:清空记录,例如:一个错误记录
以上这些方法的名字后面都带有 "Locked", 表示 ReaderBase 在调用任何一个方法之前确保获得互斥锁,这样就不用担心线程安全(虽然只保护了该类中的元素而不是全局的)。
对于 OnWorkStartedLocked, 需要打开的文件名是 `current_work()` 函数的返回值。此时的 ReadLocked 的数字签名如下:
```
Status ReadLocked(string* key, string* value, bool* produced, bool* at_end)
```
如果 ReadLocked 从文件中成功读取了一条记录,它将更新为:
* *key: 记录的标志位,通过该标志位可以重新定位到该记录。 可以包含从 current_work() 返回值获得的文件名,并追加一个记录号或其他信息。
* *value: 包含记录的内容。
* *produced: 设置为 true。
当你在文件(EOF)末尾,设置 *at_end 为 true ,在任何情况下,都将返回 Status::OK()。 当出现错误的时候,只需要使用
[tensorflow/core/lib/core/errors.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h) 中的一个辅助功能就可以简单地返回,不需要做任何参数修改。
接下来你讲创建一个实际的读写器Op。 如果你已经熟悉了[添加新的Op](tensorflow-zh/SOURCE/how_tos/adding_an_op/index.md) 那会很有帮助。 主要步骤如下:
* 注册Op。
* 定义并注册 OpKernel。
要注册Op,你需要用到一个调用指令定义在
[tensorflow/core/framework/op.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op.h)中的REGISTER_OP。
读写器 Op 没有输入,只有 Ref(string) 类型的单输出。它们调用 SetIsStateful(),并有一个
container 字符串和 shared_name 属性. 你可以在一个 Doc 中定义配置或包含文档的额外属性。 例如:详见
[tensorflow/core/ops/io_ops.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/io_ops.cc)等:
```
#include "tensorflow/core/framework/op.h"
REGISTER_OP("TextLineReader")
.Output("reader_handle: Ref(string)")
.Attr("skip_header_lines: int = 0")
.Attr("container: string = ''")
.Attr("shared_name: string = ''")
.SetIsStateful()
.Doc(R"doc(
A Reader that outputs the lines of a file delimited by '\n'.
)doc");
```
要定义一个 OpKernel, 读写器可以使用定义在[tensorflow/core/framework/reader_op_kernel.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/reader_op_kernel.h)中的 ReaderOpKernel 的递减快捷方式,并运行一个叫 SetReaderFactory 的构造函数。
定义所需要的类之后,你需要通过 REGISTER_KERNEL_BUILDER(...) 注册这个类。
一个没有属性的例子:
```
#include "tensorflow/core/framework/reader_op_kernel.h"
class TFRecordReaderOp : public ReaderOpKernel {
public:
explicit TFRecordReaderOp(OpKernelConstruction* context)
: ReaderOpKernel(context) {
Env* env = context->env();
SetReaderFactory([this, env]() { return new TFRecordReader(name(), env); });
}
};
REGISTER_KERNEL_BUILDER(Name("TFRecordReader").Device(DEVICE_CPU),
TFRecordReaderOp);
```
一个带有属性的例子:
```
#include "tensorflow/core/framework/reader_op_kernel.h"
class TextLineReaderOp : public ReaderOpKernel {
public:
explicit TextLineReaderOp(OpKernelConstruction* context)
: ReaderOpKernel(context) {
int skip_header_lines = -1;
OP_REQUIRES_OK(context,
context->GetAttr("skip_header_lines", &skip_header_lines));
OP_REQUIRES(context, skip_header_lines >= 0,
errors::InvalidArgument("skip_header_lines must be >= 0 not ",
skip_header_lines));
Env* env = context->env();
SetReaderFactory([this, skip_header_lines, env]() {
return new TextLineReader(name(), skip_header_lines, env);
});
}
};
REGISTER_KERNEL_BUILDER(Name("TextLineReader").Device(DEVICE_CPU),
TextLineReaderOp);
```
最后一步是添加 Python 包装器,你需要将 tensorflow.python.ops.io_ops 导入到
[tensorflow/python/user_ops/user_ops.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py),并添加一个 [io_ops.ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py)的衍生函数。
```
from tensorflow.python.framework import ops
from tensorflow.python.ops import common_shapes
from tensorflow.python.ops import io_ops
class SomeReader(io_ops.ReaderBase):
def __init__(self, name=None):
rr = gen_user_ops.some_reader(name=name)
super(SomeReader, self).__init__(rr)
ops.NoGradient("SomeReader")
ops.RegisterShape("SomeReader")(common_shapes.scalar_shape)
```
你可以在
[tensorflow/python/ops/io_ops.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py)中查看一些范例。
## 编写一个记录格式Op
一般来说,这是一个普通的Op, 需要一个标量字符串记录作为输入, 因此遵循 [添加Op的说明](tensorflow-zh/SOURCE/how_tos/adding_an_op/index.md)。 你可以选择一个标量字符串作为输入, 并包含在错误消息中报告不正确的格式化数据。
用于解码记录的运算实例:
* [tf.parse_single_example](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#parse_single_example)
(and [tf.parse_example](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#parse_example))
* [tf.decode_csv](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_csv)
* [tf.decode_raw](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_raw)
请注意,使用多个Op 来解码某个特定的记录格式也是有效的。 例如,你有一张以字符串格式保存在
[tf.train.Example 协议缓冲区](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto)的图像文件。
根据该图像的格式, 你可能从
[tf.parse_single_example](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#parse_single_example) 的Op 读取响应输出并调用 [tf.decode_jpeg](tensorflow-zh/SOURCE/api_docs/python/image.md#decode_jpeg),
[tf.decode_png](tensorflow-zh/SOURCE/api_docs/python/image.md#decode_png), 或者
[tf.decode_raw](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_raw)。通过读取 tf.decode_raw 的响应输出并使用[tf.slice](tensorflow-zh/SOURCE/api_docs/python/array_ops.md#slice) 和
[tf.reshape](tensorflow-zh/SOURCE/api_docs/python/array_ops.md#reshape) 来提取数据是通用的方法。
> 原文:[Custom Data Readers](http://tensorflow.org/how_tos/new_data_formats/index.html#custom-data-readers) 翻译:[@derekshang](https://github.com/derekshang) 校对:[Wiki](https://github.com/jikexueyuanwiki)
================================================
FILE: SOURCE/how_tos/overview.md
================================================
# 综述 Overview
## Variables: 创建,初始化,保存,和恢复
TensorFlow Variables 是内存中的容纳 tensor 的缓存。这一小节介绍了用它们在模型训练时(during training)创建、保存和更新模型参数(model parameters) 的方法。
[参看教程](../how_tos/variables.md)
## TensorFlow 机制 101
用 MNIST 手写数字识别作为一个小例子,一步一步的将使用 TensorFlow 基础架构(infrastructure)训练大规模模型的细节做详细介绍。
[参看教程](../tutorials/mnist_tf.md)
## TensorBoard: 学习过程的可视化
对模型进行训练和评估时,TensorBoard 是一个很有用的可视化工具。此教程解释了创建和运行 TensorBoard 的方法,和使用摘要操作(Summary ops)的方法,通过添加摘要操作(Summary ops),可以自动把数据传输到 TensorBoard 所使用的事件文件。
[参看教程](../how_tos/summaries_and_tensorboard.md)
## TensorBoard: 图的可视化
此教程介绍了在 TensorBoard 中使用可视化工具的方法,它可以帮助你理解张量流图的过程并 debug。
[参看教程](../how_tos/graph_viz.md)
## 数据读入
此教程介绍了把数据传入 TensorSlow 程序的三种主要的方法: Feeding, Reading 和 Preloading.
[参看教程](../how_tos/reading_data.md)
## 线程和队列
此教程介绍 TensorFlow 中为了更容易进行异步和并发训练的各种不同结构(constructs)。
[参看教程](../how_tos/threading_and_queues.md)
## 添加新的 Op
TensorFlow 已经提供一整套节点操作()operation),你可以在你的 graph 中随意使用它们,不过这里有关于添加自定义操作(custom op)的细节。
[参看教程](../how_tos/adding_an_op.md)。
## 自定义数据的 Readers
如果你有相当大量的自定义数据集合,可能你想要对 TensorFlow 的 Data Readers 进行扩展,使它能直接以数据自身的格式将其读入。
[参看教程](../how_tos/new_data_formats.md)。
## 使用 GPUs
此教程描述了用多个 GPU 构建和运行模型的方法。
[参看教程](../how_tos/using_gpu.md)
## 共享变量 Sharing Variables
当在多 GPU 上部署大型的模型,或展开复杂的 LSTMs 或 RNNs 时,在模型构建代码的不同位置对许多相同的变量(Variable)进行读写常常是必须的。设计变量作用域(Variable Scope)机制的目的就是为了帮助上述任务的实现。
[参看教程](../how_tos/variable_scope/index.md)。
原文: [How-to](http://tensorflow.org/how_tos/index.html)
翻译:[Terence Cooper](https://github.com/TerenceCooper)
校对:[lonlonago](https://github.com/lonlonago)
================================================
FILE: SOURCE/how_tos/reading_data/__init__.py
================================================
================================================
FILE: SOURCE/how_tos/reading_data/convert_to_records.py
================================================
"""Converts MNIST data to TFRecords file format with Example protos."""
from __future__ import print_function
import os
import tensorflow.python.platform
import numpy
import tensorflow as tf
from tensorflow.g3doc.tutorials.mnist import input_data
TRAIN_IMAGES = 'train-images-idx3-ubyte.gz' # MNIST filenames
TRAIN_LABELS = 'train-labels-idx1-ubyte.gz'
TEST_IMAGES = 't10k-images-idx3-ubyte.gz'
TEST_LABELS = 't10k-labels-idx1-ubyte.gz'
tf.app.flags.DEFINE_string('directory', 'data',
'Directory to download data files and write the '
'converted result')
tf.app.flags.DEFINE_integer('validation_size', 5000,
'Number of examples to separate from the training '
'data for the validation set.')
FLAGS = tf.app.flags.FLAGS
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def convert_to(images, labels, name):
num_examples = labels.shape[0]
if images.shape[0] != num_examples:
raise ValueError("Images size %d does not match label size %d." %
(dat.shape[0], num_examples))
rows = images.shape[1]
cols = images.shape[2]
depth = images.shape[3]
filename = os.path.join(FLAGS.directory, name + '.tfrecords')
print('Writing', filename)
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
image_raw = images[index].tostring()
example = tf.train.Example(features=tf.train.Features(feature={
'height': _int64_feature(rows),
'width': _int64_feature(cols),
'depth': _int64_feature(depth),
'label': _int64_feature(int(labels[index])),
'image_raw': _bytes_feature(image_raw)}))
writer.write(example.SerializeToString())
def main(argv):
# Get the data.
train_images_filename = input_data.maybe_download(
TRAIN_IMAGES, FLAGS.directory)
train_labels_filename = input_data.maybe_download(
TRAIN_LABELS, FLAGS.directory)
test_images_filename = input_data.maybe_download(
TEST_IMAGES, FLAGS.directory)
test_labels_filename = input_data.maybe_download(
TEST_LABELS, FLAGS.directory)
# Extract it into numpy arrays.
train_images = input_data.extract_images(train_images_filename)
train_labels = input_data.extract_labels(train_labels_filename)
test_images = input_data.extract_images(test_images_filename)
test_labels = input_data.extract_labels(test_labels_filename)
# Generate a validation set.
validation_images = train_images[:FLAGS.validation_size, :, :, :]
validation_labels = train_labels[:FLAGS.validation_size]
train_images = train_images[FLAGS.validation_size:, :, :, :]
train_labels = train_labels[FLAGS.validation_size:]
# Convert to Examples and write the result to TFRecords.
convert_to(train_images, train_labels, 'train')
convert_to(validation_images, validation_labels, 'validation')
convert_to(test_images, test_labels, 'test')
if __name__ == '__main__':
tf.app.run()
================================================
FILE: SOURCE/how_tos/reading_data/fully_connected_preloaded.py
================================================
"""Trains the MNIST network using preloaded data in a constant.
Command to run this py_binary target:
bazel run -c opt \
<...>/tensorflow/g3doc/how_tos/reading_data:fully_connected_preloaded
"""
from __future__ import print_function
import os.path
import time
import tensorflow.python.platform
import numpy
import tensorflow as tf
from tensorflow.g3doc.tutorials.mnist import input_data
from tensorflow.g3doc.tutorials.mnist import mnist
# Basic model parameters as external flags.
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')
flags.DEFINE_integer('num_epochs', 2, 'Number of epochs to run trainer.')
flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.')
flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.')
flags.DEFINE_integer('batch_size', 100, 'Batch size. '
'Must divide evenly into the dataset sizes.')
flags.DEFINE_string('train_dir', 'data', 'Directory to put the training data.')
flags.DEFINE_boolean('fake_data', False, 'If true, uses fake data '
'for unit testing.')
def run_training():
"""Train MNIST for a number of epochs."""
# Get the sets of images and labels for training, validation, and
# test on MNIST.
data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data)
# Tell TensorFlow that the model will be built into the default Graph.
with tf.Graph().as_default():
with tf.name_scope('input'):
# Input data
input_images = tf.constant(data_sets.train.images)
input_labels = tf.constant(data_sets.train.labels)
image, label = tf.train.slice_input_producer(
[input_images, input_labels], num_epochs=FLAGS.num_epochs)
label = tf.cast(label, tf.int32)
images, labels = tf.train.batch(
[image, label], batch_size=FLAGS.batch_size)
# Build a Graph that computes predictions from the inference model.
logits = mnist.inference(images, FLAGS.hidden1, FLAGS.hidden2)
# Add to the Graph the Ops for loss calculation.
loss = mnist.loss(logits, labels)
# Add to the Graph the Ops that calculate and apply gradients.
train_op = mnist.training(loss, FLAGS.learning_rate)
# Add the Op to compare the logits to the labels during evaluation.
eval_correct = mnist.evaluation(logits, labels)
# Build the summary operation based on the TF collection of Summaries.
summary_op = tf.merge_all_summaries()
# Create a saver for writing training checkpoints.
saver = tf.train.Saver()
# Create the op for initializing variables.
init_op = tf.initialize_all_variables()
# Create a session for running Ops on the Graph.
sess = tf.Session()
# Run the Op to initialize the variables.
sess.run(init_op)
# Instantiate a SummaryWriter to output summaries and the Graph.
summary_writer = tf.train.SummaryWriter(FLAGS.train_dir,
graph_def=sess.graph_def)
# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
# And then after everything is built, start the training loop.
try:
step = 0
while not coord.should_stop():
start_time = time.time()
# Run one step of the model.
_, loss_value = sess.run([train_op, loss])
duration = time.time() - start_time
# Write the summaries and print an overview fairly often.
if step % 100 == 0:
# Print status to stdout.
print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value,
duration))
# Update the events file.
summary_str = sess.run(summary_op)
summary_writer.add_summary(summary_str, step)
step += 1
# Save a checkpoint periodically.
if (step + 1) % 1000 == 0:
print('Saving')
saver.save(sess, FLAGS.train_dir, global_step=step)
step += 1
except tf.errors.OutOfRangeError:
print('Saving')
saver.save(sess, FLAGS.train_dir, global_step=step)
print('Done training for %d epochs, %d steps.' % (FLAGS.num_epochs, step))
finally:
# When done, ask the threads to stop.
coord.request_stop()
# Wait for threads to finish.
coord.join(threads)
sess.close()
def main(_):
run_training()
if __name__ == '__main__':
tf.app.run()
================================================
FILE: SOURCE/how_tos/reading_data/fully_connected_preloaded_var.py
================================================
"""Trains the MNIST network using preloaded data stored in a variable.
Command to run this py_binary target:
bazel run -c opt \
<...>/tensorflow/g3doc/how_tos/reading_data:fully_connected_preloaded_var
"""
from __future__ import print_function
import os.path
import time
import tensorflow.python.platform
import numpy
import tensorflow as tf
from tensorflow.g3doc.tutorials.mnist import input_data
from tensorflow.g3doc.tutorials.mnist import mnist
# Basic model parameters as external flags.
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')
flags.DEFINE_integer('num_epochs', 2, 'Number of epochs to run trainer.')
flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.')
flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.')
flags.DEFINE_integer('batch_size', 100, 'Batch size. '
'Must divide evenly into the dataset sizes.')
flags.DEFINE_string('train_dir', 'data', 'Directory to put the training data.')
flags.DEFINE_boolean('fake_data', False, 'If true, uses fake data '
'for unit testing.')
def run_training():
"""Train MNIST for a number of epochs."""
# Get the sets of images and labels for training, validation, and
# test on MNIST.
data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data)
# Tell TensorFlow that the model will be built into the default Graph.
with tf.Graph().as_default():
with tf.name_scope('input'):
# Input data
images_initializer = tf.placeholder(
dtype=data_sets.train.images.dtype,
shape=data_sets.train.images.shape)
labels_initializer = tf.placeholder(
dtype=data_sets.train.labels.dtype,
shape=data_sets.train.labels.shape)
input_images = tf.Variable(
images_initializer, trainable=False, collections=[])
input_labels = tf.Variable(
labels_initializer, trainable=False, collections=[])
image, label = tf.train.slice_input_producer(
[input_images, input_labels], num_epochs=FLAGS.num_epochs)
label = tf.cast(label, tf.int32)
images, labels = tf.train.batch(
[image, label], batch_size=FLAGS.batch_size)
# Build a Graph that computes predictions from the inference model.
logits = mnist.inference(images, FLAGS.hidden1, FLAGS.hidden2)
# Add to the Graph the Ops for loss calculation.
loss = mnist.loss(logits, labels)
# Add to the Graph the Ops that calculate and apply gradients.
train_op = mnist.training(loss, FLAGS.learning_rate)
# Add the Op to compare the logits to the labels during evaluation.
eval_correct = mnist.evaluation(logits, labels)
# Build the summary operation based on the TF collection of Summaries.
summary_op = tf.merge_all_summaries()
# Create a saver for writing training checkpoints.
saver = tf.train.Saver()
# Create the op for initializing variables.
init_op = tf.initialize_all_variables()
# Create a session for running Ops on the Graph.
sess = tf.Session()
# Run the Op to initialize the variables.
sess.run(init_op)
sess.run(input_images.initializer,
feed_dict={images_initializer: data_sets.train.images})
sess.run(input_labels.initializer,
feed_dict={labels_initializer: data_sets.train.labels})
# Instantiate a SummaryWriter to output summaries and the Graph.
summary_writer = tf.train.SummaryWriter(FLAGS.train_dir,
graph_def=sess.graph_def)
# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
# And then after everything is built, start the training loop.
try:
step = 0
while not coord.should_stop():
start_time = time.time()
# Run one step of the model.
_, loss_value = sess.run([train_op, loss])
duration = time.time() - start_time
# Write the summaries and print an overview fairly often.
if step % 100 == 0:
# Print status to stdout.
print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value,
duration))
# Update the events file.
summary_str = sess.run(summary_op)
summary_writer.add_summary(summary_str, step)
step += 1
# Save a checkpoint periodically.
if (step + 1) % 1000 == 0:
print('Saving')
saver.save(sess, FLAGS.train_dir, global_step=step)
step += 1
except tf.errors.OutOfRangeError:
print('Saving')
saver.save(sess, FLAGS.train_dir, global_step=step)
print('Done training for %d epochs, %d steps.' % (FLAGS.num_epochs, step))
finally:
# When done, ask the threads to stop.
coord.request_stop()
# Wait for threads to finish.
coord.join(threads)
sess.close()
def main(_):
run_training()
if __name__ == '__main__':
tf.app.run()
================================================
FILE: SOURCE/how_tos/reading_data/fully_connected_reader.py
================================================
"""Train and Eval the MNIST network.
This version is like fully_connected_feed.py but uses data converted
to a TFRecords file containing tf.train.Example protocol buffers.
See tensorflow/g3doc/how_tos/reading_data.md#reading-from-files
for context.
YOU MUST run convert_to_records before running this (but you only need to
run it once).
"""
from __future__ import print_function
import os.path
import time
import tensorflow.python.platform
import numpy
import tensorflow as tf
from tensorflow.g3doc.tutorials.mnist import mnist
# Basic model parameters as external flags.
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')
flags.DEFINE_integer('num_epochs', 2, 'Number of epochs to run trainer.')
flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.')
flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.')
flags.DEFINE_integer('batch_size', 100, 'Batch size.')
flags.DEFINE_string('train_dir', 'data', 'Directory with the training data.')
# Constants used for dealing with the files, matches convert_to_records.
TRAIN_FILE = 'train.tfrecords'
VALIDATION_FILE = 'validation.tfrecords'
def read_and_decode(filename_queue):
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
serialized_example,
dense_keys=['image_raw', 'label'],
# Defaults are not specified since both keys are required.
dense_types=[tf.string, tf.int64])
# Convert from a scalar string tensor (whose single string has
# length mnist.IMAGE_PIXELS) to a uint8 tensor with shape
# [mnist.IMAGE_PIXELS].
image = tf.decode_raw(features['image_raw'], tf.uint8)
image.set_shape([mnist.IMAGE_PIXELS])
# OPTIONAL: Could reshape into a 28x28 image and apply distortions
# here. Since we are not applying any distortions in this
# example, and the next step expects the image to be flattened
# into a vector, we don't bother.
# Convert from [0, 255] -> [-0.5, 0.5] floats.
image = tf.cast(image, tf.float32) * (1. / 255) - 0.5
# Convert label from a scalar uint8 tensor to an int32 scalar.
label = tf.cast(features['label'], tf.int32)
return image, label
def inputs(train, batch_size, num_epochs):
"""Reads input data num_epochs times.
Args:
train: Selects between the training (True) and validation (False) data.
batch_size: Number of examples per returned batch.
num_epochs: Number of times to read the input data, or 0/None to
train forever.
Returns:
A tuple (images, labels), where:
* images is a float tensor with shape [batch_size, mnist.IMAGE_PIXELS]
in the range [-0.5, 0.5].
* labels is an int32 tensor with shape [batch_size] with the true label,
a number in the range [0, mnist.NUM_CLASSES).
Note that an tf.train.QueueRunner is added to the graph, which
must be run using e.g. tf.train.start_queue_runners().
"""
if not num_epochs: num_epochs = None
filename = os.path.join(FLAGS.train_dir,
TRAIN_FILE if train else VALIDATION_FILE)
with tf.name_scope('input'):
filename_queue = tf.train.string_input_producer(
[filename], num_epochs=num_epochs)
# Even when reading in multiple threads, share the filename
# queue.
image, label = read_and_decode(filename_queue)
# Shuffle the examples and collect them into batch_size batches.
# (Internally uses a RandomShuffleQueue.)
# We run this in two threads to avoid being a bottleneck.
images, sparse_labels = tf.train.shuffle_batch(
[image, label], batch_size=batch_size, num_threads=2,
capacity=1000 + 3 * batch_size,
# Ensures a minimum amount of shuffling of examples.
min_after_dequeue=1000)
return images, sparse_labels
def run_training():
"""Train MNIST for a number of steps."""
# Tell TensorFlow that the model will be built into the default Graph.
with tf.Graph().as_default():
# Input images and labels.
images, labels = inputs(train=True, batch_size=FLAGS.batch_size,
num_epochs=FLAGS.num_epochs)
# Build a Graph that computes predictions from the inference model.
logits = mnist.inference(images,
FLAGS.hidden1,
FLAGS.hidden2)
# Add to the Graph the loss calculation.
loss = mnist.loss(logits, labels)
# Add to the Graph operations that train the model.
train_op = mnist.training(loss, FLAGS.learning_rate)
# The op for initializing the variables.
init_op = tf.initialize_all_variables()
# Create a session for running operations in the Graph.
sess = tf.Session()
# Initialize the variables (the trained variables and the
# epoch counter).
sess.run(init_op)
# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
try:
step = 0
while not coord.should_stop():
start_time = time.time()
# Run one step of the model. The return values are
# the activations from the `train_op` (which is
# discarded) and the `loss` op. To inspect the values
# of your ops or variables, you may include them in
# the list passed to sess.run() and the value tensors
# will be returned in the tuple from the call.
_, loss_value = sess.run([train_op, loss])
duration = time.time() - start_time
# Print an overview fairly often.
if step % 100 == 0:
print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value,
duration))
step += 1
except tf.errors.OutOfRangeError:
print('Done training for %d epochs, %d steps.' % (FLAGS.num_epochs, step))
finally:
# When done, ask the threads to stop.
coord.request_stop()
# Wait for threads to finish.
coord.join(threads)
sess.close()
def main(_):
run_training()
if __name__ == '__main__':
tf.app.run()
================================================
FILE: SOURCE/how_tos/reading_data/index.md
================================================
# Reading data
There are three main methods of getting data into a TensorFlow program:
* Feeding: Python code provides the data when running each step.
* Reading from files: an input pipeline reads the data from files
at the beginning of a TensorFlow graph.
* Preloaded data: a constant or variable in the TensorFlow graph holds
all the data (for small data sets).
## Contents
### [Reading data](#AUTOGENERATED-reading-data)
* [Feeding](#Feeding)
* [Reading from files](#AUTOGENERATED-reading-from-files)
* [Filenames, shuffling, and epoch limits](#AUTOGENERATED-filenames--shuffling--and-epoch-limits)
* [File formats](#AUTOGENERATED-file-formats)
* [Preprocessing](#AUTOGENERATED-preprocessing)
* [Batching](#AUTOGENERATED-batching)
* [Creating threads to prefetch using `QueueRunner` objects](#QueueRunner)
* [Filtering records or producing multiple examples per record](#AUTOGENERATED-filtering-records-or-producing-multiple-examples-per-record)
* [Sparse input data](#AUTOGENERATED-sparse-input-data)
* [Preloaded data](#AUTOGENERATED-preloaded-data)
* [Multiple input pipelines](#AUTOGENERATED-multiple-input-pipelines)
## Feeding
TensorFlow's feed mechanism lets you inject data into any Tensor in a
computation graph. A python computation can thus feed data directly into the
graph.
Supply feed data through the `feed_dict` argument to a run() or eval() call
that initiates computation.
```python
with tf.Session():
input = tf.placeholder(tf.float32)
classifier = ...
print classifier.eval(feed_dict={input: my_python_preprocessing_fn()})
```
While you can replace any Tensor with feed data, including variables and
constants, the best practice is to use a
[`placeholder` op](../../api_docs/python/io_ops.md#placeholder) node. A
`placeholder` exists solely to serve as the target of feeds. It is not
initialized and contains no data. A placeholder generates an error if
it is executed without a feed, so you won't forget to feed it.
An example using `placeholder` and feeding to train on MNIST data can be found
in
[`tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py),
and is described in the [MNIST tutorial](../../tutorials/mnist/tf/index.md).
## Reading from files
A typical pipeline for reading records from files has the following stages:
1. The list of filenames
2. *Optional* filename shuffling
3. *Optional* epoch limit
4. Filename queue
5. A Reader for the file format
6. A decoder for a record read by the reader
7. *Optional* preprocessing
8. Example queue
### Filenames, shuffling, and epoch limits
For the list of filenames, use either a constant string Tensor (like
`["file0", "file1"]` or `[("file%d" % i) for i in range(2)]`) or the
[`tf.train.match_filenames_once`
function](../../api_docs/python/io_ops.md#match_filenames_once).
Pass the list of filenames to the [`tf.train.string_input_producer`
function](../../api_docs/python/io_ops.md#string_input_producer).
`string_input_producer` creates a FIFO queue for holding the filenames until
the reader needs them.
`string_input_producer` has options for shuffling and setting a maximum number
of epochs. A queue runner adds the whole list of filenames to the queue once
for each epoch, shuffling the filenames within an epoch if `shuffle=True`.
This procedure provides a uniform sampling of files, so that examples are not
under- or over- sampled relative to each other.
The queue runner works in a thread separate from the reader that pulls
filenames from the queue, so the shuffling and enqueuing process does not
block the reader.
### File formats
Select the reader that matches your input file format and pass the filename
queue to the reader's read method. The read method outputs a key identifying
the file and record (useful for debugging if you have some weird records), and
a scalar string value. Use one (or more) of the decoder and conversion ops to
decode this string into the tensors that make up an example.
#### CSV files
To read text files in [comma-separated value (CSV)
format](https://tools.ietf.org/html/rfc4180), use a
[`TextLineReader`](../../api_docs/python/io_ops.md#TextLineReader) with the
[`decode_csv`](../../api_docs/python/io_ops.md#decode_csv) operation. For example:
```python
filename_queue = tf.train.string_input_producer(["file0.csv", "file1.csv"])
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1], [1], [1], [1], [1]]
col1, col2, col3, col4, col5 = tf.decode_csv(
value, record_defaults=record_defaults)
features = tf.concat(0, [col1, col2, col3, col4])
with tf.Session() as sess:
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(1200):
# Retrieve a single instance:
example, label = sess.run([features, col5])
coord.request_stop()
coord.join(threads)
```
Each execution of `read` reads a single line from the file. The
`decode_csv` op then parses the result into a list of tensors. The
`record_defaults` argument determines the type of the resulting tensors and
sets the default value to use if a value is missing in the input string.
You must call `tf.train.start_queue_runners` to populate the queue before
you call `run` or `eval` to execute the `read`. Otherwise `read` will
block while it waits for filenames from the queue.
#### Fixed length records
To read binary files in which each record is a fixed number of bytes, use
[`tf.FixedLengthRecordReader`](../../api_docs/python/io_ops.md#FixedLengthRecordReader)
with the [`tf.decode_raw`](../../api_docs/python/io_ops.md#decode_raw) operation.
The `decode_raw` op converts from a string to a uint8 tensor.
For example, [the CIFAR-10 dataset](http://www.cs.toronto.edu/~kriz/cifar.html)
uses a file format where each record is represented using a fixed number of
bytes: 1 byte for the label followed by 3072 bytes of image data. Once you have
a uint8 tensor, standard operations can slice out each piece and reformat as
needed. For CIFAR-10, you can see how to do the reading and decoding in
[`tensorflow/models/image/cifar10/cifar10_input.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py)
and described in
[this tutorial](../../tutorials/deep_cnn/index.md#prepare-the-data).
#### Standard TensorFlow format
Another approach is to convert whatever data you have into a supported format.
This approach makes it easier to mix and match data sets and network
architectures. The recommended format for TensorFlow is a TFRecords file
containing
[`tf.train.Example` protocol buffers](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto)
(which contain
[`Features`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/feature.proto)
as a field). You write a little program that gets your data, stuffs it in an
`Example` protocol buffer, serializes the protocol buffer to a string, and then
writes the string to a TFRecords file using the
[`tf.python_io.TFRecordWriter` class](../../api_docs/python/python_io.md#TFRecordWriter).
For example,
[`tensorflow/g3doc/how_tos/reading_data/convert_to_records.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/convert_to_records.py)
converts MNIST data to this format.
To read a file of TFRecords, use
[`tf.TFRecordReader`](../../api_docs/python/io_ops.md#TFRecordReader) with
the [`tf.parse_single_example`](../../api_docs/python/io_ops.md#parse_single_example)
decoder. The `parse_single_example` op decodes the example protocol buffers into
tensors. An MNIST example using the data produced by `convert_to_records` can be
found in
[`tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py),
which you can compare with the `fully_connected_feed` version.
### Preprocessing
You can then do any preprocessing of these examples you want. This would be any
processing that doesn't depend on trainable parameters. Examples include
normalization of your data, picking a random slice, adding noise or distortions,
etc. See
[`tensorflow/models/image/cifar10/cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py)
for an example.
### Batching
At the end of the pipeline we use another queue to batch together examples for
training, evaluation, or inference. For this we use a queue that randomizes the
order of examples, using the
[`tf.train.shuffle_batch` function](../../api_docs/python/io_ops.md#shuffle_batch).
Example:
```
def read_my_file_format(filename_queue):
reader = tf.SomeReader()
key, record_string = reader.read(filename_queue)
example, label = tf.some_decoder(record_string)
processed_example = some_processing(example)
return processed_example, label
def input_pipeline(filenames, batch_size, num_epochs=None):
filename_queue = tf.train.string_input_producer(
filenames, num_epochs=num_epochs, shuffle=True)
example, label = read_my_file_format(filename_queue)
# min_after_dequeue defines how big a buffer we will randomly sample
# from -- bigger means better shuffling but slower start up and more
# memory used.
# capacity must be larger than min_after_dequeue and the amount larger
# determines the maximum we will prefetch. Recommendation:
# min_after_dequeue + (num_threads + a small safety margin) * batch_size
min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch(
[example, label], batch_size=batch_size, capacity=capacity,
min_after_dequeue=min_after_dequeue)
return example_batch, label_batch
```
If you need more parallelism or shuffling of examples between files, use
multiple reader instances using the
[`tf.train.shuffle_batch_join` function](../../api_docs/python/io_ops.md#shuffle_batch_join).
For example:
```
def read_my_file_format(filename_queue):
# Same as above
def input_pipeline(filenames, batch_size, read_threads, num_epochs=None):
filename_queue = tf.train.string_input_producer(
filenames, num_epochs=num_epochs, shuffle=True)
example_list = [read_my_file_format(filename_queue)
for _ in range(read_threads)]
min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch_join(
example_list, batch_size=batch_size, capacity=capacity,
min_after_dequeue=min_after_dequeue)
return example_batch, label_batch
```
You still only use a single filename queue that is shared by all the readers.
That way we ensure that the different readers use different files from the same
epoch until all the files from the epoch have been started. (It is also usually
sufficient to have a single thread filling the filename queue.)
An alternative is to use a single reader via the
[`tf.train.shuffle_batch` function](../../api_docs/python/io_ops.md#shuffle_batch)
with `num_threads` bigger than 1. This will make it read from a single file at
the same time (but faster than with 1 thread), instead of N files at once.
This can be important:
* If you have more reading threads than input files, to avoid the risk that
you will have two threads reading the same example from the same file near
each other.
* Or if reading N files in parallel causes too many disk seeks.
How many threads do you need? the `tf.train.shuffle_batch*` functions add a
summary to the graph that indicates how full the example queue is. If you have
enough reading threads, that summary will stay above zero. You can
[view your summaries as training progresses using TensorBoard](../../how_tos/summaries_and_tensorboard/index.md).
### Creating threads to prefetch using `QueueRunner` objects
The short version: many of the `tf.train` functions listed above add
[`QueueRunner`](../../api_docs/python/train.md#QueueRunner) objects to your
graph. These require that you call
[`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners)
before running any training or inference steps, or it will hang forever. This
will start threads that run the input pipeline, filling the example queue so
that the dequeue to get the examples will succeed. This is best combined with a
[`tf.train.Coordinator`](../../api_docs/python/train.md#Coordinator) to cleanly
shut down these threads when there are errors. If you set a limit on the number
of epochs, that will use an epoch counter that will need to be intialized. The
recommended code pattern combining these is:
```python
# Create the graph, etc.
init_op = tf.initialize_all_variables()
# Create a session for running operations in the Graph.
sess = tf.Session()
# Initialize the variables (like the epoch counter).
sess.run(init_op)
# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
try:
while not coord.should_stop():
# Run training steps or whatever
sess.run(train_op)
except tf.errors.OutOfRangeError:
print 'Done training -- epoch limit reached'
finally:
# When done, ask the threads to stop.
coord.request_stop()
# Wait for threads to finish.
coord.join(threads)
sess.close()
```
#### Aside: What is happening here?
First we create the graph. It will have a few pipeline stages that are
connected by queues. The first stage will generate filenames to read and enqueue
them in the filename queue. The second stage consumes filenames (using a
`Reader`), produces examples, and enqueues them in an example queue. Depending
on how you have set things up, you may actually have a few independent copies of
the second stage, so that you can read from multiple files in parallel. At the
end of these stages is an enqueue operation, which enqueues into a queue that
the next stage dequeues from. We want to start threads running these enqueuing
operations, so that our training loop can dequeue examples from the example
queue.
The helpers in `tf.train` that create these queues and enqueuing operations add
a [`tf.train.QueueRunner`](../../api_docs/python/train.md#QueueRunner) to the
graph using the
[`tf.train.add_queue_runner`](../../api_docs/python/train.md#add_queue_runner)
function. Each `QueueRunner` is responsible for one stage, and holds the list of
enqueue operations that need to be run in threads. Once the graph is
constructed, the
[`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners)
function asks each QueueRunner in the graph to start its threads running the
enqueuing operations.
If all goes well, you can now run your training steps and the queues will be
filled by the background threads. If you have set an epoch limit, at some point
an attempt to dequeue examples will get an
[`tf.OutOfRangeError`](../../api_docs/python/client.md#OutOfRangeError). This
is the TensorFlow equivalent of "end of file" (EOF) -- this means the epoch
limit has been reached and no more examples are available.
The last ingredient is the
[`Coordinator`](../../api_docs/python/train.md#Coordinator). This is responsible
for letting all the threads know if anything has signalled a shut down. Most
commonly this would be because an exception was raised, for example one of the
threads got an error when running some operation (or an ordinary Python
exception).
For more about threading, queues, QueueRunners, and Coordinators
[see here](../../how_tos/threading_and_queues/index.md).
#### Aside: How clean shut-down when limiting epochs works
Imagine you have a model that has set a limit on the number of epochs to train
on. That means that the thread generating filenames will only run that many
times before generating an `OutOfRange` error. The QueueRunner will catch that
error, close the filename queue, and exit the thread. Closing the queue does two
things:
* Any future attempt to enqueue in the filename queue will generate an error.
At this point there shouldn't be any threads trying to do that, but this
is helpful when queues are closed due to other errors.
* Any current or future dequeue will either succeed (if there are enough
elements left) or fail (with an `OutOfRange` error) immediately. They won't
block waiting for more elements to be enqueued, since by the previous point
that can't happen.
The point is that when the filename queue is closed, there will likely still be
many filenames in that queue, so the next stage of the pipeline (with the reader
and other preprocessing) may continue running for some time. Once the filename
queue is exhausted, though, the next attempt to dequeue a filename (e.g. from a
reader that has finished with the file it was working on) will trigger an
`OutOfRange` error. In this case, though, you might have multiple threads
associated with a single QueueRunner. If this isn't the last thread in the
QueueRunner, the `OutOfRange` error just causes the one thread to exit. This
allows the other threads, which are still finishing up their last file, to
proceed until they finish as well. (Assuming you are using a
[`tf.train.Coordinator`](../../api_docs/python/train.md#Coordinator),
other types of errors will cause all the threads to stop.) Once all the reader
threads hit the `OutOfRange` error, only then does the next queue, the example
queue, gets closed.
Again, the example queue will have some elements queued, so training will
continue until those are exhausted. If the example queue is a
[`RandomShuffleQueue`](../../api_docs/python/io_ops.md#RandomShuffleQueue), say
because you are using `shuffle_batch` or `shuffle_batch_join`, it normally will
avoid ever going having fewer than its `min_after_dequeue` attr elements
buffered. However, once the queue is closed that restriction will be lifted and
the queue will eventually empty. At that point the actual training threads,
when they try and dequeue from example queue, will start getting `OutOfRange`
errors and exiting. Once all the training threads are done,
[`tf.train.Coordinator.join`](../../api_docs/python/train.md#Coordinator.join)
will return and you can exit cleanly.
### Filtering records or producing multiple examples per record
Instead of examples with shapes `[x, y, z]`, you will produce a batch of
examples with shape `[batch, x, y, z]`. The batch size can be 0 if you want to
filter this record out (maybe it is in a hold-out set?), or bigger than 1 if you
are producing multiple examples per record. Then simply set `enqueue_many=True`
when calling one of the batching functions (such as `shuffle_batch` or
`shuffle_batch_join`).
### Sparse input data
SparseTensors don't play well with queues. If you use SparseTensors you have
to decode the string records using
[`tf.parse_example`](../../api_docs/python/io_ops.md#parse_example) **after**
batching (instead of using `tf.parse_single_example` before batching).
## Preloaded data
This is only used for small data sets that can be loaded entirely in memory.
There are two approaches:
* Store the data in a constant.
* Store the data in a variable, that you initialize and then never change.
Using a constant is a bit simpler, but uses more memory (since the constant is
stored inline in the graph data structure, which may be duplicated a few times).
```python
training_data = ...
training_labels = ...
with tf.Session():
input_data = tf.constant(training_data)
input_labels = tf.constant(training_labels)
...
```
To instead use a variable, you need to also initialize it after the graph has been built.
```python
training_data = ...
training_labels = ...
with tf.Session() as sess:
data_initializer = tf.placeholder(dtype=training_data.dtype,
shape=training_data.shape)
label_initializer = tf.placeholder(dtype=training_labels.dtype,
shape=training_labels.shape)
input_data = tf.Variable(data_initalizer, trainable=False, collections=[])
input_labels = tf.Variable(label_initalizer, trainable=False, collections=[])
...
sess.run(input_data.initializer,
feed_dict={data_initializer: training_data})
sess.run(input_labels.initializer,
feed_dict={label_initializer: training_lables})
```
Setting `trainable=False` keeps the variable out of the
`GraphKeys.TRAINABLE_VARIABLES` collection in the graph, so we won't try and
update it when training. Setting `collections=[]` keeps the variable out of the
`GraphKeys.VARIABLES` collection used for saving and restoring checkpoints.
Either way,
[`tf.train.slice_input_producer function`](../../api_docs/python/io_ops.md#slice_input_producer)
can be used to produce a slice at a time. This shuffles the examples across an
entire epoch, so further shuffling when batching is undesirable. So instead of
using the `shuffle_batch` functions, we use the plain
[`tf.train.batch` function](../../api_docs/python/io_ops.md#batch). To use
multiple preprocessing threads, set the `num_threads` parameter to a number
bigger than 1.
An MNIST example that preloads the data using constants can be found in
[`tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py), and one that preloads the data using variables can be found in
[`tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py),
You can compare these with the `fully_connected_feed` and
`fully_connected_reader` versions above.
## Multiple input pipelines
Commonly you will want to train on one dataset and evaluate (or "eval") on
another. One way to do this is to actually have two separate processes:
* The training process reads training input data and periodically writes
checkpoint files with all the trained variables.
* The evaluation process restores the checkpoint files into an inference
model that reads validation input data.
This is what is done in
[the example CIFAR-10 model](../../tutorials/deep_cnn/index.md#save-and-restore-checkpoints). This has a couple of benefits:
* The eval is performed on a single snapshot of the trained variables.
* You can perform the eval even after training has completed and exited.
You can have the train and eval in the same graph in the same process, and share
their trained variables. See
[the shared variables tutorial](../../how_tos/variable_scope/index.md).
================================================
FILE: SOURCE/how_tos/reading_data.md
================================================
# 数据读取
TensorFlow程序读取数据一共有3种方法:
* 供给数据(Feeding): 在TensorFlow程序运行的每一步, 让Python代码来供给数据。
* 从文件读取数据: 在TensorFlow图的起始, 让一个输入管线从文件中读取数据。
* 预加载数据: 在TensorFlow图中定义常量或变量来保存所有数据(仅适用于数据量比较小的情况)。
## 目录
### [数据读取](#AUTOGENERATED-reading-data)
* [供给数据(Feeding)](#Feeding)
* [从文件读取数据](#AUTOGENERATED-reading-from-files)
* [文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits)](#AUTOGENERATED-filenames--shuffling--and-epoch-limits)
* [文件格式](#AUTOGENERATED-file-formats)
* [预处理](#AUTOGENERATED-preprocessing)
* [批处理](#AUTOGENERATED-batching)
* [使用`QueueRunner`创建预读线程](#QueueRunner)
* [对记录进行过滤或者为每个纪录创建多个样本](#AUTOGENERATED-filtering-records-or-producing-multiple-examples-per-record)
* [序列化输入数据(Sparse input data)](#AUTOGENERATED-sparse-input-data)
* [预加载数据](#AUTOGENERATED-preloaded-data)
* [多管线输入](#AUTOGENERATED-multiple-input-pipelines)
## 供给数据
TensorFlow的数据供给机制允许你在TensorFlow运算图中将数据注入到任一张量中。因此,python运算可以把数据直接设置到TensorFlow图中。
通过给run()或者eval()函数输入`feed_dict`参数, 可以启动运算过程。
```python
with tf.Session():
input = tf.placeholder(tf.float32)
classifier = ...
print classifier.eval(feed_dict={input: my_python_preprocessing_fn()})
```
虽然你可以使用常量和变量来替换任何一个张量, 但是最好的做法应该是使用[`placeholder` op](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#placeholder)节点。设计`placeholder`节点的唯一的意图就是为了提供数据供给(feeding)的方法。`placeholder`节点被声明的时候是未初始化的, 也不包含数据, 如果没有为它供给数据, 则TensorFlow运算的时候会产生错误, 所以千万不要忘了为`placeholder`提供数据。
可以在[`tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py)找到使用`placeholder`和MNIST训练的例子,[MNIST tutorial](tensorflow-zh/SOURCE/tutorials/mnist/tf/index.md)也讲述了这一例子。
## 从文件读取数据
一共典型的文件读取管线会包含下面这些步骤:
1. 文件名列表
2. *可配置的* 文件名乱序(shuffling)
3. *可配置的* 最大训练迭代数(epoch limit)
4. 文件名队列
5. 针对输入文件格式的阅读器
6. 纪录解析器
7. *可配置的*预处理器
8. 样本队列
### 文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits)
可以使用字符串张量(比如`["file0", "file1"]`, `[("file%d" % i) for i in range(2)]`, `[("file%d" % i) for i in range(2)]`) 或者[`tf.train.match_filenames_once` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#match_filenames_once)来产生文件名列表。
将文件名列表交给[`tf.train.string_input_producer` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#string_input_producer).`string_input_producer`来生成一个先入先出的队列, 文件阅读器会需要它来读取数据。
`string_input_producer` 提供的可配置参数来设置文件名乱序和最大的训练迭代数, `QueueRunner`会为每次迭代(epoch)将所有的文件名加入文件名队列中, 如果`shuffle=True`的话, 会对文件名进行乱序处理。这一过程是比较均匀的,因此它可以产生均衡的文件名队列。
这个`QueueRunner`的工作线程是独立于文件阅读器的线程, 因此乱序和将文件名推入到文件名队列这些过程不会阻塞文件阅读器运行。
### 文件格式
根据你的文件格式, 选择对应的文件阅读器, 然后将文件名队列提供给阅读器的`read`方法。阅读器的`read`方法会输出一个key来表征输入的文件和其中的纪录(对于调试非常有用),同时得到一个字符串标量, 这个字符串标量可以被一个或多个解析器,或者转换操作将其解码为张量并且构造成为样本。
#### CSV 文件
从CSV文件中读取数据, 需要使用[`TextLineReader`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TextLineReader)和[`decode_csv`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_csv) 操作, 如下面的例子所示:
```python
filename_queue = tf.train.string_input_producer(["file0.csv", "file1.csv"])
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[1], [1], [1], [1], [1]]
col1, col2, col3, col4, col5 = tf.decode_csv(
value, record_defaults=record_defaults)
features = tf.concat(0, [col1, col2, col3, col4])
with tf.Session() as sess:
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(1200):
# Retrieve a single instance:
example, label = sess.run([features, col5])
coord.request_stop()
coord.join(threads)
```
每次`read`的执行都会从文件中读取一行内容, `decode_csv` 操作会解析这一行内容并将其转为张量列表。如果输入的参数有缺失,`record_default`参数可以根据张量的类型来设置默认值。
在调用`run`或者`eval`去执行`read`之前, 你必须调用`tf.train.start_queue_runners`来将文件名填充到队列。否则`read`操作会被阻塞到文件名队列中有值为止。
#### 固定长度的记录
从二进制文件中读取固定长度纪录, 可以使用[`tf.FixedLengthRecordReader`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#FixedLengthRecordReader)的[`tf.decode_raw`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_raw)操作。`decode_raw`操作可以讲一个字符串转换为一个uint8的张量。
举例来说,[the CIFAR-10 dataset](http://www.cs.toronto.edu/~kriz/cifar.html)的文件格式定义是:每条记录的长度都是固定的,一个字节的标签,后面是3072字节的图像数据。uint8的张量的标准操作就可以从中获取图像片并且根据需要进行重组。 例子代码可以在[`tensorflow/models/image/cifar10/cifar10_input.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py)找到,具体讲述可参见[教程](tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md#prepare-the-data).
#### 标准TensorFlow格式
另一种保存记录的方法可以允许你讲任意的数据转换为TensorFlow所支持的格式, 这种方法可以使TensorFlow的数据集更容易与网络应用架构相匹配。这种建议的方法就是使用TFRecords文件,TFRecords文件包含了[`tf.train.Example` 协议内存块(protocol buffer)](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto)(协议内存块包含了字段
[`Features`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/feature.proto))。你可以写一段代码获取你的数据, 将数据填入到`Example`协议内存块(protocol buffer),将协议内存块序列化为一个字符串, 并且通过[`tf.python_io.TFRecordWriter` class](tensorflow-zh/SOURCE/api_docs/python/python_io.md#TFRecordWriter)写入到TFRecords文件。[`tensorflow/g3doc/how_tos/reading_data/convert_to_records.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/convert_to_records.py)就是这样的一个例子。
从TFRecords文件中读取数据, 可以使用[`tf.TFRecordReader`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TFRecordReader)的[`tf.parse_single_example`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#parse_single_example)解析器。这个`parse_single_example`操作可以将`Example`协议内存块(protocol buffer)解析为张量。 MNIST的例子就使用了`convert_to_records` 所构建的数据。 请参看[`tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py), 您也可以将这个例子跟`fully_connected_feed`的版本加以比较。
### 预处理
你可以对输入的样本进行任意的预处理, 这些预处理不依赖于训练参数, 你可以在[`tensorflow/models/image/cifar10/cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py)找到数据归一化, 提取随机数据片,增加噪声或失真等等预处理的例子。
### 批处理
在数据输入管线的末端, 我们需要有另一个队列来执行输入样本的训练,评价和推理。因此我们使用[`tf.train.shuffle_batch` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#shuffle_batch)来对队列中的样本进行乱序处理
示例:
```
def read_my_file_format(filename_queue):
reader = tf.SomeReader()
key, record_string = reader.read(filename_queue)
example, label = tf.some_decoder(record_string)
processed_example = some_processing(example)
return processed_example, label
def input_pipeline(filenames, batch_size, num_epochs=None):
filename_queue = tf.train.string_input_producer(
filenames, num_epochs=num_epochs, shuffle=True)
example, label = read_my_file_format(filename_queue)
# min_after_dequeue defines how big a buffer we will randomly sample
# from -- bigger means better shuffling but slower start up and more
# memory used.
# capacity must be larger than min_after_dequeue and the amount larger
# determines the maximum we will prefetch. Recommendation:
# min_after_dequeue + (num_threads + a small safety margin) * batch_size
min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch(
[example, label], batch_size=batch_size, capacity=capacity,
min_after_dequeue=min_after_dequeue)
return example_batch, label_batch
```
如果你需要对不同文件中的样子有更强的乱序和并行处理,可以使用[`tf.train.shuffle_batch_join` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#shuffle_batch_join).
示例:
```
def read_my_file_format(filename_queue):
# Same as above
def input_pipeline(filenames, batch_size, read_threads, num_epochs=None):
filename_queue = tf.train.string_input_producer(
filenames, num_epochs=num_epochs, shuffle=True)
example_list = [read_my_file_format(filename_queue)
for _ in range(read_threads)]
min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch_join(
example_list, batch_size=batch_size, capacity=capacity,
min_after_dequeue=min_after_dequeue)
return example_batch, label_batch
```
在这个例子中, 你虽然只使用了一个文件名队列, 但是TensorFlow依然能保证多个文件阅读器从同一次迭代(epoch)的不同文件中读取数据,知道这次迭代的所有文件都被开始读取为止。(通常来说一个线程来对文件名队列进行填充的效率是足够的)
另一种替代方案是: 使用[`tf.train.shuffle_batch` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#shuffle_batch),设置`num_threads`的值大于1。 这种方案可以保证同一时刻只在一个文件中进行读取操作(但是读取速度依然优于单线程),而不是之前的同时读取多个文件。这种方案的优点是:
* 避免了两个不同的线程从同一个文件中读取同一个样本。
* 避免了过多的磁盘搜索操作。
你一共需要多少个读取线程呢? 函数`tf.train.shuffle_batch*`为TensorFlow图提供了获取文件名队列中的元素个数之和的方法。 如果你有足够多的读取线程, 文件名队列中的元素个数之和应该一直是一个略高于0的数。具体可以参考[TensorBoard:可视化学习](tensorflow-zh/SOURCE/how_tos/summaries_and_tensorboard/index.md).
### 创建线程并使用`QueueRunner`对象来预取
简单来说:使用上面列出的许多`tf.train`函数添加[`QueueRunner`](../../api_docs/python/train.md#QueueRunner)到你的数据流图中。在你运行任何训练步骤之前,需要调用[`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners)函数,否则数据流图将一直挂起。[`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners) 这个函数将会启动输入管道的线程,填充样本到队列中,以便出队操作可以从队列中拿到样本。这种情况下最好配合使用一个[`tf.train.Coordinator`](../../api_docs/python/train.md#Coordinator),这样可以在发生错误的情况下正确地关闭这些线程。如果你对训练迭代数做了限制,那么需要使用一个训练迭代数计数器,并且需要被初始化。推荐的代码模板如下:
```python
# Create the graph, etc.
init_op = tf.initialize_all_variables()
# Create a session for running operations in the Graph.
sess = tf.Session()
# Initialize the variables (like the epoch counter).
sess.run(init_op)
# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
try:
while not coord.should_stop():
# Run training steps or whatever
sess.run(train_op)
except tf.errors.OutOfRangeError:
print 'Done training -- epoch limit reached'
finally:
# When done, ask the threads to stop.
coord.request_stop()
# Wait for threads to finish.
coord.join(threads)
sess.close()
```
#### 疑问: 这是怎么回事?
首先,我们先创建数据流图,这个数据流图由一些流水线的阶段组成,阶段间用队列连接在一起。第一阶段将生成文件名,我们读取这些文件名并且把他们排到文件名队列中。第二阶段从文件中读取数据(使用`Reader`),产生样本,而且把样本放在一个样本队列中。根据你的设置,实际上也可以拷贝第二阶段的样本,使得他们相互独立,这样就可以从多个文件中并行读取。在第二阶段的最后是一个排队操作,就是入队到队列中去,在下一阶段出队。因为我们是要开始运行这些入队操作的线程,所以我们的训练循环会使得样本队列中的样本不断地出队。
在`tf.train`中要创建这些队列和执行入队操作,就要添加[`tf.train.QueueRunner`](../../api_docs/python/train.md#QueueRunner)到一个使用[`tf.train.add_queue_runner`](../../api_docs/python/train.md#add_queue_runner)函数的数据流图中。每个`QueueRunner`负责一个阶段,处理那些需要在线程中运行的入队操作的列表。一旦数据流图构造成功,[`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners)函数就会要求数据流图中每个`QueueRunner`去开始它的线程运行入队操作。
如果一切顺利的话,你现在可以执行你的训练步骤,同时队列也会被后台线程来填充。如果您设置了最大训练迭代数,在某些时候,样本出队的操作可能会得到一个[`tf.OutOfRangeError`](../../api_docs/python/client.md#OutOfRangeError)的错误。这其实是TensorFlow的“文件结束”(EOF) ———— 这就意味着已经达到了最大训练迭代数,已经没有更多可用的样本了。
最后一个因素是[`Coordinator`](../../api_docs/python/train.md#Coordinator)。这是负责在收到任何关闭信号的时候,让所有的线程都知道。最常用的是在发生异常时这种情况就会呈现出来,比如说其中一个线程在运行某些操作时出现错误(或一个普通的Python异常)。
想要了解更多的关于threading, queues, QueueRunners, and Coordinators的内容可以[看这里](../../how_tos/threading_and_queues/index.md).
#### 疑问: 在达到最大训练迭代数的时候如何清理关闭线程?
想象一下,你有一个模型并且设置了最大训练迭代数。这意味着,生成文件的那个线程将只会在产生`OutOfRange`错误之前运行许多次。该`QueueRunner`会捕获该错误,并且关闭文件名的队列,最后退出线程。关闭队列做了两件事情:
* 如果还试着对文件名队列执行入队操作时将发生错误。任何线程不应该尝试去这样做,但是当队列因为其他错误而关闭时,这就会有用了。
* 任何当前或将来出队操作要么成功(如果队列中还有足够的元素)或立即失败(发生`OutOfRange`错误)。它们不会防止等待更多的元素被添加到队列中,因为上面的一点已经保证了这种情况不会发生。
关键是,当在文件名队列被关闭时候,有可能还有许多文件名在该队列中,这样下一阶段的流水线(包括reader和其它预处理)还可以继续运行一段时间。 一旦文件名队列空了之后,如果后面的流水线还要尝试从文件名队列中取出一个文件名(例如,从一个已经处理完文件的reader中),这将会触发`OutOfRange`错误。在这种情况下,即使你可能有一个QueueRunner关联着多个线程。如果这不是在QueueRunner中的最后那个线程,`OutOfRange`错误仅仅只会使得一个线程退出。这使得其他那些正处理自己的最后一个文件的线程继续运行,直至他们完成为止。 (但如果假设你使用的是[`tf.train.Coordinator`](../../api_docs/python/train.md#Coordinator),其他类型的错误将导致所有线程停止)。一旦所有的reader线程触发`OutOfRange`错误,然后才是下一个队列,再是样本队列被关闭。
同样,样本队列中会有一些已经入队的元素,所以样本训练将一直持续直到样本队列中再没有样本为止。如果样本队列是一个[`RandomShuffleQueue`](../../api_docs/python/io_ops.md#RandomShuffleQueue),因为你使用了`shuffle_batch` 或者 `shuffle_batch_join`,所以通常不会出现以往那种队列中的元素会比`min_after_dequeue` 定义的更少的情况。 然而,一旦该队列被关闭,`min_after_dequeue`设置的限定值将失效,最终队列将为空。在这一点来说,当实际训练线程尝试从样本队列中取出数据时,将会触发`OutOfRange`错误,然后训练线程会退出。一旦所有的培训线程完成,[`tf.train.Coordinator.join`](../../api_docs/python/train.md#Coordinator.join)会返回,你就可以正常退出了。
### 筛选记录或产生每个记录的多个样本
举个例子,有形式为`[x, y, z]`的样本,我们可以生成一批形式为`[batch, x, y, z]`的样本。 如果你想滤除这个记录(或许不需要这样的设置),那么可以设置batch的大小为0;但如果你需要每个记录产生多个样本,那么batch的值可以大于1。 然后很简单,只需调用批处理函数(比如: `shuffle_batch` or `shuffle_batch_join`)去设置`enqueue_many=True`就可以实现。
### 稀疏输入数据
SparseTensors这种数据类型使用队列来处理不是太好。如果要使用SparseTensors你就必须在批处理**之后**使用[`tf.parse_example`](../../api_docs/python/io_ops.md#parse_example) 去解析字符串记录 (而不是在批处理**之前**使用 `tf.parse_single_example`) 。
## 预取数据
这仅用于可以完全加载到存储器中的小的数据集。有两种方法:
* 存储在常数中。
* 存储在变量中,初始化后,永远不要改变它的值。
使用常数更简单一些,但是会使用更多的内存(因为常数会内联的存储在数据流图数据结构中,这个结构体可能会被复制几次)。
```python
training_data = ...
training_labels = ...
with tf.Session():
input_data = tf.constant(training_data)
input_labels = tf.constant(training_labels)
...
```
要改为使用变量的方式,您就需要在数据流图建立后初始化这个变量。
```python
training_data = ...
training_labels = ...
with tf.Session() as sess:
data_initializer = tf.placeholder(dtype=training_data.dtype,
shape=training_data.shape)
label_initializer = tf.placeholder(dtype=training_labels.dtype,
shape=training_labels.shape)
input_data = tf.Variable(data_initalizer, trainable=False, collections=[])
input_labels = tf.Variable(label_initalizer, trainable=False, collections=[])
...
sess.run(input_data.initializer,
feed_dict={data_initializer: training_data})
sess.run(input_labels.initializer,
feed_dict={label_initializer: training_lables})
```
设定`trainable=False` 可以防止该变量被数据流图的 `GraphKeys.TRAINABLE_VARIABLES` 收集, 这样我们就不会在训练的时候尝试更新它的值; 设定 `collections=[]` 可以防止`GraphKeys.VARIABLES` 收集后做为保存和恢复的中断点。
无论哪种方式,[`tf.train.slice_input_producer function`](../../api_docs/python/io_ops.md#slice_input_producer)函数可以被用来每次产生一个切片。这样就会让样本在整个迭代中被打乱,所以在使用批处理的时候不需要再次打乱样本。所以我们不使用`shuffle_batch`函数,取而代之的是纯[`tf.train.batch` 函数](../../api_docs/python/io_ops.md#batch)。 如果要使用多个线程进行预处理,需要将`num_threads`参数设置为大于1的数字。
在[`tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py) 中可以找到一个MNIST例子,使用常数来预加载。 另外使用变量来预加载的例子在[`tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py),你可以用上面 `fully_connected_feed` 和 `fully_connected_reader` 的描述来进行比较。
## 多输入管道
通常你会在一个数据集上面训练,然后在另外一个数据集上做评估计算(或称为 "eval")。 这样做的一种方法是,实际上包含两个独立的进程:
* 训练过程中读取输入数据,并定期将所有的训练的变量写入还原点文件)。
* 在计算过程中恢复还原点文件到一个推理模型中,读取有效的输入数据。
这两个进程在下面的例子中已经完成了:[the example CIFAR-10 model](../../tutorials/deep_cnn/index.md#save-and-restore-checkpoints),有以下几个好处:
* eval被当做训练后变量的一个简单映射。
* 你甚至可以在训练完成和退出后执行eval。
您可以在同一个进程的相同的数据流图中有训练和eval,并分享他们的训练后的变量。参考[the shared variables tutorial](../../how_tos/variable_scope/index.md).
原文地址:[Reading data](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/reading_data/index.md) 翻译:[volvet](https://github.com/volvet) and [zhangkom](https://github.com/zhangkom) 校对:
================================================
FILE: SOURCE/how_tos/summaries_and_tensorboard/index.md
================================================
# TensorBoard: Visualizing Learning
The computations you'll use TensorBoard for - like training a massive
deep neural network - can be complex and confusing. To make it easier to
understand, debug, and optimize TensorFlow programs, we've included a suite of
visualization tools called TensorBoard. You can use TensorBoard to visualize
your TensorFlow graph, plot quantitative metrics about the execution of your
graph, and show additional data like images that pass through it. When
TensorBoard is fully configured, it looks like this:

## Serializing the data
TensorBoard operates by reading TensorFlow events files, which contain summary
data that you can generate when running TensorFlow. Here's the general
lifecycle for summary data within TensorBoard.
First, create the TensorFlow graph that you'd like to collect summary
data from, and decide which nodes you would like to annotate with
[summary operations]
(../../api_docs/python/train.md#summary-operations).
For example, suppose you are training a convolutional neural network for
recognizing MNIST digits. You'd like to record how the learning rate
varies over time, and how the objective function is changing. Collect these by
attaching [`scalar_summary`](../../api_docs/python/train.md#scalar_summary) ops
to the nodes that output the learning rate and loss respectively. Then, give
each `scalar_summary` a meaningful `tag`, like `'learning rate'` or `'loss
function'`.
Perhaps you'd also like to visualize the distributions of activations coming
off a particular layer, or the distribution of gradients or weights. Collect
this data by attaching
[`histogram_summary`](../../api_docs/python/train.md#histogram_summary) ops to
the gradient outputs and to the variable that holds your weights, respectively.
For details on all of the summary operations avaiable, check out the docs on
[summary operations]
(../../api_docs/python/train.md#summary-operations).
Operations in TensorFlow don't do anything until you run them, or an op that
depends on their output. And the summary nodes that we've just created are
peripheral to your graph: none of the ops you are currently running depend on
them. So, to generate summaries, we need to run all of these summary nodes.
Managing them by hand would be tedious, so use
[`tf.merge_all_summaries`](../../api_docs/python/train.md#merge_all_summaries)
to combine them into a single op that generates all the summary data.
Then, you can just run the merged summary op, which will generate a serialized
`Summary` protobuf object with all of your summary data at a given step.
Finally, to write this summary data to disk, pass the summary protobuf to a
[`tf.train.SummaryWriter`](../../api_docs/python/train.md#SummaryWriter).
The `SummaryWriter` takes a logdir in its constructor - this logdir is quite
important, it's the directory where all of the events will be written out.
Also, the `SummaryWriter` can optionally take a `GraphDef` in its constructor.
If it receives one, then TensorBoard will visualize your graph as well.
Now that you've modified your graph and have a `SummaryWriter`, you're ready to
start runing your network! If you want, you could run the merged summary op
every single step, and record a ton of training data. That's likely to be more
data than you need, though. Instead, consider running the merged summary op
every hundred steps or so, as in the following code example.
```python
merged_summary_op = tf.merge_all_summaries()
summary_writer = tf.train.SummaryWriter('/tmp/mnist_logs', sess.graph)
total_step = 0
while training:
total_step += 1
session.run(training_op)
if total_step % 100 == 0:
summary_str = session.run(merged_summary_op)
summary_writer.add_summary(summary_str, total_step)
```
You're now all set to visualize this data using TensorBoard.
## Launching TensorBoard
To run TensorBoard, use the command
python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory
where `logdir` points to the directory where the `SummaryWriter` serialized its
data. If this `logdir` directory contains subdirectories which contain
serialized data from separate runs, then TensorBoard will visualize the data
from all of those runs. Once TensorBoard is running, navigate your web browser
to `localhost:6006` to view the TensorBoard.
If you have pip installed TensorBoard, you can use the simpler command
tensorboard --logdir=/path/to/log-directory
When looking at TensorBoard, you will see the navigation tabs in the top right
corner. Each tab represents a set of serialized data that can be visualized.
For any tab you are looking at, if the logs being looked at by TensorBoard do
not contain any data relevant to that tab, a message will be displayed
indicating how to serialize data that is applicable to that tab.
For in depth information on how to use the *graph* tab to visualize your graph,
see [TensorBoard: Visualizing your graph](../../how_tos/graph_viz/index.md).
================================================
FILE: SOURCE/how_tos/summaries_and_tensorboard.md
================================================
# TensorBoard:可视化学习
TensorBoard 涉及到的运算,通常是在训练庞大的深度神经网络中出现的复杂而又难以理解的运算。
为了更方便 TensorFlow 程序的理解、调试与优化,我们发布了一套叫做 TensorBoard 的可视化工具。你可以用 TensorBoard 来展现你的 TensorFlow 图像,绘制图像生成的定量指标图以及附加数据。
当 TensorBoard 设置完成后,它应该是这样子的:

## 数据序列化
TensorBoard 通过读取 TensorFlow 的事件文件来运行。TensorFlow 的事件文件包括了你会在 TensorFlow 运行中涉及到的主要数据。下面是 TensorBoard 中汇总数据(Summary data)的大体生命周期。
首先,创建你想汇总数据的 TensorFlow 图,然后再选择你想在哪个节点进行[汇总(summary)操作](../api_docs/python/train.md#summary_options)。
比如,假设你正在训练一个卷积神经网络,用于识别 MNISt 标签。你可能希望记录学习速度(learning rate)的如何变化,以及目标函数如何变化。通过向节点附加[scalar_summary](../api_docs/python/train.md#scalary_summary)操作来分别输出学习速度和期望误差。然后你可以给每个 scalary_summary 分配一个有意义的 `标签`,比如 `'learning rate'` 和 `'loss function'`。
或者你还希望显示一个特殊层中激活的分布,或者梯度权重的分布。可以通过分别附加 [histogram_summary](../api_docs/python/train.md#histogram_summary) 运算来收集权重变量和梯度输出。
所有可用的 summary 操作详细信息,可以查看[summary_operation](../api_docs/python/train.md#summary_operation)文档。
在TensorFlow中,所有的操作只有当你执行,或者另一个操作依赖于它的输出时才会运行。我们刚才创建的这些节点(summary nodes)都围绕着你的图像:没有任何操作依赖于它们的结果。因此,为了生成汇总信息,我们需要运行所有这些节点。这样的手动工作是很乏味的,因此可以使用[tf.merge_all_summaries](../api_docs/python/train.md#scalary_summary)来将他们合并为一个操作。
然后你可以执行合并命令,它会依据特点步骤将所有数据生成一个序列化的`Summary` protobuf对象。最后,为了将汇总数据写入磁盘,需要将汇总的protobuf对象传递给[tf.train.Summarywriter](../api_docs/python/train.md#SummaryWriter)。
`SummaryWriter` 的构造函数中包含了参数 logdir。这个 logdir 非常重要,所有事件都会写到它所指的目录下。此外,`SummaryWriter` 中还包含了一个可选择的参数 `GraphDef`。如果输入了该参数,那么 TensorBoard 也会显示你的图像。
现在已经修改了你的图,也有了 `SummaryWriter`,现在就可以运行你的神经网络了!如果你愿意的话,你可以每一步执行一次合并汇总,这样你会得到一大堆训练数据。这很有可能超过了你想要的数据量。你也可以每一百步执行一次合并汇总,或者如下面代码里示范的这样。
```python
merged_summary_op = tf.merge_all_summaries()
summary_writer = tf.train.SummaryWriter('/tmp/mnist_logs', sess.graph)
total_step = 0
while training:
total_step += 1
session.run(training_op)
if total_step % 100 == 0:
summary_str = session.run(merged_summary_op)
summary_writer.add_summary(summary_str, total_step)
```
现在已经准备好用 TensorBoard 来可视化这些数据了。
## 启动TensorBoard
输入下面的指令来启动TensorBoard
```
python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory
```
这里的参数 `logdir` 指向 `SummaryWriter` 序列化数据的存储路径。如果`logdir`目录的子目录中包含另一次运行时的数据,那么 TensorBoard 会展示所有运行的数据。一旦 TensorBoard 开始运行,你可以通过在浏览器中输入 `localhost:6006` 来查看 TensorBoard。
如果你已经通过pip安装了 TensorBoard,你可以通过执行更为简单地命令来访问 TensorBoard
```
tensorboard --logdir=/path/to/log-directory
```
进入 TensorBoard 的界面时,你会在右上角看到导航选项卡,每一个选项卡将展现一组可视化的序列化数据集 。对于你查看的每一个选项卡,如果 TensorBoard 中没有数据与这个选项卡相关的话,则会显示一条提示信息指示你如何序列化相关数据。
更多更详细的关于如何使用 graph 选项来显示你的图像的信息。参见 [TensorBoard:图表可视化](./graph_viz.md)
原文地址:[TensorBoard:Visualizing Learning](http://tensorflow.org/how_tos/summaries_and_tensorboard/index.html#tensorboard-visualizing-learning) 翻译:[thylaco1eo](https://github.com/thylaco1eo) 校对:[lucky521](https://github.com/lucky521)
================================================
FILE: SOURCE/how_tos/threading_and_queues/index.md
================================================
# Threading and Queues
Queues are a powerful mechanism for asynchronous computation using TensorFlow.
Like everything in TensorFlow, a queue is a node in a TensorFlow graph. It's a
stateful node, like variable: other nodes can modify its content. In
particular, nodes can enqueue new items in to the queue, or dequeue existing
items from the queue.
To get a feel for queues, let's consider a simple example. We will create a
"first in, first out" queue (`FIFOQueue`) and fill it with zeros.
Then we'll construct a graph
that takes an item off the queue, adds one to that item, and puts it back on the
end of the queue. Slowly, the numbers on the queue increase.
`Enqueue`, `EnqueueMany`, and `Dequeue` are special nodes. They take a pointer
to the queue instead of a normal value, allowing them to change it. We recommend
you think of these as being like methods of the queue. In fact, in the Python
API, they are methods of the queue object (eg. `q.enqueue(...)`).
Now that you have a bit of a feel for queues, let's dive into the details...
## Queue Use Overview
Queues, such as `FIFOQueue` and `RandomShuffleQueue`, are important TensorFlow
objects for computing tensors asynchronously in a graph.
For example, a typical input architecture is to use a `RandomShuffleQueue` to
prepare inputs for training a model:
* Multiple threads prepare training examples and push them in the queue.
* A training thread executes a training op that dequeues mini-batches from the
queue
This architecture has many benefits, as highlighted in the
[Reading data how to](../reading_data), which also gives an overview of
functions that simplify the construction of input pipelines.
The TensorFlow `Session` object is multithreaded, so multiple threads can
easily use the same session and run ops in parallel. However, it is not always
easy to implement a Python program that drives threads as described above. All
threads must be able to stop together, exceptions must be caught and
reported, and queues must be properly closed when stopping.
TensorFlow provides two classes to help:
[tf.Coordinator](../../api_docs/python/train.md#Coordinator) and
[tf.QueueRunner](../../api_docs/python/train.md#QueueRunner). These two classes
are designed to be used together. The `Coordinator` class helps multiple threads
stop together and report exceptions to a program that waits for them to stop.
The `QueueRunner` class is used to create a number of threads cooperating to
enqueue tensors in the same queue.
## Coordinator
The Coordinator class helps multiple threads stop together.
Its key methods are:
* `should_stop()`: returns True if the threads should stop.
* `request_stop()`: requests that threads should stop.
* `join()`: waits until the specified threads have stopped.
You first create a `Coordinator` object, and then create a number of threads
that use the coordinator. The threads typically run loops that stop when
`should_stop()` returns `True`.
Any thread can decide that the computation should stop. It only has to call
`request_stop()` and the other threads will stop as `should_stop()` will then
return `True`.
```python
# Thread body: loop until the coordinator indicates a stop was requested.
# If some condition becomes true, ask the coordinator to stop.
def MyLoop(coord):
while not coord.should_stop():
...do something...
if ...some condition...:
coord.request_stop()
# Main code: create a coordinator.
coord = Coordinator()
# Create 10 threads that run 'MyLoop()'
threads = [threading.Thread(target=MyLoop, args=(coord)) for i in xrange(10)]
# Start the threads and wait for all of them to stop.
for t in threads: t.start()
coord.join(threads)
```
Obviously, the coordinator can manage threads doing very different things.
They don't have to be all the same as in the example above. The coordinator
also has support to capture and report exceptions. See the [Coordinator class](../../api_docs/python/train.md#Coordinator) documentation for more details.
## QueueRunner
The `QueueRunner` class creates a number of threads that repeatedly run an
enqueue op. These threads can use a coordinator to stop together. In
addition, a queue runner runs a *closer thread* that automatically closes the
queue if an exception is reported to the coordinator.
You can use a queue runner to implement the architecture described above.
First build a graph that uses a `Queue` for input examples. Add ops that
process examples and enqueue them in the queue. Add training ops that start by
dequeueing from the queue.
```python
example = ...ops to create one example...
# Create a queue, and an op that enqueues examples one at a time in the queue.
queue = tf.RandomShuffleQueue(...)
enqueue_op = queue.enqueue(example)
# Create a training graph that starts by dequeuing a batch of examples.
inputs = queue.dequeue_many(batch_size)
train_op = ...use 'inputs' to build the training part of the graph...
```
In the Python training program, create a `QueueRunner` that will run a few
threads to process and enqueue examples. Create a `Coordinator` and ask the
queue runner to start its threads with the coordinator. Write a training loop
that also uses the coordinator.
```
# Create a queue runner that will run 4 threads in parallel to enqueue
# examples.
qr = tf.train.QueueRunner(queue, [enqueue_op] * 4)
# Launch the graph.
sess = tf.Session()
# Create a coordinator, launch the queue runner threads.
coord = tf.train.Coordinator()
enqueue_threads = qr.create_threads(sess, coord=coord, start=True)
# Run the training loop, controlling termination with the coordinator.
for step in xrange(1000000):
if coord.should_stop():
break
sess.run(train_op)
# When done, ask the threads to stop.
coord.request_stop()
# And wait for them to actually do it.
coord.join(threads)
```
## Handling Exceptions
Threads started by queue runners do more than just run the enqueue ops. They
also catch and handle exceptions generated by queues, including
`OutOfRangeError` which is used to report that a queue was closed.
A training program that uses a coordinator must similarly catch and report
exceptions in its main loop.
Here is an improved version of the training loop above.
```python
try:
for step in xrange(1000000):
if coord.should_stop():
break
sess.run(train_op)
except Exception, e:
# Report exceptions to the coordinator.
coord.request_stop(e)
# Terminate as usual. It is innocuous to request stop twice.
coord.request_stop()
coord.join(threads)
```
原文地址:([Threading and Queues](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/threading_and_queues/index.md)) 翻译:([zhangkom](https://github.com/zhangkom)) 校对:
================================================
FILE: SOURCE/how_tos/threading_and_queues.md
================================================
# 线程和队列
在使用TensorFlow进行异步计算时,队列是一种强大的机制。
正如TensorFlow中的其他组件一样,队列就是TensorFlow图中的节点。这是一种有状态的节点,就像变量一样:其他节点可以修改它的内容。具体来说,其他节点可以把新元素插入到队列后端(rear),也可以把队列前端(front)的元素删除。
为了感受一下队列,让我们来看一个简单的例子。我们先创建一个“先入先出”的队列(FIFOQueue),并将其内部所有元素初始化为零。然后,我们构建一个TensorFlow图,它从队列前端取走一个元素,加上1之后,放回队列的后端。慢慢地,队列的元素的值就会增加。
`Enqueue`、 `EnqueueMany`和`Dequeue`都是特殊的节点。他们需要获取队列指针,而非普通的值,如此才能修改队列内容。我们建议您将它们看作队列的方法。事实上,在Python API中,它们就是队列对象的方法(例如`q.enqueue(...)`)。
现在你已经对队列有了一定的了解,让我们深入到细节...
## 队列使用概述
队列,如`FIFOQueue`和`RandomShuffleQueue`,在TensorFlow的张量异步计算时都非常重要。
例如,一个典型的输入结构:是使用一个`RandomShuffleQueue`来作为模型训练的输入:
* 多个线程准备训练样本,并且把这些样本推入队列。
* 一个训练线程执行一个训练操作,此操作会从队列中移除最小批次的样本(mini-batches)。
这种结构具有许多优点,正如在[Reading data how to](../reading_data)中强调的,同时,[Reading data how to](../reading_data)也概括地描述了如何简化输入管道的构造过程。
TensorFlow的`Session`对象是可以支持多线程的,因此多个线程可以很方便地使用同一个会话(Session)并且并行地执行操作。然而,在Python程序实现这样的并行运算却并不容易。所有线程都必须能被同步终止,异常必须能被正确捕获并报告,回话终止的时候, 队列必须能被正确地关闭。
所幸TensorFlow提供了两个类来帮助多线程的实现:[tf.Coordinator](tensorflow-zh/SOURCE/api_docs/python/train.md#Coordinator)和
[tf.QueueRunner](tensorflow-zh/SOURCE/api_docs/python/train.md#QueueRunner)。从设计上这两个类必须被一起使用。`Coordinator`类可以用来同时停止多个工作线程并且向那个在等待所有工作线程终止的程序报告异常。`QueueRunner`类用来协调多个工作线程同时将多个张量推入同一个队列中。
## Coordinator
Coordinator类用来帮助多个线程协同工作,多个线程同步终止。
其主要方法有:
* `should_stop()`:如果线程应该停止则返回True。
* `request_stop()`: 请求该线程停止。
* `join()`:等待被指定的线程终止。
首先创建一个`Coordinator`对象,然后建立一些使用`Coordinator`对象的线程。这些线程通常一直循环运行,一直到`should_stop()`返回True时停止。
任何线程都可以决定计算什么时候应该停止。它只需要调用`request_stop()`,同时其他线程的`should_stop()`将会返回`True`,然后都停下来。
```python
# 线程体:循环执行,直到`Coordinator`收到了停止请求。
# 如果某些条件为真,请求`Coordinator`去停止其他线程。
def MyLoop(coord):
while not coord.should_stop():
...do something...
if ...some condition...:
coord.request_stop()
# Main code: create a coordinator.
coord = Coordinator()
# Create 10 threads that run 'MyLoop()'
threads = [threading.Thread(target=MyLoop, args=(coord)) for i in xrange(10)]
# Start the threads and wait for all of them to stop.
for t in threads: t.start()
coord.join(threads)
```
显然,Coordinator可以管理线程去做不同的事情。上面的代码只是一个简单的例子,在设计实现的时候不必完全照搬。Coordinator还支持捕捉和报告异常, 具体可以参考[Coordinator class](tensorflow-zh/SOURCE/api_docs/python/train.md#Coordinator)的文档。
## QueueRunner
`QueueRunner`类会创建一组线程, 这些线程可以重复的执行Enquene操作, 他们使用同一个Coordinator来处理线程同步终止。此外,一个QueueRunner会运行一个*closer thread*,当Coordinator收到异常报告时,这个*closer thread*会自动关闭队列。
您可以使用一个queue runner,来实现上述结构。
首先建立一个TensorFlow图表,这个图表使用队列来输入样本。增加处理样本并将样本推入队列中的操作。增加training操作来移除队列中的样本。
```python
example = ...ops to create one example...
# Create a queue, and an op that enqueues examples one at a time in the queue.
queue = tf.RandomShuffleQueue(...)
enqueue_op = queue.enqueue(example)
# Create a training graph that starts by dequeuing a batch of examples.
inputs = queue.dequeue_many(batch_size)
train_op = ...use 'inputs' to build the training part of the graph...
```
在Python的训练程序中,创建一个`QueueRunner`来运行几个线程, 这几个线程处理样本,并且将样本推入队列。创建一个`Coordinator`,让queue runner使用`Coordinator`来启动这些线程,创建一个训练的循环, 并且使用`Coordinator`来控制`QueueRunner`的线程们的终止。
```
# Create a queue runner that will run 4 threads in parallel to enqueue
# examples.
qr = tf.train.QueueRunner(queue, [enqueue_op] * 4)
# Launch the graph.
sess = tf.Session()
# Create a coordinator, launch the queue runner threads.
coord = tf.train.Coordinator()
enqueue_threads = qr.create_threads(sess, coord=coord, start=True)
# Run the training loop, controlling termination with the coordinator.
for step in xrange(1000000):
if coord.should_stop():
break
sess.run(train_op)
# When done, ask the threads to stop.
coord.request_stop()
# And wait for them to actually do it.
coord.join(threads)
```
## 异常处理
通过queue runners启动的线程不仅仅只处理推送样本到队列。他们还捕捉和处理由队列产生的异常,包括`OutOfRangeError`异常,这个异常是用于报告队列被关闭。
使用`Coordinator`的训练程序在主循环中必须同时捕捉和报告异常。
下面是对上面训练循环的改进版本。
```python
try:
for step in xrange(1000000):
if coord.should_stop():
break
sess.run(train_op)
except Exception, e:
# Report exceptions to the coordinator.
coord.request_stop(e)
# Terminate as usual. It is innocuous to request stop twice.
coord.request_stop()
coord.join(threads)
```
原文地址:[Threading and Queues](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/threading_and_queues/index.md) 翻译:[zhangkom](https://github.com/zhangkom) 校对:[volvet](https://github.com/volvet)
================================================
FILE: SOURCE/how_tos/using_gpu/index.md
================================================
# Using GPUs
## Supported devices
On a typical system, there are multiple computing devices. In TensorFlow, the
supported device types are `CPU` and `GPU`. They are represented as
`strings`. For example:
* `"/cpu:0"`: The CPU of your machine.
* `"/gpu:0"`: The GPU of your machine, if you have one.
* `"/gpu:1"`: The second GPU of your machine, etc.
If a TensorFlow operation has both CPU and GPU implementations, the
GPU devices will be given priority when the operation is assigned to
a device. For example, `matmul` has both CPU and GPU kernels. On a
system with devices `cpu:0` and `gpu:0`, `gpu:0` will be selected to run
`matmul`.
## Logging Device placement
To find out which devices your operations and tensors are assigned to, create
the session with `log_device_placement` configuration option set to `True`.
```python
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)
```
You should see the following output:
```
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/gpu:0
a: /job:localhost/replica:0/task:0/gpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
[[ 22. 28.]
[ 49. 64.]]
```
## Manual device placement
If you would like a particular operation to run on a device of your
choice instead of what's automatically selected for you, you can use
`with tf.device` to create a device context such that all the operations
within that context will have the same device assignment.
```python
# Creates a graph.
with tf.device('/cpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)
```
You will see that now `a` and `b` are assigned to `cpu:0`.
```
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/cpu:0
a: /job:localhost/replica:0/task:0/cpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
[[ 22. 28.]
[ 49. 64.]]
```
## Using a single GPU on a multi-GPU system
If you have more than one GPU in your system, the GPU with the lowest ID will be
selected by default. If you would like to run on a different GPU, you will need
to specify the preference explicitly:
```python
# Creates a graph.
with tf.device('/gpu:2'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)
```
If the device you have specified does not exist, you will get
`InvalidArgumentError`:
```
InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b':
Could not satisfy explicit device specification '/gpu:2'
[[Node: b = Const[dtype=DT_FLOAT, value=Tensor, _device="/gpu:2"]()]]
```
If you would like TensorFlow to automatically choose an existing and
supported device to run the operations in case the specified one doesn't
exist, you can set `allow_soft_placement` to `True` in the configuration
option when creating the session.
```python
# Creates a graph.
with tf.device('/gpu:2'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with allow_soft_placement and log_device_placement set
# to True.
sess = tf.Session(config=tf.ConfigProto(
allow_soft_placement=True, log_device_placement=True))
# Runs the op.
print sess.run(c)
```
## Using multiple GPUs
If you would like to run TensorFlow on multiple GPUs, you can construct your
model in a multi-tower fashion where each tower is assigned to a different GPU.
For example:
```
# Creates a graph.
c = []
for d in ['/gpu:2', '/gpu:3']:
with tf.device(d):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
c.append(tf.matmul(a, b))
with tf.device('/cpu:0'):
sum = tf.add_n(c)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(sum)
```
You will see the following output.
```
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K20m, pci bus
id: 0000:02:00.0
/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K20m, pci bus
id: 0000:03:00.0
/job:localhost/replica:0/task:0/gpu:2 -> device: 2, name: Tesla K20m, pci bus
id: 0000:83:00.0
/job:localhost/replica:0/task:0/gpu:3 -> device: 3, name: Tesla K20m, pci bus
id: 0000:84:00.0
Const_3: /job:localhost/replica:0/task:0/gpu:3
Const_2: /job:localhost/replica:0/task:0/gpu:3
MatMul_1: /job:localhost/replica:0/task:0/gpu:3
Const_1: /job:localhost/replica:0/task:0/gpu:2
Const: /job:localhost/replica:0/task:0/gpu:2
MatMul: /job:localhost/replica:0/task:0/gpu:2
AddN: /job:localhost/replica:0/task:0/cpu:0
[[ 44. 56.]
[ 98. 128.]]
```
The [cifar10 tutorial](../../tutorials/deep_cnn/index.md) is a good example
demonstrating how to do training with multiple GPUs.
================================================
FILE: SOURCE/how_tos/using_gpu.md
================================================
# 使用 GPUs
## 支持的设备
在一套标准的系统上通常有多个计算设备. TensorFlow 支持 CPU 和 GPU 这两种设备. 我们用指定字符串
`strings` 来标识这些设备. 比如:
* `"/cpu:0"`: 机器中的 CPU
* `"/gpu:0"`: 机器中的 GPU, 如果你有一个的话.
* `"/gpu:1"`: 机器中的第二个 GPU, 以此类推...
如果一个 TensorFlow 的 operation 中兼有 CPU 和 GPU 的实现, 当这个算子被指派设备时, GPU 有优先权. 比如`matmul`中 CPU
和 GPU kernel 函数都存在. 那么在 `cpu:0` 和 `gpu:0` 中, `matmul` operation 会被指派给 `gpu:0` .
## 记录设备指派情况
为了获取你的 operations 和 Tensor 被指派到哪个设备上运行, 用 `log_device_placement` 新建一个 `session`, 并设置为 `True`.
```python
# 新建一个 graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# 新建session with log_device_placement并设置为True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 运行这个 op.
print sess.run(c)
```
你应该能看见以下输出:
```
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/gpu:0
a: /job:localhost/replica:0/task:0/gpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
[[ 22. 28.]
[ 49. 64.]]
```
## 手工指派设备
如果你不想使用系统来为 operation 指派设备, 而是手工指派设备, 你可以用 `with tf.device`
创建一个设备环境, 这个环境下的 operation 都统一运行在环境指定的设备上.
```python
# 新建一个graph.
with tf.device('/cpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# 新建session with log_device_placement并设置为True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 运行这个op.
print sess.run(c)
```
你会发现现在 `a` 和 `b` 操作都被指派给了 `cpu:0`.
```
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/cpu:0
a: /job:localhost/replica:0/task:0/cpu:0
MatMul: /job:localhost/replica:0/task:0/gpu:0
[[ 22. 28.]
[ 49. 64.]]
```
## 在多GPU系统里使用单一GPU
如果你的系统里有多个 GPU, 那么 ID 最小的 GPU 会默认使用. 如果你想用别的 GPU, 可以用下面的方法显式的声明你的偏好:
```python
# 新建一个 graph.
with tf.device('/gpu:2'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# 新建 session with log_device_placement 并设置为 True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 运行这个 op.
print sess.run(c)
```
如果你指定的设备不存在, 你会收到 `InvalidArgumentError` 错误提示:
```
InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b':
Could not satisfy explicit device specification '/gpu:2'
[[Node: b = Const[dtype=DT_FLOAT, value=Tensor, _device="/gpu:2"]()]]
```
为了避免出现你指定的设备不存在这种情况, 你可以在创建的 `session` 里把参数 `allow_soft_placement` 设置为 `True`, 这样 tensorFlow 会自动选择一个存在并且支持的设备来运行 operation.
```python
# 新建一个 graph.
with tf.device('/gpu:2'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# 新建 session with log_device_placement 并设置为 True.
sess = tf.Session(config=tf.ConfigProto(
allow_soft_placement=True, log_device_placement=True))
# 运行这个 op.
print sess.run(c)
```
## 使用多个 GPU
如果你想让 TensorFlow 在多个 GPU 上运行, 你可以建立 multi-tower 结构, 在这个结构
里每个 tower 分别被指配给不同的 GPU 运行. 比如:
```
# 新建一个 graph.
c = []
for d in ['/gpu:2', '/gpu:3']:
with tf.device(d):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
c.append(tf.matmul(a, b))
with tf.device('/cpu:0'):
sum = tf.add_n(c)
# 新建session with log_device_placement并设置为True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 运行这个op.
print sess.run(sum)
```
你会看到如下输出:
```
Device mapping:
/job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K20m, pci bus
id: 0000:02:00.0
/job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K20m, pci bus
id: 0000:03:00.0
/job:localhost/replica:0/task:0/gpu:2 -> device: 2, name: Tesla K20m, pci bus
id: 0000:83:00.0
/job:localhost/replica:0/task:0/gpu:3 -> device: 3, name: Tesla K20m, pci bus
id: 0000:84:00.0
Const_3: /job:localhost/replica:0/task:0/gpu:3
Const_2: /job:localhost/replica:0/task:0/gpu:3
MatMul_1: /job:localhost/replica:0/task:0/gpu:3
Const_1: /job:localhost/replica:0/task:0/gpu:2
Const: /job:localhost/replica:0/task:0/gpu:2
MatMul: /job:localhost/replica:0/task:0/gpu:2
AddN: /job:localhost/replica:0/task:0/cpu:0
[[ 44. 56.]
[ 98. 128.]]
```
[cifar10 tutorial](tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md) 这个例子很好的演示了怎样用GPU集群训练.
> 原文:[using_gpu](http://tensorflow.org/how_tos/using_gpu/index.md)
翻译:[@lianghyv](https://github.com/lianghyv) 校对:[Wiki](https://github.com/jikexueyuanwiki)
================================================
FILE: SOURCE/how_tos/variable_scope/index.md
================================================
# 共享变量
你可以在[怎么使用变量](../../how_tos/variables/index.md)中所描述的方式来创建,初始化,保存及加载单一的变量.但是当创建复杂的模块时,通常你需要共享大量变量集并且如果你还想在同一个地方初始化这所有的变量,我们又该怎么做呢.本教程就是演示如何使用`tf.variable_scope()` 和`tf.get_variable()`两个方法来实现这一点.
## 问题
假设你为图片过滤器创建了一个简单的模块,和我们的[卷积神经网络教程](../../tutorials/deep_cnn/index.md)模块相似,但是这里包括两个卷积(为了简化实例这里只有两个).如果你仅使用`tf.Variable`变量,那么你的模块就如[怎么使用变量](../../how_tos/variables/index.md)里面所解释的是一样的模块.
```python
def my_image_filter(input_images):
conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
name="conv1_weights")
conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
conv1 = tf.nn.conv2d(input_images, conv1_weights,
strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(conv1 + conv1_biases)
conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
name="conv2_weights")
conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
conv2 = tf.nn.conv2d(relu1, conv2_weights,
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv2 + conv2_biases)
```
你很容易想到,模块集很快就比一个模块变得更为复杂,仅在这里我们就有了四个不同的变量:`conv1_weights`,`conv1_biases`, `conv2_weights`, 和`conv2_biases`.
当我们想重用这个模块时问题还在增多.假设你想把你的图片过滤器运用到两张不同的图片, `image1`和`image2`.你想通过拥有同一个参数的同一个过滤器来过滤两张图片,你可以调用`my_image_filter()`两次,但是这会产生两组变量.
```python
# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)
```
通常共享变量的方法就是在单独的代码块中来创建他们并且通过使用他们的函数.如使用字典的例子:
```python
variables_dict = {
"conv1_weights": tf.Variable(tf.random_normal([5, 5, 32, 32]),
name="conv1_weights")
"conv1_biases": tf.Variable(tf.zeros([32]), name="conv1_biases")
... etc. ...
}
def my_image_filter(input_images, variables_dict):
conv1 = tf.nn.conv2d(input_images, variables_dict["conv1_weights"],
strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(conv1 + variables_dict["conv1_biases"])
conv2 = tf.nn.conv2d(relu1, variables_dict["conv2_weights"],
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv2 + variables_dict["conv2_biases"])
# The 2 calls to my_image_filter() now use the same variables
result1 = my_image_filter(image1, variables_dict)
result2 = my_image_filter(image2, variables_dict)
```
虽然使用上面的方式创建变量是很方便的,但是在这个模块代码之外却破坏了其封装性:
* 在构建视图的代码中标明变量的名字,类型,形状来创建.
* 当代码改变了,调用的地方也许就会产生或多或少或不同类型的变量.
解决此类问题的方法之一就是使用类来创建模块,在需要的地方使用类来小心地管理他们需要的变量.
一个更高明的做法,不用调用类,而是利用TensorFlow 提供了*变量作用域* 机制,当构建一个视图时,很容易就可以共享命名过的变量.
## 变量作用域实例
变量作用域机制在TensorFlow中主要由两部分组成:
* `tf.get_variable(, , )`:
通过所给的名字创建或是返回一个变量.
* `tf.variable_scope()`:
通过 `tf.get_variable()`为变量名指定命名空间.
方法 `tf.get_variable()` 用来获取或创建一个变量,而不是直接调用`tf.Variable`.它采用的不是像`tf.Variable这样直接获取值来初始化的方法.一个初始化就是一个方法,创建其形状并且为这个形状提供一个张量.这里有一些在TensorFlow中使用的初始化变量:
* `tf.constant_initializer(value)` 初始化一切所提供的值,
* `tf.random_uniform_initializer(a, b)`从a到b均匀初始化,
* `tf.random_normal_initializer(mean, stddev)` 用所给平均值和标准差初始化均匀分布.
为了了解`tf.get_variable()`怎么解决前面所讨论的问题,让我们在单独的方法里面创建一个卷积来重构一下代码,命名为`conv_relu`:
```python
def conv_relu(input, kernel_shape, bias_shape):
# Create variable named "weights".
weights = tf.get_variable("weights", kernel_shape,
initializer=tf.random_normal_initializer())
# Create variable named "biases".
biases = tf.get_variable("biases", bias_shape,
initializer=tf.constant_intializer(0.0))
conv = tf.nn.conv2d(input, weights,
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv + biases)
```
这个方法中用了`"weights"` 和`"biases"`两个简称.而我们更偏向于用`conv1` 和 `conv2`这两个变量的写法,但是不同的变量需要不同的名字.这就是`tf.variable_scope()` 变量起作用的地方.他为变量指定了相应的命名空间.
```python
def my_image_filter(input_images):
with tf.variable_scope("conv1"):
# Variables created here will be named "conv1/weights", "conv1/biases".
relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
with tf.variable_scope("conv2"):
# Variables created here will be named "conv2/weights", "conv2/biases".
return conv_relu(relu1, [5, 5, 32, 32], [32])
```
现在,让我们看看当我们调用 `my_image_filter()` 两次时究竟会发生了什么.
```
result1 = my_image_filter(image1)
result2 = my_image_filter(image2)
# Raises ValueError(... conv1/weights already exists ...)
```
就像你看见的一样,`tf.get_variable()`会检测已经存在的变量是否已经共享.如果你想共享他们,你需要像下面使用的一样,通过`reuse_variables()`这个方法来指定.
```
with tf.variable_scope("image_filters") as scope:
result1 = my_image_filter(image1)
scope.reuse_variables()
result2 = my_image_filter(image2)
```
用这种方式来共享变量是非常好的,轻量级而且安全.
## 变量作用域是怎么工作的?
### 理解 `tf.get_variable()`
为了理解变量作用域,首先完全理解`tf.get_variable()`是怎么工作的是很有必要的.
通常我们就是这样调用`tf.get_variable` 的.
```python
v = tf.get_variable(name, shape, dtype, initializer)
```
此调用做了有关作用域的两件事中的其中之一,方法调入.总的有两种情况.
* 情况1:当`tf.get_variable_scope().reuse == False`时,作用域就是为创建新变量所设置的.
这种情况下,`v`将通过`tf.Variable`所提供的形状和数据类型来重新创建.创建变量的全称将会由当前变量作用域名+所提供的`名字`所组成,并且还会检查来确保没有任何变量使用这个全称.如果这个全称已经有一个变量使用了,那么方法将会抛出`ValueError`错误.如果一个变量被创建,他将会用`initializer(shape)`进行初始化.比如:
```python
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
assert v.name == "foo/v:0"
```
* 情况1:当`tf.get_variable_scope().reuse == True`时,作用域是为重用变量所设置
这种情况下,调用就会搜索一个已经存在的变量,他的全称和当前变量的作用域名+所提供的`名字`是否相等.如果不存在相应的变量,就会抛出`ValueError` 错误.如果变量找到了,就返回这个变量.如下:
```python
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
with tf.variable_scope("foo", reuse=True):
v1 = tf.get_variable("v", [1])
assert v1 == v
```
### `tf.variable_scope()` 基础
知道`tf.get_variable()`是怎么工作的,使得理解变量作用域变得很容易.变量作用域的主方法带有一个名称,它将会作为前缀用于变量名,并且带有一个重用标签来区分以上的两种情况.嵌套的作用域附加名字所用的规则和文件目录的规则很类似:
```python
with tf.variable_scope("foo"):
with tf.variable_scope("bar"):
v = tf.get_variable("v", [1])
assert v.name == "foo/bar/v:0"
```
当前变量作用域可以用`tf.get_variable_scope()`进行检索并且`reuse` 标签可以通过调用`tf.get_variable_scope().reuse_variables()`设置为`True` .
```python
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
tf.get_variable_scope().reuse_variables()
v1 = tf.get_variable("v", [1])
assert v1 == v
```
注意你*不能*设置`reuse`标签为`False`.其中的原因就是允许改写创建模块的方法.想一下你前面写得方法`my_image_filter(inputs)`.有人在变量作用域内调用`reuse=True` 是希望所有内部变量都被重用.如果允许在方法体内强制执行`reuse=False`,将会打破内部结构并且用这种方法使得很难再共享参数.
即使你不能直接设置 `reuse` 为 `False` ,但是你可以输入一个重用变量作用域,然后就释放掉,就成为非重用的变量.当打开一个变量作用域时,使用`reuse=True` 作为参数是可以的.但也要注意,同一个原因,`reuse` 参数是不可继承.所以当你打开一个重用变量作用域,那么所有的子作用域也将会被重用.
```python
with tf.variable_scope("root"):
# At start, the scope is not reusing.
assert tf.get_variable_scope().reuse == False
with tf.variable_scope("foo"):
# Opened a sub-scope, still not reusing.
assert tf.get_variable_scope().reuse == False
with tf.variable_scope("foo", reuse=True):
# Explicitly opened a reusing scope.
assert tf.get_variable_scope().reuse == True
with tf.variable_scope("bar"):
# Now sub-scope inherits the reuse flag.
assert tf.get_variable_scope().reuse == True
# Exited the reusing scope, back to a non-reusing one.
assert tf.get_variable_scope().reuse == False
```
### 获取变量作用域
在上面的所有例子中,我们共享参数只因为他们的名字是一致的,那是因为我们开启一个变量作用域重用时刚好用了同一个字符串.在更复杂的情况,他可以通过变量作用域对象来使用,而不是通过依赖于右边的名字来使用.为此,变量作用域可以被获取并使用,而不是仅作为当开启一个新的变量作用域的名字.
```python
with tf.variable_scope("foo") as foo_scope:
v = tf.get_variable("v", [1])
with tf.variable_scope(foo_scope)
w = tf.get_variable("w", [1])
with tf.variable_scope(foo_scope, reuse=True)
v1 = tf.get_variable("v", [1])
w1 = tf.get_variable("w", [1])
assert v1 == v
assert w1 == w
```
当开启一个变量作用域,使用一个预先已经存在的作用域时,我们会跳过当前变量作用域的前缀而直接成为一个完全不同的作用域.这就是我们做得完全独立的地方.
```python
with tf.variable_scope("foo") as foo_scope:
assert foo_scope.name == "foo"
with tf.variable_scope("bar")
with tf.variable_scope("baz") as other_scope:
assert other_scope.name == "bar/baz"
with tf.variable_scope(foo_scope) as foo_scope2:
assert foo_scope2.name == "foo" # Not changed.
```
### 变量作用域中的初始化器
使用`tf.get_variable()`允许你重写方法来创建或者重用变量,并且可以被外部透明调用.但是如果我们想改变创建变量的初始化器那要怎么做呢?是否我们需要为所有的创建变量方法传递一个额外的参数呢?那在大多数情况下,当我们想在一个地方并且为所有的方法的所有的变量设置一个默认初始化器,那又改怎么做呢?为了解决这些问题,变量作用域可以携带一个默认的初始化器.他可以被子作用域继承并传递给`tf.get_variable()` 调用.但是如果其他初始化器被明确地指定,那么他将会被重写.
```python
with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)):
v = tf.get_variable("v", [1])
assert v.eval() == 0.4 # Default initializer as set above.
w = tf.get_variable("w", [1], initializer=tf.constant_initializer(0.3)):
assert w.eval() == 0.3 # Specific initializer overrides the default.
with tf.variable_scope("bar"):
v = tf.get_variable("v", [1])
assert v.eval() == 0.4 # Inherited default initializer.
with tf.variable_scope("baz", initializer=tf.constant_initializer(0.2)):
v = tf.get_variable("v", [1])
assert v.eval() == 0.2 # Changed default initializer.
```
### 在`tf.variable_scope()`中ops的名称
我们讨论 `tf.variable_scope` 怎么处理变量的名字.但是又是如何在作用域中影响到
其他ops的名字的呢?ops在一个变量作用域的内部创建,那么他应该是共享他的名字,这是很自然的想法.出于这样的原因,当我们用`with tf.variable_scope("name")`时,这就间接地开启了一个`tf.name_scope("name")`.比如:
```python
with tf.variable_scope("foo"):
x = 1.0 + tf.get_variable("v", [1])
assert x.op.name == "foo/add"
```
名称作用域可以被开启并添加到一个变量作用域中,然后他们只会影响到ops的名称,而不会影响到变量.
```python
with tf.variable_scope("foo"):
with tf.name_scope("bar"):
v = tf.get_variable("v", [1])
x = 1.0 + v
assert v.name == "foo/v:0"
assert x.op.name == "foo/bar/add"
```
当用一个引用对象而不是一个字符串去开启一个变量作用域时,我们就不会为ops改变当前的名称作用域.
## 使用实例
这里有一些指向怎么使用变量作用域的文件.特别是,他被大量用于
[时间递归神经网络](https://zh.wikipedia.org/wiki/%E9%80%92%E5%BD%92%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C)和`sequence-to-sequence`模型,
File | What's in it?
--- | ---
`models/image/cifar10.py` |图像中检测对象的模型.
`models/rnn/rnn_cell.py` |时间递归神经网络的元方法集.
`models/rnn/seq2seq.py` |为创建`sequence-to-sequence`模型的方法集.
原文:[Sharing Variables](http://www.tensorflow.org/how_tos/variable_scope/index.md) 翻译:[nb312](https://github.com/nb312)
================================================
FILE: SOURCE/how_tos/variable_scope.md
================================================
# 共享变量
你可以在[怎么使用变量](tensorflow-zh/SOURCE/how_tos/variables/index.md)中所描述的方式来创建,初始化,保存及加载单一的变量.但是当创建复杂的模块时,通常你需要共享大量变量集并且如果你还想在同一个地方初始化这所有的变量,我们又该怎么做呢.本教程就是演示如何使用`tf.variable_scope()` 和`tf.get_variable()`两个方法来实现这一点.
## 问题
假设你为图片过滤器创建了一个简单的模块,和我们的[卷积神经网络教程](tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md)模块相似,但是这里包括两个卷积(为了简化实例这里只有两个).如果你仅使用`tf.Variable`变量,那么你的模块就如[怎么使用变量](tensorflow-zh/SOURCE/how_tos/variables/index.md)里面所解释的是一样的模块.
```python
def my_image_filter(input_images):
conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
name="conv1_weights")
conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
conv1 = tf.nn.conv2d(input_images, conv1_weights,
strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(conv1 + conv1_biases)
conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
name="conv2_weights")
conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
conv2 = tf.nn.conv2d(relu1, conv2_weights,
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv2 + conv2_biases)
```
你很容易想到,模块集很快就比一个模块变得更为复杂,仅在这里我们就有了四个不同的变量:`conv1_weights`,`conv1_biases`, `conv2_weights`, 和`conv2_biases`.
当我们想重用这个模块时问题还在增多.假设你想把你的图片过滤器运用到两张不同的图片, `image1`和`image2`.你想通过拥有同一个参数的同一个过滤器来过滤两张图片,你可以调用`my_image_filter()`两次,但是这会产生两组变量.
```python
# First call creates one set of variables.
result1 = my_image_filter(image1)
# Another set is created in the second call.
result2 = my_image_filter(image2)
```
通常共享变量的方法就是在单独的代码块中来创建他们并且通过使用他们的函数.如使用字典的例子:
```python
variables_dict = {
"conv1_weights": tf.Variable(tf.random_normal([5, 5, 32, 32]),
name="conv1_weights")
"conv1_biases": tf.Variable(tf.zeros([32]), name="conv1_biases")
... etc. ...
}
def my_image_filter(input_images, variables_dict):
conv1 = tf.nn.conv2d(input_images, variables_dict["conv1_weights"],
strides=[1, 1, 1, 1], padding='SAME')
relu1 = tf.nn.relu(conv1 + variables_dict["conv1_biases"])
conv2 = tf.nn.conv2d(relu1, variables_dict["conv2_weights"],
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv2 + variables_dict["conv2_biases"])
# The 2 calls to my_image_filter() now use the same variables
result1 = my_image_filter(image1, variables_dict)
result2 = my_image_filter(image2, variables_dict)
```
虽然使用上面的方式创建变量是很方便的,但是在这个模块代码之外却破坏了其封装性:
* 在构建试图的代码中标明变量的名字,类型,形状来创建.
* 当代码改变了,调用的地方也许就会产生或多或少或不同类型的变量.
解决此类问题的方法之一就是使用类来创建模块,在需要的地方使用类来小心地管理他们需要的变量.
一个更高明的做法,不用调用类,而是利用TensorFlow 提供了*变量作用域* 机制,当构建一个视图时,很容易就可以共享命名过的变量.
## 变量作用域实例
变量作用域机制在TensorFlow中主要由两部分组成:
* `tf.get_variable(, , )`:
通过所给的名字创建或是返回一个变量.
* `tf.variable_scope()`:
通过 `tf.get_variable()`为变量名指定命名空间.
方法 `tf.get_variable()` 用来获取或创建一个变量,而不是直接调用`tf.Variable`.它采用的不是像`tf.Variable这样直接获取值来初始化的方法.一个初始化就是一个方法,创建其形状并且为这个形状提供一个张量.这里有一些在TensorFlow中使用的初始化变量:
* `tf.constant_initializer(value)` 初始化一切所提供的值,
* `tf.random_uniform_initializer(a, b)`从a到b均匀初始化,
* `tf.random_normal_initializer(mean, stddev)` 用所给平均值和标准差初始化均匀分布.
为了了解`tf.get_variable()`怎么解决前面所讨论的问题,让我们在单独的方法里面创建一个卷积来重构一下代码,命名为`conv_relu`:
```python
def conv_relu(input, kernel_shape, bias_shape):
# Create variable named "weights".
weights = tf.get_variable("weights", kernel_shape,
initializer=tf.random_normal_initializer())
# Create variable named "biases".
biases = tf.get_variable("biases", bias_shape,
initializer=tf.constant_intializer(0.0))
conv = tf.nn.conv2d(input, weights,
strides=[1, 1, 1, 1], padding='SAME')
return tf.nn.relu(conv + biases)
```
这个方法中用了`"weights"` 和`"biases"`两个简称.而我们更偏向于用`conv1` 和 `conv2`这两个变量的写法,但是不同的变量需要不同的名字.这就是`tf.variable_scope()` 变量起作用的地方.他为变量指定了相应的命名空间.
```python
def my_image_filter(input_images):
with tf.variable_scope("conv1"):
# Variables created here will be named "conv1/weights", "conv1/biases".
relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
with tf.variable_scope("conv2"):
# Variables created here will be named "conv2/weights", "conv2/biases".
return conv_relu(relu1, [5, 5, 32, 32], [32])
```
现在,让我们看看当我们调用 `my_image_filter()` 两次时究竟会发生了什么.
```
result1 = my_image_filter(image1)
result2 = my_image_filter(image2)
# Raises ValueError(... conv1/weights already exists ...)
```
就像你看见的一样,`tf.get_variable()`会检测已经存在的变量是否已经共享.如果你想共享他们,你需要像下面使用的一样,通过`reuse_variables()`这个方法来指定.
```
with tf.variable_scope("image_filters") as scope:
result1 = my_image_filter(image1)
scope.reuse_variables()
result2 = my_image_filter(image2)
```
用这种方式来共享变量是非常好的,轻量级而且安全.
## 变量作用域是怎么工作的?
### 理解 `tf.get_variable()`
为了理解变量作用域,首先完全理解`tf.get_variable()`是怎么工作的是很有必要的.
通常我们就是这样调用`tf.get_variable` 的.
```python
v = tf.get_variable(name, shape, dtype, initializer)
```
此调用做了有关作用域的两件事中的其中之一,方法调入.总的有两种情况.
* 情况1:当`tf.get_variable_scope().reuse == False`时,作用域就是为创建新变量所设置的.
这种情况下,`v`将通过`tf.Variable`所提供的形状和数据类型来重新创建.创建变量的全称将会由当前变量作用域名+所提供的`名字`所组成,并且还会检查来确保没有任何变量使用这个全称.如果这个全称已经有一个变量使用了,那么方法将会抛出`ValueError`错误.如果一个变量被创建,他将会用`initializer(shape)`进行初始化.比如:
```python
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
assert v.name == "foo/v:0"
```
* 情况1:当`tf.get_variable_scope().reuse == True`时,作用域是为重用变量所设置
这种情况下,调用就会搜索一个已经存在的变量,他的全称和当前变量的作用域名+所提供的`名字`是否相等.如果不存在相应的变量,就会抛出`ValueError` 错误.如果变量找到了,就返回这个变量.如下:
```python
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
with tf.variable_scope("foo", reuse=True):
v1 = tf.get_variable("v", [1])
assert v1 == v
```
### `tf.variable_scope()` 基础
知道`tf.get_variable()`是怎么工作的,使得理解变量作用域变得很容易.变量作用域的主方法带有一个名称,它将会作为前缀用于变量名,并且带有一个重用标签来区分以上的两种情况.嵌套的作用域附加名字所用的规则和文件目录的规则很类似:
```python
with tf.variable_scope("foo"):
with tf.variable_scope("bar"):
v = tf.get_variable("v", [1])
assert v.name == "foo/bar/v:0"
```
当前变量作用域可以用`tf.get_variable_scope()`进行检索并且`reuse` 标签可以通过调用`tf.get_variable_scope().reuse_variables()`设置为`True` .
```python
with tf.variable_scope("foo"):
v = tf.get_variable("v", [1])
tf.get_variable_scope().reuse_variables()
v1 = tf.get_variable("v", [1])
assert v1 == v
```
注意你*不能*设置`reuse`标签为`False`.其中的原因就是允许改写创建模块的方法.想一下你前面写得方法`my_image_filter(inputs)`.有人在变量作用域内调用`reuse=True` 是希望所有内部变量都被重用.如果允许在方法体内强制执行`reuse=False`,将会打破内部结构并且用这种方法使得很难再共享参数.
即使你不能直接设置 `reuse` 为 `False` ,但是你可以输入一个重用变量作用域,然后就释放掉,就成为非重用的变量.当打开一个变量作用域时,使用`reuse=True` 作为参数是可以的.但也要注意,同一个原因,`reuse` 参数是不可继承.所以当你打开一个重用变量作用域,那么所有的子作用域也将会被重用.
```python
with tf.variable_scope("root"):
# At start, the scope is not reusing.
assert tf.get_variable_scope().reuse == False
with tf.variable_scope("foo"):
# Opened a sub-scope, still not reusing.
assert tf.get_variable_scope().reuse == False
with tf.variable_scope("foo", reuse=True):
# Explicitly opened a reusing scope.
assert tf.get_variable_scope().reuse == True
with tf.variable_scope("bar"):
# Now sub-scope inherits the reuse flag.
assert tf.get_variable_scope().reuse == True
# Exited the reusing scope, back to a non-reusing one.
assert tf.get_variable_scope().reuse == False
```
### 获取变量作用域
在上面的所有例子中,我们共享参数只因为他们的名字是一致的,那是因为我们开启一个变量作用域重用时刚好用了同一个字符串.在更复杂的情况,他可以通过变量作用域对象来使用,而不是通过依赖于右边的名字来使用.为此,变量作用域可以被获取并使用,而不是仅作为当开启一个新的变量作用域的名字.
```python
with tf.variable_scope("foo") as foo_scope:
v = tf.get_variable("v", [1])
with tf.variable_scope(foo_scope)
w = tf.get_variable("w", [1])
with tf.variable_scope(foo_scope, reuse=True)
v1 = tf.get_variable("v", [1])
w1 = tf.get_variable("w", [1])
assert v1 == v
assert w1 == w
```
当开启一个变量作用域,使用一个预先已经存在的作用域时,我们会跳过当前变量作用域的前缀而直接成为一个完全不同的作用域.这就是我们做得完全独立的地方.
```python
with tf.variable_scope("foo") as foo_scope:
assert foo_scope.name == "foo"
with tf.variable_scope("bar")
with tf.variable_scope("baz") as other_scope:
assert other_scope.name == "bar/baz"
with tf.variable_scope(foo_scope) as foo_scope2:
assert foo_scope2.name == "foo" # Not changed.
```
### 变量作用域中的初始化器
使用`tf.get_variable()`允许你重写方法来创建或者重用变量,并且可以被外部透明调用.但是如果我们想改变创建变量的初始化器那要怎么做呢?是否我们需要为所有的创建变量方法传递一个额外的参数呢?那在大多数情况下,当我们想在一个地方并且为所有的方法的所有的变量设置一个默认初始化器,那又改怎么做呢?为了解决这些问题,变量作用域可以携带一个默认的初始化器.他可以被子作用域继承并传递给`tf.get_variable()` 调用.但是如果其他初始化器被明确地指定,那么他将会被重写.
```python
with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)):
v = tf.get_variable("v", [1])
assert v.eval() == 0.4 # Default initializer as set above.
w = tf.get_variable("w", [1], initializer=tf.constant_initializer(0.3)):
assert w.eval() == 0.3 # Specific initializer overrides the default.
with tf.variable_scope("bar"):
v = tf.get_variable("v", [1])
assert v.eval() == 0.4 # Inherited default initializer.
with tf.variable_scope("baz", initializer=tf.constant_initializer(0.2)):
v = tf.get_variable("v", [1])
assert v.eval() == 0.2 # Changed default initializer.
```
### 在`tf.variable_scope()`中ops的名称
我们讨论 `tf.variable_scope` 怎么处理变量的名字.但是又是如何在作用域中影响到
其他ops的名字的呢?ops在一个变量作用域的内部创建,那么他应该是共享他的名字,这是很自然的想法.出于这样的原因,当我们用`with tf.variable_scope("name")`时,这就间接地开启了一个`tf.name_scope("name")`.比如:
```python
with tf.variable_scope("foo"):
x = 1.0 + tf.get_variable("v", [1])
assert x.op.name == "foo/add"
```
名称作用域可以被开启并添加到一个变量作用域中,然后他们只会影响到ops的名称,而不会影响到变量.
```python
with tf.variable_scope("foo"):
with tf.name_scope("bar"):
v = tf.get_variable("v", [1])
x = 1.0 + v
assert v.name == "foo/v:0"
assert x.op.name == "foo/bar/add"
```
当用一个引用对象而不是一个字符串去开启一个变量作用域时,我们就不会为ops改变当前的名称作用域.
## 使用实例
这里有一些指向怎么使用变量作用域的文件.特别是,他被大量用于
[时间递归神经网络](https://zh.wikipedia.org/wiki/%E9%80%92%E5%BD%92%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C)和`sequence-to-sequence`模型,
File | What's in it?
--- | ---
`models/image/cifar10.py` |图像中检测对象的模型.
`models/rnn/rnn_cell.py` |时间递归神经网络的元方法集.
`models/rnn/seq2seq.py` |为创建`sequence-to-sequence`模型的方法集.
原文:[Sharing Variables](http://www.tensorflow.org/how_tos/variable_scope/index.md) 翻译:[nb312](https://github.com/nb312)校对:[Wiki](https://github.com/jikexueyuanwiki)
================================================
FILE: SOURCE/how_tos/variables/index.md
================================================
# Variables: Creation, Initialization, Saving, and Loading
When you train a model, you use [variables](../../api_docs/python/state_ops.md)
to hold and update parameters. Variables are in-memory buffers containing
tensors. They must be explicitly initialized and can be saved to disk during
and after training. You can later restore saved values to exercise or analyse
the model.
This document references the following TensorFlow classes. Follow the links to
their reference manual for a complete description of their API:
* The [`tf.Variable`](../../api_docs/python/state_ops.md#Variable) class.
* The [`tf.train.Saver`](../../api_docs/python/state_ops.md#Saver) class.
## Creation
When you create a [Variable](../../api_docs/python/state_ops.md) you pass a
`Tensor` as its initial value to the `Variable()` constructor. TensorFlow
provides a collection of ops that produce tensors often used for initialization
from [constants or random values](../../api_docs/python/constant_op.md).
Note that all these ops require you to specify the shape of the tensors. That
shape automatically becomes the shape of the variable. Variables generally
have a fixed shape, but TensorFlow provides advanced mechanisms to reshape
variables.
```python
# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")
```
Calling `tf.Variable()` adds several ops to the graph:
* A `variable` op that holds the variable value.
* An initializer op that sets the variable to its initial value. This is
actually a `tf.assign` op.
* The ops for the initial value, such as the `zeros` op for the `biases`
variable in the example are also added to the graph.
The value returned by `tf.Variable()` value is an instance of the Python class
`tf.Variable`.
## Initialization
Variable initializers must be run explicitly before other ops in your model can
be run. The easiest way to do that is to add an op that runs all the variable
initializers, and run that op before using the model.
You can alternatively restore variable values from a checkpoint file, see
below.
Use `tf.initialize_all_variables()` to add an op to run variable initializers.
Only run that op after you have fully constructed your model and launched it in
a session.
```python
# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")
...
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Later, when launching the model
with tf.Session() as sess:
# Run the init operation.
sess.run(init_op)
...
# Use the model
...
```
### Initialization from another Variable
You sometimes need to initialize a variable from the initial value of another
variable. As the op added by `tf.initialize_all_variables()` initializes all
variables in parallel you have to be careful when this is needed.
To initialize a new variable from the value of another variable use the other
variable's `initialized_value()` property. You can use the initialized value
directly as the initial value for the new variable, or you can use it as any
other tensor to compute a value for the new variable.
```python
# Create a variable with a random value.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
name="weights")
# Create another variable with the same value as 'weights'.
w2 = tf.Variable(weights.initialized_value(), name="w2")
# Create another variable with twice the value of 'weights'
w_twice = tf.Variable(weights.initialized_value() * 0.2, name="w_twice")
```
### Custom Initialization
The convenience function `tf.initialize_all_variables()` adds an op to
initialize *all variables* in the model. You can also pass it an explicit list
of variables to initialize. See the
[Variables Documentation](../../api_docs/python/state_ops.md) for more options,
including checking if variables are initialized.
## Saving and Restoring
The easiest way to save and restore a model is to use a `tf.train.Saver` object.
The constructor adds `save` and `restore` ops to the graph for all, or a
specified list, of the variables in the graph. The saver object provides
methods to run these ops, specifying paths for the checkpoint files to write to
or read from.
### Checkpoint Files
Variables are saved in binary files that, roughly, contain a map from variable
names to tensor values.
When you create a `Saver` object, you can optionally choose names for the
variables in the checkpoint files. By default, it uses the value of the
[`Variable.name`](../../api_docs/python/state_ops.md#Variable.name) property for
each variable.
### Saving Variables
Create a `Saver` with `tf.train.Saver()` to manage all variables in
the model.
```python
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
..
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print "Model saved in file: ", save_path
```
### Restoring Variables
The same `Saver` object is used to restore variables. Note that when you
restore variables from a file you do not have to initialize them beforehand.
```python
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print "Model restored."
# Do some work with the model
...
```
### Choosing which Variables to Save and Restore
If you do not pass any argument to `tf.train.Saver()` the saver handles all
variables in the graph. Each one of them is saved under the name that was
passed when the variable was created.
It is sometimes useful to explicitly specify names for variables in the
checkpoint files. For example, you may have trained a model with a variable
named `"weights"` whose value you want to restore in a new variable named
`"params"`.
It is also sometimes useful to only save or restore a subset of the variables
used by a model. For example, you may have trained a neural net with 5 layers,
and you now want to train a new model with 6 layers, restoring the parameters
from the 5 layers of the previously trained model into the first 5 layers of
the new model.
You can easily specify the names and variables to save by passing to the
`tf.train.Saver()` constructor a Python dictionary: keys are the
names to use, values are the variables to manage.
Notes:
* You can create as many saver objects as you want if you need to save and
restore different subsets of the model variables. The same variable can be
listed in multiple saver objects, its value is only changed when the saver
`restore()` method is run.
* If you only restore a subset of the model variables at the start
of a session, you have to run an initialize op for the other variables. See
[`tf.initialize_variables()`](../../api_docs/python/state_ops.md#initialize_variables)
for more information.
```python
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to save and restore only 'v2' using the name "my_v2"
saver = tf.train.Saver({"my_v2": v2})
# Use the saver object normally after that.
...
```
================================================
FILE: SOURCE/how_tos/variables.md
================================================
# 变量:创建、初始化、保存和加载#
当训练模型时,用[变量](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md)来存储和更新参数。变量包含张量 (Tensor)存放于内存的缓存区。建模时它们需要被明确地初始化,模型训练后它们必须被存储到磁盘。这些变量的值可在之后模型训练和分析是被加载。
本文档描述以下两个TensorFlow类。点击以下链接可查看完整的API文档:
- [`tf.Variable`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md#Variable) 类
- [`tf.train.Saver`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md#Saver) 类
## 创建
当创建一个[变量](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md)时,你将一个`张量`作为初始值传入构造函数`Variable()`。TensorFlow提供了一系列操作符来初始化张量,初始值是[常量或是随机值](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/constant_op.md)。
注意,所有这些操作符都需要你指定张量的shape。那个形状自动成为变量的shape。变量的shape通常是固定的,但TensorFlow提供了高级的机制来重新调整其行列数。
```python
# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")
```
调用`tf.Variable()`添加一些操作(Op, operation)到graph:
- 一个`Variable`操作存放变量的值。
- 一个初始化op将变量设置为初始值。这事实上是一个`tf.assign`操作.
- 初始值的操作,例如示例中对`biases`变量的`zeros`操作也被加入了graph。
`tf.Variable`的返回值是Python的`tf.Variable`类的一个实例。
## 初始化
变量的初始化必须在模型的其它操作运行之前先明确地完成。最简单的方法就是添加一个给所有变量初始化的操作,并在使用模型之前首先运行那个操作。
你或者可以从检查点文件中重新获取变量值,详见下文。
使用`tf.initialize_all_variables()`添加一个操作对变量做初始化。记得在完全构建好模型并加载之后再运行那个操作。
```python
# Create two variables.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
name="weights")
biases = tf.Variable(tf.zeros([200]), name="biases")
...
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Later, when launching the model
with tf.Session() as sess:
# Run the init operation.
sess.run(init_op)
...
# Use the model
...
```
### 由另一个变量初始化
你有时候会需要用另一个变量的初始化值给当前变量初始化。由于`tf.initialize_all_variables()`是并行地初始化所有变量,所以在有这种需求的情况下需要小心。
用其它变量的值初始化一个新的变量时,使用其它变量的`initialized_value()`属性。你可以直接把已初始化的值作为新变量的初始值,或者把它当做tensor计算得到一个值赋予新变量。
```python
# Create a variable with a random value.
weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),
name="weights")
# Create another variable with the same value as 'weights'.
w2 = tf.Variable(weights.initialized_value(), name="w2")
# Create another variable with twice the value of 'weights'
w_twice = tf.Variable(weights.initialized_value() * 0.2, name="w_twice")
```
### 自定义初始化
`tf.initialize_all_variables()`函数便捷地添加一个op来初始化模型的所有变量。你也可以给它传入一组变量进行初始化。详情请见[Variables Documentation](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md),包括检查变量是否被初始化。
## 保存和加载
最简单的保存和恢复模型的方法是使用`tf.train.Saver`对象。构造器给graph的所有变量,或是定义在列表里的变量,添加`save`和`restore`ops。saver对象提供了方法来运行这些ops,定义检查点文件的读写路径。
### 检查点文件
变量存储在二进制文件里,主要包含从变量名到tensor值的映射关系。
当你创建一个`Saver`对象时,你可以选择性地为检查点文件中的变量挑选变量名。默认情况下,将每个变量[`Variable.name`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md#Variable.name)属性的值。
### 保存变量
用`tf.train.Saver()`创建一个`Saver`来管理模型中的所有变量。
```python
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
# Do some work with the model.
..
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print "Model saved in file: ", save_path
```
### 恢复变量
用同一个`Saver`对象来恢复变量。注意,当你从文件中恢复变量时,不需要事先对它们做初始化。
```python
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print "Model restored."
# Do some work with the model
...
```
### 选择存储和恢复哪些变量
如果你不给`tf.train.Saver()`传入任何参数,那么saver将处理graph中的所有变量。其中每一个变量都以变量创建时传入的名称被保存。
有时候在检查点文件中明确定义变量的名称很有用。举个例子,你也许已经训练得到了一个模型,其中有个变量命名为`"weights"`,你想把它的值恢复到一个新的变量`"params"`中。
有时候仅保存和恢复模型的一部分变量很有用。再举个例子,你也许训练得到了一个5层神经网络,现在想训练一个6层的新模型,可以将之前5层模型的参数导入到新模型的前5层中。
你可以通过给`tf.train.Saver()`构造函数传入Python字典,很容易地定义需要保持的变量及对应名称:键对应使用的名称,值对应被管理的变量。
注意:
- 如果需要保存和恢复模型变量的不同子集,可以创建任意多个saver对象。同一个变量可被列入多个saver对象中,只有当saver的`restore()`函数被运行时,它的值才会发生改变。
- 如果你仅在session开始时恢复模型变量的一个子集,你需要对剩下的变量执行初始化op。详情请见[`tf.initialize_variables()`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md#initialize_variables)。
```python
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
# Add ops to save and restore only 'v2' using the name "my_v2"
saver = tf.train.Saver({"my_v2": v2})
# Use the saver object normally after that.
...
```
>原文链接: [http://tensorflow.org/how_tos/variables/index.html](http://tensorflow.org/how_tos/variables/index.html)
>翻译:[赵屹华](https://github.com/zhyhooo) 校对:[Wiki](https://github.com/jikexueyuanwiki)
================================================
FILE: SOURCE/images/getting_started.dot
================================================
digraph Dependencies {
node [shape = oval];
"predictions: MatMul()" -> "data: Concat()"
"data: Concat()" -> data_left
"data: Concat()" -> data_right
"predictions: MatMul()" -> "weight_matrix: Reshape()"
"weight_matrix: Reshape()" -> "new_weights: Add()"
"new_weights: Add()" -> weights
"new_weights: Add()" -> deltas
"update: Assign()" -> weights
"update: Assign()" -> "new_weights: Add()"
"InitializeAllVariables()" -> weights
"InitializeAllVariables()" -> init_value
}
================================================
FILE: SOURCE/index.md
================================================
# TensorFlow
## Introduction
TensorFlow™ is an open source software library for numerical computation
using data flow graphs. Nodes in the graph represent mathematical operations,
while the graph edges represent the multidimensional data arrays (tensors) that
flow between them. This flexible architecture allows you to deploy computation
to one or more CPUs or GPUs in a desktop, server, or mobile device without
rewriting code. TensorFlow was originally developed by researchers and
engineers working on the Google Brain team within Google's Machine Intelligence
research organization for the purposes of conducting machine learning and deep
neural networks research. The system is general enough to be applicable in a
wide variety of other domains as well. The following documents show you how
to set up and use the TensorFlow system.
## Table of Contents
================================================
FILE: SOURCE/personal.md
================================================
## TensorFlow 个人学习心得
说明:本章所列链接为个人学习TensorFlow的心得,很多是博客地址,从实践的角度帮助大家更好的理解官方文档的内容
示例:
- [我的TensorFlow学习博客](http://wiki.jikexueyuan.com)
================================================
FILE: SOURCE/resource.md
================================================
## TensorFlow 相关资源
- [Google官方Blog宣布TensorFlow开源](https://googleblog.blogspot.com/2015/11/tensorflow-smarter-machine-learning-for.html)
- [TensorFlow WhitePaper(PDF下载)](http://download.tensorflow.org/paper/whitepaper2015.pdf)
- [Jeff Dean 介绍 TensorFlow(视频)](https://www.youtube.com/watch?v=90-S1M7Ny_o&t=21m2s)
- [TensorFlow 简化版接口 Scikit Flow](https://github.com/google/skflow)
- [TensorFlow 使用样例](https://github.com/aymericdamien/TensorFlow-Examples)
- [TensorFlow 与 mxnet, caffe 对比 @chenrudan](http://chenrudan.github.io/blog/2015/11/18/comparethreeopenlib.html)
- [TensorFlow 与 torch7, caffe, mxnet 在内存使用量以及性能对比 @mli](https://github.com/dmlc/mxnet/blob/master/doc/overview_chn.md)
================================================
FILE: SOURCE/resources/bib.md
================================================
# BibTex 引用
如果你在研究中使用了 TensorFlow,并且希望引用 TensorFlow系统。我们建议你引用一下白皮书。
```
@misc{tensorflow2015-whitepaper,
title={{TensorFlow}: Large-Scale Machine Learning on Heterogeneous Systems},
url={http://tensorflow.org/},
note={Software available from tensorflow.org},
author={
Mart\'{\i}n~Abadi and
Ashish~Agarwal and
Paul~Barham and
Eugene~Brevdo and
Zhifeng~Chen and
Craig~Citro and
Greg~S.~Corrado and
Andy~Davis and
Jeffrey~Dean and
Matthieu~Devin and
Sanjay~Ghemawat and
Ian~Goodfellow and
Andrew~Harp and
Geoffrey~Irving and
Michael~Isard and
Yangqing Jia and
Rafal~Jozefowicz and
Lukasz~Kaiser and
Manjunath~Kudlur and
Josh~Levenberg and
Dan~Man\'{e} and
Rajat~Monga and
Sherry~Moore and
Derek~Murray and
Chris~Olah and
Mike~Schuster and
Jonathon~Shlens and
Benoit~Steiner and
Ilya~Sutskever and
Kunal~Talwar and
Paul~Tucker and
Vincent~Vanhoucke and
Vijay~Vasudevan and
Fernanda~Vi\'{e}gas and
Oriol~Vinyals and
Pete~Warden and
Martin~Wattenberg and
Martin~Wicke and
Yuan~Yu and
Xiaoqiang~Zheng},
year={2015},
}
```
文本形式如下:
```
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo,
Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis,
Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow,
Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia,
Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster,
Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens,
Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker,
Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas,
Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke,
Yuan Yu, and Xiaoqiang Zheng.
TensorFlow: Large-scale machine learning on heterogeneous systems,
2015. Software available from tensorflow.org.
```
> 原文:http://tensorflow.org/resources/bib.md 翻译:[Jim-Zenn](https://github.com/Jim-Zenn) 校对:[Wiki](https://github.com/jikexueyuanwiki)
================================================
FILE: SOURCE/resources/dims_types.md
================================================
# 张量的阶、形状、数据类型
TensorFlow用张量这种数据结构来表示所有的数据.你可以把一个张量想象成一个n维的数组或列表.一个张量有一个静态类型和动态类型的维数.张量可以在图中的节点之间流通.
## 阶
在TensorFlow系统中,张量的维数来被描述为*阶*.但是张量的阶和矩阵的阶并不是同一个概念.张量的阶(有时是关于如*顺序*或*度数*或者是*n维*)是张量维数的一个数量描述.比如,下面的张量(使用Python中list定义的)就是2阶.
```
t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
```
你可以认为一个二阶张量就是我们平常所说的矩阵,一阶张量可以认为是一个向量.对于一个二阶张量你可以用语句`t[i, j]`来访问其中的任何元素.而对于三阶张量你可以用't[i, j, k]'来访问其中的任何元素.
阶 |数学实例| Python 例子
--- | --- | ---
0 | 纯量 (只有大小) | `s = 483`
1 | 向量(大小和方向) | `v = [1.1, 2.2, 3.3]`
2 | 矩阵(数据表) | `m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]`
3 | 3阶张量 (数据立体) | `t = [[[2], [4], [6]], [[8], [10], [12]], [[14], [16], [18]]]`
n | n阶 (自己想想看) | `....`
## 形状
TensorFlow文档中使用了三种记号来方便地描述张量的维度:阶,形状以及维数.下表展示了他们之间的关系:
阶 | 形状 | 维数 | 实例
--- | --- | --- | ---
0 | [ ] | 0-D | 一个 0维张量. 一个纯量.
1 | [D0] | 1-D | 一个1维张量的形式[5].
2 | [D0, D1] | 2-D |一个2维张量的形式[3, 4].
3 | [D0, D1, D2] | 3-D | 一个3维张量的形式 [1, 4, 3].
n | [D0, D1, ... Dn] | n-D | 一个n维张量的形式 [D0, D1, ... Dn].
形状可以通过Python中的整数列表或元祖(int list或tuples)来表示,也或者用[`TensorShape` class](../api_docs/python/framework.md#TensorShape).
## 数据类型
除了维度,Tensors有一个数据类型属性.你可以为一个张量指定下列数据类型中的任意一个类型:
数据类型 | Python 类型| 描述
--- | --- | ---
`DT_FLOAT` | `tf.float32` | 32 位浮点数.
`DT_DOUBLE` | `tf.float64` | 64 位浮点数.
`DT_INT64` | `tf.int64` | 64 位有符号整型.
`DT_INT32` | `tf.int32` | 32 位有符号整型.
`DT_INT16` | `tf.int16` | 16 位有符号整型.
`DT_INT8` | `tf.int8` | 8 位有符号整型.
`DT_UINT8` | `tf.uint8` | 8 位无符号整型.
`DT_STRING` | `tf.string` | 可变长度的字节数组.每一个张量元素都是一个字节数组.
`DT_BOOL` | `tf.bool` |布尔型.
`DT_COMPLEX64` | `tf.complex64` | 由两个32位浮点数组成的复数:实数和虚数.
`DT_QINT32` | `tf.qint32` | 用于量化Ops的32位有符号整型.
`DT_QINT8` | `tf.qint8` | 用于量化Ops的8位有符号整型.
`DT_QUINT8` | `tf.quint8` |用于量化Ops的8位无符号整型.
原文:[Tensor Ranks, Shapes, and Types](http://www.tensorflow.org/resources/dims_types.md)
翻译:[nb312](https://github.com/nb312)
校对:[lonlonago]( https://github.com/lonlonago)
================================================
FILE: SOURCE/resources/faq.md
================================================
# 常见问题
此文档对关于TensorFlow的一些常见问题提供了答案,如果这里没有你问题的答案,你可能会在[社区资源](tensorflow-zh/SOURCE/resoreces/index.md)中找到它。
## 内容
### [常见问题](#AUTOGENERATED-frequently-asked-questions)
* [建立 TensorFlow graph](#AUTOGENERATED-building-a-tensorflow-graph)
* [运行 TensorFlow 计算过程](#AUTOGENERATED-running-a-tensorflow-computation)
* [Variables](#AUTOGENERATED-variables)
* [Tensor shapes](#AUTOGENERATED-tensor-shapes)
* [TensorBoard](#AUTOGENERATED-tensorboard)
* [扩展 TensorFlow](#AUTOGENERATED-extending-tensorflow)
* [其他问题](#AUTOGENERATED-miscellaneous)
## 建立 TensorFlow graph
参看
[建立 graph 的 API 文档](../api_docs/python/framework.md).
#### 为什么`c = tf.matmul(a, b)` 不立即执行矩阵相乘?
在 TensorFlow 的 Python API 中, `a`, `b`, and `c` 都是
[`Tensor`](../api_docs/python/framework.md#Tensor) 对象. 一个 `Tensor` 对象是一个操作(operation)结果的字符别名,它实际上并不储存操作(operation)输出结果的值。
TensorFlow 鼓励用户去建立复杂的表达式(如整个神经网络及其梯度)来形成 data flow graph 。
然后你可以将整个 data flow graph 的计算过程交给一个 TensorFlow 的 [`Session`](../api_docs/python/client.md#Session),
此 `Session` 可以运行整个计算过程,比起操作(operations)一条一条的执行效率高的多。
#### 设备是如何命名的?
对CPU设备而言,支持的设备名是`"/device:CPU:0"` (或 `"/cup:0"`),对第 *i* 个 GPU 设备是`"/device:GPU:i"` (或 `"/gpu:i"`)
#### 如何在指定的设备上运行操作(operations)?
在 [`with tf.device(name):`](../api_docs/python/framework.md#device) context 中创建操作(operation),这样可以在指定的设备上运行操作(operation)。
关于 TensorFlow 怎样将操作(operations)分配给设备的细节,参看 [TensorFlow使用 GPU ](../how_tos/using_gpu/index.md); 使用多 GPU 的示范实例参看 [CIFAR-10 教程](../tutorials/deep_cnn/index.md)。
#### 可用的 tensor 有哪些不同的类型?
TensorFlow 支持许多种不同的数据类型和 tensor shape ,更多细节请参看 [ranks, shapes, and type reference](../resources/dims_types.md)
## 运行 TensorFlow 计算过程。
参看
[运行 graph 的 API 文档](../api_docs/python/client.md).
#### 请详细解释 feeding 和 placeholders?
Feeding 是 TensorFlow Session API 的一种机制,它允许你在运行时用不同的值替换一个或多个 tensor 的值。
[`Session.run()`](../api_docs/python/client.md#Session.run) 的参数 `feed_dict` 是一个字典,
它将 [`Tensor`](../api_docs/python/framework.md) 对象映射为 numpy 的数组(和一些其他类型)。
在执行 step 时,这些数组就是 tensor 的值。
你常会碰到某些 tensor 总是有值的,比如 inputs。 [`tf.placeholder()`](../api_docs/python/io_ops.md#placeholder) 操作(operation)允许你定义一种必须提供值的 tensor ,你也可以随意限定它们的 shape。关于如何使用 placelolders 和 feeding 为神经网络提供训练数据的例子,请参看[初学者的 MNIST 教程](../tutorials/mnist/beginners/index.md)
#### `Session.run()` 和 `Tensor.eval()` 有什么区别?
如果 `t` 是一个 [`Tensor`](../api_docs/python/framework.md#Tensor) 对象, [`t.eval()`](../api_docs/python/framework.md#Tensor.eval) 就是 [`sess.run(t)`](../api_docs/python/client.md#Session.run) (`sess` 是当前[默认 session](../api_docs/python/client.md#get_default_session))的简写。
以下两段小程序是等效的:
```python
# 使用 `Session.run()`.
sess = tf.Session()
c = tf.constant(5.0)
print sess.run(c)
# 使用 `Tensor.eval()`.
c = tf.constant(5.0)
with tf.Session():
print c.eval()
```
在第二个例子中, session 的作用就象 [context manager](https://docs.python.org/2.7/reference/compound_stmts.html#with) , context manager 在 `with` 块的生存期,将 session 作为默认的 session。对简单应用的情形(如单元测试),context manager 的方法可以得到更简洁的代码; 如果你的代码要处理多个 graph 和 session ,更直白的方式可能是显式调用 `Session.run()`。
#### Sessions 有生存期吗? 调用时产生的 tensors 呢?
Session 能够占有资源,例如 [variables](../api_docs/python/state_ops.md#Variable),[queues](../api_docs/python/io_ops.md#QueueBase), 和
[readers](../api_docs/python/io_ops.md#ReaderBase); 这些资源会使用相当大量的内存。 当调用[`Session.close()`](../api_docs/python/client.md#Session.close) 关闭 session 后,这些资源(和相关的内存)就被释放了。
作为调用 [`Session.run()`](../api_docs/python/client.md) 过程的一部分所创建的 tensors, 会在调用时或调用结束前释放。
#### 我可以在多个计算机上运行分布式的训练吗?
最初的 TensorFlow 开源版本支持单一计算机内的多设备(CPUs 和 GPUs)。
我们也正在致力于一个分布式的版本:如果你有兴趣,请告知我们,这样我们可以做相应的调整。
#### 运行时会并行计算图的执行的各个部分(parts of graph execution)吗?
TensorFlow 运行时会在许多不同的层面(dimensions)并行图的执行(graph execution):
* 在一个CPU中用多核或是一个GPU中用多线程来并行许多单独的操作(operation)。
* 在 TensorFlow graph 中各个独立的节点可以在多个设备上并行,这样就提供了加速的可能。[CIFAR-10 用多 GPU 训练](../tutorials/deep_cnn/index.md).
* Session API 允许并行执行多并发的 steps (如 调用 [Session.run()](../api_docs/python/client.md#Session.run))。
如果单一的 step 不使用你计算机中所有的资源,这种方法可以使运行时有更高的吞吐量。
#### TensorFlow 支持哪些客户端编程语言?
TensorFlow 被设计成为支持多种客户端语言。当前支持最好的客户端语言是 [Python](../api_docs/python/index.md)。 [C++ 客户端 API](../api_docs/cc/index.md) 提供了启动 graph 和运行 steps 的接口; 我们还有一个 [用 C++ 建立 graph 的 API](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/cc/tutorials/example_trainer.cc),此 API 是实验性的。
从社区的利益出发,我们想要支持更多的客户端语言。 TensorFlow 有一个 [基于 C 的客户端 API](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/tensor_c_api.h),它使得用许多不同的语言创建客户端变得很容易。我们请大家在新语言绑定上做出努力。
#### TensorFlow 会利用我计算机上所有可用的设备(GPUs 和 CPUs)吗?
TensorFlow 支持多 GPU 和 CPU。 有关 TensorFlow 如何将操作(operations)分配到设备的细节请参看 [TensorFlow 如何使用 GPUs](../how_tos/using_gpu/index.md) 的文档,有关使用多 GPU 的示范实例请参看 [CIFAR-10 教程](../tutorials/deep_cnn/index.md)
请注意, TensorFlow 只使用计算能力(compute capability)大于 3.5 的 GPU 设备。
#### 当使用一个 reader 或 一个 queue 时,为什么 `Session.run()` 会挂起?
[reader](../api_docs/python/io_ops.md#ReaderBase) 类和 [queue](../api_docs/python/io_ops.md#QueueBase) 类提供特殊的操作(operations),这些操作(operations)在有可用的输入(对有界队列则是空闲空间)前会 *阻塞* 。使用这些操作(operations)你可以创建复杂的[输入流水线(input pipelines)](../how_tos/reading_data/index.md) ,不过,这会使 TensorFlow 的计算过程更复杂。有关如何使用这些操作(operations)的更多信息请参看 how-to 文档中的[使用 `QueueRunner` 对象来控制 queues 和 readers](../how_tos/reading_data/index.md#QueueRunners)。
## Variables
参看 [Variables](../how_tos/variables/index.md)
和 [变量作用域](../how_tos/variable_scope/index.md) 的 how-to 文档,还有[关于变量的 API 文档](../api_docs/python/state_ops.md)
#### 变量的生存期是?
在某一 session 中,当你一开始运行 [`tf.Variable.initializer`](../api_docs/python/state_ops.md#Variable.initializer) 操作(operation)时,变量就会被创建。此 [`session 关闭后`](../api_docs/python/client.md#Session.close)它就被摧毁(destroyed)了。
#### 并发读取或存入变量时会是什么情况?
变量可以进行并发的读和写操作(operation)。由于变量是并发(concurrently)更新的, 所以从一个变量中读出的值可能会改变。在不互斥的条件下,对一个变量的并发的许多赋值操作(operation)是默认允许运行的。在对一个变量赋值时,如果想要加锁,可以将 `use_locking=True` 传递给 [`Variable.assign()`](../api_docs/python/state_ops.md#Variable.assign)。
## Tensor shapes
参看
[`TensorShape` API 文档](../api_docs/python/framework.md#TensorShape).
#### 在 Python 中我怎么判断一个 tensor 的 shape ?
在 TensorFlow 中,一个 tensor 具备静态和动态两种 shape 。静态的 shape 可以用 [`tf.Tensor.get_shape()`](../api_docs/python/framework.md#Tensor.get_shape) 方法读出:这种 shape 是由此 tensor 在创建时使用的操作(operations)推导得出的,可能是 [partially complete](../api_docs/python/framework.md#TensorShape) 的。如果静态 shape 没有完整定义(not fully defined)的话,则一个 tensor 的动态 shape 可通过求 [`tf.shape(t)`](../api_docs/python/array_ops.md#shape) 的值得到。
#### `x.set_shape()` 和 `x = tf.reshape(x)` 有什么区别?
[`tf.Tensor.set_shape()`](../api_docs/python/framework.md) 方法(method)会更新(updates)一个 `Tensor` 对象的静态 shape ,当静态 shape 信息不能够直接推导得出的时候,此方法常用来提供额外的 shape 信息。它不改变此 tensor 动态 shape 的信息。
[`tf.reshape()`](../api_docs/python/array_ops.md#reshape) 操作(operation)会以不同的动态 shape 创建一个新的 tensor。
#### 我怎么创建这样一个 graph ,它在批次大小可变(variable batch sizes)的情形下也可以正常运作?
如果能够创建一个 graph ,在批次大小可变(variable batch sizes)的情形下也可以正常运作将会是十分有用的,例如可以使用相同的代码完成(小)批量训练((mini-)batch training)和单例推导(single-instance inference)。这样生成的 graph 可以[保存起来当作协议缓存(protocol buffer)](../api_docs/python/framework.md#Graph.as_graph_def),也可以[导入至其他的程序](../api_docs/python/framework.md#import_graph_def)。
创建一个可变大小的 graph 时,要记住最重要的事情是不要将批次大小(batch size)编码成为 Python 常数,而是用一个字符性(symbolic)的 `Tensor` 来表示。下面的提示可能会有用:
* 用 [`batch_size = tf.shape(input)[0]`](../api_docs/python/array_ops.md#shape) 从一个叫 `input` 的 `Tensor` 提取批次的维度(batch dimention),再将其存入一个名为 `batch_size` 的 `Tensor` 。
* 用 [`tf.reduce_mean()`](../api_docs/python/math_ops.md#reduce_mean) 而不是 `tf.reduce_sum(...) / batch_size`。
* 如果你使用 [placeholders for feeding input](../how_tos/reading_data/index.md#Feeding),你就可以用 [`tf.placeholder(..., shape=[None, ...])`](../api_docs/python/io_ops.md#placeholder) 通过创建 placeholder 来具体指定一个可变的批次维度(variable batch dimention)。shape 的 `None` 元素与可变大小的维度(a variable-sized dimension)相对应。
## TensorBoard
#### 我怎样视觉化一个 TensorFlow graph ?
参看[ graph 的视觉化教程](../how_tos/graph_viz/index.md).
#### 向 TensorBoard 发送数据的最简单的方法是什么?
给你的 TensorFlow graph 增加 summary 操作(ops),接着用 [`SummaryWriter`](../api_docs/python/train.md#SummaryWriter) 将这些 summaries 写入一个 log directory。然后用以下命令启动 TensorBoard 。
`python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory`
更多细节请参看 [Summaries 和 TensorBoard 教程](../how_tos/summaries_and_tensorboard/index.md)。
## 扩展 TensorFlow
参看有关[向 TensorFlow 添加新操作(oprations)](../how_tos/adding_an_op/index.md) 的 how-to 文档。
#### 我的数据是自定义格式,要怎样用 TensorFlow 来读取它?
有两种主要的操作(operation)来处理自定义格式的数据。
较简单的方法:用 Python 编写一段分词的代码(parsing code),将数据转换成为 numpy array,然后用此数据把一个 [`tf.placeholder()`]
(../api_docs/python/io_ops.md#placeholder) 传送给一个 tensor 。更多的细节参见 [使用 placeholders 进行输入](../how_tos/reading_data/index.md#Feeding) 的相关文档。这个方法比较容易实现,不过分词的部分会成为性能的瓶颈。
更高效的方法是[添加一个用 C++ 编写的操作(op)](../how_tos/adding_an_op/index.md),用这个操作(operation)来对你的数据格式进行分词(parse)。
[新数据格式处理指南](../how_tos/new_data_formats/index.md)中有更多相关步骤的信息。
#### 我如何定义操作(operation)使得它能够接受可变数量的输入?
TensorFlow 的操作(operation)注册机制允许你定义几种输入:单独的 tensor,一列相同类型的 tensors (例如把一个可变长列表中的 tensors 相加), 一列不同类型的 tensors (例如将一个 tuple 中的 tensors 入队(enqueue))。有关怎样定义这些不同的输入类型的更多细节,请参看[添加具有一列输入或输出的操作(op)](../how_tos/adding_an_op/index.md#list-input-output)的相关文档。
## 其他问题
#### TensorFlow 能使用 Python 3 吗?
我们只用 Python 2.7 进行了测试。我们了解对 Python 3 的兼容性来说,还需要有一些修改,欢迎大家朝这个方向多努力。
#### TensorFlow 的代码风格有什么规则?
TensorFlow Python API 遵循 [PEP8](https://www.python.org/dev/peps/pep-0008/) 惯例。
* 特别的,我们使用 `CamelCase` 格式作为类名, `snake_case` 格式作为方程名, 方法名, 和属性名。我们也遵循
[Google Python style guide](https://google.github.io/styleguide/pyguide.html)。
TensorFlow C++ 代码遵循 [Google C++ style guide](http://google.github.io/styleguide/cppguide.html)。
(* 有一条例外: 我们使用 2 空格缩进而不是 4 空格缩进)
原文:[Frequently Asked Questions](http://tensorflow.org/resources/faq.md) 翻译:[Terence Cooper](https://github.com/TerenceCooper) 校对:[Wiki](https://github.com/jikexueyuanwiki)
================================================
FILE: SOURCE/resources/glossary.md
================================================
# 术语表
### 广播操作(Broadcasting operation)
一种用[numpy-style broadcasting](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)来保证tensor参数的形态兼容的操作。
### Devices
一块可以用来运算并且拥有自己的地址空间的硬件,比如GPU和CPU。
### eval
Tensor 的一个方法,返回 Tensor 的值。触发任意一个图计算都需要计算出这个值。只能在一个已经启动的会话的图中才能调用该 Tensor 值。
### Feed
TensorFlow 的一个概念:把一个 Tensor 直接连接到一个会话图表中的任意节点。feed 不是在构建图(graph)的时候创建,而是在触发图的执行操作时去申请。一个 feed 临时替代一个带有 Tensor 值的节点。把feed数据作为run( )方法和eval( )方法的参数来初始化运算。方法运行结束后,替换的 feed 就会消失,而最初的节点定义仍然还在。可以通过tf.placeholder( )把特定的节点指定为 feed 节点来创建它们。详见[Basic Usage](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/get_started/basic_usage.md).
### Fetch
TensorFlow中的一个概念:为了取回运算操作的输出结果。取回的申请发生在触发执行图操作的时候,而不是发生在建立图的时候。如果要取回一个或多个节点(node)的 Tensor 值,可以通过在 Session 对象上调用run( )方法并将待取回节点(node)的列表作为参数来执行图表(graph)。详见[Basic Usage](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/get_started/basic_usage.md)。
### Graph(图)
把运算任务描述成一个直接的无环图形(DAG),图表中的节点(node)代表必须要实现的一些操作。图中的边代表数据或者可控的依赖。GratheDef 是系统中描述一个图表的协议(api),它由一个 NodeDefs 集合组成。一个GraphDef可以转化成一个更容易操作的图表对象。
### IndexedSlices(索引化切片)
在 Python API 中,TensorFlow 仅仅在第一维上对 Tensor 有所体现。如果一个 Tensor 有k维,那么一个 IndexedSlices 实例在逻辑上代表一个沿着这个 Tensor 第一维的(k-1)维切片的集合。切片的索引被连续储存在一个单独的一维向量中,而对应的切片则被拼接成一个单独的k维 Tensor。如果 sparsity 不是受限于第一维空间,请用
SparseTensor。
### Node(节点)
图中的一个元素。
把启动一个特定操作的方式称为特定运算图表的一个节点,包括任何用来配置这个操作的属性的值。对于那些多形态的操作,这些属性包括能完全决定这个节点(Node)签名的充分信息。详见graph.proto。
### 操作(Op/operation)
在 TensorFlow 的运行时中,它是一种类似 add 或 matmul 或 concat的运算。可以用[how to add an op](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/adding_an_op/index.md)中的方法来向运行时添加新的操作。
在 Python 的API中,它是图中的一个节点。在[tf.Operation](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#Operation)类中列举出了这些操作。一个操作(Operation)的 type 属性决定这个节点(node)的操作类型,比如add和matmul。
### Run
在一个运行的图中执行某种操作的行为。要求图必须运行在会话中。
在 Python 的 API 中,它是 Session 类的一个方法[tf.Session.run](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/client.md#Session)。可以通过 Tensors 来订阅或获取run( )操作。
在C++的API中,它是[tensorflow::Session](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/client.md#Session)类
的一个方法。
### Session(会话)
启动图的第一步是创建一个 Session 对象。Session 提供在图中执行操作的一些方法。
在 Python API中,使用[tf.Session](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/client.md#Session)。
在 C++ 的API中,[tensorflow::Session](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/cc/ClassSession.md)是用来创建一个图并运行操作的类:
### Shape
Tensor 的维度和它们的大小。
在一个已经启动的图中,它表示流动在节点(node)之间的 Tensor 的属性。一些操作对 shape 有比较强的要求,如果没有 Shape 属性则会报告错误。
在 Python API中,用创建图的 API 来说明 Tensor 的 Shape 属性。Tensor 的Shape 属性要么只有部分已知,要么全部未知。详见[tf.TensroShape](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#TensorShape)
在C++中,Shape 类用来表示 Tensor 的维度。[tensorflow::TensorShape](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/cc/ClassTensorShape.md)。
### SparseTensor
在 Python API 中,它用来表示在 TensorFlow 中稀疏散落在任意地方的 Tensor 。SparseTensor 以字典-值格式来储存那些沿着索引的非空值。换言之,m个非空值,就包含一个长度为m的值向量和一个由m列索引(indices)组成的矩阵。为了提升效率,SparseTensor 需要将 indice(索引)按维度的增加来按序存储,比如行主序。如果稀疏值仅沿着第一维度,就用 IndexedSlices。
### Tensor
Tensor是一种特定的多维数组。比如,一个浮点型的四维数组表示一小批由[batch,height,width,channel]组成的图片。
在一个运行的图(graph)中,它是一种流动在节点(node)之间的数据。
在 Python 中,Tensor 类表示添加到图的操作中的输入和输出,见[tf.Tensor](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#Tensor),这样的类不持有数据。
在C++中,Tensor是方法[Session::Run( )](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/cc/ClassSession.md)的返回值,见[tensorflow::Tensor](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/cc/ClassTensor.md),这样的 Tensor 持有数据。
原文:[Glossary](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/resources/glossary.md)
翻译:[leege100](https://github.com/leege100)
校对:[lonlonago](https://github.com/lonlonago)
================================================
FILE: SOURCE/resources/overview.md
================================================
# 其他资源
## TensorFlow 白皮书
在这份白皮书里,你可以找到关于 TensorFlow 编程模型的更多详情和 TensorFlow 的实现原理。
* [TensorFlow: Large-scale machine learning on heterogeneous systems](http://download.tensorflow.org/paper/whitepaper2015.pdf)
### 引用
如果你在你的研究中使用了 TensorFlow,并且希望在引用中注记 TensorFlow,我们建议你引用上面这篇论文。你可以使用这个 [BibTeX 目录](../resources/bib.md)。 随着项目的继续发展,我们还会在这个建议引用列表添加新的论文。
## 社区
### 讨论
* [GitHub ](https://github.com/tensorflow/tensorflow)
* [Stack Overflow](https://stackoverflow.com/questions/tagged/tensorflow)
* [TensorFlow 讨论邮件列表](https://groups.google.com/a/tensorflow.org/d/forum/discuss)
### 报告 Issues
* [TensorFlow issues](https://github.com/tensorflow/tensorflow/issues)
### 开发
* 如果你有兴趣为 TensorFlow 贡献代码请
[阅读这篇代码贡献指南](https://github.com/tensorflow/tensorflow/blob/master/CONTRIBUTING.md).
> 原文:[http://tensorflow.org/resources/bib.md](http://tensorflow.org/resources/bib.md)
>
> 翻译:[Jim-Zenn](https://github.com/Jim-Zenn)
>
> 校对:[lonlonago]( https://github.com/lonlonago)
================================================
FILE: SOURCE/resources/uses.md
================================================
# 应用实例
本页介绍了一些 TensorFlow 系统当前在实际中的应用。
> 如果您在做研究、教育、或在某些产品中正在使用 TensorFlow,
> 我们非常乐意在这里添加一些有关您的使用情况。
> 请随时给我们发电子邮件简要说明您是如何使用TensorFlow的,
> 或者给我们发 pull request来添加一个条目到本文件。
下面列出了一些 TensorFlow 的用途。
* **RankBrain**
* **组织**: Google
* **域名**: Information Retrieval
* **描述**: 对www.google.com搜索排名大规模部署的深层神经网络。
* **更多信息**: ["Google Turning Over Its Lucrative Search to AI Machines"](http://www.bloomberg.com/news/articles/2015-10-26/google-turning-its-lucrative-web-search-over-to-ai-machines)
* **Inception Image Classification Model**
* **组织**: Google
* **描述**: 研究高精确的计算机视觉模型,赢得了2014年Imagenet图像分类的挑战 (ILSVRC 2014)
* **更多信息**: 关于 Baseline model 的描述 [Arxiv paper](http://arxiv.org/abs/1409.4842)
* **SmartReply**
* **组织**: Google
* **描述**: 基于深度 LSTM 模型的自动生成电子邮件回复
* **更多信息**: [Google research blog post](http://googleresearch.blogspot.com/2015/11/computer-respond-to-this-email.html)
* **Massively Multitask Networks for Drug Discovery**
* **组织**: Google and Stanford University
* **域名**: Drug discovery
* **描述**: 基于深度神经网络模型的药物探索
* **更多信息**: [Arxiv paper](http://arxiv.org/abs/1502.02072)
* **On-Device Computer Vision for OCR**
* **组织**: Google
* **描述**: 用设备内置的计算机视觉模型来做‘光学字符识别’(OCR)以实现实时翻译。
* **更多信息**: [Google Research blog post](http://googleresearch.blogspot.com/2015/07/how-google-translate-squeezes-deep.html)
原文:[http://tensorflow.org/resources/uses.md](http://tensorflow.org/resources/uses.md)
翻 译:[andyiac](https://github.com/andyiac)
校 对:[lonlonago](https://github.com/lonlonago)
================================================
FILE: SOURCE/tutorials/BUILD
================================================
# Description:
# Top-level tutorials files
package(default_visibility = ["//tensorflow:internal"])
licenses(["notice"]) # Apache 2.0
exports_files(["LICENSE"])
filegroup(
name = "all_files",
srcs = glob(
["**/*"],
exclude = [
"**/METADATA",
"**/OWNERS",
],
),
)
================================================
FILE: SOURCE/tutorials/__init__.py
================================================
================================================
FILE: SOURCE/tutorials/deep_cnn/cifar_tensorboard.html
================================================
TensorBoard Demo
================================================
FILE: SOURCE/tutorials/deep_cnn/index.md
================================================
# Convolutional Neural Networks
> **NOTE:** This tutorial is intended for *advanced* users of TensorFlow
and assumes expertise and experience in machine learning.
## Overview
CIFAR-10 classification is a common benchmark problem in machine learning. The
problem is to classify RGB 32x32 pixel images across 10 categories:
```airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.```

For more details refer to the [CIFAR-10 page](http://www.cs.toronto.edu/~kriz/cifar.html)
and a [Tech Report](http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf)
by Alex Krizhevsky.
### Goals
The goal of this tutorial is to build a relatively small convolutional neural
network (CNN) for recognizing images. In the process, this tutorial:
1. Highlights a canonical organization for network architecture,
training and evaluation.
2. Provides a template for constructing larger and more sophisticated models.
The reason CIFAR-10 was selected was that it is complex enough to exercise
much of TensorFlow's ability to scale to large models. At the same time,
the model is small enough to train fast, which is ideal for trying out
new ideas and experimenting with new techniques.
### Highlights of the Tutorial
The CIFAR-10 tutorial demonstrates several important constructs for
designing larger and more sophisticated models in TensorFlow:
* Core mathematical components including [convolution](
../../api_docs/python/nn.md#conv2d), [rectified linear activations](
../../api_docs/python/nn.md#relu), [max pooling](
../../api_docs/python/nn.md#max_pool) and [local response normalization](
../../api_docs/python/nn.md#local_response_normalization).
* [Visualization](../../how_tos/summaries_and_tensorboard/index.md)
of network activities during training, including input images,
losses and distributions of activations and gradients.
* Routines for calculating the
[moving average](../../api_docs/python/train.md#ExponentialMovingAverage)
of learned parameters and using these averages
during evaluation to boost predictive performance.
* Implementation of a
[learning rate schedule](../../api_docs/python/train.md#exponential_decay)
that systematically decrements over time.
* Prefetching [queues](../../api_docs/python/io_ops.md#shuffle_batch)
for input
data to isolate the model from disk latency and expensive image pre-processing.
We also provide a multi-GPU version of the model which demonstrates:
* Configuring a model to train across multiple GPU cards in parallel.
* Sharing and updating variables among multiple GPUs.
We hope that this tutorial provides a launch point for building larger CNNs for
vision tasks on TensorFlow.
### Model Architecture
The model in this CIFAR-10 tutorial is a multi-layer architecture consisting of
alternating convolutions and nonlinearities. These layers are followed by fully
connected layers leading into a softmax classifier. The model follows the
architecture described by
[Alex Krizhevsky](https://code.google.com/p/cuda-convnet/), with a few
differences in the top few layers.
This model achieves a peak performance of about 86% accuracy within a few hours
of training time on a GPU. Please see [below](#evaluating-a-model) and the code
for details. It consists of 1,068,298 learnable parameters and requires about
19.5M multiply-add operations to compute inference on a single image.
## Code Organization
The code for this tutorial resides in
[`tensorflow/models/image/cifar10/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/).
File | Purpose
--- | ---
[`cifar10_input.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py) | Reads the native CIFAR-10 binary file format.
[`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py) | Builds the CIFAR-10 model.
[`cifar10_train.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_train.py) | Trains a CIFAR-10 model on a CPU or GPU.
[`cifar10_multi_gpu_train.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py) | Trains a CIFAR-10 model on multiple GPUs.
[`cifar10_eval.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_eval.py) | Evaluates the predictive performance of a CIFAR-10 model.
## CIFAR-10 Model
The CIFAR-10 network is largely contained in
[`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py).
The complete training
graph contains roughly 765 operations. We find that we can make the code most
reusable by constructing the graph with the following modules:
1. [**Model inputs:**](#model-inputs) `inputs()` and `distorted_inputs()` add
operations that read and preprocess CIFAR images for evaluation and training,
respectively.
1. [**Model prediction:**](#model-prediction) `inference()`
adds operations that perform inference, i.e. classification, on supplied images.
1. [**Model training:**](#model-training) `loss()` and `train()`
add operations that compute the loss,
gradients, variable updates and visualization summaries.
### Model Inputs
The input part of the model is built by the functions `inputs()` and
`distorted_inputs()` which read images from the CIFAR-10 binary data files.
These files contain fixed byte length records, so we use
[`tf.FixedLengthRecordReader`](../../api_docs/python/io_ops.md#FixedLengthRecordReader).
See [Reading Data](../../how_tos/reading_data/index.md#reading-from-files) to
learn more about how the `Reader` class works.
The images are processed as follows:
* They are cropped to 24 x 24 pixels, centrally for evaluation or
[randomly](../../api_docs/python/image.md#random_crop) for training.
* They are [approximately whitened](../../api_docs/python/image.md#per_image_whitening)
to make the model insensitive to dynamic range.
For training, we additionally apply a series of random distortions to
artificially increase the data set size:
* [Randomly flip](../../api_docs/python/image.md#random_flip_left_right) the image from left to right.
* Randomly distort the [image brightness](../../api_docs/python/image.md#random_brightness).
* Randomly distort the [image contrast](../../api_docs/python/image.md#tf_image_random_contrast).
Please see the [Images](../../api_docs/python/image.md) page for the list of
available distortions. We also attach an
[`image_summary`](../../api_docs/python/train.md#image_summary) to the images
so that we may visualize them in TensorBoard. This is a good practice to verify
that inputs are built correctly.
Reading images from disk and distorting them can use a non-trivial amount of
processing time. To prevent these operations from slowing down training, we run
them inside 16 separate threads which continuously fill a TensorFlow
[queue](../../api_docs/python/io_ops.md#shuffle_batch).
### Model Prediction
The prediction part of the model is constructed by the `inference()` function
which adds operations to compute the *logits* of the predictions. That part of
the model is organized as follows:
Layer Name | Description
--- | ---
`conv1` | [convolution](../../api_docs/python/nn.md#conv2d) and [rectified linear](../../api_docs/python/nn.md#relu) activation.
`pool1` | [max pooling](../../api_docs/python/nn.md#max_pool).
`norm1` | [local response normalization](../../api_docs/python/nn.md#local_response_normalization).
`conv2` | [convolution](../../api_docs/python/nn.md#conv2d) and [rectified linear](../../api_docs/python/nn.md#relu) activation.
`norm2` | [local response normalization](../../api_docs/python/nn.md#local_response_normalization).
`pool2` | [max pooling](../../api_docs/python/nn.md#max_pool).
`local3` | [fully connected layer with rectified linear activation](../../api_docs/python/nn.md).
`local4` | [fully connected layer with rectified linear activation](../../api_docs/python/nn.md).
`softmax_linear` | linear transformation to produce logits.
Here is a graph generated from TensorBoard describing the inference operation:
> **EXERCISE**: The output of `inference` are un-normalized logits. Try editing
the network architecture to return normalized predictions using [`tf.softmax()`]
(../../api_docs/python/nn.md#softmax).
The `inputs()` and `inference()` functions provide all the components
necessary to perform evaluation on a model. We now shift our focus towards
building operations for training a model.
> **EXERCISE:** The model architecture in `inference()` differs slightly from
the CIFAR-10 model specified in
[cuda-convnet](https://code.google.com/p/cuda-convnet/). In particular, the top
layers are locally connected and not fully connected. Try editing the
architecture to exactly replicate that fully connected model.
### Model Training
The usual method for training a network to perform N-way classification is
[multinomial logistic regression](https://en.wikipedia.org/wiki/Multinomial_logistic_regression),
aka. *softmax regression*. Softmax regression applies a
[softmax](../../api_docs/python/nn.md#softmax) nonlinearity to the
output of the network and calculates the
[cross-entropy](../../api_docs/python/nn.md#softmax_cross_entropy_with_logits)
between the normalized predictions and a
[1-hot encoding](../../api_docs/python/sparse_ops.md#sparse_to_dense) of the label.
For regularization, we also apply the usual
[weight decay](../../api_docs/python/nn.md#l2_loss) losses to all learned
variables. The objective function for the model is the sum of the cross entropy
loss and all these weight decay terms, as returned by the `loss()` function.
We visualize it in TensorBoard with a [scalar_summary](../../api_docs/python/train.md#scalar_summary):

We train the model using standard
[gradient descent](https://en.wikipedia.org/wiki/Gradient_descent)
algorithm (see [Training](../../api_docs/python/train.md) for other methods)
with a learning rate that
[exponentially decays](../../api_docs/python/train.md#exponential_decay)
over time.

The `train()` function adds the operations needed to minimize the objective by
calculating the gradient and updating the learned variables (see
[`GradientDescentOptimizer`](../../api_docs/python/train.md#GradientDescentOptimizer)
for details). It returns an operation that executes all the calculations
needed to train and update the model for one batch of images.
## Launching and Training the Model
We have built the model, let's now launch it and run the training operation with
the script `cifar10_train.py`.
```shell
python cifar10_train.py
```
**NOTE:** The first time you run any target in the CIFAR-10 tutorial,
the CIFAR-10 dataset is automatically downloaded. The data set is ~160MB
so you may want to grab a quick cup of coffee for your first run.
You should see the output:
```shell
Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes.
2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch)
2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch)
2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch)
2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch)
2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch)
2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch)
...
```
The script reports the total loss every 10 steps as well the speed at which
the last batch of data was processed. A few comments:
* The first batch of data can be inordinately slow (e.g. several minutes) as the
preprocessing threads fill up the shuffling queue with 20,000 processed CIFAR
images.
* The reported loss is the average loss of the most recent batch. Remember that
this loss is the sum of the cross entropy and all weight decay terms.
* Keep an eye on the processing speed of a batch. The numbers shown above were
obtained on a Tesla K40c. If you are running on a CPU, expect slower performance.
> **EXERCISE:** When experimenting, it is sometimes annoying that the first
training step can take so long. Try decreasing the number of images initially
that initially fill up the queue. Search for `NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN`
in `cifar10.py`.
`cifar10_train.py` periodically [saves](../../api_docs/python/state_ops.md#Saver)
all model parameters in
[checkpoint files](../../how_tos/variables/index.md#saving-and-restoring)
but it does *not* evaluate the model. The checkpoint file
will be used by `cifar10_eval.py` to measure the predictive
performance (see [Evaluating a Model](#evaluating-a-model) below).
If you followed the previous steps, then you have now started training
a CIFAR-10 model. [Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0)
The terminal text returned from `cifar10_train.py` provides minimal insight into
how the model is training. We want more insight into the model during training:
* Is the loss *really* decreasing or is that just noise?
* Is the model being provided appropriate images?
* Are the gradients, activations and weights reasonable?
* What is the learning rate currently at?
[TensorBoard](../../how_tos/summaries_and_tensorboard/index.md) provides this
functionality, displaying data exported periodically from `cifar10_train.py` via
a
[`SummaryWriter`](../../api_docs/python/train.md#SummaryWriter).
For instance, we can watch how the distribution of activations and degree of
sparsity in `local3` features evolve during training:
Individual loss functions, as well as the total loss, are particularly
interesting to track over time. However, the loss exhibits a considerable amount
of noise due to the small batch size employed by training. In practice we find
it extremely useful to visualize their moving averages in addition to their raw
values. See how the scripts use
[`ExponentialMovingAverage`](../../api_docs/python/train.md#ExponentialMovingAverage)
for this purpose.
## Evaluating a Model
Let us now evaluate how well the trained model performs on a hold-out data set.
the model is evaluated by the script `cifar10_eval.py`. It constructs the model
with the `inference()` function and uses all 10,000 images in the evaluation set
of CIFAR-10. It calculates the *precision at 1:* how often the top prediction
matches the true label of the image.
To monitor how the model improves during training, the evaluation script runs
periodically on the latest checkpoint files created by the `cifar10_train.py`.
```shell
python cifar10_eval.py
```
> Be careful not to run the evaluation and training binary on the same GPU or
else you might run out of memory. Consider running the evaluation on
a separate GPU if available or suspending the training binary while running
the evaluation on the same GPU.
You should see the output:
```shell
2015-11-06 08:30:44.391206: precision @ 1 = 0.860
...
```
The script merely returns the precision @ 1 periodically -- in this case
it returned 86% accuracy. `cifar10_eval.py` also
exports summaries that may be visualized in TensorBoard. These summaries
provide additional insight into the model during evaluation.
The training script calculates the
[moving average](../../api_docs/python/train.md#ExponentialMovingAverage)
version of all learned variables. The evaluation script substitutes
all learned model parameters with the moving average version. This
substitution boosts model performance at evaluation time.
> **EXERCISE:** Employing averaged parameters may boost predictive performance
by about 3% as measured by precision @ 1. Edit `cifar10_eval.py` to not employ
the averaged parameters for the model and verify that the predictive performance
drops.
## Training a Model Using Multiple GPU Cards
Modern workstations may contain multiple GPUs for scientific computation.
TensorFlow can leverage this environment to run the training operation
concurrently across multiple cards.
Training a model in a parallel, distributed fashion requires
coordinating training processes. For what follows we term *model replica*
to be one copy of a model training on a subset of data.
Naively employing asynchronous updates of model parameters
leads to sub-optimal training performance
because an individual model replica might be trained on a stale
copy of the model parameters. Conversely, employing fully synchronous
updates will be as slow as the slowest model replica.
In a workstation with multiple GPU cards, each GPU will have similar speed
and contain enough memory to run an entire CIFAR-10 model. Thus, we opt to
design our training system in the following manner:
* Place an individual model replica on each GPU.
* Update model parameters synchronously by waiting for all GPUs to finish
processing a batch of data.
Here is a diagram of this model:
Note that each GPU computes inference as well as the gradients for a unique
batch of data. This setup effectively permits dividing up a larger batch
of data across the GPUs.
This setup requires that all GPUs share the model parameters. A well-known
fact is that transferring data to and from GPUs is quite slow. For this
reason, we decide to store and update all model parameters on the CPU (see
green box). A fresh set of model parameters is transferred to the GPU
when a new batch of data is processed by all GPUs.
The GPUs are synchronized in operation. All gradients are accumulated from
the GPUs and averaged (see green box). The model parameters are updated with
the gradients averaged across all model replicas.
### Placing Variables and Operations on Devices
Placing operations and variables on devices requires some special
abstractions.
The first abstraction we require is a function for computing inference and
gradients for a single model replica. In the code we term this abstraction
a "tower". We must set two attributes for each tower:
* A unique name for all operations within a tower.
[`tf.name_scope()`](../../api_docs/python/framework.md#name_scope) provides
this unique name by prepending a scope. For instance, all operations in
the first tower are prepended with `tower_0`, e.g. `tower_0/conv1/Conv2D`.
* A preferred hardware device to run the operation within a tower.
[`tf.device()`](../../api_docs/python/framework.md#device) specifies this. For
instance, all operations in the first tower reside within `device('/gpu:0')`
scope indicating that they should be run on the first GPU.
All variables are pinned to the CPU and accessed via
[`tf.get_variable()`](../../api_docs/python/state_ops.md#get_variable)
in order to share them in a multi-GPU version.
See how-to on [Sharing Variables](../../how_tos/variable_scope/index.md).
### Launching and Training the Model on Multiple GPU cards
If you have several GPU cards installed on your machine you can use them to
train the model faster with the `cifar10_multi_gpu_train.py` script. It is a
variation of the training script that parallelizes the model across multiple GPU
cards.
```shell
python cifar10_multi_gpu_train.py --num_gpus=2
```
The training script should output:
```shell
Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes.
2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch)
2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch)
2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch)
2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch)
2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch)
2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch)
...
```
Note that the number of GPU cards used defaults to 1. Additionally, if only 1
GPU is available on your machine, all computations will be placed on it, even if
you ask for more.
> **EXERCISE:** The default settings for `cifar10_train.py` is to
run on a batch size of 128. Try running `cifar10_multi_gpu_train.py` on 2 GPUs
with a batch size of 64 and compare the training speed.
## Next Steps
[Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0) You have
completed the CIFAR-10 tutorial.
If you are now interested in developing and training your own image
classification system, we recommend forking this tutorial and replacing
components to build address your image classification problem.
> **EXERCISE:** Download the
[Street View House Numbers (SVHN)](http://ufldl.stanford.edu/housenumbers/) data set.
Fork the CIFAR-10 tutorial and swap in the SVHN as the input data. Try adapting
the network architecture to improve predictive performance.
================================================
FILE: SOURCE/tutorials/deep_cnn.md
================================================
# 卷积神经网络
> **注意:** 本教程适用于对Tensorflow有丰富经验的用户,并假定用户有机器学习相关领域的专业知识和经验。
## 概述
对CIFAR-10 数据集的分类是机器学习中一个公开的基准测试问题,其任务是对一组大小为32x32的RGB图像进行分类,这些图像涵盖了10个类别:
```飞机, 汽车, 鸟, 猫, 鹿, 狗, 青蛙, 马, 船以及卡车。```

想了解更多信息请参考[CIFAR-10 page](http://www.cs.toronto.edu/~kriz/cifar.html),以及Alex Krizhevsky写的[技术报告](http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf)
### 目标
本教程的目标是建立一个用于识别图像的相对较小的卷积神经网络,在这一过程中,本教程会:
1. 着重于建立一个规范的网络组织结构,训练并进行评估;
2. 为建立更大规模更加复杂的模型提供一个范例
选择CIFAR-10是因为它的复杂程度足以用来检验TensorFlow中的大部分功能,并可将其扩展为更大的模型。与此同时由于模型较小所以训练速度很快,比较适合用来测试新的想法,检验新的技术。
### 本教程的重点
CIFAR-10 教程演示了在TensorFlow上构建更大更复杂模型的几个种重要内容:
* 相关核心数学对象,如[卷积](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#conv2d)、[修正线性激活](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#relu)、[最大池化](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#max_pool)以及[局部响应归一化](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#local_response_normalization);
* 训练过程中一些网络行为的[可视化](https://github.com/jikexueyuanwiki/tensorflow-zh/tree/master/SOURCE/how_tos/summaries_and_tensorboard/index.md),这些行为包括输入图像、损失情况、网络行为的分布情况以及梯度;
* 算法学习参数的[移动平均值](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#ExponentialMovingAverage)的计算函数,以及在评估阶段使用这些平均值提高预测性能;
* 实现了一种机制,使得[学习率](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#exponential_decay)随着时间的推移而递减;
* 为输入数据设计预存取[队列](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/io_ops.md#shuffle_batch),将磁盘延迟和高开销的图像预处理操作与模型分离开来处理;
我们也提供了模型的多GPU版本,用以表明:
* 可以配置模型后使其在多个GPU上并行的训练
* 可以在多个GPU之间共享和更新变量值
我们希望本教程给大家开了个头,使得在Tensorflow上可以为视觉相关工作建立更大型的CNN模型
### 模型架构
本教程中的模型是一个多层架构,由卷积层和非线性层(nonlinearities)交替多次排列后构成。这些层最终通过全连通层对接到softmax分类器上。这一模型除了最顶部的几层外,基本跟[Alex Krizhevsky](https://code.google.com/p/cuda-convnet/)提出的模型一致。
在一个GPU上经过几个小时的训练后,该模型最高可以达到86%的精度。细节请查看[下面](#evaluating-a-model)的描述以及代码。模型中包含了1,068,298个学习参数,对一副图像进行分类大概需要19.5M个乘加操作。
## 代码组织
本教程的代码位于[`tensorflow/models/image/cifar10/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/).
文件 | 作用
--- | ---
[`cifar10_input.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py) | 读取本地CIFAR-10的二进制文件格式的内容。
[`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py) | 建立CIFAR-10的模型。
[`cifar10_train.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_train.py) | 在CPU或GPU上训练CIFAR-10的模型。
[`cifar10_multi_gpu_train.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py) | 在多GPU上训练CIFAR-10的模型。
[`cifar10_eval.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_eval.py) | 评估CIFAR-10模型的预测性能。
## CIFAR-10 模型
CIFAR-10 网络模型部分的代码位于
[`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py).
完整的训练图中包含约765个操作。但是我们发现通过下面的模块来构造训练图可以最大限度的提高代码复用率:
1. [**模型输入:**](#model-inputs) 包括`inputs()` 、 `distorted_inputs()`等一些操作,分别用于读取CIFAR的图像并进行预处理,做为后续评估和训练的输入;
2. [**模型预测:**](#model-prediction) 包括`inference()`等一些操作,用于进行统计计算,比如在提供的图像进行分类;
adds operations that perform inference, i.e. classification, on supplied images.
3. [**模型训练:**](#model-training) 包括`loss()` and `train()`等一些操作,用于计算损失、计算梯度、进行变量更新以及呈现最终结果。
### 模型输入
输入模型是通过 `inputs()` 和`distorted_inputs()`函数建立起来的,这2个函数会从CIFAR-10二进制文件中读取图片文件,由于每个图片的存储字节数是固定的,因此可以使用[`tf.FixedLengthRecordReader`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/io_ops.md#FixedLengthRecordReader)函数。更多的关于`Reader`类的功能可以查看[Reading Data](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/reading_data/index.md#reading-from-files)。
图片文件的处理流程如下:
* 图片会被统一裁剪到24x24像素大小,裁剪中央区域用于评估或[随机](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#random_crop)裁剪用于训练;
* 图片会进行[近似的白化处理](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#per_image_whitening),使得模型对图片的动态范围变化不敏感。
对于训练,我们另外采取了一系列随机变换的方法来人为的增加数据集的大小:
* 对图像进行[随机的左右翻转](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#random_flip_left_right);
* 随机变换[图像的亮度](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#random_brightness);
* 随机变换[图像的对比度](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#tf_image_random_contrast);
可以在[Images](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md)页的列表中查看所有可用的变换,对于每个原始图我们还附带了一个[`image_summary`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#image_summary),以便于在TensorBoard中查看。这对于检查输入图像是否正确十分有用。
相比于总损失,在训练过程中的单项损失尤其值得人们的注意。但是由于训练中使用的数据批量比较小,损失值中夹杂了相当多的噪声。在实践过程中,我们也发现相比于原始值,损失值的移动平均值显得更为有意义。请参阅脚本[`ExponentialMovingAverage`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#ExponentialMovingAverage)了解如何实现。
## 评估模型
现在可以在另一部分数据集上来评估训练模型的性能。脚本文件`cifar10_eval.py`对模型进行了评估,利用 `inference()`函数重构模型,并使用了在评估数据集所有10,000张CIFAR-10图片进行测试。最终计算出的精度为*1:N*,N=预测值中置信度最高的一项与图片真实label匹配的频次。(It calculates the *precision at 1:* how often the top prediction matches the true label of the image)。
为了监控模型在训练过程中的改进情况,评估用的脚本文件会周期性的在最新的检查点文件上运行,这些检查点文件是由`cifar10_train.py`产生。
```shell
python cifar10_eval.py
```
>注意:不要在同一块GPU上同时运行训练程序和评估程序,因为可能会导致内存耗尽。尽可能的在其它单独的GPU上运行评估程序,或者在同一块GPU上运行评估程序时先挂起训练程序。
你可能会看到如下所示输出:
```shell
2015-11-06 08:30:44.391206: precision @ 1 = 0.860
...
```
评估脚本只是周期性的返回precision@1 (The script merely returns the precision @ 1 periodically)--在该例中返回的准确率是86%。`cifar10_eval.py` 同时也返回其它一些可以在TensorBoard中进行可视化的简要信息。可以通过这些简要信息在评估过程中进一步的了解模型。
训练脚本会为所有学习变量计算其[移动均值](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#ExponentialMovingAverage),评估脚本则直接将所有学习到的模型参数替换成对应的移动均值。这一替代方式可以在评估过程中提升模型的性能。
> **练习:** 通过precision @ 1测试发现,使用均值参数可以将预测性能提高约3%,在`cifar10_eval.py`中尝试修改为不采用均值参数的方式,并确认由此带来的预测性能下降。
## 在多个GPU板卡上训练模型
现代的工作站可能包含多个GPU进行科学计算。TensorFlow可以利用这一环境在多个GPU卡上运行训练程序。
在并行、分布式的环境中进行训练,需要对训练程序进行协调。对于接下来的描述,术语*模型拷贝*(*model replica*)特指在一个数据子集中训练出来的模型的一份拷贝。
如果天真的对模型参数的采用异步方式更新将会导致次优的训练性能,这是因为我们可能会基于一个旧的模型参数的拷贝去训练一个模型。但与此相反采用完全同步更新的方式,其速度将会变得和最慢的模型一样慢(Conversely, employing fully synchronous updates will be as slow as the slowest model replica.)。
在具有多个GPU的工作站中,每个GPU的速度基本接近,并且都含有足够的内存来运行整个CIFAR-10模型。因此我们选择以下方式来设计我们的训练系统:
* 在每个GPU上放置单独的模型副本;
* 等所有GPU处理完一批数据后再同步更新模型的参数;
下图示意了该模型的结构::
可以看到,每一个GPU会用一批独立的数据计算梯度和估计值。这种设置可以非常有效的将一大批数据分割到各个GPU上。
这一机制要求所有GPU能够共享模型参数。但是众所周知在GPU之间传输数据非常的慢,因此我们决定在CPU上存储和更新所有模型的参数(对应图中绿色矩形的位置)。这样一来,GPU在处理一批新的数据之前会更新一遍的参数。
图中所有的GPU是同步运行的。所有GPU中的梯度会累积并求平均值(绿色方框部分)。模型参数会利用所有模型副本梯度的均值来更新。
### 在多个设备中设置变量和操作
在多个设备中设置变量和操作时需要做一些特殊的抽象。
我们首先需要把在单个模型拷贝中计算估计值和梯度的行为抽象到一个函数中。在代码中,我们称这个抽象对象为“tower”。对于每一个“tower”我们都需要设置它的两个属性:
* 在一个tower中为所有操作设定一个唯一的名称。[`tf.name_scope()`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#name_scope)通过添加一个范围前缀来提供该唯一名称。比如,第一个tower中的所有操作都会附带一个前缀`tower_0`,示例:`tower_0/conv1/Conv2D`;
* 在一个tower中运行操作的优先硬件设备。 [`tf.device()`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#device) 提供该信息。比如,在第一个tower中的所有操作都位于 `device('/gpu:0')`范围中,暗含的意思是这些操作应该运行在第一块GPU上;
为了在多个GPU上共享变量,所有的变量都绑定在CPU上,并通过[`tf.get_variable()`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/state_ops.md#get_variable)访问。可以查看[Sharing Variables](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/variables/index.md)以了解如何共享变量。
### 启动并在多个GPU上训练模型
如果你的机器上安装有多块GPU,你可以通过使用`cifar10_multi_gpu_train.py`脚本来加速模型训练。该脚本是训练脚本的一个变种,使用多个GPU实现模型并行训练。
```shell
python cifar10_multi_gpu_train.py --num_gpus=2
```
训练脚本的输出如下所示:
```shell
Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes.
2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch)
2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch)
2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch)
2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch)
2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch)
2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch)
...
```
需要注意的是默认的GPU使用数是1,此外,如果你的机器上只有一个GPU,那么所有的计算都只会在一个GPU上运行,即便你可能设置的是N个。
> **练习:** `cifar10_train.py`中的批处理大小默认配置是128。尝试在2个GPU上运行`cifar10_multi_gpu_train.py`脚本,并且设定批处理大小为64,然后比较2种方式的训练速度。
## 下一步
[恭喜你!](https://www.youtube.com/watch?v=9bZkp7q19f0) 你已经完成了CIFAR-10教程。
如果你对开发和训练自己的图像分类系统感兴趣,我们推荐你新建一个基于该教程的分支,并修改其中的内容以建立解决您问题的图像分类系统。
> **练习:** 下载[Street View House Numbers (SVHN)](http://ufldl.stanford.edu/housenumbers/) 数据集。新建一个CIFAR-10教程的分支,并将输入数据替换成SVHN。尝试改变网络结构以提高预测性能。
> 原文:[Convolutional Neural Networks](http://tensorflow.org/tutorials/deep_cnn/index.md) 翻译:[oskycar](https://github.com/oskycar) 校对:[KK4SBB](https://github.com/zhyhooo)
================================================
FILE: SOURCE/tutorials/mandelbrot/index.md
================================================
# Mandelbrot Set
Visualizing the Mandelbrot set doesn't have anything to do with machine
learning, but it makes for a fun example of how one can use TensorFlow for
general mathematics. This is actually a pretty naive implementation of the
visualization, but it makes the point. (We may end up providing a more
elaborate implementation down the line to produce more truly beautiful images.)
Note: This tutorial was originally prepared as an IPython notebook.
## Basic Setup
We'll need a few imports to get started.
```python
# Import libraries for simulation
import tensorflow as tf
import numpy as np
# Imports for visualization
import PIL.Image
from cStringIO import StringIO
from IPython.display import clear_output, Image, display
import scipy.ndimage as nd
```
Now we'll define a function to actually display the image once we have
iteration counts.
```python
def DisplayFractal(a, fmt='jpeg'):
"""Display an array of iteration counts as a
colorful picture of a fractal."""
a_cyclic = (6.28*a/20.0).reshape(list(a.shape)+[1])
img = np.concatenate([10+20*np.cos(a_cyclic),
30+50*np.sin(a_cyclic),
155-80*np.cos(a_cyclic)], 2)
img[a==a.max()] = 0
a = img
a = np.uint8(np.clip(a, 0, 255))
f = StringIO()
PIL.Image.fromarray(a).save(f, fmt)
display(Image(data=f.getvalue()))
```
## Session and Variable Initialization
For playing around like this, we often use an interactive session, but a regular
session would work as well.
```python
sess = tf.InteractiveSession()
```
It's handy that we can freely mix NumPy and TensorFlow.
```python
# Use NumPy to create a 2D array of complex numbers on [-2,2]x[-2,2]
Y, X = np.mgrid[-1.3:1.3:0.005, -2:1:0.005]
Z = X+1j*Y
```
Now we define and initialize TensorFlow tensors.
```python
xs = tf.constant(Z.astype("complex64"))
zs = tf.Variable(xs)
ns = tf.Variable(tf.zeros_like(xs, "float32"))
```
TensorFlow requires that you explicitly initialize variables before using them.
```python
tf.initialize_all_variables().run()
```
## Defining and Running the Computation
Now we specify more of the computation...
```python
# Compute the new values of z: z^2 + x
zs_ = zs*zs + xs
# Have we diverged with this new value?
not_diverged = tf.complex_abs(zs_) < 4
# Operation to update the zs and the iteration count.
#
# Note: We keep computing zs after they diverge! This
# is very wasteful! There are better, if a little
# less simple, ways to do this.
#
step = tf.group(
zs.assign(zs_),
ns.assign_add(tf.cast(not_diverged, "float32"))
)
```
... and run it for a couple hundred steps
```python
for i in range(200): step.run()
```
Let's see what we've got.
```python
DisplayFractal(ns.eval())
```

Not bad!
================================================
FILE: SOURCE/tutorials/mandelbrot.md
================================================
# 曼德布洛特(Mandelbrot)集合
虽然可视化曼德布洛特(Mandelbrot)集合与机器学习没有任何关系,但这对于将TensorFlow应用在数学更广泛的领域是一个有趣的例子。实际上,这是tensorflow一个非常直截了当的可视化运用。(我们最终也许会提供一种更加精心设计的运用方式来生成真正更加美丽的图像。)
说明:本教程使用了IPython的notebook。
## 基本步骤
首先,我们需要导入一些库。
```python
# 导入仿真库
import tensorflow as tf
import numpy as np
# 导入可视化库
import PIL.Image
from cStringIO import StringIO
from IPython.display import clear_output, Image, display
import scipy.ndimage as nd
```
现在我们将定义一个函数来显示迭代计算出的图像。
```python
def DisplayFractal(a, fmt='jpeg'):
"""显示迭代计算出的彩色分形图像。"""
a_cyclic = (6.28*a/20.0).reshape(list(a.shape)+[1])
img = np.concatenate([10+20*np.cos(a_cyclic),
30+50*np.sin(a_cyclic),
155-80*np.cos(a_cyclic)], 2)
img[a==a.max()] = 0
a = img
a = np.uint8(np.clip(a, 0, 255))
f = StringIO()
PIL.Image.fromarray(a).save(f, fmt)
display(Image(data=f.getvalue()))
```
## 会话(session)和变量(variable)初始化
为了操作的方便,我们常常使用交互式会话(interactive session),但普通会话(regular session)也能正常使用。
```python
sess = tf.InteractiveSession()
```
我们可以自由的混合使用NumPy和TensorFlow,这一点非常方便。
```python
# 使用NumPy创建一个在[-2,2]x[-2,2]范围内的2维复数数组
Y, X = np.mgrid[-1.3:1.3:0.005, -2:1:0.005]
Z = X+1j*Y
```
现在我们定义并初始化一组TensorFlow的张量 (tensors)。
```python
xs = tf.constant(Z.astype("complex64"))
zs = tf.Variable(xs)
ns = tf.Variable(tf.zeros_like(xs, "float32"))
```
TensorFlow在使用之前需要你明确给定变量的初始值。
```python
tf.initialize_all_variables().run()
```
## 定义并运行计算
现在我们指定更多的计算...
```python
# 计算一个新值z: z^2 + x
zs_ = zs*zs + xs
# 这个新值会发散吗?
not_diverged = tf.complex_abs(zs_) < 4
# 更新zs并且迭代计算。
#
# 说明:在这些值发散之后,我们仍然在计算zs,这个计算消耗特别大!
# 如果稍微简单点,这里有更好的方法来处理。
#
step = tf.group(
zs.assign(zs_),
ns.assign_add(tf.cast(not_diverged, "float32"))
)
```
...继续执行几百个步骤
```python
for i in range(200): step.run()
```
让我们看看我们得到了什么。
```python
DisplayFractal(ns.eval())
```

结果不错!
> 原文:[Mandelbrot Set](http://tensorflow.org/tutorials/mandelbrot/index.md) 翻译:[ericxk](https://github.com/ericxk) 校对:[tensorfly](https://github.com/tensorfly)
================================================
FILE: SOURCE/tutorials/mnist/__init__.py
================================================
================================================
FILE: SOURCE/tutorials/mnist/beginners/index.md
================================================
# MNIST For ML Beginners
*This tutorial is intended for readers who are new to both machine learning and
TensorFlow. If you already
know what MNIST is, and what softmax (multinomial logistic) regression is,
you might prefer this [faster paced tutorial](../../../tutorials/mnist/pros/index.md).*
When one learns how to program, there's a tradition that the first thing you do
is print "Hello World." Just like programming has Hello World, machine learning
has MNIST.
MNIST is a simple computer vision dataset. It consists of images of handwritten
digits like these:
It also includes labels for each image, telling us which digit it is. For
example, the labels for the above images are 5, 0, 4, and 1.
In this tutorial, we're going to train a model to look at images and predict
what digits they are. Our goal isn't to train a really elaborate model that
achieves state-of-the-art performance -- although we'll give you code to do that
later! -- but rather to dip a toe into using TensorFlow. As such, we're going
to start with a very simple model, called a Softmax Regression.
The actual code for this tutorial is very short, and all the interesting
stuff happens in just three lines. However, it is very
important to understand the ideas behind it: both how TensorFlow works and the
core machine learning concepts. Because of this, we are going to very carefully
work through the code.
## The MNIST Data
The MNIST data is hosted on
[Yann LeCun's website](http://yann.lecun.com/exdb/mnist/). For your
convenience, we've included some python code to download and install the data
automatically. You can either download
[the code](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/input_data.py)
and import it as below, or simply copy and paste it in.
```python
import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
```
The downloaded data is split into two parts, 60,000 data points of training
data (`mnist.train`) and 10,000 points of test data (`mnist.test`). This
split is very important: it's essential in machine learning that we
have separate data which we don't learn from so that we can make sure
that what we've learned actually generalizes!
As mentioned earlier, every MNIST data point has two parts: an image of a
handwritten digit and a corresponding label. We will call the images "xs" and
the labels "ys". Both the training set and test set contain xs and ys, for
example the training images are `mnist.train.images` and the train labels are
`mnist.train.labels`.
Each image is 28 pixels by 28 pixels. We can interpret this as a big array of
numbers:
We can flatten this array into a vector of 28x28 = 784 numbers. It doesn't
matter how we flatten the array, as long as we're consistent between images.
From this perspective, the MNIST images are just a bunch of points in a
784-dimensional vector space, with a
[very rich structure](http://colah.github.io/posts/2014-10-Visualizing-MNIST/)
(warning: computationally intensive visualizations).
Flattening the data throws away information about the 2D structure of the image.
Isn't that bad? Well, the best computer vision methods do exploit this
structure, and we will in later tutorials. But the simple method we will be
using here, a softmax regression, won't.
The result is that `mnist.train.images` is a tensor (an n-dimensional array) with a
shape of `[60000, 784]`. The first dimension indexes the images and the second
dimension indexes the pixels in each image. Each entry in the tensor is the
pixel intensity between 0 and 1, for a particular pixel in a particular image.
The corresponding labels in MNIST are numbers between 0 and 9, describing
which digit a given image is of.
For the purposes of this tutorial, we're going to want our labels
as "one-hot vectors". A one-hot vector is a vector which is 0 in most
dimensions, and 1 in a single dimension. In this case, the \\(n\\)th digit will be
represented as a vector which is 1 in the \\(n\\)th dimensions. For example, 3 would be \\([0,0,0,1,0,0,0,0,0,0]\\).
Consequently, `mnist.train.labels` is a
`[60000, 10]` array of floats.
We're now ready to actually make our model!
## Softmax Regressions
We know that every image in MNIST is a digit, whether it's a zero or a nine. We
want to be able to look at an image and give probabilities for it being each
digit. For example, our model might look at a picture of a nine and be 80% sure
it's a nine, but give a 5% chance to it being an eight (because of the top loop)
and a bit of probability to all the others because it isn't sure.
This is a classic case where a softmax regression is a natural, simple model.
If you want to assign probabilities to an object being one of several different
things, softmax is the thing to do. Even later on, when we train more
sophisticated models, the final step will be a layer of softmax.
A softmax regression has two steps: first we add up the evidence of our input
being in certain classes, and then we convert that evidence into probabilities.
To tally up the evidence that a given image is in a particular class, we do a
weighted sum of the pixel intensities. The weight is negative if that pixel
having a high intensity is evidence against the image being in that class,
and positive if it is evidence in favor.
The following diagram shows the weights one model learned for each of these
classes. Red represents negative weights, while blue represents positive
weights.
We also add some extra evidence called a bias. Basically, we want to be able
to say that some things are more likely independent of the input. The result is
that the evidence for a class \\(i\\) given an input \\(x\\) is:
$$\text{evidence}_i = \sum_j W_{i,~ j} x_j + b_i$$
where \\(W\_i\\) is the weights and \\(b\_i\\) is the bias for class \\(i\\),
and \\(j\\) is an index for summing over the pixels in our input image \\(x\\).
We then convert the evidence tallies into our predicted probabilities
\\(y\\) using the "softmax" function:
$$y = \text{softmax}(\text{evidence})$$
Here softmax is serving as an "activation" or "link" function, shaping
the output of our linear function into the form we want -- in this case, a
probability distribution over 10 cases.
You can think of it as converting tallies
of evidence into probabilities of our input being in each class.
It's defined as:
$$\text{softmax}(x) = \text{normalize}(\exp(x))$$
If you expand that equation out, you get:
$$\text{softmax}(x)_i = \frac{\exp(x_i)}{\sum_j \exp(x_j)}$$
But it's often more helpful to think of softmax the first way:
exponentiating its inputs and then normalizing them. The exponentiation
means that one unit more evidence increases the weight given to any hypothesis
multiplicatively. And conversely, having one less unit of evidence means that a
hypothesis gets a fraction of its earlier weight. No hypothesis ever has zero
or negative weight. Softmax then normalizes these weights, so that they add up
to one, forming a valid probability distribution. (To get more intuition about
the softmax function, check out the
[section](http://neuralnetworksanddeeplearning.com/chap3.html#softmax)
on it in Michael Nieslen's book, complete with an interactive visualization.)
You can picture our softmax regression as looking something like the following,
although with a lot more \\(x\\)s. For each output, we compute a weighted sum of
the \\(x\\)s, add a bias, and then apply softmax.
If we write that out as equations, we get:
We can "vectorize" this procedure, turning it into a matrix multiplication
and vector addition. This is helpful for computational efficiency. (It's also
a useful way to think.)
More compactly, we can just write:
$$y = \text{softmax}(Wx + b)$$
## Implementing the Regression
To do efficient numerical computing in Python, we typically use libraries like
NumPy that do expensive operations such as matrix multiplication outside Python,
using highly efficient code implemented in another language.
Unfortunately, there can still be a lot of overhead from switching back to
Python every operation. This overhead is especially bad if you want to run
computations on GPUs or in a distributed manner, where there can be a high cost
to transferring data.
TensorFlow also does its heavy lifting outside python,
but it takes things a step further to avoid this overhead.
Instead of running a single expensive operation independently
from Python, TensorFlow lets us describe a graph of interacting operations that
run entirely outside Python. (Approaches like this can be seen in a few
machine learning libraries.)
To use TensorFlow, we need to import it.
```python
import tensorflow as tf
```
We describe these interacting operations by manipulating symbolic variables.
Let's create one:
```python
x = tf.placeholder("float", [None, 784])
```
`x` isn't a specific value. It's a `placeholder`, a value that we'll input when
we ask TensorFlow to run a computation. We want to be able to input any number
of MNIST images, each flattened into a 784-dimensional vector. We represent
this as a 2d tensor of floating point numbers, with a shape `[None, 784]`.
(Here `None` means that a dimension can be of any length.)
We also need the weights and biases for our model. We could imagine treating
these like additional inputs, but TensorFlow has an even better way to handle
it: `Variable`.
A `Variable` is a modifiable tensor that lives in TensorFlow's graph of
interacting
operations. It can be used and even modified by the computation. For machine
learning applications, one generally has the model parameters be `Variable`s.
```python
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
```
We create these `Variable`s by giving `tf.Variable` the initial value of the
`Variable`: in this case, we initialize both `W` and `b` as tensors full of
zeros. Since we are going to learn `W` and `b`, it doesn't matter very much
what they initially are.
Notice that `W` has a shape of [784, 10] because we want to multiply the
784-dimensional image vectors by it to produce 10-dimensional vectors of
evidence for the difference classes. `b` has a shape of [10] so we can add it
to the output.
We can now implement our model. It only takes one line!
```python
y = tf.nn.softmax(tf.matmul(x,W) + b)
```
First, we multiply `x` by `W` with the expression `tf.matmul(x,W)`. This is
flipped from when we multiplied them in our equation, where we had \\(Wx\\), as a
small trick
to deal with `x` being a 2D tensor with multiple inputs. We then add `b`, and
finally apply `tf.nn.softmax`.
That's it. It only took us one line to define our model, after a couple short
lines of setup. That isn't because TensorFlow is designed to make a softmax
regression particularly easy: it's just a very flexible way to describe many
kinds of numerical computations, from machine learning models to physics
simulations. And once defined, our model can be run on different devices:
your computer's CPU, GPUs, and even phones!
## Training
In order to train our model, we need to define what it means for the model to
be good. Well, actually, in machine learning we typically define what it means
for a model to be bad, called the cost or loss, and then try to minimize how bad
it is. But the two are equivalent.
One very common, very nice cost function is "cross-entropy." Surprisingly,
cross-entropy arises from thinking about information compressing codes in
information theory but it winds up being an important idea in lots of areas,
from gambling to machine learning. It's defined:
$$H_{y'}(y) = -\sum_i y'_i \log(y_i)$$
Where \\(y\\) is our predicted probability distribution, and \\(y'\\) is the true
distribution (the one-hot vector we'll input). In some rough sense, the
cross-entropy is measuring how inefficient our predictions are for describing
the truth. Going into more detail about cross-entropy is beyond the scope of
this tutorial, but it's well worth
[understanding](http://colah.github.io/posts/2015-09-Visual-Information/).
To implement cross-entropy we need to first add a new placeholder to input
the correct answers:
```python
y_ = tf.placeholder("float", [None,10])
```
Then we can implement the cross-entropy, \\(-\sum y'\log(y)\\):
```python
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
```
First, `tf.log` computes the logarithm of each element of `y`. Next, we multiply
each element of `y_` with the corresponding element of `tf.log(y)`. Finally,
`tf.reduce_sum` adds all the elements of the tensor. (Note that this isn't
just the cross-entropy of the truth with a single prediction, but the sum of the
cross-entropies for all 100 images we looked at. How well we are doing on 100
data points is a much better description of how good our model is than a single
data point.)
Now that we know what we want our model to do, it's very easy to have TensorFlow
train it to do so.
Because TensorFlow knows the entire graph of your computations, it
can automatically use the [backpropagation
algorithm](http://colah.github.io/posts/2015-08-Backprop/)
to efficiently determine how your variables affect the cost you ask it minimize.
Then it can apply your choice of optimization algorithm to modify the variables
and reduce the cost.
```python
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
```
In this case, we ask TensorFlow to minimize `cross_entropy` using the gradient
descent algorithm with a learning rate of 0.01. Gradient descent is a simple
procedure, where TensorFlow simply shifts each variable a little bit in the
direction that reduces the cost. But TensorFlow also provides
[many other optimization algorithms]
(../../../api_docs/python/train.md#optimizers): using one is as simple as
tweaking one line.
What TensorFlow actually does here, behind the scenes, is it adds new operations
to your graph which
implement backpropagation and gradient descent. Then it gives you back a
single operation which, when run, will do a step of gradient descent training,
slightly tweaking your variables to reduce the cost.
Now we have our model set up to train. One last thing before we launch it,
we have to add an operation to initialize the variables we created:
```python
init = tf.initialize_all_variables()
```
We can now launch the model in a `Session`, and run the operation that
initializes the variables:
```python
sess = tf.Session()
sess.run(init)
```
Let's train -- we'll run the training step 1000 times!
```python
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
```
Each step of the loop, we get a "batch" of one hundred random data points from
our training set. We run `train_step` feeding in the batches data to replace
the `placeholder`s.
Using small batches of random data is called stochastic training -- in
this case, stochastic gradient descent. Ideally, we'd like to use all our data
for every step of training because that would give us a better sense of what
we should be doing, but that's expensive. So, instead, we use a different subset
every time. Doing this is cheap and has much of the same benefit.
## Evaluating Our Model
How well does our model do?
Well, first let's figure out where we predicted the correct label. `tf.argmax`
is an extremely useful function which gives you the index of the highest entry
in a tensor along some axis. For example, `tf.argmax(y,1)` is the label our
model thinks is most likely for each input, while `tf.argmax(y_,1)` is the
correct label. We can use `tf.equal` to check if our prediction matches the
truth.
```python
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
```
That gives us a list of booleans. To determine what fraction are correct, we
cast to floating point numbers and then take the mean. For example,
`[True, False, True, True]` would become `[1,0,1,1]` which would become `0.75`.
```python
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
```
Finally, we ask for our accuracy on our test data.
```python
print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
```
This should be about 91%.
Is that good? Well, not really. In fact, it's pretty bad. This is because we're
using a very simple model. With some small changes, we can get to
97%. The best models can get to over 99.7% accuracy! (For more information, have
a look at this
[list of results](http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html).)
What matters is that we learned from this model. Still, if you're feeling a bit
down about these results, check out [the next tutorial](../../../tutorials/overview.md) where we
do a lot better, and learn how to build more sophisticated models using
TensorFlow!
================================================
FILE: SOURCE/tutorials/mnist/download/index.md
================================================
# MNIST Data Download
Code: [tensorflow/g3doc/tutorials/mnist/](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/)
The goal of this tutorial is to show how to download the dataset files required
for handwritten digit classification using the (classic) MNIST data set.
## Tutorial Files
This tutorial references the following files:
File | Purpose
--- | ---
[`input_data.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/input_data.py) | The code to download the MNIST dataset for training and evaluation.
## Prepare the Data
MNIST is a classic problem in machine learning. The problem is to look at
greyscale 28x28 pixel images of handwritten digits and determine which digit
the image represents, for all the digits from zero to nine.

For more information, refer to [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/)
or [Chris Olah's visualizations of MNIST](http://colah.github.io/posts/2014-10-Visualizing-MNIST/).
### Download
[Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/)
also hosts the training and test data for download.
File | Purpose
--- | ---
[`train-images-idx3-ubyte.gz`](http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz) | training set images - 55000 training images, 5000 validation images
[`train-labels-idx1-ubyte.gz`](http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz) | training set labels matching the images
[`t10k-images-idx3-ubyte.gz`](http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz) | test set images - 10000 images
[`t10k-labels-idx1-ubyte.gz`](http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz) | test set labels matching the images
In the `input_data.py` file, the `maybe_download()` function will ensure these
files are downloaded into a local data folder for training.
The folder name is specified in a flag variable at the top of the
`fully_connected_feed.py` file and may be changed to fit your needs.
### Unpack and Reshape
The files themselves are not in any standard image format and are manually
unpacked (following the instructions available at the website) by the
`extract_images()` and `extract_labels()` functions in `input_data.py`.
The image data is extracted into a 2d tensor of: `[image index, pixel index]`
where each entry is the intensity value of a specific pixel in a specific
image, rescaled from `[0, 255]` to `[-0.5, 0.5]`. The "image index" corresponds
to an image in the dataset, counting up from zero to the size of the dataset.
And the "pixel index" corresponds to a specific pixel in that image, ranging
from zero to the number of pixels in the image.
The 60000 examples in the `train-*` files are then split into 55000 examples
for training and 5000 examples for validation. For all of the 28x28
pixel greyscale images in the datasets the image size is 784 and so the output
tensor for the training set images is of shape `[55000, 784]`.
The label data is extracted into a 1d tensor of: `[image index]`
with the class identifier for each example as the value. For the training set
labels, this would then be of shape `[55000]`.
### DataSet Object
The underlying code will download, unpack, and reshape images and labels for
the following datasets:
Dataset | Purpose
--- | ---
`data_sets.train` | 55000 images and labels, for primary training.
`data_sets.validation` | 5000 images and labels, for iterative validation of training accuracy.
`data_sets.test` | 10000 images and labels, for final testing of trained accuracy.
The `read_data_sets()` function will return a dictionary with a `DataSet`
instance for each of these three sets of data. The `DataSet.next_batch()`
method can be used to fetch a tuple consisting of `batch_size` lists of images
and labels to be fed into the running TensorFlow session.
```python
images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size)
```
================================================
FILE: SOURCE/tutorials/mnist/fully_connected_feed.py
================================================
"""Trains and Evaluates the MNIST network using a feed dictionary.
TensorFlow install instructions:
https://tensorflow.org/get_started/os_setup.html
MNIST tutorial:
https://tensorflow.org/tutorials/mnist/tf/index.html
"""
from __future__ import print_function
# pylint: disable=missing-docstring
import os.path
import time
import tensorflow.python.platform
import numpy
import tensorflow as tf
from tensorflow.g3doc.tutorials.mnist import input_data
from tensorflow.g3doc.tutorials.mnist import mnist
# Basic model parameters as external flags.
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')
flags.DEFINE_integer('max_steps', 2000, 'Number of steps to run trainer.')
flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.')
flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.')
flags.DEFINE_integer('batch_size', 100, 'Batch size. '
'Must divide evenly into the dataset sizes.')
flags.DEFINE_string('train_dir', 'data', 'Directory to put the training data.')
flags.DEFINE_boolean('fake_data', False, 'If true, uses fake data '
'for unit testing.')
def placeholder_inputs(batch_size):
"""Generate placeholder variables to represent the the input tensors.
These placeholders are used as inputs by the rest of the model building
code and will be fed from the downloaded data in the .run() loop, below.
Args:
batch_size: The batch size will be baked into both placeholders.
Returns:
images_placeholder: Images placeholder.
labels_placeholder: Labels placeholder.
"""
# Note that the shapes of the placeholders match the shapes of the full
# image and label tensors, except the first dimension is now batch_size
# rather than the full size of the train or test data sets.
images_placeholder = tf.placeholder(tf.float32, shape=(batch_size,
mnist.IMAGE_PIXELS))
labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size))
return images_placeholder, labels_placeholder
def fill_feed_dict(data_set, images_pl, labels_pl):
"""Fills the feed_dict for training the given step.
A feed_dict takes the form of:
feed_dict = {
: ,
....
}
Args:
data_set: The set of images and labels, from input_data.read_data_sets()
images_pl: The images placeholder, from placeholder_inputs().
labels_pl: The labels placeholder, from placeholder_inputs().
Returns:
feed_dict: The feed dictionary mapping from placeholders to values.
"""
# Create the feed_dict for the placeholders filled with the next
# `batch size ` examples.
images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size,
FLAGS.fake_data)
feed_dict = {
images_pl: images_feed,
labels_pl: labels_feed,
}
return feed_dict
def do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_set):
"""Runs one evaluation against the full epoch of data.
Args:
sess: The session in which the model has been trained.
eval_correct: The Tensor that returns the number of correct predictions.
images_placeholder: The images placeholder.
labels_placeholder: The labels placeholder.
data_set: The set of images and labels to evaluate, from
input_data.read_data_sets().
"""
# And run one epoch of eval.
true_count = 0 # Counts the number of correct predictions.
steps_per_epoch = int(data_set.num_examples / FLAGS.batch_size)
num_examples = steps_per_epoch * FLAGS.batch_size
for step in xrange(steps_per_epoch):
feed_dict = fill_feed_dict(data_set,
images_placeholder,
labels_placeholder)
true_count += sess.run(eval_correct, feed_dict=feed_dict)
precision = float(true_count) / float(num_examples)
print(' Num examples: %d Num correct: %d Precision @ 1: %0.04f' %
(num_examples, true_count, precision))
def run_training():
"""Train MNIST for a number of steps."""
# Get the sets of images and labels for training, validation, and
# test on MNIST.
data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data)
# Tell TensorFlow that the model will be built into the default Graph.
with tf.Graph().as_default():
# Generate placeholders for the images and labels.
images_placeholder, labels_placeholder = placeholder_inputs(
FLAGS.batch_size)
# Build a Graph that computes predictions from the inference model.
logits = mnist.inference(images_placeholder,
FLAGS.hidden1,
FLAGS.hidden2)
# Add to the Graph the Ops for loss calculation.
loss = mnist.loss(logits, labels_placeholder)
# Add to the Graph the Ops that calculate and apply gradients.
train_op = mnist.training(loss, FLAGS.learning_rate)
# Add the Op to compare the logits to the labels during evaluation.
eval_correct = mnist.evaluation(logits, labels_placeholder)
# Build the summary operation based on the TF collection of Summaries.
summary_op = tf.merge_all_summaries()
# Create a saver for writing training checkpoints.
saver = tf.train.Saver()
# Create a session for running Ops on the Graph.
sess = tf.Session()
# Run the Op to initialize the variables.
init = tf.initialize_all_variables()
sess.run(init)
# Instantiate a SummaryWriter to output summaries and the Graph.
summary_writer = tf.train.SummaryWriter(FLAGS.train_dir,
graph_def=sess.graph_def)
# And then after everything is built, start the training loop.
for step in xrange(FLAGS.max_steps):
start_time = time.time()
# Fill a feed dictionary with the actual set of images and labels
# for this particular training step.
feed_dict = fill_feed_dict(data_sets.train,
images_placeholder,
labels_placeholder)
# Run one step of the model. The return values are the activations
# from the `train_op` (which is discarded) and the `loss` Op. To
# inspect the values of your Ops or variables, you may include them
# in the list passed to sess.run() and the value tensors will be
# returned in the tuple from the call.
_, loss_value = sess.run([train_op, loss],
feed_dict=feed_dict)
duration = time.time() - start_time
# Write the summaries and print an overview fairly often.
if step % 100 == 0:
# Print status to stdout.
print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration))
# Update the events file.
summary_str = sess.run(summary_op, feed_dict=feed_dict)
summary_writer.add_summary(summary_str, step)
# Save a checkpoint and evaluate the model periodically.
if (step + 1) % 1000 == 0 or (step + 1) == FLAGS.max_steps:
saver.save(sess, FLAGS.train_dir, global_step=step)
# Evaluate against the training set.
print('Training Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.train)
# Evaluate against the validation set.
print('Validation Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.validation)
# Evaluate against the test set.
print('Test Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.test)
def main(_):
run_training()
if __name__ == '__main__':
tf.app.run()
================================================
FILE: SOURCE/tutorials/mnist/input_data.py
================================================
"""Functions for downloading and reading MNIST data."""
from __future__ import print_function
import gzip
import os
import urllib
import numpy
SOURCE_URL = 'http://yann.lecun.com/exdb/mnist/'
def maybe_download(filename, work_directory):
"""Download the data from Yann's website, unless it's already here."""
if not os.path.exists(work_directory):
os.mkdir(work_directory)
filepath = os.path.join(work_directory, filename)
if not os.path.exists(filepath):
filepath, _ = urllib.urlretrieve(SOURCE_URL + filename, filepath)
statinfo = os.stat(filepath)
print('Succesfully downloaded', filename, statinfo.st_size, 'bytes.')
return filepath
def _read32(bytestream):
dt = numpy.dtype(numpy.uint32).newbyteorder('>')
return numpy.frombuffer(bytestream.read(4), dtype=dt)
def extract_images(filename):
"""Extract the images into a 4D uint8 numpy array [index, y, x, depth]."""
print('Extracting', filename)
with gzip.open(filename) as bytestream:
magic = _read32(bytestream)
if magic != 2051:
raise ValueError(
'Invalid magic number %d in MNIST image file: %s' %
(magic, filename))
num_images = _read32(bytestream)
rows = _read32(bytestream)
cols = _read32(bytestream)
buf = bytestream.read(rows * cols * num_images)
data = numpy.frombuffer(buf, dtype=numpy.uint8)
data = data.reshape(num_images, rows, cols, 1)
return data
def dense_to_one_hot(labels_dense, num_classes=10):
"""Convert class labels from scalars to one-hot vectors."""
num_labels = labels_dense.shape[0]
index_offset = numpy.arange(num_labels) * num_classes
labels_one_hot = numpy.zeros((num_labels, num_classes))
labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
return labels_one_hot
def extract_labels(filename, one_hot=False):
"""Extract the labels into a 1D uint8 numpy array [index]."""
print('Extracting', filename)
with gzip.open(filename) as bytestream:
magic = _read32(bytestream)
if magic != 2049:
raise ValueError(
'Invalid magic number %d in MNIST label file: %s' %
(magic, filename))
num_items = _read32(bytestream)
buf = bytestream.read(num_items)
labels = numpy.frombuffer(buf, dtype=numpy.uint8)
if one_hot:
return dense_to_one_hot(labels)
return labels
class DataSet(object):
def __init__(self, images, labels, fake_data=False):
if fake_data:
self._num_examples = 10000
else:
assert images.shape[0] == labels.shape[0], (
"images.shape: %s labels.shape: %s" % (images.shape,
labels.shape))
self._num_examples = images.shape[0]
# Convert shape from [num examples, rows, columns, depth]
# to [num examples, rows*columns] (assuming depth == 1)
assert images.shape[3] == 1
images = images.reshape(images.shape[0],
images.shape[1] * images.shape[2])
# Convert from [0, 255] -> [0.0, 1.0].
images = images.astype(numpy.float32)
images = numpy.multiply(images, 1.0 / 255.0)
self._images = images
self._labels = labels
self._epochs_completed = 0
self._index_in_epoch = 0
@property
def images(self):
return self._images
@property
def labels(self):
return self._labels
@property
def num_examples(self):
return self._num_examples
@property
def epochs_completed(self):
return self._epochs_completed
def next_batch(self, batch_size, fake_data=False):
"""Return the next `batch_size` examples from this data set."""
if fake_data:
fake_image = [1.0 for _ in xrange(784)]
fake_label = 0
return [fake_image for _ in xrange(batch_size)], [
fake_label for _ in xrange(batch_size)]
start = self._index_in_epoch
self._index_in_epoch += batch_size
if self._index_in_epoch > self._num_examples:
# Finished epoch
self._epochs_completed += 1
# Shuffle the data
perm = numpy.arange(self._num_examples)
numpy.random.shuffle(perm)
self._images = self._images[perm]
self._labels = self._labels[perm]
# Start next epoch
start = 0
self._index_in_epoch = batch_size
assert batch_size <= self._num_examples
end = self._index_in_epoch
return self._images[start:end], self._labels[start:end]
def read_data_sets(train_dir, fake_data=False, one_hot=False):
class DataSets(object):
pass
data_sets = DataSets()
if fake_data:
data_sets.train = DataSet([], [], fake_data=True)
data_sets.validation = DataSet([], [], fake_data=True)
data_sets.test = DataSet([], [], fake_data=True)
return data_sets
TRAIN_IMAGES = 'train-images-idx3-ubyte.gz'
TRAIN_LABELS = 'train-labels-idx1-ubyte.gz'
TEST_IMAGES = 't10k-images-idx3-ubyte.gz'
TEST_LABELS = 't10k-labels-idx1-ubyte.gz'
VALIDATION_SIZE = 5000
local_file = maybe_download(TRAIN_IMAGES, train_dir)
train_images = extract_images(local_file)
local_file = maybe_download(TRAIN_LABELS, train_dir)
train_labels = extract_labels(local_file, one_hot=one_hot)
local_file = maybe_download(TEST_IMAGES, train_dir)
test_images = extract_images(local_file)
local_file = maybe_download(TEST_LABELS, train_dir)
test_labels = extract_labels(local_file, one_hot=one_hot)
validation_images = train_images[:VALIDATION_SIZE]
validation_labels = train_labels[:VALIDATION_SIZE]
train_images = train_images[VALIDATION_SIZE:]
train_labels = train_labels[VALIDATION_SIZE:]
data_sets.train = DataSet(train_images, train_labels)
data_sets.validation = DataSet(validation_images, validation_labels)
data_sets.test = DataSet(test_images, test_labels)
return data_sets
================================================
FILE: SOURCE/tutorials/mnist/mnist.py
================================================
"""Builds the MNIST network.
Implements the inference/loss/training pattern for model building.
1. inference() - Builds the model as far as is required for running the network
forward to make predictions.
2. loss() - Adds to the inference model the layers required to generate loss.
3. training() - Adds to the loss model the Ops required to generate and
apply gradients.
This file is used by the various "fully_connected_*.py" files and not meant to
be run.
TensorFlow install instructions:
https://tensorflow.org/get_started/os_setup.html
MNIST tutorial:
https://tensorflow.org/tutorials/mnist/tf/index.html
"""
import math
import tensorflow.python.platform
import tensorflow as tf
# The MNIST dataset has 10 classes, representing the digits 0 through 9.
NUM_CLASSES = 10
# The MNIST images are always 28x28 pixels.
IMAGE_SIZE = 28
IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE
def inference(images, hidden1_units, hidden2_units):
"""Build the MNIST model up to where it may be used for inference.
Args:
images: Images placeholder, from inputs().
hidden1: Size of the first hidden layer.
hidden2: Size of the second hidden layer.
Returns:
softmax_linear: Output tensor with the computed logits.
"""
# Hidden 1
with tf.name_scope('hidden1') as scope:
weights = tf.Variable(
tf.truncated_normal([IMAGE_PIXELS, hidden1_units],
stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))),
name='weights')
biases = tf.Variable(tf.zeros([hidden1_units]),
name='biases')
hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases)
# Hidden 2
with tf.name_scope('hidden2') as scope:
weights = tf.Variable(
tf.truncated_normal([hidden1_units, hidden2_units],
stddev=1.0 / math.sqrt(float(hidden1_units))),
name='weights')
biases = tf.Variable(tf.zeros([hidden2_units]),
name='biases')
hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases)
# Linear
with tf.name_scope('softmax_linear') as scope:
weights = tf.Variable(
tf.truncated_normal([hidden2_units, NUM_CLASSES],
stddev=1.0 / math.sqrt(float(hidden2_units))),
name='weights')
biases = tf.Variable(tf.zeros([NUM_CLASSES]),
name='biases')
logits = tf.matmul(hidden2, weights) + biases
return logits
def loss(logits, labels):
"""Calculates the loss from the logits and the labels.
Args:
logits: Logits tensor, float - [batch_size, NUM_CLASSES].
labels: Labels tensor, int32 - [batch_size].
Returns:
loss: Loss tensor of type float.
"""
# Convert from sparse integer labels in the range [0, NUM_CLASSSES)
# to 1-hot dense float vectors (that is we will have batch_size vectors,
# each with NUM_CLASSES values, all of which are 0.0 except there will
# be a 1.0 in the entry corresponding to the label).
batch_size = tf.size(labels)
labels = tf.expand_dims(labels, 1)
indices = tf.expand_dims(tf.range(0, batch_size, 1), 1)
concated = tf.concat(1, [indices, labels])
onehot_labels = tf.sparse_to_dense(
concated, tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
onehot_labels,
name='xentropy')
loss = tf.reduce_mean(cross_entropy, name='xentropy_mean')
return loss
def training(loss, learning_rate):
"""Sets up the training Ops.
Creates a summarizer to track the loss over time in TensorBoard.
Creates an optimizer and applies the gradients to all trainable variables.
The Op returned by this function is what must be passed to the
`sess.run()` call to cause the model to train.
Args:
loss: Loss tensor, from loss().
learning_rate: The learning rate to use for gradient descent.
Returns:
train_op: The Op for training.
"""
# Add a scalar summary for the snapshot loss.
tf.scalar_summary(loss.op.name, loss)
# Create the gradient descent optimizer with the given learning rate.
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
# Create a variable to track the global step.
global_step = tf.Variable(0, name='global_step', trainable=False)
# Use the optimizer to apply the gradients that minimize the loss
# (and also increment the global step counter) as a single training step.
train_op = optimizer.minimize(loss, global_step=global_step)
return train_op
def evaluation(logits, labels):
"""Evaluate the quality of the logits at predicting the label.
Args:
logits: Logits tensor, float - [batch_size, NUM_CLASSES].
labels: Labels tensor, int32 - [batch_size], with values in the
range [0, NUM_CLASSES).
Returns:
A scalar int32 tensor with the number of examples (out of batch_size)
that were predicted correctly.
"""
# For a classifier model, we can use the in_top_k Op.
# It returns a bool tensor with shape [batch_size] that is true for
# the examples where the label's is was in the top k (here k=1)
# of all logits for that example.
correct = tf.nn.in_top_k(logits, labels, 1)
# Return the number of true entries.
return tf.reduce_sum(tf.cast(correct, tf.int32))
================================================
FILE: SOURCE/tutorials/mnist/mnist_softmax.py
================================================
"""A very simple MNIST classifer.
See extensive documentation at ??????? (insert public URL)
"""
from __future__ import print_function
# Import data
import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
import tensorflow as tf
sess = tf.InteractiveSession()
# Create the model
x = tf.placeholder("float", [None, 784])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
#y = tf.nn.softmax(tf.matmul(x,W) + b) # this will be lead an error because of log(0)
y = tf.nn.log_softmax(tf.matmul(x,W) + b)
# Define loss and optimizer
y_ = tf.placeholder("float", [None,10])
#cross_entropy = -tf.reduce_sum(y_*tf.log(y))
cross_entropy = -tf.reduce_sum(y_*y)
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
# Train
tf.initialize_all_variables().run()
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
train_step.run({x: batch_xs, y_: batch_ys})
# Test trained model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print(accuracy.eval({x: mnist.test.images, y_: mnist.test.labels}))
================================================
FILE: SOURCE/tutorials/mnist/pros/index.md
================================================
# Deep MNIST for Experts
TensorFlow is a powerful library for doing large-scale numerical computation.
One of the tasks at which it excels is implementing and training deep neural
networks.
In this tutorial we will learn the basic building blocks of a TensorFlow model
while constructing a deep convolutional MNIST classifier.
*This introduction assumes familiarity with neural networks and the MNIST
dataset. If you don't have
a background with them, check out the
[introduction for beginners](../../../tutorials/mnist/beginners/index.md).*
## Setup
Before we create our model, we will first load the MNIST dataset, and start a
TensorFlow session.
### Load MNIST Data
For your convenience, we've included
[a script](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/input_data.py)
which automatically downloads and imports the MNIST dataset. It will create a
directory `'MNIST_data'` in which to store the data files.
```python
import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
```
Here `mnist` is a lightweight class which stores the training, validation, and
testing sets as NumPy arrays.
It also provides a function for iterating through data minibatches, which we
will use below.
### Start TensorFlow InteractiveSession
Tensorflow relies on a highly efficient C++ backend to do its computation. The
connection to this backend is called a session. The common usage for TensorFlow
programs is to first create a graph and then launch it in a session.
Here we instead use the convenience `InteractiveSession` class, which
makes TensorFlow more flexible about how you
structure your code.
It allows you to interleave operations which build a
[computation graph](../../../get_started/basic_usage.md#the-computation-graph)
with ones that run the graph.
This is particularly convenient when working in interactive contexts like
iPython.
If you are not using an `InteractiveSession`, then you should build
the entire computation graph before starting a session and [launching the
graph](../../../get_started/basic_usage.md#launching-the-graph-in-a-session).
```python
import tensorflow as tf
sess = tf.InteractiveSession()
```
#### Computation Graph
To do efficient numerical computing in Python, we typically use libraries like
NumPy that do expensive operations such as matrix multiplication outside Python,
using highly efficient code implemented in another language.
Unfortunately, there can still be a lot of overhead from switching back to
Python every operation. This overhead is especially bad if you want to run
computations on GPUs or in a distributed manner, where there can be a high cost
to transferring data.
TensorFlow also does its heavy lifting outside Python,
but it takes things a step further to avoid this overhead.
Instead of running a single expensive operation independently
from Python, TensorFlow lets us describe a graph of interacting operations that
run entirely outside Python.
This approach is similar to that used in Theano or Torch.
The role of the Python code is therefore to build this external computation
graph, and to dictate which parts of the computation graph should be run. See
the
[Computation Graph](../../../get_started/basic_usage.md#the-computation-graph)
section of
[Basic Usage](../../../get_started/basic_usage.md)
for more detail.
## Build a Softmax Regression Model
In this section we will build a softmax regression model with a single linear
layer. In the next section, we will extend this to the case of softmax
regression with a multilayer convolutional network.
### Placeholders
We start building the computation graph by creating nodes for the
input images and target output classes.
```python
x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])
```
Here `x` and `y_` aren't specific values. Rather, they are each a `placeholder`
-- a value that we'll input when we ask TensorFlow to run a computation.
The input images `x` will consist of a 2d tensor of floating point numbers.
Here we assign it a `shape` of `[None, 784]`, where `784` is the dimensionality of
a single flattened MNIST image, and `None` indicates that the first dimension,
corresponding to the batch size, can be of any size.
The target output classes `y_` will also consist of a 2d tensor,
where each row is a one-hot 10-dimensional vector indicating
which digit class the corresponding MNIST image belongs to.
The `shape` argument to `placeholder` is optional, but it allows TensorFlow
to automatically catch bugs stemming from inconsistent tensor shapes.
### Variables
We now define the weights `W` and biases `b` for our model. We could imagine treating
these like additional inputs, but TensorFlow has an even better way to handle
them: `Variable`.
A `Variable` is a value that lives in TensorFlow's computation graph.
It can be used and even modified by the computation. In machine
learning applications, one generally has the model paramaters be `Variable`s.
```python
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
```
We pass the initial value for each parameter in the call to `tf.Variable`.
In this case, we initialize both `W` and `b` as tensors full of
zeros. `W` is a 784x10 matrix (because we have 784 input features
and 10 outputs) and `b` is a 10-dimensional vector (because we have 10 classes).
Before `Variable`s can be used within a session, they must be initialized using
that session.
This step takes the initial values (in this case tensors full of zeros) that
have already been specified, and assigns them to each `Variable`. This can be
done for all `Variables` at once.
```python
sess.run(tf.initialize_all_variables())
```
### Predicted Class and Cost Function
We can now implement our regression model. It only takes one line!
We multiply the vectorized input images `x` by the weight matrix `W`, add
the bias `b`, and compute the softmax probabilities that are assigned to each
class.
```python
y = tf.nn.softmax(tf.matmul(x,W) + b)
```
The cost function to be minimized during training can be specified just as
easily. Our cost function will be the cross-entropy between the target and the
model's prediction.
```python
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
```
Note that `tf.reduce_sum` sums across all images in the minibatch, as well as
all classes. We are computing the cross entropy for the entire minibatch.
## Train the Model
Now that we have defined our model and training cost function, it is
straightforward to train using TensorFlow.
Because TensorFlow knows the entire computation graph, it
can use automatic differentiation to find the gradients of the cost with
respect to each of the variables.
TensorFlow has a variety of
[builtin optimization algorithms]
(../../../api_docs/python/train.md#optimizers).
For this example, we will use steepest gradient descent, with a step length of
0.01, to descend the cross entropy.
```python
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
```
What TensorFlow actually did in that single line was to add new operations to
the computation graph. These operations included ones to compute gradients,
compute parameter update steps, and apply update steps to the parameters.
The returned operation `train_step`, when run, will apply the gradient
descent updates to the parameters. Training the model can therefore be
accomplished by repeatedly running `train_step`.
```python
for i in range(1000):
batch = mnist.train.next_batch(50)
train_step.run(feed_dict={x: batch[0], y_: batch[1]})
```
Each training iteration we load 50 training examples. We then run the
`train_step` operation, using `feed_dict` to replace the `placeholder` tensors
`x` and `y_` with the training examples.
Note that you can replace any tensor in your computation graph using `feed_dict`
-- it's not restricted to just `placeholder`s.
### Evaluate the Model
How well did our model do?
First we'll figure out where we predicted the correct label. `tf.argmax`
is an extremely useful function which gives you the index of the highest entry
in a tensor along some axis. For example, `tf.argmax(y,1)` is the label our
model thinks is most likely for each input, while `tf.argmax(y_,1)` is the
true label. We can use `tf.equal` to check if our prediction matches the
truth.
```python
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
```
That gives us a list of booleans. To determine what fraction are correct, we
cast to floating point numbers and then take the mean. For example,
`[True, False, True, True]` would become `[1,0,1,1]` which would become `0.75`.
```python
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
```
Finally, we can evaluate our accuracy on the test data. This should be about
91% correct.
```python
print accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})
```
## Build a Multilayer Convolutional Network
Getting 91% accuracy on MNIST is bad. It's almost embarrassingly bad. In this
section, we'll fix that, jumping from a very simple model to something moderatly
sophisticated: a small convolutional neural network. This will get us to around
99.2% accuracy -- not state of the art, but respectable.
### Weight Initialization
To create this model, we're going to need to create a lot of weights and biases.
One should generally initialize weights with a small amount of noise for
symmetry breaking, and to prevent 0 gradients. Since we're using ReLU neurons,
it is also good practice to initialize them with a slightly positive initial
bias to avoid "dead neurons." Instead of doing this repeatedly while we build
the model, let's create two handy functions to do it for us.
```python
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
```
### Convolution and Pooling
TensorFlow also gives us a lot of flexibility in convolution and pooling
operations. How do we handle the boundaries? What is our stride size?
In this example, we're always going to choose the vanilla version.
Our convolutions uses a stride of one and are zero padded so that the
output is the same size as the input. Our pooling is plain old max pooling
over 2x2 blocks. To keep our code cleaner, let's also abstract those operations
into functions.
```python
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
```
### First Convolutional Layer
We can now implement our first layer. It will consist of convolution, followed
by max pooling. The convolutional will compute 32 features for each 5x5 patch.
Its weight tensor will have a shape of `[5, 5, 1, 32]`. The first two
dimensions are the patch size, the next is the number of input channels, and
the last is the number of output channels. We will also have a bias vector with
a component for each output channel.
```python
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
```
To apply the layer, we first reshape `x` to a 4d tensor, with the second and
third dimensions corresponding to image width and height, and the final
dimension corresponding to the number of color channels.
```python
x_image = tf.reshape(x, [-1,28,28,1])
```
We then convolve `x_image` with the weight tensor, add the
bias, apply the ReLU function, and finally max pool.
```python
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
```
### Second Convolutional Layer
In order to build a deep network, we stack several layers of this type. The
second layer will have 64 features for each 5x5 patch.
```python
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
```
### Densely Connected Layer
Now that the image size has been reduced to 7x7, we add a fully-connected layer
with 1024 neurons to allow processing on the entire image. We reshape the tensor
from the pooling layer into a batch of vectors,
multiply by a weight matrix, add a bias, and apply a ReLU.
```python
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
```
#### Dropout
To reduce overfitting, we will apply dropout before the readout layer.
We create a `placeholder` for the probability that a neuron's output is kept
during dropout. This allows us to turn dropout on during training, and turn it
off during testing.
TensorFlow's `tf.nn.dropout` op automatically handles scaling neuron outputs in
addition to masking them, so dropout just works without any additional scaling.
```python
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
```
### Readout Layer
Finally, we add a softmax layer, just like for the one layer softmax regression
above.
```python
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
```
### Train and Evaluate the Model
How well does this model do?
To train and evaluate it we will use code that is nearly identical to that for
the simple one layer SoftMax network above.
The differences are that: we will replace the steepest gradient descent
optimizer with the more sophisticated ADAM optimizer; we will include the
additional parameter `keep_prob` in `feed_dict` to control the dropout rate;
and we will add logging to every 100th iteration in the training process.
```python
cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess.run(tf.initialize_all_variables())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_: batch[1], keep_prob: 1.0})
print "step %d, training accuracy %g"%(i, train_accuracy)
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print "test accuracy %g"%accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})
```
The final test set accuracy after running this code should be approximately 99.2%.
We have learned how to quickly and easily build, train, and evaluate a
fairly sophisticated deep learning model using TensorFlow.
================================================
FILE: SOURCE/tutorials/mnist/tf/index.md
================================================
# TensorFlow Mechanics 101
Code: [tensorflow/g3doc/tutorials/mnist/](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/)
The goal of this tutorial is to show how to use TensorFlow to train and
evaluate a simple feed-forward neural network for handwritten digit
classification using the (classic) MNIST data set. The intended audience for
this tutorial is experienced machine learning users interested in using
TensorFlow.
These tutorials are not intended for teaching Machine Learning in general.
Please ensure you have followed the instructions to [install TensorFlow](../../../get_started/os_setup.md).
## Tutorial Files
This tutorial references the following files:
File | Purpose
--- | ---
[`mnist.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/mnist.py) | The code to build a fully-connected MNIST model.
[`fully_connected_feed.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py) | The main code to train the built MNIST model against the downloaded dataset using a feed dictionary.
Simply run the `fully_connected_feed.py` file directly to start training:
`python fully_connected_feed.py`
## Prepare the Data
MNIST is a classic problem in machine learning. The problem is to look at
greyscale 28x28 pixel images of handwritten digits and determine which digit
the image represents, for all the digits from zero to nine.

For more information, refer to [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/)
or [Chris Olah's visualizations of MNIST](http://colah.github.io/posts/2014-10-Visualizing-MNIST/).
### Download
At the top of the `run_training()` method, the `input_data.read_data_sets()`
function will ensure that the correct data has been downloaded to your local
training folder and then unpack that data to return a dictionary of `DataSet`
instances.
```python
data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data)
```
**NOTE**: The `fake_data` flag is used for unit-testing purposes and may be
safely ignored by the reader.
Dataset | Purpose
--- | ---
`data_sets.train` | 55000 images and labels, for primary training.
`data_sets.validation` | 5000 images and labels, for iterative validation of training accuracy.
`data_sets.test` | 10000 images and labels, for final testing of trained accuracy.
For more information about the data, please read the [Download](../../../tutorials/mnist/download/index.md)
tutorial.
### Inputs and Placeholders
The `placeholder_inputs()` function creates two [`tf.placeholder`](../../../api_docs/python/io_ops.md#placeholder)
ops that define the shape of the inputs, including the `batch_size`, to the
rest of the graph and into which the actual training examples will be fed.
```python
images_placeholder = tf.placeholder(tf.float32, shape=(batch_size,
IMAGE_PIXELS))
labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size))
```
Further down, in the training loop, the full image and label datasets are
sliced to fit the `batch_size` for each step, matched with these placeholder
ops, and then passed into the `sess.run()` function using the `feed_dict`
parameter.
## Build the Graph
After creating placeholders for the data, the graph is built from the
`mnist.py` file according to a 3-stage pattern: `inference()`, `loss()`, and
`training()`.
1. `inference()` - Builds the graph as far as is required for running
the network forward to make predictions.
1. `loss()` - Adds to the inference graph the ops required to generate
loss.
1. `training()` - Adds to the loss graph the ops required to compute
and apply gradients.
### Inference
The `inference()` function builds the graph as far as needed to
return the tensor that would contain the output predictions.
It takes the images placeholder as input and builds on top
of it a pair of fully connected layers with ReLu activation followed by a ten
node linear layer specifying the output logits.
Each layer is created beneath a unique [`tf.name_scope`](../../../api_docs/python/framework.md#name_scope)
that acts as a prefix to the items created within that scope.
```python
with tf.name_scope('hidden1') as scope:
```
Within the defined scope, the weights and biases to be used by each of these
layers are generated into [`tf.Variable`](../../../api_docs/python/state_ops.md#Variable)
instances, with their desired shapes:
```python
weights = tf.Variable(
tf.truncated_normal([IMAGE_PIXELS, hidden1_units],
stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))),
name='weights')
biases = tf.Variable(tf.zeros([hidden1_units]),
name='biases')
```
When, for instance, these are created under the `hidden1` scope, the unique
name given to the weights variable would be "`hidden1/weights`".
Each variable is given initializer ops as part of their construction.
In this most common case, the weights are initialized with the
[`tf.truncated_normal`](../../../api_docs/python/constant_op.md#truncated_normal)
and given their shape of a 2-D tensor with
the first dim representing the number of units in the layer from which the
weights connect and the second dim representing the number of
units in the layer to which the weights connect. For the first layer, named
`hidden1`, the dimensions are `[IMAGE_PIXELS, hidden1_units]` because the
weights are connecting the image inputs to the hidden1 layer. The
`tf.truncated_normal` initializer generates a random distribution with a given
mean and standard deviation.
Then the biases are initialized with [`tf.zeros`](../../../api_docs/python/constant_op.md#zeros)
to ensure they start with all zero values, and their shape is simply the number
of units in the layer to which they connect.
The graph's three primary ops -- two [`tf.nn.relu`](../../../api_docs/python/nn.md#relu)
ops wrapping [`tf.matmul`](../../../api_docs/python/math_ops.md#matmul)
for the hidden layers and one extra `tf.matmul` for the logits -- are then
created, each in turn, with their `tf.Variable` instances connected to the
input placeholder or the output tensor of the layer beneath each.
```python
hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases)
```
```python
hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases)
```
```python
logits = tf.matmul(hidden2, weights) + biases
```
Finally, the `logits` tensor that will contain the output is returned.
### Loss
The `loss()` function further builds the graph by adding the required loss
ops.
First, the values from the `labels_placeholder` are encoded as a tensor of 1-hot
values. For example, if the class identifier is '3' the value is converted to:
`[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]`
```python
batch_size = tf.size(labels)
labels = tf.expand_dims(labels, 1)
indices = tf.expand_dims(tf.range(0, batch_size, 1), 1)
concated = tf.concat(1, [indices, labels])
onehot_labels = tf.sparse_to_dense(
concated, tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
```
A [`tf.nn.softmax_cross_entropy_with_logits`](../../../api_docs/python/nn.md#softmax_cross_entropy_with_logits)
op is then added to compare the output logits from the `inference()` function
and the 1-hot labels.
```python
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
onehot_labels,
name='xentropy')
```
It then uses [`tf.reduce_mean`](../../../api_docs/python/math_ops.md#reduce_mean)
to average the cross entropy values across the batch dimension (the first
dimension) as the total loss.
```python
loss = tf.reduce_mean(cross_entropy, name='xentropy_mean')
```
And the tensor that will then contain the loss value is returned.
> Note: Cross-entropy is an idea from information theory that allows us
> to describe how bad it is to believe the predictions of the neural network,
> given what is actually true. For more information, read the blog post Visual
> Information Theory (http://colah.github.io/posts/2015-09-Visual-Information/)
### Training
The `training()` function adds the operations needed to minimize the loss via
gradient descent.
Firstly, it takes the loss tensor from the `loss()` function and hands it to a
[`tf.scalar_summary`](../../../api_docs/python/train.md#scalar_summary),
an op for generating summary values into the events file when used with a
`SummaryWriter` (see below). In this case, it will emit the snapshot value of
the loss every time the summaries are written out.
```python
tf.scalar_summary(loss.op.name, loss)
```
Next, we instantiate a [`tf.train.GradientDescentOptimizer`](../../../api_docs/python/train.md#GradientDescentOptimizer)
responsible for applying gradients with the requested learning rate.
```python
optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate)
```
We then generate a single variable to contain a counter for the global
training step and the [`minimize()`](../../../api_docs/python/train.md#Optimizer.minimize)
op is used to both update the trainable weights in the system and increment the
global step. This is, by convention, known as the `train_op` and is what must
be run by a TensorFlow session in order to induce one full step of training
(see below).
```python
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss, global_step=global_step)
```
The tensor containing the outputs of the training op is returned.
## Train the Model
Once the graph is built, it can be iteratively trained and evaluated in a loop
controlled by the user code in `fully_connected_feed.py`.
### The Graph
At the top of the `run_training()` function is a python `with` command that
indicates all of the built ops are to be associated with the default
global [`tf.Graph`](../../../api_docs/python/framework.md#Graph)
instance.
```python
with tf.Graph().as_default():
```
A `tf.Graph` is a collection of ops that may be executed together as a group.
Most TensorFlow uses will only need to rely on the single default graph.
More complicated uses with multiple graphs are possible, but beyond the scope of
this simple tutorial.
### The Session
Once all of the build preparation has been completed and all of the necessary
ops generated, a [`tf.Session`](../../../api_docs/python/client.md#Session)
is created for running the graph.
```python
sess = tf.Session()
```
Alternately, a `Session` may be generated into a `with` block for scoping:
```python
with tf.Session() as sess:
```
The empty parameter to session indicates that this code will attach to
(or create if not yet created) the default local session.
Immediately after creating the session, all of the `tf.Variable`
instances are initialized by calling [`sess.run()`](../../../api_docs/python/client.md#Session.run)
on their initialization op.
```python
init = tf.initialize_all_variables()
sess.run(init)
```
The [`sess.run()`](../../../api_docs/python/client.md#Session.run)
method will run the complete subset of the graph that
corresponds to the op(s) passed as parameters. In this first call, the `init`
op is a [`tf.group`](../../../api_docs/python/control_flow_ops.md#group)
that contains only the initializers for the variables. None of the rest of the
graph is run here; that happens in the training loop below.
### Train Loop
After initializing the variables with the session, training may begin.
The user code controls the training per step, and the simplest loop that
can do useful training is:
```python
for step in xrange(max_steps):
sess.run(train_op)
```
However, this tutorial is slightly more complicated in that it must also slice
up the input data for each step to match the previously generated placeholders.
#### Feed the Graph
For each step, the code will generate a feed dictionary that will contain the
set of examples on which to train for the step, keyed by the placeholder
ops they represent.
In the `fill_feed_dict()` function, the given `DataSet` is queried for its next
`batch_size` set of images and labels, and tensors matching the placeholders are
filled containing the next images and labels.
```python
images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size)
```
A python dictionary object is then generated with the placeholders as keys and
the representative feed tensors as values.
```python
feed_dict = {
images_placeholder: images_feed,
labels_placeholder: labels_feed,
}
```
This is passed into the `sess.run()` function's `feed_dict` parameter to provide
the input examples for this step of training.
#### Check the Status
The code specifies two values to fetch in its run call: `[train_op, loss]`.
```python
for step in xrange(FLAGS.max_steps):
feed_dict = fill_feed_dict(data_sets.train,
images_placeholder,
labels_placeholder)
_, loss_value = sess.run([train_op, loss],
feed_dict=feed_dict)
```
Because there are two values to fetch, `sess.run()` returns a tuple with two
items. Each `Tensor` in the list of values to fetch corresponds to a numpy
array in the returned tuple, filled with the value of that tensor during this
step of training. Since `train_op` is an `Operation` with no output value, the
corresponding element in the returned tuple is `None` and, thus,
discarded. However, the value of the `loss` tensor may become NaN if the model
diverges during training, so we capture this value for logging.
Assuming that the training runs fine without NaNs, the training loop also
prints a simple status text every 100 steps to let the user know the state of
training.
```python
if step % 100 == 0:
print 'Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration)
```
#### Visualize the Status
In order to emit the events files used by [TensorBoard](../../../how_tos/summaries_and_tensorboard/index.md),
all of the summaries (in this case, only one) are collected into a single op
during the graph building phase.
```python
summary_op = tf.merge_all_summaries()
```
And then after the session is created, a [`tf.train.SummaryWriter`](../../../api_docs/python/train.md#SummaryWriter)
may be instantiated to write the events files, which
contain both the graph itself and the values of the summaries.
```python
summary_writer = tf.train.SummaryWriter(FLAGS.train_dir,
graph_def=sess.graph_def)
```
Lastly, the events file will be updated with new summary values every time the
`summary_op` is run and the ouput passed to the writer's `add_summary()`
function.
```python
summary_str = sess.run(summary_op, feed_dict=feed_dict)
summary_writer.add_summary(summary_str, step)
```
When the events files are written, TensorBoard may be run against the training
folder to display the values from the summaries.

**NOTE**: For more info about how to build and run Tensorboard, please see the accompanying tutorial [Tensorboard: Visualizing Your Training](../../../how_tos/summaries_and_tensorboard/index.md).
#### Save a Checkpoint
In order to emit a checkpoint file that may be used to later restore a model
for further training or evaluation, we instantiate a
[`tf.train.Saver`](../../../api_docs/python/state_ops.md#Saver).
```python
saver = tf.train.Saver()
```
In the training loop, the [`saver.save()`](../../../api_docs/python/state_ops.md#Saver.save)
method will periodically be called to write a checkpoint file to the training
directory with the current values of all the trainable variables.
```python
saver.save(sess, FLAGS.train_dir, global_step=step)
```
At some later point in the future, training might be resumed by using the
[`saver.restore()`](../../../api_docs/python/state_ops.md#Saver.restore)
method to reload the model parameters.
```python
saver.restore(sess, FLAGS.train_dir)
```
## Evaluate the Model
Every thousand steps, the code will attempt to evaluate the model against both
the training and test datasets. The `do_eval()` function is called thrice, for
the training, validation, and test datasets.
```python
print 'Training Data Eval:'
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.train)
print 'Validation Data Eval:'
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.validation)
print 'Test Data Eval:'
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.test)
```
> Note that more complicated usage would usually sequester the `data_sets.test`
> to only be checked after significant amounts of hyperparameter tuning. For
> the sake of a simple little MNIST problem, however, we evaluate against all of
> the data.
### Build the Eval Graph
Before opening the default Graph, the test data should have been fetched by
calling the `get_data(train=False)` function with the parameter set to grab
the test dataset.
```python
test_all_images, test_all_labels = get_data(train=False)
```
Before entering the training loop, the Eval op should have been built
by calling the `evaluation()` function from `mnist.py` with the same
logits/labels parameters as the `loss()` function.
```python
eval_correct = mnist.evaluation(logits, labels_placeholder)
```
The `evaluation()` function simply generates a [`tf.nn.in_top_k`](../../../api_docs/python/nn.md#in_top_k)
op that can automatically score each model output as correct if the true label
can be found in the K most-likely predictions. In this case, we set the value
of K to 1 to only consider a prediction correct if it is for the true label.
```python
eval_correct = tf.nn.in_top_k(logits, labels, 1)
```
### Eval Output
One can then create a loop for filling a `feed_dict` and calling `sess.run()`
against the `eval_correct` op to evaluate the model on the given dataset.
```python
for step in xrange(steps_per_epoch):
feed_dict = fill_feed_dict(data_set,
images_placeholder,
labels_placeholder)
true_count += sess.run(eval_correct, feed_dict=feed_dict)
```
The `true_count` variable simply accumulates all of the predictions that the
`in_top_k` op has determined to be correct. From there, the precision may be
calculated from simply dividing by the total number of examples.
```python
precision = float(true_count) / float(num_examples)
print ' Num examples: %d Num correct: %d Precision @ 1: %0.02f' % (
num_examples, true_count, precision)
```
================================================
FILE: SOURCE/tutorials/mnist_beginners.md
================================================
# MNIST机器学习入门
*这个教程的目标读者是对机器学习和TensorFlow都不太了解的新手。如果你已经了解MNIST和softmax回归(softmax regression)的相关知识,你可以阅读这个[快速上手教程](./mnist_pros.md)。*
当我们开始学习编程的时候,第一件事往往是学习打印"Hello World"。就好比编程入门有Hello World,机器学习入门有MNIST。
MNIST是一个入门级的计算机视觉数据集,它包含各种手写数字图片:
> 原文:[Overview](http://tensorflow.org/tutorials) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@eric_xu](https://github.com/ericxk)
================================================
FILE: SOURCE/tutorials/pdes/index.md
================================================
# 偏积分方程
***TensorFlow*** 不仅仅是用来机器学习,它更可以用来模拟仿真。在这里,我们将通过模拟仿真几滴落入一块方形水池的雨点的例子,来引导您如何使用 ***TensorFlow*** 中的偏积分方程来模拟仿真的基本使用方法。
>注:本教程最初是准备做为一个 **IPython** 的手册。
>>译者注:关于偏积分方程的相关知识,译者推荐读者查看 [**网易公开课**](http://open.163.com/) 上的[**《麻省理工学院公开课:多变量微积分》**](http://open.163.com/special/opencourse/multivariable.html)课程。
## 基本设置
首先,我们需要导入一些必要的引用。
```python
#Import libraries for simulation
import tensorflow as tf
import numpy as np
#Imports for visualization
import PIL.Image
from cStringIO import StringIO
from IPython.display import clear_output, Image, display
```
然后,我们还需要一个用于表示池塘表面状态的函数。
```python
def DisplayArray(a, fmt='jpeg', rng=[0,1]):
"""Display an array as a picture."""
a = (a - rng[0])/float(rng[1] - rng[0])*255
a = np.uint8(np.clip(a, 0, 255))
f = StringIO()
PIL.Image.fromarray(a).save(f, fmt)
display(Image(data=f.getvalue()))
```
最后,为了方便演示,这里我们需要打开一个交互的 ***TensorFlow*** 会话。当然为了以后能方便调用,我们可以把相关代码写到一个可以执行的***Python***文件中。
```python
sess = tf.InteractiveSession()
```
## 定义计算函数
```python
def make_kernel(a):
"""Transform a 2D array into a convolution kernel"""
a = np.asarray(a)
a = a.reshape(list(a.shape) + [1,1])
return tf.constant(a, dtype=1)
def simple_conv(x, k):
"""A simplified 2D convolution operation"""
x = tf.expand_dims(tf.expand_dims(x, 0), -1)
y = tf.nn.depthwise_conv2d(x, k, [1, 1, 1, 1], padding='SAME')
return y[0, :, :, 0]
def laplace(x):
"""Compute the 2D laplacian of an array"""
laplace_k = make_kernel([[0.5, 1.0, 0.5],
[1.0, -6., 1.0],
[0.5, 1.0, 0.5]])
return simple_conv(x, laplace_k)
```
## 定义偏积分方程
首先,我们需要创建一个完美的 500 × 500 的正方形池塘,就像是我们在现实中找到的一样。.
```python
N = 500
```
然后,我们需要创建了一个池塘和几滴将要坠入池塘的雨滴。
```python
# Initial Conditions -- some rain drops hit a pond
# Set everything to zero
u_init = np.zeros([N, N], dtype="float32")
ut_init = np.zeros([N, N], dtype="float32")
# Some rain drops hit a pond at random points
for n in range(40):
a,b = np.random.randint(0, N, 2)
u_init[a,b] = np.random.uniform()
DisplayArray(u_init, rng=[-0.1, 0.1])
```

现在,让我们来指定该微分方程的一些详细参数。
```python
# Parameters:
# eps -- time resolution
# damping -- wave damping
eps = tf.placeholder(tf.float32, shape=())
damping = tf.placeholder(tf.float32, shape=())
# Create variables for simulation state
U = tf.Variable(u_init)
Ut = tf.Variable(ut_init)
# Discretized PDE update rules
U_ = U + eps * Ut
Ut_ = Ut + eps * (laplace(U) - damping * Ut)
# Operation to update the state
step = tf.group(
U.assign(U_),
Ut.assign(Ut_))
```
## 开始仿真
为了能看清仿真效果,我们可以用一个简单的 **for** 循环来远行我们的仿真程序。
```python
# Initialize state to initial conditions
tf.initialize_all_variables().run()
# Run 1000 steps of PDE
for i in range(1000):
# Step simulation
step.run({eps: 0.03, damping: 0.04})
# Visualize every 50 steps
if i % 50 == 0:
clear_output()
DisplayArray(U.eval(), rng=[-0.1, 0.1])
```

看!! 雨点落在池塘中,和现实中泛起了无数涟漪。
> 原文链接:[http://tensorflow.org/tutorials/pdes/index.md](http://tensorflow.org/tutorials/pdes/index.md) 翻译:[@wangaicc](https://github.com/wangaicc) 校对:
================================================
FILE: SOURCE/tutorials/pdes.md
================================================
# 偏微分方程
***TensorFlow*** 不仅仅是用来机器学习,它更可以用来模拟仿真。在这里,我们将通过模拟仿真几滴落入一块方形水池的雨点的例子,来引导您如何使用 ***TensorFlow*** 中的偏微分方程来模拟仿真的基本使用方法。
>注:本教程最初是准备做为一个 **IPython** 的手册。
>>译者注:关于偏微分方程的相关知识,译者推荐读者查看 [**网易公开课**](http://open.163.com/) 上的[**《麻省理工学院公开课:多变量微积分》**](http://open.163.com/special/opencourse/multivariable.html)课程。
## 基本设置
首先,我们需要导入一些必要的引用。
```python
#导入模拟仿真需要的库
import tensorflow as tf
import numpy as np
#导入可视化需要的库
import PIL.Image
from cStringIO import StringIO
from IPython.display import clear_output, Image, display
```
然后,我们还需要一个用于表示池塘表面状态的函数。
```python
def DisplayArray(a, fmt='jpeg', rng=[0,1]):
"""Display an array as a picture."""
a = (a - rng[0])/float(rng[1] - rng[0])*255
a = np.uint8(np.clip(a, 0, 255))
f = StringIO()
PIL.Image.fromarray(a).save(f, fmt)
display(Image(data=f.getvalue()))
```
最后,为了方便演示,这里我们需要打开一个 ***TensorFlow*** 的交互会话(interactive session)。当然为了以后能方便调用,我们可以把相关代码写到一个可以执行的***Python***文件中。
```python
sess = tf.InteractiveSession()
```
## 定义计算函数
```python
def make_kernel(a):
"""Transform a 2D array into a convolution kernel"""
a = np.asarray(a)
a = a.reshape(list(a.shape) + [1,1])
return tf.constant(a, dtype=1)
def simple_conv(x, k):
"""A simplified 2D convolution operation"""
x = tf.expand_dims(tf.expand_dims(x, 0), -1)
y = tf.nn.depthwise_conv2d(x, k, [1, 1, 1, 1], padding='SAME')
return y[0, :, :, 0]
def laplace(x):
"""Compute the 2D laplacian of an array"""
laplace_k = make_kernel([[0.5, 1.0, 0.5],
[1.0, -6., 1.0],
[0.5, 1.0, 0.5]])
return simple_conv(x, laplace_k)
```
## 定义偏微分方程
首先,我们需要创建一个完美的 500 × 500 的正方形池塘,就像是我们在现实中找到的一样。
```python
N = 500
```
然后,我们需要创建了一个池塘和几滴将要坠入池塘的雨滴。
```python
# Initial Conditions -- some rain drops hit a pond
# Set everything to zero
u_init = np.zeros([N, N], dtype="float32")
ut_init = np.zeros([N, N], dtype="float32")
# Some rain drops hit a pond at random points
for n in range(40):
a,b = np.random.randint(0, N, 2)
u_init[a,b] = np.random.uniform()
DisplayArray(u_init, rng=[-0.1, 0.1])
```

现在,让我们来指定该微分方程的一些详细参数。
```python
# Parameters:
# eps -- time resolution
# damping -- wave damping
eps = tf.placeholder(tf.float32, shape=())
damping = tf.placeholder(tf.float32, shape=())
# Create variables for simulation state
U = tf.Variable(u_init)
Ut = tf.Variable(ut_init)
# Discretized PDE update rules
U_ = U + eps * Ut
Ut_ = Ut + eps * (laplace(U) - damping * Ut)
# Operation to update the state
step = tf.group(
U.assign(U_),
Ut.assign(Ut_))
```
## 开始仿真
为了能看清仿真效果,我们可以用一个简单的 **for** 循环来远行我们的仿真程序。
```python
# Initialize state to initial conditions
tf.initialize_all_variables().run()
# Run 1000 steps of PDE
for i in range(1000):
# Step simulation
step.run({eps: 0.03, damping: 0.04})
# Visualize every 50 steps
if i % 50 == 0:
clear_output()
DisplayArray(U.eval(), rng=[-0.1, 0.1])
```

看!! 雨点落在池塘中,和现实中一样的泛起了涟漪。
> 原文链接:[http://tensorflow.org/tutorials/pdes/index.md](http://tensorflow.org/tutorials/pdes/index.md) 翻译:[@wangaicc](https://github.com/wangaicc) 校对:[@tensorfly](https://github.com/tensorfly)
================================================
FILE: SOURCE/tutorials/recurrent/index.md
================================================
# Recurrent Neural Networks
## Introduction
Take a look at [this great article]
(http://colah.github.io/posts/2015-08-Understanding-LSTMs/)
for an introduction to recurrent neural networks and LSTMs in particular.
## Language Modeling
In this tutorial we will show how to train a recurrent neural network on
a challenging task of language modeling. The goal of the problem is to fit a
probabilistic model which assigns probablities to sentences. It does so by
predicting next words in a text given a history of previous words. For this
purpose we will use the Penn Tree Bank (PTB) dataset, which is a popular
benchmark for measuring quality of these models, whilst being small and
relatively fast to train.
Language modeling is key to many interesting problems such as speech
recognition, machine translation, or image captioning. It is also fun, too --
take a look [here] (http://karpathy.github.io/2015/05/21/rnn-effectiveness/).
For the purpose of this tutorial, we will reproduce the results from
[Zaremba et al., 2014] (http://arxiv.org/abs/1409.2329), which achieves very
good results on the PTB dataset.
## Tutorial Files
This tutorial references the following files from `models/rnn/ptb`:
File | Purpose
--- | ---
`ptb_word_lm.py` | The code to train a language model on the PTB dataset.
`reader.py` | The code to read the dataset.
## Download and Prepare the Data
The data required for this tutorial is in the data/ directory of the
PTB dataset from Tomas Mikolov's webpage:
http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz
The dataset is already preprocessed and contains overall 10000 different words,
including the end-of-sentence marker and a special symbol (\) for rare
words. We convert all of them in the `reader.py` to unique integer identifiers
to make it easy for the neural network to process.
## The Model
### LSTM
The core of the model consists of an LSTM cell that processes one word at the
time and computes probabilities of the possible continuations of the sentence.
The memory state of the network is initialized with a vector of zeros and gets
updated after reading each word. Also, for computational reasons, we will
process data in mini-batches of size `batch_size`.
The basic pseudocode looks as follows:
```python
lstm = rnn_cell.BasicLSTMCell(lstm_size)
# Initial state of the LSTM memory.
state = tf.zeros([batch_size, lstm.state_size])
loss = 0.0
for current_batch_of_words in words_in_dataset:
# The value of state is updated after processing each batch of words.
output, state = lstm(current_batch_of_words, state)
# The LSTM output can be used to make next word predictions
logits = tf.matmul(output, softmax_w) + softmax_b
probabilities = tf.nn.softmax(logits)
loss += loss_function(probabilities, target_words)
```
### Truncated Backpropagation
In order to make the learning process tractable, it is a common practice to
truncate the gradients for backpropagation to a fixed number (`num_steps`)
of unrolled steps.
This is easy to implement by feeding inputs of length `num_steps` at a time and
doing backward pass after each iteration.
A simplifed version of the code for the graph creation for truncated
backpropagation:
```python
# Placeholder for the inputs in a given iteration.
words = tf.placeholder(tf.int32, [batch_size, num_steps])
lstm = rnn_cell.BasicLSTMCell(lstm_size)
# Initial state of the LSTM memory.
initial_state = state = tf.zeros([batch_size, lstm.state_size])
for i in range(len(num_steps)):
# The value of state is updated after processing each batch of words.
output, state = lstm(words[:, i], state)
# The rest of the code.
# ...
final_state = state
```
And this is how to implement an iteration over the whole dataset:
```python
# A numpy array holding the state of LSTM after each batch of words.
numpy_state = initial_state.eval()
total_loss = 0.0
for current_batch_of_words in words_in_dataset:
numpy_state, current_loss = session.run([final_state, loss],
# Initialize the LSTM state from the previous iteration.
feed_dict={initial_state: numpy_state, words: current_batch_of_words})
total_loss += current_loss
```
### Inputs
The word IDs will be embedded into a dense representation (see the
[Vector Representations Tutorial](../../tutorials/word2vec/index.md)) before feeding to
the LSTM. This allows the model to efficiently represent the knowledge about
particular words. It is also easy to write:
```python
# embedding_matrix is a tensor of shape [vocabulary_size, embedding size]
word_embeddings = tf.nn.embedding_lookup(embedding_matrix, word_ids)
```
The embedding matrix will be initialized randomly and the model will learn to
differentiate the meaning of words just by looking at the data.
### Loss Fuction
We want to minimize the average negative log probability of the target words:
$$ \text{loss} = -\frac{1}{N}\sum_{i=1}^{N} \ln p_{\text{target}_i} $$
It is not very difficult to implement but the function
`sequence_loss_by_example` is already available, so we can just use it here.
The typical measure reported in the papers is average per-word perplexity (often
just called perplexity), which is equal to
$$e^{-\frac{1}{N}\sum_{i=1}^{N} \ln p_{\text{target}_i}} = e^{\text{loss}} $$
and we will monitor its value throughout the training process.
### Stacking multiple LSTMs
To give the model more expressive power, we can add multiple layers of LSTMs
to process the data. The output of the first layer will become the input of
the second and so on.
We have a class called `MultiRNNCell` that makes the implementation seamless:
```python
lstm = rnn_cell.BasicLSTMCell(lstm_size)
stacked_lstm = rnn_cell.MultiRNNCell([lstm] * number_of_layers)
initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32)
for i in range(len(num_steps)):
# The value of state is updated after processing each batch of words.
output, state = stacked_lstm(words[:, i], state)
# The rest of the code.
# ...
final_state = state
```
## Compile and Run the Code
First, the library needs to be built. To compile it on CPU:
```
bazel build -c opt tensorflow/models/rnn/ptb:ptb_word_lm
```
And if you have a fast GPU, run the following:
```
bazel build -c opt --config=cuda tensorflow/models/rnn/ptb:ptb_word_lm
```
Now we can run the model:
```
bazel-bin/tensorflow/models/rnn/ptb/ptb_word_lm \
--data_path=/tmp/simple-examples/data/ --alsologtostderr --model small
```
There are 3 supported model configurations in the tutorial code: "small",
"medium" and "large". The difference between them is in size of the LSTMs and
the set of hyperparameters used for training.
The larger the model, the better results it should get. The `small` model should
be able to reach perplexity below 120 on the test set and the `large` one below
80, though it might take several hours to train.
## What Next?
There are several tricks that we haven't mentioned that make the model better,
including:
* decreasing learning rate schedule,
* dropout between the LSTM layers.
Study the code and modify it to improve the model even further.
================================================
FILE: SOURCE/tutorials/recurrent.md
================================================
# 递归神经网络
## 介绍
可以在 [this great article](http://colah.github.io/posts/2015-08-Understanding-LSTMs/) 查看循环神经网络(RNN)以及 LSTM 的介绍。
## 语言模型
此教程将展示如何在高难度的语言模型中训练循环神经网络。该问题的目标是获得一个能确定语句概率的概率模型。为了做到这一点,通过之前已经给出的词语来预测后面的词语。我们将使用 PTB(Penn Tree Bank) 数据集,这是一种常用来衡量模型的基准,同时它比较小而且训练起来相对快速。
语言模型是很多有趣难题的关键所在,比如语音识别,机器翻译,图像字幕等。它很有意思--可以参看 [here](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)。
本教程的目的是重现 [Zaremba et al., 2014](http://arxiv.org/abs/1409.2329) 的成果,他们在 PTB 数据集上得到了很棒的结果。
## 教程文件
本教程使用的下面文件的目录是 `models/rnn/ptb`:
文件 | 作用
--- | ---
`ptb_word_lm.py` | 在 PTB 数据集上训练一个语言模型.
`reader.py` | 读取数据集.
## 下载及准备数据
本教程需要的数据在 data/ 路径下,来源于 Tomas Mikolov 网站上的 PTB 数据集`http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz`。
该数据集已经预先处理过并且包含了全部的 10000 个不同的词语,其中包括语句结束标记符,以及标记稀有词语的特殊符号 `()` 。我们在 `reader.py` 中转换所有的词语,让他们各自有唯一的整型标识符,便于神经网络处理。
## 模型
### LSTM
模型的核心由一个 LSTM 单元组成,其可以在某时刻处理一个词语,以及计算语句可能的延续性的概率。网络的存储状态由一个零矢量初始化并在读取每一个词语后更新。而且,由于计算上的原因,我们将以 `batch_size` 为最小批量来处理数据。
基础的伪代码就像下面这样:
```python
lstm = rnn_cell.BasicLSTMCell(lstm_size)
# 初始化 LSTM 存储状态.
state = tf.zeros([batch_size, lstm.state_size])
loss = 0.0
for current_batch_of_words in words_in_dataset:
# 每次处理一批词语后更新状态值.
output, state = lstm(current_batch_of_words, state)
# LSTM 输出可用于产生下一个词语的预测
logits = tf.matmul(output, softmax_w) + softmax_b
probabilities = tf.nn.softmax(logits)
loss += loss_function(probabilities, target_words)
```
### 截断反向传播
为使学习过程易于处理,通常的做法是将反向传播的梯度在(按时间)展开的步骤上照一个固定长度(`num_steps`)截断。
通过在一次迭代中的每个时刻上提供长度为 `num_steps` 的输入和每次迭代完成之后反向传导,这会很容易实现。
一个简化版的用于计算图创建的截断反向传播代码:
```python
# 一次给定的迭代中的输入占位符.
words = tf.placeholder(tf.int32, [batch_size, num_steps])
lstm = rnn_cell.BasicLSTMCell(lstm_size)
# 初始化 LSTM 存储状态.
initial_state = state = tf.zeros([batch_size, lstm.state_size])
for i in range(len(num_steps)):
# 每处理一批词语后更新状态值.
output, state = lstm(words[:, i], state)
# 其余的代码.
# ...
final_state = state
```
下面展现如何实现迭代整个数据集:
```python
# 一个 numpy 数组,保存每一批词语之后的 LSTM 状态.
numpy_state = initial_state.eval()
total_loss = 0.0
for current_batch_of_words in words_in_dataset:
numpy_state, current_loss = session.run([final_state, loss],
# 通过上一次迭代结果初始化 LSTM 状态.
feed_dict={initial_state: numpy_state, words: current_batch_of_words})
total_loss += current_loss
```
### 输入
在输入 LSTM 前,词语 ID 被嵌入到了一个密集的表示中(查看 [矢量表示教程](tensorflow-zh/SOURCE/tutorials/word2vec/index.md))。这种方式允许模型高效地表示词语,也便于写代码:
```python
# embedding_matrix 张量的形状是: [vocabulary_size, embedding_size]
word_embeddings = tf.nn.embedding_lookup(embedding_matrix, word_ids)
```
嵌入的矩阵会被随机地初始化,模型会学会通过数据分辨不同词语的意思。
### 损失函数
我们想使目标词语的平均负对数概率最小

实现起来并非很难,而且函数 `sequence_loss_by_example` 已经有了,可以直接使用。
论文中的典型衡量标准是每个词语的平均困惑度(perplexity),计算式为

同时我们会观察训练过程中的困惑度值(perplexity)。
### 多个 LSTM 层堆叠
要想给模型更强的表达能力,可以添加多层 LSTM 来处理数据。第一层的输出作为第二层的输入,以此类推。
类 `MultiRNNCell` 可以无缝的将其实现:
```python
lstm = rnn_cell.BasicLSTMCell(lstm_size)
stacked_lstm = rnn_cell.MultiRNNCell([lstm] * number_of_layers)
initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32)
for i in range(len(num_steps)):
# 每次处理一批词语后更新状态值.
output, state = stacked_lstm(words[:, i], state)
# 其余的代码.
# ...
final_state = state
```
## 编译并运行代码
首先需要构建库,在 CPU 上编译:
```
bazel build -c opt tensorflow/models/rnn/ptb:ptb_word_lm
```
如果你有一个强大的 GPU,可以运行:
```
bazel build -c opt --config=cuda tensorflow/models/rnn/ptb:ptb_word_lm
```
运行模型:
```
bazel-bin/tensorflow/models/rnn/ptb/ptb_word_lm \
--data_path=/tmp/simple-examples/data/ --alsologtostderr --model small
```
教程代码中有 3 个支持的模型配置参数:"small",
"medium" 和 "large"。它们指的是 LSTM 的大小,以及用于训练的超参数集。
模型越大,得到的结果应该更好。在测试集中 `small` 模型应该可以达到低于 120 的困惑度(perplexity),`large` 模型则是低于 80,但它可能花费数小时来训练。
## 除此之外?
还有几个优化模型的技巧没有提到,包括:
* 随时间降低学习率,
* LSTM 层间 dropout.
继续学习和更改代码以进一步改善模型吧。
原文:[Recurrent Neural Networks](http://tensorflow.org/tutorials/recurrent/index.md)
翻译:[Warln](https://github.com/Warln)
校对:[HongyangWang](https://github.com/wanghong-yang)
================================================
FILE: SOURCE/tutorials/seq2seq/index.md
================================================
# Sequence-to-Sequence Models
Recurrent neural networks can learn to model language, as already discussed
in the [RNN Tutorial](../../tutorials/recurrent/index.md)
(if you did not read it, please go through it before proceeding with this one).
This raises an interesting question: could we condition the generated words on
some input and generate a meaningful response? For example, could we train
a neural network to translate from English to French? It turns out that
the answer is *yes*.
This tutorial will show you how to build and train such a system end-to-end.
You can start by running this binary.
```
bazel run -c opt <...>/models/rnn/translate/translate.py
--data_dir [your_data_directory]
```
It will download English-to-French translation data from the
[WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html)
prepare it for training and train. It takes about 20GB of disk space,
and a while to download and prepare (see [later](#run_it) for details),
so you can start and leave it running while reading this tutorial.
This tutorial references the following files from `models/rnn`.
File | What's in it?
--- | ---
`seq2seq.py` | Library for building sequence-to-sequence models.
`translate/seq2seq_model.py` | Neural translation sequence-to-sequence model.
`translate/data_utils.py` | Helper functions for preparing translation data.
`translate/translate.py` | Binary that trains and runs the translation model.
## Sequence-to-Sequence Basics
A basic sequence-to-sequence model, as introduced in
[Cho et al., 2014](http://arxiv.org/pdf/1406.1078v3.pdf),
consists of two recurrent neural networks (RNNs): an *encoder* that
processes the input and a *decoder* that generates the output.
This basic architecture is depicted below.
Each box in the picture above represents a cell of the RNN, most commonly
a GRU cell or an LSTM cell (see the [RNN Tutorial](../../tutorials/recurrent/index.md)
for an explanation of those). Encoder and decoder can share weights or,
as is more common, use a different set of parameters. Mutli-layer cells
have been successfully used in sequence-to-sequence models too, e.g. for
translation [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215).
In the basic model depicted above, every input has to be encoded into
a fixed-size state vector, as that is the only thing passed to the decoder.
To allow the decoder more direct access to the input, an *attention* mechanism
was introduced in [Bahdanu et al., 2014](http://arxiv.org/abs/1409.0473).
We will not go into the details of the attention mechanism (see the paper),
suffice it to say that it allows the decoder to peek into the input at every
decoding step. A multi-layer sequence-to-sequence network with LSTM cells and
attention mechanism in the decoder looks like this.
## TensorFlow seq2seq Library
As you can see above, there are many different sequence-to-sequence
models. Each of these models can use different RNN cells, but all
of them accept encoder inputs and decoder inputs. This motivates
the interfaces in the TensorFlow seq2seq library (`models/rnn/seq2seq.py`).
The basic RNN encoder-decoder sequence-to-sequence model works as follows.
```python
outputs, states = basic_rnn_seq2seq(encoder_inputs, decoder_inputs, cell)
```
In the above call, `encoder_inputs` are a list of tensors representing inputs
to the encoder, i.e., corresponding to the letters *A, B, C* in the first
picture above. Similarly, `decoder_inputs` are tensors representing inputs
to the decoder, *GO, W, X, Y, Z* on the first picture.
The `cell` argument is an instance of the `models.rnn.rnn_cell.RNNCell` class
that determines which cell will be used inside the model. You can use
an existing cell, such as `GRUCell` or `LSTMCell`, or you can write your own.
Moreover, `rnn_cell` provides wrappers to construct multi-layer cells,
add dropout to cell inputs or outputs, or to do other transformations,
see the [RNN Tutorial](../../tutorials/recurrent/index.md) for examples.
The call to `basic_rnn_seq2seq` returns two arguments: `outputs` and `states`.
Both of them are lists of tensors of the same length as `decoder_inputs`.
Naturally, `outputs` correspond to the outputs of the decoder in each time-step,
in the first picture above that would be *W, X, Y, Z, EOS*. The returned
`states` represent the internal state of the decoder at every time-step.
In many applications of sequence-to-sequence models, the output of the decoder
at time t is fed back and becomes the input of the decoder at time t+1. At test
time, when decoding a sequence, this is how the sequence is constructed.
During training, on the other hand, it is common to provide the correct input
to the decoder at every time-step, even if the decoder made a mistake before.
Functions in `seq2seq.py` support both modes using the `feed_previous` argument.
For example, let's analyze the following use of an embedding RNN model.
```python
outputs, states = embedding_rnn_seq2seq(
encoder_inputs, decoder_inputs, cell,
num_encoder_symbols, num_decoder_symbols,
output_projection=None, feed_previous=False)
```
In the `embedding_rnn_seq2seq` model, all inputs (both `encoder_inputs` and
`decoder_inputs`) are integer-tensors that represent discrete values.
They will be embedded into a dense representation (see the
[Vectors Representations Tutorial](../../tutorials/word2vec/index.md) for more details
on embeddings), but to construct these embeddings we need to specify
the maximum number of discrete symbols that will appear: `num_encoder_symbols`
on the encoder side, and `num_decoder_symbols` on the decoder side.
In the above invocation, we set `feed_previous` to False. This means that the
decoder will use `decoder_inputs` tensors as provided. If we set `feed_previous`
to True, the decoder would only use the first element of `decoder_inputs`.
All other tensors from this list would be ignored, and instead the previous
output of the encoder would be used. This is used for decoding translations
in our translation model, but it can also be used during training, to make
the model more robust to its own mistakes, similar
to [Bengio et al., 2015](http://arxiv.org/pdf/1506.03099v2.pdf).
One more important argument used above is `output_projection`. If not specified,
the outputs of the embedding model will be tensors of shape batch-size by
`num_decoder_symbols` as they represent the logits for each generated symbol.
When training models with large output vocabularies, i.e., when
`num_decoder_symbols` is large, it is not practical to store these large
tensors. Instead, it is better to return smaller output tensors, which will
later be projected onto a large output tensor using `output_projection`.
This allows to use our seq2seq models with a sampled softmax loss, as described
in [Jean et. al., 2015](http://arxiv.org/pdf/1412.2007v2.pdf).
In addition to `basic_rnn_seq2seq` and `embedding_rnn_seq2seq` there are a few
more sequence-to-sequence models in `seq2seq.py`, take a look there. They all
have similar interfaces, so we will not describe them in detail. We will use
`embedding_attention_seq2seq` for our translation model below.
## Neural Translation Model
While the core of the sequence-to-sequence model is constructed by
the functions in `models/rnn/seq2seq.py`, there are still a few tricks
that are worth mentioning that are used in our translation model in
`models/rnn/translate/seq2seq_model.py`.
### Sampled softmax and output projection
For one, as already mentioned above, we want to use sampled softmax to
handle large output vocabulary. To decode from it, we need to keep track
of the output projection. Both the sampled softmax loss and the output
projections are constructed by the following code in `seq2seq_model.py`.
```python
if num_samples > 0 and num_samples < self.target_vocab_size:
w = tf.get_variable("proj_w", [size, self.target_vocab_size])
w_t = tf.transpose(w)
b = tf.get_variable("proj_b", [self.target_vocab_size])
output_projection = (w, b)
def sampled_loss(inputs, labels):
labels = tf.reshape(labels, [-1, 1])
return tf.nn.sampled_softmax_loss(w_t, b, inputs, labels, num_samples,
self.target_vocab_size)
```
First, note that we only construct a sampled softmax if the number of samples
(512 by default) is smaller that the target vocabulary size. For vocabularies
smaller than 512 it might be a better idea to just use a standard softmax loss.
Then, as you can see, we construct an output projection. It is a pair,
consisting of a weight matrix and a bias vector. If used, the rnn cell
will return vectors of shape batch-size by `size`, rather than batch-size
by `target_vocab_size`. To recover logits, we need to multiply by the weight
matrix and add the biases, as is done in lines 124-126 in `seq2seq_model.py`.
```python
if output_projection is not None:
self.outputs[b] = [tf.matmul(output, output_projection[0]) +
output_projection[1] for ...]
```
### Bucketing and padding
In addition to sampled softmax, our translation model also makes use
of *bucketing*, which is a method to efficiently handle sentences of
different lengths. Let us first clarify the problem. When translating
English to French, we will have English sentences of different lengths L1
on input, and French sentences of different lengths L2 on output. Since
the English sentence is passed as `encoder_inputs`, and the French sentence
comes as `decoder_inputs` (prefixed by a GO symbol), we should in principle
create a seq2seq model for every pair (L1, L2+1) of lengths of an English
and French sentence. This would result in an enormous graph consisting of
many very similar subgraphs. On the other hand, we could just pad every
sentence with a special PAD symbol. Then we'd need only one seq2seq model,
for the padded lengths. But on shorter sentence our model would be inefficient,
encoding and decoding many PAD symbols that are useless.
As a compromise between contructing a graph for every pair of lengths and
padding to a single length, we use a number of *buckets* and pad each sentence
to the length of the bucket above it. In `translate.py` we use the following
default buckets.
```python
buckets = [(5, 10), (10, 15), (20, 25), (40, 50)]
```
This means that if the input is an English sentence with 3 tokens,
and the corresponding output is a French sentence with 6 tokens,
then they will be put in the first bucket and padded to length 5 for
encoder inputs, and length 10 for decoder inputs. If we have an English
sentence with 8 tokens and the corresponding French sentence has 18 tokens,
then they will not fit into the (10, 15) bucket, and so the (20, 25) bucket
will be used, i.e. the English sentence will be padded to 20, and the French
one to 25.
Remember that when constructing decoder inputs we prepend the special `GO`
symbol to the input data. This is done in the `get_batch()` function in
`seq2seq_model.py`, which also reverses the input English sentence.
Reversing the inputs was shown to improve results for the neural translation
model in [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215).
To put it all together, imagine we have the sentence "I go.", tokenized
as `["I", "go", "."]` as input and the sentence "Je vais." as output,
tokenized `["Je", "vais", "."]`. It will be put in the (5, 10) bucket,
with encoder inputs representing `[PAD PAD "." "go" "I"]` and decoder
inputs `[GO "Je" "vais" "." EOS PAD PAD PAD PAD PAD]`.
## Let's Run It
To train the model described above, we need to a large English-French corpus.
We will use the *10^9-French-English corpus* from the
[WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html)
for training, and the 2013 news test from the same site as development set.
Both data-sets will be downloaded to `data_dir` and training will start,
saving checkpoints in `train_dir`, when this command is run.
```
bazel run -c opt <...>/models/rnn/translate:translate
--data_dir [your_data_directory] --train_dir [checkpoints_directory]
--en_vocab_size=40000 --fr_vocab_size=40000
```
It takes about 18GB of disk space and several hours to prepare the training
corpus. It is unpacked, vocabulary files are created in `data_dir`, and then
the corpus is tokenized and converted to integer ids. Note the parameters
that determine vocabulary sizes. In the example above, all words outside
the 40K most common ones will be converted to an `UNK` token representing
unknown words. So if you change vocabulary size, the binary will re-map
the corpus to token-ids again.
After the data is prepared, training starts. Default parameters in `translate`
are set to quite large values. Large models trained over a long time give good
results, but it might take too long or use too much memory for your GPU.
You can request to train a smaller model as in the following example.
```
bazel run -c opt <...>/models/rnn/translate:translate
--data_dir [your_data_directory] --train_dir [checkpoints_directory]
--size=256 --num_layers=2 --steps_per_checkpoint=50
```
The above command will train a model with 2 layers (the default is 3),
each layer with 256 units (default is 1024), and will save a checkpoint
every 50 steps (the default is 200). You can play with these parameters
to find out how large a model can be to fit into the memory of your GPU.
During training, every `steps_per_checkpoint` steps the binary will print
out statistics from recent steps. With the default parameters (3 layers
of size 1024), first messages look like this.
```
global step 200 learning rate 0.5000 step-time 1.39 perplexity 1720.62
eval: bucket 0 perplexity 184.97
eval: bucket 1 perplexity 248.81
eval: bucket 2 perplexity 341.64
eval: bucket 3 perplexity 469.04
global step 400 learning rate 0.5000 step-time 1.38 perplexity 379.89
eval: bucket 0 perplexity 151.32
eval: bucket 1 perplexity 190.36
eval: bucket 2 perplexity 227.46
eval: bucket 3 perplexity 238.66
```
You can see that each step takes just under 1.4 seconds, the perplexity
on the training set and the perplexities on the development set
for each bucket. After about 30K steps, we see perplexities on short
sentences (bucket 0 and 1) going into single digits.
Since the training corpus contains ~22M sentences, one epoch (going through
the training data once) takes about 340K steps with batch-size of 64. At this
point the model can be used for translating English sentences to French
using the `--decode` option.
```
bazel run -c opt <...>/models/rnn/translate:translate --decode
--data_dir [your_data_directory] --train_dir [checkpoints_directory]
Reading model parameters from /tmp/translate.ckpt-340000
> Who is the president of the United States?
Qui est le président des États-Unis ?
```
## What Next?
The example above shows how you can build your own English-to-French
translator, end-to-end. Run it and see how the model performs for yourself.
While it has reasonable quality, the default parameters will not give you
the best translation model. Here are a few things you can improve.
First of all, we use a very promitive tokenizer, the `basic_tokenizer` function
in `data_utils`. A better tokenizer can be found on the
[WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html).
Using that tokenizer, and a larger vocabulary, should improve your translations.
Also, the default parameters of the translation model are not tuned.
You can try changing the learning rate, decay, or initializing the weights
of your model in a different way. You can also change the default
`GradientDescentOptimizer` in `seq2seq_model.py` to a more advanced one, such
as `AdagradOptimizer`. Try these things and see how they improve your results!
Finally, the model presented above can be used for any sequence-to-sequence
task, not only for translation. Even if you want to transform a sequence to
a tree, for example to generate a parsing tree, the same model as above can
give state-of-the-art results, as demonstrated in
[Vinyals & Kaiser et al., 2015](http://arxiv.org/abs/1412.7449).
So you can not only build your own translator, you can also build a parser,
a chat-bot, or any program that comes to your mind. Experiment!
================================================
FILE: SOURCE/tutorials/seq2seq.md
================================================
# Sequence-to-Sequence Models
Recurrent neural networks can learn to model language, as already discussed
in the [RNN Tutorial](tensorflow-zh/SOURCE/tutorials/recurrent/index.md)
(if you did not read it, please go through it before proceeding with this one).
This raises an interesting question: could we condition the generated words on
some input and generate a meaningful response? For example, could we train
a neural network to translate from English to French? It turns out that
the answer is *yes*.
This tutorial will show you how to build and train such a system end-to-end.
You can start by running this binary.
```
bazel run -c opt <...>/models/rnn/translate/translate.py
--data_dir [your_data_directory]
```
It will download English-to-French translation data from the
[WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html)
prepare it for training and train. It takes about 20GB of disk space,
and a while to download and prepare (see [later](#run_it) for details),
so you can start and leave it running while reading this tutorial.
This tutorial references the following files from `models/rnn`.
File | What's in it?
--- | ---
`seq2seq.py` | Library for building sequence-to-sequence models.
`translate/seq2seq_model.py` | Neural translation sequence-to-sequence model.
`translate/data_utils.py` | Helper functions for preparing translation data.
`translate/translate.py` | Binary that trains and runs the translation model.
## Sequence-to-Sequence Basics
A basic sequence-to-sequence model, as introduced in
[Cho et al., 2014](http://arxiv.org/pdf/1406.1078v3.pdf),
consists of two recurrent neural networks (RNNs): an *encoder* that
processes the input and a *decoder* that generates the output.
This basic architecture is depicted below.
Each box in the picture above represents a cell of the RNN, most commonly
a GRU cell or an LSTM cell (see the [RNN Tutorial](tensorflow-zh/SOURCE/tutorials/recurrent/index.md)
for an explanation of those). Encoder and decoder can share weights or,
as is more common, use a different set of parameters. Mutli-layer cells
have been successfully used in sequence-to-sequence models too, e.g. for
translation [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215).
In the basic model depicted above, every input has to be encoded into
a fixed-size state vector, as that is the only thing passed to the decoder.
To allow the decoder more direct access to the input, an *attention* mechanism
was introduced in [Bahdanu et al., 2014](http://arxiv.org/abs/1409.0473).
We will not go into the details of the attention mechanism (see the paper),
suffice it to say that it allows the decoder to peek into the input at every
decoding step. A multi-layer sequence-to-sequence network with LSTM cells and
attention mechanism in the decoder looks like this.
## TensorFlow seq2seq Library
As you can see above, there are many different sequence-to-sequence
models. Each of these models can use different RNN cells, but all
of them accept encoder inputs and decoder inputs. This motivates
the interfaces in the TensorFlow seq2seq library (`models/rnn/seq2seq.py`).
The basic RNN encoder-decoder sequence-to-sequence model works as follows.
```python
outputs, states = basic_rnn_seq2seq(encoder_inputs, decoder_inputs, cell)
```
In the above call, `encoder_inputs` are a list of tensors representing inputs
to the encoder, i.e., corresponding to the letters *A, B, C* in the first
picture above. Similarly, `decoder_inputs` are tensors representing inputs
to the decoder, *GO, W, X, Y, Z* on the first picture.
The `cell` argument is an instance of the `models.rnn.rnn_cell.RNNCell` class
that determines which cell will be used inside the model. You can use
an existing cell, such as `GRUCell` or `LSTMCell`, or you can write your own.
Moreover, `rnn_cell` provides wrappers to construct multi-layer cells,
add dropout to cell inputs or outputs, or to do other transformations,
see the [RNN Tutorial](tensorflow-zh/SOURCE/tutorials/recurrent/index.md) for examples.
The call to `basic_rnn_seq2seq` returns two arguments: `outputs` and `states`.
Both of them are lists of tensors of the same length as `decoder_inputs`.
Naturally, `outputs` correspond to the outputs of the decoder in each time-step,
in the first picture above that would be *W, X, Y, Z, EOS*. The returned
`states` represent the internal state of the decoder at every time-step.
In many applications of sequence-to-sequence models, the output of the decoder
at time t is fed back and becomes the input of the decoder at time t+1. At test
time, when decoding a sequence, this is how the sequence is constructed.
During training, on the other hand, it is common to provide the correct input
to the decoder at every time-step, even if the decoder made a mistake before.
Functions in `seq2seq.py` support both modes using the `feed_previous` argument.
For example, let's analyze the following use of an embedding RNN model.
```python
outputs, states = embedding_rnn_seq2seq(
encoder_inputs, decoder_inputs, cell,
num_encoder_symbols, num_decoder_symbols,
output_projection=None, feed_previous=False)
```
In the `embedding_rnn_seq2seq` model, all inputs (both `encoder_inputs` and
`decoder_inputs`) are integer-tensors that represent discrete values.
They will be embedded into a dense representation (see the
[Vectors Representations Tutorial](tensorflow-zh/SOURCE/tutorials/word2vec/index.md) for more details
on embeddings), but to construct these embeddings we need to specify
the maximum number of discrete symbols that will appear: `num_encoder_symbols`
on the encoder side, and `num_decoder_symbols` on the decoder side.
In the above invocation, we set `feed_previous` to False. This means that the
decoder will use `decoder_inputs` tensors as provided. If we set `feed_previous`
to True, the decoder would only use the first element of `decoder_inputs`.
All other tensors from this list would be ignored, and instead the previous
output of the encoder would be used. This is used for decoding translations
in our translation model, but it can also be used during training, to make
the model more robust to its own mistakes, similar
to [Bengio et al., 2015](http://arxiv.org/pdf/1506.03099v2.pdf).
One more important argument used above is `output_projection`. If not specified,
the outputs of the embedding model will be tensors of shape batch-size by
`num_decoder_symbols` as they represent the logits for each generated symbol.
When training models with large output vocabularies, i.e., when
`num_decoder_symbols` is large, it is not practical to store these large
tensors. Instead, it is better to return smaller output tensors, which will
later be projected onto a large output tensor using `output_projection`.
This allows to use our seq2seq models with a sampled softmax loss, as described
in [Jean et. al., 2015](http://arxiv.org/pdf/1412.2007v2.pdf).
In addition to `basic_rnn_seq2seq` and `embedding_rnn_seq2seq` there are a few
more sequence-to-sequence models in `seq2seq.py`, take a look there. They all
have similar interfaces, so we will not describe them in detail. We will use
`embedding_attention_seq2seq` for our translation model below.
## Neural Translation Model
While the core of the sequence-to-sequence model is constructed by
the functions in `models/rnn/seq2seq.py`, there are still a few tricks
that are worth mentioning that are used in our translation model in
`models/rnn/translate/seq2seq_model.py`.
### Sampled softmax and output projection
For one, as already mentioned above, we want to use sampled softmax to
handle large output vocabulary. To decode from it, we need to keep track
of the output projection. Both the sampled softmax loss and the output
projections are constructed by the following code in `seq2seq_model.py`.
```python
if num_samples > 0 and num_samples < self.target_vocab_size:
w = tf.get_variable("proj_w", [size, self.target_vocab_size])
w_t = tf.transpose(w)
b = tf.get_variable("proj_b", [self.target_vocab_size])
output_projection = (w, b)
def sampled_loss(inputs, labels):
labels = tf.reshape(labels, [-1, 1])
return tf.nn.sampled_softmax_loss(w_t, b, inputs, labels, num_samples,
self.target_vocab_size)
```
First, note that we only construct a sampled softmax if the number of samples
(512 by default) is smaller that the target vocabulary size. For vocabularies
smaller than 512 it might be a better idea to just use a standard softmax loss.
Then, as you can see, we construct an output projection. It is a pair,
consisting of a weight matrix and a bias vector. If used, the rnn cell
will return vectors of shape batch-size by `size`, rather than batch-size
by `target_vocab_size`. To recover logits, we need to multiply by the weight
matrix and add the biases, as is done in lines 124-126 in `seq2seq_model.py`.
```python
if output_projection is not None:
self.outputs[b] = [tf.matmul(output, output_projection[0]) +
output_projection[1] for ...]
```
### Bucketing and padding
In addition to sampled softmax, our translation model also makes use
of *bucketing*, which is a method to efficiently handle sentences of
different lengths. Let us first clarify the problem. When translating
English to French, we will have English sentences of different lengths L1
on input, and French sentences of different lengths L2 on output. Since
the English sentence is passed as `encoder_inputs`, and the French sentence
comes as `decoder_inputs` (prefixed by a GO symbol), we should in principle
create a seq2seq model for every pair (L1, L2+1) of lengths of an English
and French sentence. This would result in an enormous graph consisting of
many very similar subgraphs. On the other hand, we could just pad every
sentence with a special PAD symbol. Then we'd need only one seq2seq model,
for the padded lengths. But on shorter sentence our model would be inefficient,
encoding and decoding many PAD symbols that are useless.
As a compromise between contructing a graph for every pair of lengths and
padding to a single length, we use a number of *buckets* and pad each sentence
to the length of the bucket above it. In `translate.py` we use the following
default buckets.
```python
buckets = [(5, 10), (10, 15), (20, 25), (40, 50)]
```
This means that if the input is an English sentence with 3 tokens,
and the corresponding output is a French sentence with 6 tokens,
then they will be put in the first bucket and padded to length 5 for
encoder inputs, and length 10 for decoder inputs. If we have an English
sentence with 8 tokens and the corresponding French sentence has 18 tokens,
then they will not fit into the (10, 15) bucket, and so the (20, 25) bucket
will be used, i.e. the English sentence will be padded to 20, and the French
one to 25.
Remember that when constructing decoder inputs we prepend the special `GO`
symbol to the input data. This is done in the `get_batch()` function in
`seq2seq_model.py`, which also reverses the input English sentence.
Reversing the inputs was shown to improve results for the neural translation
model in [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215).
To put it all together, imagine we have the sentence "I go.", tokenized
as `["I", "go", "."]` as input and the sentence "Je vais." as output,
tokenized `["Je", "vais", "."]`. It will be put in the (5, 10) bucket,
with encoder inputs representing `[PAD PAD "." "go" "I"]` and decoder
inputs `[GO "Je" "vais" "." EOS PAD PAD PAD PAD PAD]`.
## Let's Run It
To train the model described above, we need to a large English-French corpus.
We will use the *10^9-French-English corpus* from the
[WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html)
for training, and the 2013 news test from the same site as development set.
Both data-sets will be downloaded to `data_dir` and training will start,
saving checkpoints in `train_dir`, when this command is run.
```
bazel run -c opt <...>/models/rnn/translate:translate
--data_dir [your_data_directory] --train_dir [checkpoints_directory]
--en_vocab_size=40000 --fr_vocab_size=40000
```
It takes about 18GB of disk space and several hours to prepare the training
corpus. It is unpacked, vocabulary files are created in `data_dir`, and then
the corpus is tokenized and converted to integer ids. Note the parameters
that determine vocabulary sizes. In the example above, all words outside
the 40K most common ones will be converted to an `UNK` token representing
unknown words. So if you change vocabulary size, the binary will re-map
the corpus to token-ids again.
After the data is prepared, training starts. Default parameters in `translate`
are set to quite large values. Large models trained over a long time give good
results, but it might take too long or use too much memory for your GPU.
You can request to train a smaller model as in the following example.
```
bazel run -c opt <...>/models/rnn/translate:translate
--data_dir [your_data_directory] --train_dir [checkpoints_directory]
--size=256 --num_layers=2 --steps_per_checkpoint=50
```
The above command will train a model with 2 layers (the default is 3),
each layer with 256 units (default is 1024), and will save a checkpoint
every 50 steps (the default is 200). You can play with these parameters
to find out how large a model can be to fit into the memory of your GPU.
During training, every `steps_per_checkpoint` steps the binary will print
out statistics from recent steps. With the default parameters (3 layers
of size 1024), first messages look like this.
```
global step 200 learning rate 0.5000 step-time 1.39 perplexity 1720.62
eval: bucket 0 perplexity 184.97
eval: bucket 1 perplexity 248.81
eval: bucket 2 perplexity 341.64
eval: bucket 3 perplexity 469.04
global step 400 learning rate 0.5000 step-time 1.38 perplexity 379.89
eval: bucket 0 perplexity 151.32
eval: bucket 1 perplexity 190.36
eval: bucket 2 perplexity 227.46
eval: bucket 3 perplexity 238.66
```
You can see that each step takes just under 1.4 seconds, the perplexity
on the training set and the perplexities on the development set
for each bucket. After about 30K steps, we see perplexities on short
sentences (bucket 0 and 1) going into single digits.
Since the training corpus contains ~22M sentences, one epoch (going through
the training data once) takes about 340K steps with batch-size of 64. At this
point the model can be used for translating English sentences to French
using the `--decode` option.
```
bazel run -c opt <...>/models/rnn/translate:translate --decode
--data_dir [your_data_directory] --train_dir [checkpoints_directory]
Reading model parameters from /tmp/translate.ckpt-340000
> Who is the president of the United States?
Qui est le président des États-Unis ?
```
## What Next?
The example above shows how you can build your own English-to-French
translator, end-to-end. Run it and see how the model performs for yourself.
While it has reasonable quality, the default parameters will not give you
the best translation model. Here are a few things you can improve.
First of all, we use a very promitive tokenizer, the `basic_tokenizer` function
in `data_utils`. A better tokenizer can be found on the
[WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html).
Using that tokenizer, and a larger vocabulary, should improve your translations.
Also, the default parameters of the translation model are not tuned.
You can try changing the learning rate, decay, or initializing the weights
of your model in a different way. You can also change the default
`GradientDescentOptimizer` in `seq2seq_model.py` to a more advanced one, such
as `AdagradOptimizer`. Try these things and see how they improve your results!
Finally, the model presented above can be used for any sequence-to-sequence
task, not only for translation. Even if you want to transform a sequence to
a tree, for example to generate a parsing tree, the same model as above can
give state-of-the-art results, as demonstrated in
[Vinyals & Kaiser et al., 2015](http://arxiv.org/abs/1412.7449).
So you can not only build your own translator, you can also build a parser,
a chat-bot, or any program that comes to your mind. Experiment!
================================================
FILE: SOURCE/tutorials/word2vec/__init__.py
================================================
================================================
FILE: SOURCE/tutorials/word2vec/index.md
================================================
# Vector Representations of Words
在本教程我们来看一下[Mikolov et al](http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)中提到的word2vec模型。该模型是用于学习文字的向量表示,称之为“word embedding”。
## 亮点
本教程意在展现出在TensorfLow中构建word2vec模型有趣、本质的部分。
* 我们从我们为何需要使用向量表示文字开始。
* 我们通过直观地例子观察模型背后的本质,以及它是如何训练的(通过一些数学方法评估)。
* 同时我们也展示了TensorFlow对该模型的简单实现。
* 最后,我们着眼于让给这个简单版本的模型表现更好。
我们会在教程的推进中循序渐进地解释代码,但是如果你更希望直入主题,可以在
[tensorflow/examples/tutorials/word2vec/word2vec_basic.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/word2vec/word2vec_basic.py)查看到一个最简单的实现。这个基本的例子提供的代码可以完成下载一些数据,简单训练后展示结果。一旦你觉得已经完全掌握了这个简单版本,你可以查看
[tensorflow/models/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow/models/embedding/word2vec.py),这里提供了一些更复杂的实现,同时也展示了TensorFlow的一些更进阶的特性,比如如何更高效地使用线程将数据送入文本模型,再比如如何在训练中设置检查点等等。
但是首先,让我们来看一下为何需要学习word embeddings。如果你对word embeddings相关内容已经是个专家了,那么请安心跳过本节内容,直接深入细节干一些脏活吧。
## 动机: 为什么需要学习 Word Embeddings?
通常图像或音频系统处理的是由图片中所有单个原始像素点强度值或者音频中功率谱密度的强度值,把它们编码成丰富、高纬度的向量数据集。对于物体或语音识别这一类的任务,我们所需的全部信息已经都存储在原始数据中(显然人类本身就是依赖原始数据进行日常的物体或语音识别的)。然后,自然语言处理系统通常将词汇作为离散的单一符号,例如 "cat" 一词或可表示为 `Id537` ,而 "dog" 一词或可表示为 `Id143`。这些符号编码毫无规律,无法提供不同词汇之间可能存在的关联信息。换句话说,在处理关于 "dogs" 一词的信息时,模型将无法利用已知的关于 "cats" 的信息(例如,它们都是动物,有四条腿,可作为宠物等等)。可见,将词汇表达为上述的独立离散符号将进一步导致数据稀疏,使我们在训练统计模型时不得不寻求更多的数据。而词汇的向量表示将克服上述的难题。
Et voila! 与预期的一样,相似的单词被聚类在一起。对word2vec模型更复杂的实现需要用到TensorFlow一些更高级的特性,具体是实现可以参考
[tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py)。
## 嵌套学习的评估: 类比推理
词嵌套在NLP的预测问题中是非常有用且使用广泛地。如果要检测一个模型是否是可以成熟地区分词性或者区分专有名词的模型,最简单的办法就是直接检验它的预测词性、语义关系的能力,比如让它解决形如`king is to queen as father is to ?`这样的问题。这种方法叫做*类比推理* ,可参考[Mikolov and colleagues](http://msr-waypoint.com/en-us/um/people/gzweig/Pubs/NAACL2013Regularities.pdf),数据集下载地址为:
https://word2vec.googlecode.com/svn/trunk/questions-words.txt。
To see how we do this evaluation如何执行这样的评估,可以看`build_eval_graph()`和
`eval()`这两个函数在下面源码中的使用
[tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py).
超参数的选择对该问题解决的准确性有巨大的影响。想要模型具有很好的表现,需要有一个巨大的训练数据集,同时仔细调整参数的选择并且使用例如二次抽样的一些技巧。不过这些问题已经超出了本教程的范围。
## 优化实现
以上简单的例子展示了TensorFlow的灵活性。比如说,我们可以很轻松得用现成的`tf.nn.sampled_softmax_loss()`来代替`tf.nn.nce_loss()`构成目标函数。如果你对损失函数想做新的尝试,你可以用TensorFlow手动编写新的目标函数的表达式,然后用控制器执行计算。这种灵活性的价值体现在,当我们探索一个机器学习模型时,我们可以很快地遍历这些尝试,从中选出最优。
一旦你有了一个满意的模型结构,或许它就可以使实现运行地更高效(在短时间内覆盖更多的数据)。比如说,在本教程中使用的简单代码,实际运行速度都不错,因为我们使用Python来读取和填装数据,而这些在TensorFlow后台只需执行非常少的工作。如果你发现你的模型在输入数据时存在严重的瓶颈,你可以根据自己的实际问题自行实现一个数据阅读器,参考
[新的数据格式](../../how_tos/new_data_formats/index.md)。对于Skip-Gram
模型,我们已经完成了如下这个例子
[tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py)。
如果I/O问题对你的模型已经不再是个问题,并且想进一步地优化性能,或许你可以自行编写TensorFlow操作单元,详见
[添加一个新的操作](../../how_tos/adding_an_op/index.md)。相应的,我们也提供了Skip-Gram模型的例子
[tensorflow/models/embedding/word2vec_optimized.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec_optimized.py)。请自行调节以上几个过程的标准,使模型在每个运行阶段有更好地性能。
## 总结
在本教程中我们介绍了word2vec模型,它在解决词嵌套问题中具有良好的性能。我们解释了使用词嵌套模型的实用性,并且讨论了如何使用TensorFlow实现该模型的高效训练。总的来说,我们希望这个例子能够让向你展示TensorFlow可以提供实验初期的灵活性,以及在后期优化模型时对模型内部的可操控性。
原文地址:[Vector Representation of Words](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/tutorials/word2vec/index.md) 翻译:[btpeter](https://github.com/btpeter) 校对:waiwaizheng
================================================
FILE: SOURCE/tutorials/word2vec/word2vec_basic.py
================================================
from __future__ import print_function
import tensorflow.python.platform
import collections
import math
import numpy as np
import os
import random
import tensorflow as tf
import urllib
import zipfile
# Step 1: Download the data.
url = 'http://mattmahoney.net/dc/'
def maybe_download(filename, expected_bytes):
"""Download a file if not present, and make sure it's the right size."""
if not os.path.exists(filename):
filename, _ = urllib.urlretrieve(url + filename, filename)
statinfo = os.stat(filename)
if statinfo.st_size == expected_bytes:
print('Found and verified', filename)
else:
print(statinfo.st_size)
raise Exception(
'Failed to verify ' + filename + '. Can you get to it with a browser?')
return filename
filename = maybe_download('text8.zip', 31344016)
# Read the data into a string.
def read_data(filename):
f = zipfile.ZipFile(filename)
for name in f.namelist():
return f.read(name).split()
f.close()
words = read_data(filename)
print('Data size', len(words))
# Step 2: Build the dictionary and replace rare words with UNK token.
vocabulary_size = 50000
def build_dataset(words):
count = [['UNK', -1]]
count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
dictionary = dict()
for word, _ in count:
dictionary[word] = len(dictionary)
data = list()
unk_count = 0
for word in words:
if word in dictionary:
index = dictionary[word]
else:
index = 0 # dictionary['UNK']
unk_count = unk_count + 1
data.append(index)
count[0][1] = unk_count
reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
return data, count, dictionary, reverse_dictionary
data, count, dictionary, reverse_dictionary = build_dataset(words)
del words # Hint to reduce memory.
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10])
data_index = 0
# Step 4: Function to generate a training batch for the skip-gram model.
def generate_batch(batch_size, num_skips, skip_window):
global data_index
assert batch_size % num_skips == 0
assert num_skips <= 2 * skip_window
batch = np.ndarray(shape=(batch_size), dtype=np.int32)
labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
span = 2 * skip_window + 1 # [ skip_window target skip_window ]
buffer = collections.deque(maxlen=span)
for _ in range(span):
buffer.append(data[data_index])
data_index = (data_index + 1) % len(data)
for i in range(batch_size / num_skips):
target = skip_window # target label at the center of the buffer
targets_to_avoid = [ skip_window ]
for j in range(num_skips):
while target in targets_to_avoid:
target = random.randint(0, span - 1)
targets_to_avoid.append(target)
batch[i * num_skips + j] = buffer[skip_window]
labels[i * num_skips + j, 0] = buffer[target]
buffer.append(data[data_index])
data_index = (data_index + 1) % len(data)
return batch, labels
batch, labels = generate_batch(batch_size=8, num_skips=2, skip_window=1)
for i in range(8):
print(batch[i], '->', labels[i, 0])
print(reverse_dictionary[batch[i]], '->', reverse_dictionary[labels[i, 0]])
# Step 5: Build and train a skip-gram model.
batch_size = 128
embedding_size = 128 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right.
num_skips = 2 # How many times to reuse an input to generate a label.
# We pick a random validation set to sample nearest neighbors. Here we limit the
# validation samples to the words that have a low numeric ID, which by
# construction are also the most frequent.
valid_size = 16 # Random set of words to evaluate similarity on.
valid_window = 100 # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(xrange(valid_window), valid_size))
num_sampled = 64 # Number of negative examples to sample.
graph = tf.Graph()
with graph.as_default():
# Input data.
train_inputs = tf.placeholder(tf.int32, shape=[batch_size])
train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
valid_dataset = tf.constant(valid_examples, dtype=tf.int32)
# Construct the variables.
embeddings = tf.Variable(
tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
nce_weights = tf.Variable(
tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
nce_biases = tf.Variable(tf.zeros([vocabulary_size]))
# Look up embeddings for inputs.
embed = tf.nn.embedding_lookup(embeddings, train_inputs)
# Compute the average NCE loss for the batch.
# tf.nce_loss automatically draws a new sample of the negative labels each
# time we evaluate the loss.
loss = tf.reduce_mean(
tf.nn.nce_loss(nce_weights, nce_biases, train_labels,embed,
num_sampled, vocabulary_size))
# Construct the SGD optimizer using a learning rate of 1.0.
optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
# Compute the cosine similarity between minibatch examples and all embeddings.
norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
normalized_embeddings = embeddings / norm
valid_embeddings = tf.nn.embedding_lookup(
normalized_embeddings, valid_dataset)
similarity = tf.matmul(
valid_embeddings, normalized_embeddings, transpose_b=True)
# Step 6: Begin training
num_steps = 100001
with tf.Session(graph=graph) as session:
# We must initialize all variables before we use them.
tf.initialize_all_variables().run()
print("Initialized")
average_loss = 0
for step in xrange(num_steps):
batch_inputs, batch_labels = generate_batch(
batch_size, num_skips, skip_window)
feed_dict = {train_inputs : batch_inputs, train_labels : batch_labels}
# We perform one update step by evaluating the optimizer op (including it
# in the list of returned values for session.run()
_, loss_val = session.run([optimizer, loss], feed_dict=feed_dict)
average_loss += loss_val
if step % 2000 == 0:
if step > 0:
average_loss = average_loss / 2000
# The average loss is an estimate of the loss over the last 2000 batches.
print("Average loss at step ", step, ": ", average_loss)
average_loss = 0
# note that this is expensive (~20% slowdown if computed every 500 steps)
if step % 10000 == 0:
sim = similarity.eval()
for i in xrange(valid_size):
valid_word = reverse_dictionary[valid_examples[i]]
top_k = 8 # number of nearest neighbors
nearest = (-sim[i, :]).argsort()[1:top_k+1]
log_str = "Nearest to %s:" % valid_word
for k in xrange(top_k):
close_word = reverse_dictionary[nearest[k]]
log_str = "%s %s," % (log_str, close_word)
print(log_str)
final_embeddings = normalized_embeddings.eval()
# Step 7: Visualize the embeddings.
def plot_with_labels(low_dim_embs, labels, filename='tsne.png'):
assert low_dim_embs.shape[0] >= len(labels), "More labels than embeddings"
plt.figure(figsize=(18, 18)) #in inches
for i, label in enumerate(labels):
x, y = low_dim_embs[i,:]
plt.scatter(x, y)
plt.annotate(label,
xy=(x, y),
xytext=(5, 2),
textcoords='offset points',
ha='right',
va='bottom')
plt.savefig(filename)
try:
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
tsne = TSNE(perplexity=30, n_components=2, init='pca', n_iter=5000)
plot_only = 500
low_dim_embs = tsne.fit_transform(final_embeddings[:plot_only,:])
labels = list(dictionary.keys())[:plot_only]
plot_with_labels(low_dim_embs, labels)
except ImportError:
print("Please install sklearn and matplotlib to visualize embeddings.")
================================================
FILE: SOURCE/tutorials/word2vec.md
================================================
# Vector Representations of Words
在本教程我们来看一下[Mikolov et al](http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)中提到的word2vec模型。该模型是用于学习文字的向量表示,称之为“word embedding”。
## 亮点
本教程意在展现出在TensorfLow中构建word2vec模型有趣、本质的部分。
* 我们从我们为何需要使用向量表示文字开始。
* 我们通过直观地例子观察模型背后的本质,以及它是如何训练的(通过一些数学方法评估)。
* 同时我们也展示了TensorFlow对该模型的简单实现。
* 最后,我们着眼于让给这个简单版本的模型表现更好。
我们会在教程的推进中循序渐进地解释代码,但是如果你更希望直入主题,可以在
[tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py)查看到一个最简单的实现。这个基本的例子提供的代码可以完成下载一些数据,简单训练后展示结果。一旦你觉得已经完全掌握了这个简单版本,你可以查看
[tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py),这里提供了一些更复杂的实现,同时也展示了TensorFlow的一些更进阶的特性,比如如何更高效地使用线程将数据送入文本模型,再比如如何在训练中设置检查点等等。
但是首先,让我们来看一下为何需要学习word embeddings。如果你对word embeddings相关内容已经是个专家了,那么请安心跳过本节内容,直接深入细节干一些脏活吧。
## 动机: 为什么需要学习 Word Embeddings?
通常图像或音频系统处理的是由图片中所有单个原始像素点强度值或者音频中功率谱密度的强度值,把它们编码成丰富、高纬度的向量数据集。对于物体或语音识别这一类的任务,我们所需的全部信息已经都存储在原始数据中(显然人类本身就是依赖原始数据进行日常的物体或语音识别的)。然后,自然语言处理系统通常将词汇作为离散的单一符号,例如 "cat" 一词或可表示为 `Id537` ,而 "dog" 一词或可表示为 `Id143`。这些符号编码毫无规律,无法提供不同词汇之间可能存在的关联信息。换句话说,在处理关于 "dogs" 一词的信息时,模型将无法利用已知的关于 "cats" 的信息(例如,它们都是动物,有四条腿,可作为宠物等等)。可见,将词汇表达为上述的独立离散符号将进一步导致数据稀疏,使我们在训练统计模型时不得不寻求更多的数据。而词汇的向量表示将克服上述的难题。
```
// 代码区域的上下分别用三个 ` 括起来
public class Person {
// 代码缩进请使用 四个空格,不要使用 Tab
}
```
效果:
```
// 代码区域的上下分别用三个 ` 括起来
public class Person {
// 代码缩进请使用 四个空格,不要使用 Tab
}
```
## 锚点使用
书写示例:
```
极客学院是中国最大的 IT 职业在线教育平台。[[锚点名]](#footnode)
[锚点名]我的示例。
```
效果:
极客学院是中国最大的 IT 职业在线教育平台。[[锚点名]](#footnode)
[锚点名]我的示例。
## 特别注意
1. 英文字母与中文之间需要两边空格(英文与符号之间不空格)
2. 高亮英文之间需要两边空格
3. 代码符号\`\`\`上面需要空一行,结束```下面需要空一行
4. 中文之间和短英文之间使用中文标点
5. 一篇文章中只能出现一个#号标题
6. 标题#号之后需要空一格
## 表格
书写示例:
```
| Prefix | Framework |
| ------------- |:-------------:| -----:|
| NS | Foundation (OS X and iOS) and Application Kit (OS X) |
| UI | UIKit (iOS) |
| AB | Address Book |
| CA | Core Animation |
| CI | Core Image |
```
效果:
| Prefix | Framework |
| ------------- |:-------------:| -----:|
| NS | Foundation (OS X and iOS) and Application Kit (OS X) |
| UI | UIKit (iOS) |
| AB | Address Book |
| CA | Core Animation |
| CI | Core Image |
================================================
FILE: tex_pdf/api/c4s00.tex
================================================
\section{Overview}\label{api_overview}
TensorFlow has APIs available in several languages both for constructing
and executing a TensorFlow graph. The Python API is at present the most
complete and the easiest to use, but the C++ API may offer some
performance advantages in graph execution, and supports deployment to
small devices such as Android.
Over time, we hope that the TensorFlow community will develop front ends
for languages like Go, Java, JavaScript, Lua R, and perhaps others. With
\href{http://swig.org}{SWIG}, it's relatively easy to develop a
TensorFlow interface for your favorite language.
Note: Many practical aspects of usage are covered in the Mechanics tab,
and some additional documentation not specific to any particular
language API is available in the Resources tab.
================================================
FILE: tex_pdf/api/cc/ClassEnv.md
================================================
# Class `tensorflow::Env`
An interface used by the tensorflow implementation to access operating system functionality like the filesystem etc.
Callers may wish to provide a custom Env object to get fine grain control.
All Env implementations are safe for concurrent access from multiple threads without any external synchronization.
##Member Summary
* [`tensorflow::Env::Env()`](#tensorflow_Env_Env)
* [`virtual tensorflow::Env::~Env()`](#virtual_tensorflow_Env_Env)
* [`virtual Status tensorflow::Env::NewRandomAccessFile(const string &fname, RandomAccessFile **result)=0`](#virtual_Status_tensorflow_Env_NewRandomAccessFile)
* Creates a brand new random access read-only file with the specified name.
* [`virtual Status tensorflow::Env::NewWritableFile(const string &fname, WritableFile **result)=0`](#virtual_Status_tensorflow_Env_NewWritableFile)
* Creates an object that writes to a new file with the specified name.
* [`virtual Status tensorflow::Env::NewAppendableFile(const string &fname, WritableFile **result)=0`](#virtual_Status_tensorflow_Env_NewAppendableFile)
* Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with).
* [`virtual bool tensorflow::Env::FileExists(const string &fname)=0`](#virtual_bool_tensorflow_Env_FileExists)
* Returns true iff the named file exists.
* [`virtual Status tensorflow::Env::GetChildren(const string &dir, std::vector< string > *result)=0`](#virtual_Status_tensorflow_Env_GetChildren)
* Stores in *result the names of the children of the specified directory. The names are relative to "dir".
* [`virtual Status tensorflow::Env::DeleteFile(const string &fname)=0`](#virtual_Status_tensorflow_Env_DeleteFile)
* Deletes the named file.
* [`virtual Status tensorflow::Env::CreateDir(const string &dirname)=0`](#virtual_Status_tensorflow_Env_CreateDir)
* Creates the specified directory.
* [`virtual Status tensorflow::Env::DeleteDir(const string &dirname)=0`](#virtual_Status_tensorflow_Env_DeleteDir)
* Deletes the specified directory.
* [`virtual Status tensorflow::Env::GetFileSize(const string &fname, uint64 *file_size)=0`](#virtual_Status_tensorflow_Env_GetFileSize)
* Stores the size of fname in *file_size.
* [`virtual Status tensorflow::Env::RenameFile(const string &src, const string &target)=0`](#virtual_Status_tensorflow_Env_RenameFile)
* Renames file src to target. If target already exists, it will be replaced.
* [`virtual uint64 tensorflow::Env::NowMicros()=0`](#virtual_uint64_tensorflow_Env_NowMicros)
* Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time.
* [`virtual void tensorflow::Env::SleepForMicroseconds(int micros)=0`](#virtual_void_tensorflow_Env_SleepForMicroseconds)
* Sleeps/delays the thread for the prescribed number of micro-seconds.
* [`virtual Thread* tensorflow::Env::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) TF_MUST_USE_RESULT=0`](#virtual_Thread_tensorflow_Env_StartThread)
* Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name".
* [`static Env* tensorflow::Env::Default()`](#static_Env_tensorflow_Env_Default)
* Returns a default environment suitable for the current operating system.
##Member Details
#### `tensorflow::Env::Env()`
#### `virtual tensorflow::Env::~Env()`
#### `virtual Status tensorflow::Env::NewRandomAccessFile(const string &fname, RandomAccessFile **result)=0`
Creates a brand new random access read-only file with the specified name.
On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. If the file does not exist, returns a non-OK status.
The returned file may be concurrently accessed by multiple threads.
#### `virtual Status tensorflow::Env::NewWritableFile(const string &fname, WritableFile **result)=0`
Creates an object that writes to a new file with the specified name.
Deletes any existing file with the same name and creates a new file. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK.
The returned file will only be accessed by one thread at a time.
#### `virtual Status tensorflow::Env::NewAppendableFile(const string &fname, WritableFile **result)=0`
Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with).
On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK.
The returned file will only be accessed by one thread at a time.
#### `virtual bool tensorflow::Env::FileExists(const string &fname)=0`
Returns true iff the named file exists.
#### `virtual Status tensorflow::Env::GetChildren(const string &dir, std::vector< string > *result)=0`
Stores in *result the names of the children of the specified directory. The names are relative to "dir".
Original contents of *results are dropped.
#### `virtual Status tensorflow::Env::DeleteFile(const string &fname)=0`
Deletes the named file.
#### `virtual Status tensorflow::Env::CreateDir(const string &dirname)=0`
Creates the specified directory.
#### `virtual Status tensorflow::Env::DeleteDir(const string &dirname)=0`
Deletes the specified directory.
#### `virtual Status tensorflow::Env::GetFileSize(const string &fname, uint64 *file_size)=0`
Stores the size of fname in *file_size.
#### `virtual Status tensorflow::Env::RenameFile(const string &src, const string &target)=0`
Renames file src to target. If target already exists, it will be replaced.
#### `virtual uint64 tensorflow::Env::NowMicros()=0`
Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time.
#### `virtual void tensorflow::Env::SleepForMicroseconds(int micros)=0`
Sleeps/delays the thread for the prescribed number of micro-seconds.
#### `virtual Thread* tensorflow::Env::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) TF_MUST_USE_RESULT=0`
Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name".
Caller takes ownership of the result and must delete it eventually (the deletion will block until fn() stops running).
#### `static Env* tensorflow::Env::Default()`
Returns a default environment suitable for the current operating system.
Sophisticated users may wish to provide their own Env implementation instead of relying on this default environment.
The result of Default() belongs to this library and must never be deleted.
================================================
FILE: tex_pdf/api/cc/ClassEnvWrapper.md
================================================
# Class `tensorflow::EnvWrapper`
An implementation of Env that forwards all calls to another Env .
May be useful to clients who wish to override just part of the functionality of another Env .
##Member Summary
* [`tensorflow::EnvWrapper::EnvWrapper(Env *t)`](#tensorflow_EnvWrapper_EnvWrapper)
* Initializes an EnvWrapper that delegates all calls to *t.
* [`virtual tensorflow::EnvWrapper::~EnvWrapper()`](#virtual_tensorflow_EnvWrapper_EnvWrapper)
* [`Env* tensorflow::EnvWrapper::target() const`](#Env_tensorflow_EnvWrapper_target)
* Returns the target to which this Env forwards all calls.
* [`Status tensorflow::EnvWrapper::NewRandomAccessFile(const string &f, RandomAccessFile **r) override`](#Status_tensorflow_EnvWrapper_NewRandomAccessFile)
* Creates a brand new random access read-only file with the specified name.
* [`Status tensorflow::EnvWrapper::NewWritableFile(const string &f, WritableFile **r) override`](#Status_tensorflow_EnvWrapper_NewWritableFile)
* Creates an object that writes to a new file with the specified name.
* [`Status tensorflow::EnvWrapper::NewAppendableFile(const string &f, WritableFile **r) override`](#Status_tensorflow_EnvWrapper_NewAppendableFile)
* Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with).
* [`bool tensorflow::EnvWrapper::FileExists(const string &f) override`](#bool_tensorflow_EnvWrapper_FileExists)
* Returns true iff the named file exists.
* [`Status tensorflow::EnvWrapper::GetChildren(const string &dir, std::vector< string > *r) override`](#Status_tensorflow_EnvWrapper_GetChildren)
* Stores in *result the names of the children of the specified directory. The names are relative to "dir".
* [`Status tensorflow::EnvWrapper::DeleteFile(const string &f) override`](#Status_tensorflow_EnvWrapper_DeleteFile)
* Deletes the named file.
* [`Status tensorflow::EnvWrapper::CreateDir(const string &d) override`](#Status_tensorflow_EnvWrapper_CreateDir)
* Creates the specified directory.
* [`Status tensorflow::EnvWrapper::DeleteDir(const string &d) override`](#Status_tensorflow_EnvWrapper_DeleteDir)
* Deletes the specified directory.
* [`Status tensorflow::EnvWrapper::GetFileSize(const string &f, uint64 *s) override`](#Status_tensorflow_EnvWrapper_GetFileSize)
* Stores the size of fname in *file_size.
* [`Status tensorflow::EnvWrapper::RenameFile(const string &s, const string &t) override`](#Status_tensorflow_EnvWrapper_RenameFile)
* Renames file src to target. If target already exists, it will be replaced.
* [`uint64 tensorflow::EnvWrapper::NowMicros() override`](#uint64_tensorflow_EnvWrapper_NowMicros)
* Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time.
* [`void tensorflow::EnvWrapper::SleepForMicroseconds(int micros) override`](#void_tensorflow_EnvWrapper_SleepForMicroseconds)
* Sleeps/delays the thread for the prescribed number of micro-seconds.
* [`Thread* tensorflow::EnvWrapper::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) override`](#Thread_tensorflow_EnvWrapper_StartThread)
* Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name".
##Member Details
#### `tensorflow::EnvWrapper::EnvWrapper(Env *t)`
Initializes an EnvWrapper that delegates all calls to *t.
#### `virtual tensorflow::EnvWrapper::~EnvWrapper()`
#### `Env* tensorflow::EnvWrapper::target() const`
Returns the target to which this Env forwards all calls.
#### `Status tensorflow::EnvWrapper::NewRandomAccessFile(const string &f, RandomAccessFile **r) override`
Creates a brand new random access read-only file with the specified name.
On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. If the file does not exist, returns a non-OK status.
The returned file may be concurrently accessed by multiple threads.
#### `Status tensorflow::EnvWrapper::NewWritableFile(const string &f, WritableFile **r) override`
Creates an object that writes to a new file with the specified name.
Deletes any existing file with the same name and creates a new file. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK.
The returned file will only be accessed by one thread at a time.
#### `Status tensorflow::EnvWrapper::NewAppendableFile(const string &f, WritableFile **r) override`
Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with).
On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK.
The returned file will only be accessed by one thread at a time.
#### `bool tensorflow::EnvWrapper::FileExists(const string &f) override`
Returns true iff the named file exists.
#### `Status tensorflow::EnvWrapper::GetChildren(const string &dir, std::vector< string > *r) override`
Stores in *result the names of the children of the specified directory. The names are relative to "dir".
Original contents of *results are dropped.
#### `Status tensorflow::EnvWrapper::DeleteFile(const string &f) override`
Deletes the named file.
#### `Status tensorflow::EnvWrapper::CreateDir(const string &d) override`
Creates the specified directory.
#### `Status tensorflow::EnvWrapper::DeleteDir(const string &d) override`
Deletes the specified directory.
#### `Status tensorflow::EnvWrapper::GetFileSize(const string &f, uint64 *s) override`
Stores the size of fname in *file_size.
#### `Status tensorflow::EnvWrapper::RenameFile(const string &s, const string &t) override`
Renames file src to target. If target already exists, it will be replaced.
#### `uint64 tensorflow::EnvWrapper::NowMicros() override`
Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time.
#### `void tensorflow::EnvWrapper::SleepForMicroseconds(int micros) override`
Sleeps/delays the thread for the prescribed number of micro-seconds.
#### `Thread* tensorflow::EnvWrapper::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) override`
Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name".
Caller takes ownership of the result and must delete it eventually (the deletion will block until fn() stops running).
================================================
FILE: tex_pdf/api/cc/ClassRandomAccessFile.md
================================================
# Class `tensorflow::RandomAccessFile`
A file abstraction for randomly reading the contents of a file.
##Member Summary
* [`tensorflow::RandomAccessFile::RandomAccessFile()`](#tensorflow_RandomAccessFile_RandomAccessFile)
* [`virtual tensorflow::RandomAccessFile::~RandomAccessFile()`](#virtual_tensorflow_RandomAccessFile_RandomAccessFile)
* [`virtual Status tensorflow::RandomAccessFile::Read(uint64 offset, size_t n, StringPiece *result, char *scratch) const =0`](#virtual_Status_tensorflow_RandomAccessFile_Read)
* Reads up to "n" bytes from the file starting at "offset".
##Member Details
#### `tensorflow::RandomAccessFile::RandomAccessFile()`
#### `virtual tensorflow::RandomAccessFile::~RandomAccessFile()`
#### `virtual Status tensorflow::RandomAccessFile::Read(uint64 offset, size_t n, StringPiece *result, char *scratch) const =0`
Reads up to "n" bytes from the file starting at "offset".
"scratch[0..n-1]" may be written by this routine. Sets "*result" to the data that was read (including if fewer than "n" bytes were successfully read). May set "*result" to point at data in "scratch[0..n-1]", so "scratch[0..n-1]" must be live when "*result" is used.
On OK returned status: "n" bytes have been stored in "*result". On non-OK returned status: [0..n] bytes have been stored in "*result".
Returns `OUT_OF_RANGE` if fewer than n bytes were stored in "*result" because of EOF.
Safe for concurrent use by multiple threads.
================================================
FILE: tex_pdf/api/cc/ClassSession.md
================================================
# Class `tensorflow::Session`
A Session instance lets a caller drive a TensorFlow graph computation.
When a Session is created with a given target, a new Session object is bound to the universe of resources specified by that target. Those resources are available to this session to perform computation described in the GraphDef. After extending the session with a graph, the caller uses the Run() API to perform the computation and potentially fetch outputs as Tensors.
Example:
```c++ tensorflow::GraphDef graph;
// ... Create or load graph into "graph".
// This example uses the default options which connects
// to a local runtime.
tensorflow::SessionOptions options;
std::unique_ptr
session(tensorflow::NewSession(options));
// Create the session with this graph.
tensorflow::Status s = session->Create(graph);
if (!s.ok()) { ... }
// Run the graph and fetch the first output of the "output"
// operation, and also run to but do not return anything
// for the "update_state" operation.
std::vector outputs;
s = session->Run({}, {"output:0"}, {"update_state"}, &outputs);
if (!s.ok()) { ... }
// Map the output as a flattened float tensor, and do something
// with it.
auto output_tensor = outputs[0].flat();
if (output_tensor(0) > 0.5) { ... }
// Close the session to release the resources associated with
// this session.
session->Close()
```
A Session allows concurrent calls to Run() , though a Session must be created / extended by a single thread.
Only one thread must call Close() , and Close() must only be called after all other calls to Run() have returned.
##Member Summary
* [`virtual Status tensorflow::Session::Create(const GraphDef &graph)=0`](#virtual_Status_tensorflow_Session_Create)
* Create the graph to be used for the session.
* [`virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0`](#virtual_Status_tensorflow_Session_Extend)
* Adds operations to the graph that is already registered with the Session .
* [`virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0`](#virtual_Status_tensorflow_Session_Run)
* Runs the graph with the provided input tensors and fills `outputs` for the endpoints specified in `output_tensor_names`. Runs to but does not return Tensors for the nodes in `target_node_names`.
* [`virtual Status tensorflow::Session::Close()=0`](#virtual_Status_tensorflow_Session_Close)
* Closes this session.
* [`virtual tensorflow::Session::~Session()`](#virtual_tensorflow_Session_Session)
##Member Details
#### `virtual Status tensorflow::Session::Create(const GraphDef &graph)=0`
Create the graph to be used for the session.
Returns an error if this session has already been created with a graph. To re-use the session with a different graph, the caller must Close() the session first.
#### `virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0`
Adds operations to the graph that is already registered with the Session .
The names of new operations in "graph" must not exist in the graph that is already registered.
#### `virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0`
Runs the graph with the provided input tensors and fills `outputs` for the endpoints specified in `output_tensor_names`. Runs to but does not return Tensors for the nodes in `target_node_names`.
The order of tensors in `outputs` will match the order provided by `output_tensor_names`.
If `Run` returns `OK()`, then `outputs->size()` will be equal to `output_tensor_names.size()`. If `Run` does not return `OK()`, the state of `outputs` is undefined.
REQUIRES: The name of each Tensor of the input or output must match a "Tensor endpoint" in the `GraphDef` passed to ` Create() `.
REQUIRES: outputs is not nullptr if `output_tensor_names` is non-empty.
#### `virtual Status tensorflow::Session::Close()=0`
Closes this session.
Closing a session releases the resources used by this session on the TensorFlow runtime (specified during session creation by the ` SessionOptions::target ` field).
#### `virtual tensorflow::Session::~Session()`
================================================
FILE: tex_pdf/api/cc/ClassStatus.md
================================================
# Class `tensorflow::Status`
##Member Summary
* [`tensorflow::Status::Status()`](#tensorflow_Status_Status)
* Create a success status.
* [`tensorflow::Status::~Status()`](#tensorflow_Status_Status)
* [`tensorflow::Status::Status(tensorflow::error::Code code, tensorflow::StringPiece msg)`](#tensorflow_Status_Status)
* Create a status with the specified error code and msg as a human-readable string containing more detailed information.
* [`tensorflow::Status::Status(const Status &s)`](#tensorflow_Status_Status)
* Copy the specified status.
* [`void tensorflow::Status::operator=(const Status &s)`](#void_tensorflow_Status_operator_)
* [`bool tensorflow::Status::ok() const`](#bool_tensorflow_Status_ok)
* Returns true iff the status indicates success.
* [`tensorflow::error::Code tensorflow::Status::code() const`](#tensorflow_error_Code_tensorflow_Status_code)
* [`const string& tensorflow::Status::error_message() const`](#const_string_tensorflow_Status_error_message)
* [`bool tensorflow::Status::operator==(const Status &x) const`](#bool_tensorflow_Status_operator_)
* [`bool tensorflow::Status::operator!=(const Status &x) const`](#bool_tensorflow_Status_operator_)
* [`void tensorflow::Status::Update(const Status &new_status)`](#void_tensorflow_Status_Update)
* If ` ok() `, stores `new_status` into `*this`. If `!ok()`, preserves the current status, but may augment with additional information about `new_status`.
* [`string tensorflow::Status::ToString() const`](#string_tensorflow_Status_ToString)
* Return a string representation of this status suitable for printing. Returns the string `"OK"` for success.
* [`static Status tensorflow::Status::OK()`](#static_Status_tensorflow_Status_OK)
##Member Details
#### `tensorflow::Status::Status()`
Create a success status.
#### `tensorflow::Status::~Status()`
#### `tensorflow::Status::Status(tensorflow::error::Code code, tensorflow::StringPiece msg)`
Create a status with the specified error code and msg as a human-readable string containing more detailed information.
#### `tensorflow::Status::Status(const Status &s)`
Copy the specified status.
#### `void tensorflow::Status::operator=(const Status &s)`
#### `bool tensorflow::Status::ok() const`
Returns true iff the status indicates success.
#### `tensorflow::error::Code tensorflow::Status::code() const`
#### `const string& tensorflow::Status::error_message() const`
#### `bool tensorflow::Status::operator==(const Status &x) const`
#### `bool tensorflow::Status::operator!=(const Status &x) const`
#### `void tensorflow::Status::Update(const Status &new_status)`
If ` ok() `, stores `new_status` into `*this`. If `!ok()`, preserves the current status, but may augment with additional information about `new_status`.
Convenient way of keeping track of the first error encountered. Instead of: `if (overall_status.ok()) overall_status = new_status` Use: `overall_status.Update(new_status);`
#### `string tensorflow::Status::ToString() const`
Return a string representation of this status suitable for printing. Returns the string `"OK"` for success.
#### `static Status tensorflow::Status::OK()`
================================================
FILE: tex_pdf/api/cc/ClassTensor.md
================================================
# Class `tensorflow::Tensor`
Represents an n-dimensional array of values.
##Member Summary
* [`tensorflow::Tensor::Tensor()`](#tensorflow_Tensor_Tensor)
* Default Tensor constructor. Creates a 1-dimension, 0-element float tensor.
* [`tensorflow::Tensor::Tensor(DataType type, const TensorShape &shape)`](#tensorflow_Tensor_Tensor)
* Creates a Tensor of the given `type` and `shape`.
* [`tensorflow::Tensor::Tensor(Allocator *a, DataType type, const TensorShape &shape)`](#tensorflow_Tensor_Tensor)
* Creates a tensor with the input `type` and `shape`, using the allocator `a` to allocate the underlying buffer.
* [`tensorflow::Tensor::Tensor(DataType type)`](#tensorflow_Tensor_Tensor)
* Creates an uninitialized Tensor of the given data type.
* [`tensorflow::Tensor::Tensor(const Tensor &other)`](#tensorflow_Tensor_Tensor)
* [`tensorflow::Tensor::~Tensor()`](#tensorflow_Tensor_Tensor)
* Copy constructor.
* [`DataType tensorflow::Tensor::dtype() const`](#DataType_tensorflow_Tensor_dtype)
* Returns the data type.
* [`const TensorShape& tensorflow::Tensor::shape() const`](#const_TensorShape_tensorflow_Tensor_shape)
* Returns the shape of the tensor.
* [`int tensorflow::Tensor::dims() const`](#int_tensorflow_Tensor_dims)
* Convenience accessor for the tensor shape.
* [`int64 tensorflow::Tensor::dim_size(int d) const`](#int64_tensorflow_Tensor_dim_size)
* Convenience accessor for the tensor shape.
* [`int64 tensorflow::Tensor::NumElements() const`](#int64_tensorflow_Tensor_NumElements)
* Convenience accessor for the tensor shape.
* [`bool tensorflow::Tensor::IsSameSize(const Tensor &b) const`](#bool_tensorflow_Tensor_IsSameSize)
* [`bool tensorflow::Tensor::IsInitialized() const`](#bool_tensorflow_Tensor_IsInitialized)
* Has this Tensor been initialized?
* [`size_t tensorflow::Tensor::TotalBytes() const`](#size_t_tensorflow_Tensor_TotalBytes)
* Returns the estimated memory usage of this tensor.
* [`Tensor& tensorflow::Tensor::operator=(const Tensor &other)`](#Tensor_tensorflow_Tensor_operator_)
* Assign operator. This tensor shares other's underlying storage.
* [`bool tensorflow::Tensor::CopyFrom(const Tensor &other, const TensorShape &shape) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_CopyFrom)
* Copy the other tensor into this tensor and reshape it.
* [`Tensor tensorflow::Tensor::Slice(int64 dim0_start, int64 dim0_limit) const`](#Tensor_tensorflow_Tensor_Slice)
* Slice this tensor along the 1st dimension.
* [`bool tensorflow::Tensor::FromProto(const TensorProto &other) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_FromProto)
* Parse `other` and construct the tensor.
* [`bool tensorflow::Tensor::FromProto(Allocator *a, const TensorProto &other) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_FromProto)
* [`void tensorflow::Tensor::AsProtoField(TensorProto *proto) const`](#void_tensorflow_Tensor_AsProtoField)
* Fills in `proto` with `*this` tensor's content.
* [`void tensorflow::Tensor::AsProtoTensorContent(TensorProto *proto) const`](#void_tensorflow_Tensor_AsProtoTensorContent)
* [`TTypes::Vec tensorflow::Tensor::vec()`](#TTypes_T_Vec_tensorflow_Tensor_vec)
* Return the tensor data as an `Eigen::Tensor` with the type and sizes of this ` Tensor `.
* [`TTypes::Matrix tensorflow::Tensor::matrix()`](#TTypes_T_Matrix_tensorflow_Tensor_matrix)
* [`TTypes< T, NDIMS >::Tensor tensorflow::Tensor::tensor()`](#TTypes_T_NDIMS_Tensor_tensorflow_Tensor_tensor)
* [`TTypes::Flat tensorflow::Tensor::flat()`](#TTypes_T_Flat_tensorflow_Tensor_flat)
* Return the tensor data as an `Eigen::Tensor` of the data type and a specified shape.
* [`TTypes::UnalignedFlat tensorflow::Tensor::unaligned_flat()`](#TTypes_T_UnalignedFlat_tensorflow_Tensor_unaligned_flat)
* [`TTypes::Matrix tensorflow::Tensor::flat_inner_dims()`](#TTypes_T_Matrix_tensorflow_Tensor_flat_inner_dims)
* [`TTypes::Matrix tensorflow::Tensor::flat_outer_dims()`](#TTypes_T_Matrix_tensorflow_Tensor_flat_outer_dims)
* [`TTypes< T, NDIMS >::Tensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes)`](#TTypes_T_NDIMS_Tensor_tensorflow_Tensor_shaped)
* [`TTypes< T, NDIMS >::UnalignedTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes)`](#TTypes_T_NDIMS_UnalignedTensor_tensorflow_Tensor_unaligned_shaped)
* [`TTypes< T >::Scalar tensorflow::Tensor::scalar()`](#TTypes_T_Scalar_tensorflow_Tensor_scalar)
* Return the Tensor data as a `TensorMap` of fixed size 1: `TensorMap>`.
* [`TTypes::ConstVec tensorflow::Tensor::vec() const`](#TTypes_T_ConstVec_tensorflow_Tensor_vec)
* Const versions of all the methods above.
* [`TTypes::ConstMatrix tensorflow::Tensor::matrix() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_matrix)
* [`TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::tensor() const`](#TTypes_T_NDIMS_ConstTensor_tensorflow_Tensor_tensor)
* [`TTypes::ConstFlat tensorflow::Tensor::flat() const`](#TTypes_T_ConstFlat_tensorflow_Tensor_flat)
* [`TTypes::UnalignedConstFlat tensorflow::Tensor::unaligned_flat() const`](#TTypes_T_UnalignedConstFlat_tensorflow_Tensor_unaligned_flat)
* [`TTypes::ConstMatrix tensorflow::Tensor::flat_inner_dims() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_flat_inner_dims)
* [`TTypes::ConstMatrix tensorflow::Tensor::flat_outer_dims() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_flat_outer_dims)
* [`TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes) const`](#TTypes_T_NDIMS_ConstTensor_tensorflow_Tensor_shaped)
* [`TTypes< T, NDIMS >::UnalignedConstTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes) const`](#TTypes_T_NDIMS_UnalignedConstTensor_tensorflow_Tensor_unaligned_shaped)
* [`TTypes< T >::ConstScalar tensorflow::Tensor::scalar() const`](#TTypes_T_ConstScalar_tensorflow_Tensor_scalar)
* [`string tensorflow::Tensor::SummarizeValue(int64 max_entries) const`](#string_tensorflow_Tensor_SummarizeValue)
* Render the first `max_entries` values in `*this` into a string.
* [`string tensorflow::Tensor::DebugString() const`](#string_tensorflow_Tensor_DebugString)
* A human-readable summary of the tensor suitable for debugging.
* [`void tensorflow::Tensor::FillDescription(TensorDescription *description) const`](#void_tensorflow_Tensor_FillDescription)
* [`StringPiece tensorflow::Tensor::tensor_data() const`](#StringPiece_tensorflow_Tensor_tensor_data)
* Returns a `StringPiece` mapping the current tensor's buffer.
##Member Details
#### `tensorflow::Tensor::Tensor()`
Default Tensor constructor. Creates a 1-dimension, 0-element float tensor.
#### `tensorflow::Tensor::Tensor(DataType type, const TensorShape &shape)`
Creates a Tensor of the given `type` and `shape`.
The underlying buffer is allocated using a `CPUAllocator`.
#### `tensorflow::Tensor::Tensor(Allocator *a, DataType type, const TensorShape &shape)`
Creates a tensor with the input `type` and `shape`, using the allocator `a` to allocate the underlying buffer.
`a` must outlive the lifetime of this Tensor .
#### `tensorflow::Tensor::Tensor(DataType type)`
Creates an uninitialized Tensor of the given data type.
#### `tensorflow::Tensor::Tensor(const Tensor &other)`
#### `tensorflow::Tensor::~Tensor()`
Copy constructor.
#### `DataType tensorflow::Tensor::dtype() const`
Returns the data type.
#### `const TensorShape& tensorflow::Tensor::shape() const`
Returns the shape of the tensor.
#### `int tensorflow::Tensor::dims() const`
Convenience accessor for the tensor shape.
For all shape accessors, see comments for relevant methods of ` TensorShape ` in ` tensor_shape.h `.
#### `int64 tensorflow::Tensor::dim_size(int d) const`
Convenience accessor for the tensor shape.
#### `int64 tensorflow::Tensor::NumElements() const`
Convenience accessor for the tensor shape.
#### `bool tensorflow::Tensor::IsSameSize(const Tensor &b) const`
#### `bool tensorflow::Tensor::IsInitialized() const`
Has this Tensor been initialized?
#### `size_t tensorflow::Tensor::TotalBytes() const`
Returns the estimated memory usage of this tensor.
#### `Tensor& tensorflow::Tensor::operator=(const Tensor &other)`
Assign operator. This tensor shares other's underlying storage.
#### `bool tensorflow::Tensor::CopyFrom(const Tensor &other, const TensorShape &shape) TF_MUST_USE_RESULT`
Copy the other tensor into this tensor and reshape it.
This tensor shares other's underlying storage. Returns `true` iff `other.shape()` has the same number of elements of the given `shape`.
#### `Tensor tensorflow::Tensor::Slice(int64 dim0_start, int64 dim0_limit) const`
Slice this tensor along the 1st dimension.
I.e., the returned tensor satisifies returned[i, ...] == this[dim0_start + i, ...]. The returned tensor shares the underlying tensor buffer with this tensor.
NOTE: The returned tensor may not satisfies the same alignment requirement as this tensor depending on the shape. The caller must check the returned tensor's alignment before calling certain methods that have alignment requirement (e.g., ` flat() `, `tensor()`).
REQUIRES: ` dims() ` >= 1 REQUIRES: `0 <= dim0_start <= dim0_limit <= dim_size(0)`
#### `bool tensorflow::Tensor::FromProto(const TensorProto &other) TF_MUST_USE_RESULT`
Parse `other` and construct the tensor.
Returns `true` iff the parsing succeeds. If the parsing fails, the state of `*this` is unchanged.
#### `bool tensorflow::Tensor::FromProto(Allocator *a, const TensorProto &other) TF_MUST_USE_RESULT`
#### `void tensorflow::Tensor::AsProtoField(TensorProto *proto) const`
Fills in `proto` with `*this` tensor's content.
` AsProtoField() ` fills in the repeated field for `proto.dtype()`, while `AsProtoTensorContent()` encodes the content in `proto.tensor_content()` in a compact form.
#### `void tensorflow::Tensor::AsProtoTensorContent(TensorProto *proto) const`
#### `TTypes::Vec tensorflow::Tensor::vec()`
Return the tensor data as an `Eigen::Tensor` with the type and sizes of this ` Tensor `.
Use these methods when you know the data type and the number of dimensions of the Tensor and you want an `Eigen::Tensor` automatically sized to the ` Tensor ` sizes. The implementation check fails if either type or sizes mismatch.
Example:
```c++ typedef float T;
Tensor my_mat(...built with Shape{rows: 3, cols: 5}...);
auto mat = my_mat.matrix(); // 2D Eigen::Tensor, 3 x 5.
auto mat = my_mat.tensor(); // 2D Eigen::Tensor, 3 x 5.
auto vec = my_mat.vec(); // CHECK fails as my_mat is 2D.
auto vec = my_mat.tensor(); // CHECK fails as my_mat is 2D.
auto mat = my_mat.matrix();// CHECK fails as type mismatch.
```
#### `TTypes::Matrix tensorflow::Tensor::matrix()`
#### `TTypes< T, NDIMS >::Tensor tensorflow::Tensor::tensor()`
#### `TTypes::Flat tensorflow::Tensor::flat()`
Return the tensor data as an `Eigen::Tensor` of the data type and a specified shape.
These methods allow you to access the data with the dimensions and sizes of your choice. You do not need to know the number of dimensions of the Tensor to call them. However, they `CHECK` that the type matches and the dimensions requested creates an `Eigen::Tensor` with the same number of elements as the tensor.
Example:
```c++ typedef float T;
Tensor my_ten(...built with Shape{planes: 4, rows: 3, cols: 5}...);
// 1D Eigen::Tensor, size 60:
auto flat = my_ten.flat();
// 2D Eigen::Tensor 12 x 5:
auto inner = my_ten.flat_inner_dims();
// 2D Eigen::Tensor 4 x 15:
auto outer = my_ten.shaped({4, 15});
// CHECK fails, bad num elements:
auto outer = my_ten.shaped({4, 8});
// 3D Eigen::Tensor 6 x 5 x 2:
auto weird = my_ten.shaped({6, 5, 2});
// CHECK fails, type mismatch:
auto bad = my_ten.flat();
```
#### `TTypes::UnalignedFlat tensorflow::Tensor::unaligned_flat()`
#### `TTypes::Matrix tensorflow::Tensor::flat_inner_dims()`
Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all Tensor dimensions but the last one into the first dimension of the result.
#### `TTypes::Matrix tensorflow::Tensor::flat_outer_dims()`
Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all Tensor dimensions but the first one into the last dimension of the result.
#### `TTypes< T, NDIMS >::Tensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes)`
#### `TTypes< T, NDIMS >::UnalignedTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes)`
#### `TTypes< T >::Scalar tensorflow::Tensor::scalar()`
Return the Tensor data as a `TensorMap` of fixed size 1: `TensorMap>`.
Using ` scalar() ` allows the compiler to perform optimizations as the size of the tensor is known at compile time.
#### `TTypes::ConstVec tensorflow::Tensor::vec() const`
Const versions of all the methods above.
#### `TTypes::ConstMatrix tensorflow::Tensor::matrix() const`
#### `TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::tensor() const`
#### `TTypes::ConstFlat tensorflow::Tensor::flat() const`
#### `TTypes::UnalignedConstFlat tensorflow::Tensor::unaligned_flat() const`
#### `TTypes::ConstMatrix tensorflow::Tensor::flat_inner_dims() const`
#### `TTypes::ConstMatrix tensorflow::Tensor::flat_outer_dims() const`